Introduction to Data Warehouse Design
Data Warehousing practice has its own Development Life Cycle flow for designing and implementing the Data Warehouse systems. It involves the basic steps like Requirement Analysis, Data Source Identification, ETL Processing, Data Modeling for to elect the data model based on the requirement and data sources, and design approach for selecting the design approach based on which the Data Warehouse is to be implemented, that is, either ‘top-down approach’ or ‘bottom-up approach.’
Data warehouse has two major concepts:
- OLAP: Online Analytical Processing
- OLTP: Online Transactional Processing
Both are online processing systems but have some differences. OLTP manages the transactional application like ATM, OLAP uses for analytical processing like reporting, forecasting, etc.
- Requirement gathering is one phase in data warehouse design. It needs to determine the criteria and implement them successfully. Thus, there will be two strategies used for data warehouse design: the business, and the technical.
- The business strategy focuses on the long-term business view and helps to increase the profit for growth. The technical strategy requirement is based on user reporting, analysis, hardware selection, development method, testing technique, implementation environment, and user training.
- When we determined the business and technical strategy, we also need to design the BCP (Disaster Recovery) plan. When there is a disaster happen by human or natural, we need to have a plan to recover the data quickly and ensure no data lose. Developing the disaster recovery plan is one of the challenges and makes a trust for the organization.
- Once we gathered the data for data warehouse design, we need to make a proper environment setup for development, testing, and production. Preferably there should be a separate system for application, database, and separate for reporting/ETL as well.
- When we are building a separate environment for each, it ensures all the changes can be developed/ tested and then move on to production.
- If we have a single environment which designed for all these activities, it could be end-up with issue and data lose. For example, when there is an incident that occurred in the system, we couldn’t able to navigate and find out the way to fix it, and it makes it more complex.
- Once the requirement gathering and environment are set up, the next is to design how to connect the data source, process, and store in the data warehouse. This technique is called data modelling. It can be an analysis of the object and the relationship between the others.
- When the design of the data warehouse, the engineers designed how and where the data needs to be stored. On the same occasion, we should also define the possible way to retrieve the data from the data warehouse. Once the source is identified, the team can build the logic and create a structure schema view.
Types of Data Model
There are three types:
The Three Types of Data Model are mentioned below:
1. Conceptual: It says WHAT the system contains, and it’s designed by business Architects to define the scope for business strategy.
2. Logical: This define HOW the logical can be created in DBMS; it will be designed by a Business Analyst and Data Architect to create a set of rules to store/retrieve the data
3. Physical: This defines HOW the system can be implemented.
Use of Data Warehouse Design
Being a good data warehouse design can be time-consuming when retrieving the data. Each step has to follow effectively to make the system a good one. It will help the organization to handle complex types of data and improve productivity based on the trend analysis. So each step in DWH architecture design is important and more conscious in the selection method. The organization steps into each flow subsequently and leads to successfully implementing the data warehouse.
There are few important applications uses of Data Warehouse:
1. Banking Industry: Most of the banks are using the data warehouse for storing a large amount of transaction data and the ability to retrieve the query data much faster. It can be managed like customer data, market trends, reports, analysis, etc.
2. Finance Industry: It is similar to banking, but the only focus is to improve financial changes by analyzing the customer data.
3. Government: Nowadays Government managing a lot of data online and stores it in the relational database. Each data have a relationship with each other like Aadhaar; PAN is linked to many sources.
4. Healthcare: Healthcare managers and services so much information. It maintains the clinical details, customer records and helps them to predict the outcomes, analyse the feedback and generate the reports.
5. Insurance: Insurance company primarily used for data patterns, customer trends and maintaining records.
6. Manufacturing and Distribution Industry: It is most widely used in all industries for storing item information and helps them to predict the demand item for manufacture and sales. Analyzing the sold item which gives better decision-making techniques.
7. Retailers Services: Retailers are the middleman between the producer and the customer. Data warehouse helps them with promotions and item buying trends.
8. Telephone Industry: Telephone industries manage a lot of historical data, which helps for making the customer data trend and target to push advertising campaigns.
Advantages and Disadvantages of Data Warehouse
Given below are the advantages and disadvantages mentioned:
- Delivers Enhanced Business Intelligence.
- Ensures Data Quality and Consistency.
- Saves Time and Money.
- Tracks Historically Intelligent Data.
- Generates high ROI.
- Extra Report Work.
- Inflexibility and homogenization of Data.
- Ownership Concerns.
- Demands for Large Amounts of Resources.
- Hidden Issues Consume Time.
This is a guide to Data Warehouse Design. Here we discuss the data warehouse design technique, requirement gathering, environment set-up, types, uses, with advantages and disadvantages. You may also look at the following article to learn more –