EDUCBA

EDUCBA

MENUMENU
  • Blog
  • Free Courses
  • All Courses
  • All in One Bundle
  • Login
Home Data Science Data Science Tutorials Data Warehouse Tutorial Data Warehouse Components

Data Warehouse Components

Data Warehouse Components

Introduction to Data Warehouse Components

In the article Data Warehouse Components Data warehouse (abbreviated as DWH) is the process of capturing and maintaining business data in Relational Database for Business Intelligence (BI) practices. DWH databases are efficiently designed to store large amounts of historical business transactions for data analysis and to support Business decisions. Moreover, DWH databases are architected and optimized for query processing and multi-dimensional analysis which helps businesses with comparative study, historical data pattern analyzes, and improved decision making.

There are several related industry terms in the BI world, such as

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

  • Decision Support system(DSS)
  • Business Information system
  • Historical database
  • Enterprise Data warehouse (EDW)

Types of Database

There are two types of databases:

    1. OLTP(Online Transnational Processing)

  • These types of databases are designed for real-time business operations or transactions.
  • Optimized for single record insert or retrieve at a time from database table and business validation rules are applied on these tables.
  • These databases generally support a very high number of concurrent user interactions to support regular business activities.
  • Examples of an OLTP will be processing and string an accounts receivable transaction or an invoice.

     2. OLAP (Online Analytical Processing)

  • These databases are designed for reporting and analysis of business data.
  • Optimized for Bulk load and complex query processing to support reporting needs on large volumes data set.
  • The OLAP databases are configured for specific users who are part of the Business Intelligence or Data Analysis department.
  • Examples of OLAP would be marinating sales data or order returned data for the business.

Basic Architecture

Basic Architecture

The basic DWH architecture consists of the following four major components

  • Source systems
  • ETL tools
  • DWH databases
  • BI Tools

1. Source Systems

The source systems are generally business applications like ERP and CRM system, which captures the transaction data from the front ad applications. The transaction data are stored mainly in relational databases or files. ETL tools connect these sources through the connectors and process these data.

2. ETL Tools

There is a popular concept associated with DWH which is called ETL(Extract, Transform & Load) which performs the Data movement from Sources systems or Databases or Files to DWH database. There are several industry demand enterprise ETL tools that are popular like Informatica power center, IBM Data Stage, Ab Initio, Alteryx, Oracle Data Integrator, and Open source tools like Talend.

3. DWH Databases

DWH is RDBMS systems that are generally created using popular database vendors like Oracle, SQL Server, or DB2. Also by the popularity of the cloud, there are several offerings from cloud vendors like AWS which provides Amazon Redshift database, or Microsoft Azure SQL Data warehouse solution are growing in demand.

4. BI Tools

Similarly, There are several BI tools that consume EDW data for BI Reports and charts creation which is referred by Business users to make data-driven business decisions. Some of the BI or Visualization tools are Tableau, Microsoft Power BI, SAP Crystal Reports, Quick View. Some organizations prefer using MS Excel also for basic Data Analysis tasks.

Also, there are several best practices that enhance the basic architecture of DWH, by introducing staging Tables and Datamart.

5. Staging table

These generally work as pre-process data tables which are useful to develop and maintain efficient ETL pipelines.

6. Data Mart

Datamart is a practice pertaining to a single subject-specific or focused business unit data. Whereas DWH is a broader scope of Data which is a superset of the data mart.

Examples:  Revenue department data analysis or Sales Data analysis of a particular business unit. whereas DWH example would be a financial analysis of the organization.

Implementation Methodology – Data Warehouse Components

Based on the Datamart approach, there are two types of implementations methods or designs that are being used.

1. Inmon Methodology

  • This is also known as the Top-down approach or Enterprise wise approach.
  • It involves first creating DWH and subsequently building multiple Data marts.

2. Kimball Methodology

  • This is also known as the Bottom-up approach or Data Mart approach.
  • It involves first creating multiple data marts and subsequently building the DWH.

Data Modelling

Next, we will discuss the data modeling part of the DWH which is the design principle for managing analysis-ready data in a larger volume. There are two popular types of schema which are being used.

  • Star Schema
  •  Snowflake Schema

Before elaborating on these schema let’s understand a few important tables type based upon the data it captures, which are the components of the schema design.

Fact Table

  • These are the type of table which stores the measures or facts.
  • These tables include numeric measure columns and foreign keys to other dimension tables.
  • Example – Sales Data fact table which has Sales in USD and timestamp columns.
  • It’s possible to have a fact table without actual fact columns, these are called factless fact tables.
  • Example-In an educational system DWH for students, teachers, and course data analysis.
  • Also, there are different types of facts
    • Additive fact –these are the fact that can be summed up through all the dimensions in the fact table
    • Nonadditive fact-These are the facts that cannot be summed up with any of the dimensions in the fact table Example- Height, width, percentage.
    • Semi additive fact-these are the facts that are summed up with some of the dimensions in the Fact table

Dimension Table

  • These are the type of table which stores the descriptive or textual value of the business attributes and most of the data are static in nature.
  • These tables are joined to the fact table through the foreign key reference.
  • Example – Product Dimension table which contains the product name, product category, product subcategory.
  • Also, there is a specific type of dimension table is being used, which is called Confirmed Dimension Table.
    • These are the dimension table which is shared by more than one data mart or more than one fact table.
    • Example: buying a product and return the product.

Star Schema

  • The fact table is surrounded by a dimension table in a star-like structure called a star schema. These dimensions are generally de-normalized for DWH.
  • This schema supports the ease of developing reports and efferent summarization of queries.
  • Following is a diagrammatical representation of Star schema for Sales Data analysis.

Star Schema

Snowflake Schema

  • The fact table is surrounded by de-normalized dimension tables and if anyone or more of the dimension tables are subdivided into another dimension table then it is called snowflake schema.
  • This schema supports further drill down of reports by retrieving the related data from the dimension table.
  • Example: Sales Data DWH with Snowflake schema where customer dimensions are subdivided further to countries dimension.
  • Following is a diagrammatical representation of Snowflake schema for sales data analysis.

Snowflake Schema

In an organization Data modelers or Data Architect are the professionals who develop these types of schemas for Multi-dimensional DWH. And they generally follow these three stages of processes for its designing.

  • Conceptual Modelling –  the conceptual level of the data model or schema.
  • Logical Modelling – the logical level of the data model or schema.
  • Physical Modelling – database-specific data model or schema.

Conclusion – Data Warehouse Components

The Data warehouse is in practice in the industry for a very long. It supports BI commonly through Batch processing or processes historical data for analysis. There are several advancements happening in the data warehouse for real-time data analysis or capturing and processing heterogeneous structures of data. The concept and technologies related to Data Lake is the latest trend in the industry which is an advancement of enterprise data warehouse concepts.

Recommended Articles

This has been a guide to the Data Warehouse Components. Here we also discuss their types, architecture, implementation methodology, and data modeling. You may also have a look at the following articles to learn more–

  1. Oracle Data Warehousing
  2. Informatica vs Datastage
  3. AWS Databases
  4. Data Warehouse Modeling
All in One Excel VBA Bundle
500+ Hours of HD Videos
15 Learning Paths
120+ Courses
Verifiable Certificate of Completion
Lifetime Access
Financial Analyst Masters Training Program
1000+ Hours of HD Videos
43 Learning Paths
250+ Courses
Verifiable Certificate of Completion
Lifetime Access
All in One Data Science Bundle
1500+ Hour of HD Videos
80 Learning Paths
360+ Courses
Verifiable Certificate of Completion
Lifetime Access
All in One Software Development Bundle
3000+ Hours of HD Videos
149 Learning Paths
600+ Courses
Verifiable Certificate of Completion
Lifetime Access
Primary Sidebar
All in One Data Science Bundle1500+ Hour of HD Videos | 80 Learning Paths | 360+ Courses | Verifiable Certificate of Completion | Lifetime Access
Financial Analyst Masters Training Program1000+ Hours of HD Videos | 43 Learning Paths | 250+ Courses | Verifiable Certificate of Completion | Lifetime Access
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2023 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

By continuing above step, you agree to our Terms of Use and Privacy Policy.
*Please provide your correct email id. Login details for this Free course will be emailed to you
Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more