EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login
Home Data Science Data Science Tutorials Data Warehouse Tutorial Snowflake Schema
Secondary Sidebar
Data Warehouse Tutorial
  • Basic
    • What is Data Warehouse
    • Data Warehouse tools
    • Career in Data Warehousing
    • Benefits of Data Warehouse
    • Data Warehouse Architecture
    • Data Warehouse Design
    • Data Warehouse Implementation
    • Data Warehouse Features
    • Data Warehouse Modeling
    • Data Warehouse Software
    • Data Warehousing
    • Types of Data Warehouse
    • 10 Popular Data Warehouse Tools
    • Data Lake Architecture
    • Three Tier Data Warehouse Architecture
    • Data Warehouse Process
    • Database Parallelism
    • What is OLTP
    • What is OLAP
    • OLAP Tools
    • Types of OLAP
    • Operations in OLAP
    • MOLAP
    • HOLAP
    • Data Warehouse Schema
    • Data Warehouse Components
    • Snowflake Schema
    • Snowflake Architecture
    • What is Star Schema
    • Galaxy Schema
    • What is Fact Table
    • Kimball Methodology
    • Data Warehouse Testing
    • Operational Data Stores
  • ETL
    • What is Data Mart
    • What is Data Cube
    • What is a Data Lake
    • What is Data Integration
    • What is ETL
    • What is ETL Testing
    • ETL Testing Tools
    • ETL architecture
    • Dimension Table
    • Multidimensional Data Model
    • Fact Constellation Schema
    • ETL Process
  • Interview Questions
    • Data Warehouse Interview Questions
    • ETL Interview Questions
    • ETL Testing Interview Questions
    • Data Warehousing Interview Questions

Related Courses

Business Intelligence Course

All in One Data Science Course

Data Visualization Certification Courses

Snowflake Schema

By Priya PedamkarPriya Pedamkar

snowflake schema

Introduction to Snowflake Schema

Snowflake is one of the many schema types used for the implementation of the Data Warehouse systems Architecture. In this type of schema, the data warehouse structure contains one fact table in the middle, multiple dimension tables connected to it and connected with one another as well. It should have all the dimension tables to be normalized to the last level of normalization, until there is no more space for further normalization.

Snowflake Schema must contain a single Fact Table in the center, with single or multiple levels of Dimension Table. All the Dimension Tables are completely Normalized that can lead to any number of levels. Normalization is nothing but breaking down one Dimension table into two or more Dimension tables, to make sure minimum or no redundancy. While all the first level Dimension tables are linked to the center Fact table, all the other Dimension tables can be linked to one another if required. This Structure resembles a Snowflake (Fig. 01), hence the name ‘Snowflake Schema’.

Snowflake Schema

Why Snowflake Schema?

Snowflake Schema type is selected based on multiple limitations that are considered crucial for the specific Project, by the Project Management team. Here are the basic characteristics of the Snowflake

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

The schema that can help in this decision-making process,

  • This model can involve only one fact table and multiple dimension tables which must be further normalized until there is no more room for further normalization.
  • Snowflake Schema makes it possible for the data in the Database to be more defined, in contrast to other schemas, as normalization is the main attribute in this schema type.
  • Normalization is the key feature that distinguishes Snowflake schema from other schema types available in the Database Management System Architecture.
  • The Fact Table will have all the facts/ measures, while the Dimension Tables will have foreign keys to connect with the Fact Table.
  • Snowflake Schema allows the Dimension Tables to be linked to other Dimension tables, except for the Dimension Tables in the first level.
  • This Multidimensional nature makes it easy to implement on complex Relational Database systems, thus resulting in effective Analysis & Reporting processes.
  • In terms of Accessibility, Complex multiple levels of Join queries are required to fetch aggregated data from the central fact table, using the foreign keys to access all the required Dimension tables.
  • Multiple Dimension tables, which are created as a result of normalization, serve as lookup tables when querying with Joins.
  • The process of breaking down all the Dimension tables into multiple small Dimensions until it is completely normalized takes up a lot of storage space compared to other schemas.
  • As the querying process is complex, the pace for Data Retrieval is by far low.

Workflow of Snowflake Schema

Here we will discuss the Workflow of Snowflake Schema by explaining how to create snowflake schema along with the pros and cons.

All in One Data Science Bundle(360+ Courses, 50+ projects)
Python TutorialMachine LearningAWSArtificial Intelligence
TableauR ProgrammingPowerBIDeep Learning
Price
View Courses
360+ Online Courses | 50+ projects | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (86,241 ratings)

How to Create a Snowflake Schema?

When the requirement is to create a schema with a fact table ‘A’ that has 6 dimension tables ‘B, C, D, E, F, G’, and each of these dimension tables has furthermore normalization in-scope, then Snowflake schema will be the right pick in this case.

These Dimension tables ‘B, C, D, E, F, G’ are further disintegrated into further more Dimension tables. This process continues up until there is no further approach to break the already normalized Dimension tables.

Say our ‘A’ is a ‘Clothing Sales’, it could have the below dimensions as its ‘B, C, D, E, F, G’, which has the scope for further normalization –

  • Employees
  • Customer
  • Store
  • Products
  • Sales
  • Exchange

Now let us design a Snowflake Schema for this –

snowflake schema

The above Dimension tables can be further broken as –

  • Stores – Owned & Rented, which can be further broken into location, country, state, region, city/ town, etc in each level, depending on the available data and requirements.
  • Sales – Limited Editions & other Branded, which can be further broken into seasonal, nonseasonal, etc.
  • Exchanges – Reasons as the second level, Exchange for ‘money-back’ & ‘different product’ as the third level of Dimensions.
  • Products – ‘Product Types’ table as second-level Dimensions, and levels for each type of product. This can be continued until the last level of normalization.
  • Customers – ‘customer types’ as Men & Women, which can be additionally split as members, non-members, types of membership, etc.
  • Employees – the type of employees as ‘Permanent’, ‘Temporary/ Part-time’ employees. The next level here can be departments, location, Salary Grade, etc.

This can be further normalized to its final level of dimension tables, as it helps in reducing redundancy in final data. This Schema can be used for Analysis or Reporting when the focus is mainly on the Clothing Sales alone (fact table), and the first level dimensions as specified above.

Pros and Cons of Snowflake Schema

The following pros and cons are mention below –

  1. Minimum or no redundancy, as a result of Normalization, which is the core quality for Snowflake Schema.
  2. Snowflake Schema is a complex system, as it can have any number of levels of normalization depending on the depth of the given database.
  3. Data Quality will be exceptional, as Normalization grants the benefit for the well-defined form of tables/ data.
  4. If any new requirement creates a need for denormalization, data quality will be taken back and redundancy may occur. This may be lead to restructuring the entire Schema.
  5. When queried with Joins, clear & accurate data is retrieved.
  6. Maintenance is difficult as the higher-level dimensions need to be expanded constantly.
  7. High Data quality & accuracy helps in facilitating efficient Reporting & Analysis.
  8. Low performance as it required complex Join queries.
  9. Easy implementation process when provided with multipart Relational Databases.
  10. Large storage space is required for full Normalization and elaborate querying process.

Conclusion

To sum up, if the requirement comes with options for more storage, tolerance for low performance, complex tables which allow structuring with the single fact table, time & space for complete normalization, Snowflake Schema will be the best option. Though it uses complex Join queries, the output will be an accurate compilation of data that can make Analysis and Report exceptionally efficient.

Recommended Articles

This is a guide to Snowflake Schema. Here we discuss the Workflow of Snowflake Schema by explaining how to create snowflake schema. You can also go through our other suggested articles to learn more –

  1. What is Star Schema?
  2. What is MySQL Schema?
  3. MD5 Algorithm
  4. Tableau Joins
  5. Guide to Snowflake Architecture
Popular Course in this category
All in One Data Science Bundle (360+ Courses, 50+ projects)
  360+ Online Courses |  1500+ Hours |  Verifiable Certificates |  Lifetime Access
4.7
Price

View Course

Related Courses

Business Intelligence Training (12 Courses, 6+ Projects)4.9
Data Visualization Training (15 Courses, 5+ Projects)4.8
0 Shares
Share
Tweet
Share
Primary Sidebar
Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Live Classes
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

ISO 10004:2018 & ISO 9001:2015 Certified

© 2022 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA
Free Data Science Course

SPSS, Data visualization with Python, Matplotlib Library, Seaborn Package

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA Login

Forgot Password?

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you

By signing up, you agree to our Terms of Use and Privacy Policy.

Let’s Get Started

By signing up, you agree to our Terms of Use and Privacy Policy.

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

Loading . . .
Quiz
Question:

Answer:

Quiz Result
Total QuestionsCorrect AnswersWrong AnswersPercentage

Explore 1000+ varieties of Mock tests View more