EDUCBA

EDUCBA

MENUMENU
  • Free Tutorials
  • Free Courses
  • Certification Courses
  • 360+ Courses All in One Bundle
  • Login

CAP Theorem

By Priya PedamkarPriya Pedamkar

Home » Data Science » Data Science Tutorials » Data Analytics Basics » CAP Theorem

CAP Theorem

Overview of CAP Theorem

Consistency Availability Partition tolerance is three main aspects of the modern distributed data system. The CAP theorem was coined by Eric Brewer in 2000 to address the standard way to maintain the network-based database systems. In the era or petabyte-scale data, it became immensely important to develop and maintain distributed data systems to main the load. In this article, we will discuss the key points of the CAP theorem, how it is different from ACID and why it is important for the current technological landscape.

CAP Theorem 2

Start Your Free Data Science Course

Hadoop, Data Science, Statistics & others

Key Points on CAP Theorem

The three aspects of the CAP theorem are consistency, Availability, and Partition tolerance. Let’s first discuss all of these separately then we will join the pieces.

1. Consistency

According to this theorem, all connected nodes of the distributed system see the same value at the same times and partial transactions will not be saved. Suppose there are multiple steps inside a transaction and due to some malfunction some middle operation got corrupted, now if part of the connected nodes read the corrupted value, the data will be inconsistent and misleading. So according to the CAP principle, we will not allow such a transaction. A transaction cannot be executed partially. It will always be ‘All or none’. If something goes wrong in between the execution of a transaction, the whole transaction needs to be rolled back.

2. Availability

According to this, the connected or distributed systems should remain operational all the time. There should be a response for every client request in the system irrespective of a particular node is being available or not. Though in a practical scenario it is purely based on the traffic requirements. The key point of this is every functioning node must return a response for all read and write requests in a reasonable amount of time.

3. Partition tolerance

According to the partition tolerance policy, if a subpart of the network is compromised, the entire distributed system should not go down. A system that is partition tolerance should recover fast from partial outrage. In practical scenarios partition tolerance cannot be an optional criterion, it should be maintained thoroughly. So adhering CAP theorem became always a choice between high consistency and high availability.

Why CAP theorem is important?

After the internet boom in 2005, the size of data is growing exponentially day by day. At the early stages to maintain the ever-changing scale of data and plan the capacity properly the only option was to increase the capacity vertically which means adding more machines or increasing the machine capabilities. But is not always feasible and cost-effective. Instead of this, the new concept is to add the capacity horizontally which means leveraging distributed computing. To standardize the network, we need to maintain the principles of the CAP theorem.

Popular Course in this category
All in One Data Science Bundle (360+ Courses, 50+ projects)360+ Online Courses | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (3,220 ratings)
Course Price

View Course

Related Courses
Data Scientist Training (76 Courses, 60+ Projects)Machine Learning Training (17 Courses, 27+ Projects)Cloud Computing Training (18 Courses, 5+ Projects)

We cannot maintain all three principles of the CAP theorem simultaneously. Theoretically, we can maintain only CA, CP, or AP.

  • Consistency and Availability: These are systems with high consistency and very lesser downtime but the option of partition tolerance is not enforced. For example, network issues can down the entire distributed RDBMS system.
  • Consistency and Partition tolerance: These systems adhere to high consistency and partition tolerance but there is a risk of some data being unavailable. Ex. MongoDB.
  • Availability and Partition tolerance: These systems adhere to high availability and partition tolerance but there is a risk of reading inconsistent data. Ex. Cassandra.

How to CAP theorem is different from ACID properties?

Before we talk about the differences, let’s talk about the ACID properties in brief.

How to CAP theorem is different from ACID properties

1. Atomicity

All changes to data are performed as a single operation. That is, all or none, either all of the operations are performed or one of them is performed. For example, in an application we are transferring funds from one account to another, the atomicity property ensures that, if a debit is made successfully from one account, then the associated credit is also done to the other account.

2. Consistency

For each transaction, the system should move from one consistent state to another consistent state.

3. Isolation

All transactions should be executed in isolation from other transactions. During concurrent transaction execution, intermediate transaction results from parallel executed transactions should be mutually exclusive. Failure of one module should not affect another transaction

4. Durability

After every successful transaction, the changes made in the database should persist. Even if the system comprises or failed somehow, still the successfully committed or aborted operations should persist. Now we can see that, these terms technically refer to different things. The way in which they are related is that a distributed database system that guarantees the ACID transactions must choose consistency over availability according to the CAP Theorem (i.e it is a CP system).

On the other hand, If a distributed database chooses availability over consistency in accordance with the CAP Theorem (suppose. it is an AP system), it cannot tightly follow the properties of the ACID principles.

Conclusion

In this article we have discussed the principles on CAP theorem and why this is still important in the current context. We also discussed how the CAP theorem differs or related to another database design principle (ACID). In most practical use cases the principle of partition tolerance needs to be followed always and it becomes a choice between high availability and high consistency.

Recommended Articles

This is a guide to a CAP Theorem. Here we discuss the introduction, Key Points on CAP Theorem, How to CAP theorem is different from ACID properties? You can also go through our other suggested articles to learn more–

  1. Bayes Theorem
  2. Is Cassandra NoSQL?
  3. NOPAT Formula
  4. NLP in Python

All in One Data Science Bundle (360+ Courses, 50+ projects)

360+ Online Courses

1500+ Hours

Verifiable Certificates

Lifetime Access

Learn More

0 Shares
Share
Tweet
Share
Primary Sidebar
Data Analytics Basics
  • Basics
    • What is Natural Language Processing
    • What Is Apache
    • What is Business Intelligence
    • Predictive Modeling
    • What is NoSQL Database
    • Types of NoSQL Databases
    • What is Cluster Computing
    • Uses of Salesforce
    • The Beginners Guide to Startup Analytics
    • Analytics Software is Hiding From You
    • Real Time Analytics
    • Lean Analytics
    • Important Elements of Mudbox Software
    • Business Intelligence Tools (Benefits)
    • Mechatronics Projects
    • Know about A Business Analyst
    • Flexbox Essentials For Beginners
    • Predictive Analytics Tool
    • Data Modeling Tools (Free)
    • Modern Data Integration
    • Crowd Sourcing Data
    • Build a Data Supply Chain
    • What is Minitab
    • Sqoop Commands
    • Pig Commands
    • What is Apache Flink
    • What is Predictive Analytics
    • What is Business Analytics
    • What is Pig
    • What is Fuzzy Logic
    • What is Apache Tomcat
    • Talend Data Integration
    • Talend Open Studio
    • How MapReduce Works
    • Types of Data Model
    • Test Data Generation
    • Apache Flume
    • NoSQL Data Models
    • Advantages of NoSQL
    • What is Juypter Notebook
    • What is CentOS
    • What is MuleSoft
    • MapReduce Algorithms
    • What is Dropbox
    • Pandas.Dropna()
    • Salesforce IoT Cloud
    • Talend Tools
    • Data Integration Tool
    • Career in Business Analytics
    • Marketing Analytics For Dummies
    • Risk Analytics Helps in Risk management
    • Salesforce Certification
    • Tips to Become Certified Salesforce Admin
    • Customer Analytics Techniques
    • What is Data Engineering?
    • Business Analysis Tools
    • Business Analytics Techniques
    • Smart City Application
    • COBOL Data Types
    • Business Intelligence Dashboard
    • What is MDM?
    • What is Logstash?
    • CAP Theorem
    • Pig Architecture
    • Pig Data Types
    • KMP Algorithm
    • What is Metadata?
    • Data Modelling Tools
    • Sqoop Import
    • Apache Solr
    • What is Impala?
    • Impala Database
    • What is Digital Image?
    • What is Kibana?
    • Kibana Visualization
    • Kibana Logstash
    • Kibana_query
    • Kibana Reporting
    • Kibana Alert
    • Longitudinal Data Analysis
    • Metadata Management Tools
    • Time Series Analysis
    • Types of Arduino
    • Arduino Shields
    • What is Arduino UNO?
    • Arduino Sensors
    • Arduino Boards
    • Arduino Application
    • 8085 Architecture
    • Dynatrace Competitors
    • Data Migration Tools
    • Likert Scale Data Analysis
    • Predictive Analytics Techniques
    • Data Governance
    • What is RTK
    • Data Virtualization
    • Knowledge Engineering
    • Data Dictionaries
    • Types of Dimensions
    • What is Google Chrome?
    • Embedded Systems Architecture
    • Data Collection Tools
    • Panel Data Analysis
    • Sqoop Export
    • What is Metabase?

Related Courses

Data Science Certification

Online Machine Learning Training

Cloud Computing Certification

Footer
About Us
  • Blog
  • Who is EDUCBA?
  • Sign Up
  • Corporate Training
  • Certificate from Top Institutions
  • Contact Us
  • Verifiable Certificate
  • Reviews
  • Terms and Conditions
  • Privacy Policy
  •  
Apps
  • iPhone & iPad
  • Android
Resources
  • Free Courses
  • Database Management
  • Machine Learning
  • All Tutorials
Certification Courses
  • All Courses
  • Data Science Course - All in One Bundle
  • Machine Learning Course
  • Hadoop Certification Training
  • Cloud Computing Training Course
  • R Programming Course
  • AWS Training Course
  • SAS Training Course

© 2020 - EDUCBA. ALL RIGHTS RESERVED. THE CERTIFICATION NAMES ARE THE TRADEMARKS OF THEIR RESPECTIVE OWNERS.

EDUCBA Login

Forgot Password?

EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you
Book Your One Instructor : One Learner Free Class

Let’s Get Started

This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy

EDUCBA

*Please provide your correct email id. Login details for this Free course will be emailed to you
EDUCBA
Free Data Science Course

Hadoop, Data Science, Statistics & others

*Please provide your correct email id. Login details for this Free course will be emailed to you

Special Offer - All in One Data Science Bundle (360+ Courses, 50+ projects) Learn More