Introduction to Cassandra Interview Questions
This article consists of Cassandra Interview Questons And Answers. Apache Cassandra is a highly available “NoSQL” distributed database management system. It is a type of NoSQL database. Cassandra is open-source and is designed in such a way that it can handle large amounts of data, providing high availability that has no single point of failure. Cassandra became a top-level Apache Project in 2010. Cassandra has been written in java language and hence it can run on vast array operating systems and platforms. It can be flexible in Real-time storing the data for the online applications as well as it can read data for the business intelligence system.
List of top 10 frequently asked 2019 Cassandra Interview Questions and Answers:
So you have finally found your dream job in Cassandra but are wondering how to crack the 2019 Cassandra Interview and what could be the probable Cassandra interview questions. Every Cassandra interview is different and the scope of a job is different too. Keeping this in mind we have designed the most common Cassandra Interview Questions and Answers to help you get success in your interview.
Part 1 – Cassandra Interview Questions (Basics)
This first part covers the basic Interview Questions.
1. What is NoSQL? How many types of NoSQL databases are there?
NoSQL (sometimes expanded to “not only SQL “) could be a broad category of management systems that dissent from the classic model of the relational database management system (RDBMS) in some significant ways.
– Specifically designed for top load
– Natively support horizontal scalability
– Do not usually store data in a table
– Sometimes offer ultimate consistency rather than ACID transactions
– Store data in the demoralized manner
In contrast to RDBMS, NoSQL systems:
• Usually not offer support for distributed transactions
• Do not guarantee data consistency
• Do not sometimes use some advanced ideas of RDBMS, like triggers, views, hold on procedures
NoSQL implementations can be categorized by their manner of implementation:
1. Document Stores (MongoDB, Couchbase)
2. Key-Value Stores (Redis, Voldemort)
3. Column Stores (Cassandra)
4. Graph Stores (Neo4j, Giraph)
5. Multivalued databases
6. Object databases
8. Tuple store
2. Explain what is Cassandra? Why is Cassandra preferred over different NoSQL databases like HBase?
Apache Cassandra is highly available “NoSQL” distributed database management system which is an open source and designed to handle large amounts of data, providing high availability with no single point of failure. Cassandra was developed at Facebook and after Facebook open-sourced the code, Cassandra became a top-level Apache project in 2010. It is a type of NoSQL database. Cassandra is written in Java and can run on a vast array of operating systems and platform. It can serve as both
• Real-time data store system for online applications
• Also, read data for the business intelligence system
For performance and availability, the Cassandra is designed for large-scale distributed data and it is optimized for very fast writes.
The various factors responsible for using Cassandra are
- Gigabytes to petabytes scalabilities
- It could be a column-oriented information
- No single purpose of failure
- No want for a separate caching layer
- Flexible schema style
- It has versatile data storage, simple knowledge distribution, and quick writes
- It supports ACID (Atomicity, Consistency, Isolation, and Durability) properties
- Multi-datacentre and cloud capable
- Data compression
3. What is SSTable?
SSTable is also known as ‘Sorted String Table,’. In it memtables are stored on disk and exist for each Cassandra table. Being changeless, SStables don’t enable to any extent further addition and removal of data items once written. For every SSTable, 3 files are created by Cassandra like partition index, partition outline, and a bloom filter.
4. Define Mem-table in Cassandra?
It is a memory-resident data structure. once commit log, the info is written to the mem-table. Mem-table is in-memory/write-back cache house consisting of the content in key and column format. The info in mem- a table is sorted by key, and every column family consists of a definite mem-table that retrieves column knowledge via the key.
5. How Cassandra stores data?
- All data hold on as bytes
- When you specify validator, Cassandra ensures those bytes square measure encoded as per demand.
- While composite is just byte arrays with a specific encoding, for every element it stores a two-byte length followed by the computer memory unit encoded element followed by a termination bit.
Part 2 – Cassandra Interview Questions (Advanced)
Let us now have a look at the advanced Interview Questions
6. Mention what is Cassandra- CQL collections?
Cassandra provides a prompt Cassandra query language shell (cqlsh) using which you can execute Cassandra Query Language (CQL). In Cassandra, you can use CQL collections in the following ways
- List: it’s used once the order of the info has to be maintained, and worth is to be held on multiple times (holds the list of distinctive elements)
- SET: it’s used for the cluster of components to store and came back in sorted orders
- MAP: It is a data type used to store a key-value pair of elements
7. Explain the Cassandra Data Model?
The Cassandra data model consists of 4 main pillars which are the cluster, keyspace, column, column & family.
- Clusters: Clusters contain many nodes (machines) and can contain multiple keyspaces.
- Keyspace: A keyspace is a namespace to group multiple column families.
- Column: A column contains a name, value and timestamp.
- Family: A column family contains multiple columns referenced by a row of keys.
8. Explain how Cassandra writes?
Cassandra first writes data to a commit log and then associate in memtable and in a table. A write is successful when both commits are complete. Memtables and SSTables are created per column family. Writes are written to disk in a table structure called an SSTable (sorted string table). In the event of a fault once writing to the SSTable Cassandra will merely replay the commit log. With this style, Cassandra has the lowest disk I/O and offers high speed write performance as a result of the commit log is append-only and Cassandra doesn’t look for on writes.
9. Explain how Cassandra delete Data?
SSTables are changeless tables. once a row has to be deleted, Cassandra assigns the column value with a special value referred to as Tombstone. Once the data is read, the Tombstone value is taken into account as deleted.
10. What is tunable consistency in Cassandra? How many types of tunable consistency are supported in Cassandra?
Tunable Consistency could be a fantastic characteristic of Cassandra that makes it a preferred selection. Consistency refers to the up-to-date and synchronous data rows on all their replicas. Cassandra’s Tunable Cassandra’s Tunable Consistency facilitates users to pick the consistency uttermost suited to their use cases.
It supports two consistencies: Eventual Consistency and Strong Consistency.
Eventual Consistency – The eventual consistency is employed once no new updates are made on a given data item, all accesses come back the last updated worth eventually. Systems with eventual consistency famed to own achieved reproduction convergence.
Cassandra subsequent conditions for robust consistency:
R + W > N
N: Number of replicas
W: Number of nodes that need to agree for a successful write
R: Number of nodes that need to agree for a successful read
This has been a guide to List Of Cassandra Interview Questions and Answers. Here we have covered the few commonly asked interview questions with their detailed answers so that it helps candidates to crack interviews with ease. You may also look at the following articles to learn more –