Introduction to Bitmap Indexing
The bit is the smallest unit used to represent data that has 0 or 1 as its values. It is the bit arrays used to answer queries by performing bitwise logical operations on bitmaps. Indexing refers to a data structure that organizes data based on a specific schema. This bitmap indexing technique is used in huge databases when the column has low cardinality and is frequently used in the query. Low cardinality columns refer to columns with few unique values. We shall discuss why it is needed, its uses, advantages, and disadvantages of it in today’s IT field.
It is used in Data warehousing applications with a large amount of data and complex queries but low-level transactions, for such kind of applications, it helps in reducing response time for complex queries, space reduction compared to other techniques of indexing, increase in hardware performance with less number of CPU’s and memory and effective maintenance. Low cardinality columns have several distinct values absolutely or relatively to the number of records containing data. At extreme, low cardinality can also have Boolean values i.e. true or false.
The index provides pointers to rows in a table that contain given key value. A normal index stores list of row ids for each key parallel to the row with key-value whereas it stores each key-value replacing its list of row ids.
The need for Bitmap Indexing
Let us consider that a Public school holds Student data having columns as ‘student name’, ‘student’, ‘specialization’, ‘marks’. On the assumption that students will have to give exams only once in an academic year, hence the table will be updated once and remain static throughout the year.
As columns will be frequently used for querying to retrieve data, like a Total number of male students which needs file organization method to give quicker results. As traditional file organization method is not fast comparatively, hence comes into picture Bitmap Indexing for higher-level storage and retrieval of data.
Consider below an example of bitmap indexing which can explain Use of it,
Consider a table with columns, employee, employeeName, gender, employmentStatus.
The gender column has only two distinct values ‘Male’ or ‘Female’, hence it is useful for bitmap indexing. On comparing the whole employee table, they have no unique value except for the gender column.
The employee table has only 5000 rows, the number of indices on the column will be equal to the number of distinct values in the column. The gender column will have 2 bitmap indices, one for male and other for female. If there is any matching value in the column for a row, that particular row bit will be ‘1’ or ‘0’. Bit value for ‘male’ index be ‘1’ and for ‘female’ be ‘0’.
As we are considering to index gender column, bitmap index ‘F’ has value ‘1010’ whereas ‘M’ has ‘0101’
Suppose we need to find details of Employee with ‘Weekly based’ employee status and is a male.
SELECT * FROM employee WHERE gender=’M’ and employeeStatus=’Weekly based’;
Hence, the Bitmap Index for Male: 0101, similarly index for Weekly based is 0001. In DBMS, logical AND operation is performed to find the output of the query.
0101 Bitmap index for gender ‘Male’
0001 Bitmap index for employment Status as ‘Weekly based’
= 0001 The result here represents that the last column has to be retrieved.
Lesser the unique values in the column, lesser is the cardinality. Hence Cardinality is directly proportional to a count of unique values. Even with less cardinality, these columns are used for querying often. A bitmap indexing can also be built on columns with higher cardinality, consider a table with million rows having 50000 rows with distinct values. B-tree indexing can be used for data with high cardinality.
High cardinality bitmap indexing would not work as there has to be a new data structure record for each unique value in the column, as the number of unique value increases, the index will take up more space leading to inefficiency.
Advantages and Disadvantages of Bitmap Indexing
Below are the advantages and disadvantages:
- It helps in faster retrieval of data when fewer cardinality columns and are used in querying frequently
- It is efficient even if the table has a huge number of records.
- It would be much more efficient if table columns are involved in insertion/ updating/ deletion operations
- Multiple indices can be combined for executing a query.
- A Bitmap join index can be created on a temporary table
- Bitmap indexing is not suitable for tables with fewer records, as DBMS forces full scan over the table for result instead of using bitmap indexing
- Multiple insertions/ updating/ deletion operation from different users can cause deadlock
- To perform DML operations, bitmap indexing takes time to perform the transaction and update its bitmap index.
- A large number of records can cause overhead in maintaining it.
- It has to be modified throughout if the user enters a new record which is time-consuming and difficult.
- Only one table can be updated by different transactions while using the bitmap join index.
- It can not be created with the UNIQUE attribute in the table.
As we have come to an end of the article, let us summarize what have we gone through. We have seen what Bitmap Indexing means and what it does. How it is performed in SQL via AND operation has been illustrated above. Need for it, its uses, advantages, and disadvantages.
This is a guide to Bitmap Indexing. Here we discuss an introduction to Bitmap Indexing, needs with explanation, advantages, and disadvantages. You can also go through our other related articles to learn more –