Updated April 10, 2023

Introduction to Data Integrity

Data integrity is the process of maintenance and assurance of data accuracy and consistency over its entire life cycle. It is the critical aspect to the design and implementation and usage of any system that stores, process or retrieve the data. It also applies to data protection and security in terms of regulatory enforcement. It is kept up to date by a set of procedures, guidelines, and specifications that were put in the place during the design and requirement phase. It ensures that the information is protected from outside influences.

Data stored in the database must satisfy certain types of rules and regulations. Data in the database must be correct and consistent. So that data stored in the database must satisfy certain rules. The database management system provides various ways to implement these rules. This improves the integrity of the database.

Why do we need Data Integrity?

Data integrity is essential because of the following reasons. Using electronic control systems, all key business processes are managed. Increase in usage of processes for interactions between computer systems such as the internet, mobile devices, enterprise systems, wireless systems, etc. Increased focus by regulatory bodies on data integrity-related issues such as errors in the transcription of raw data, data flows, etc.

Types of Data Integrity

Below is the list mentioned :

Physical integrity: Physical integrity refers to the safeguarding of completeness and precision during the storage and retrieval of the data. This type of integrity is jeopardized when natural disasters occur, hackers interrupt the database functions or electricity goes out. Entity integrity: Entity integrity defines the characteristics of relational databases that store the information in tables that can be connected and used in a number of ways.

Domain integrity: Domain is nothing but a collection of the suitable values that a column may contain in this context. Constraints and other steps that restrict the format, sort, and amount of data entered can be used. Referential integrity: Referential integrity refers to the set of procedures that ensures that data is stored and used consistently. In a database, to restrict the use of foreign keys, some rules are embedded, so that only necessary modifications such as addition or subtraction of the data are made. These rules include various restrictions to prevent the duplication of the data, ensure data accuracy, or prohibiting the entry of the data that does not apply.

User-defined integrity: User-defined integrity refers to the rules and restrictions that the user creates to meet their specific requirements. When it comes to the security of the data, person, referential, domain integrity is not always enough. Logical integrity: In a relational database, logical integrity ensures the data remain intact as it is used in various ways. Like physical integrity, logical integrity also prevents the data from human error and hackers.

Advantages

Below are mentioned the advantages :

Data integrity ensures the privacy and safety of the customers
It also ensures the quality of the product.
It helps to protect the data from end-to-end transfer over a transmission medium.
Stored procedures can be used with ease in order to have complete control of the data access.
Improves reusability and maintainability.

Disadvantages

Below are mentioned the disadvantages :

Implementation of the program is complex.
Database management system should have the ability to enforce the data integrity for all the application which use the data.
Due to many new information and communication technologies, it is difficult to implement across the entire system.
Lacks structural independence.

Risk of data integrity

Human error: Data integrity is put in jeopardy, when a user enters the data incorrectly, duplicate or delete data, makes mistakes during the implementation of procedures means to safeguard information, do not follow the appropriate protocol, etc.

Bugs and viruses: Spyware, malware, and viruses are the piece of code that can invade a computer and can steal, alter or delete the data.

Transfer errors: Transfer has occurred when information is not successfully transferred from one location to another in the database. For example, in a relational database, when some piece of code is present in the destination table, but not in the source table then it is called transfer errors.

Compromised hardware: Hardware is compromised when suddenly a computer or server crashes and problems with how computers or other devices function, are some examples of the significant failure that may cause the hardware compromise. Compromised hardware may render data incorrectly or incompletely or make information hard to use or limit or eliminate the access of the data.

Conclusion

Here in this article, we have discussed the basic concepts of data integrity with its types. We have also discussed the advantages, disadvantages, and limitations. It refers to the structure of the data and how it matches the schema of the database.