Data Integrity Integrity Rules Codd's 12 Rules

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

Data Integrity, Integrity Rules and Codd's 12 rules

Data Integrity
Data Integrity refers to the process of ensuring that a database remains an accurate reflection of the universe of discourse it is modeling or representing. In other words there is a close correspondence between the facts stored in the database and the real world it models.

Database integrity ensures that data entered into the database is accurate, valid, and consistent. Any applicable integrity constraints and data validation rules must be satisfied before permitting a change to the database.
Three basic types of database integrity constraints are:
Entity integrity, not allowing multiple rows to have the same identity within a table. Domain integrity, restricting data to predefined data types, e.g.: dates. Referential integrity, requiring the existence of a related row in another table, e.g. a customer for a given customer ID

It is important that data adhere to a predefined set of rules, as determined by the database administrator or application developer.

Example
As an example of data integrity, consider the tables employees and departments and the business rules for the information in each of the tables, as illustrated.

Note that some columns in each table have specific rules that constrain the data contained within them

Types of Data Integrity


Null Rule
A null rule is a rule defined on a single column that allows or disallows inserts or updates of rows containing a null (the absence of a value) in that column.

Unique Column Values


A unique value rule defined on a column (or set of columns) allows the insert or update of a row only if it contains a unique value in that column (or set of columns).

Primary Key Values


A primary key value rule defined on a key (a column or set of columns) specifies that each row in the table can be uniquely identified by the values in the key.

Referential Integrity Rules A referential integrity rule is a rule defined on a key (a column or set of columns) in one table that guarantees that the values in that key match the values in a key in a related table (the referenced value). Referential integrity also includes the rules that dictate what types of data manipulation are allowed on referenced values and how these actions affect dependent values. The rules associated with referential integrity are:
1. 2. 3. 4. Restrict: Disallows the update or deletion of referenced data. Set to Null: When referenced data is updated or deleted, all associated dependent data is set to NULL. Set to Default: When referenced data is updated or deleted, all associated dependent data is set to a default value. Cascade: When referenced data is updated, all associated dependent data is correspondingly updated. When a referenced row is deleted, all associated dependent rows are deleted. No Action: Disallows the update or deletion of referenced data. This differs from RESTRICT in that it is checked at the end of the statement, or at the end of the transaction if the constraint is deferred. (Oracle uses No Action as its default action.)

5.

Complex Integrity Checking - Complex integrity checking is a user-defined rule for a column (or set of columns) that allows or disallows inserts, updates, or deletes of a row based on the value it contains for the column (or set of columns). Most of these rules are easily defined using integrity constraints or database triggers

Integrity Constraints
An integrity constraint is a declarative method of defining a rule for a column of a table. Oracle supports the following integrity constraints: NOT NULL constraints for the rules associated with nulls in a column UNIQUE key constraints for the rule associated with unique column values PRIMARY KEY constraints for the rule associated with primary identification values FOREIGN KEY constraints for the rules associated with referential integrity. Oracle supports the use of FOREIGN KEY integrity constraints to define the referential integrity actions, including:
Update and delete No Action Delete CASCADE Delete SET NULL

CHECK constraints for complex integrity rules

Introduction to Triggers
You can write triggers that fire whenever one of the following operations occurs: DML statements (INSERT, UPDATE, DELETE) on a particular table or view, issued by any user DDL statements (CREATE or ALTER primarily) issued either by a particular schema/user or by any schema/user in the database Database events, such as logon/logoff, errors, or startup/shutdown, also issued either by a particular schema/user or by any schema/user in the database Triggers are similar to stored procedures. A trigger stored in the database can include SQL and PL/SQL or Java statements to run as a unit and can invoke stored procedures. However, procedures and triggers differ in the way that they are invoked. A procedure is explicitly run by a user, application, or trigger. Triggers are implicitly fired by Oracle when a triggering event occurs, no matter which user is connected or which application is being used.

Codd's 12 rules
The evolution of relational data storage began in 1970 with the work of Dr. E. F. Codd, who proposed a set of 12 rules for identifying relationships between pieces of data. Codd's rules formed the basis for the development of systems to manage data. Today, Relational Database Management Systems (RDBMS) are the result of Codd's vision. Codd's twelve rules are a set of thirteen rules (numbered zero to twelve) proposed by Edgar F. Codd, a pioneer of the relational model for databases, designed to define what is required from a database management system in order for it to be considered relational, i.e., a relational database management system (RDBMS). They are sometimes jokingly referred to as "Codd's Twelve Commandments".

Rules in five functional areas

Foundational rules Structural rules Integrity rules Data manipulation rules Data independence rules

Foundational rules (Rule 0 and Rule 12)


Provides a test to assess whether a system is a relational DBMS. If these rules are not satisfied, the product should not be considered relational.

Rule (0): The system must qualify as relational, as a database, and as a management system. For a system to qualify as a relational database management system (RDBMS), that system must use its relational facilities (exclusively) to manage the database. Rule 12: The nonsubversion rule: If the system provides a low-level (record-at-a-time) interface, then that interface cannot be used to subvert the system, for example, bypassing a relational security or integrity constraint. A low level (single-record-at-a-time) language cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language (multiple-records-at-a-time). All database access is controlled by the DBMS so that the integrity of the database cannot be compromised without the knowledge of the user or the Database Administrator (DBA). However, this does not prohibit the use of a language with a record-at-atime interface.

Structural rules (Rule 1 and Rule 6)

Fundamental structural concept is the relation. RDBMS must support several structural features, including relations, domains, primary, and foreign keys. There should be a primary key for each relation in the database.

Rule 1: The information rule:


All information in the database is to be represented in only one way, namely by values in column positions within rows of tables. All information is represented explicitly at the logical level by values in tables. All information, even metadata, must be stored as relations, and managed by the same operational functions used to maintain data. logical level means that physical constructs, such as indexes, are not represented and need not be explicitly referenced by a user in a retrieval operation, even if they exist.

Rule 6: The view updating rule:


All views that are theoretically updatable must be updatable by the system. No system truly supports this feature, because conditions have not been found yet to identify all theoretically updatable views.

Integrity rules (Rule 3 and Rule 10)

Support of data integrity is an important criterion when assessing the suitability of a product. The more integrity constraints maintained by the DBMS product, rather than by application programs, the better the guarantee of data quality.

Rule 3: Systematic treatment of null values: The DBMS must allow each field to remain null (or empty). Specifically, it must support a representation of "missing information and inapplicable information" that is systematic, distinct from all regular values (for example, "distinct from zero or any other number", in the case of numeric values), and independent of data type. It is also implied that such representations must be manipulated by the DBMS in a systematic way.
Rule 10: Integrity independence: Integrity constraints must be specified separately from application programs and stored in the catalog. It must be possible to change such constraints as and when appropriate without unnecessarily affecting existing applications.

Data manipulation rules (Rule 2, Rule 4, Rule 5, and Rule 7)


An ideal relational DBMS should support 18 manipulation features. These features define the completeness of the query language. Adherence to rules insulates the user and application programs from the physical and logical mechanisms that implement the data management capabilities.

Rule 2: The guaranteed access rule: All data must be accessible. This rule is essentially a restatement of the fundamental requirement for primary keys. It says that every individual scalar value in the database must be logically addressable by specifying the name of the containing table, the name of the containing column and the primary key value of the containing row. Rule 4: Active online catalog based on the relational model: The system must support an online, inline, relational catalog that is accessible to authorized users by means of their regular query language. That is, users must be able to access the database's structure (catalog) using the same query language that they use to access the database's data. Rule 5: The comprehensive data sublanguage rule: The system must support at least one relational language that Has a linear syntax Can be used both interactively and within application programs, Supports data definition operations (including view definitions), data manipulation operations (update as well as retrieval), security and integrity constraints, and transaction management operations (begin, commit, and rollback). Rule 7: High-level insert, update, and delete: The system must support set-at-a-time insert, update, and delete operators. This means that data can be retrieved from a relational database in sets constructed of data from multiple rows and/or multiple tables. This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table.

Data independence rules (Rule 8, Rule 9, and Rule 11)

Specify the independence of data from the applications that use the data. Adherence to these rules ensures that both users and developers are protected from having to change the applications following low-level reorganizations of the database.

Rule 8: Physical data independence: Changes to the physical level (how the data is stored, whether in arrays or linked lists etc.) must not require a change to an application based on the structure.
Rule 9: Logical data independence: Changes to the logical level (tables, columns, rows, and so on) must not require a change to an application based on the structure. Logical data independence is more difficult to achieve than physical data independence.

Rule 11: Distribution independence: The distribution of portions of the database to various locations should be invisible to users of the database. Existing applications should continue to operate successfully : when a distributed version of the DBMS is first introduced; and when existing distributed data are redistributed around the system. Distribution independence means that an application program that accesses the DBMS on a single computer should also work without modification, even if the data is moved about from computer to computer, in a network environment.

Thank You

You might also like