Skip to content

Page380

Referential, Semantic, and Entity Integrity

Databases must ensure the integrity of the data in the tables: this is called data integrity, discussed in the “Database Integrity” section below. There are three additional specific integrity issues that must be addressed beyond the correctness of the data itself: Referential, Semantic, and Entity Integrity. These are tied closely to the logical operations of the DBMS.

Referential integrity means that every foreign key in a secondary table matches a primary key in the parent table: if this is not true, referential integrity has been broken. Semantic integrity means that each attribute (column) value is consistent with the attribute data type. Entity integrity means each tuple has a unique primary key that is not null. The HR database table shown in Table 9.2, as seen previously, has referential, semantic, and entity integrity. Table 9.3, on the other hand, has multiple problems: one tuple violates referential integrity, one tuple violates semantic integrity, and the last two tuples violate entity integrity.

Table. 9.3 Database Table Lacking Integrity.

SSN Vacation Time Sick Time
467-51-9732 7 days 14 days
737-54-2268 3 days Nexus 6
133-73-1337 16 days 22 days
133-73-1337 15 days 20 days

The tuple with the foreign key 467-51-9732 has no matching entry in the employee database table. This breaks referential integrity: there is no way to link this entry to a name or title. Cell “Nexus 6” violates semantic integrity: the sick time attribute requires values of days, and “Nexus 6” is not a valid amount of sick days. Finally, the last two tuples both have the same primary key (primary to this table; foreign key to the parent employees table); this breaks entity integrity.

Database Normalization

Database normalization seeks to make the data in a database table logically concise, organized, and consistent. Normalization removes redundant data, and improves the integrity and availability of the database. Normalization has three rules, called forms (see http://www.informit.com/articles/article.aspx?p=30646 for more information):

  • First Normal Form (1NF): Divide data into tables.
  • Second Normal Form (2NF): Move data that is partially dependent on the primary key to another table. The HR Database (Table 9.2) is an example of 2NF.
  • Third normal Form (3NF): Remove data that is not dependent on the primary key [26].

Database Views

Database tables may be queried; the results of a query are called a database view. Views may be used to provide a constrained user interface: for example, non-management employees can be shown their individual records only via database views. Table 9.4 shows the database view resulting from querying the employee table “Title” attribute with a string of “Detective.” While employees of the HR department may be able to view the entire employee table, this view may be authorized for the captain of the detectives, for example.

Table. 9.4 Employee Table Database View "Detective.

SSN Name Title
425-22-8422 Gaff Detective
737-54-2268 Rick Deckard Detective