picture by Greg via Flickr
While working on database projects, we often find ourselves doing reverse engineering. Reverse engineering is the inverse to normal development. Developers start with an application and work backwards to understand the software and infer its intent. Reverse engineering can apply to a variety of artifacts, such as hardware, programming code, and databases. Our focus here is on relational databases.
There are many reasons for database reverse engineering. One reason is to assess software quality. For example, you may want to assess the quality of a vendor product or an internal legacy application. Information systems revolve about a database, so you can use database quality as an indicator of software quality.
This article is the first in a series of two blogs that present our grading scale for database quality. We assign separate grades for the quality of a database design (this blog) and the underlying model (next blog). The design grade measures the quality of the database syntax. The model grade measures the quality of the semantic concepts underlying the database. Applications can have different design and model grades.
Most business leaders do not want to hear about database gore, such as primary keys, foreign keys, nulls, and indexing. Furthermore, even if they understand such technology, they may find it difficult to weigh their importance. We have found it helpful to quantify a database evaluation in terms of a grade. Business leaders readily understand the meaning of the grade, realize that we have the supporting technical details, and can learn about the details at their leisure.
Here is our grading scale for database designs. We use grades of “A”, “B”, “C”, “D”, and “F”. “A” is the best grade and “F” is the worst grade with “B”, “C”, and “D” denoting intermediate quality.
|Grade||Explanation||Design Flaw Examples|
|A||The database has no major flaws. The style is reasonable and uniform.|
|B||The database has flaws that are not apparent in the software’s operation. The flaws can be easily repaired.||• Data types and lengths are inconsistent.
• Required fields are not defined as not null.
• Unique keys and enumerations are unspecified.
• Columns have cryptic names.
|C||The database has major flaws that are difficult to fix and cause noticeable problems (bugs, slow performance, difficult maintenance) in the application.||• Primary keys are not defined.
• Indexing is haphazard; many foreign keys lack indexes and some indexes are subsumed by other indexes.
• Foreign keys have mismatching data types.
|D||The database has severe flaws that compromise the application.||• The database has much redundant data (non-DW).
• The database has binary data from programming code.
• There may be gross denormalization (non-DW).
• Tables have dangling foreign key references.
|F||The database is appalling. The application runs only because of brute-force programming effort.||• There are gross design errors.|
A database design with a grade of ‘B’ is still quite good, but not perfect. Some might regard the distinction between ‘A’ and ‘B’ as quibbling.A database design with a grade of ‘A’ is expertly crafted. Given the demonstrated proficiency, it’s likely that the corresponding software is also of high quality.
A database design with a grade of ‘C’ is mediocre. It serves business needs, but should be better.
A database design with a grade of ‘D’ is defective. It is poor quality work. Redundancy and denormalization are acceptable for data warehouses, but not for day-to-day operational applications.
A database design with a grade of ‘F’ is an embarrassment. Such a design is a hack and reflects a lack of understanding of databases.
As experts, we must convey our technical judgment to management for their decisions. The best way to communicate is by summarizing the results and providing details upon request. We have found database grades to be helpful as their meaning is intuitive.