picture by Travis via Flickr
In the database literature, the lore has been to first design a schema and then predict the query mix to decide on a tuning strategy. This approach can work, but it is risky and error prone. A small misjudgment of the query mix can lead to vastly inferior performance. For many applications a better strategy is to index every foreign key. That is often sufficient and further tuning is unnecessary.
This advice applies to operational applications and not data warehouses. Data warehouses have tuning strategies of their own.
What is a Foreign Key?
A foreign key is a reference to a unique identifier that is usually the primary key. Foreign-to-primary-key bindings couple the tables of a relational database. For example, a database can have Customer and Order tables. A Customer can have many Orders; an Order is for one Customer. The Order table has a foreign key to the primary key of Customer.
What is an Index?
An index is similar to a phone book that is ordered by last name and then first name. With a phone book, it’s quick to find someone given their name. Considering the placement of the name in the alphabet we can try various pages in the phone book and usually find a person with a few tries. The phone book example illustrates the database mechanics of using an index to quickly find a record.
In contrast, it’s much slower to find someone using their address or phone number. We would have to scan the phone book, page by page, until we finally found them. If the address or phone number was there, we’d have to search half the phone book on average. If the data was not there, we’d have to search the entire book for confirmation.
Now consider that there are two phone books. One is the current phone book for a city. The other is a phone book for the same city, but ten years old. Using the two phone books, it’s easy to look up a name in each and find out if a person lives at the same address now as they did ten years ago. Since both phone books are ordered by name, we can quickly find a person’s entry and join information from one phone book to the other. If there are indexes, databases use similar logic in joining foreign keys to primary keys. Without indexes databases have to resort to record by record searches.
Index Every Foreign Key
In database queries, joins provide the usual means for connecting tables. In practice, most joins are traversals from a foreign key to its primary key. These joins are efficient if every primary key and foreign key has an index. Most databases create an index as a side effect of a primary key declaration. But you have to explicitly index the foreign keys. A missing index can slow performance by orders of magnitude.
So make sure that every foreign key has an index. Do not make any exceptions. Do not decide that some foreign keys are unimportant and omit their indexes. Some database design tools, such as ERwin, have an option to index all foreign keys. If you follow our advice, many applications will have excellent performance from the start. For the remaining applications, you can study the bottlenecks and modify the schema, tuning, and queries for a specific solution.