picture by Kanban Tool via Flickr
We define agile analytics as the iterative and rapid processing of data for decision support. There is minimal upfront investment. Developers instead process source data and build queries incrementally as business needs emerge.
Big Data and data warehouses are both technologies for realizing agile analytics. They differ in several ways.
Continue reading Agile Analytics
picture by Abby Sasser via Flickr
Few database projects start with a clean slate. Many operational applications have legacy databases as a source of data and ideas. Analytical applications have the operational databases that are feeding data. Developers often encounter existing databases that are poorly documented and need to figure them out. We use the term database archaeology to refer to the study of database artifacts.
Continue reading Database Archaeology
I’m proficient with ERwin and recently had the opportunity to try ER/Studio. I must say that I was impressed. ER/Studio is the best data modeling tool that I’ve seen to date. My experiments did not cover every aspect. But I did look at enough features to form a clear opinion.
Continue reading Comments on ER/Studio
A data warehouse is a database dedicated to decision support and business analysis. The inputs to a data warehouse are data from the day-to-day operational systems. A data warehouse integrates the input data and restructures it so that it is amenable to data mining.
With an agile approach developers build a data warehouse rapidly to get it into the hands of business users so they can give feedback. Where possible, SQL code substitutes for programming and ETL scripts.
The course is organized about a threaded case study for the retail industry.
picture by Alejandro via Flickr
Over the years we’ve seen a number of projects where application architects use a generic layer to hide a database. This is a common approach with object-oriented languages accessing a relational database. Application code accesses the layer which in turn accesses the database. The use of a generic layer can be a valuable technique, but it is overused. Some architects seem to be unaware that there are other possibilities.
Continue reading Be Wary of Generic Database Layers
picture by lili chin via Flickr
The UML is a popular notation for modeling software. Even though the UML was mostly developed for programming, it is also relevant for databases. This article takes a critical look at using the UML for databases.
Continue reading When To Use the UML for Databases
picture by Rudolf Vicek via Flickr
Generally the most pressing problems for software development concern quality, time to market, and cost. If you define referential integrity (RI) in your software you can improve all three of these items. RI improves quality by ensuring that data references truly exist and cannot be dangling. RI also reduces development time and lowers cost as it takes much less effort to define RI than to program the equivalent application code.
Continue reading Do Use Referential Integrity
picture by shinichi via Flickr
I often work with programmers to build applications. It never ceases to amaze me that many programmers look at SQL as a store and retrieve mechanism and see nothing more. Their approach is to write programming code for all their logic. It’s not just a matter of programmers being unable to write complex SQL. It’s also a matter of them not imagining what is possible.
Continue reading SQL Is More than Just Store and Retrieve
picture by Travis via Flickr
In the database literature, the lore has been to first design a schema and then predict the query mix to decide on a tuning strategy. This approach can work, but it is risky and error prone. A small misjudgment of the query mix can lead to vastly inferior performance. For many applications a better strategy is to index every foreign key. That is often sufficient and further tuning is unnecessary.
Continue reading Please Index Foreign Keys
picture by Thor via Flickr
Here’s a chapter from my book Patterns of Data Modeling. This chapter presents six data modeling patterns for trees — UML data models, IE data models, SQL queries, and examples. The book has much more information about patterns including patterns for directed graphs, patterns for undirected graphs, antipatterns, and archetypes.