A data warehouse is a database that is dedicated to data analysis and reporting. It combines data from multiple operational applications and provides one location for decision-support data. A warehouse should include staging tables — one staging table for each source table or file.
Do you use SQL as part of your work? If so, would you like to use it with greater proficiency?
Check out my new free video course on Udemy — Advanced SQL Queries – Subtleties of Joins. In this course we study different ways of phrasing database joins and their trade-offs.
picture by Kanban Tool via Flickr
I’ve been practicing agile database techniques for about twenty years now. My use of agile techniques didn’t start as an explicit plan. Rather it evolved over time as I was working on consulting projects. It made sense to look for ways of working faster and better and with greater customer interaction.
I can think of at least three kinds of agile database techniques.
- For data modeling.
- For data warehouse development.
- For database reverse engineering.
picture by xianrendujia via Flickr
Data warehouses have a much different architecture and different business motivations than operational applications. For example, operational applications manage the day-to-day data needed to support the business. In contrast, data warehouse restructure operational data and place it in a format amenable to data mining and deep analysis. Operational applications rapidly read and write transactions with small amounts of data. In contrast, users only read data warehouses and can have extensive queries involving large data sets running for multiple minutes.
My new video course — Agile Data Warehouse Design — is now on the market.
A data warehouse is a database dedicated to decision support and business analysis. The inputs to a data warehouse are data from the day-to-day operational systems. A data warehouse integrates the input data and restructures it so that it is amenable to data mining.
With an agile approach developers build a data warehouse rapidly to get it into the hands of business users so they can give feedback. Where possible, SQL code substitutes for programming and ETL scripts.
The course is organized about a threaded case study for the retail industry.
We define agile analytics as the iterative and rapid processing of data for decision support. There is minimal upfront investment. Developers instead process source data and build queries incrementally as business needs emerge.
Big Data and data warehouses are both technologies for realizing agile analytics. They differ in several ways.
picture by shinichi via Flickr
I often work with programmers to build applications. It never ceases to amaze me that many programmers look at SQL as a store and retrieve mechanism and see nothing more. Their approach is to write programming code for all their logic. It’s not just a matter of programmers being unable to write complex SQL. It’s also a matter of them not imagining what is possible.