During our third week, we have been finishing up Part 1 of our first project, which has us designing a database with data capture or OLTP (online transaction processing) in mind. This initial design of our database relies primarily on what we have learned last week about normalizing databases to reduce data redundancy. By contrast, this week’s readings have focused on OLAP (online analysis processing), which serves a different purpose than OLTP. My understanding of these differences are outlined in the table below:
OLTP | OLAP |
---|---|
Purpose is operational record keeping | Purpose is analytical decision making / evaluation of performance (answer how/what/why questions) |
Optimized to process transactions quickly — usually processes one transaction at a time | Must handle up to hundreds of thousands of transactions at a time |
Updated to maintain the current state (no historical records are kept) | Historical records are important to provide context for the evaluation of performance over time |
OLTP and OLAP differ fundamentally because the goals of the end user are different. While OLTP typically use relational databases that are normalized (typically in the third normal form) to reduce data redundancy and to be able to process single transactions quickly, OLAP makes use of dimensional modeling to deliver data that is understandable and appealing to the end user.
Kimball and Ross (2013) use a restaurant metaphor to describe one form of OLAP architecture — Kimball’s Data Warehouse/ Business Intelligence (DW/BI) architecture .
In the restaurant metaphor, the kitchen is the staging area of the DW/BI. Just as kitchen staff must ensure that the ingredients they procure are high quality and are prepared in a manner that suits the diner’s palate, we who design the data staging area are concerned with cleaning the data (e.g. by fixing misspellings or formatting inconsistencies) that we extract from multiple sources. This process is called the ETL (Extract, Transformation, and Load) System.
Just as we would not expect diners to wander into the back kitchen, we would not expect users to have access to the staging area / ETL. Diners belong in the dining room, where they are presented with prepared food. A DW/BI user’s domain is in the presentation area, where they should be presented with data in a meaningful and easy-to-understand format.
References
Kimball, R. and Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd Ed.). Indianapolis, IN: John Wiley and Sons.
Ross, M. (2004). Differences of Opinion. Kimball Group. https://www.kimballgroup.com/2004/03/differences-of-opinion/