OLTP Datamarts

A SQL Server Datamart is a OLTP Database that is structured for in a star or snowflake schema and mostly is used as a data-source for OLAP Datacubes. It can be used as a source for SQLRS. The real power of combining OLAP with a Data Warehouse is that you can traverse a OLAP cube and allow users to drill down to the root data in the data mart that OLAP cube was aggregated from. OLAP also aggregates the data so that queries are incredibly fast in joining many dimensions to a fact.

A dimension is a hierarchy [time].[year].[month].[day].[hour] for example or [product group].[product name]

A fact is a point in time containing an amount or count of warehouse inventory that contains ids of the time and product name. You have to use the lowest levels of the dimensions [hour] and [product name]  but use the ids associated with them. The goal is all members (lowest levels of dimensions that are in a fact) are unique. Before it gets confusing lets look at time

Time Dimension OLTP Table

First thing in a data mart is to create a schema that defines what the table contains

This is a dimension so

CREATE SCHEMA [dimension] AUTHORIZATION [dbo]
GO

Next the time dimension table

dimension_time

Whats great about this is the table does all the work for you. the date column is the only field you need to fill out. If you need a to add financial quarters just add a column and a calculation as a default for the column.

Using my SSIS example as a model for the rest of the data we can see the rest of the metadata we have, Source, Destination, Action,Date (out time dimension), Session and Instance

SSIS Normalize Session Metadata

 

First thing to consider is do we care about the raw data? From my SSIS example we have a session feed. I dont need to audit my data so I will put it into a aggregated datamart. If you need to keep the data before it is aggregated you should put it in a staging table before it is transformed.

Looking at Source and Destination they are just web addresses so they should be combined into one dimension. Because my feed does not contain any parameters I can just build a dimension with my urls and make it unique. If you have parameters you may have to massage the data to strip off parameters to make sure you don’t have dups. Example (@mortgage:t123116) would be transformed into mortgage

Again we can create schema’s for facts, staging and temp table to help organize our OLTP datamart

CREATE SCHEMA [fact] AUTHORIZATION [dbo]
GO

In this example I am only going to use 2 dimensions Time and Destination(Web Page)

For web page I am going to create a simple dimension for Destination

dimension.destination

Now that I have 2 dimension tables I can create my fact table

Next I insert the url’s I am tracking

insert dimension.destination

So this is what our destination table looks like

destination_id url
1                            http://www.derekevans.us/
2                           http://www.derekevans.us/database-administration/
3                           http://www.derekevans.us/sql-server-integration-services/
4                          http://www.derekevans.us/server-setup/
5                          http://www.derekevans.us/category/ssrs/

Next we need to insert unique dates into our date dimension

insert dimension.date

We are now ready for our facts the facts will contain 2 measures. Time and Destination. Normally I would use SSIS and lookups and inserts to process the raw data to datamart but I will use sql here

My raw data looks like this

SELECT [impression_id]
,[date]
,[destination]
FROM [DataMart].[staging].[impression]
GO

1 2011-02-01 08:16:00.000 http://www.derekevans.us/
2 2011-02-01 08:16:00.000 http://www.derekevans.us/
3 2011-02-01 08:16:00.000 http://www.derekevans.us/
4 2011-02-01 08:16:00.000 http://www.derekevans.us/
5 2011-02-01 08:16:00.000 http://www.derekevans.us/
6 2011-02-01 08:16:00.000 http://www.derekevans.us/
7 2011-02-01 08:16:00.000 http://www.derekevans.us/
8 2011-02-01 08:16:00.000 http://www.derekevans.us/
9 2011-02-01 08:16:00.000 http://www.derekevans.us/
10 2011-02-01 08:16:00.000 http://www.derekevans.us/
11 2011-02-01 08:16:00.000 http://www.derekevans.us/
12 2011-02-01 08:16:00.000 http://www.derekevans.us/
13 2011-02-01 08:16:00.000 http://www.derekevans.us/
14 2011-02-01 08:16:00.000 http://www.derekevans.us/

Create our fact table

fact.impression

Insert data int fact.impression

insert fact.impression

We now have a star schema datamart with 2 dimension tables and 1 fact

 

 

 

 

 

 

 

We now a a simple star schema OLTP Database that is ready for SSAS