Archive for January, 2012

Cool code peice to analyze table size

–Greate script to analyze table space used and what percentage of space the table uses in regards to the whole database

— Create the temp table for further querying
rec_id  int IDENTITY (1, 1),
table_name varchar(128),
nbr_of_rows int,
data_space decimal(15,2),
index_space decimal(15,2),
total_size decimal(15,2),
percent_of_db decimal(15,12),
db_size  decimal(15,2))

— Get all tables, names, and sizes
EXEC sp_msforeachtable @command1=”insert into #temp(nbr_of_rows, data_space, index_space) exec sp_mstablespace ‘?'”,
@command2=”update #temp set table_name = ‘?’ where rec_id = (select max(rec_id) from #temp)”

— Set the total_size and total database size fields
UPDATE #temp
SET total_size = (data_space + index_space), db_size = (SELECT SUM(data_space + index_space) FROM #temp)

— Set the percent of the total database size
UPDATE #temp
SET percent_of_db = (total_size/db_size) * 100

— Get the data
FROM #temp
ORDER BY total_size DESC

— Comment out the following line if you want to do further querying




Slowly Changing Dimension

Dimensions are basically lookup tables. they contain data such as status(open, closed) or (married, single) or employee (employee_id int, fname varchar(50), lname varchar(50), position varchar(50), salary float)

When we get into the last examples (position, salary) there are situations where we have to be able to capture the changes to the employee dimension. SQL Server Analysis Services (SSIS) will not do this automatically. We have to change the dimension to a slowly changing dimensions (SCD).

The difference between a dimension and a SCD is that each change in the dimension is captured by a start (when the data was created) and a end when the data was changed.

The first thing we need is a surrogate key to keep track of changes to the primary key (employeeid)

Next we need startdate and enddate columns to keep track of when the status of the column was live. So our employee dimension would look like

(employee_sk int,  employeeid int, fname varchar(50), lname varchar(50), position varchar(50), salary float, startdate datetime, enddate datetime)

CREATE TABLE [dimension].[employee](  [employee_sk] [int] PRIMARY KEY IDENTITY(1,1) NOT NULL,  [employee_id] [int] NOT NULL,  [fname] [varchar](50) NOT NULL,  [lname] [varchar](50) NOT NULL,  [postion] [varchar](50) NOT NULL,  [salary] [float] NOT NULL,  [start_date] datetime NOT NULL,  [end_date] datetime  )

When you assign your  surrogate key you must update all fact tables to use this as the foreign key instead of the employeeid

At this point we are ready to use a SCD Task in SSIS the fact table will now use the surrogate key and will show the state of the row at the time of the change and a new row is created with the same properties as old but with a blank end date.