|
|
|
Lecture 6 |
|
Professor Burns |
|
|
|
|
|
Traditional Reports – focus of the chapter |
|
Pre-defined vs Adhoc |
|
Pivot Tables – OLAP; Drilling |
|
Charts |
|
Pies |
|
Plots |
|
Lines |
|
Areas |
|
Bars |
|
Variations of the above |
|
Plus – Portals |
|
http://www.cognos.com/products/tours/index.html |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Simple Report – Report 13.1 |
|
Fully-Additive Measures – stored and aggregated
in SQL statement – Body rows |
|
Non-Additive Measures – components are
fully-additive; calc in the same SQL statement or in the application |
|
Summary Rows – Report 13.2 |
|
One dimension - Total |
|
Multiple dimensions – multiple levels of
aggregation |
|
Sub-Totals and Total |
|
Non-additive measures – always take the ratio of
the sums of the additive components, not the sum of the ratios (Margin %) |
|
Recall Chap 3 - Storing Margin Dollars vs. calc
Margin Dollars – computation of summary rows for a non-additive measure
requires access to its fully additive components, even if they will not be
displayed |
|
|
|
|
|
|
cont. Report 13.2 |
|
Computational Approach |
|
Retrieve data for columns 1(Brand),2 (Product),
3 (Sales Dollars),4 (Margin Dollars) |
|
Compute summary rows (Row Sub-Totals and Total) |
|
Compute column 5 – non-additive measure Margin % |
|
Ease of Use Test – does the tool manage these
details? In other words, does it
provide functionality that masks this complexity from the end user? |
|
|
|
|
|
What happens when the values summarized are not
fully-additive, or not composed of fully-additive components? – Report 13.3 |
|
Computational Approach doesn’t work well here
because we’re averaging in the body rows themselves and the domain of
customers vary across brands and products |
|
Re-query Approach |
|
Role of the Application/Tool |
|
|
|
|
|
|
|
cont. Report 13.3 |
|
Re-Query Approach |
|
First, construct the body rows: query Product,
Brand and Average Customer Sales. |
|
Next, we query Brand and Average Customer Sales. |
|
Lastly, we query and compute the Average
Customer Sales for all products. |
|
3 SQL queries. |
|
Each query is successively simpler |
|
If we use a well-designed set of aggregate
tables, each succesive summary level will enjoy a performance boost over
the previous. |
|
|
|
|
|
|
More on averages |
|
Recall Chap 3. – semi-additive measures require
averages |
|
Bank Account Balances are not additive over time |
|
Basket counts are not additive across products |
|
Averages are a ratio; non-additive |
|
Average = sum of a qty / cardinality |
|
SQL’s Average function often gets the
cardinality wrong; therefore, averages are best constructed by retrieving
the sum and cardinality components, then calculating the ratio. |
|
|
|
|
|
|
|
Suppose we are trying to calculate the average
inventory for a product in all warehouses |
|
SQL Average function would count up the rows in
the fact table and that would be the cardinality; but is it? What if a warehouse didn’t have a
product? This is an issue of sparsity. |
|
Solution:
Load zero rows. |
|
No. We
would balloon the size of the fact tbl. |
|
What if we want the average inventory for a
product across the company (assume our grain is inventory by product by
month and we have 4 warehouses)? |
|
Now we have another problem – a grain
problem. We want to compute an
average at a higher level of summarization than our fact table. In other words, our cardinality using the SQL Average
function will be 4 x 12 = 48 when we really want 12 |
|
Solution is retrieve the sum and cardinality
components, then calculate the ratio |
|
Cardinality could be relative (ie. period
averages); cardinality could computed based on dimension data or based on
facts (ie. Average product sales by Brand for 1/01 - number of products in
a brand would need to be the cardinality in the average for each
brand). The latter would require 2
queries. |
|
|
|
|
|
|
|
Where clause |
|
Simple:
Adding a line to restrict Report 13.1 to only 1/98. |
|
Note: In
situations using multiple fact tables (drill across), each query must be
constrained in the same manner. |
|
|
|
|
|
|
Based on Comparision to Summary or Rank |
|
Avoid complex SQL query; let the application
handle it. |
|
Show me sales by Product for Jan, only for
Products that account for more than 30% of their brand totals |
|
Interrow comparisions – see Report 13.4 |
|
Filter out rows < 30% |
|
|
|
|
|
|
|
|
|
Based on Group Membership |
|
Show me the average balance of new savings
accounts opened by recipient’s on January’s Super-Saver mailing |
|
SQL |
|
Create a study group table – households part of
the Super-Saver mailing |
|
Join the study group table to the fact table |
|
Application |
|
Set Operations – union, intersect, minus –
reduce complexity and increase performance |
|
|
|
|