Aggregation Functions
Microsoft SQL Server Analysis Services provides functions to aggregate measures along the dimensions that are contained in measure groups. By default, measures are summed along each dimension. However, the AggregateFunction property lets you modify this behavior. The additivity of an aggregation function determines how the measure is aggregated across all the dimensions in the cube. Aggregation functions fall into three levels of additivity:
Additive
An additive measure, also called a fully additive measure, as name suggest can be aggregated along all the dimensions that are included in the measure group that contains the measure, without restriction.
Semiadditive
A semiadditive measure can be aggregated along some, but not all, dimensions that are included in the measure group that contains the measure. For example, a measure that represents the quantity available for inventory can be aggregated along a geography dimension to produce a total quantity available for all warehouses, but the measure cannot be aggregated along a time dimension because the measure represents a periodic snapshot of quantities available. Aggregating such a measure along a time dimension would produce incorrect results.
Nonadditive
A nonadditive measure cannot be aggregated along any dimension in the measure group that contains the measure. Instead, the measure must be individually calculated for each cell in the cube that represents the measure. For example, a calculated measure that returns a percentage, such as profit margin, cannot be aggregated from the percentage values of child members in any dimension.
The following table lists the aggregation functions in Analysis Services, and describes both the additivity and expected output of the function.
SUM
If a measure's Aggregate Function property value is Sum, the measure value for a cube cell is calculated by adding the values in the measure's source column from only the rows for the combination of members that defines the cell and the descendants of those members.
Fig1. Shows [Sales Amount] as the sum of values of categories for a particular day of month.
Count
If a measure's Aggregate Function property value is Count, the measure value for a cube cell is calculated by adding the number of values in the measure's source column from only the rows for the combination of members that defines the cell and the descendants of those members.
Fig2. Shows [Customer Count] as the no. of records particular to a product category in a state.It is a count of rows instead of sum of the values in those rows.
Min
If a measure's Aggregate Function property value is Min, the measure value for a cube cell is calculated by taking the lowest value in the measure's source column from only the rows for the combination of members that defines the cell and the descendants of those members.
SELECT min(SalesAmount) FROM FactResellerSales WHERE ProductKey in (SELECT ProductKey FROM DimProduct WHERE ProductSubcategoryKey=28)and ShipDateKey in (SELECT TimeKey FROM DimTime WHERE DayNumberOfWeek=1)
This gives the min (SalesAmount) for “Bottles and Cages” for the 1st day of the week. It is the minimum salesamout value for the entire data present for “1st day of week” and for “Bottles and Cages”
Max
If a measure's Aggregate Function property value is Max, the measure value for a cube cell is calculated by taking the highest value in the measure's source column from only the rows for the combination of members that defines the cell and the descendants of those members.
SELECT max(SalesAmount) FROM FactResellerSales WHERE ProductKey in (SELECT ProductKey FROM DimProduct WHERE ProductSubcategoryKey=28)and ShipDateKey in (SELECT TimeKey FROM DimTime WHERE DayNumberOfWeek=1)
Distinct Count
If a measure's Aggregate Function property value is Distinct Count, the measure value for a cube cell is calculated by adding the number of unique values in the measure's source column from only the rows for the combination of members that defines the cell and the descendants of those members.A measure with an Aggregate Function property value of Distinct Count is called a distinct count measure. A distinct count measure can be used to count occurrences of a dimension's lowest-level members in the fact table. Because the count is distinct, if a member occurs multiple times, it is counted only once.Distinct count measures are commonly used to determine for each member of a dimension how many distinct, lowest-level members of another dimension share rows in the fact table. For example, in a Sales cube, for each customer and customer group, how many distinct products were purchased? (That is, for each member of the Customers dimension, how many distinct, lowest-level members of the Products dimension share rows in the fact table?) In this example, the second dimension's lowest-level members are counted by a distinct count measure.This kind of analysis need not be limited to two dimensions. In fact, a distinct count measure can be separated and sliced by any combination of dimensions in the cube, including the dimension that contains the counted members. A distinct count measure that counts members is based on a foreign key column in the fact table. (That is, the measure's Source Column property identifies this column.) This column joins the dimension table column that identifies the members counted by the distinct count measure. Because distinct count measures are nonadditive, the presence of a distinct count measure significantly restricts the ability of Microsoft® SQL Server™ 2000 Analysis Services to preaggregate the cube. For this reason, it is recommended that each distinct count be placed in its own cube with no other measures. These cubes with distinct count measures can then be joined together with other cubes in a virtual cube that efficiently manages all of the measures.
Note: If a cube uses a distinct count measure, the entire cube is treated as nonadditive. Nonadditive cubes do not support dynamically created members, therefore MDX functions, such as VisualTotals, which dynamically create members, will return an error if used on a nonadditive cube. This also affects other features, such as enabling visual totals in dimension security, which involve dynamically created members.
SELECT DISTINCT CustomerKey FROM dbo.FactInternetSales WHERE ProductKey in (SELECT ProductKey FROM DimProduct WHERE ProductSubcategoryKey='28')and SalesTerritoryKey='6'
/* Bottles & Cages have ProductSubcategoryKey as 28 */
This query gives the distinct count of “Bottles and Cages” for Canada territory.
Fig3. Shows the distinct is based on some columnId ie; CustomerKey
Fig4. Shows the count of distinct customer for a particular product in a particular country.
568 is the distinct total count for Bottle and cages for Canada. The query given above will clear how Distinct count works.
Defining Semiadditive Behavior
Semiadditive measures, which do not uniformly aggregate across all dimensions, are very common in many business scenarios. Every cube that is based on snapshots of balances over time exhibits this problem. You can find these snapshots in applications dealing with securities, account balances, budgeting, human resources, insurance policies and claims, and many other business domains.Add semiadditive behavior to a cube to define an aggregation method for individual measures or members of the account type attribute. If the cube contains an account dimension, you can automatically set semiadditive behavior based on the account type.To add semiadditive behavior, you use the Business Intelligence Wizard, and select the Define semiadditive behavior option on the Choose Enhancement page. This wizard then guides you through the steps of identifying which measures have semiadditive behavior. Define Semi-additive Behavior On the Define Semiadditive Behavior page of the wizard, you select how to define semiadditivity by selecting one of the following options:
Turn off semiadditive behavior Removes semiadditive behavior from a cube in which semiadditive behavior was previously defined. This selection resets a measure to SUM if it is set to any of the following aggregation function types:
• By Account
• Average of Children
• First Child
• Last Child
• Last Nonempty Child
• First Nonempty Child
• None
This option does not change measures with a regular aggregation function: Sum, Min, Max, Count, or Distinct Count.
The wizard has detected the 'Account" account dimension, which contains semiadditive members. The server will aggregate members of this dimension according to the semiadditive behavior specified for each account type. Causes the system to set all measures from a measure group dimensioned by an Account type dimension to the By Account aggregation function and the server will aggregate members of the dimension according to the semiadditive behavior specified for each account type.
Define semi additive behavior for individual measures Selects the semiadditive behavior of each measure individually. The default setting is SUM (fully additive).
For each measure, you can select from the types of semiadditive functionality described in the following table.Any existing semiadditive behavior is overwritten when you complete the wizard. AverageOfChildren:
The AverageOfChildren is semi-additive, ie. it acts like a "sum" across all dimensions except time. For example, in Adventure Works if you add an AverageOfChildren measure on the OrderQuantity field of FactSalesSummary; its value matches that of [Measures]. [Order Quantity] at the leaf Date level. But for July 2001 as a whole, [Measures].[Order Quantity] = 966, whereas the AverageOfChildren measure = 31 (which is 966 divided over 31 days of July).
Fig 5: The measure [Order Quantity] for various products in various territories.
Here the aggregate function AverageOfChildren works as the sum when it is not related to time dimension.The AverageOfChildren aggregation type is also useful for inventory counts or other measures that represent a snapshot in time. Since inventory isn't additive across time, the AverageOfChildren aggregation type allows you to apply semiadditive aggregation where a simple sum or average would not be applicable.This behavior could occur because AverageOfChildren, FirstChild, LastChild, FirstNonEmpty and LastNonEmpty are semi-additive and treats the Time dimension different from the other dimensions.
The AverageOfChildren only applies when aggregating via Time dimension. Actually, when you try to create a new measure in cube, when you select usage, you could see "average over time" which is for AverageOfChildren. To get the result of average behavior you want, you may want define a Sum and a Count measure, then create a calculated measure (in the cube script) which divides the two base measures.
Fig 6: Shows aggregate function ‘AverageOfChildren’ works as average with Time dimension.
In the screens here, you can find that the parent member have value which is equal to the average of their child members (only for the children for which the record is present).
Fig 7: [Order quantity] has aggregate function as AverageOfChildren.
Fig 8: Time dimension taken on columns. Work as average.
LastNonEmpty:
Retrieves the value of the last non-empty child member.
Fig 9: Shows the Last non empty value of a particular row.
By account:
Calculates the aggregation according to the aggregation function assigned to the account type for a member in an account dimension. If no account type dimension exists in the measure group, treated as the none aggregation function.For more information about account dimensions, see Account (Analysis Services - Multidimensional Data).
Fig 10: Shows the distribution by Account.
No comments:
Post a Comment