Calculating Quartiles in Analysis Services - statistics

I´m using MDX code to calculate quartile, , like in this blog:
https://electrovoid.wordpress.com/2011/06/24/ssas-quartile/
That is what I´m doing:
WITH SET OrderedData AS
ORDER
(
NONEMPTY
(
[Dim Parameter].[id].[id]
*[Dim Result].[Id].[Id].ALLMEMBERS,
[Measures].[Value]
),
[Measures].[Value],
BASC
)
MEMBER [Measures].[RowCount] AS COUNT (OrderedData)
MEMBER [Measures].[i25] AS ( .25 * ( [RowCount] - 1 ) ) + 1
MEMBER [Measures].[i25Lo] AS FIX([i25]) - 1
MEMBER [Measures].[i25Rem] AS ([i25] - FIX([i25]))
MEMBER [Measures].[n25Lo] AS (OrderedData.Item([i25Lo]), [Value])
MEMBER [Measures].[n25Hi] AS (OrderedData.Item([i25Lo] + 1), [Value])
MEMBER [Measures].[Quartile1] AS [n25Lo] + ( [i25Rem] * ( [n25Hi] - [n25Lo] ))
,FORMAT_STRING='Currency'
MEMBER [Measures].[Quartile2] AS MEDIAN(OrderedData, [Value])
,FORMAT_STRING='Currency'
MEMBER [Measures].[i75] AS ( .75 * ( [RowCount] - 1 ) ) + 1
MEMBER [Measures].[i75Lo] AS FIX([i75]) - 1
MEMBER [Measures].[i75Rem] AS ([i75] - FIX([i75]))
MEMBER [Measures].[n75Lo] AS (OrderedData.Item([i75Lo] ),[Value])
MEMBER [Measures].[n75Hi] AS (OrderedData.Item([i75Lo] + 1),[Value])
MEMBER [Measures].[Quartile3] AS [n75Lo] + ( [i75Rem] * ( [n75Hi] - [n75Lo] ))
,FORMAT_STRING='Currency'
MEMBER [Measures].[RIC] As ([Quartile3]-[Quartile1] )
MEMBER [Measures].[Ls] As ([Quartile3]+ ([RIC]*1.5) )
MEMBER [Measures].[Li] As ([Quartile1]- ([RIC] *1.5))
MEMBER [Measures].[MAX] as MAX (Filter(OrderedData ,[value]<=[LS]),[value])
MEMBER [Measures].[Min] as MIn(Filter(OrderedData ,[value]>=[Li]),[value])
MEMBER [Measures].[out] as MAX (Filter(OrderedData ,[value]>[LS]),[value
What I want is to add Dim date, to calculate the quartiles for each month, something like this:
MEMBER [Measures].[out] as MAX (Filter(OrderedData ,[value]>[LS]),[value
SELECT {
[Measures].[Quartile1],[Measures].[Quartile2],[Measures].[Quartile3], [min],
[MAX] , [out] , [Measures].[ValueAVG],[RowCount],[Measures].[Recuento Fact Result]
} ON 0 ,
[Dim Parameter].[Reference].[Reference] *
[Dim Parameter].[Section ES].[Section ES] *
[Id Distribution Date].[DateJ].[Month] ON 1
FROM [Tess Tek DW Dev]
But it didn't work, How i can calculate quartiles of different date ranges in only one mdx query?

You need to work the Date into your WITH statement somehow.
Try first adding your target months, that will be on the rows, into a named set:
WITH
SET [TargetSet] AS
{
[Id Distribution Date].[DateJ].[Month].[Jan-2015],
[Id Distribution Date].[DateJ].[Month].[Feb-2015]
}
Then I'd add another set taking TargetSet into account:
SET [NonEmptyIds] AS
NonEmpty(
[Dim Parameter].[id].[id]
*[Dim Result].[Id].[Id].ALLMEMBERS
,
{[Measures].[Value]} * [TargetSet]
)
Then feed this set into your current set:
SET [OrderedData] AS
ORDER
(
[NonEmptyIds],
[Measures].[Value],
BASC
)
Then try amending the rows snippet to use the TargetSet:
[Dim Parameter].[Reference].[Reference] *
[Dim Parameter].[Section ES].[Section ES] *
[TargetSet] ON 1

Related

SQL Server 2017 - Dynamically generate a string based on the number of columns in another string

I have the following table & data:
CREATE TABLE dbo.TableMapping
(
[GenericMappingKey] [nvarchar](256) NULL,
[GenericMappingValue] [nvarchar](256) NULL,
[TargetMappingKey] [nvarchar](256) NULL,
[TargetMappingValue] [nvarchar](256) NULL
)
INSERT INTO dbo.TableMapping
(
[GenericMappingKey]
,[GenericMappingValue]
,[TargetMappingKey]
,[TargetMappingValue]
)
VALUES
(
'Generic'
,'Col1Source|Col1Target;Col2Source|Col2Target;Col3Source|Col3Target;Col4Source|Col4Target;Col5Source|Col5Target;Col6Source|Col6Target'
,'Target'
,'Fruit|Apple;Car|Red;House|Bungalo;Gender|Female;Material|Brick;Solution|IT'
)
I would need to be able to automatically generate my GenericMappingValue string dynamically based on the number of column pairs in the TargetMappingValue column.
Currently, there are 6 column mapping pairs. However, if I only had two mapping column pairs in my TargetMapping such as the following...
'Fruit|Apple;Car|Red'
then I would like for the GenericMappingValue to be automatically generated (updated) such as the following since, as a consequence, I would only have 2 column pairs in my string...
'Col1Source|Col1Target;Col2Source|Col2Target'
I've started building the following query logic:
DECLARE #Mapping nvarchar(256)
SELECT #Mapping = [TargetMappingValue] from TableMapping
print #Mapping
SELECT count(*) ColumnPairCount
FROM String_split(#Mapping, ';')
The above query gives me a correct count of 6 for my column pairs.
How would I be able to continue my logic to achieve my automatically generated mapping string?
I think I understand what you are after. This should get you moving in the right direction.
Since you've tagged 2017 you can use STRING_AGG()
You'll want to split your TargetMappingValue using STRING_SPLIT() with ROW_NUMER() in a sub-query. (NOTE: We aren't guaranteed order using string_split() with ROW_NUMBER here, but will work for this situation. Example below using OPENJSON if we need to insure accurate order.)
Then you can then use that ROW_NUMBER() as the column indicator/number in a CONCAT().
Then bring it all back together using STRING_AGG()
Have a look at this working example:
DECLARE #TableMapping TABLE
(
[GenericMappingKey] [NVARCHAR](256) NULL
, [GenericMappingValue] [NVARCHAR](256) NULL
, [TargetMappingKey] [NVARCHAR](256) NULL
, [TargetMappingValue] [NVARCHAR](256) NULL
);
INSERT INTO #TableMapping (
[GenericMappingKey]
, [GenericMappingValue]
, [TargetMappingKey]
, [TargetMappingValue]
)
VALUES ( 'Generic'
, 'Col1Source|Col1Target;Col2Source|Col2Target;Col3Source|Col3Target;Col4Source|Col4Target;Col5Source|Col5Target;Col6Source|Col6Target'
, 'Target'
, 'Fruit|Apple;Car|Red;House|Bungalo;Gender|Female;Material|Brick;Solution|IT' );
SELECT [col].[GenericMappingKey]
, STRING_AGG(CONCAT('Col', [col].[ColNumber], 'Source|Col', [col].[ColNumber], 'Target'), ';') AS [GeneratedGenericMappingValue]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
FROM (
SELECT *
, ROW_NUMBER() OVER ( ORDER BY (
SELECT 1
)
) AS [ColNumber]
FROM #TableMapping
CROSS APPLY STRING_SPLIT([TargetMappingValue], ';')
) AS [col]
GROUP BY [col].[GenericMappingKey]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue];
Here's an example of what an update would look like assuming your primary key is the GenericMappingKey column:
--This what an update would look like
--Assuming your primary key is the [GenericMappingKey] column
UPDATE [upd]
SET [upd].[GenericMappingValue] = [g].[GeneratedGenericMappingValue]
FROM (
SELECT [col].[GenericMappingKey]
, STRING_AGG(CONCAT('Col', [col].[ColNumber], 'Source|Col', [col].[ColNumber], 'Target'), ';') AS [GeneratedGenericMappingValue]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
FROM (
SELECT *
, ROW_NUMBER() OVER ( ORDER BY (
SELECT 1
)
) AS [ColNumber]
FROM #TableMapping
CROSS APPLY [STRING_SPLIT]([TargetMappingValue], ';')
) AS [col]
GROUP BY [col].[GenericMappingKey]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
) AS [g]
INNER JOIN #TableMapping [upd]
ON [upd].[GenericMappingKey] = [g].[GenericMappingKey];
Shnugo brings up a great point in the comments in that we are not guarantee sort order with string_split() and using row number. In this particular situation it wouldn't matter as the output mappings in generic. But what if you needed to used elements from your "TargetMappingValue" column in the final "GenericMappingValue", then you would need to make sure sort order was accurate.
Here's an example showing how to use OPENJSON() and it's "key" which would guarantee that order using Shnugo example:
SELECT [col].[GenericMappingKey]
, STRING_AGG(CONCAT('Col', [col].[colNumber], 'Source|Col', [col].[colNumber], 'Target'), ';') AS [GeneratedGenericMappingValue]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue]
FROM (
SELECT [tm].*
, [oj].[Key] + 1 AS [colNumber] --Use the key as our order/column number, adding 1 as it is zero based.
, [oj].[Value] -- and if needed we can bring the split value out.
FROM #TableMapping [tm]
CROSS APPLY OPENJSON('["' + REPLACE([tm].[TargetMappingValue], ';', '","') + '"]') [oj] --Basically turn the column value into JSON string.
) AS [col]
GROUP BY [col].[GenericMappingKey]
, [col].[TargetMappingKey]
, [col].[TargetMappingValue];
if the data is already in the table and you want to break it out into columns, this should work
select
v.value
,left(v.value, charindex('|',v.value) -1) col1
,reverse(left(reverse(v.value), charindex('|',reverse(v.value)) -1)) col2
from String_split(#mapping,';') v

Running total of a measure DAX

I am trying to calculate running total of a measure that I have created, I can get the pivot table to show it using "show Value As" however I cannot achieve the same using a new Measure and I need a new measure so I can then calculate year to date average
using the following formula I get sum of individual rows rather sum of aggregated rows per month
Any help would be appreciated
Thanks
Measure2:=CALCULATE(SUMX(Table,[measure1]),FILTER (
ALL( Calendar[Date]),
Calendar[Date] <= MAX (Calendar[Date] )
)
)
Measure1:=if(sum(AMOUNT)=0,Blank(),if(sum(AMOUNT)<0,[WAT],if([Countback]=1,(SUM(AMOUNT)/[CumulativSales1])*[Sum of Days],
if([Countback]=2,[Sum of Days]+((SUM(AMOUNT))-[CumulativSales1])/([CumulativSales2]-[CumulativSales1])*[Days Previous],
if([Countback]=3,[Sum of Days]+[Days Previous]+((SUM(AMOUNT))-[CumulativSales2])/([CumulativSales3]-[CumulativSales2])*[Days Previous2],
if([Countback]=4,[Sum of Days]+[Days Previous]+[Days Previous2]+((SUM(AMOUNT))-[CumulativSales3])/([CumulativSales4]-[CumulativSales3])*[Days Previous3],
if([Countback]=5,[Sum of Days]+[Days Previous]+[Days Previous2]+[Days Previous3]+((SUM(AMOUNT))-[CumulativSales4])/([CumulativSales5]-[CumulativSales4])*[Days Previous4],200)))))))
AverageSaleWeight2:=if(HASONEVALUE(Calendar[Date]),
CALCULATE(sum(INVOICE[Days Given * Amount])/sum(INVOICE[Amount GBP]),
DATEADD(Calendar[Date],-2,MONTH)),BLANK())
AverageSaleWeight3:=if(HASONEVALUE(Calendar[Date]),
CALCULATE(sum(INVOICE[Days Given * Amount])/sum(INVOICE[Amount GBP]),
DATEADD(Calendar[Date],-3,MONTH)),BLANK())
.....
Countback:=IF((DIVIDE([CumulativSales1],sum(Aging[OPEN_DOM_AMOUNT]))>=0.9999,1,
IF((DIVIDE([CumulativSales2],SUM(Aging[OPEN_DOM_AMOUNT]))>=0.9999,2,
IF((DIVIDE([CumulativSales3],SUM(Aging[OPEN_DOM_AMOUNT]))>=0.9999,3,
IF((DIVIDE([CumulativSales4],sum(Aging[OPEN_DOM_AMOUNT]))>=0.9999,4,
IF((DIVIDE([CumulativSales5],sum(Aging[OPEN_DOM_AMOUNT]))>=0.9999,5,6))))))))))
CumulativSales1:=CALCULATE(SUM(INVOICE[Amount GBP]),
DATESINPERIOD(Calendar[Date],
LASTDATE(Calendar[Date]),-1,MONTH))
CumulativSales2:=CALCULATE(SUM(INVOICE[Amount GBP]),
DATESINPERIOD(Calendar[Date],
LASTDATE(Calendar[Date]),-2,MONTH))
WAT:=if(sum([AMOUNT])=0,Blank(),IF([Countback]=1,[AverageSaleWeight],IF([Countback]=2,[AverageSaleWeight1],IF([Countback]=3,[AverageSaleWeight2],IF([Countback]=4,[AverageSaleWeight3],IF([Countback]=5,
[AverageSaleWeight4],IF([Countback]=6,[AverageSaleWeight5],30)))))))
Days Previous:=CALCULATE(SUM(Calendar[Days]),
DATESINPERIOD(Calendar[Date],
LASTDATE(Calendar[Date]),-2,MONTH))-CALCULATE(SUM(Calendar[Days]),
DATESINPERIOD(Calendar[Date],
LASTDATE(Calendar[Date]),-1,MONTH))
Days Previous2:=CALCULATE(SUM(Calendar[Days]),
DATESINPERIOD(Calendar[Date],
LASTDATE(Calendar[Date]),-3,MONTH))-CALCULATE(SUM(Calendar[Days]),
DATESINPERIOD(Calendar[Date],
LASTDATE(Calendar[Date]),-2,MONTH))
....
`
Try this revised version and see if you get the desired result:
Measure3 := CALCULATE(
SUMX( VALUES(Calendar[Month]), [measure1] )
, FILTER(
ALL(Calendar)
, Calendar[Date] <= MAX(Calendar[Date])
&& Calendar[Year] = MAX(Calendar[Year])
)
)
Latest Edit: added SUMX VALUES

Datevalue calculation in MDX query

I want to create a dynamic query which updates each day.
So to filter on todays report I use
[Report Date].[Report Date].&[4226]
The 4226 is coming from:
=DATEVALUE("28-07-2017")-38718 or =TODAY()-38718 (convert to number)
38718 is just an arbitrary number to get the correct date from the cube.
EDIT:
Here is my current query:
SELECT NON EMPTY { [Measures].[Price FC] } ON COLUMNS
FROM ( SELECT ( -{ [Agency].[Nationality - Consortium - Agency].[Nationality].&[111],
[Agency].[Nationality - Consortium - Agency].[Nationality].&[116],
[Agency].[Nationality - Consortium - Agency].[Nationality].&[242],
[Agency].[Nationality - Consortium - Agency].[Nationality].&[134] } ) ON COLUMNS
FROM ( SELECT ( { StrToMember("[Report Date].[Report Date].&[" + Str(DateValue(Format(Now(), "dd-MM-yyyy")) - 38718) + "]") } ) ON COLUMNS
FROM ( SELECT ( { [Market].[Market].[Market].&[103] } ) ON COLUMNS
FROM ( SELECT ( { [Travel Type].[Travel Type].&[101],
[Travel Type].[Travel Type].&[102],
[Travel Type].[Travel Type].&[103] } ) ON COLUMNS
FROM ( SELECT ( { [Departure Date].[Year].&[2017] } ) ON COLUMNS
FROM [Booking])))))
WHERE ( [Departure Date].[Year].&[2017],
[Travel Type].[Travel Type].CurrentMember,
[Market].[Market].[Market].&[103],
StrToMember("[Report Date].[Report Date].&[" + Str(DateValue(Format(Now(), "dd-MM-yyyy")) - 38718) + "]") )
But it says that there is no column detected in the statement. I have also tried different date formats, any ideas?
Following the tips from this thread:
VBA Date as integer
I used CDbl instead of Datevalue, which gave me the desired result!
StrToMember("[Report Date].[Report Date].&[" + Str(Int(CDbl(Now()) - 38718)) + "]")

PowerPivot Filter Function

In PowerPivot Excel 2016 I write a formula to summarize year to date sales using filter function as below:
SalesYTD:=CALCULATE (
[Net Sales],
FILTER (
ALL ( Sales),
'sales'[Year] = MAX ( 'Sales'[Year] )
&& 'Sales'[Date] <= MAX ( 'Sales'[Date] )
)
)
And it's work perfectly, now in my data I have a field called "Channel" which I want to filter it in my pivot table but it won't works!
Does anybody knows how should I fix this formula?!
Thanks in advance...
Try:
SalesYTD:=CALCULATE (
[Net Sales],
FILTER (
ALLEXCEPT ( 'Sales', 'Sales'[Channel] ),
'sales'[Year] = MAX ( 'Sales'[Year] )
&& 'Sales'[Date] <= MAX ( 'Sales'[Date] )
)
)
ALLEXCEPT removes all context filters in the table except filters that have been applied to the specified columns, in this case [Channel] column.
Let me know if this helps.

Identify first occurence of event based on multiple criterias

I have a dataset in PowerPivot and need to find a way to flag ONLY the first occurrence of a customer sub event
Context: Each event (COLUMN A) can have X number of sub events (COLUMN B),
I already have a flag that identifies a customer event based on multiple criteria's (COLUMN D)... What I need is a way to flag only the first occurrence of a customer sub event within each event, I've added a fake COLUMN E to illustrate how the flagging should work.
UPDATE
Additional situation - Having duplicated customer sub_events but only need to flag the first sub_event... should look like this:
Create a calculated column in your model using the following expression:
=
IF (
[Customer_Event] = 1
&& [Sub_Event]
= CALCULATE (
FIRSTNONBLANK ( 'Table'[Sub_Event], 0 ),
FILTER (
'Table',
'Table'[Event] = EARLIER ( 'Table'[Event] )
&& [Customer_Event] = 1
)
),
1,
0
)
If Sub_Event column is a number replace FIRSTNONBLANK ( 'Table'[Sub_Event], 0 ) by MIN('Table'[Sub_Event])
Also if your machine regional settings use ; (semicolon) as list separator replace every , (comma) in my expression by a semicolon in order to match your settings.
UPDATE: Repeated values in Sub_Event column.
I think we can use CaseRow# column to get the first occurence of Sub_Event value:
=
IF (
[Customer_Event] = 1
&& [Sub_Event]
= CALCULATE (
FIRSTNONBLANK ( 'Table'[Sub_Event], 0 ),
FILTER (
'Table',
'Table'[Event] = EARLIER ( 'Table'[Event] )
&& [Customer_Event] = 1
)
)
&& [CaseRow#]
= CALCULATE (
MIN ( 'Table'[CaseRow#] ),
FILTER (
'Table',
'Table'[Event] = EARLIER ( 'Table'[Event] )
&& [Customer_Event] = 1
)
),
1,
0
)
It is not tested but should work.
Let me know if this helps.

Resources