DAX Count function making mistake somewhere - excel

File: count.xlsx located on GitHub repository
Software: MS Excel 2016 Power Pivot
I know for fact there are 10,921 rows in EXCEL sheet.
When I create DAX measure Total_Incidents:=Count(Graffiti[CREATED_DATE]) value comes to 10,921. I count the CREATED_DATE because there is no NULL value.
There are three statuses, Open, Pending, Closed which are calculated as follows
Total_Closed:=sumx(FILTER(Graffiti,Graffiti[STATUS]="Closed"),[Total_Incidents])
Total_Closed= 5354, <- correct
However, Total_Opened is incorrect
Total_Opened:=sumx(FILTER(Graffiti,Graffiti[STATUS]="Open"),[Total_Incidents])
Total_Opened= 4483 it is supposed to be 4481
However Total_Pending is correct
Total_Pending:=sumx(FILTER(Graffiti,Graffiti[STATUS]="Pending"),[Total_Incidents])
Total_Pending= 75, <- correct
When I add totals I get 2 more incidents because of Total_Opened
Total_Calc:=[Total_Closed]+[Total_Opened]+[Total_Pending]
Total_Calc= 10923 <- incorrect, should be 10921
Why the discrepancy in Total_Opened? Cannot figure this.

I had originally answered with this:
"I know you said you count CREATED_DATE because there is no NULL value, but did you check for blanks in your CREATE_DATE column? I duplicated your problem by having blank dates. You could have 2 blank dates.
You could use COUNTBLANK(Graffiti[CREATED_DATE]) to check if you do have blanks."
Then I noticed you had a link to your Excel file on GitHub, so...
I downloaded it and I looked for blanks in your dates--there were none.
So I added columns for Total_Closed, Total_Opened, Total_Pending and Total_Calc. (I used your formulas, but instead of sumx, I used countx in each formula, so I could just compare row counts.)
Total_Incidents:=Count(Graffiti[CREATED_DATE])
Total_Closed:=countx(FILTER(Graffiti,Graffiti[STATUS]="Closed"),[Total_Incidents])
Total_Opened:=countx(FILTER(Graffiti,Graffiti[STATUS]="Open"),[Total_Incidents])
Total_Pending:=countx(FILTER(Graffiti,Graffiti[STATUS]="Pending"),[Total_Incidents])
Total_Calc:=[Total_Closed]+[Total_Opened]+[Total_Pending]
Here's what I got:
Total_Incidents: 10921
Total_Closed: 6365
Total_Opened: 4481
Total_Pending: 75
Total_Calc: 10921
These counts look correct.
I'm guessing you figured out and corrected your problem.

Related

Excel CUBEVALUE & CUBESET count records greater than a number

I am writing a series of queries to my workbook's data model to retrieve the number of documents by Category_Name which are greater than a certain numbers of days old (e.g. >=650).
Currently this formula (entered in celll C3) returns the correct number for a single Days Old value (=3).
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]",
"[EDD_Report_10-01-18].[Days Old].[34]")
How do I return the number of documents for Days Old values >=650?
The worksheet looks like:
A B C
1 Date PL Count of Docs
2 10/1/2018 ALD 3
3 ...
UPDATE: As suggested in #ama 's answer below, the expression in step B did not work.
However, I created a subset of the Days Old values using
=CUBESET("ThisWorkbookDataModel",
"{[EDD_Report_10-01-18].[Days Old].[all].[650]:[EDD_Report_10-01-18].[Days Old].[All].[3647]}")
The cell containing this cubeset is referenced as the third Member_expression of the original CUBEVALUE formula. The limitation is now that the values for the beginning and end must be members of the Days Old set.
This is limiting, in that, I was hoping for a more general test for >=650 and there is no way to guarantee that specific values of Days Old will be in the query.
First time I hear about CUBE, so you got me curious and I did some digging. Definitely not an expert, but here is what I found:
MDX language should allow you to provide value ranges in the form of {[Table].[Field].[All].[LowerBound]:[Table].[Field].[All].[UpperBound]}.
A. Get the total number of entries:
D3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All]")
B. Get the number of entries less than 650:
E3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[0]:[EDD_Report_10-01-18].[Days Old].[All].[649]}")
Note I found something about using .[All].[650].lag(1)} but I think for it to work properly your data might need to be sorted?
C. Substract
C3 =D3-E3
Alternatively, go for the quick and dirty:
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[650]:[EDD_Report_10-01-18].[Days Old].[All].[99999]}")
Hope this helps and do let me know, I am still curious!

Difference time view from Excel to VBA

I can't obtain the average time from start to end of some activities,
I tried 1K way but the result isn't correct, every time I've one day minus.
the image can explain better (that my english).
In this example the sum of my activities il 480:52:56 hours, in vba I've different result, for vba the date is "19/01/1900 00:52:56" like 456:52:56 hous
24 hours minus
why this difference? and how I can obtain the same result?
thanks for any suggestion
Dates are stored as serial numbers where first valid date has a value of 1. This value in excel reads as 01/01/1900, and in VBA as 31/12/1899. In excel, value 60 returns 29/02/1900 which doesn't exist in VBA, so from value 61 onwards all values will return the same date in VBA and excel.
/e: Also, maximum value is 2958465 (31/12/9999), values higher than that will return error rather than valid date
thanks to your comments I understand that the problem is for the minor dates of March 1, 1900 so I changed the select from:
Select [DataAttesa] as Data, avg(iif([totHours] > 1 and [totHours] < 61, dateadd("d",-1,CDate([totHours])) , [totHours])) as nr FROM [db_In$] Where TypeTrasp = "AOG" group by [DataAttesa] Order by [DataAttesa]asc
now, when I put the recordset.results on excel the value are correct.
Thank at all

Index Match to return MAX Date with multiple criteria

apologies for this, I'm assuming this is simple but a few hours of SO googling hasn't helped
Enough whining from me:
Consider the following dataset:
ID, Date, Review, Review Status
1 01/02/18, "Cool", Positive
1 01/03/18, "Awesome", " Positive
1 01/01/18, "Cumbersome", Negative
1 01/02/18, "Rubbish!", " Negative
I'm currently using an array type index match to get the latest review based on a few conditions
I have two columns one which says positive, one negative.
in each of the column I would like to return the latest positive review but I'm unsure how to get the max date within the below formula:
{=index(C2:C4, MATCH(1,(1 = A2:A4)*("Positive" = D2:D4)*(maxdatehere = B2:B4),0))}
The data I have is around 7k rows and is Google Review Data and pretty much matches the example above.
I'd rather not use VBA as I've never really used it before (but will do so grudgingly)
as this is excel I've not created a google demo sheet, but happy to do so for ease of the experts and for others to benfit if they find there way here one day.
If you have Office 365:
=INDEX(C2:C5, MATCH(1,(1 = A2:A5)*("Positive" = D2:D5)*(MAXIFS(B:B,A:A,1,D:D,"Positive") = B2:B5),0))
Confirme with Ctrl-Shift-enter instead of Enter when exiting edit mode.
If you have 2010 or later then:
=INDEX(C:C,AGGREGATE(15,7,ROW(C2:C5)/((A2:A5=1)*(D2:D5="Positive")*(AGGREGATE(14,7,B2:B5/((A2:A5=1)*(D2:D5="Positive")),1)=B2:B5)),1))
Entered normally.

Moving total which uses two calculated fields and also uses its previous value-Spotfire

Hi folks I am new to Spotfire and having difficulty in replicating one of the formulas from Excel to spotfire.
Sample Data
Sample data here(excel)
https://docs.google.com/spreadsheets/d/1KSdrIYKlRYG9c3wIM3NwQcLP_Ob8Z2UZ5Cjrdjy8UO8/edit?usp=sharing
and I am trying to replicate the column [Steady Repay-Option Scenario]
Formula used in excel
=IF(B6-IF(C3>0,C2,0)>0,B6-IF(C3>0,C2,0),0)
the above is the formula I have in excel where subsequent columns are calculated by using the previous value and the current values from columns [Monthly impact on cash] and [running total]
This is the formula I have created in spotfire:
if((Sum([Scenario opening balance]) over (allPrevious([Document_Date_Number])) - (If(Sum([Rolling_total_cash_calculated]) over (AllPrevious([Document_Date_Number]))>0,Sum([Monthly_impact_on_cash_calculated]) over (AllPrevious([Document_Date_Number])),0)))>0,Sum([Scenario opening balance]) over (allPrevious([Document_Date_Number])) - (If(sum([Rolling_total_cash_calculated]) over (allPrevious([Document_Date_Number]))>0,Sum([Monthly_impact_on_cash_calculated]) OVER (allPrevious([Document_Date_Number])),0)),0)
Assumptions--
Data has been pivoted into three columns([Document_Date_Number], Monthly_impact_on_cash_calculated] and [Rolling_total_cash_calculated])
where:
[Scenario opening balance] = 150000000(fixed)
[Document_Date_Number] = Jan,Feb,Mar etc
[Rolling_total_cash_calculated] = Rolling total(excel)
[Monthly_impact_on_cash_calculated] = Monthly impact on cash(excel)
But I get incorrect results for some reason
results in spotfire
But the expected result is
Correct result in excel
So although the results match till Oct as shown above they don't seem to match afterwards.
Please let me know what can I do to get the same values. Any help in deeply appreciated.

Excel - 2 tables - If 2 cells in a single row match, return another cell of same row

Working with 2 separate data sets (with duplicates)
Dataset is unique identified by an ID.
There may not be an entry for the timestamp I require.
Datasets are quite large, and due to duplicates, can't use vlookup.
Samples:
Table 1:
Device Name|Time Bracket| On/Off?
ID1 |06:20:00 |
ID2 |06:20:00 |
ID3 |06:30:00 |
Table 2:
Device Name |Timestamp |On/Off?
ID1 |06:20:00 |On
ID2 |06:50:00 |Off
ID3 |07:20:00 |Off
What I want to achieve:
I want an if statement to check if:
1) device ID matches AND
2) timestamp matches
If so, return the value of On/Off from Table 2.
If not, then I want it to return the value of the cell above it IF it's the same device, otherwise just put "absent" into the cell.
I thought I could do this with some IF statements like so:
=if(HOUR([#[Time Bracket]]) = HOUR(Table13[#[Timestamp Rounded (GMT)]]) and
minute([#[Time Bracket]]) = minute(Table13[#[Timestamp Rounded (GMT)]]) and
[#[Device Name]]=Table13[#[Device Name]], Table13[#[On/Off?]],
IF([#[Device Name]]=Table13[#[Device Name]], INDIRECT("B" and Rows()-1), "absent"))
(I put some newlines in there for readability)
However, this doesn't seem to resolve at all... what am I doing wrong?
Is this even the correct way of achieving this?
I've also tried something similar with a VLookUp, but that failed horribly.
Thanks all!
To not deal with array formulas or merging strings which, (not in your case) can still be wrong at the end, I suggest the use of COUNTIFS due to the fact, you have a very small amount of outcomes (just on or off)...
for the first table (starting at A1, so the formula is at C2):
=IFERROR(CHOOSE(
OR(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"On"))+
OR(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"Off"))*2
,"On","Off","Error"),IF(A1=[#[Device Name]],C1,"Absent"))
this will also show "Error" of a match for "On" and "Off" is shown... to skip that and increase the speed, you also could use:
=IF(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"On"),"On",
IF(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"Off"),"Off",
IF(A1=[#[Device Name]],C1,"Absent")))
For both the "Device Name" is at column A, "Time Bracket" at column B and "On/Off?" at column C while the table starts at row 1... If that is not the case for you, then change A1 and C1 so they match
(Also inserted line-breaks for better reading)
Picture to show the layout:
I picked the second formula to show how it works... also, this formula should not be able to return 0's... I'm confused
Couple of good suggestions, however using the helper column as suggested in the topic by Scott Craner above worked.
Created a helper column of concat'd device ID and timestamp for both tables, then did a simple VlookUp.
Another lesson learned: Think outside of the box, and go with simple solutions, rather than try + be too clever like I was doing... :)

Resources