Multiple pivots in U-SQL to output multiple columns? - pivot

Is it possible to perform multiple pivots in U-SQL without doing a UNION? Something along the lines of:
SELECT Email
FROM #somedata
PIVOT (
MIN(EventTimestamp) FOR EventType IN ("A" AS FirstATime, "B" AS FirstBTime)
),
PIVOT (
MAX(EventTimestamp) FOR EventType IN ("A" AS LastATime, "B" AS LastBTime)
)
GROUP BY Email
The resulting columns should be:
Email, FirstATime, FirstBTime, LastATime, LastBTime

You can compose PIVOT expressions. Please note that PIVOT is a rowset expression. Thus it will take a rowset on its left hand side argument.
I think you would like to apply two different aggregations over the same data with the PIVOTs though. In that case, I think you will need to do it in two SELECTs and then do an OUTER UNION ALL BY NAME ON (Email) to merge the rows.

Related

calculate sum with criteria from many columns of another filtered table DAX PowerBi

Hello i want to sum a column but i need to filter the table based on data from another table.
So i have table1 where i want to sum points and i want to sum only the record that for the dates and the names and the classes i find in table 2
I am using measure like this:
Measure 3 = CALCULATE(sum(Table1[points]);Table1[name] in (ALLSELECTED(Table2[name]));Table1[date] in (ALLSELECTED(Table2[date]));Table1[class] in (ALLSELECTED(Table2[class])))
but it does not filter properly,
is there any better way to do this?
One way would be, you create a relationship between the two tables. I think Power BI doesnt support multi relationships between two tables, so you have to add a custom column on both tables with your key <> foreign key. In your case like you mentioned it woulb be the name, date and class (in the query editor):
Key = [name] & [date] & [class]
In my sample here I just use the name as key column.
If the relationship is set you can use the following measure:
You can use TREATAS to filter Table1 based on Table2. No relationship is needed.
Total Points Filtered By Table2 =
CALCULATE (
SUM ( Table1[point] ),
TREATAS (
SUMMARIZE ( Table2, Table2[name], Table2[date], Table2[class] ),
Table1[name], Table1[date], Table1[class]
)
)

Power Pivot / DAX - Distinct Count of one dim column from multiple fact tables

Using Power Pivot in Excel 2016
I have one dim table called "Roster" with 218 unique 'Employee Names' and other attributes for the employees
I have three fact tables called "Forecast," "Actual," and "Invoice," each with the related 'Employee Name' columns, in addition to many other attributes and values. I want a distinct count of 'Employee Name' across all three of those tables, depending on what I'd like to pivot them by in my pivot table, like 'Project' or 'Company.' I've read about counting across multiple columns from one table, but I'm trying to count across multiple tables.
When I create measure of:
Headcount:=calculate(DISTINCTCOUNT('Roster_Table'[Employee Name]),'Actual','Forecast','Invoice'), and throw it in my pivot table, I get a very small count of 14, which is probably the ones that are unique to only one of the three tables.
When I create measure of: Headcount:=DISTINCTCOUNT('Roster_Table'[Employee Name]), I get all 218
The true number should be around 170. Any ideas of how to make this work?
Thanks
Please try the following DAX calculation:
COUNTROWS(
DISTINCT(
UNION(
VALUES('Forecast'[Employee Name]),
VALUES('Actual'[Employee Name]),
VALUES('Invoice'[Employee Name])
)
)
)

Power Pivot "dynamic" joins

I have a situation where I kind of need a many-to-many join - which I know isn't possible.
I have one fact table and two dimension tables.
The fact table contains account numbers (as in GL accounts) and amounts. Plus a date field, so the account numbers are not unique.
The first dimension table has just one column listing the reports that can be created by combining the accounts in different ways.
The second dimension table could be called a "roll-up" table. It has 3 columns: report, account, and a line item description field. The latter defines which line on the respective report that the account should be mapped to.
So I want to have a pivot table that has the line item description in the row area and the amount in the values area. With a mechanism for the user to specify which report they want to view. But the join on the account field between the roll-up table and the fact table is many-to-many. If the roll-up table were somehow filtered based on the specific report that the user has selected, THEN it would become one-to-many. Hence the "dynamic" joins in my title.
I've been trying to come up with a connecting table of some kind, but without any luck so far. If anybody has any suggestions/pointers, that would be much appreciated.
I figured out a way to do it using a DAX formula that calculates the field to be placed in the Values area. It uses FILTER and CROSSJOIN combinations to effect the dynamic joins. Note that in order to use a CROSSJOIN I added prefix letters to a couple of the field names (to make them unique). Also, I made it that the Report table (the first dimension table I described) has only one row - containing the report that the user wishes to view.
The DAX formula is as follows:
SUMX (
FILTER (
CROSSJOIN (
fBalances,
FILTER (
CROSSJOIN (
dRollUp,
dReport
), dRollup[Report] = dReport[uReport]
)
), fBalances[fAccount] = dRollUp[Account]
), fBalances[Amount]
)
Subsequent update: I moved it into Power BI where I added a parameter (called myReport) for the user to specify the report. Consequently I deleted the dReport table.
So the Power BI DAX formula becomes:
SUMX (
FILTER (
CROSSJOIN (
fBalances,
FILTER (
CROSSJOIN (
dRollUp,
myReport
), dRollup[Report] = FIRSTNONBLANK ( myReport[myReport], TRUE() )
)
), fBalances[fAccount] = dRollUp[Account]
), fBalances[Amount]
)

How to compare one row to others in DAX in Excel

I have this table which has foreign keys from several other keys:
Basically, this table shows which students registered in which module run by which teacher in what term.
I want to query the following:
How many students have registered for more than one module run by a given tutor?
It will look something like this:
For example, Vasiliy Kuznetsov runs two modules: FunPro and NO. If one student registers for both of them, he is counted as one.
My sql oriented mind is telling me this: Count all the rows in which student_id and tutor_id are the same. For example, in one row student_id is 5 and tutor_id is 10, and the same is true for the third row. Then, I count it as one.
How can I do that with DAX formulas?
RowCount:=
COUNTROWS( ModuleRegistration )
StudentsWithTwoOrMoreRegistrations:=
COUNTROWS(
FILTER(
VALUES( ModuleRegistration[Student_ID] )
,[RowCount] >= 2
)
)
I refer to arguments positionally, thus the first argument to a function is (1), the second (2), and so on.
So, [RowCount] is trivial.
[StudentsWithTwoOrMoreRegistrations] is a bit more involved. DAX, being a functional language, is best understood inside-out.
FILTER() takes a table expression in (1) and evaluates a boolean predicate, (2), for each row in (1). It returns all rows from (1) for which (2) evaluates to true.
Our FILTER()'s (1) is VALUES( ModuleRegistration[Student_ID] ). VALUES() returns the unique rows from a field based on current filter context (it respects slicers and filters in the pivot table). Thus, we will return some subset of the unique list of [Student_ID]s.
Our FILTER()'s (2) is [RowCount] >= 2. For each [Student_ID] in (1), we'll evaluate [RowCount], checking how many times that student appears in ModuleRegistration. [RowCount] is evaluated in the combination of filter context from the pivot table (the [Faculty Name] field in your sample pivot provides filter context) and row context from FILTER()'s (1). Thus it counts how many times the student appears in ModuleRegistration for the [Faculty Name] on the pivot table row.
We check that [RowCount] is >= 2.
You've not indicated if your measure needs to handle grand totals, or how you might want to see that. If you need more help for the grand total to get it to behave the way you like, let me know.
Edit for grand total
There are a few ways you might want to handle grand totals. I'm gong to assume that you want a unique count of students.
StudentsWithTwoOrMoreRegistrations:=
COUNTROWS(
SUMMARIZE(
FILTER(
SUMMARIZE(
ModuleRegistration
,ModuleRegistration[Tutor_ID]
,ModuleRegistration[Student_ID]
)
,[RowCount] >= 2
)
,ModuleRegistration[Student_ID]
)
)
WTF happened to our measure?
Let's examine:
Starting with the innermost SUMMARIZE(). SUMMARIZE() navigates relationships outward from the table in (1) and groups by the columns listed in (2)-(N) (these don't have to be from the table in (1), but must be reachable by navigating relationships).
This is equivalent to the following in SQL:
SELECT
mr.Tutor_ID
,mr.Student_ID
FROM ModuleRegistration mr
We use FILTER() on this table like earlier. [RowCount] is evaluated in the combination of filter context from the pivot table and the row in the table, defined by our SUMMARIZE() above.
Now our row context is instead of just a student, a student-tutor pair. This pair will have a [RowCount] >= 2 when the student has taken more than one module from a tutor.
Our FILTER() returns the pairs which have a [RowCount] >= 2. This output table has two fields, [Tutor_ID] and [Student_ID], but we want to count distinct [Student_ID]s out of this.
Thus, we use the table from FILTER() as our (1) in the outer SUMMARIZE(). We group only by the values of [Student_ID]. We then count the rows of this table.
When only one [Faculty_Name] is in context, e.g. on a pivot table row, then our inner SUMMARIZE() is grouping by a single value of [Tutor_ID] and whatever [Student_ID]s are associated with it. This is identical to our earlier measure.
When we have many [Tutor_ID]s in context, like in the grand total, then we'll see the appropriate behavior of only counting each [Student_ID] once.

how to join two or more tables and result set having all distinct values

I have some 20 excel files containing data. all the tables have same columns like id name age location etc..... each file has distinct data but i don't know if data in one file is again repeated in another file. so i want to join all the files and the result st should contain distinct values. please help me out with this problem as soon as possible. i want the result set to be stored in an access database.
I would recomend either linking the sheets in acces, or importing the sheets as tabels.
Then from there try to determine using a DISTINCT select from the tables/sheets the keys required, and only selecting the records as required.
In SQL, you can use JOIN or NATURAL JOIN to join tables. I would look into NATURAL JOIN since you said all tables have the same values.
After that you can use DISTINCT to get distinct values.
I'm not sure if this is what you're looking for though: your question asks about excel but you've tagged it with SQL.
If you can use all the tables in one query, you can use a union to get the distinct rows:
select id, name, age, location from Table1
union
select id, name, age, location from Table2
union
select id, name, age, location from Table3
union
...
You can insert the records directly from the result:
insert into ResultTable
select id, name, age, location from Table1
union
....
If you only can select from one table at a time, you can skip the insert of rows that are already in the table:
insert into ResultTable
select t.id, t.name, t.age, t.location from Table1 as t
left join ResultTable as r on r.id = t.id
where r.id is null
(Assuming that id is a unique field identifying the record.)
It seems the unique set of data you want is this:
SELECT T1.name, T1.loc
FROM [Excel 8.0;HDR=YES;IMEX=1;DATABASE=C:\db1.xls;
].[Sheet1$] AS T1
UNION
SELECT T1.name, T1.loc
FROM [Excel 8.0;HDR=YES;IMEX=1;DATABASE=C:\db2.xls;
].[Sheet1$] AS T1
...but that you then want to arbitrarily apply a sequence of integers as id (rather than using the id values from the Excel tables).
Because Access Database Engine does not support common table expressions and Excel does not support VIEWs, you will have to repeat that UNION query as derived tables (hopefully the optimizer will recognize the repeat?) e.g. using a correlated subquery to get the row number:
SELECT (
SELECT COUNT(*) + 1
FROM (
SELECT T1.name, T1.loc
FROM [Excel 8.0;HDR=YES;IMEX=1;DATABASE=C:\db1.xls;
].[Sheet1$] AS T1
UNION
SELECT T1.name, T1.loc
FROM [Excel 8.0;HDR=YES;IMEX=1;DATABASE=C:\db2.xls;
].[Sheet1$] AS T1
) AS DT1
WHERE DT1.name < DT2.name
) AS id,
DT2.name, DT2.loc
FROM (
SELECT T2.name, T2.loc
FROM [Excel 8.0;HDR=YES;IMEX=1;DATABASE=C:\db1.xls;
].[Sheet1$] AS T2
UNION
SELECT T2.name, T2.loc
FROM [Excel 8.0;HDR=YES;IMEX=1;DATABASE=C:\db2.xls;
].[Sheet1$] AS T2
) AS DT2;
Note:
i want the result set to be stored in
an access database
Then maybe you should migrate the Excel data into a staging table in your Access database and do the data scrubbing from there. At least you could put that derived table into a VIEW :)
Join is to combine two tables by matching the values in corresponding columns. In result, you will get a merged table which consists of the first table, plus the matched rows copied from the second table. You can use DIGBD add-in for excel

Resources