powerquery group by rows with formula on multiple columns - excel

I am trying to group/merge two rows by dividing the values in each based on another column (Eligible) value.
From the initial raw data, I have reached this level with different steps (by unpivoting etc.) in power query.
Now I need to have a ratio per employee (eligible/not-eligible) for each month.
So for employee A, "Jan-14" will be -10/(-10 + -149) and so on. Any ideas will be appreciated. Thanks
Really appreciate the response. Interestingly, I have used your other answer to reach this stage from the raw data.
Since we are calculating how much time an employee worked on eligible activities each month so We will be grouping on the Employee. Employee name was just for reference which I took out and later will join with employee query to get the names if required. There was a typo in the image, the last row should also be an employee with id 2.
So now when there is a matching row, we use the formula to calculate the percentage of time spent on eligible activities but
If there isn't a matching row with eligible=1, then the outcome should be 0
if there isn't a matching row with eligible-0, then the outcome should be 1 (100%)

Try this and modify as needed. It assumes you are starting with all Eligible=0 and will only pick up matching Eligible=1. If there is a E=1 without E=0 it is removed. Also assumes we match on both Employee and EmployeeName
~ ~ ~ ~ ~
Click select the first three columns (Employee, EmployeeName, Eligible), right click .... Unpivot other other columns
Add custom column with name "One" and formula =1+[Eligible]
Merge the table onto itself, Home .. Merge Queries... with join kind Left Outer
Click to match Employee, EmployeeName and Attribute columns in the two boxes, and match One column in the top box to the Eligible Column in the bottom box
In the new column, use arrows atop the column to expand, choosing [x] onlt the Value column. Make the name of the column: Custom.Value
Add column .. custom column ... formula = [Custom.Value] / ( [Custom.Value] + [Value])
Filter Eligible to only pick up the zeroes using the arrow atop that column
Remove extra columns
Click select Attribute column, Transform ... pivot ... use custom as the values column
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Employee", "EmployeeName", "Eligible"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "One", each 1+[Eligible]),
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Employee", "EmployeeName", "Attribute", "One"},#"Added Custom",{"Employee", "EmployeeName", "Attribute", "Eligible"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"Value"}, {"Custom.Value"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Added Custom", "Custom", each [Custom.Value]/([Custom.Value]+[Value])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Eligible] = 0)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Eligible", "Value", "One", "Custom.Value"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Custom", List.Sum)
in #"Pivoted Column"

Related

Aggregate multiple (many!) pair of columns (Exce)

I have table; The table consists of pairs of date and value columns
Pair Pair Pair Pair .... ..... ......
What I need is the sum of all values for the same date.
The total table has 3146 columns (so 1573 pairs of value and date)!! with up to 186 entries on row level.
Thankfully, the first column contains all possible date values.
Considering the 3146 columns I am not sure how to do that without doing massivle amount of small steps :(
This shows a different method of creating the two column table that you will group by Date and return the Sum. Might be faster than the List.Accumulate method. Certainly worth a try in view of your comment above.
Unpivot the original table
Add 0-based Index column; then IntegerDivide by 2
Group by the IntegerDivide column and extract the Date and Value to separate columns.
Then group by date and aggregate by sum
let
Source = Excel.CurrentWorkbook(){[Name="Table12"]}[Content],
//assuming only columns are Date and Value, this will set the data types for any number of columns
Types = List.Transform(List.Alternate(Table.ColumnNames(Source),1,1,1), each {_, type date}) &
List.Transform(List.Alternate(Table.ColumnNames(Source),1,1,0), each {_, type number}),
#"Changed Type" = Table.TransformColumnTypes(Source,Types),
//Unpivot all columns to create a two column table
//The Value.1 table will alternate the related Date/Value
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {}, "Attribute", "Value.1"),
//add a column to group the pairs of values
//below two lines => a column in sequence of 0,0,1,1,2,2,3,3, ...
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1, Int64.Type),
#"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "Integer-Division", each Number.IntegerDivide([Index], 2), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Integer-Division",{"Index"}),
// Group by the "pairing" sequence,
// Extract the Date and Value to new columns
// => a 2 column table
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Integer-Division"}, {
{"Date", each [Value.1]{0}, type date},
{"Value", each [Value.1]{1}, type number}}),
#"Removed Columns1" = Table.RemoveColumns(#"Grouped Rows",{"Integer-Division"}),
//Group by Date and aggregate by Sum
#"Grouped Rows1" = Table.Group(#"Removed Columns1", {"Date"}, {{"Sum Values", each List.Sum([Value]), type number}}),
//Sort into date order
#"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"Date", Order.Ascending}})
in
#"Sorted Rows"
Quick google shows "Number of columns per table 16,384" for powerquery and 16000 for powerBI, so I'm thinking you have to split your input data somehow first, or perhaps this is not the tool for you, maybe AWK
Assuming that works, an M version of what you are looking for. It stacks the columns in groups of 2, then groups and sums them
let Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
Combo = List.Split(Table.ColumnNames(Source),2),
#"Added Custom" =List.Accumulate(
Combo,
#table({"Column1"}, {}),
(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1)
),
#"Grouped Rows" = Table.Group(#"Added Custom", {"Column1"}, {{"Sum", each List.Sum([Column2]), type number}})
in #"Grouped Rows"
186 rows * 1573 pairs of columns = 292,578 records.
Assuming not a very old version of Excel, 293k rows is fine, so it can be done with formulae:
Insert five columns to the left, so data starts in F3.
In A3 put zero, in A4 put 1, select the two and drag down to A188.
In A189 put =A3.
In B3 put 0, and drag down to B188.
In B189 put =B3
"Drag"* down A189 and B189 to row 292580
In C3 put =OFFSET($F$3,A3,B3)
In D3 put =OFFSET($F$3,A3,B3+1)
Select those two cells and click on the cross at bottom right to copy them to the end of column B.
Then put Date and Value in A1 and B1, and use a Pivot Table to get totals, averages, or whatever you need.
Any blank cells in the original input do not matter.
to "drag" down hundred of thousands of cells:
Copy A189 and B189
Goto (F5) A292580
Paste
Pin (F8)
CTRL-up arrow
Enter
And rather than $F$3 I would name that cell Origin, and use "Origin" in the two Offset formulae, but many people seem to consider that too complicated.

Group by column A value, transpose column B, column C row values for each grouped column A value

This is in Excel 2016. I have a spreadsheet where each row represents a response to two questions "Qa" and "Qb" from a unique student. The spreadsheet columns are: "Section" (class section student is in), "Qa", and "Qb".
Thus, if three students answered from the same class section, that section will be listed three times under "Section", with each unique students answers in the other columns.
I want to group by section and spread the answers to each question across a single row in separate columns. The number of columns to create will default to the section with the most unique responses
In this case, 10003 has the greatest number of responses, so I want to get the following end result.
I am at a loss with how to get this going. Something like grouping by the section but transposing the rows within that group?
As #ScottCraner pointed out, you can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
//Change table name in next row to actual table name in workbook
Source = Excel.CurrentWorkbook(){[Name="Table20"]}[Content],
//set data type
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Section", Int64.Type}, {"Qa", type text}, {"Qb", type text}}),
//Group by Section
//Add a 1-based Index column to each Group
#"Grouped Rows" = Table.Group(#"Changed Type", {"Section"}, {
{"Row", each Table.AddIndexColumn(_,"Row",1,1)}}),
//Expand the grouped tables
#"Expanded Row" = Table.ExpandTableColumn(#"Grouped Rows", "Row", {"Qa", "Qb", "Row"}, {"Qa", "Qb", "Row"}),
//Unpivot
//Merge Row and Attribute columns to create the q-number headers
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Expanded Row", {"Section", "Row"}, "Attribute", "Value"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Unpivoted Other Columns",
{{"Row", type text}}, "en-US"),{"Attribute", "Row"},
Combiner.CombineTextByDelimiter("-", QuoteStyle.None),"Merged"),
//Pivot on the Sorted Merged column with no aggregation
#"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Sort(List.Distinct(#"Merged Columns"[Merged])), "Merged", "Value")
in
#"Pivoted Column"
Note that there are no empty columns (iow, there is no Qa-4)
If you really need an empty column, insert a step at the beginning replacing nulls with a blank
let
//Change table name in next row to actual table name in workbook
Source = Excel.CurrentWorkbook(){[Name="Table20"]}[Content],
//set data type
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Section", Int64.Type}, {"Qa", type text}, {"Qb", type text}}),
//if you really need a blank Qa column since you have four distinct Qb rows but only 3 Qa rows,
// then we insert the next line
#"Replaced Value" = Table.ReplaceValue(#"Changed Type",null,"",Replacer.ReplaceValue,{"Qa", "Qb"}),
//Group by Section
//Add a 1-based Index column to each Group
#"Grouped Rows" = Table.Group(#"Replaced Value", {"Section"}, {
{"Row", each Table.AddIndexColumn(_,"Row",1,1)}}),
//Expand the grouped tables
#"Expanded Row" = Table.ExpandTableColumn(#"Grouped Rows", "Row", {"Qa", "Qb", "Row"}, {"Qa", "Qb", "Row"}),
//Unpivot
//Merge Row and Attribute columns to create the q-number headers
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Expanded Row", {"Section", "Row"}, "Attribute", "Value"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Unpivoted Other Columns",
{{"Row", type text}}, "en-US"),{"Attribute", "Row"},
Combiner.CombineTextByDelimiter("-", QuoteStyle.None),"Merged"),
//Pivot on the Sorted Merged column with no aggregation
#"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Sort(List.Distinct(#"Merged Columns"[Merged])), "Merged", "Value")
in
#"Pivoted Column"

Excel calculate value for new category based on other group categories

I am getting data from a database that is provided in long format and I need to get ratios from values that are given different categories. E.g. I want the average price based on revenues and quantity sold.
Is there an easy way to calculate this in a pivot once I have the data?
My MWE would look like this
And I woul like to calculate the new rows with the category price
One way would probably to do this in MS SQL beforehand, but I am not that skilled with that and I need my colleagues to be able to do this in Excel themselves.
In Power Query, you can
Group the Rows by Year
From the resultant tables, divide the 1st Value by the 2nd.
Paste the code below into the Advanced Editor; and change the table name in Line 2 to reflect the actual table name of your data. Then you can explore the "Applied Steps" in the UI to see how the code was generated.
Changing the data table will change the Query results, but you will need to "Refresh" the query. This can be done form the Ribbon; or you can create a Button on the worksheet.
M-Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"Year"}, {{"Grouped", each _, type table [Year=number, Category=text, Value=number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Price",
each Table.Column([Grouped],"Value"){0} /
Table.Column([Grouped],"Value"){1})
in
#"Added Custom"
Edit: From your comments, it seems you might have more than just Revenue/Quantity pairs of categories for each year. And I suppose it possible you might have more than a single Revenue/Quantity pair.
Below is code that will take that into account; breaking the Quantity and Revenue from each year into two columns, then dividing one by the other which would result in a weighted average price for each year:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//needed only if you have blank rows in the table
#"Filtered Rows" = Table.SelectRows(Source, each ([Year] <> null)),
//Group by Year
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Year"}, {{"Grouped", each _, type table [Year=number, Category=text, Value=number]}}),
//Extract Revenue and Quantity into two new columns of Lists
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Revenue", each Table.Column(Table.SelectRows([Grouped], each ([Category] = "Revenue")),"Value")),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Quantity", each Table.Column(Table.SelectRows([Grouped], each ([Category] = "Quantity")),"Value")),
//Sum the value for each List of Revenue and divide by each in the List of Quantity
//This will result in a weighted average if there is more than one Revenue/Quantity pair in a year
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Price", each List.Sum([Revenue]) / List.Sum([Quantity])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Grouped", "Revenue", "Quantity"}),
//Some cleanup
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Year", Int64.Type}, {"Price", Currency.Type}})
in
#"Changed Type"

How to Merge Query for multiple column

I have a table of information like this:
And a lookup table for user names to IDs:
How do I do a Merge on each column to lookup the values from the other table so I get this result:
I do not want to manually apply an action to each role column, because the list of roles may grow or shrink. So the solution needs to all columns (except the first) in the table.
Can this be done?
Basically this calls for unpivot on the Project data, merge to the other table, then re-pivot to get back in proper order
Steps:
Load in the ID data; here I am assuming it is loaded in query ID_Table
Load in Project data; here I am assuming it is loaded in range Projects
In the project query, right-click the first (project) column, unpivot other columns
Home ... Merge queries...
Merge the two tables using the Value column in the project query and the Person column in the ID_Table query, and use Left Outer merge
Expand results using double arrows atop column and uncheck all except ID
Right-click the value column and remove
Click attribute column ... transform .. pivot column...
Use ID as value column ... advanced options ... dont aggregate
sample code
let Source = Excel.CurrentWorkbook(){[Name="Projects"]}[Content],
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Promoted Headers", {"Project"}, "Attribute", "Value"),
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns",{"Value"},ID_Table,{"Person"},"ID_Table",JoinKind.LeftOuter),
#"Expanded ID_Table" = Table.ExpandTableColumn(#"Merged Queries", "ID_Table", {"ID"}, {"ID"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded ID_Table",{"Value"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "ID")
in #"Pivoted Column"

How can you get Power Query/Excel to separate and associate two unique IDs from the same column?

I have a dataset that exports with a single column including personnel IDs and job IDs.
I want to use Power Query separate Person_ID into one column and Job_ID into another column. People are associated with the job that appears closest above them. Job IDs are a 6-character text string, Person IDs are 9 character. The same Job_ID can apply to multiple people, but Person_ID is unique (only one job per person, multiple people for some jobs).
Example data structure:
Hope someone's got something!
Step by step
Highlight input data
Data...From Table/Range... do not check [] my table has headers
Add Column...Custom Column... using column name Custom, with formula
Text.Length([Column1])
Add Column...Custom Column... using column name Custom.1, with formula
if [Custom]=6 then [Column1] else null
Click on Custom.1 column, right click and do fill...down...
Use arrow next to Custom column and uncheck [] 6 leaving just [x]11
Click column Custom, right click and choose remove columns
file...close and load
Code produced:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each Text.Length([Column1])),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if [Custom]=6 then [Column1] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom.1"}),
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Custom] =11)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom"})
in #"Removed Columns"

Resources