I have a table where rows are duplicated and I need to merge these into one row.
Download Example data
In this example I have condensed it to January and February which I need to merge, in my actual data there is one column for each month.
I can do this in Excel but I would like to do it in Power Query instead, if possible?
So far I have tried to Group By and Transposing the rows but either I get an error or I end up back in the same results, one row per month.
Try this on your source data file. It groups on a bunch of stuff, then fills up the month columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"ContractNumber", "Year", "Quarter", "Product", "SerialNumber", "PersonFullName", "PersonCostCarrierCode", "CompanyName", "CompanyOrgNo","Cost"}, {{"data", each
Table.FirstN(Table.FillUp(
Table.SelectColumns(_,{"December", "November", "October", "September", "August", "June", "May", "April", "March", "February", "January"})
,{"January", "February", "March", "April", "May", "June", "August", "September", "October", "November", "December"}),1)
, type table }}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"December", "November", "October", "September", "August", "June", "May", "April", "March", "February", "January"}, {"December", "November", "October", "September", "August", "June", "May", "April", "March", "February", "January"})
in #"Expanded data"
You can simply filter the [January] column on "January" and then replace all null values in the [February] column with "February" to get your expected result. All other data is duplicate anyhow, except for the MonthNr, but you are discarding that information.
Here is one way of doing this:
Unpivot the "Months" columns
After doing this, the Attributes and Values columns will be identical, so delete one of them
Group by the columns that define your "duplicate rows"
in the Aggregation, create a Record of the months data:
eg: {"Records", each Record.FromList([Month],[Month]), type record}
Then Expand that record column, using a List of the Months for the Column Names
Related
I would like to sort an array of months in the reverse/regular month's order.
At the moment I have the following array
["August", "July", "June", "May", "April", "March", "February", "January", "September"]
I would like to be like
["September", "August", "July", "June", "May", "April", "March", "February", "January"]
I used the sort filter but it sorts in ascending or descending order only.
I need to get these project description rows merged into a single row so that there will be consistency in the number of a rows per record so that I can transpose them into proper columns through Power Query. (see image) I understand how to execute a transpose w/ Power Query if the number of rows are consistent across records but I cannot figure out how to do this if the number of rows differ. The data comes from a PDF which is horribly formatted and breaks the Project Description information in to separate rows. < THAT IS THE KEY PROBLEM. Apart from that the rest is cake. See snippet to see what I mean.
Each transposed record will have seven columns:
Director Analysis
Address
Project
Area
Notice Date
Project Description
Appeal
I can get everything I need including the headers. I just can't figure out how to merge the rows under Project Description so that I can proceed w/ the transpose.
here is the link to view a screenshot of my sheet
This is a kludge but seems to work. Assumes the column we want to operate on is named column a in powerquery
It looks for anything between the rows that contain Project Description and Appeals must be
Create a shifted row, so we can see what is on the row above
Add index
Use custom columns to determine which rows need filtering out, and which rows are the start and end rows to combine based on the first column and the shifted first column
Merge text together based on that info, merge that back into original table, then remove the extra rows
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// create shifted row
shiftedList = {null} & List.RemoveLastN(Source[a],1),
custom3 = Table.ToColumns(Source) & {shiftedList},
custom4 = Table.FromColumns(custom3,Table.ColumnNames(Source) & {"Next Row Header"}),
#"Added Index" = Table.AddIndexColumn(custom4, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try if Text.Contains([Next Row Header],"Project Description" ) then [Index] else if Text.Contains([a],"Appeals must be") then [Index] else null otherwise 0),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
#"Added Custom1" = Table.AddColumn(#"Filled Down", "Custom.1", each try if Text.Contains([Next Row Header],"Project Desc") then "remove" else if Text.Contains([a],"Appeals must be") then "keep" else null otherwise "keep"),
#"Filled Down1" = Table.FillDown(#"Added Custom1",{"Custom.1"}),
#"Filtered Rows1" = Table.SelectRows(#"Filled Down1", each ([Custom.1] = "remove")),
#"Grouped Rows1" = Table.Group(#"Filtered Rows1", {"Custom"}, {{"Count", each Text.Combine(List.Transform([a], Text.From), ","), type text}}),
#"Merged Queries" = Table.NestedJoin(#"Filled Down1", {"Index"}, #"Grouped Rows1", {"Custom"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Count"}, {"Count"}),
#"SwapValue"= Table.ReplaceValue( #"Expanded Table2", each [Custom.1], each if [Count] = null then [Custom.1] else "keep", Replacer.ReplaceValue,{"Custom.1"}),
#"Final Swap"=Table.ReplaceValue(#"SwapValue",each [a], each if [Count]=null then [a] else [Count] , Replacer.ReplaceValue,{"a"}),
#"Filtered Rows" = Table.SelectRows(#"Final Swap", each ([Custom.1] = "keep")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Next Row Header", "Index", "Custom", "Custom.1", "Count"})
in #"Removed Columns"
Apologies if this has been asked before, although I tried searching on the forum and didn't find anything.
Let's say I have two tables having identical columns. The only difference is that these are for two different dates hence the values might change.
Table 1: YDAY
ID
Name
Dept
Salary
Date
X02
Jim
HR
40,000
03/31/2021
X03
Ray
Admin
45,000
03/31/2021
X04
Mark
Sales
55,000
03/31/2021
Table 2: YDAY
ID
Name
Dept
Salary
Date
X01
John
Sales
50,000
03/31/2020
X02
Jim
HR
40,000
03/31/2020
X03
Ray
Admin
45,000
03/31/2020
Now I use Power Query merge request and select ID as the lookup value and perform an outerjoin (i.e. pick up all the ids). However when I do that it will keep two lookup columns separately. What I want to do is merge both of them and create a unified column for ID, which contains all of the IDs in both data sets (i.e. X01 which was present in YDAY but not today and X04 which was present in TODAY but not YDAY). See below for desired result.
Can you please help or point out in the right direction?
My desired result is as follows.
ID
Name
Dept
Salary
Date
Name_Prev
Dept_Prev
Salary_Prev
Date_Prev
X01
John
Sales
50,000
03/31/2020
X02
Jim
HR
40,000
06/30/2021
Jim
HR
40,000
03/31/2020
X03
Ray
Admin
45,000
06/30/2021
Ray
Admin
45,000
03/31/2020
X04
Mark
Sales
55,000
06/30/2021
You could create a consolidated list of IDs from both tables as a separate query, and then merge it with Table1 and Table2 to get your desired output. Not sure if it would be any more efficient but it's an option.
let
#"Table1 IDs" = Table.SelectColumns(Table1,{"ID"}),
#"Table2 IDs" = Table.SelectColumns(Table2,{"ID"}),
#"Appended Query" = Table.Combine({#"Table1 IDs", #"Table2 IDs"}),
#"Removed Duplicates" = Table.Distinct(#"Appended Query"),
#"Merge with Table1" = Table.NestedJoin(#"Removed Duplicates", {"ID"}, Table1, {"ID"}, "Table1", JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merge with Table1", "Table1", {"Name", "Dept", "Salary", "Date"}, {"Name", "Dept", "Salary", "Date"}),
#"Merge with Table2" = Table.NestedJoin(#"Expanded Table1", {"ID"}, Table2, {"ID"}, "Prev", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merge with Table2", "Prev", {"Name", "Dept", "Salary", "Date"}, {"Prev.Name", "Prev.Dept", "Prev.Salary", "Prev.Date"}),
#"Sorted Rows" = Table.Sort(#"Expanded Table2",{{"ID", Order.Ascending}})
in
#"Sorted Rows"
I have a question:
I have 60+ tables in dbf with columns: year, product, value. Tables have different years data.
EXAMPLE.
Table 1
Year product value
1993 Apple 98.45
1994 Mushrooms 67.54
Table 2
Year product value
1992 Apple 95.45
2021 Melon 112.0
I need a pivot table(to consolidate) all tables in one table.
My way:
Let
DatesList={1992, 1993,1994,1995,2021},
Tbl=Odbc.Query("dsn=my_custom_dsn", "select * from c:\data\1993.dbf"),
Result=List.Accumulate (DatesList, Tbl, (state, current) =>Table.Join(Tbl, "product", Query.odbc("dsn=my_custom_dsn", "select * from c:\data\" +Text.From(current) +".dbf", "product")
in
Result
Its ok, but results only for the last date. How to save Table between dates
Please, help
I think you are overcomplicating.
Try this:
let
DatesList = Table.FromList({1992,1993,1994,1995,2021}, Splitter.SplitByNothing(), {"Year"}, null, ExtraValues.Error)
, #"Added Custom" = Table.AddColumn(DatesList, "Data", each Odbc.Query("dsn=my_custom_dsn", "select * from c:\data\" & Number.ToText([Year]) & ".dbf"))
, #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Data", {"product", "table"}, {"product", "table"})
in
#"Expanded Data"
Remember in your question you misspelled Odbc.Query
I am trying to group/merge two rows by dividing the values in each based on another column (Eligible) value.
From the initial raw data, I have reached this level with different steps (by unpivoting etc.) in power query.
Now I need to have a ratio per employee (eligible/not-eligible) for each month.
So for employee A, "Jan-14" will be -10/(-10 + -149) and so on. Any ideas will be appreciated. Thanks
Really appreciate the response. Interestingly, I have used your other answer to reach this stage from the raw data.
Since we are calculating how much time an employee worked on eligible activities each month so We will be grouping on the Employee. Employee name was just for reference which I took out and later will join with employee query to get the names if required. There was a typo in the image, the last row should also be an employee with id 2.
So now when there is a matching row, we use the formula to calculate the percentage of time spent on eligible activities but
If there isn't a matching row with eligible=1, then the outcome should be 0
if there isn't a matching row with eligible-0, then the outcome should be 1 (100%)
Try this and modify as needed. It assumes you are starting with all Eligible=0 and will only pick up matching Eligible=1. If there is a E=1 without E=0 it is removed. Also assumes we match on both Employee and EmployeeName
~ ~ ~ ~ ~
Click select the first three columns (Employee, EmployeeName, Eligible), right click .... Unpivot other other columns
Add custom column with name "One" and formula =1+[Eligible]
Merge the table onto itself, Home .. Merge Queries... with join kind Left Outer
Click to match Employee, EmployeeName and Attribute columns in the two boxes, and match One column in the top box to the Eligible Column in the bottom box
In the new column, use arrows atop the column to expand, choosing [x] onlt the Value column. Make the name of the column: Custom.Value
Add column .. custom column ... formula = [Custom.Value] / ( [Custom.Value] + [Value])
Filter Eligible to only pick up the zeroes using the arrow atop that column
Remove extra columns
Click select Attribute column, Transform ... pivot ... use custom as the values column
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Employee", "EmployeeName", "Eligible"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "One", each 1+[Eligible]),
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Employee", "EmployeeName", "Attribute", "One"},#"Added Custom",{"Employee", "EmployeeName", "Attribute", "Eligible"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"Value"}, {"Custom.Value"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Added Custom", "Custom", each [Custom.Value]/([Custom.Value]+[Value])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Eligible] = 0)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Eligible", "Value", "One", "Custom.Value"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Custom", List.Sum)
in #"Pivoted Column"