Adding unique index in Excel Power Query - excel

I have a table which looks like
Date
9/4/2016
9/11/2016
9/18/2016
9/25/2016
10/2/2016
10/9/2016
10/16/2016
10/23/2016
10/30/2016
11/6/2016
11/13/2016
11/20/2016
11/20/2016
I'm trying to assign unique index values to 'Date column' but couldn't do it using the 'Add custom index value' in power query which doesn't check duplication. Also I tried "Date.WeekOfYear" which gives number based on year, but I want to assign unique numbers from 1 to .... for dates like
Date Custom_weeknumber
9/4/2016 1
9/11/2016 2
9/18/2016 3
9/25/2016 4
10/2/2016 5
10/9/2016 6
10/16/2016 7
10/23/2016 8
10/30/2016 9
11/6/2016 10
11/13/2016 11
11/20/2016 12
11/20/2016 12
Any help would be helpful, thanks!

Assuming:
Your dates are sorted.
The row after duplicates will get the Custom_weeknumber from the duplicates + 1.
Then you can group by dates (with New column name e.g. "DateGroups" and Oparation "All Rows"), add an index column, expand the "DateGroups" field and remove the "DateGroups" field.
Code example created in Power Query in Excel:
let
Source = Excel.CurrentWorkbook(){[Name="Dates"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Date", type date}}),
Grouped = Table.Group(Typed, {"Date"}, {{"DateGroups", each _, type table}}),
Numbered = Table.AddIndexColumn(Grouped, "Custom_weeknumber", 1, 1),
Expanded = Table.ExpandTableColumn(Numbered, "DateGroups", {"Date"}, {"DateGroups"}),
Removed = Table.RemoveColumns(Expanded,{"DateGroups"})
in
Removed

I'd do it this way (which seem to me a bit simplier, I don't like nested tables unless absolutely needed):
Group By Date column
Optional: Sort
Add index
Join source table
Code:
let
//Source = Excel.CurrentWorkbook(){[Name="YourTableName"]}[Content],
Source = #table(type table [Date = date], {{#date(2016, 10, 12)}, {#date(2016, 10, 13)}, {#date(2016,10,14)}, {#date(2016, 10, 14)}}),
GroupBy = Table.RemoveColumns(Table.Group(Source, "Date", {"tmp", each null, type any}), {"tmp"}),
//Optional: sort to ensure values are ordered
Sort = Table.Sort(GroupBy,{{"Date", Order.Ascending}}),
Index = Table.AddIndexColumn(Sort, "Custom_weeknumber", 1, 1),
JoinTables = Table.Join(Source, {"Date"}, Index, {"Date"}, JoinKind.Inner)
in
JoinTables

Related

IF condition met take value from row above.....Power Query

I'm new to this forum and I need your help in PowerQuery.
What I would like to do:
I have a list with expected stock changes for article. On the one hand it could be a inrease of the stock in case of incoming goods from an order which I made to my suppliers. On the other hand it could be a decrease in case of outgoing goods from a sales order. I would like to have a list which is writing the article code, the actual stock, the aamount of the decrease/increase of the change and the stock of the article after the change.
For example:
Row Column1 (A) Column2(B) Column3 (C) Column4 (D)
1 Article actual stock amount of change stock after change
2 A 5 -1 4
3 A 5 -2 2
4 A 5 -1 1
5 B 4 -1 3
The stock is always the same because the changes are expected and so for the future.
In Excel this would be a "easy"solution for me to calculate column4.
D2 = IF(A1=A2;D1+C2;B2+C2)
So I'm referencing to the value above the actual row in column4 if the arcicle is still the same. But how I have to do this in PowerQuery.
What I tried in PowerQuery? I added two index columns to the table:
Column1 (A) Column2(B) Column3 (C) Column4 (D) Index(colum5) Index.1(column6)
Article actual stock amount of change stock after change
A 5 -1 4 0 1
A 5 -2 2 1 2
A 5 -1 1 2 3
B 4 -1 3 3 4
I combined this two table by the index and added the artile again in this table.
I added a new conditional column called "Häufigkeit" where a "Doulbe" should be insert if article in column1 is equal to article1 in the new added column7, otherwise "single". To get the value from the row above I for my calculation tried this code:
"= Table.AddColumn(#"addcolum", ""ConditionalColumn", each if [Häufigkeit]="Double" then ([colum4]{-1}+[colum3]) else [colum2+colum3])"
This doesn't work.
So how I can fix this problem. Thanks in advance
See if this helps. It pulls the value from the row below if article matches
If you need the row above, change each instance of [Index]+1 to [Index]-1
Otherwise please make your question less confusing
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try if [article]= #"Added Index"{[Index]+1}[article] then #"Added Index"{[Index]+1}[value] else null otherwise null),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"})
in #"Removed Columns"

Add Custom and Dynamic columns

I have two tables and am trying to figure out how to create custom code to add dynamic columns with custom names that are based off of row values in another table. I then need to use the values of rows in Table 2 to not only create the column names but also fill the new dynamic Columns with a value from another column in Table 2. Hopefully my pictures below help
Table 1 has varying amount of rows depending on what the user input.
Table 2 has varying amount of rows depending on how many values the user inputs.
Table 1 Before
Col1
Col2
Col 3
stuff 1
stuff 2
stuff 3
stuff 4
stuff 5
stuff 6
.
.
.
.
.
.
Table 2
Name
Values
Name1
100
Name2
500
.
.
NameX
Y
Table 1 After
Col1
Col2
Col 3
"Column" & Name1
"Column"& Name2
...
"Column"& NameX
stuff 1
stuff 2
stuff 3
100
500
...
Y
stuff 4
stuff 5
stuff 6
100
500
...
Y
.
.
.
100
500
...
Y
.
.
.
100
500
...
Y
The "Column" & Name1 meaning I want to concatenate Column with the values in the Name column in Table 2.
You can make this dynamic by not referring to the absolute column names, but rather using the Table.ColumnNames function to return those names.
I did assume that the column names in Table 2 are fixed. If not, that code can be changed.
Read the code comments and examine the Applied Steps window to better understand the methods used. There are examples of setting the data type, and also re-naming columns without referring to a hard-coded column name.
M Code
let
//read in the two tables and set the data types
Source1 = Excel.CurrentWorkbook(){[Name="Table_2"]}[Content],
Table2 =Table.TransformColumnTypes(Source1,
{{"Name", type text},{"Values", type any}}),
Source = Excel.CurrentWorkbook(){[Name="Table_1_Before"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,
List.Transform(Table.ColumnNames(Source), each {_, type text})),
//create the extra columns by
//Transpose Table2
// Use first row as headers
xpose = Table.Transpose(Table2),
#"Promoted Headers" = Table.PromoteHeaders(xpose, [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",
List.Transform(Table.ColumnNames(#"Promoted Headers"), each {_, type any})),
//rename the columns
renameNameCols = Table.RenameColumns(#"Changed Type1",
List.Zip(
{Table.ColumnNames(#"Changed Type1"),
List.Transform(Table.ColumnNames(#"Changed Type1"), each "Column " & _)})),
//Combine the tables
combine = Table.Combine({#"Changed Type",renameNameCols}),
//fill up the original table 2 columns and remove the blank Table 1 rows
#"Filled Up" = Table.FillUp(combine,Table.ColumnNames(renameNameCols)),
#"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([Col1] <> null))
in
#"Filtered Rows"
Original Tables
Results
Note that I did NOT add logic to avoid prepending the ... with the word column, as shown in your desired output, but that is easily added if really needed
My version
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
custom = Table.FromColumns(Table.ToColumns(Source) &Table.ToColumns(Table2), List.Combine({Table.ColumnNames(Source),Table.ColumnNames(Table2)}) ),
#"Filled Down" = Table.FillDown(custom,Table.ColumnNames(Table2))
in #"Filled Down"

Sum multiple rows based on duplicate column data without formula

Based on data available in columns A to D (can be any 100's of columns), I want to sum up all the rows for column E to K (can be any 100's of columns)
The rows should sum up based on duplicate data from rows A to D, the result required as below
This is easily possible to do, with sumif, but would like to know if possible natively in excel or power query without creating unique id for each column or using sumif function or formula of any sort
In powerquery .. unpivot, group, pivot, done.
More detail:
Click select first 4 columns, right click, unpivot other columns
Click select first 4 columns and the new Attribute column, right click, group by
Use Operation:Sum on Column:Value name:count and hit OK
Click select Attribute column and transform .. pivot column... , for value column choose count
File Close and load
Full sample code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Code1", "Code2", "Code3", "Code4"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Code1", "Code2", "Code3", "Code4", "Attribute"}, {{"Count", each List.Sum([Value]), type number}}),
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[Attribute]), "Attribute", "Count", List.Sum)
in #"Pivoted Column"
To solve a problem like this, I first do a concrete example and then generalize it. I made a small table in Excel like so:
Code1
Code2
2-Jul-20
3-Jul-20
4-Jul-20
5-Jul-20
6-Jul-20
ERT
EXC
10
6
15
2
ERT
EXC
2
3
23
1
CON
HOR
3
CON
HOR
6
2
356
3
Then I clicked within the table and created a Power Query referencing it. After opening the Power Query Editor, there is a Group By function on the Home tab. It's pretty straightforward to choose the columns you want and the Sum function in a toy example like this.
Then, I opened the Advanced Editor to see what code was auto-generated. It looked something like this:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows orig" = Table.Group(Source, {"Code1", "Code2"}, {{"2-Jul-20", each List.Sum([#"2-Jul-20"]), type nullable number}, {"3-Jul-20", each List.Sum([#"3-Jul-20"]), type nullable number}, {"4-Jul-20", each List.Sum([#"4-Jul-20"]), type nullable number}, {"5-Jul-20", each List.Sum([#"5-Jul-20"]), type nullable number}, {"6-Jul-20", each List.Sum([#"6-Jul-20"]), type nullable number}})
in
#"Grouped Rows orig"
Typically, a Power Query expression is a series of transformations applied to a table, where each one operates on the table as returned from the previous. Here, we start with the original table as "Source" and then do the grouping. The parameters are a little messy, but what we have is: (1) the input table, (2) a list of the column names to group by, and (3) a list of 3-item lists, each of which describe an aggregated column. The sublists have the output column name, the function that does the aggregation, and the data type.
In Power Query, "each" is syntactic sugar for a single parameter function whose parameter is just an underscore. But also, when you have a record or row, you can just use [column] instead of _[column].
So how to generalize the operation you want to do? My first thought is that a convenient grouping function should have two parameters, based on your description. The first is the table to group, and the second is the number of columns starting from the left to group by. If you don't have them arranged contiguously, of course, you could do something else.
sumFromColumn = (t, n) => let
cList = Table.ColumnNames(t),
toGroup = List.FirstN(cList, n),
toSum = List.RemoveFirstN(cList, n),
sumFunc = (cName) => {cName, each List.Sum(Record.Field(_, cName)), type nullable number}
in Table.Group(t, toGroup, List.Transform(toSum, each sumFunc(_))),
#"Grouped Rows" = sumFromColumn(Source, 2), // Group by the first 2 columns and sum the rest
Here is the generalized function I made, which appears to match the original Table.Group operation that was generated by the interface.
The let statement arranges things for readability but does not imply a particular sequence that they happen in. Power Query figures out the dependencies and executes the statements in whatever order is needed.
The list of column names of the table is defined as cList, and split into toGroup and toSum. Then, sumFunc is defined as a function taking a column name and returning the 3-item list needed to define an aggregation operation. In Power Query, functions can return other functions any which way. So here we are defining a function that returns a list, with a function in it. Then we can use List.Transform to take the list of aggregated columns and turn it into the appropriate parameters for Table.Group.
Finally, the actual group by is done with a call like sumFromColumn(Source, 2), which is equivalent to the original statement that hard-codes the column names.
Code1
Code2
2-Jul-20
3-Jul-20
4-Jul-20
5-Jul-20
6-Jul-20
ERT
EXC
12
3
6
38
3
CON
HOR
6
5
356
3
This can easily be changed to sumFromColumn(Source, 1), in which case it will reduce to two rows, but then the second column being non-numeric, will become error values.
Or, you can use sumFromColumn(Source, 3), which will not add things up because the group by columns taken together are distinct.
This way you can easily aggregate any number of columns without caring about their names. I recommend both the Power Query M documentation on microsoft.com and reading about functional programming in general.

Excel VBA Power Query - How to create a query that dynamically returns only the sale's rows of the last minute?

I have a comma separated csv file with the following structure:
Col Headers:
ProdDate, ProdTime, OLEDATETIME, ProdBuyPrice, ProdSellPrice, ProdBoughtQTY, ProdSoldQTY, etc
09/21/2019, 13:54:22, 43729.5801, 12.45, 12.61, 8, 9, etc.
This CSV file is atualized many times per minute (5 to 70 times per minute) meaning that it can have 5 to 70 lines within the last minute of sales, then I can't fix an arbitray fixed number on "mantain first lines" to return only the rows that arrived in the last minute and I never did this before with Power Query. So I need an finished recipe to do this, but my googling resulted nothing until now.
Any suggestion?
This is an example of how you can identify a dynamic row number. In this example, we have a table that shows fruit sales by store. We want to create a query that returns the highest number of bananas sold.
This is what our data table looks like.
Step 1 - Add an index column starting from 1. This assigns row numbers.
Add Column > Index Column > From 1
Step 2 - Filter and Sort the data.
Remove any columns that are unnecessary.
Filter the Item column for Bananas.
Sort the Values column in descending order.
Right-click on the first value in the Index column and choose Drill-Down.
RESULT
Now you have a dynamic row #. You could also instead choose the value itself to return the sales instead of the index. To apply this to other scenarios, just keep filtering and sorting until you get to the result you need.
This is how you filter a time column for records occurring in the latest one minute of times.
let
Source = Excel.CurrentWorkbook(){[Name="t_DatesAndTimes"]}[Content],
ChangedTypes_ColData = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}}),
AddCol_DateAndTime = Table.AddColumn(ChangedTypes_ColData, "Date and Time", each [Date] & [Time], type datetime),
LatestTime_ofReport_MinusOneMinute = List.Max(AddCol_DateAndTime[Date and Time])-#duration(0,0,1,0),
FilterRows_KeepTimesInLastMinute = Table.SelectRows(AddCol_DateAndTime, each [Date and Time] >= LatestTime_ofReport_MinusOneMinute)
in
FilterRows_KeepTimesInLastMinute
Data Table needing to be filtered
Table filtered for time in the last minute of times listed in the report.

Excel PowerQuery (or dax is just as perfect) - add column with unique ID

I have a table (formatted as table) for the inputs.
I want to add a unique ID column to my table.
Constraints:
it should not use any other columns
it should be constant and stable, meaning that inserting a new row won't change any row IDs, but only adds a new one.
Anything calculated from another column's value is not useful because there will be typos. So changing the ID will mean data loss in other tables connected to this one.
Simply adding an index in query editor is not useful because there will be inserted rows in the middle, and ids are recalculated at this action
I am also open for any VBA solution. I tried to write a custom function taht would add a new ID into an "rowID" column in the same row if there is no ID yet, but I failed with referencing the cells from a function called from a Table.
My suggestion would be to use a self referencing query.
Query "Data" below imports Excel table "Data" and also outputs to Excel table "Data".
In order to create such a query, first create a query "Data" that imports some Excel table (let's say Table1), run the query so table "Data" is created. Now you can adjust the query source from Table1 to Data and maintain this table in Excel (leaving blank IDs for new rows) and run the query to generate new IDs.
Otherwise the query should be pretty straightforward; if not: let me know where you need additional explanation.
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Col1", Int64.Type}, {"Col2", type text}, {"ID", Int64.Type}}),
MaxID = List.Max(Typed[ID]),
OriginalSort = Table.AddIndexColumn(Typed, "OriginalSort",1,1),
OldRecords = Table.SelectRows(OriginalSort, each ([ID] <> null)),
NewRecords = Table.SelectRows(OriginalSort, each ([ID] = null)),
RemovedNullIDs = Table.RemoveColumns(NewRecords,{"ID"}),
NewIDs = Table.AddIndexColumn(RemovedNullIDs, "ID", MaxID + 1, 1),
NewTable = OldRecords & NewIDs,
OriginalSortRestored = Table.Sort(NewTable,{{"OriginalSort", Order.Ascending}}),
RemovedOriginalSort = Table.RemoveColumns(OriginalSortRestored,{"OriginalSort"})
in
RemovedOriginalSort

Resources