Related
I have the following data with duplicates which I wish to identify. I do not wish to remove these so unique value only won't work. I want to be able to identify them but just saying null.
I have attempted to self-reference the code but end up just duplicating the original result.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type"),
#"Merged Queries" = Table.NestedJoin(Source, {"Column1"}, #"Removed Duplicates", {"Column1"}, "Removed Duplicates", JoinKind.LeftOuter)
in
#"Merged Queries"
You can use List.Generate to generate a list with your requirements. And then you can either replace the first column or add the list as a second column.
This needs to be done in the Advanced Editor.
Please note there is a difference between the text string "null" and the power query null value. Based on your example screenshot, I assumed you wanted the "null" text string. If you prefer the null value, remove the surrounding quotes in the code
M Code
let
//Change next line to reflect your actual data source
Source = Excel.CurrentWorkbook(){[Name="Table13"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//change 2nd and later duplicates to null
dupsNull = List.Generate(
()=>[v=#"Changed Type"[Column1]{0}, idx=0],
each [idx]<Table.RowCount(#"Changed Type"),
each [v=if List.PositionOf(#"Changed Type"[Column1],#"Changed Type"[Column1]{[idx]+1},Occurrence.First) = [idx]+1
then #"Changed Type"[Column1]{[idx]+1} else "null", idx=[idx]+1],
each [v]),
//either add as a column or replace the first column
#"add de-duped" = Table.FromColumns(
Table.ToColumns(#"Changed Type") & {dupsNull},
type table[Column1=text, Column2=text])
in
#"add de-duped"
Here's another way. First, add an index column. Then add another column using List.PositionOf to get the row of the first occurrence of each value in the column. Then add one last column to compare the index and List.PositionOf, to determine which row entries should be null.
Let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each List.PositionOf(#"Added Index"[Column1],[Column1])),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if [Index] = [Custom] then [Column1] else null)
in
#"Added Custom1"
Here a solution that doesn't require to add a new column. It returns the same column just with duplicated values replaced with "null":
let
Source = Excel.CurrentWorkbook(){[Name="TB_INPUT"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
removeDups = (lst as list) =>
List.Accumulate(lst, {}, (x, y) => x & {if List.Contains(x, y) then "null" else y}),
replacedValues = removeDups(Table.Column(#"Changed Type", "Column1")),
#"replaced Values" = Table.FromList(replacedValues, null, type table[Column1 = Text.Type ])
in
#"replaced Values"
it uses a List.Accumulate function to simplify the process to generate the corresponding list with the specified requirements.
The output will be the following in Power Query:
and in Excel:
If you want an empty cell instead of "null" token, then in the function removeDups replace "null" with null.
If you want to consider a more general case, lets say you have more than one column in the input Excel Table and you want to replace duplicated values in more than one column at the same time.
Let's say we have the following input in Excel:
The following code can be used to replace duplicates in Column1 and Column2:
let
Source = Excel.CurrentWorkbook(){[Name="TB_GralCase"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}}),
listOfColumns = {"Column1", "Column2"},
remainingColumns = List.Difference(Table.ColumnNames(#"Changed Type"), listOfColumns),
removeDups = (lst as list) =>
List.Accumulate(lst, {}, (x, y) => x & {if List.Contains(x, y) then "null" else y}),
replacedValues = List.Transform(listOfColumns, each removeDups(Table.Column( #"Changed Type", _))),
#"replaced values" = Table.FromColumns(
replacedValues & Table.ToColumns(Table.SelectColumns( #"Changed Type", remainingColumns)),
listOfColumns & remainingColumns
)
in
#"replaced values"
In listOfColumns variable, you define the list of columns you want to replace duplicates.
The the output in Power Query will be:
What I am trying to achieve is to obtain "matches/pairs" from two tables. One (source 1)is data table with Date/Time and Pressure value columns and the other (source 2) is like Date/Time and Info value Columns. Second table has so called "pairs" , start and stop in certain time. I want to get exact matches when is found in source 1 or approximate match when is not exact as in source 1 (seconds can be a problem).
Lets say you are matching/lookup two tables, give me everything that falls between for instance 15.01.2022 06:00:00 and 15.01.2022 09:15:29.
Where I have a problem is more likely exact match and seconds. It is skipping or cant find any pair if the seconds are not matching. So my question is how to make if not seconds then lookup for next availablee match, can be a minute too as long as they are in the given range (start stop instances).
That is a reason I am getting this Expression error. Or is there a way to skip that error and proceed with Query??
Link to download the data:
https://docs.google.com/spreadsheets/d/1Jv5j7htAaEFktN0ntwOZCV9jesF43tEP/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
On the code below is what I am trying to do:
let
//Be sure to change the table names in the Source= and Source2= lines to be the actual table names from your workbook
Source = Excel.CurrentWorkbook(){[Name="Parameters"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = Excel.CurrentWorkbook(){[Name="Log_Original"]}[Content],
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "NextLineStart", each if Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) = true
and Text.Contains(#"Added Index"[Info]{[Index]+1},"start",Comparer.OrdinalIgnoreCase) = true
then "delete"
else null),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([NextLineStart] = null)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Index", "NextLineStart"}),
//create a list of all the relevant start/stop times
filterTimes = List.Combine(
List.Generate(
()=> [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{0},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{1}-#"Removed Columns1"[#"Date/Time"]{0})+1,
#duration(0,0,0,1)), IDX = 0],
each [IDX] < Table.RowCount(#"Removed Columns1"),
each [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{[IDX]+2},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{[IDX]+3}-#"Removed Columns1"[#"Date/Time"]{[IDX]+2})+1,
#duration(0,0,0,1)), IDX = [IDX]+2],
each [times]
)
),
//filter the table using the list
filterTimesCol = Table.FromList(filterTimes,Splitter.SplitByNothing()),
filteredTable = Table.Join(#"Changed Type","Date/Time",filterTimesCol,"Column1",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(filteredTable,{"Column1"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each DateTime.ToText([#"Date/Time"],"dd-MMM-yy")),
#"Filtered Rows2" = Table.SelectRows(#"Added Custom1", each [#"Date/Time"] > #datetime(2019, 01, 01, 0, 0, 0)),
#"Sorted Rows" = Table.Sort(#"Filtered Rows2",{{"Date/Time", Order.Ascending}})
in
#"Sorted Rows"
I set up the below to return a sorted table with all results between the start and ending date/times. You can then select the first or middle or bottom row of each table if you want from this point. Its hard to tell from your question if you are looking for the value closest to the start value, closest to the end value or something inbetween. You can wrap my Table.Sort with a Table.FirstN or Table.LastN to pick up the first or last row.
I left most of your starting code alone
let Source = Table.Buffer(T1),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = T2,
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
// shift Info up one row for comparison
shiftedList = List.RemoveFirstN( #"Added Index"[Info],1),
custom1 = Table.ToColumns( #"Added Index") & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames( #"Added Index") & {"NextInfo"}),
#"Filtered Rows2" = Table.SelectRows(custom2, each not (Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) and Text.Contains([NextInfo],"start", Comparer.OrdinalIgnoreCase))),
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Type", each if Text.Contains(Text.Lower([Info]),"start") then "start" else if Text.Contains(Text.Lower([Info]),"finished") then "finished" else null),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom3",{"Info", "NextInfo"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns2", "Custom", each if [Type]="start" then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Type]), "Type", "Date/Time"),
#"Added Custom2" = Table.AddColumn(#"Pivoted Column","Table",(i)=>Table.Sort(Table.SelectRows(T1, each [#"Date/Time"]>=i[start] and [#"Date/Time"]<=i[finished]),{{"Date/Time", Order.Ascending}}) , type table )
in #"Added Custom2"
I need a help from you with correction/suggestion of query I am using to get a data from folder in CSV format. Warning upfront: I don't know, how to write this shortly.
Few informations first:
Tools are limited for Power Query, Excel, VBA
Data query will run once in a month, so bigger loading time is not a BIG issue, although lower time is ofc preferable
I have chosen Power Query approach, because the source data have to be used in another Excel file, but with different set of rules (and this is part of my current issue).
Basic issue with my code is that it runs for really long time, there are big amount of conditions that need to be met and I have to use similar approach for another reason/tool/file. And I want the people to just press Refresh to get the information needed.
Description:
I have source of data in CSV files in a folder. Naming convention doesn't exist, because multiple people do the export of the data from system. Because of that I've used folder option in PQ.
The size of the data is currently around 400-600 MB. Name of the columns might be changing, for which are the first line in M-code to get around.
My main struggle is:
There are several conditions, that need to be implemented. I didn't want to write multiple if statements, because the code would get really ugly, and the number of conditions is in tenths and across multiple columns. For that reason I've implemented (let's call it TT) translation table where I have all columns where filtering could be used and last column of that TT is concatenation of all columns. If in the condition I don't care about one of the columns, I fill it with wildcard "*".
So the TT might be looking like:
| PC | CLIENT | FN | TC | STRING |
|----|--------|-----|----|-------------|
| 11 | * | NEW | AC | 11*NEWAC |
| 47 | 000001 | NEW | * | 47000001NEW*|
etc...
PC is PoC, FN is FUNCTION, TC is Transaction code (in code below).
Then in the code I am replacing the wildcard with appropriate column's value from PQ and check, if the concatenated string from same columns in PQ is contained in TT (last column is made into a list).
Code below works for the easier solution, but it's pretty hardcoded, because I've wanted to know if it's even possible.
After data update I run VBA macro to append the data into "database" table (ofc check for existing values is there) so the data load can be minimized. For that reason the first part code is used.
Basically the code I could split into three parts:
Basic transformation: Loading from folder, getting rid of unconventional names and checking with other folder if it contains the same named files to minimize load.
Filtering data: Consists of merging the PQ table with TT table, replacing the wildcards with correct column and then creating filtering string to check if the text in concatenated PQ table contains at least one value from the TT list.
Final transformation of used data to get the information I need (It's mainly about late settlements from market)
Whole M-Code with comments
let
/*Here starts basic data transformation to limit errors in CSV files due to
different conventions */
Source = Folder.Files(source),
#"Uppercased Text1" = Table.TransformColumns(Source,{{"Name", Text.Upper, type text}}),
#"Merged Queries2" = Table.NestedJoin(#"Uppercased Text1", {"Name"}, q_Archive, {"Name"}, "q_Archive", JoinKind.LeftAnti),
#"Added Custom" = Table.AddColumn(#"Merged Queries2", "Data", each Csv.Document(File.Contents([Folder Path] & "\" & [Name]),[Delimiter=";", Encoding = 1252, QuoteStyle = QuoteStyle.None])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Data"}),
#"Added Custom1" = Table.AddColumn(#"Removed Other Columns", "Table", each Table.PromoteHeaders([Data])),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom1",{"Table"}),
#"Added Custom2" = Table.AddColumn(#"Removed Other Columns1", "Upper", each Table.TransformColumnNames([Table],Text.Upper)),
#"Removed Other Columns2" = Table.SelectColumns(#"Added Custom2",{"Upper"}),
#"Expanded Upper" = Table.ExpandTableColumn(#"Removed Other Columns2", "Upper", {"19A AMOUNT", "19A CURRENCY CODE", "35B ISIN", "CLIENT", "EXP.SETTL.DATE", "FUNCTION", "INSTR.ID", "MESSAGE FUNCTION", "POC", "RECEPTION DATE", "SETTL.AMOUNT", "SETTL.CUR.", "TRANSACTION CODE"}, {"19A AMOUNT", "19A CURRENCY CODE", "35B ISIN", "CLIENT", "EXP.SETTL.DATE", "FUNCTION", "INSTR.ID", "MESSAGE FUNCTION", "POC", "RECEPTION DATE", "SETTL.AMOUNT", "SETTL.CUR.", "TRANSACTION CODE"}),
#"Renamed Columns1" = Table.RenameColumns(#"Expanded Upper",{{"SETTL.AMOUNT", "SETTL.AMOUNT2"}, {"SETTL.CUR.", "SETTL.CUR.2"}, {"19A CURRENCY CODE", "19A CURRENCY CODE2"}, {"19A AMOUNT", "19A AMOUNT2"}}),
#"Added Custom10" = Table.AddColumn(#"Renamed Columns1", "19A AMOUNT", each if[SETTL.AMOUNT2]=null then [19A AMOUNT2] else [SETTL.AMOUNT2]),
#"Added Custom11" = Table.AddColumn(#"Added Custom10", "19A CURRENCY CODE", each if [SETTL.CUR.2] = null then [19A CURRENCY CODE2] else [SETTL.CUR.2]),
#"Renamed Columns" = Table.RenameColumns(#"Added Custom11",{{"FUNCTION", "FUNCTION2"}}),
#"Added Custom8" = Table.AddColumn(#"Renamed Columns", "FUNCTION", each if[FUNCTION2]=null then [MESSAGE FUNCTION] else[FUNCTION2]),
#"Removed Other Columns3" = Table.SelectColumns(#"Added Custom8",{"35B ISIN", "CLIENT", "EXP.SETTL.DATE", "INSTR.ID", "POC", "RECEPTION DATE", "TRANSACTION CODE", "19A AMOUNT", "19A CURRENCY CODE", "FUNCTION"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns3",{"POC", "CLIENT", "FUNCTION", "TRANSACTION CODE", "EXP.SETTL.DATE", "RECEPTION DATE", "19A AMOUNT", "19A CURRENCY CODE"}),
#"Replaced Value" = Table.ReplaceValue(#"Reordered Columns","""","",Replacer.ReplaceText,{"POC", "CLIENT", "INSTR.ID", "35B ISIN"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","=","",Replacer.ReplaceText,{"POC", "CLIENT", "INSTR.ID", "35B ISIN"}),
#"Uppercased Text" = Table.TransformColumns(#"Replaced Value1",{{"POC", Text.Upper, type text}, {"CLIENT", Text.Upper, type text}, {"FUNCTION", Text.Upper, type text}, {"TRANSACTION CODE", Text.Upper, type text}}),
#"Filtered Rows" = Table.SelectRows(#"Uppercased Text", each ([FUNCTION] = "NEWM")),
#"Merged Queries" = Table.NestedJoin(#"Filtered Rows", {"POC"}, tbl_setup_pocList, {"PocList"}, "tbl_setup_pocList", JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"tbl_setup_pocList"}),
/* Here ends the data transformation part
and the part for list transformations start*/
#"Added condition" = Table.AddColumn(#"Removed Columns","COND", each (
((Table.FromRecords({
[PC = List.ReplaceValue(Table.Column(tbl_filtering_string, "POC"),"*",[POC], Replacer.ReplaceText),
CL = List.ReplaceValue(Table.Column(tbl_filtering_string, "CLIENT"),"*",[CLIENT], Replacer.ReplaceText),
FN = List.ReplaceValue(Table.Column(tbl_filtering_string, "FUNCTION"),"*",[FUNCTION], Replacer.ReplaceText),
TC = List.ReplaceValue(Table.Column(tbl_filtering_string, "TRANSACTION CODE"),"*",[TRANSACTION CODE], Replacer.ReplaceText)]}
))))),
#"Expanded COND" = Table.ExpandTableColumn(#"Added condition", "COND", {"PC", "CL", "FN", "TC"}, {"PC", "CL", "FN", "TC"}),
#"Added Custom3" = Table.AddColumn(#"Expanded COND", "Test", each (List.Combine(
{
{_[PC]},{_[CL]},{_[FN]},{_[TC]}
}
))),
#"Expanded Test" = Table.AddColumn(#"Added Custom3", "Test2", each (Table.FromColumns(_[Test],null))),
#"Removed Columns2" = Table.RemoveColumns(#"Expanded Test",{"PC", "CL", "FN", "TC", "Test"}),
#"Added Custom4" = Table.AddColumn(#"Removed Columns2", "String", each Table.ToList([Test2],Combiner.CombineTextByDelimiter(""))),
#"Removed Columns3" = Table.RemoveColumns(#"Added Custom4",{"Test2"}),
#"Added Custom6" = Table.AddColumn(#"Removed Columns3", "CONTAIN_STR", each [POC]&[CLIENT]&[FUNCTION]&[TRANSACTION CODE]),
#"Added Custom5" = Table.AddColumn(#"Added Custom6", "Cond", each List.Contains(_[String],[CONTAIN_STR])),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom5", each ([Cond] = false)),
/*Here the code for filtering ends and final transformations occur */
#"Removed Columns4" = Table.RemoveColumns(#"Filtered Rows1",{"String", "CONTAIN_STR", "Cond"}),
#"Merged Queries1" = Table.NestedJoin(#"Removed Columns4", {"POC"}, tbl_setup_exotics, {"Exotic_PoC"}, "tbl_setup_exotics", JoinKind.LeftOuter),
#"Expanded tbl_setup_exotics" = Table.ExpandTableColumn(#"Merged Queries1", "tbl_setup_exotics", {"Exotic_PoC"}, {"Exotic_PoC"}),
#"Replaced Value2" = Table.ReplaceValue(#"Expanded tbl_setup_exotics",null, "Non Exotic",Replacer.ReplaceValue,{"Exotic_PoC"}),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value2", {"EXP.SETTL.DATE", "RECEPTION DATE"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Errors",{{"EXP.SETTL.DATE", type date}, {"RECEPTION DATE", type date}}),
#"Added Custom7" = Table.AddColumn(#"Changed Type", "RD", each (if [Exotic_PoC] <> "Non Exotic" then Date.AddDays([RECEPTION DATE],1)else [RECEPTION DATE])),
#"Filtered Rows2" = Table.AddColumn(#"Added Custom7", "LB" , each if [RD]>=[EXP.SETTL.DATE] then "Late" else "Not"),
#"Added Custom9" = Table.AddColumn(#"Filtered Rows2", "DAYS_LATE", each [RD]-[EXP.SETTL.DATE]),
#"Inserted Year" = Table.AddColumn(#"Added Custom9", "Year", each Date.Year([EXP.SETTL.DATE]), Int64.Type),
#"Inserted Month" = Table.AddColumn(#"Inserted Year", "Month", each Date.Month([EXP.SETTL.DATE]), Int64.Type),
#"Changed Type1" = Table.TransformColumnTypes(#"Inserted Month",{{"19A AMOUNT", type number}}),
#"Grouped Rows" = Table.Group(#"Changed Type1", {"Year", "Month", "POC", "19A CURRENCY CODE", "DAYS_LATE", "LB"}, {{"Count", each Table.RowCount(_), type number}, {"Countervalue", each List.Sum([19A AMOUNT]), type text}, {"ISIN", each Text.Combine([35B ISIN],";"), type text}, {"INSTR.ID", each Text.Combine([INSTR.ID], ";"), type text}}),
#"Merged Queries3" = Table.NestedJoin(#"Grouped Rows", {"Year", "Month", "19A CURRENCY CODE"}, q_Xrates, {"Year", "Month", "Currency"}, "q_Xrates", JoinKind.LeftOuter),
#"Expanded q_Xrates" = Table.ExpandTableColumn(#"Merged Queries3", "q_Xrates", {"Rate"}, {"Rate"}),
#"Replaced Value3" = Table.ReplaceValue(#"Expanded q_Xrates",null,1,Replacer.ReplaceValue,{"Rate"}),
#"Added Col" = Table.AddColumn(#"Replaced Value3", "CV", each [Countervalue]/[Rate]),
#"Remove Countervalue" = Table.RemoveColumns(#"Added Col", {"Countervalue"})
in
#"Remove Countervalue"
Questions
I know this approach sounds over-complicated, but it makes it work (unfortunately it takes a long time to refresh). But is it really good? Aren't there other options, considering limited tool usage mentioned in the beginning?
How can I make this code better? I believe it could be partially re-made into function, but since I am quite a beginner in PQ, I cannot imagine how.
How can I use same approach, for same source data, but with bigger complexity? You can understand that as more columns to add to the filtering string.
Do you have other suggestions?
End comments
I am now pretty desperate and my written text might be confusing sometimes.
I don't have any issue providing some kind of Visio chart to show my logic in more graphical way (I am more familiar with that) and also with relationship overview.
I also don't have issue provide anonymized data (since it might be partially confidential). If you'd need that one, please refer to preferred service.
I don't mind working on my code, if I am pushed in correct direction. For that Q. #1 is priority. So basically is this good approach and can it be easily adjustable for another same, but more complicated purpose?
I really appreciate your time.
*/ MK */
If I were to do this, I would write a function that compiles the filter condition table into a function, then apply it with Table.SelectRows.
// Compile the condition table into a function that can be applied in row filtering.
filterCondition = compileFilterConditionTable(tbl_filtering_string),
#"Filtered Rows" = Table.SelectRows(#"Table after Preceding Steps", filterCondition)
Isn't this looking much easier to trace the steps?
Below is an example code of a function that compiles condition table into a logical function. I'm not sure this works correctly for your case, because I'm not completely understanding the requirement.
compileFilterConditionTable =
let compileFilterConditionTable = (filterConditionTable as table) as function =>
let recordConditions = List.Transform(
Table.ToRecords(filterConditionTable),
compileFilterConditionRecord)
in applyCombine(recordConditions, List.AnyTrue),
compileFilterConditionRecord = (cond as record) as function =>
let fieldNameValues = List.Transform(
Record.FieldNames(cond),
each [Name = _, Value = Record.Field(cond, Name)]
),
fieldConditions = List.Transform(fieldNameValues, compileFieldCondition)
in applyCombine(fieldConditions, List.AllTrue),
compileFieldCondition = (fieldNameValue as record) as function =>
let name = fieldNameValue[Name],
value = fieldNameValue[Value]
in
if value = "*" then (record as record) as logical => true
else (record as record) as logical => Record.Field(record, name) = value,
applyCombine = (functions as list, combiner as function) as function =>
(value) => combiner(List.Transform(functions, (f) => f(value)))
in compileFilterConditionTable
Anyway, M is a functional programming language, so it would help to think and code it in functional way. Break down the entire logic into small parts, so that each small parts will be easy enough to understand. Write your code as reusable small functions, and combine them to build the whole.
I have data in an excel table, generated by some software we use on site, the report is timestamped like this
6/05/2018 6:23:00 AM
As we are a 24 hour operation, I need PowerQuery to be able to recognise that the timestamps from 12:00:00 AM to 7:00:00 AM belong to the night shift of the previous day.
The problem I am facing is that although PowerQuery can handle Date/Time as a data type, it is seems to be truncating the time when importing data, so that the result for the above example within my query (and obviously in my output) is
6/05/2018 12:00:00 AM
Most of the stuff I can find on the net is about how to strip the time away - I want to keep it!!!
The purpose of this is so I can display records in chronological order for a night shift of production in a pivot table. At the moment I am having to add another column with the time alone, which causes data from midnight to 7:00am to preceed data from 7:00pm to midnight - when in fact it occurred after.
Cheers,
Mat
Edit: adding pictures of my problem, as I type I cannot see the images in the thread so I hope they are in the correct spots!
Example of my source data, timestamp is in the "Time" column.The other date columsn on the right are me getting something working so at least I have the shift date and the actual time together.
Source data
The following is my query, there is a lot in here that appends the source location to a predefined set of attributes, basically what I am struggling with is that the "Time" field gets imported, but loses all the data after the decimal point so I just get a date. I want to keep that time appended to that date, as well as have another field which is the shift date as described above.
let
// Removes unwanted characters.
CharsToRemove = List.Transform({33..45,47,58..126}, each Character.FromNumber(_)),
// The query is all based on the current month's portion of the current 13wk.
Source = Location_Data,
// Set some data fields, not all are changed here as it affects later calculations.
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Group Name", type text}, {"Name", type text}, {"Waste tonnes", type number}, {"Total Ore Tonnes", Int64.Type}, {"Dil cu_pct", type number}, {"Dil au", type number}, {"Dil ag", type number}, {"Dil fe_pct", type number}, {"Dil zn_pct", type number}, {"Density", type number}, {"Material", type text}, {"Type", type text}, {"Active from", Int64.Type}, {"Active to", Int64.Type}, {"Comments", type text}}),
// This steps add the MTD trucking data, and correlates it with our claim grades and density based on two fields, "Name" and "Material".
#"Merged Queries" = Table.NestedJoin(Source,{"Name", "Material"},LoadTrak_Data,{"Load Origin", "Material"},"NewColumn",JoinKind.LeftOuter),
// This step expands the trucking data so we can work with the each column individually, such as truck ID, Date/time, Load Volume, etc.
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Record", "Time", "Dir.", "Operator", "Truck ID", "Load (m3)", "Truck Operator", "Crew", "Shift", "Material", "Load Origin", "Dumped At", "Day", "Shift Time", "Calc Shift"}, {"Record", "Time", "Dir.", "Operator", "Truck ID", "Load (m3)", "Truck Operator", "Crew", "Shift", "Material.1", "Load Origin", "Dumped At", "Day", "Shift Time", "Calc Shift"}),
#"Changed Type3" = Table.TransformColumnTypes(#"Expanded NewColumn",{{"Time", type number}}),
#"Sorted Record low to high" = Table.Sort(#"Changed Type3",{{"Record", Order.Ascending}}),
#"Added Error Volume Column" = Table.AddColumn(#"Sorted Record low to high", "Error Volume", each if [#"Load (m3)"] = null then "28.2" else null ),
#"Changed Error Volume to decimal number" = Table.TransformColumnTypes(#"Added Error Volume Column",{{"Error Volume", type number}}),
#"Added Custom1" = Table.AddColumn(#"Changed Error Volume to decimal number", "DMT", each if [#"Dir."] = null then null else if [#"Load (m3)"] = null then ([Error Volume]*[Density] * 0.7) else [#"Load (m3)"] * [Density] * 0.7),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Level Loaded", each if [Load Origin] = "Empty" then 0 else (Text.Start([Load Origin],4))),
#"Creates shift date" = Table.AddColumn(#"Added Custom2", "Shift Date", each if [Shift Time] is null then null else if [Shift Time] < 0.2916 then [Time] -1 else [Time]),
#"Added Custom" = Table.AddColumn(#"Creates shift date", "Correct location", each if [#"Dir."] = null then null else if ([Shift Date]) < ([Active from]) or ([Shift Date]) > (([Active to])+0.999999) then "No" else "Yes"),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Level Loaded", type number}}),
#"Level Dumped" = Table.AddColumn(#"Changed Type1", "Level Dumped", each if [Dumped At] = "ROM" then 5270 else if [Dumped At] = "PAF" then 5170 else if [Dumped At] = "Paste" then 5270 else if [Dumped At] = "Waste" then 5270 else Text.Start([Dumped At], 4)),
#"Change Level Dumped to decimal number" = Table.TransformColumnTypes(#"Level Dumped",{{"Level Dumped", type number}}),
#"Added TKMs column" = Table.AddColumn(#"Change Level Dumped to decimal number", "TKMs", each if [Dumped At] = "Waste" then (((([Level Dumped] - [Level Loaded])*7)+300)/1000 * [DMT]) else if [Dumped At] = "ROM" then (((([Level Dumped] - [Level Loaded])*7)+150)+300)/1000 * [DMT] else if [Dumped At] = "PAF" then (((([Level Dumped] - [Level Loaded])*7)+150)+300)/1000 * [DMT] else if [Dumped At] = "Paste" then (((([Level Dumped] - [Level Loaded])*7)+150)+300)/1000 * [DMT] else (([Level Dumped] - [Level Loaded])*7)/1000 * [DMT]),
#"Changed TKMs to decimal number" = Table.TransformColumnTypes(#"Added TKMs column",{{"TKMs", type number}}),
#"Filtered Correct Location to remove incorrect duplicates" = Table.SelectRows(#"Changed TKMs to decimal number", each [Correct location] <> "No"),
#"Added load count helper column" = Table.AddColumn(#"Filtered Correct Location to remove incorrect duplicates", "Load", each if [Correct location] = "Yes" then 1 else ""),
#"Filtered Rows1" = Table.SelectRows(#"Added load count helper column", each true),
#"Filtered non-null Group Name rows" = Table.SelectRows(#"Filtered Rows1", each [Group Name] <> null and [Group Name] <> ""),
#"Converts date fields to type date" = Table.TransformColumnTypes(#"Filtered non-null Group Name rows",{{"Active from", type date}, {"Active to", type date}, {"Time", type date}, {"Shift Date", type date}}),
#"Merged with Sched_13wk" = Table.NestedJoin(#"Converts date fields to type date",{"Name", "Material"},Sched_13wk,{"Name", "Material"},"Sched_13wk",JoinKind.LeftOuter),
#"Expanded Sched_13wk" = Table.ExpandTableColumn(#"Merged with Sched_13wk", "Sched_13wk", {"Dil cu_pct", "Material", "Scheduled Tonnes"}, {"Sched_13wk.Dil cu_pct", "Sched_13wk.Material", "Sched_13wk.Scheduled Tonnes"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Expanded Sched_13wk",{{"Shift Time", type time}, {"Day", type date}, {"Time", type date}})
in
#"Changed Type2"
Urgghhh that looks horrible, not sure how better to include it though. Lol the notes in there are to remind me what I am doing when I have to edit it. Still not finished, wanting to add a load more so that if I were to change position, the nest person sitting in my chair will have an idea of what is happening in the query!
And this is how the data is in the query output, see that all time values are gone.
Output with no time values.
Okay I really hope that helps. As you have guessed I am not a programmer by trade!
Report data which is copied into my excel workbook, with additional info typed in and then queried
Report generated by software which is copied and pasted into workbook containing query.
The problem may be resolved by using Power Query to import from the original text file. You should be able to resolve the timestamp issues, as well as more easily format the results.
Copy/Paste is frequently NOT the best way to handle this sort of problem.
My problem:
Through New Query -> From Other Sources -> From Web, I entered a static URL that allowed me to load approximately 60k "IDs" from a webpage in JSON format.
I believe each of these IDs corresponds to an item.
So they're all loaded and organised in a column, with one ID per line, inside a Query tab.
For the moment, no problem.
Now I need to import information from a dynamic URL that depends on the ID.
So I need to import from URL in this form:
http://www.example.com/xxx/xxxx/ID
This imports the following for each ID:
name of correspond item,
average price,
supply,
demand,
etc.
After research I came to the conclusion that I had to use the "Advanced Editor" inside the query editor to reference the ID query tab.
However I have no idea how to put together the static part with the ID, and how to repeat that over the 60k lines.
I tried this:
let
Source = Json.Document(Web.Contents("https://example.com/xx/xxxx/" & ID)),
name1 = Source[name]
in
name1
This returns an error.
I think it's because I can't add a string and a column.
Question: How do I reference the value of the cell I'm interested in and add it to my string ?
Question: Is what I'm doing viable?
Question: How is Excel going to handle loading 60k queries?
Each query is only a few words to import.
Question: Is it possible to load information from 60k different URLs with one query?
EDIT : thank you very much for answer Alexis, was very helpful. So to avoid copying what you posted I did it without the function (tell me what you think of it) :
let
Source = Json.Document(Web.Contents("https://example.com/all-ID.json")),
items1 = Source[items],
#"Converted to Table" = Table.FromList(items1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "ID"}}),
#"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "URL", each Text.Combine({"http://example.com/api/item/", Text.From([ID], "fr-FR")}), type text),
#"Added Custom" = Table.AddColumn(#"Inserted Merged Column", "Item", each Json.Document(Web.Contents([URL]))),
#"Expanded Item" = Table.ExpandRecordColumn(#"Added Custom", "Item", {"name"}, {"Item.name"})
in
#"Expanded Item"
Now the problem I have is that it takes ages to load up all the information I need from all the URLs.
As it turns out it's possible to extract from multiple IDs at once using this format : http://example.com/api/item/ID1,ID2,ID3,ID4,...,IDN
I presume that trying to load from an URL containing all of the IDs at once would not work out because the URL would contain way too many characters to handle.
So to speed things up, what I'm trying to do now is concatenate every Nth row into one cell, for example with N=3 :
205
651
320165
63156
4645
31
6351
561
561
31
35
would become :
205, 651, 320165
63156, 4645, 31
6351, 561, 561
31, 35
The "Group by" functionnality doesn't seem to be what I'm looking for, and I'm not sure how to automatise that throught Power Query
EDIT 2
So after a lot of testing I found a solution, even though it might not be the most elegant and optimal :
I created an index with a 1 step
I created another costum column, I associated every N rows with an N increasing number
I used "Group By" -> "All Rows" to create a "Count" column
Created a costum column "[Count][ID]
Finally I excracted values from that column and put a "," separator
Here's the code for N = 10 000 :
let
Source = Json.Document(Web.Contents("https://example.com/items.json")),
items1 = Source[items],
#"Converted to Table" = Table.FromList(items1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "ID"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"ID", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Added Conditional Column" = Table.AddColumn(#"Added Index", "Custom", each if Number.RoundDown([Index]/10000) = [Index]/10000 then [Index] else Number.IntegerDivide([Index],10000)*10000),
#"Reordered Columns" = Table.ReorderColumns(#"Added Conditional Column",{"Index", "ID", "Custom"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns", {"Custom"}, {{"Count", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom.1", each [Count][ID]),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom.1", each Text.Combine(List.Transform(_, Text.From), ","), type text})
in
#"Extracted Values"
I think what you want to do here is create a custom function that you invoke with each of your ID values.
Let me give a similar example that should point you in the right direction.
Let's say I have a table named ListIDs which looks like this:
ID
----
1
2
3
4
5
6
7
8
9
10
and for each ID I want to pull some information from Wikipedia (e.g. for ID = 6 I want to lookup https://en.wikipedia.org/wiki/6 and return the Cardinal, Ordinal, Factorization, and Divisors of 6).
To get this for just one ID value my query would look like this (using 6 again):
let
Source = Web.Page(Web.Contents("https://en.wikipedia.org/wiki/6")),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Column2] = "Cardinal" or [Column2] = "Divisors" or [Column2] = "Factorization" or [Column2] = "Ordinal")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Column1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "Property"}, {"Column3", "Value"}}),
#"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Property]), "Property", "Value")
in
#"Pivoted Column"
Now we want to convert this into a function so that we can use it as many times as we want without creating a bunch of queries. (Note: I've named this query/function WikiLookUp as well.) To do this, change it to the following:
let
WikiLookUp = (ID as text) =>
let
Source = Web.Page(Web.Contents("https://en.wikipedia.org/wiki/" & ID)),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Column2] = "Cardinal" or [Column2] = "Divisors" or [Column2] = "Factorization" or [Column2] = "Ordinal")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Column1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "Property"}, {"Column3", "Value"}}),
#"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Property]), "Property", "Value")
in
#"Pivoted Column"
in
WikiLookUp
Notice that all we did is wrap it in another set of let...in and defined the parameter ID = text which gets substituted into the Source line near the end. The function should appear like this:
Now we can go back to our table which we've imported into the query editor and invoke our newly created function in a custom column. (Note: Make sure you convert your ID values to text type first since they're being appended to a URL.)
Add a custom column with the following definition (or use the Invoke Custom Function button)
= WikiLookUp([ID])
Expand that column to bring in all the columns you want and you're done!
Here's what that query's M code looks like:
let
Source = Excel.CurrentWorkbook(){[Name="ListIDs"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each WikiLookUp([ID])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Cardinal", "Ordinal", "Factorization", "Divisors"}, {"Cardinal", "Ordinal", "Factorization", "Divisors"})
in
#"Expanded Custom"
The query should look like this: