Powerquery-appending files giving error - excel

I am trying to append close to 10000 excel files (each having size of 50-100 kb). Half the way into the process I am running into an error with the PQ. The error hits half the way when I am appending files and it is impossible to figure out which .xlsx file is the one causing the issue.
PQ's Queries and Connections pane shows the following error at the same time:
How do I go about resolving this issue other than going one by one manually and uploading query on PQ until I find the file(s) which are giving me the errors? Thanks for reading!

I've frequently run into issues where PQ outright fails when it runs into "error" cells in excel workbooks, even if you've tried to remove errors in earlier steps. I'm not clear on the criteria that causes this, but I wonder if that could be the case here since it mentions a "#VALUE!" error in that message? While PQ should probably be handling this more gracefully, I made a couple of queries that let me input a directory and it will return the workbook, sheet, and row of every cell error in every excel file in that directory. I've never tried it with 10k excel files, but if my code were cleaned up to be more efficient it would probably work quickly enough.
The query that gets all the raw excel file data looks like this:
let
Source = Folder.Files(YOUR DIRECTORY HERE),
#"Filtered Rows1" = Table.SelectRows(Source, each not Text.StartsWith([Name], "~")),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows1", each Text.EndsWith([Extension], ".xlsx") or Text.EndsWith([Extension], ".xlsm")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "WorkbookData", each Excel.Workbook([Content])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Folder Path", "Name", "WorkbookData"}),
#"Expanded WorkbookData" = Table.ExpandTableColumn(#"Removed Other Columns", "WorkbookData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"WorkbookData.Data", "WorkbookData.Hidden", "WorkbookData.Item", "WorkbookData.Kind", "WorkbookData.Name"}),
#"Filtered Rows2" = Table.SelectRows(#"Expanded WorkbookData", each ([WorkbookData.Kind] = "Sheet")),
#"Removed Other Columns1" = Table.SelectColumns(#"Filtered Rows2",{"Folder Path", "Name", "WorkbookData.Name", "WorkbookData.Data"}),
ExpandedData = Table.ExpandTableColumn(#"Removed Other Columns1", "WorkbookData.Data", Table.ColumnNames(Table.Combine(#"Removed Other Columns1"[WorkbookData.Data]))),
IdentifySheets = Table.AddColumn(ExpandedData, "UniqueSheet", each [Folder Path]&[Name]&[WorkbookData.Name]),
SheetRowCounts = Table.Group(IdentifySheets, {"UniqueSheet"}, {{"Count", each Table.RowCount(_), type number}}),
#"Added Custom2" = Table.AddColumn(SheetRowCounts, "PerSheetRow", each List.Numbers(1, [Count], 1)),
#"Expanded PerSheetIndex" = Table.ExpandListColumn(#"Added Custom2", "PerSheetRow"),
IndexBase = Table.AddIndexColumn(#"Expanded PerSheetIndex", "Index", 0, 1),
#"Added Index" = Table.AddIndexColumn(IdentifySheets, "Index", 0, 1),
#"Merged Queries" = Table.NestedJoin(#"Added Index",{"Index"},IndexBase,{"Index"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"PerSheetRow"}, {"PerSheetRow"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded NewColumn",{"UniqueSheet", "Index"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns", List.Combine({{"Folder Path", "Name", "WorkbookData.Name", "PerSheetRow"}, List.RemoveMatchingItems(Table.ColumnNames(ExpandedData), {"Folder Path", "Name", "WorkbookData.Name"})}))
in
#"Reordered Columns"
And that part is setup as connection only query, since I don't want to load the data of every sheet of every workbook I'm checking.
The query I use to load the rows with errors in it looks like this:
let
Source = NAME OF THE QUERY ABOVE,
#"Kept Errors" = Table.SelectRowsWithErrors(Source, Table.ColumnNames(Source)),
ColumnList = Table.FromList(Table.ColumnNames(#"Kept Errors")),
#"Added Custom" = Table.AddColumn(ColumnList, "Custom", each "ERROR"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Replacements", each Record.FieldValues(_)),
ErrorReplacements = Table.SelectColumns(#"Added Custom1",{"Replacements"}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Kept Errors", ErrorReplacements[Replacements]),
#"Renamed Columns" = Table.RenameColumns(#"Replaced Errors",{{"PerSheetRow", "SheetRow"}, {"Name", "Workbook"}, {"WorkbookData.Name", "Sheet"}})
in
#"Renamed Columns"
I couldn't find a way to get PQ convert the "error" cells to a string of which specific error it is (probably possible, I just don't know how), so instead I just have it replace all the error cells with "ERROR" and have conditional formatting on my sheet to highlight that.
Can't say how functional this would be for your case, but it has helped me numerous times to find errors cells in sets of excel files though.

Related

Combine multiple .xls files that are formatted as txt files into one pivot?

So I have files that I get from a 3rd party everyday, they have been building up for over a year and I need to combine them into summary pivots, 1 file/pivot for each month.
So I have ~30 files that are .xls files but I think they are formatted as txt files because when I open them I get this notification below, anyway when I save them the defualt is text tab delimited.
Example of notification
Each file has the same formatting and the same column headers. My current slow strategy is to open one at a time and paste the contents all into one file, then create a pivot. I know I should be doing this faster using either Power Pivot/Power Query or VB. Which one should I use and can anyone give me hints on how to get started?
you can do this in powerquery using either of these options from data ... get data ... from other sources .... blank query and then home ... advanced editor...
To read and combine all XLS files in a directory
//read all files in specified directory you fill in here
let Source = Folder.Files("C:\directory\subdirectory"),
//filter only filetype xlsx
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Name", "Content"}),
#"Added Custom" = Table.AddColumn(#"Removed Other Columns", "GetFileData", each Excel.Workbook([Content],true)),
#"Expanded GetFileData" = Table.ExpandTableColumn(#"Added Custom", "GetFileData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"Data", "Hidden", "Item", "Kind", "Sheet"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded GetFileData",{"Content", "Hidden", "Item", "Kind"}),
List = List.Union(List.Transform(#"Removed Columns"[Data], each Table.ColumnNames(_))),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Columns", "Data", List,List)
in #"Expanded Data"
to read and combine all text files (you specify extension) in directory
let Source = Folder.Files("C:\directory\subdirectory"),
//filter only filetype txt
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".txt")),
#"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Custom", each Table.AddIndexColumn(Csv.Document(File.Contents([Folder Path]&"\"&[Name]),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),"Index")),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom1", "Custom", {"Column1", "Index"}, {"Column1", "Index"}),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded Custom",{"Column1", "Index", "Name"}),
#"Pivoted Column" = Table.Pivot(#"Removed Other Columns", List.Distinct(#"Removed Other Columns"[Name]), "Name", "Column1"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns"

Excel PowerQuery - Dynamically Change Source from Cell Value

For a project, I'm trying to query an Access database from Excel using PowerQuery. The path to the file will be set in a cell on the sheet and each user will change it as necessary.
I've tried the following code below, as well as endless examples from Google, however it always results in the error: DataFormat.Error: The supplied file path must be a valid absolute path. Details: ‪D:\Downloads\Database.accdb
let
//FilePath = Text.From(Excel.CurrentWorkbook(){[Name="File"]}[Content]{0}[Column1]),
//Name='File' refers to a named cell called File with the value of 'D:\Downloads\'
Path = Excel.CurrentWorkbook(){[Name="File"]}[Content]{0}[Column1],
FilePath = Text.From("" & Path & ""),
Source = Access.Database(File.Contents(FilePath & "Database.accdb"), [CreateNavigationProperties=true]),
_Stores = Source{[Schema="",Item="Stores"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(_Stores,{{"Open Time", type time}, {"Close Time", type time}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"State", "Routes(City)", "Routes(City) 2"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"City", "Store"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "CurTime", each DateTime.LocalNow()),
#"Inserted Time" = Table.AddColumn(#"Added Custom", "Time", each DateTime.Time([CurTime]), type time),
#"Removed Columns1" = Table.RemoveColumns(#"Inserted Time",{"CurTime"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns1", "Hours Until Close", each Duration.Hours(Duration.From(DateTime.Time([Close Time])-DateTime.Time(DateTime.LocalNow())))),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom1",{"Time"})
in
#"Removed Columns2"
Any assistance would be much appreciated!
I cannot reproduce the error. This code works just fine for me.
let
Path = Excel.CurrentWorkbook(){[Name="File"]}[Content]{0}[Column1],
FilePath = Text.From("" & Path & ""),
Source = Access.Database(File.Contents(FilePath & "Database.accdb"))
// Source = Access.Database(File.Contents("C:\demo\Database.accdb"), [CreateNavigationProperties=true])
in
Source
Check for spelling and missing characters in your variables.

Excel Power Query only imports column titles, not data

I am trying to use Power Query in Excel 2013 to import a folder full of 121 text files. Each text file has a column of numbers:
24
2.0000E+07
1.0000E+07
5.0000E+06
2.0000E+06
1.0000E+06
1.0000E+05
1.0000E+04
1.0000E+03
1.0000E+02
1.0000E+01
1.0000E+00
6.2500E-01
5.0000E-01
4.0000E-01
3.0000E-01
2.0000E-01
1.0000E-01
8.0000E-02
6.0000E-02
4.0000E-02
3.0000E-02
2.0000E-02
1.0000E-02
2.0000E-04
1.0000E-05
1.0516E-05
9.3907E-06
3.3497E-04
1.8445E-03
1.3411E-03
5.4756E-03
9.4254E-03
1.2390E-02
1.4350E-02
1.5677E-02
1.7293E-02
4.0507E-03
2.0602E-03
2.1823E-03
3.1392E-03
7.5455E-03
9.1609E-02
7.5750E-02
1.2536E-01
1.9400E-01
1.2207E-01
1.2811E-01
1.1341E-01
5.2564E-02
56
2.0000E+07
6.4300E+06
4.3000E+06
3.0000E+06
1.8500E+06
1.5000E+06
1.2000E+06
8.6100E+05
7.5000E+05
6.0000E+05
4.7000E+05
3.3000E+05
2.7000E+05
2.0000E+05
5.0000E+04
2.0000E+04
1.7000E+04
3.7400E+03
2.2500E+03
1.9200E+02
1.8800E+02
1.1800E+02
1.1600E+02
1.0500E+02
1.0100E+02
6.7500E+01
6.5000E+01
3.7100E+01
3.6000E+01
2.1800E+01
2.1200E+01
2.0500E+01
7.0000E+00
6.8800E+00
6.5000E+00
6.2500E+00
5.0000E+00
1.1300E+00
1.0800E+00
1.0100E+00
6.2500E-01
4.5000E-01
3.7500E-01
3.5000E-01
3.2500E-01
2.5000E-01
2.0000E-01
1.5000E-01
1.0000E-01
8.0000E-02
6.0000E-02
5.0000E-02
4.0000E-02
2.5300E-02
1.0000E-02
4.0000E-03
1.0000E-05
I want to use Power Query to import the entire folder into Excel, with the data in each text file having its own column, and the column header being the name of the text file.
Like this
The problem is that Power Query only seems to import the file names, but not the data within them.
So I get something like:
this
With no data underneath its respective column. What am I doing wrong? Would it have something to do with Power Query seeing the data as 'binary' instead of 'text'?
This should do what you want ... read in all .txt files in a directory, and then place the values from each into its own column where the column headers is the filename.
Obviously, change the path in the first step
Assumes a single column of data in each source file
let Source = Folder.Files("C:\directory\subdirectory\"),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".txt")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each Table.AddIndexColumn(Csv.Document(File.Contents([Folder Path]&"\"&[Name]),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),"Index",1)),
#"Expanded Custom.1" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Column1", "Index"}, {"Column1", "Index"}),
#"Removed Other Columns" = Table.SelectColumns(#"Expanded Custom.1",{"Name", "Column1", "Index"}),
#"Pivoted Column" = Table.Pivot(#"Removed Other Columns", List.Distinct(#"Removed Other Columns"[Name]), "Name", "Column1"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns"

Excel Power Query - Count number of matching multiple columns

I have a datasource from an external Excel file that I have added to an Excel worksheet. I need to add new custom columns that compare the data to a table ("My_Table") in another worksheet that is manually updated. I used the Power Query Editor and created a new column that checks if there is a matching entry in My_Table based on matching 3 columns and gives a True/False result (ie for each row of the datasource, if the acctName, projectName, and boardName match a corresponding row in My_Table, then it returns true):
#"Added Custom" = Table.AddColumn(#"Reordered Columns", "Tracked", each Table.Contains( My_Table, [Customer=[acctName], Project=[projectName], Board=[boardName]]))
What I would like to do now is do the exact same thing but count how many times those three columns match in "My_Table". I thought Tabel.RowCount would work but I'm not sure if that's the right way to do it as I either have an error or a zero result.
dolomike, Here's another shot at it...
I started with this as Table1:
...and this as My_Table:
...and used this M code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Merged Queries" = Table.NestedJoin(Source, {"acctName", "projectName", "boardName"}, My_Table, {"Customer", "Project", "Board"}, "My_Table", JoinKind.LeftOuter),
#"Expanded My_Table" = Table.ExpandTableColumn(#"Merged Queries", "My_Table", {"Customer", "Project", "Board"}, {"My_Table.Customer", "My_Table.Project", "My_Table.Board"}),
#"Grouped Rows" = Table.Group(#"Expanded My_Table", {"My_Table.Customer", "My_Table.Project", "My_Table.Board"}, {{"Count", each Table.RowCount(_), type number}, {"AllData", each _, type table [acctName=text, projectName=text, boardName=text, My_Table.Customer=text, My_Table.Project=text, My_Table.Board=text]}}),
Custom2 = Table.TransformColumns(#"Grouped Rows",{"Count", each if _ = List.Max(#"Grouped Rows"[Count]) then 0 else _}),
#"Removed Other Columns" = Table.SelectColumns(Custom2,{"Count", "AllData"}),
#"Expanded AllData" = Table.ExpandTableColumn(#"Removed Other Columns", "AllData", {"acctName", "projectName", "boardName", "My_Table.Customer", "My_Table.Project", "My_Table.Board"}, {"acctName", "projectName", "boardName", "My_Table.Customer", "My_Table.Project", "My_Table.Board"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Expanded AllData",{"Count", "acctName", "projectName", "boardName"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns1",{"acctName", "projectName", "boardName", "Count"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"acctName", "Customer"}, {"projectName", "Project"}, {"boardName", "Board"}}),
#"Removed Duplicates" = Table.Distinct(#"Renamed Columns")
in
#"Removed Duplicates"
...to get this result:

Way to filter multiple conditions in Power Query from folder containing CSV files

I need a help from you with correction/suggestion of query I am using to get a data from folder in CSV format. Warning upfront: I don't know, how to write this shortly.
Few informations first:
Tools are limited for Power Query, Excel, VBA
Data query will run once in a month, so bigger loading time is not a BIG issue, although lower time is ofc preferable
I have chosen Power Query approach, because the source data have to be used in another Excel file, but with different set of rules (and this is part of my current issue).
Basic issue with my code is that it runs for really long time, there are big amount of conditions that need to be met and I have to use similar approach for another reason/tool/file. And I want the people to just press Refresh to get the information needed.
Description:
I have source of data in CSV files in a folder. Naming convention doesn't exist, because multiple people do the export of the data from system. Because of that I've used folder option in PQ.
The size of the data is currently around 400-600 MB. Name of the columns might be changing, for which are the first line in M-code to get around.
My main struggle is:
There are several conditions, that need to be implemented. I didn't want to write multiple if statements, because the code would get really ugly, and the number of conditions is in tenths and across multiple columns. For that reason I've implemented (let's call it TT) translation table where I have all columns where filtering could be used and last column of that TT is concatenation of all columns. If in the condition I don't care about one of the columns, I fill it with wildcard "*".
So the TT might be looking like:
| PC | CLIENT | FN | TC | STRING |
|----|--------|-----|----|-------------|
| 11 | * | NEW | AC | 11*NEWAC |
| 47 | 000001 | NEW | * | 47000001NEW*|
etc...
PC is PoC, FN is FUNCTION, TC is Transaction code (in code below).
Then in the code I am replacing the wildcard with appropriate column's value from PQ and check, if the concatenated string from same columns in PQ is contained in TT (last column is made into a list).
Code below works for the easier solution, but it's pretty hardcoded, because I've wanted to know if it's even possible.
After data update I run VBA macro to append the data into "database" table (ofc check for existing values is there) so the data load can be minimized. For that reason the first part code is used.
Basically the code I could split into three parts:
Basic transformation: Loading from folder, getting rid of unconventional names and checking with other folder if it contains the same named files to minimize load.
Filtering data: Consists of merging the PQ table with TT table, replacing the wildcards with correct column and then creating filtering string to check if the text in concatenated PQ table contains at least one value from the TT list.
Final transformation of used data to get the information I need (It's mainly about late settlements from market)
Whole M-Code with comments
let
/*Here starts basic data transformation to limit errors in CSV files due to
different conventions */
Source = Folder.Files(source),
#"Uppercased Text1" = Table.TransformColumns(Source,{{"Name", Text.Upper, type text}}),
#"Merged Queries2" = Table.NestedJoin(#"Uppercased Text1", {"Name"}, q_Archive, {"Name"}, "q_Archive", JoinKind.LeftAnti),
#"Added Custom" = Table.AddColumn(#"Merged Queries2", "Data", each Csv.Document(File.Contents([Folder Path] & "\" & [Name]),[Delimiter=";", Encoding = 1252, QuoteStyle = QuoteStyle.None])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Data"}),
#"Added Custom1" = Table.AddColumn(#"Removed Other Columns", "Table", each Table.PromoteHeaders([Data])),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom1",{"Table"}),
#"Added Custom2" = Table.AddColumn(#"Removed Other Columns1", "Upper", each Table.TransformColumnNames([Table],Text.Upper)),
#"Removed Other Columns2" = Table.SelectColumns(#"Added Custom2",{"Upper"}),
#"Expanded Upper" = Table.ExpandTableColumn(#"Removed Other Columns2", "Upper", {"19A AMOUNT", "19A CURRENCY CODE", "35B ISIN", "CLIENT", "EXP.SETTL.DATE", "FUNCTION", "INSTR.ID", "MESSAGE FUNCTION", "POC", "RECEPTION DATE", "SETTL.AMOUNT", "SETTL.CUR.", "TRANSACTION CODE"}, {"19A AMOUNT", "19A CURRENCY CODE", "35B ISIN", "CLIENT", "EXP.SETTL.DATE", "FUNCTION", "INSTR.ID", "MESSAGE FUNCTION", "POC", "RECEPTION DATE", "SETTL.AMOUNT", "SETTL.CUR.", "TRANSACTION CODE"}),
#"Renamed Columns1" = Table.RenameColumns(#"Expanded Upper",{{"SETTL.AMOUNT", "SETTL.AMOUNT2"}, {"SETTL.CUR.", "SETTL.CUR.2"}, {"19A CURRENCY CODE", "19A CURRENCY CODE2"}, {"19A AMOUNT", "19A AMOUNT2"}}),
#"Added Custom10" = Table.AddColumn(#"Renamed Columns1", "19A AMOUNT", each if[SETTL.AMOUNT2]=null then [19A AMOUNT2] else [SETTL.AMOUNT2]),
#"Added Custom11" = Table.AddColumn(#"Added Custom10", "19A CURRENCY CODE", each if [SETTL.CUR.2] = null then [19A CURRENCY CODE2] else [SETTL.CUR.2]),
#"Renamed Columns" = Table.RenameColumns(#"Added Custom11",{{"FUNCTION", "FUNCTION2"}}),
#"Added Custom8" = Table.AddColumn(#"Renamed Columns", "FUNCTION", each if[FUNCTION2]=null then [MESSAGE FUNCTION] else[FUNCTION2]),
#"Removed Other Columns3" = Table.SelectColumns(#"Added Custom8",{"35B ISIN", "CLIENT", "EXP.SETTL.DATE", "INSTR.ID", "POC", "RECEPTION DATE", "TRANSACTION CODE", "19A AMOUNT", "19A CURRENCY CODE", "FUNCTION"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns3",{"POC", "CLIENT", "FUNCTION", "TRANSACTION CODE", "EXP.SETTL.DATE", "RECEPTION DATE", "19A AMOUNT", "19A CURRENCY CODE"}),
#"Replaced Value" = Table.ReplaceValue(#"Reordered Columns","""","",Replacer.ReplaceText,{"POC", "CLIENT", "INSTR.ID", "35B ISIN"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","=","",Replacer.ReplaceText,{"POC", "CLIENT", "INSTR.ID", "35B ISIN"}),
#"Uppercased Text" = Table.TransformColumns(#"Replaced Value1",{{"POC", Text.Upper, type text}, {"CLIENT", Text.Upper, type text}, {"FUNCTION", Text.Upper, type text}, {"TRANSACTION CODE", Text.Upper, type text}}),
#"Filtered Rows" = Table.SelectRows(#"Uppercased Text", each ([FUNCTION] = "NEWM")),
#"Merged Queries" = Table.NestedJoin(#"Filtered Rows", {"POC"}, tbl_setup_pocList, {"PocList"}, "tbl_setup_pocList", JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"tbl_setup_pocList"}),
/* Here ends the data transformation part
and the part for list transformations start*/
#"Added condition" = Table.AddColumn(#"Removed Columns","COND", each (
((Table.FromRecords({
[PC = List.ReplaceValue(Table.Column(tbl_filtering_string, "POC"),"*",[POC], Replacer.ReplaceText),
CL = List.ReplaceValue(Table.Column(tbl_filtering_string, "CLIENT"),"*",[CLIENT], Replacer.ReplaceText),
FN = List.ReplaceValue(Table.Column(tbl_filtering_string, "FUNCTION"),"*",[FUNCTION], Replacer.ReplaceText),
TC = List.ReplaceValue(Table.Column(tbl_filtering_string, "TRANSACTION CODE"),"*",[TRANSACTION CODE], Replacer.ReplaceText)]}
))))),
#"Expanded COND" = Table.ExpandTableColumn(#"Added condition", "COND", {"PC", "CL", "FN", "TC"}, {"PC", "CL", "FN", "TC"}),
#"Added Custom3" = Table.AddColumn(#"Expanded COND", "Test", each (List.Combine(
{
{_[PC]},{_[CL]},{_[FN]},{_[TC]}
}
))),
#"Expanded Test" = Table.AddColumn(#"Added Custom3", "Test2", each (Table.FromColumns(_[Test],null))),
#"Removed Columns2" = Table.RemoveColumns(#"Expanded Test",{"PC", "CL", "FN", "TC", "Test"}),
#"Added Custom4" = Table.AddColumn(#"Removed Columns2", "String", each Table.ToList([Test2],Combiner.CombineTextByDelimiter(""))),
#"Removed Columns3" = Table.RemoveColumns(#"Added Custom4",{"Test2"}),
#"Added Custom6" = Table.AddColumn(#"Removed Columns3", "CONTAIN_STR", each [POC]&[CLIENT]&[FUNCTION]&[TRANSACTION CODE]),
#"Added Custom5" = Table.AddColumn(#"Added Custom6", "Cond", each List.Contains(_[String],[CONTAIN_STR])),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom5", each ([Cond] = false)),
/*Here the code for filtering ends and final transformations occur */
#"Removed Columns4" = Table.RemoveColumns(#"Filtered Rows1",{"String", "CONTAIN_STR", "Cond"}),
#"Merged Queries1" = Table.NestedJoin(#"Removed Columns4", {"POC"}, tbl_setup_exotics, {"Exotic_PoC"}, "tbl_setup_exotics", JoinKind.LeftOuter),
#"Expanded tbl_setup_exotics" = Table.ExpandTableColumn(#"Merged Queries1", "tbl_setup_exotics", {"Exotic_PoC"}, {"Exotic_PoC"}),
#"Replaced Value2" = Table.ReplaceValue(#"Expanded tbl_setup_exotics",null, "Non Exotic",Replacer.ReplaceValue,{"Exotic_PoC"}),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value2", {"EXP.SETTL.DATE", "RECEPTION DATE"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Errors",{{"EXP.SETTL.DATE", type date}, {"RECEPTION DATE", type date}}),
#"Added Custom7" = Table.AddColumn(#"Changed Type", "RD", each (if [Exotic_PoC] <> "Non Exotic" then Date.AddDays([RECEPTION DATE],1)else [RECEPTION DATE])),
#"Filtered Rows2" = Table.AddColumn(#"Added Custom7", "LB" , each if [RD]>=[EXP.SETTL.DATE] then "Late" else "Not"),
#"Added Custom9" = Table.AddColumn(#"Filtered Rows2", "DAYS_LATE", each [RD]-[EXP.SETTL.DATE]),
#"Inserted Year" = Table.AddColumn(#"Added Custom9", "Year", each Date.Year([EXP.SETTL.DATE]), Int64.Type),
#"Inserted Month" = Table.AddColumn(#"Inserted Year", "Month", each Date.Month([EXP.SETTL.DATE]), Int64.Type),
#"Changed Type1" = Table.TransformColumnTypes(#"Inserted Month",{{"19A AMOUNT", type number}}),
#"Grouped Rows" = Table.Group(#"Changed Type1", {"Year", "Month", "POC", "19A CURRENCY CODE", "DAYS_LATE", "LB"}, {{"Count", each Table.RowCount(_), type number}, {"Countervalue", each List.Sum([19A AMOUNT]), type text}, {"ISIN", each Text.Combine([35B ISIN],";"), type text}, {"INSTR.ID", each Text.Combine([INSTR.ID], ";"), type text}}),
#"Merged Queries3" = Table.NestedJoin(#"Grouped Rows", {"Year", "Month", "19A CURRENCY CODE"}, q_Xrates, {"Year", "Month", "Currency"}, "q_Xrates", JoinKind.LeftOuter),
#"Expanded q_Xrates" = Table.ExpandTableColumn(#"Merged Queries3", "q_Xrates", {"Rate"}, {"Rate"}),
#"Replaced Value3" = Table.ReplaceValue(#"Expanded q_Xrates",null,1,Replacer.ReplaceValue,{"Rate"}),
#"Added Col" = Table.AddColumn(#"Replaced Value3", "CV", each [Countervalue]/[Rate]),
#"Remove Countervalue" = Table.RemoveColumns(#"Added Col", {"Countervalue"})
in
#"Remove Countervalue"
Questions
I know this approach sounds over-complicated, but it makes it work (unfortunately it takes a long time to refresh). But is it really good? Aren't there other options, considering limited tool usage mentioned in the beginning?
How can I make this code better? I believe it could be partially re-made into function, but since I am quite a beginner in PQ, I cannot imagine how.
How can I use same approach, for same source data, but with bigger complexity? You can understand that as more columns to add to the filtering string.
Do you have other suggestions?
End comments
I am now pretty desperate and my written text might be confusing sometimes.
I don't have any issue providing some kind of Visio chart to show my logic in more graphical way (I am more familiar with that) and also with relationship overview.
I also don't have issue provide anonymized data (since it might be partially confidential). If you'd need that one, please refer to preferred service.
I don't mind working on my code, if I am pushed in correct direction. For that Q. #1 is priority. So basically is this good approach and can it be easily adjustable for another same, but more complicated purpose?
I really appreciate your time.
*/ MK */
If I were to do this, I would write a function that compiles the filter condition table into a function, then apply it with Table.SelectRows.
// Compile the condition table into a function that can be applied in row filtering.
filterCondition = compileFilterConditionTable(tbl_filtering_string),
#"Filtered Rows" = Table.SelectRows(#"Table after Preceding Steps", filterCondition)
Isn't this looking much easier to trace the steps?
Below is an example code of a function that compiles condition table into a logical function. I'm not sure this works correctly for your case, because I'm not completely understanding the requirement.
compileFilterConditionTable =
let compileFilterConditionTable = (filterConditionTable as table) as function =>
let recordConditions = List.Transform(
Table.ToRecords(filterConditionTable),
compileFilterConditionRecord)
in applyCombine(recordConditions, List.AnyTrue),
compileFilterConditionRecord = (cond as record) as function =>
let fieldNameValues = List.Transform(
Record.FieldNames(cond),
each [Name = _, Value = Record.Field(cond, Name)]
),
fieldConditions = List.Transform(fieldNameValues, compileFieldCondition)
in applyCombine(fieldConditions, List.AllTrue),
compileFieldCondition = (fieldNameValue as record) as function =>
let name = fieldNameValue[Name],
value = fieldNameValue[Value]
in
if value = "*" then (record as record) as logical => true
else (record as record) as logical => Record.Field(record, name) = value,
applyCombine = (functions as list, combiner as function) as function =>
(value) => combiner(List.Transform(functions, (f) => f(value)))
in compileFilterConditionTable
Anyway, M is a functional programming language, so it would help to think and code it in functional way. Break down the entire logic into small parts, so that each small parts will be easy enough to understand. Write your code as reusable small functions, and combine them to build the whole.

Resources