When importing files from folder, Power Query automatically generates 4 helper objects for the main query.
Question is : how does the "Transform File" function receive the content from "Transform Sample File" ?
Sample File -> Parameter 1 -> Transform File()
actual Query calls Transform File()
Transform Sample file content magically appears within Transform File() ....
Object called "Sample File"
let
Source = Folder.Files("C:\OneDrive\A\d_LAB\timestamp_cgc_cat"),
Navigation1 = Source{0}[Content]
in
Navigation1
Parameter called "Parameter 1" ( receives "Sample File" as argument)
#"Sample File" meta [IsParameterQuery=true, BinaryIdentifier=#"Sample File", Type="Binary", IsParameterQueryRequired=true]
Function called "Transform File"
let
Source = (Parameter1 as binary) => let
Source = Csv.Document(Parameter1,[Delimiter="|", Columns=5, Encoding=65001, QuoteStyle=QuoteStyle.None]),
A = custom_step_a,
B = custom_step_b,
C = custom_step_c
in
C
in
Source
Query called "Transform Sample File"
Source = Csv.Document(Parameter1,[Delimiter="|", Columns=5, Encoding=65001, QuoteStyle=QuoteStyle.None]),
A = custom_step_a,
B = custom_step_b,
C = custom_step_c
in
C
Actual Query that the average human gets to fumble with
let
Source = Folder.Files("C:\folder_with_csv_files"),
filter_hidden = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(filter_hidden, "Transform File", each #"Transform File"([Content])),
X = custom_step_x
Y = custom_step_y
Z = custom_step_z
in
Z
asking out of curiosity ...
If you are asking where you can select the Sample File:
Instead of clicking on the Combine & Transform Data button, click on the Transform Data button. Then, after you click the button on the Content column to combine the files from the folder into a table, you'll get the Combine Files screen. At the top left of that screen, you'll see a drop-down selector where you can choose the file from the folder contents that you're combining, to use as the sample file.
If you want to change the file to be used for the Sample File at a later time:
Click Sample File, under Queries; then click your Filtered Rows step, under Applied Steps; then double-click the corresponding Binary for it in the Content column. This will create an applied step with the path and name of the file that you used as the Sample File, and one or more other applied steps below it. Delete any such additional steps so that your last applied step is the one named with the path and name of the file and you see an icon representation of your file.
By the way... You may find it more useful to "fumble with" the Transform Sample File instead of the one you were thinking, as that is the template, so to speak, for the actions that will be done to every file brought into the query from the folder.
Related
In our lab we have a system that many people uses and produces a oddly shaped txt file with the results. I created a power query that cleans the file and I would like to share this with others (not very computer savvy) so they can apply it to the files they will generate.
What can I do to make it as easy as possible for other users to select the file they want the query to be applied to? Example: is there an easy way to create button that opens a dialog requesting the file location? Right now I have to edit the query source to select the data, this approach is clunky and will be confusing for some of my colleagues.
let
Source = Table.FromColumns({Lines.FromBinary(File.Contents("X:\foo\foo.txt"), null, null, 1252)}),
#"Removed Top Rows2" = Table.Skip(Source,32),
#"Removed Bottom Rows" = Table.RemoveLastN(#"Removed Top Rows2",16),
#"Other Steps" = ...
Thanks!
You can directly grab a filepath from a range name cell without a function by
let
NameValue= Excel.CurrentWorkbook(){[Name="rangenamehere"]}[Content]{0}[Column1],
Source = Table.FromColumns({Lines.FromBinary(File.Contents(NameValue), null, null, 1252)}),
Or if you wanted the VBA route for file prompt
1 Create a range name, here aaa
2 Use VBA to populate it using a file prompt
Sub prompt()
Dim FName As Variant
FName = Application.GetSaveAsFilename("", "Data file (*.xl*),*.xl*", 1)
If FName = False Then
MsgBox "False"
Exit Sub
End If
Range("aaa").Value = FName
End Sub
3 Refer to the named range in powerquery you set up
let
NameValue= Excel.CurrentWorkbook(){[Name="aaa"]}[Content]{0}[Column1],
Source = Table.FromColumns({Lines.FromBinary(File.Contents(NameValue), null, null, 1252)}),
4 Tack on code at end of VBA to refresh all queries or specific query
ActiveWorkbook.RefreshAll
or
ActiveWorkbook.Queries("QueryNameHere").Refresh
I found this post from 2014 that works pretty well. You write a function on Query (fnGetParameter) that reads the file location from a table and then you feed it to the query that processes the data.
All the user needs to do is write the file location on the table and name and refresh.
I changed the first to lines on my PowerQuery code to look like this:
Fileloc = fnGetParameter("File Path"),
Source = Table.FromColumns({Lines.FromBinary(File.Contents(Fileloc), null, null, 1252)}),
Any suggestions to make it even better?
You can make the fnGetParameter into a one-liner:
= ( getValue as text ) => Excel.CurrentWorkbook(){[Name=”Parameters”]}[Content]{[Parameter=getValue]}?[Value]?
I have been trying to get current directory into Power Query. But it somehow doesn't work. How can I get current directory, to make the path dynamic for my Query, so that in case the file moved a new directory, I will have no issue with retrieving data into Power Query. Here is the code I tried with:
let
Source = Excel.CurrentWorkbook(){[Name="pathTable"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Path", type text}}),
Path = Table.ReplaceValue(#"Changed Type","\[PathForSummaryFiles.xlsx]Path","\summary.xlsx",Replacer.ReplaceText,{"Path"})
GetFilesFromFolder = Folder.Files(Path)
in
GetFilesFromFolder
the code above throws an error.
Is this what you are trying to do?
In excel, name a cell DirectoryRangeName using formulas ... name manager
Within that cell, put in a formula to capture the path of that file
=LEFT(CELL("filename",A1),FIND("[",CELL("filename",A1))-1)
or enter your own path such as:
c:\directory\subdirectory\
Then, once in powerquery, read that value, and combine however you want to reference it such as
let Directory = Excel.CurrentWorkbook(){[Name="DirectoryRangeName"]}[Content]{0}[Column1],
Source = Excel.Workbook(File.Contents(Directory & "abs.xlsx"), null, true)
in Source
Or
let Directory = Excel.CurrentWorkbook(){[Name="DirectoryRangeName"]}[Content]{0}[Column1],
GetFilesFromFolder = Folder.Files(Directory)
in GetFilesFromFolder
Instead of a formula in the range name, you could simply put a full filepath such as c:\temp\a.xlsx
and then read it within powerquery
let FilePath= Excel.CurrentWorkbook(){[Name="DirectoryRangeName"]}[Content]{0}[Column1],
Source = Excel.Workbook(File.Contents(FilePath), null, true)
in Source
Power BI junior here
How to look in each excel file from a SharePoint list and extract contents from predefined cells.
I am currently accessing a few intranet Sharepoint libraries containing .xlsx files and with the metadata of those files I am doing some reporting. For example, a library contains 10 excel files so I can graph who uploaded them, when they were uploaded, and wat category they were assigned to...
However, is there a way with Power Query to look into each and every of the files, take the value from, say cell A1 of the excel, and add it as a new column "CellA1Content"? I.e., make your own metadata from the content of the files and add them to the imported metadata table.
I've found some functions that I possibly might need:
File.Contents
Excel.CurrentWorkbook
However I am not well-versed enough in Power Query to put it all together, if it's even possible at all. I would have to do a foreach operation of some kind.
Edit: Solution
This worked. I selected the first non-hidden sheet in the excel and I also made the function so that I can pass the column and row number.
Main query:
let
Source = SharePoint.Contents("http://mysharepoint", [Implementation=null, ApiVersion=15]),
... ... ...
//Open each excel and get cell D5
#"AddedColumn1" = Table.AddColumn(#"Filtered Rows", "AddedColumn1", each GetCellContent([Content],4,5))
in
AddedColumn1
Blank query in Power BI, called GetCellContent:
let
Source = (binaryParameter,col,row) => let
Source = Excel.Workbook(binaryParameter, null, false),
UnhiddenSheets = Table.SelectRows(Source, each if [Hidden]=false and [Kind]="Sheet" then true else false),
Sheet = UnhiddenSheets{0}[Data],
Column = Table.SelectColumns(Sheet,{Text.Combine({"Column",Number.ToText(col)})}),
Cell = Record.Field(Column{row-1}, Text.Combine({"Column",Number.ToText(col)}) )
in
Cell
in
Source
You'll need a Function used in a column like this.
This is my local interpretation of your problem, without sharepoint. The same logic is shared though.
Main Query
let
Source = Folder.Contents("YourDirectory"),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content", "Name"}),
#"Added Custom" = Table.AddColumn(#"Removed Other Columns", "Row1Col1", each PullRow1Col1([Content]))
in
#"Added Custom"
PullRow1Col1:
let
Source = (binaryParameter) => let
Source = Excel.Workbook(binaryParameter, null, false),
Sheet1_sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
Column1 = Sheet1_sheet{0}[Column1]
in
Column1
in
Source
I am trying to come up with a small function that will update my main query. There is one line in my code that is currently static, and I would like it to update as I add files to my folder. The number is based off the files in the folder, 1 - "Files in the Folder". Currently there are 14 files in my folder, so the line looks like so:
= Table.RenameColumns(#"Added LAST_FIRST_MID NAME",{{"1", "Report Count"}, {"-13", "Index"}})
So, the bold number needs to be a function of 1 - Count(Files in Folder). The problem lies in the fact that I have no idea how to create a function from scratch. I tried creating a file with the following code and creating a function from that table.
let
Source = Folder.Files("P:\CREDIT DEPT\Credit Bureau - Analysis\BOSP CC DataFiles"),
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, -1)
in
#"Added Index"
This is my first attempt at trying to create a function, I think that I'm on the right path, but I'm not sure where to go from here. Thanks in advance for your help.
If you are trying to count number of files and then use that variable, try below
Use Home...Advanced Editor... to paste in the first row above where you intend to reference the result
FileCount = Text.From(1 - Table.RowCount(Folder.Files("P:\CREDIT DEPT\Credit Bureau - Analysis\BOSP CC DataFiles"))),
#"Rename" = Table.RenameColumns(#"Added LAST_FIRST_MID NAME",{{"1", "Report Count"}, {FileCount, "Index"}})
I want to import multiple text files from a folder, each file containing two columns, into a single excel sheet, so that every new file starts in a new column. Ideally, I need the two columns from the first file and only the second column from every additional text file.
In powerquery, I tried to use the "Import From Folder (Import metadata and links about files in a folder)" functionality followed by query editor and expanding the binaries and the result was that every new file was appended at the end of the previous one. But I want every file to start a new column in the same sheet and I don't know how to do that.
How can I direct powerquery to do that?
Thanks in advance for your help!
My proposal includes 2 rather difficult steps added via the advanced editor, but it is dynamic with regard to the number of .txt files in the folder. I added a ton of comments so it should be self explanatory.
/* In this query, .txt files from a folder are combined.
Each source file has 2 columns.
The resulting table consists of both columns from the first file and each second column from the other files.
Tables are joined using each first column as key and with a left outer join
It is assumed that each file has column headers in the first row, that the first column header is the same for each file
and, preferably, the second column header differs per file, although this is not necessary.
This query is tested with the following file contents:
File1.txt:
ID,File1
1,A
2,B
3,C
4,D
File2.txt:
ID,File2
1,W
2,X
3,Y
Another file was added later on, to test for .txt files being added to the folder: works fine!
*/
let
// Standard UI:
Source = Folder.Files("C:\Users\Marcel\Documents\Forum bijdragen\StackOverflow Power Query\Multiple files in 1 folder"),
// Standard UI; step renamed
FilteredTxt = Table.SelectRows(Source, each [Extension] = ".txt"),
// Standard UI; step renamed
RemovedColumns = Table.RemoveColumns(FilteredTxt,{"Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
// UI add custom column "FileContents" with formula Csv.Document([Content]); step renamed
AddedFileContents = Table.AddColumn(RemovedColumns, "FileContents", each Csv.Document([Content])),
// Standard UI; step renamed
RemovedBinaryContent = Table.RemoveColumns(AddedFileContents,{"Content"}),
// In the next 3 steps, temporary names for the new columns are created ("Column2", "Column3", etcetera)
// Standard UI: add custom Index column, start at 2, increment 1
#"Added Index" = Table.AddIndexColumn(RemovedBinaryContent, "Index", 2, 1),
// Standard UI: select Index column, Transform tab, Format, Add Prefix: "Column"
#"Added Prefix" = Table.TransformColumns(#"Added Index", {{"Index", each "Column" & Text.From(_, "en-US"), type text}}),
// Standard UI:
#"Renamed Columns" = Table.RenameColumns(#"Added Prefix",{{"Index", "ColumnName"}}),
// Now we have the names for the new columns
// Advanced Editor: create a list with records with FileContents (tables) and ColumnNames (text) (1 list item (or record) per txt file in the folder)
// From this list, the resulting table will be build in the next step.
ListOfRecords = Table.ToRecords(#"Renamed Columns"),
// Advanced Editor: use List.Accumulate to build the table with all columns,
// starting with Column1 of the first file (Table.FromList(ListOfRecords{0}[FileContents][Column1], each {_}),)
// adding Column2 of each file for all items in ListOfRecords.
BuildTable = List.Accumulate(ListOfRecords,
Table.FromList(ListOfRecords{0}[FileContents][Column1], each {_}),
(TableSoFar,NewColumn) =>
Table.ExpandTableColumn(Table.NestedJoin(TableSoFar, "Column1", NewColumn[FileContents], "Column1", "Dummy", JoinKind.LeftOuter), "Dummy", {"Column2"}, {NewColumn[ColumnName]})),
// Standard UI
#"Promoted Headers" = Table.PromoteHeaders(BuildTable)
in
#"Promoted Headers"