I am doing a query from a folder with many Excel files which all have the same structure. I want to reference a certain row which is always in the same place (row no. 5) in the same sheet in all of the excel files.
How can I do that? There is no reference point like a certain word that I could filter for, I just need row no. 5. The row sometimes is empty, partially filled or completely filled in. I need it in all 3 states.
Can anyone help me?
Thanks!
= Table.FromRecords(List.Transform(Folder.Files("YourFolderPath")[Content],each Excel.Workbook(_){0}[Data]{4}))
Maybe this approach will help.
I'm assuming you selected your query's source folder path and clicked Combine & Edit...
and picked a Sample file Parameter sheet and clicked OK to combine files...
So you'd see your appended worksheets result...something like this below. Note that my different workbooks in this example contain your different conditions for the 5th rows--empty, partially or completely filled.
All I think you really need to do from here is to add an index in the "Transform Sample File from..." query--in this example, it is the "Transform Sample File from Test" query since Test is what my folder was named and therefore what my query name was defaulted as.
Just select the "Transform Sample File from..." query, then Add Column > Index Column.
Then, when you click back on your main query...in this example...Test, you will see the index numbers, which you can use along with the Source.Name values for easy reference.
For instance, you could filter for Index value 4 (rows 5 of the worksheets) to see:
Related
i am learning Alteryx and have ran into my first issue. I have an excel file that i am using as one source. The files has two sheets with the same data, but the second sheet does not have headers.
I wanted to see if there was a way to combine the two sheets into one, within Alteryx using column position instead of headers since the second does not have them. Any help is very much appreciated.
Yes, both their Join (https://help.alteryx.com/20213/designer/join-tool) and Union (https://help.alteryx.com/20213/designer/union-tool) tools have a "Record Position" option which is exactly what you're requesting. See the links for details.
You have to input the file twice, once for each sheet.
For the 2nd sheet make sure to click on the option that the first row contains Data
Then you can use the Union tool --> Auto Config by position --> Set a specific order (Check). See image links below.
First Row Contains Data
Union Tool Configuration
Sheet 1 Example Input
Sheet 2 Example Input
Output
I'm working with hundreds of .txt files and I have to combine them into 1 single .csv file
Above is a sample format of the text file (there are only 2 columns but have hundreds more rows)
I'm required to first transpose the contents of each .txt file, and then merge all the results into one table, where they all have a common header row (the column of 31330_at, 31385_at, 31463_s_at etc)
This is my first time working with power query and I'm not entirely sure how to do this as I've tried importing all files and transposing them all at once, but it doesn't work.
let
Source = Folder.Files("Directory),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".TXT")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content", "Name"}),
#"Invert" = Table.TransformColumns(#"Removed Other Columns", {{"Content", eachTable.Transpose(_)}}),
.....
I've tried the code above but it runs into an error Expression.Error: We cannot convert a value of type Binary to type Table. at the #'Invert' function
For reference, it's the same concept as this link https://stackguides.com/questions/57805673/how-to-transpose-multiple-csv-files-and-combine-in-excel-power-query
How do I fix this?
It looks like you already know how to pull the files from the folder, as shown by your Source = Folder.Files("Directory).
After combining the Binary files, you probably see something like this on your screen, where all of the txt files are appended one after the other:
But that's not what you want. Right? You want the files appended based upon a transposed view of each file. I understand from your description above, that the first column of each file will contain the same information as the first column of every other file, and you want that first column's information to be used as the header for the information that is initially listed in column 2 of each file but will be transposed into appended rows.)
It looks like you are trying to do your transposing in the query that is generated and listed under Other Queries (probably called Directory for you).
Don't do the transposing there. Instead, look for the query called Transform Sample File, which should be listed under Helper Queries, and do the transposing in it.
Click on the query named Transform Sample File.
Then click Transform -> Transpose, to transpose your table.
Then click the ribbon button for Use First Row as Headers, to make the first row your headers.
Then click on that earlier query that is listed under Other Queries (probably called Directory for you)
...and you will see this error message:
This error is caused because the final step of the query is trying to change types using the old column names. So look to the right side of your screen and delete the Changed Type step by clicking on the X before Changed Type. (If you need to, you can change column types later, for the columns that need it.)
Then you should see what I understand you are wanting to see as a result.
Before I start to explain - I am not providing any data or else, because I need an idea first, to see if it is possible (if it is easy).
Imagine you have one folder with X different subfolders inside, and every of them is filled with twenty text files (.txt), with same structure and length. What I normally do is uploading one folder Data > From File > From Folder in Excel and then doing certain transformations and saving it in an Excel file. That is what I am doing like X times for all subs. It is not super time consuming because I know how to change "Path" in Advanced editor, and with Refresh button is exceptionally smooth. BUT...
If I want to have for example one drop down list with those "subs" and every time when I change it on drop down with refresh button - my data set is in a minute refreshed. How to do that? Parameters or function in PQ?
That means I could avoid going into PQ editor or Changing source manually...
Any ideas or suggests?
You can create a named range in Excel which is just a cell with the subfolder name. Using data validation you can make that cell into a dropdown based on a list you define in a separate range.
Once you've done that, you can load that named range into Power Query and insert it as part of the folder path as in this question related to using a cell value in a query URL.
I have an excel file with two sheets. The second sheet (Report) contains data validation cells based on the first sheet (Data). From the second sheet, the drop-down list that displays in the Select XXX depends on the selection in the Generate Report. When the Generate Report is set to anything beyond the first five in its list, the "Select XXX" displays year as a default list (no problem with this) via the code ...INDIRECT("Year").... The problem is that excel does not allow for addition of more code (seems I hit the limit). The question is - how can I manipulate this code to accommodate every option in the Generate Report? or perhaps, is there another method to implement?
The data validation source code for the drop-down list is =IF($B$4=Data!$Q$5,INDIRECT("Client"), IF($B$4=Data!$Q$6,INDIRECT("Month"), IF($B$4=Data!$Q$7,INDIRECT("Product_Service"), IF($B$4=Data!$Q$8,INDIRECT("Sector"), IF($B$4=Data!$Q$9,INDIRECT("Trans_Type"),INDIRECT("Year"))))))
Please, see the sample file at https://drive.google.com/file/d/1VKkGHjlJzLQqx4J9kyd_bCKG4r0Q7HkG/view?usp=sharing
What you could do is put the range names in column R, and VLOOKUP them:
=IFERROR(INDIRECT(VLOOKUP($B$4,Data!$Q$5:$R$9,2,FALSE)),INDIRECT("Year"))
You could then have as many item lists as you wish.
I have several excel sheets that have all the same strucure. From each excel sheet I need two tables.
I know how to use Power Query to combine two tables of the same file and I know how to combine several files within one folder.
But I do not know how to set up the query such that Power Query first combines the two tables of one file and then repeats this step for all files of the folder such that I get as a result both tables of all files combined in one table.
Any suggestions or hints?
Thank you!
If you had posted some code, it would be easier to answer. I assume you mean:
I have several excel sheets workbooks that have all the same strucure. From each excel sheet workbook I need two tables.
because you go on to say:
...combines the two tables of one file and then repeats this step for all files of the folder such that I get as a result both tables of all files ...
Below is what I think you're trying to achieve.
Say I have two tables (see blue table and yellow table below) in some Excel workbook, which I want to combine.
I can combine my two tables using some code like below (provided I load the query in an Excel workbook different to the Excel workbook containing the tables):
let
someExcelFile = Excel.Workbook(File.Contents("C:\someFolder\Book1.xlsx")),
firstTable = someExcelFile{[Name="Table1"]}[Data], // Or however you're getting your first table.
secondTable = someExcelFile{[Name="Table2"]}[Data], // Or however you're getting your second table.
combineTwoTables = Table.Combine({firstTable, secondTable}) // Or however you're combining the two tables.
in
combineTwoTables
(Presume you have something like the above. In the code above I'm identifying the tables by their name, but you might be identifying them in some other way.)
Then say I also have multiple Excel workbooks in some folder, each containing two tables (just like the ones shown above) which also need extracting/combining.
In order to use the above on every Excel workbook in my folder, one approach might be to change it into a function which accepts any file as an argument. Something like:
let
CombineTwoTablesInSomeExcelFile = (someFile as binary) as table =>
let
someExcelFile = Excel.Workbook(someFile),
firstTable = someExcelFile{[Name="Table1"]}[Data], // Or however you're getting your first table.
secondTable = someExcelFile{[Name="Table2"]}[Data], // Or however you're getting your second table.
combineTwoTables = Table.Combine({firstTable, secondTable}) // Or however you're combining the two tables.
in combineTwoTables
in
CombineTwoTablesInSomeExcelFile
The reason the function accepts a binary type argument is that Folder.Files (used below to access files in a folder) returns a column containing each file (in that folder) as a binary. In other words, this is convenient and we can pass the values in that column directly into our function (hope that makes sense).
To call the function against all files in a folder and combining the results into one table, we can use something like:
let
CombineTwoTablesInSomeExcelFile = (someFile as binary) as table =>
let
someExcelFile = Excel.Workbook(someFile),
firstTable = someExcelFile{[Name="Table1"]}[Data], // Or however you're getting your first table.
secondTable = someExcelFile{[Name="Table2"]}[Data], // Or however you're getting your second table.
combineTwoTables = Table.Combine({firstTable, secondTable}) // Or however you're combining the two tables.
in combineTwoTables,
filesInFolder = Folder.Files("C:\someFolder\"), // Change to whatever the folder is on your computer.
relevantFiles = Table.SelectRows(filesInFolder, each List.Contains({".xlsx"}, [Extension])),
invokedFunction = Table.AddColumn(relevantFiles, "toCombine", each CombineTwoTablesInSomeExcelFile([Content]), type table),
combinedAllTables = Table.Combine(invokedFunction[toCombine])
in
combinedAllTables
Some points in closing:
I tried to filter the files in the folder to only include certain file extensions. In your case, you may need to add file extensions to the list.
There is no error handling implemented within the function. So if you pass the function a file which is not an Excel file containing two tables (or if the steps required to extract the two tables differ from the steps/logic in the function), then you will likely get an error. (You should also not include the workbook containing this query in the folder.)
Here's another way.
Open Excel.
Click Data > New Query > From File > From Folder.
Browse to and select the folder that has the Excel files you want to work with. Once the folder is listed in the Folder Path text input box, click OK.
Click Transform Data.
Here, you can filter the information in the columns to restrict the files to only the ones you want to work with.
Click the Combine Files button at the top right of the Content column.
Select the first table listed and click OK.
Click on the query "Transform Sample File from Folder," to open it for editing. This is where all the transformations that would be done to every file are worked out.
Select the Navigation applied step then Transform > Detect Data Type.
This detection of data types step is needed between the Navigation step, which brought in the first table you'll use, and the next step, which will bring in the second table. Without a step between these two steps, working with the graphical user interface, the second table would just replace the first in the Navigation Step. I'm not sure why, but it does.
To make sure you understand what you see below, my tables were named Table1 and Table2 in all of the spreadsheets. In the first spreadsheet, each table had entries using the following convention:
T(for table) # C(for column) # R(for row) #
So for Spreadsheet1, Table1, Column1:
T1C1R1
T1C1R2
...
Select the Navigation applied step again, and then copy what is in the formula bar.
Click the fx to the left of the formula bar and paste what you just copied over what appears in the formula bar. In other words, replace what appears with what you just copied. Then replace the table name listed after Item= with the second table's name and press enter.
Now you can perform merges with the two tables: the one initially brought in, at the Navigation step, and the one brought in later (in my example the Applied Step where the second table was brought in is called Custom1). I would use the instances of each where I changed type...so Change Type for the first table and Change Type1 for the second.
So now, to append Table1 and Table2:
Click on the Changed Type1 Applied Step, then Home > Append Queries, and select Transform Sample File from Folder (Current) from the dropdown, then click OK.
Then, in the formula bar, change the first #"Changed Type1" to #"Changed Type" and press enter.
Now go back to the original query. Mine was called folder. You'll see all of your spreadsheet's with their appended tables have been appended to each other.
Just so you understand what you're seeing in the completely appended listing above, for the other spreadsheets, I added a file number. I used the convention:
F(for file) # T(for table) # C(for column) # R(for row) #
So for Spreadsheet2, Table1, Column1:
F2T1C1R1
F2T1C1R2
...