I have multiple similar data files with a large overlap of similar rows. I'd like to combine them so that a given column from each set appears in a distinct column in a new table. Essentially this is very similar to a standard pivot table where the source is a column field and the values of the field are those of the original files where present.
So for 2 source files:
File 1
and File 2
I'd like to end up with:
So all the common data is in the row and there is one column for each file containing the "Status" (or blank if that row isn't present).
I want to have this as a data source that I can then pivot. Is this possible? I know how to combine the files into a single source using Get Data -> From Folder and I know I can then pivot that data, but it doesn't get me to the final solution above.
Assuming you've got 2 separate queries bringing in data from the 2 source files as listed above, first step would be to add a 'File' column to each of them ie Table.AddColumn(#"Previous Step", "File", each "File 1/2/3 etc", type text) , ie so you end up with:
and
Then Append the 2 adjusted tables to give you this
Select the 'File' column, go to Transform => Pivot Column and in the pivot window choose 'Status' as the Values column and Don't Aggregate as the Aggregate Function
Which gives you your desired result
Related
The data I am using is split into 3 sheets (STORE1, STORE2, STORE3 --- not very ideal) and is not formatted in table either.
I'm looking for a way in Power Query to create one column containing all months out of multiple columns containing sales per each month per product (If it isn't clear, screens will probably be helpful)
Basically transforming this :
Initial Sheets
into this :
Final Sheet
The "products" are linked to "type" via another table, and I would also like to have "store number" in the new table format as a header instead of having different sheets
As of now, I haven't found a way to do it.
Thanks already for the help,
Unpivot is the command you ar looking for.
Make a query for each sheet, the only operation you have to do is to add a custom column (Add column -> custom column, in the formula write just 1 for the query of store 1, 2 for the query of store 2, etc. then rename the new column to "store number"
append the three queries
select all the columns except the one from january to december and
click on Transform -> drop down menu under unpivot -> unpivot other
columns
I am not sure what you mean when you say that the "products" are linked to "type" via another table but if you mean that you have a
look up table where you have a column that matches "products" and
another where you have the "type" column, then you can merge your
query, I would suggest you make sure you don't have any duplicate in
you look up table.
I have a set of excel files inside ADLS. The format looks similar to the one below:
The first 4 rows would always be the document header information and the last 3 will be 2 empty rows and the end of the document indicator. The number of rows for the employee information is indefinite. I would like to delete the first 4 rows and the last 3 rows using ADF.
Can any help me with what should be expressions in the Derived column / Select?
My Excel file:
Source Data set settings (give A5 in range and select first row as header):
SourceDataSetProperties
Make sure to refresh schema in the source data set.
Schema
After schema refresh, if you preview the source data, you will be seeing all rows from row number 5. This will include footer too which we can filter in data flow.
Next, add a filter transformation with below expression
!startsWith(sno,'dummy') && sno!=''
this will filter out the rows starting with dummy, in your case, end of document. Also we are ignoring the empty rows by checking sno!=''
Final Preview after filter:
How about this? Under the 'Source' tab, choose the number of lines you want to skip.
How to convert a table of multiple columns to a two columns table in excel. As shown in the image the two-column table should be based on the first column of the multiple column table
use these steps
select the data,
get & transform tab on data menu
from table
data will be opened in power query window
select columns 2 to column 5, at once
right click, use unpivot
save and load
see the following screenGIFs
Edit in view of comments below
To add null values to output you can do following workaround
replace null values from any non-existent value say -
then unpivot
see the following GIF
I need to transform Excel files to ESRI FileGDB using FME.
The problem is that my excel worksheets contains more than one table.
Example: At row 1, I have the attributes of the first table. Row 2 to 4 contains the values.
At row 6 I have the attributes of the second table. 45 next rows are the values.
And the same thing for the third table.
These rows can change. I could have the attributes of the second table at any row.
I think the best solution would be to have a process that split the .xls file in three different files so I can transform them directly into ESRI format.
Is there a transformer that could perform this task or should I code it myself in Python?
PS: This process will be called from a REST Service so I can't do this manually. Also, the columns name will always be the same.
Thanks
FME reads the Excel rows in order, so I would add a Counter transformer after reading the Excel file.
The column names don't change, so you could check at which row (number given by the Counter) the new table begins.
Then is just a matter of filtering the features with a TestFilter.
I am trying to integrate several tables across multiple worksheets, but all in one workbook. I am currently using Power Query to get data from tables on all the sheets to appear in an overview on the first sheet.
For example, consider the following:
Table 1 -
Date Time Note
01/02/03 13:59 First entry
03/04/05 08:36 Second entry
Table 2 -
Date Time Type
02/03/04 19:19 Cold
06/07/08 07:22 Hot
Overview -
Date Time Entries
01/02/03 13:59 First entry
02/03/04 19:19 Cold
03/04/05 08:36 Second entry
04/05/06 07:22 Hot
I am currently able to merge columns together (though I am having trouble when merging columns containing numbers with columns containing text...), as can be seen under "Entries" in the Overview table.
What I would like to do is be able to add another column based on the source for each row in the Overview table.
This would look like:
Overview -
Date Time Entries Source
01/02/03 13:59 First entry Table 1
02/03/04 19:19 Cold Table 2
03/04/05 08:36 Second entry Table 1
04/05/06 07:22 Hot Table 2
Additionally, it would be nice if the rows sourced from Table 1 could be in red, while the rows sourced from Table 2 could be in blue.
Is there a way I can use Power Query to format the individual cell contents, as well as entire rows based on the source of entries?
If the tables have the same structure, you can use Append, rather than Merge. Before appending, set the columns to the same data type. I don't quite see how there are any numbers in your text columns, though.
In Power Query:
create a query from Table 1
add a column called "Source" with the formula ="Table 1"
rename "Note" to "Entries" and set it as type "text"
Save the query as a connection only
create another query from Table 2
add a column called "Source" with the formula ="Table 2"
rename "Type" to "Entries" and set it as type "text"
append the query from above
sort as desired
save and load to the workbook
In the resulting worksheet use conditional formatting for coloring based on the value in the Source column.