Excel Power Query > Add columns from another table - excel

Need some guidance, please. I have two power query tables in excel, I'm looking to add columns from table B to table A where the Customer# matches. I don't want to use vlookup formulas due to performance so I was wondering if with power query this is possible.
Here is an example:
Thanks a lot!

The M-Code for merging TableA with TableB would look like that
let
Source = Excel.CurrentWorkbook(){[Name="TableA"]}[Content],
chgType = Table.TransformColumnTypes(Source,{{"Customer Number", type text}, {"Certification", type text}}),
mergeQueries = Table.NestedJoin(chgType, {"Customer Number"}, TableB, {"Customer Number"}, "TableB", JoinKind.LeftOuter),
extendTbl = Table.ExpandTableColumn(mergeQueries, "TableB", {"Crd Limit"}, {"Crd Limit"})
in
extendTbl
You need to import TableB into Powerquery beforehand, as well.
Further reading on this
Microsoft documentation
Excel Guru Blog

I think you can get what you want by to a JoinKind.Inner.
Note that this will return customer ID's that are present in both tables. If that is not the case, and you want unmatched ID's to be returned, you'll need to do a .NestedJoin with JoinKind.FullOuter and then expand the resulting table.
eg:
let
Source1 = Excel.CurrentWorkbook(){[Name="TableA"]}[Content],
tabA = Table.TransformColumnTypes(Source1,{{"Customer Number", type text}, {"Certification", type text}}),
Source2 = Excel.CurrentWorkbook(){[Name="TableB"]}[Content],
tabB = Table.TransformColumnTypes(Source2,{{"Customer Number", type text}, {"Crd Limit", Currency.Type}}),
joinTbl = Table.Join(tabA,"Customer Number",tabB,"Customer Number",JoinKind.Inner)
in
joinTbl

Related

Power Query - Nested IF function

I am currently working in Power Query and trying to determine a specific field to output based on multiple conditions.
I have a table as follows:
What we can see is that we have multiple rows per order und 2 status keys.
I need to determine that MAX Date based on this logic per item:
As we can see, each type has a specific key combination to get the MAX Date.
The result would be like this:
Order 100 is 04.02.2021
Is there any alternative to a very long IF Clause?
There are 7 Product types in total each with specific key combinations.
How would you do that? In Power Query.
Just Join the two tables on the relevant keys:
let
//read original data and set data types
Source = Excel.CurrentWorkbook(){[Name="Table10"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Order", Int64.Type}, {"Product Type", type text},
{"Status Key 1", Int64.Type}, {"Status Key 2", Int64.Type},
{"Date", type date}}),
//specification table
//hard coded here but
// could read it in from Excel instead
spec = Table.FromColumns({
{"Metals","Pipes"},
{80,50},
{30,30}},
type table[Material=text, Key1 = Int64.Type, Key2=Int64.Type]),
//now just do the join and remove the unneeded columns
join = Table.Join(#"Changed Type",{"Product Type","Status Key 1","Status Key 2"}, spec,{"Material","Key1","Key2"},JoinKind.RightOuter),
#"Removed Columns" = Table.RemoveColumns(join,{"Material", "Key1", "Key2"})
in
#"Removed Columns"
Note: Depending on what you want to happen if there are specifications that are not matched at all; or if there are specifications that are matched multiple times, you may need to change the type of Join

Restructuring a table with PowerQuery

I am moving my first steps in PowerQuery, so here's my problem. I have a raw data table which list countries and certain products. For each product there is the "market" value followed by a MyValue (meaning my own sales of that product in that country). An example here:
raw table
What I was trying to obtain with PowerQuery is a table that unpivots the products category and leaves two columns, one for Market and one for MyValue.
I tried in many ways and the closest to the result I could get was splitting the original table in two, one for the Market and one for MyValues. Then unpivot each one of them in PowerQuery so that I could get them in this way:
Market
And
MyValue
I tried then to merge the two tables but can't work it out. Of course I could do that manually but I'm sure there a way to do it with PowerQuery, either splitting into 2 tables, unpivoting and then merging or - even better - with a single query.
The result I'm aiming at is like
Desired Result
You are close.
After you unpivot, you need to create a custom column that you can pivot on, and also modify the names in the resultant "attribute" column.
Read the comments in the code and explore the Applied Steps window to understand the algorithm
M Code
let
Source = Excel.CurrentWorkbook(){[Name="rawTable"]}[Content],
//generalized "typer" in case you add other Items
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Country", type text}, {"Date", type date}} &
List.Transform(List.RemoveFirstN(Table.ColumnNames(Source),2),each {_, Int64.Type})),
//Unpivot all except Country|Date
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Country", "Date"}, "Item", "Value"),
//Add Custom Column to create Pivot column for "Market" and "MyValue
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Custom", each
if Text.StartsWith([Item],"My")
then "Market"
else "MyValue"),
//Replace "My" so Item Labels will be consistent
#"Replaced Value" = Table.ReplaceValue(#"Added Custom","My","",Replacer.ReplaceText,{"Item"}),
//Pivot with no aggregation (unless you want to)
#"Pivoted Column" = Table.Pivot(#"Replaced Value", List.Distinct(#"Replaced Value"[Custom]), "Custom", "Value"),
//Sort "Items" to original Column Order
itemSortOrder = List.Distinct(#"Replaced Value"[Item]),
sorted = Table.Sort(#"Pivoted Column",
{{"Country", Order.Ascending},
each List.PositionOf(itemSortOrder,[Item])
})
in
sorted
Hopefully, this is what you want for a result
thank you so much for having spent your time to help me.
I think I solved my problem using the List.Zip function. Solution was not mine but I took if from THIS video. With this trick, I don't even have to split the original source data into two tables (market & MyShare).
It perfectly does what I needed to with little if no effort for data-cleaning...

Grouping and summing Cells based on MATCH criteria

as an example I have created a small set of Data in B3:F20 with component, type and count list etc. I have assigned a Name "TypeP" for B24:B25.
My goal is to group the components based on the type and sum their count from Input B3:F20. To show the final goal, I have manually added the result in L3:N7. In L4, multiple(here 2) instances of Component DEF with same type PA are grouped and the count is summed.
I was able to achieve my goal partially as in H3:J11, where the data was grouped based on the TypeP, but still I should be able to group the similar types.
Formula I have used in H3 is
=FILTER(INDEX(B3:F20;SEQUENCE(ROWS(B3:F20));{1\2\3});(ISNUMBER(MATCH(C3:C20;TypeP;0))=TRUE))
How can I achieve the result as shown in L3:N7?
L3: =UNIQUE(H3:I11)
N3: =SUMIFS($J$3:$J$11,$H$3:$H$11,L3,$I$3:$I$11,M3)
Select N3 and fill down as far as needed.
You could also do this in Power Query
To use Power Query
Make your TypeP a Named Range (or a Table)
Select some cell in your Data Table
Data => Get&Transform => from Table/Range
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Read main table
//Change table name in next line to real name of your table in the workbook
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Component", type text}, {"Type", type text}, {"Count", Int64.Type},
{"Others1", type text}, {"Others2", type any}}),
//read in the types to filter by from a "Named Range"
// Range name is `TypeP` in the workbook
typeP = Excel.CurrentWorkbook(){[Name="TypeP"]}[Content][Column1],
//Filter for the desired types
filter = Table.SelectRows(#"Changed Type", each List.Contains(typeP,[Type])),
//Group by "component and type"
//Then sum the Count column
#"Grouped Rows" = Table.Group(filter, {"Component", "Type"},
{{"Count", each List.Sum([Count]), type nullable number}})
in
#"Grouped Rows"

How do I reconstruct a data-set based on unique ID

Looking for a solution either in excel or IBM SPSS:
I have a dataset with around 95,000 rows. Each row is one response from a participant on a particular question. For example, Row 2 is the response from participant A, on Question 1, where they indicated a score of 2. As pictured.
Ideally I need 1 line of responses per participant as pictured here:
I've tried VLOOKUP and then a macro to delete #N/A and move up the values but memory can't even handle the VLOOKUP, so it's not a viable option.
I feel out of options on what to do, but without laying out my data-set like this, I can't do later analysis (Later I need to average across all participants where Q5 = 80 etc [Q5 is a category code]).
You can do this with a Pivot Table.
Using Power Query (Excel 2010+) (aka Get&Transform in Excel 2016+) gives you a bit more flexibility in, for example, automating the naming of the column Headers.
You can use the GUI if you will only have five questions. But if the number of questions might vary from run to run, the code to handle that needs to be done through the Advanced Editor.
If not, you can use the GUI to just Pivot the QuestionNumber column
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"UserID", type text}, {"QuestionNumber", Int64.Type}, {"Score", Int64.Type}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Changed Type", {{"QuestionNumber", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Changed Type", {{"QuestionNumber", type text}}, "en-US")[QuestionNumber]), "QuestionNumber", "Score", List.Sum),
Renames = List.Transform(List.Skip(Table.ColumnNames(#"Pivoted Column"),1), each {_, "Q" &_}),
#"New Headers" = Table.RenameColumns(#"Pivoted Column", Renames)
in
#"New Headers"
SPSS ANSWER:
Run this code in a new syntax window:
casestovars /id=userid /index=questionNum /separator="".

How can I have the informations from 2 differents columns represented as 1 in a pivot table?

In my data, I have 2 columns who represent a country visited before and a country visited after the cities that I am studying.
Here's a picture of my data sample: https://i.imgur.com/kS4K9uK.png
I'd like to represent in my pivot table all the countries linked to each city (so before and after the city). I'd like to have the cities in my line and all the countries who can possibly be visited before and after as my columns and the count of those in my values.
Here is a picture of what I'd like to achieve, but I can only do it for one of the columns (country after in that case). I'd like the same format but having the data of both before and after (but it's important to know that it's not necessarily the same countries in the 2 columns so I can't just have one of the country columns as the head and both as the values): https://i.imgur.com/PUjhSmB.png
When I place the cities in the line and the 2 country columns in value and columns, it is so difficult to read the table as the before and after are all separate and might even be counted as a pair. and if they are not in the pivot table column they only give me the count of countries before and after but not by the countries, which is not what I'm looking for.
Here is a picture of the result of the pivot table: https://i.imgur.com/3j4BD3k.png
I also tried to create a new field by doing «Country before» + «Country after» but it doesn't seem to work as the data is in text.
Ok I think understand the output now. You essentially want a count of the number of occurrences of each country in columns B+C, grouped by the city. I'll provide a few ways so you can select what suits you best.
Simplest method
The easiest way I can think of is simply paste the second column under the first column and then pivot on this new table.
COUNTIF
A more repeatable way would be to essentially make your own pivot table and use the COUNTIF function to count the instances of each country.
=COUNTIFS($A$1:$A$6,$F2,$B$1:$B$6,G$1)+COUNTIFS($A$1:$A$6,$F2,$C$1:$C$6,G$1)
Power Query
The most repeatable way is to use PowerQuery. This will enable you to refresh the data at the click of a button. To do this (assuming you have excel 2016) go to the Data tab and, with you data selected click "From Table/Range". The Power Query window will open. On the top left of the screen will be a button with advanced editor. When you open it you'll see the following code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"City", type text}, {"Country Before", type text}, {"Country After", type text}})
in
#"Changed Type"
Replace the code with the following code. Note that your table may be called something different. You can see what it's called on the second line of the code. The code below uses "Table1" - you can replace this with the name of your table.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"City", type text}, {"Country Before", type text}, {"Country After", type text}}),
#"Before" = Table.RemoveColumns(#"Changed Type",{"Country Before"}),
#"After" = Table.RemoveColumns(#"Changed Type",{"Country After"}),
#"Append" = Table.Combine({#"Before",#"After"}),
#"Inserted Merged Column" = Table.AddColumn(Append, "Country", each Text.Combine({[Country After], [Country Before]}, ""), type text),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Country After", "Country Before"})
in
#"Removed Columns"
Hope that helps.

Resources