I’m attempting to compare two columns of strings for differences.
I have these two lists of states that I need to extract the differences. I need to see if Column F is missing anything from Column G. Is there a way to formulate this without running a macro? Thanks all!
You can do so using Power Query.
Please refer to this article to find out how to use Power Query on your version of Excel. It is available in Excel 2010 Professional Plus and later versions. My demonstration is using Excel 2016.
Steps are:
Use From Table function to add your source table to the power query editor, Remove all irrelevant columns and leave Column F and Column G only. Below is a small example I am using;
Right click the header of Column G to make a Duplicate Column of Column G;
Use Split Column function under Transform tab to split Column G by comma , and make sure set the outputs to be put into Rows:
Add a custom column under Add Column tab with this formula: =Text.PositionOf([Col_F],[Col_G])>=0 then you should have something like below:
Click the filter button on the right hand side of the column header to Filter the Custom column to show FALSE results only;
Use Merge Columns function under the Transform tab to merge Column F and Column G - Copy with semicolon ; as the delimiter. Then you should have:
Use Group By Function to group Column G by the merged column with the following set-up. Once done you will notice the Sum column is showing errors which is expected.
Go to the formula bar and replace the formula with this one:= Table.Group(#"Merged Columns", {"Merged"}, {{"Sum", each Text.Combine([Col_G],","), type text}}), then you should have something like the following:
Then you can split the merged column by semicolon ; to retrieve the original Column F and Column G;
Rename each column as desired and then Close & Load the output to a new worksheet (by default). Then you should have something like the following:
The above steps can be simplified if you do not need to show Column G in the output table. You can skip Step 2, 6 and 9. The tricky bit is Step 8 where you are not creating a new step but rather change the formula used in the previous step. The key is to replace the original List.Sum function with Text.Combine function in the formula to get the desired output.
Here are the Power Query M Codes behind the scene:
let
Source = Excel.CurrentWorkbook(){[Name="Table11"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Col_F", type text}, {"Col_G", type text}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Col_G", "Col_G - Copy"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Duplicated Column", {{"Col_G", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Col_G"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Col_G", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Custom", each Text.PositionOf([Col_F],[Col_G])>=0),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = false)),
#"Merged Columns" = Table.CombineColumns(#"Filtered Rows",{"Col_F", "Col_G - Copy"},Combiner.CombineTextByDelimiter(";", QuoteStyle.None),"Merged"),
#"Grouped Rows" = Table.Group(#"Merged Columns", {"Merged"}, {{"Sum", each Text.Combine([Col_G],","), type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Grouped Rows", "Merged", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Merged.1", "Merged.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Merged.1", type text}, {"Merged.2", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Merged.1", "Col_F"}, {"Merged.2", "Col_G"}, {"Sum", "Col_Missing"}})
in
#"Renamed Columns"
Let me know if you have any questions. Cheers :)
you can use a VLOOKUP function to determine which strings are missing from the other set.
Assuming your data starts in row 2, write the following:
=VLOOKUP(G2, [ highlight all the data in column F and lock it], 1, 0)
Any #N/A that appears means that data is not in the other data set.
In addition, to see specificly which data is missing you can use this formula:
=IF(ISNUMBER(SEARCH(MID(TRIM(F1),1,2),TRIM(G1),1)),"",MID(TRIM(F1),1,2))&","&IF(ISNUMBER(SEARCH(MID(TRIM(F1),4,2),TRIM(G1),1)),"",MID(TRIM(F1),4,2))&","&IF(ISNUMBER(SEARCH(MID(TRIM(F1),7,2),TRIM(G1),1)),"",MID(TRIM(F1),7,2))&","&IF(ISNUMBER(SEARCH(MID(TRIM(F1),10,2),TRIM(G1),1)),"",MID(TRIM(F1),10,2))&","&IF(ISNUMBER(SEARCH(MID(TRIM(F1),13,2),TRIM(G1),1)),"",MID(TRIM(F1),13,2))&","&IF(ISNUMBER(SEARCH(MID(TRIM(F1),16,2),TRIM(G1),1)),"",MID(TRIM(F1),16,2))&","&IF(ISNUMBER(SEARCH(MID(F1,19,2),TRIM(G1),1)),"",MID(F1,19,2))
One bit that I recently have tried is:
=IF(B2=“TX”,REPLACE(F2,SEARCH(“CO,”,F2)3,””))
This finds “CO,” in my string in F2 and pulls it out leaving me with “AR,FL,KY,LA,NJ,NM,OK,TX,”
May not be the most elegant way, but . . . How can I extend this to also pull out “TX,” “AR,” “LA,” etc (the remaining string from G2) from F2 as well?
I’m looking to see if anything from G2 is missing from F2. So my net result should be a blank cell in this case.
Related
I'm trying to use Power Query to make a table that automatically subtracts A1 (10) from each cell in column B. The results will be in column C.
In regular excel, this would simply be B2-$A$2 and so on bur I'm not sure how to do that in PQ. Thanks in advance!
c
I tried looking it up online but those didnt subtract from one specific cell but from the corresponding cell
To refer to the first cell in Column A, you use a construction like:
TableName[ColumnName]{index into column}
The 'Index' in PQ is zero-based, so more likely:
=#"Previous Step"[A]{0}
Or, in this case the previous step is `#"Changed Type" (as you can see from the Screen Shot:
Equivalent M-Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "C", each [B]-#"Changed Type"[A]{0}, Int64.Type)
in
#"Added Custom"
I'm trying to concatenate multiples values in one cell based on a lookup search (ID), the thing is this ID sometimes might be alone but other time might be between multiple IDs in the same cell (at the beginning, middle or end separated by commas). I've been using the below formula but only returns the first value when the ID is alone.
Current formula --> =TEXTJOIN(",",TRUE,IFERROR(XLOOKUP(A2,D:D,E:E),"ID not found"))
Hope you can help.
Thanks.
Rows highlighted in blue, yellow and green are the expected results (I did them manually).
Row 7 is the actual result (wrong/incomplete) for the current formula.
You can try below approach to get results:
=TEXTJOIN(",",TRUE,IF(ISNUMBER(SEARCH(A2,D:D,1)),E:E,""))
One suggestion would be to limit the entire column usage to improve formula speed.
This can also be done via power query. I used different ID values from your table on the right, but it should still work correctly.
Create a table that has ID's in Col1 and the Value to be Returned in Col2
Open the table in Power Query (Data tab > Get & Transform Data via From Table/Range.
If you don't have power query, reference this Complete Guide to Installing Power Query.
Power Query Steps
Change type of Col: Value to be Returned to text
Split Col: ID by Delimeter = ','
You know should have multiple ID columns (ID.X). Select all ID columns > Right click on header > select Unpivot Other Columns
Remove any unecessary columns
Select the column with all of your ID's > right click and select Group By
Under new column name, enter a new column header. Change the Operation to Sum. For Column, select your column that contains your Values to be Returned
Reference this guide for the next step. You need to manually configure the M code in the formula bar & change the formula from List.Sum([COL]).. to Text.Combine([COL], ",")..
The last step is to make sure that your new column is a text column, not a number.
I've attached a copy of my workbook, which should hopefully help. If not, I've pasted my code from the Advanced Editor below in case that is helpful. Be sure to update Table & Column names accordingly based on your workbook.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Value to be returned", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "ID", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"ID.1", "ID.2", "ID.3"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"ID.1", type text}, {"ID.2", type text}, {"ID.3", type text}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Value to be returned"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Value", "Value to be returned"}),
#"Grouped Rows" = Table.Group(#"Reordered Columns", {"Value"}, {{"Value.1", each Text.Combine([Value to be returned], ", "), type number}}),
#"Changed Type2" = Table.TransformColumnTypes(#"Grouped Rows",{{"Value.1", type text}})
IN
#"Changed Type 2"
Let me know if this works or if you have any follow up questions.
I am working with an Excel file having import and export data between different countries for various shipping products, which looks pretty much like this:
The goal is to create a consolidated state pair that have a trade relation between them. So the final list for the above example should look something like this:
What would be the best way to go about this?
Try below formula-
=COUNTIFS(A:A,E4,B:B,F4)+COUNTIFS(A:A,F4,B:B,E4)
So assume your data is in a&b in the first screenshot, put this formula in column c
=a1&b1 then hit enter, then drag the bottom right hand corner down so the formula works for all cells in that column. Then copy the column and paste it as values.
Then in column d, use the formula =countifs(c:c,c1) and drag it down in the same way. Now paste column d as values.
Finally, on the data tab, remove duplicates based on column c and then delete column c.
You can do this really easy with Pivot Tables. Just take Field Importing Country into rows section and also values section (make sure it's counting) and field Exporting Countryto rows section.
Choose Tabular design, deactivate subtotals and activate Repeat labels. This is what you get:
You can do this with Power Query, available in Windows Excel 2010+ and O365
Sort each row horizontally
Group by the two resultant country rows, aggregating by Count
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Importing Country", type text}, {"Exporting Country", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Country", each List.Sort({[Importing Country],[Exporting Country]})),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Importing Country", "Exporting Country"}),
#"Extracted Values" = Table.TransformColumns(#"Removed Columns", {"Country", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Country", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Country.1", "Country.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Country.1", type text}, {"Country.2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type1", {"Country.1", "Country.2"}, {{"Occurrences", each Table.RowCount(_), Int64.Type}})
in
#"Grouped Rows"
I have three columns of data. Column A is a list of computers. Column B is the list of User ID's. Column C are the user permissions. What I want to do is concatenate the values in Column C when there is a match for A & B. Attached is a simple screenshot of what I am trying to do. Please advise the easiest way to achieve this. I am new to Excel formulas so any assistance is appreciated!
Your desired results don't make sense given your original data.
In particular, you only have two bbb123 in your data, and the aaa123 has two different matches.
If that is typo related, what you want can be done easily with Power Query aka Get&Transform which is a part of versions of Excel since 2010.
Except for the Custom Column, most of the code below can be generated automatically from the User Interface.
Algorithm:
Group the data by machine and account
-This forms a table of the grouped "permissions"
Convert the table into a list (this is what the Custom Column does)
Expand the list by extracting the values and using comma as the delimiter
Delete the column containing the tables
The formula for the custom column is: (Add Column/Custom Column from the UI)
=Table.Column([Grouped],"permissions")`
The M-Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"machine", type text}, {"account", type text}, {"permissions", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"machine", "account"}, {{"Grouped", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Column([Grouped],"permissions")),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
#"Removed Columns" = Table.RemoveColumns(#"Extracted Values",{"Grouped"})
in
#"Removed Columns"
Original Data
Grouped Data
I have an excel sheet, which has values from a database. One cell has responses from a multiple choice question which is pipe delimited. eg a user could ask what colour cars have you had? With options such as:
A. Black
B. White
C. Red
D. Yellow
So a user can respond with A, C and D. these values are stored in one cell as "A. Black|C. Red|D. Yellow" I want to separate each of these values in a A column, B column, C column and D column.
I tried using the Text to Column feature but this does know that the A column should only contain As.
I thin I need to add a formula to each column which looks for the "A.", "B.", "C." or "D." and then finds the next available pipe character. I think I need a substring of some sort maybe. Something like this maybe:
=LEFT(C2,LEN(C2)-FIND("A.",C2))
But I don't know how to find the next occurrence of the pipe symbol - any ideas? Is there a nextIndexOf function in excel maybe?
Many thanks in advance
I think this would require either VBA or Power Query, because you are talking about having to analyze the data to determine what column it goes into. It's actually very easy to do in Power Query, but if you've never even heard of Power Query (it's built into Excel 2016+ and is a free add-in for 2013) then this solution is probably out of scope without an extensive explanation of what Power Query is.
That being said, for anyone interested, I took this dummy set of data, and converted it into an output table with Power Query which I believe is what was being asked in the question.
This is the code for the Query. A quick summary is we split it by the pipe delimiters and index it. Then the main trick is using the Group function combined with the Transpose function to get the split columns into an unpivoted data set. After expanding back out from the Group we can then split out the "ABCD" section from the answer, and pivot off that to get the ABCD columns aligned with the original entries.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 1, 1),
#"Grouped Rows" = Table.Group(#"Added Index", {"Index"}, {{"Rows", each Table.Transpose(Table.RemoveColumns(_, {"Index"})), type table}}),
#"Expanded Rows" = Table.ExpandTableColumn(#"Grouped Rows", "Rows", {"Column1"}, {"Rows.Column1"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Rows", each ([Rows.Column1] <> null)),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Rows.Column1", Splitter.SplitTextByDelimiter(". ", QuoteStyle.Csv), {"Selection", "Answer"}),
#"Pivoted Column" = Table.Pivot(#"Split Column by Delimiter1", List.Distinct(#"Split Column by Delimiter1"[Selection]), "Selection", "Answer")
in
#"Pivoted Column"