I have a column which looks something like below:
1235hytfgf ui
3434jhjhjh ui
6672jhjkhj ty
I have to name 1st four characters as numbers; the next 6 characters as type and last 2 as id; which function should I use to say that from index(0-3) should be numbers and index(4-9) : type
LEFT(), RIGHT() and MID() are three functions used to rip out portions of a string.
If you data was starting in B2 you could use the following formula in an empty cell to pull the first 4 charters.
LEFT(B2,4)
Now that will pull the first 4 characters and leave them as characters. If you want the numbers as a string to be converted to a numeric value then one of the easy ways to convert it is to send it through a math operation which does not change its value. *1, +0, -0, /1 and -- all work. You formula may look like one of the following:
--LEFT(B2,4)
LEFT(B2,4)+0
LEFT(B2,4)*1
LEFT(B2,4)/1
LEFT(B2,4)-0
To grab the middle portion of the string, use the MID function. Since you already know the starting position and length of the string to pull you can hard code the information into your formula and it will look as follows:
MID(B2,5,6)
5 is the starting position for which character to start pulling from, and 6 is the length of the string to pull or number of characters to pull.
To get the last 2 characters, similar to the first function, use RIGHT(). The formula would look as follows:
RIGHT(B2,2)
If you are dealing with a large amount of data, say over thousands of records, try power query in excel.
Please refer to this article to find out how to use Power Query on your version of Excel. It is available in Excel 2010 Professional Plus and later versions. My demonstration is using Excel 2016.
Steps are:
Add your one-column data to Power Query Editor;
Highlight the column, use Split Columns function and select By Digit to Non-Digit, it will separate the first nth numerical characters from the string;
Highlight the second column, use Split Columns function again and select By Delimiter and use space as the delimiter, it will separate the type and id as desired;
Renamed the columns, and you should have something like the following:
You can then Close & Load the output to a new worksheet (by default).
Here is the power query M Code behind the scene for reference only. All functions used are available in GUI so should be easy to execute.
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Split Column by Character Transition" = Table.SplitColumn(#"Changed Type", "Column1", Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c)), {"Column1.1", "Column1.2"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Split Column by Character Transition", "Column1.2", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Column1.2.1", "Column1.2.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", Int64.Type}, {"Column1.2.1", type text}, {"Column1.2.2", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"Column1.1", "numbers"}, {"Column1.2.1", "type"}, {"Column1.2.2", "id"}})
in
#"Renamed Columns"
Let me know if you have any questions. Cheers :)
Related
In my Power Query I have a column that shows different durations on certain items, but it displays an error when attempting to convert on time or duration.
As a solution next to my Excel Table I created a formula that alows to convert the duration in the format I wish to use, but I have not been able to translate the formula into a language that Power Query can understand (I am pretty new to Power Query).
This is how the data is pulled from source:
But I will like it to show like this:
The Excel Formula I am using to accomplish this is:
=IF(LEN([#Age])=7,"0"&[#Age],IF(LEN([#Age])=5,"00:"&[#Age],IF(LEN([#Age])=4,"00:0"&[#Age],IF(LEN([#Age])=3,"00:00"&[#Age],[#Age]))))
It will be nice to have it in the Power Query instead of the Excel sheet, as it serves as a learning oportunity.
I am self learning Power Query in Excel so any help is welcomed.
EDIT: In Case of the duration being more than 24:00:00, how will i approach it
Here is the error code it returns
You can add a custom column with the formula:
Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
3),
":"))
The formula
Splits the text string by the colon into a List
Replacing blanks with {00} and also prepend the list with a {00} element
Retrieve the last three elements and combine them into a colon separated text string.
Use Duration.FromText function to convert to a duration.
Set the data type of the column to duration
In the PQ Editor, a duration will have the format of d.hh:mm:ss, but when you load it back into Excel, you can change that to [hh]:mm:ss
You can accomplish the above all in the PQ User Interface.
Here is M-Code that does the same thing:
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Duration", each Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),
3),
":"))),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Age"})
in
#"Removed Columns"
You can even do it (using M-Code in the Advanced Editor) without adding a column by using the Table.TransformColumns function:
let
Source = Excel.CurrentWorkbook(){[Name="Table16"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Age", type text}}),
#"Change to Duration" = Table.TransformColumns(#"Changed Type",
{"Age", each Duration.FromText(
Text.Combine(
List.LastN(
{"00"} & List.ReplaceValue(Text.Split(_,":"),"","00",Replacer.ReplaceValue),
3),
":")), type duration})
in
#"Change to Duration"
All result in:
Edit
With your modified data, now showing duration values of more than 23 hours (not allowed in a duration literal in PQ), the transformation will be different. We have to check the hours and break it into days and hours if it is more than 23.
Note: the below edit also assumes there will never be anything entered in the day location; and that entries for minutes and seconds will always be within range. If there might be day values, you will need to just add what's there to the "overflow" from the hours entry
So we change the Custom Column formula to check for that:
let
split = List.LastN({"00","00"} & List.ReplaceValue(Text.Split([Age],":"),"","00",Replacer.ReplaceValue),4),
s = Number.From(List.Last(split)),
m = Number.From(List.LastN(split,2){0}),
hTotal = Number.From(List.LastN(split,3){0}),
h = Number.Mod(hTotal,24),
d = Number.IntegerDivide(hTotal,24)
in #duration(d,h,m,s)
If you might have illegal values for minutes or seconds, you can add logig to check for that also
Also, if you will be loading this into Excel, and you might have total days >31, you will need to format it (in Excel), as [hh]:mm:ss as with the format d.hh:mm:ss Excel cannot display more than 31 days (although the proper value will be stored in the cell)
I am importing a lot of tables from pdf-files into Excel by Powerquery - which works pretty well.
Beside several other migrations I have the following task which I am not able to solve:
In some cases - esp. after page breaks - single values that should go into single cells (one below the other) are placed into one cell joined by linebreaks and below cells are empty.
I need to split the values of such a cell (cell-content contains line-breaks) and put 2nd to n value into the according empty cells below this cell.
(It's kind of a "splitted drill-down" ...)
I am pretty new to M (not to VBA or programming) but I am not able to find a working solution.
This is difficult to do robustly but you can expand using Text.Split on the line feed delimiter as #horseyride suggests and remove the blank rows on that second column and then smash the columns back together with Table.FromColumns.
Here's an example you can paste into the Advanced Editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUUoEYkMDpVgdCDcJiI0gXCMgMxkkawrnpsTkpcbkpYEEjeCCIJ4FMs8IImcMZKaDJI3h3AwQ11wpNhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Week = _t, A = _t, B = _t]),
TransformA = List.Select(List.Combine(List.Transform(Source[A], each Text.Split(_, "#(lf)"))), each Text.Length(_) > 0),
FromCols = Table.FromColumns({Source[Week], TransformA, Source[B]}, {"Week", "A", "B"})
in
FromCols
This takes a starting table like this:
Transforms the A column as a list, splitting each element on the line feed character, combining each result back together, and filtering out null and empty strings:
The final step takes columns Week and B from the original table and sticks the transformed column A in the middle:
You'll run into trouble if the number of extra expanded rows doesn't exactly match the number of blank rows removed but this should work under the assumption that they do match.
right click column
transform data type text
right click column ... split column ... by delimiter ... advanced option, split using special characters [x] .. split into rows
then use arrow atop that column to filter out null rows
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Column2", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column2"),
#"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Column2] <> null)),
in #"Filtered Rows"
I have 1500 rows in one column that contains the work notes from help desk tickets. The contents explains how they helped a caller. Sometimes they will paste a HTTP link (not hot as it's only text) and the http link can be any length. I need to delete the ENTIRE character string making up the hyperlink.
I've tried various ISEARCH and TRIM formulas with no success.
The contents could look like:
Caller reported trouble completing form as a requirement of their job.
Remoted in to the callers desktop and attempted various fixes with no success.
Eventually found a fix on an external website that corrected the issue.
https://troubleshootingfixfoundhere.com/!this_should_fix_the_issue_or_it_may_not32_40maybe
The caller was able to fix own problem by using the steps found on that website
What formula can find and delete the entire http string, that will always be a variable length?
Thanks in advance!
If you just have one substring with http then possibly:
Formula in B1:
=REPLACE(A1,FIND("http",A1),IFERROR(FIND(" ",A1,FIND("http",A1)),LEN(A1))-FIND("http",A1)+1,"")
You can obviously first implement a check to see if HTTP is a substring.
Just in case you can have multiple substrings within a text containing an HTTP address AND you got access to TEXTJOIN you could use the following:
Formula in B1:
=TEXTJOIN(" ",TRUE,IF(ISNUMBER(FIND("http",FILTERXML("<t><s>"&SUBSTITUTE($A$1," ","</s><s>")&"</s></t>","//s"))),"",FILTERXML("<t><s>"&SUBSTITUTE($A$1," ","</s><s>")&"</s></t>","//s")))
Note: It's an array formula and needs to be confirmed through CtrlShiftEnter
Assuming your text is saved in cell A2, and the URL is followed by a space (" ").
=CONCATENATE(LEFT(A2;SEARCH("http";A2)-1);MID(A2;FIND(" ";A2;SEARCH("http";A2));(LEN(A2)-FIND(" ";A2;SEARCH("http";A2)))+1))
You can use Power Query to solve the problem. Power Query is available in Excel 2010 Professional Plus and later versions of Excel including Excel 2013.
Please note the following solution works on the assumption that each target string contains forward lash / and the target string is separated by space " " with other words in the text.
Therefore my solution will not work on the following cases:
I need to remove this www.test.com from the sentence. My solution cannot detect the link without forward slash /.
nor
I need to remove thishttp://www.test.com from the sentence. My solution will remove thishttp://www.text.com from the sentence as there is no space between this and the link.
nor
I/TerryW need to remove this www.test.com from the sentence. My solution will remove I/TerryW from the sentence but not the link as the former contains forward slash "/".
If you can tolerant the above limitations, let's begin:
Load your data (which is a column with thousands of rows) to the power query editor;
Add an Index Column to assign a unique number to each string;
Use Replace Values function in the Transform tab to replace special characters such as Line Break or Carriage Return with space " ", repeat this step as many times as needed to make sure there is a space sit between the target string and other words;
Use Split Columns function to split the column by delimiter space " " and make sure to put the results into Rows in the advanced option;
Add a custom column using this function =Text.Contains([Column1],"/") where [Column1] is the column being splited in the last step. This function will return TRUE for string that contains forward slash /;
Filter the custom column to show FALSE results only;
Use Group By function to group [Column1] by the index column as shown below:
Go back to last step in the power query editor, go to the formula bar, and replace this part of the formula List.Sum([Column1]) with Text.Combine([Column1]," "), and remove the index column;
Once done, you can close and load the output to a new worksheet (by default).
Here is the sample data I used:
String1
Caller reported trouble completing form as a requirement of their job. Remoted in to the callers desktop and attempted various fixes with no success. Eventually found a fix on an external website that corrected the issue. https://troubleshootingfixfoundhere.com/!this_should_fix_the_issue_or_it_may_not32_40maybe
The caller was able to fix own problem by using the steps found on that website
String2
01-18-2009 13:17:09 – Jim Bob (Work notes) Request is completed. This PTASK can be closed. 01-18-2009 13:16:08 – Jim Bob (Work notes) Request RITM9999999 created for DVR team to create a incidents in case below two URL's have failures xyz-zyx-iib.xyzint.net:7501/xyz/xyz/authenticatePin xyz-xyz-iib.xyzint.net:7501/xyz/xyz/activatePin
String3
01-25-1942 09:26:06 - Van Shoes (Work notes) 1.disabled the services that should not be running BDO side: 1.alarming is tuam to enable integration and email alarming is turned on. 2.An EMA service request has been created servicenow.xyzint.net/… for service integration. onenote:bdo.wss.xyzint.net/sites/RUN/DistributedServices/ContactCenter/…{D0333A-900-4C8-83E-45A6B0}&page-id={8D646}&end
And here is my result:
Here are the power query M codes behind the scene for reference only:
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
#"Replaced Value" = Table.ReplaceValue(#"Added Index","#(lf)"," ",Replacer.ReplaceText,{"Column1"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Column1", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Custom", each Text.Contains([Column1],"/")),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = false)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Index"}, {{"Combined", each Text.Combine([Column1]," "), type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Index"})
in
#"Removed Columns"
Actually, there are workarounds for the limitations I mentioned before. For instance, you can use Text.Replace function in PQ Editor to replace http with (space)http to add a space in front of each target string, and you can add an extra step to evaluate the length of all strings that contain forward slash / and normally the http link will have a large length than a natural word so you can eliminate the case where / is used in between nature words. Sorry for being lazy but given that there is insufficient sample data (and I am running out of spare time), I will not elaborate all potential solutions.
Let me know if you have any questions. Cheers :)
I'm trying get from a cell just the value of the 'id' tag separated by ';'.
The data is as follows:
Cell:
A1: {"id":1145585,"label":"1145585: Project Z"}
A2: {"id":1150322,"label":"1150322: Project Waka 1"}|{"id":1150365,"label":"1150365: Project Waka 2"}
A3: {"id":1149240,"label":"1149240: Analysis of Technical Options"}|{"id":1149258,"label":"1149258: Check and Report"}
A4: {"id":1148925,"label":"1148925: Change Management Review"}|{"id":1148920,"label":"1148920: Follow-Up Meetings"}|{"id":1148923,"label":"1148923: Launch Date Definition"}
I have tried to use left, mid and find functions, however the number of 'IDs' can vary from 1 to 1000. I'm also trying to avoid using vba, but it seems to be the only option. So any solution is great!
The result should be:
Cell:
A1: 1145585
A2: 1150322;1150365
A3: 1149240;1149258
A4: 1148925;1148920;1148923
Any ideas?
Thanks!
Sounds like a task for #powerquery. Please refer to this article to find out how to use Power Query on your version of Excel. It is availeble in Excel 2010 Professional Plus and later versions. My demonstration is using Excel 2016.
The steps are:
Load the source data to power query editor which should look like the following:
Use Index Column function under the Add Column tab to add an Index column;
Use Split Column function under the Transform tab to split the column by custom delimiter "id": and put the results into Rows as shown below:
Use Extract function under the Transform tab to extract the first 7 characters of the column;
Change the Data Type to Whole Number, remove Errors, and then change the Data Type back to Text;
Use Group By function under the Transform tab to group Column1 by Index as set out below. Don't panic if the result is in error as it is expected.
Go back to last step and replace the original formula in the formula bar with the following one as Text.Combine is not a built-in function:
= Table.Group(#"Changed Type3", {"Index"}, {{"Sum", each Text.Combine([Column1],";"), type text}})
Close & Load the output to a new worksheet (by default), and you should have the following:
Here are the Power Query M codes behind the scene. Most of the steps are performed using built-in functions except the last step of manually replacing the formula with the correct one. Let me know if you have any questions. Cheers :)
let
Source = Excel.CurrentWorkbook(){[Name="Table10"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Added Index", {{"Column1", Splitter.SplitTextByDelimiter("""id"":", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1", type text}}),
#"Extracted First Characters" = Table.TransformColumns(#"Changed Type1", {{"Column1", each Text.Start(_, 7), type text}}),
#"Changed Type2" = Table.TransformColumnTypes(#"Extracted First Characters",{{"Column1", Int64.Type}}),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type2", {"Column1"}),
#"Changed Type3" = Table.TransformColumnTypes(#"Removed Errors",{{"Column1", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type3", {"Index"}, {{"Sum", each Text.Combine([Column1],";"), type text}})
in
#"Grouped Rows"
Based on #TerryW comment, here is a solution using the FILTERXML function available in Excel 2013+. But it also requires TEXTJOIN which did not appear until later versions of Excel 2016 (and office 365)
It relies on the fact that the id string is always followed by a comma.
A disadvantage is that FILTERXML will return the numeric id's as numeric values. So leading zero's will be dropped. If there are always a fixed number of digits in the id and leading zero's need to be present, this can be mitigated by using the TEXT function.
We construct an xml by dividing both on id and on comma
We then use an xpath to return the node which follows the node that contains id
=TEXTJOIN(";",TRUE,FILTERXML("<t><s>" & SUBSTITUTE(SUBSTITUTE(A1,"""id"":",",id,"),",","</s><s>")&"</s></t>","//s[text()='id']/following-sibling::*[1]"))
Since this is an array formula, you need to "confirm" it by holding down ctrl + shift while hitting enter. If you do this correctly, Excel will place braces {...} around the formula as observed in the formula bar
Source
Results
I have a list of accounts with 2 digit modifiers. Some accounts will have more then one modifier. I am looking for accounts with a certain combinations of modifiers.
So I have a list of accounts in the B column.
I have the modifiers in C Column
Example
Act # Modifier
111 80
111 56
111
222 55
222
333 51
333 50
333
I have some working code that works great until I get to many rows.
In this sample formula I have 8 Modifier groups.
50,22,51,62
51,22,62
54,50,51
55,50,51
56,50,51
80,50,51
"AS",50,51
59,50
=IF(OR(SUMPRODUCT(COUNTIFS(B:B,B3,C:C{50,22,51,62}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{51,22,62}))>=2,SUMPRODUCT(COUNTIFS(B:{54,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{55,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{56,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{80,50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{"AS",50,51}))>=2,SUMPRODUCT(COUNTIFS(B:B,B3,C:C,{59,50}))>=2),"Check","")
This code will put check by any account that has 2 or more of the modifiers from any of the 8 groups. It has to be 2 modifiers from the same group though.
I was just wondering if there is a better way to write this? Instead of doing all these or can I just do OR for the different modifier criteria I am looking for?
Something like
=COUNTIFS(H:H,H5,I:I,OR({59,50},{"AS",50,51}))
As requested by #SkysLastChance, I will post my solution using Power Query (PQ) even though the question was tagged to Excel-Formula.
Please note you MUST use Excel 2010 or later versions otherwise you will not be able to use Power Query. My answer might not be robust enough for people who has not used PQ before. So feel free to leave a question if you are unclear with any particular step.
Step 1
Convert the Account List and Modifier Group in the example into Table in your excel worksheet. One way of doing that is to highlight the data including headers and press Ctrl+T. Then you should get two tables as shown below. I have named the first table as Tbl_ActList, and named the second one as Tbl_MoGrp.
Please note I have added some data to the Account List table for result testing purpose.
Step 2
Select any cell within a table, go to the Data tab on top of your excel (mine is Excel 2016), click From Table in the Get & Transform section. It will load and add the table to the built-in PQ Editor. You can exit the editor (and keep the changes), and repeat this step to add the second table to the PQ Editor. Alternatively you can add a new query in the PQ Editor and find the second table from your excel worksheet. I will not demonstrate this process as you can google the know-how later on.
Step 3
Once you have added both tables to the editor, you can start editing/transforming data in each table/query using built-in functions and/or advanced coding. In this case I only used built-in functions.
For the Modifier Group table, I want to transform the original data into a 2-Column list with one column showing which Group the modifier belongs to, and the other column showing a single modifier.
Firstly, use the Split Column function in the Transform tab to split the original modifier groups into single value by using , (comma) as the delimiter.
The new table is in matrix structure which is no ideal for look up purpose, so I used Unpivot Columns function in the Transform tab to convert it into list structure. What I actually did is to highlight the Grp column and select Unpivot Other Columns to get the list. Alternatively you can highlight the first four columns and use Unpivot Columns to get the same list.
Then I renamed Value column as Modifier, and removed the Attribute column to end up a 2-Column table.
Please note all data in each table/query in this example have been set to 'Text' format (aka data type). Data type is very sensitive and specific in PQ, and incorrect data type may lead to error.
Here is the full code behind the scene. All steps are performed using the built-in functions without any advanced coding:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_MoGrp"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Modifier", type text}, {"Grp", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Modifier", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Modifier.1", "Modifier.2", "Modifier.3", "Modifier.4"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Modifier.1", type text}, {"Modifier.2", type text}, {"Modifier.3", type text}, {"Modifier.4", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Grp"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Value", "Modifier"}}),
#"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"Attribute"})
in
#"Removed Columns"
Step 4
With the Modifier Group list ready, we can look up the modifier group in the Account List table for each modifier using Merge Queries function in the Home tab. The logic is to find the link between two tables to conduct a look up.
Firstly, select/highlight the column (Modifier) that contains the look up value in the origin table (Tbl_ActList), then select the table (Tbl_MoGrp) that you want to look up from, then select/highlight the corresponding column (Modifier) in the second table, and then click OK to continue.
Please note before merging I have filtered the Modifier column in the Account List table to get rid of cells showing null (blank) as they are not useful for the look up.
After merging the queries there is a new column added to the Account List table. It may look like a column but it contains all data from the Modifier Group table stored in Grp column and Modifier column. As we want to look up the modifier group only, we can Expand the column to show the Grp column only.
Click on the little square box on the right hand side of the header of the last column to trigger the Expand function, then select the Grp column only and click OK to continue.
Now we have a table showing account number, modifier, and modifier group. We can then use the Group By function in the Home tab to find out for each account number how many modifiers have appeared in each applicable modifier group.
Please See below screenshot for the settings for the Group By function.
Then I sorted the table ascending by Acc # column, and filtered the Count column to show values greater than or equal to 2, i.e. at least 2 modifies linked to that account number have appeared in a modifier group.
Here is the full code behind the scene:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_ActList"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Act #", type text}, {"Modifier", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Modifier] <> null)),
#"Merged Queries" = Table.NestedJoin(#"Filtered Rows", {"Modifier"}, Tbl_MoGrp, {"Modifier"}, "Tbl_Grp", JoinKind.LeftOuter),
#"Expanded Tbl_Grp" = Table.ExpandTableColumn(#"Merged Queries", "Tbl_Grp", {"Grp"}, {"Grp"}),
#"Grouped Rows" = Table.Group(#"Expanded Tbl_Grp", {"Act #", "Grp"}, {{"Count", each Table.RowCount(_), type number}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Act #", Order.Ascending}}),
#"Filtered Rows1" = Table.SelectRows(#"Sorted Rows", each [Count] >= 2)
in
#"Filtered Rows1"
Step 5
The answer could stop at Step 4 as the table has shown the account number that we are looking for. However if there are thousands of account numbers, then it is better to Remove Other Columns except the Act # column, and Remove Duplicates within the column, and then Close & Load the result to a new worksheet. The final result may look like this:
A tip here, before Close & Load any query for the first time, it is better to set the following in your Query Options. It will prevent PQ Editor to load each of your queries to a separate worksheet by default. Just imaging how long it will take if you have 20 queries in your PQ Editor and each of them have more than a thousand lines of data.
Once you change the default option, PQ Editor will only create connections for your queries after you click Close & Load, and you can manually load a specific query result to a worksheet as shown below:
Conclusion
I believe if this question was tagged as a PowerQuery, there may be more concise or 'fancier' answers than mine. Regardless, the things that I like PQ the most are it is a built-in function of excel (2010 and later versions), it is scalable, replicable and more powerful when it comes to data cleansing and transforming.
Cheers :)