Excel extracting an exact string from cells with random structure - excel

Ok so,
In column A basically every cell has a different composition and doesn't have the same string Before And After the value we are looking to extract. For Example:
ODODODODEFGH4OGOGOG
LALALALALABCDE12-1ALALALALA
IRIRIRIRIJKLMNOROROR
And I need to extract the following strings which are located in another sheet ((its an SKU information combining text and numbers with variable length) from column A and list it in the column B next to it
ABCDE12-1
EFGH4
IJKLMN
I've tried Find, Mid, Lookup, Index functions but can't seem to find the solution. Any help very appreciated!

Let's say your Sheet1 and Sheet2 looks like this.
Put this formula in Cell B1 of Sheet1 and pull it down.
=IF(SUMPRODUCT(COUNTIF(A1,"*"&Sheet2!$A$1:$A$3&"*")),INDEX(Sheet2!A:A,SUMPRODUCT(COUNTIF(A1,"*"&Sheet2!$A$1:$A$3&"*")*ROW(Sheet2!$A$1:$A$3))),"")

OK, now that we know you have a lookup table, set up the following:
On Some sheet, list your valid SKU's in a vertical Named Range. e.g: ValidSKU refers to: Sheet2!$A$2:$A$100
Then with your gibberish string on some sheet in A1, to return the valid SKU from the string:
B1: =INDEX(ValidSKU,LOOKUP(9E+307,FIND(ValidSKU,A1),ROW(INDIRECT("1:10000"))))
The "10000" argument in the above formula needs to be a number that is at least as large as the number of SKU's in your list. So if you have 5000 valid SKU's, use some number greater than that.
Then fill down as far as needed.
This method has a weakness: If there are overlapping SKU's, it will return the lowest one that matches. So it would be best to have your longest SKU's at the bottom of the list.
In other words, if you have two SKU's ABCDE12-1, and ABCDE12, both of those are found in your 2nd string. Whichever is located last in the ValidSKU list will be the one returned. I don't know of any way (other than position) to differentiate these two possibilities.

Related

Generate a unique ID (As much As possible) from a string in Excel using string functions

Let's say I have two strings in two cells
Cell A1 = Customer Country
Cell B1 = Customer City
I need to generate a unique ID using the Excel string functions (LEN, LEFT, MID, RIGHT etc.) or any other (CONCAT etc.) along with the ROW function.
Get first letter & last letter of each word, remove spaces and dashes, get the row number and return a unique string.
If I use
=IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=0,LEFT(A$1,1),IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=1,LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1),LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1)&MID(A$1,FIND(" ",A$1,FIND(" ",A$1)+1)+1,1))) &ROW(A$1)
I get results as CC1 in both cases. How would I get a unique ID in such as case.
The idea in the comment-section by #JosWoolley is a good one. Though, be careful how/where you'd add a column index. If you'd just add the column index number you'd create confusion between say CC111 from row 11 column 1 and the number from row 1 and possibly column 11. Just adding the actual address of the cell instead of these indices will help but can create confusion too if you don't add a delimiter first. Therefor I'd suggest something along the lines of:
Formula in D1:
=CONCAT(LEFT(TEXTSPLIT(A1," ")),"|",ADDRESS(ROW(A1),COLUMN(A1),4))
Note: If you don't yet have access to TEXTSPLIT() you can swap this with FILTERXML(). Also, you mentioned CONCAT() but if used with Excel 2019 you may need to CSE the formula.

OFFSET / INDIRECT function trouble

I have two sheets within a workbook, the first with several thousand lines of expenses, separated by individuals, and the second a summary of totals and such.
On the second sheet, I've created a reference to the first to insert each individual's name (i.e. B4: ='Card Transactions'!D89). I'm having difficulty with the syntax for returning the total of each individual's total, which is in a predictable cell in the first sheet relative to the name (down 1, right 7).
I've tried the following:
=offset(indirect(B4),1,7) with only a reference error in return. This seems like it should be relatively simple but I'm not having any luck. . . any suggestions?
use this:
=OFFSET(INDIRECT(MID(FORMULATEXT(B4),2,300)),1,7)
note:
this only works if the formula in B4 only contains the one cell reference.
This is a volatile function and will cause a noticeable lag in calculations if used too many times.
The following should work for you as long as your data follows these rules:
Your columns have headers
The names are all in the same column
And you are able to set the range with row numbers and not just full columns
Let's say your first sheet is set out like this:
And you want your second sheet like this:
And your sheets are named:
Sheet1
Sheet2
This is the formula in B2 of Sheet2:
=INDEX(Sheet1!$A$1:$H$9,MATCH(A1,Sheet1!$A$1:$A$9,0)+1,MATCH("Column 8",Sheet1!$A$1:$H$1,0))
And here's what it does:
Your index array is the entire blue area, this can be the whole sheet but can't be a full column reference, the row number must be specified. In this example, the index array is $A$1:$H$9 and the $ signs mean the range won't move when you drag down the formula, so they are important!
Your first match finds the row number, it uses the name (in this case 'bart') as the lookup value, and the purple area as the array. In this example the row array is $A$1:$A$9 and the row numbers must match the row numbers in the index array. The match has a "+1" at the end, so it will find the matching row, then add one row down to get your offset.
Your second match finds the column number, it will need to use the name of your column. In this example the column array is $A$1:$H$1 and the column letters must match the column letters in the index array.
Let me know if this doesn't fit your problem, I'm sure we can figure it out.
Thanks.

Excel: Search for many text strings in a cell, and return all positive results

I need a function that will search a cell for many keyword text strings (model numbers) and return each model number that it finds. In all my research I have only found solutions that provide one matching keyword, but I would like all matching keywords.
An example of a solution only finding one keyword: Excel: Search for a list of strings within a particular string using array formulas?
Example of what I would like:
Cell to search in (A1) contains:
A-007858 CustomerCompanyName D1001, S1135, BE60 and R235 New 6 and 8 Packs
Search Keywords (on separate worksheet A1-A70):
A32: D1001
A43: S1135
A6: BE60
A64: R235
Desired Output:
Each model number found (D1001, S1135, BE60, R235) displayed in cells B1, C1, D1, and E1 next to the cell that was searched (A1). The order of the model numbers is not important. I would prefer an Excel function solution rather than VBA.
Put this formula in B1 and copy over:
=IFERROR(INDEX(Sheet2!$A$1:$A$70,AGGREGATE(15,6,ROW(Sheet2!$A$1:$A$70)/(ISNUMBER(SEARCH(Sheet2!$A$1:$A$70,$A1))),COLUMN(A:A))),"")
Replace Sheet2 with the name of the sheet on which your list resides.
It will be in order of the list on the other sheet.
If you don't have too many keywords you can do this fairly simply:
B1 = IF(ISERROR(SEARCH("D1001",A1)),"","D1001")
where you can replace "D1001" with a reference to the cell in the other sheet. C1:E1 would be analogous.
If you have a lot, then you'll need something more involved like #ScottCraner suggests.

Sort Order formula to alphabetise in Excel

I am currently drawing up a spreadsheet that will automatically remove duplicates and alphabetize a list:
I am using the COUNTIF() function in column G to create a sort order and then VLOOKUP() to find the sort in column J.
The problem I am having is that I can't seem to get my SortOrder column to function properly. At the moment it creates an index for two number 1's meaning the cell highlighted in yellow is missed out and the last entry in the sorted list is null:
If anyone can find and rectify this mistake for me I'll be very grateful as it has been driving me insane all day! Many thanks.
I'll provide my usual method for doing an automatic pulling-in of raw data into a sorted, duplicate-removed list:
Assume raw data is in column A. In column B, use this formula to increase the counter each time the row shows a non-duplicate item in column A. Hardcord B2 to be "1", and use this formula in B3 and drag down.
=if(iserror(match(A3,$A$2:A2,0)),B2+1,B2)
This takes advantage of the fact that when we refer to this row counter in our revised list, we will use the match function, which only checks for the first matching number. Then say you want your new list of data on column D (usually I do this for display purposes, so either 'group-out' [hide] columns that form the formulas, or do this on another tab). You can avoid this step, but if you are already using helper columns I usually do each step in a different column - easier to document. In column C, starting in C3 [C2 hardcoded to 1] and drag down, just have a simple counter, which error-checks to the stop at the end of your list:
=if(C2<max(B:B),C2+1," ")
Then in column D, starting at D2 and dragged down:
=iferror(index(A:A,match(C2,B:B,0)),"")
The index function is like half of the vlookup function - it pulls the result out of a given array, when you provide it with a row number. The match function is like the other half of the vlookup function - it provides you with the row number where an item appears in a given array.
Hope this helps you in the future as well.
The actual reason that this is going wrong as implied by Jeeped's comment is that you can't meaningfully compare a string to a number unless you do a conversion because they are stored differently. So COUNTIF counts numbers and text separately.
20212 will give a count of 1 because it is the only (or lowest) number.
CS10Z002 will give a count of 1 because it is the first text string in alphabetical order.
Another approach is to add the count of numbers to the count if the current cell contains text:-
=COUNTIF(INDIRECT("$D$2:$D$"&$F$3),"<="&D2)+ISTEXT(D2)*COUNT(INDIRECT("$D$2:$D$"&$F$3))
It's easier to show the result of three different conversions with some test data:-
(0) No conversion - just use COUNTIF
=COUNTIF(D$2:D$7,"<="&D2)
"999"<"abc"<"def", 999<1000
(1) Count everything as text
=SUMPRODUCT(--(D$2:D$7&""<=D2&""))
"1000"<"999"
(2) Count numbers before text
=COUNTIF(D$2:D$7,"<="&D2)+ISTEXT(D2)*COUNT(D$2:D$7)
999<1000<"999"
(3) Count everything as text but convert numbers with leading zeroes
=SUMPRODUCT(--(TEXT(D$2:D$7,"000000")<=TEXT(D2,"000000")))
"000999" = "000999", "000999"<"001000"

comparing two columns in excel sheet (text/string) and return the matched element in third column

I have two columns in excel which I am trying to compare with one another and result the common element in third column. For example my sheet looks like
How do I compare Column D with E and if there is a matching string it will be printed in column F.
Edit 1: What function should I use to compare both case sensitive and non-sensitive strings.
This is kind of crude, but will tell you if the word in the first cell is in the 2nd, you can vary the left with mid, or right depending on your values.
=FIND(LEFT(C199,FIND(" ",C199,1)),O199,1)
In cell F1 place this formula:
=IF(ISERROR(MATCH(D1,$E$1:$E$100,0)),"",D1)
Then copy it down. This will show all non-case sensitive matches (for a column two list that has 100 values. Change the 100 to however long your column two list really is.)
To do the case-sensitive comparison try this:
=IF(EXACT(D1,LOOKUP(D1,$E$1:$E$100)),D1,"")

Resources