Concatenate part of cell 1 by multiple criteria with all of cell 2, finishing with cell 1’s remainder - excel

Introduction
This is a continuation of the question posted and answered here, but has been advised to be posted as a separate question by #Jordan.
Goal: Join part of cell 1’s contents with all of cell 2’s, finishing with the remained of cell 1.
Twist: Multiple criteria have to be applied to cell 1.
Problem
After successfully altering Jordan’s excellent answer to accomodate concatenated names for joined Thinglag, the following function will perform the task, as long as there is only a single criteria to identify:
IF(F2="",E2,CONCATENATE(LEFT(E2,SEARCH(" T",E2))&"("&F2&")"&MID(E2,SEARCH(" T",E2),Len(E2)-SEARCH(" T",E2)+1)))
However, for parishes and annexes, multiple criteria are needed, vz. the following:
Sogn
Hovedsogn
Annex
Præstegjeld
Structure
AB–AH: Sogn_anx_[1–7]
AI–AO: Sogn_anx_[1–7]_altnvn
AP–AV: Sogn_anx_[1–7]_hele
As with the original post, I have a source giving current official names for the area, as well as previously used names (providing etymological information for the current name). In the source, where the old name is included, it is given as a paranthetical remark, e.g.:
‘Søndeløvs (Sundaleid) Annex’
‘Tromø (Thrumø) Annex’
‘Hvitesø (Hviteseids) Hovedsogn’
‘Attraa (Attrod) Hovedsogn’
‘Thjølings (Thjodaling) Sogn’
These have been entered into the database using three columns:
One for the official name
One for the old name
One showing the name as printed
This is to allow for better searchability when this information is to be made publicly available.
Example data:
Sogn_anx_1 Sogn_anx_2 Sogn_anx_3 … Sogn_anx_1_altnvn Sogn_anx_2_altnvn Sogn_anx_3_altnvn … Sogn_anx_1_hele Sogn_anx_2_hele Sogn_anx_3_hele
AB AC AD … AI AJ AK … AP AQ AR
Soleims Hovedsogn … Solheims … Soleims (Solheims) Hovedsogn
Meleims Annex … Medelheims … Meleims (Medelheims) Hovedsogn
Holdens Hovedsogn Romenæs Annex Holdens Hovedsogn … Rumenæs Hollen … Holdens Hovedsogn Romenæs (Rumenæs) Annex Holdens (Hollen) Hovedsogn
As can be seen, the first set of columns contain the official name; the second set of columns (altnvn = alt_name) contains the old name, which in the source is written as a paranthetical remark; and the third set of columns contains the full, concatenated name (hele = entire/whole), which—in those cases where there is an alternative name—includes this in parentheses.
Desired result
I would like to perform the same task in the third column as done in the post referenced, only this time it has to be able to perform the search by looking for any of the four criteria, so " T" would have to be replaced by all four variants: " So", " Ho", " An" or " Pr" (note: spaces are intentional). I have tried editing the original function using OR, but this—to no surprise—fails.

It might be simpler with a VBA solution. But your immediate problem, to find one of several defined words using SEARCH, can be accomplished by using an array constant for find_text and appending the terms to within_text. If you are not guaranteed that find_text will always appear, you'll need to check that the result is less than the length of the original within_text.
You might also consider using the case-sensitive Find function, or longer find_text strings in case there might be some ambiguity.
=MIN(SEARCH({" So"," Ho"," An"," Pr"},AB3&" So Ho An Pr"))

Related

Excel: How to find six different combinations of words in string?

I have been working for several days on this and have researched everything looking for this answer. I'd appreciate any help you can give.
In Excel I am searching a string of text in column A:
Bought 1 HD Sep 3 2021 325.0 Call # 2.75
I am detecting the first word (in this case "Bought") and detecting the last word before "#" symbol (in this case "Call").
I am then detecting the price following the "#" symbol (in this case "2.75"). This number will go into column B (header "Open") or column C (header "Close") depending on the combination of words found:
Sold/Put=Close
Sold/Call=Open
Bought/Put=Open
Bought/Call=Close
Sold (by itself)=Open
Sold (by itself)=Close.
Bought 1 HD Sep 3 2021 325.0 Call # 2.75
The combination found in the above string is: "Bought Call". Therefore the number at the end ("2.75"), goes into "Open" column.
Here's another example:
Sold 4 AI Sep 17 2021 50.0 Put # 1.5
The combination found in the above string is: "Sold Put". Therefore the number at the end ("1.5") goes into "Close" column.
I am currently using this formula to determine if the string contains "Sold" and "Call" and get the desired number and it does work:
=IF(AND(
ISNUMBER(SEARCH({"Sold","Call"},A10))),
TRIM(MID(A10,SEARCH("#",A10)+LEN("#"),255))," ")
But, I don't know how to search for all the other possible combinations.
The point behind this is to be able to paste the transaction from the broker and have most of the entry process automated. I'm sure many will benefit from this as I've not found anything like this.
I'd appreciate any help and if possible, an explanation of the formula so I can better learn.
Thanks!
I think you have the right idea, but would just extend the IF statement.
Something like the below might work for you:
=IF(ISNUMBER(SEARCH("Call", $A1)),
IF(ISNUMBER(SEARCH({"Bought","Sold"}, $A1)),
NUMBERVALUE(RIGHT($A1, LEN($A1)-SEARCH("#", $A1))),""),
IF(ISNUMBER(SEARCH({"!!!","!!!","Bought","Sold"}, $A1)),
NUMBERVALUE(RIGHT($A1, LEN($A1)-SEARCH("#", $A1))),""))
Just enter in column B and drag down; columns B through E should fill as needed.
For example:
Note that the search for "!!!" is just random characters, it can be anything that you don't think has a good chance of appearing in the string.
Here/screenshots refer:
(requires Office 365 compatible version Excel)
Main lookup
=LET(fn_1,MATCH("*"&$H$7:$H$12&"*",B4,0),fn_2,MATCH("*"&$I$7:$I$12&"*",B4,0),IFERROR(INDEX($J$7:$J$12,MATCH(1,IF($I$7:$I$12="",fn_1*ISNUMBER(fn_2),fn_1*fn_2),0)),))
EDIT:
Other Excel versions:
=IFERROR(INDEX($J$7:$J$12,MATCH(1,IF($I$7:$I$12="",MATCH("*"&$H$7:$H$12&"*",B4,0)*ISNUMBER(MATCH("*"&$I$7:$I$12&"*",B4,0)),MATCH("*"&$H$7:$H$12&"*",B4,0)*MATCH("*"&$I$7:$I$12&"*",B4,0)),0)),)
(all that falls away is the 'Let' formula, replacing fn_1 and fn_2 with respective functions in index formula within the let making first equation somewhat longer, but otherwise identical)
Example applications
Have provided 2 examples of how one might customize to insert numeric in one of the columns (the key part to this question is really how to do lookup in first instance, from thereon it's a matter of finetuning/taking appropriate action)...
Assuming calls/buys are "long" position and strike price go in first col (here, D), and puts/sales are "short" position with strike price going in 2nd col (here, E):
Long - insert strike price col D
=IF(LET(fn_1,MATCH("*"&$H$7:$H$12&"*",B4,0),fn_2,MATCH("*"&$I$7:$I$12&"*",B4,0),IFERROR(INDEX($K$7:$K$12,MATCH(1,IF($I$7:$I$12="",fn_1*ISNUMBER(fn_2),fn_1*fn_2),0)),))=1,MID(SUBSTITUTE(B4," ",""),SEARCH("#",SUBSTITUTE(B4," ",""))+1,LEN(SUBSTITUTE(B4," ",""))),"")
EDIT
Other Excel versions:
=IF(IFERROR(INDEX($K$7:$K$12,MATCH(1,IF($I$7:$I$12="",MATCH("*"&$H$7:$H$12&"*",B4,0)*ISNUMBER(MATCH("*"&$I$7:$I$12&"*",B4,0)),MATCH("*"&$H$7:$H$12&"*",B4,0)*MATCH("*"&$I$7:$I$12&"*",B4,0)),0)),)=1,MID(SUBSTITUTE(B4," ",""),SEARCH("#",SUBSTITUTE(B4," ",""))+1,LEN(SUBSTITUTE(B4," ",""))),"")
Short - insert strike price col E
=IF(LET(fn_1,MATCH("*"&$H$7:$H$12&"*",B4,0),fn_2,MATCH("*"&$I$7:$I$12&"*",B4,0),IFERROR(INDEX($K$7:$K$12,MATCH(1,IF($I$7:$I$12="",fn_1*ISNUMBER(fn_2),fn_1*fn_2),0)),))=2,MID(SUBSTITUTE(B4," ",""),SEARCH("#",SUBSTITUTE(B4," ",""))+1,LEN(SUBSTITUTE(B4," ",""))),"")
EDIT
Other Excel versions:
Follow same routine in previous Edits (remove Let, replace fn_1 & fn_2 with respective formulae...)
Note similarity in all 3 equations above: 2nd and 3rd contain 1st (effectively they just wrap a big old 'if' statement around 1st, use lookup_2 col (here, col K), and use mid/search to extract rate after the hashtag.
Assumes you don't have other hashtags in the sentence..
Customize as required.

Finding Last Name while ignoring Suffixes

I have a field that has first and last names. Some names include a middle initial, some names include a suffix.
I am trying to find a formula that only pulls the last name regardless of which format it is in.
Example format
Donald P Bellisario --> Bellisario
Dale Earnhardt Jr --> Earnhardt
Jimmy M Butler III--> Butler
Kanye E West--> West
Joseph Biden--> Biden
Formula 1: =TRIM(RIGHT(SUBSTITUTE(AS9," ",REPT(" ",LEN(AS9))),LEN(AS9)))
Formula 2:=RIGHT(AS9,LEN(AS9)-FIND("*",SUBSTITUTE(AS9," ","*",LEN(AS9)-LEN(SUBSTITUTE(AS9," ","")))))
Formula 1 and 2 do not ignore suffixes and will list those if existent Jack Smith Jr--> Jr
Formula 3: =SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(TRIM(SUBSTITUTE($AS9,IFERROR(RIGHT($AS9,LEN(AS9)-FIND(" ",$AS9)-10),""),""))," ",REPT(" ","99")),99)),",","")
Formula 3 will only include 10 characters after the end of the first name without displaying the middle initial. E.G(Heisenberger--> Heisenberg)
Truth is, working with names can be subject to various edge-cases that will prove a working solution wrong at some point. But for those samples shown I'd use FILTERXML() to "split" these input strings on the spaces and use xpath expressions to filter out those substrings:
Formula in B1:
=FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ.', '')!=''][translate(., 'aeiouAEIOU', '')!=.][last()]")
The trick here is that there are three coherent xpath expressions working together:
[translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ.', '')!=''] - Assert that node is not nothing when all uppercase and dots have been substituted with nothing.
[translate(., 'aeiouAEIOU', '')!=.] - Assert that node is not equal to its original node when all vowels (upper- and lowercase) have been substituted with nothing.
[last()] - The last() function returns an integer equal to the context size from the expression evaluation context, and thus it will return the last node that compiled testing against previous expressions.
I'd guess that depending on possible edge-cases you could add more rules to the equation. For a more comprehensive insight on these expressions you could have a look here.
Good luck.

Formula to look for a word within a sentence

Here is the Sample Google sheet file
https://docs.google.com/spreadsheets/d/1B0CQyFeqxg2wgYHJpFxLIzw_8Pv067p0cwacWk0Nc4o/edit?usp=sharing
I have an Excel Sheet where I need to find Arabic Words and separate them.
For example, I have data like this:
//olyservice/GIS-TANSIQ01/Storage/46-أمانة منطقة عسير -بلدية بللحمر/حدود القري المطلوب اعتمادهاالمعتمد مسمايتها بالوزارة.rar
I'm looking for:
1st column: أمانة منطقة عسير
2nd column: بلدية بللحمر
3rd column: RAR
If there is no أمانة and بلدية words, the columns should be blank.
I tried these methods, without success:
=RIGHT(MID(A2,FIND("-",A2,20)+1,255),25)
and
=TRIM(MID(SUBSTITUTE(A2,"",REPT(" ",99)),MAX(1,FIND("-",SUBSTITUTE(A2,"",REPT(" ",99)))+21),99))
Since you specify certain key words to be found, we can look for those key words and then the relevant delimiter, based on your example.
In your example, أمانة is followed by the dash, and بلدية by the slash. (followed by is in terms of the right-to-left orientation of Arabic words).
Try this:
Col1: =MID(A1,FIND("أمانة",A1),FIND(CHAR(1),SUBSTITUTE(A1,"-",CHAR(1),LEN(A1) - LEN(SUBSTITUTE(A1,"-",""))))-FIND("أمانة",A1))
Col2: =MID(A1,FIND("بلدية",A1),FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))))-FIND("بلدية",A1))
Col3: =TRIM(RIGHT(SUBSTITUTE(A1,".",REPT(" ",99)),99))
If the keywords are not found, the formula will return an Error. So you can just "wrap" the formula in IFERROR to have it return a blank if the key words are not present.
Edit:
The actual workbook does not have the same pattern as the sample you posted. In particular. Try this for column 2 data:
=MID(A2,FIND("بلدية",A2),99)
or with error suppression:
Col1: =IFERROR(MID(A2,FIND("أمانة",A2),FIND("-",A2,FIND("أمانة",A2))-FIND("أمانة",A2)),"")
Col2: =IFERROR(MID(A2,FIND("بلدية",A2),99),"")
And, the cells that are still returning the #VALUE! error do not have that keyword in the line.
For example:
A6: //olyservice/GIS-TANSIQ01/Storage/103-أمانة منطقة عسير -أحد رفيدة
does not contain بلدية
BTW, those formulas seem to both work on Sheets also.
Edit2:
Since you also posted an example in Sheets, if you can implement this in Sheets, you can use Regular Expressions to account for multiple terminations.
In that case, you would use:
=iferror(REGEXEXTRACT(A2,"(أمانة.*?)\s*(?:[-/\\.]|$)"),"")
or
iferror(REGEXEXTRACT(A2,"(بلدية.*?)\s*(?:[-/\\.\w]|$)"),"")
for the columns.
The regex extracts the pattern that begins with the key phrase, up to the terminator which can be any character in the set of -/\.A-Za-z0-9 or the end of the line. That seems to cover the examples in your sample worksheet, but if there are other terminators, you can add them to the sequence.
In Excel, this would require a VBA UDF to implement the Regex engine.

VLOOKUP with conditions

I have an issue at the moment which I'm not able to resolve even with multiple combinations of If and Vlookups. I'm not doing this right.
I have a sheet which has the names of the products and an empty column for the Sl Number. The Sl number needs to be retrieved from Sheet 2 if it matches the value in the adjacent cell of the formula (This I know can be possible with Vlookup). However, I am trying to display the value even if the match is not exact. By that I mean if the product name has all the values as on the sheet 1 but also has additional information in brackets, then the value should still be displayed.
Sheet 1
Formula in A2 - A7 = "=VLOOKUP(B2, Sheet2!B:E, 2, 0)"
Sheet 2
The complete data
Is this possible?
Thanks in advance.
Apologies, I'm new here and not sure how this works. So trying to do the right thing but may take some time.
Thanks Frank and Tim. I have another extended question to this.
Is there a way to retrieve the value by ignoring text in brackets on the lookup cell itself?
For example:
Sheet 1
Sl Number Name
123454 Cream SPF 30+ 50g
**NA** Bar Chocolate 70g X 6 (Sample)
234256 Hand Wash 150ml
26786 Toothpaste - Whitening 110g
Sheet 2
ID Name Sl number Manufacturer Quantity
8 Collagen Essence 10ml 456788 AL 87
9 Hand Wash 150ml 234256 AD 23
10 Bar Chocolate 70g X 6 835424 AU 234
Row 2 on Sheet 1 has the name that includes (Sample) and the same product on sheet 2 does not contain the (Sample) for that product. Is there a way I can use lookup in the above scenario?
Thank you
Tim's comment
=VLOOKUP(B2 & "*", Sheet2!B:E, 2, 0) as long as the "Extra" info is tagged onto the end of the name, and none of your product names is a
substring of another product name. – Tim Williams 53 mins ago
Will get what you are looking for, as for getting rid of text between "(...)" use
=IFERROR(IF(FIND("(",A2),LEFT(A2,FIND("(",A2)-1),A2),A2)
To create a new column that will cut out anything that has parentheses "(...)" this presumes that all of your entries has the "(...)" at the end, i.e. far right side.
As you are new, I presume you might be interested in an explanation. I'll explain what Tim and I did. If I am incorrect, anyone is free to edit.
Based on your question, it would appear that you are familiar with Excel but not the site. This said, my understanding of the key difference between your attempt and Tim's was =VLOOKUP(B2 & "*", Sheet2!B:E, 2, 0) or specifically & "*". This introduces a Wildcard to the search parameter. So if you typed "Bob" but the actual reference was "Bob's Burger" That "*" would allow ['s Burger] to be included as part of the possible search given that you set vLookup to search for Approximate rather than exact matches. =VLOOKUP(B2 & "*", Sheet2!B:E, 2, 0) specifically , 0).
As for my part, IFERROR is effectively an catch-all for errors in IF functions. If there is a error, then X. In this case, if it does not find "(" in the cell, then it will throw an error. Since it is an error, display the original cell.
As for IF(FIND("(",A2),LEFT(A2,FIND("(",A2)-1),A2) It asks Excel to look for "(" in the cell A2, if it finds it, then it it counts from the LEFT until it finds the "(" and deletes the text one space to the left of the first "(". Thus removing the "(...)".

Excel SUMIF with exact word but not case sensitive match

I have a simplified table for this problem example
Column A       |  Column B  |  Column C
war            | 1          | war
War            | 2
warred         | 3
war and peace  | 4
awful war      | 5
dead war horse | 6
Now I need to find all rows containing the word "war" that is not case sensitive, but must be a separate word, not a part of another word.
For example
=SUMIF(A1:A6;"C1";B1:B6)
right now finds only values "war" and "War" and SUM is 3.
I want it to find also values "war and peace", "awful war" and "dead war horse" since they all contain the word "war" and the SUM value should be 18.
I can't use search term
"*war*"
since this also includes the value "warred" and this is a separate word and shouldn't match.
One possibility is to create 4 different SUMIF-s with terms
war
war_*
*_war
*_war_*
_ is space
and then sum those four, but this is not that elegant.
I thought SUMPRODUCT with EXACT would work, but this seems to work over columns, not rows and EXACT isn't suitable..it think.
Is there a way to match row based on word that is not case sensitive and then sum all the values in Column B that have a matching row?
You could use:
=SUMPRODUCT((ISNUMBER(SEARCH(" "&C1&" "," "&A1:A6&" ")))*B1:B6)

Resources