Finding Exact Matches in Text Strings with Search - excel

I've included example screenshots and the forumlas bellow. I'm basically trying to run a list of Competitors against a list of searches. The competitor list is static and the searches list will grow and change overtime.
I'm finding it challenging to get consistent results with the search formula. It either is too loose (example A) or too strict (example B).
Is there a way to use the formula to exact match whole words in a text string? Or do I need to get sneaky with a VBA (something I'm not too familiar with). Or should I be doing this in another program?
Thanks!
Example A
So here, my formula works and says that someone searched containing avamere. However, It gives me a false positive on "roseville communities" because it has seville in roseville.
Example A Formula:
SUMPRODUCT(--ISNUMBER(SEARCH($C$2:$C$5,B2)))>0
Example B
So I tried to normalize the competitors with spaces:
Now they're all false.
Example B Formula:
SUMPRODUCT(--ISNUMBER(SEARCH(" "&$C$10:$C$13&" ",B10)))>0

Related

Find value within cell from a series in Excel

I have a list of addresses, such as this:
Lake Havasu,  Lake Havasu City,  Arizona.
St. Johns River,  Palatka,  Florida.
Tennessee River,  Knoxville,  Tennessee.
I would like to extract the State from these addresses and then have a column showing the abbreviated State name (AZ, FL, TN etc.).
I have a table that has the States with their abbreviation and once I extract the State, doing a simple INDEX MATCH to get the abbreviation is easy. I don't want to use text-to-columns because this file will constantly have values added to it and it would be much easier to just have a formula that does the extraction for me.
The ways I've tried to approach this that have failed so far are:
Some kind of SEARCH() function that looks at the full State list and tries to find a value that exists in the cell
A MID or RIGHT approach to only capture the last section but I can't work out how to have FIND only look for the second ", "
A version of INDEX MATCH but that fails because I can't find a good way to search or find the values as per approach (1)
Any help would be appreciated!
Please try this formula, where A2 is the original text.
=FILTERXML("<data><a>" & SUBSTITUTE(A2,", ","</a><a>") & "</a></data>","data/a[3]")
An alternative would be to look for the 2nd comma as shown below. Note that the "50" in the formula is an irrelevant number required by the MID() function. It shouldn't be smaller than the number of characters you need to return, however.
Char(160) is a character that wouldn't (shouldn't) naturally occur in your text, as it might if the text comes from a UNIX database. You can replace it with another one that fits the description.
=TRIM(MID(A2, FIND(CHAR(160),SUBSTITUTE(A2,",",CHAR(160),2)) + 1,50))
The following variation of the above would remove the final period. It will fail if there is anything following the period, such as an unwanted blank. That could be accommodated within the formula as well but it would be easier to treat the original data, if that is an option for you.
=TRIM(MID(LEFT(A2, LEN(A2)-1), FIND(CHAR(160),SUBSTITUTE(A2,",",CHAR(160),2)) + 1,50))
To find the abbreviation I would recommend to use VLOOKUP rather than INDEX/MATCH.
Use this (screenshot refers):
=MID(MID(B3,1,LEN(B3)-1),SEARCH(",",B3,SEARCH(",",B3,1)+1)+3,LEN(B3))

Finding the right range from excel table

What is the best way to find the right column for the travelled miles using visual basic coding or some excel function and return the price from that column? HLOOKUP can't be used here because the lookup value isn't exact and the ranges in the table are also not with specific intervals (If they were, I could use e.g. FLOOR(travelled miles/100)*100 and find the price with HLOOKUP). Obviously, it's easy to find the price manually with a small table but with a big table computer will be faster.
Note that, if x is between a and b, then MEDIAN(x,a,b)=x. Combine this with some nested IFs:
=IF(MEDIAN(B5,B1,C1-1)=B5,B2,IF(MEDIAN(B5,C1,D1-1)=B5,C2,IF(MEDIAN(B5,D1,E1-1)=B5,D2)))
I'm on my phone, so just done the first three cases, but hopefully you can see how it continues.
(should note you need to remove the dashes for this to work)
Edit:
I also want to answer your question in the comments above. You can use the following to keep the dash, but get a number to work with.
Assume cell A1 has got the value 10-. We can use the FIND function to work out where the - occurs and then use the LEFT function to only return the characters from before the dash:
=LEFT(A1,FIND("-",A1)-1)
This will return the value 10, but it will return it as a string, not a number - basically Excel will think it is text. To force Excel to consider it as a number, we can simply multiply the value by one. Our formula above therefore becomes:
=(LEFT(A1,FIND("-",A1)-1))*1
You may also see people use a double minus sign, like this:
=--LEFT(A1,FIND("-",A1)-1)
I don't recommend this because it's a bit complex, but combining with the formula above would give:
=IF(MEDIAN(B5,--LEFT(B1,FIND("-",B1)-1),--LEFT(C1,FIND("-",C1)-1)-1)=B5,B2,IF(MEDIAN(B5,--LEFT(C1,FIND("-",C1)-1,--LEFT(D1,FIND("-",D1)-1-1)=B5,C2,IF(MEDIAN(B5,--LEFT(D1,FIND("-",D1)-1,--LEFT(E1,FIND("-",E1)-1-1)=B5,D2)))

Need to understand fomula re: excel search for a string in a table and return string if true

I've adapted this solution from a couple of years ago:
=LOOKUP(2^15,FIND(Keywords,A2),Categories)
I use this for searching within a description field for keywords in a named list, in order to return a corresponding category from an adjacent named list.
However I do not understand the significance of 2^15. Can someone explain?
Also it's unclear in what order the search operates. If two keyword options were "check" and "deposit," and they were assigned to different categories, but both appeared in the same description field cell, how do I know which will be found first? Is it placement in the string, or order in the list?
2^15 is simply an arbitrarily large number, which lookup attempts to find - when it can't find it, it takes the next lowest number.
Effectively your formula looks at Keywords, and attempts to find the value in A2. For each word that actually matches A2, it provides a non-error message. Then out of the whole list, it attempts to find that line number in categories, resulting in many errors, and a single correct value. Lookup picks the value by using 2^15. Though this seems to be a weird way of doing it; it is likely a holdover of pre-2007, as Lookup is generally used now only for backwards compatibility purposes. Also using 1 instead of 2^15 worked for a couple of simple cases that I tried when writing this up.

Excel, Numberplate Clarification

I am working on an excel document for fuel cards at the minute and my current issue is to write in a formula for validating number plates based on UK standard plates (two letters followed by two numbers then three letters i.e. BK08JWZ). At this point in time we are not considering personal plates in this just to keep things simple.
Ideally I need excel to look at the text in the box and confirm it to an agreed layout but I am struggling to find the right formula. The plates are in column 'I' and I have already added in another column after titled 'approved plates' in column 'J'but this can be deleted if it's not needed.
Results wise, I can do this one of two ways, to either get the excel document to highlight and number plates that do not match the DVLA standard , or have a column next to the number plate column that registers a boolean response to the recognition i.e. If it is valid (true) or if not (false).
Either way the plate needs to be able to be seen as it was currently, so if there is something wrong with it, it needs to be visible, not throw up an error message.
Any help would be very welcome.
All the information on UK standard number plates are on this site:
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/359317/INF104_160914.pdf
I would do it like this:
1) create a lookup sheet with data from the booklet. One column for allowed "memory tag" identiffiers (first two letters), one column for the allowed "age identiffiers" (first two numbers), and one column for allowed random letters (last three letters, full alphabet except I and Q)
2) strip spaces from the number plate for comparison
3) Use MID(numberplate,1,2), MID(numberplate,3,2) and MID(numberplate,5,3) to compare to each lookup list repectively (using INDEX()>0).
4) when all 3 parts are found in lookup lists the number plate is valid.
Try researching Regular Expressions or RegEx. This is a powerful programming tool to determine whether strings match specific patterns. You can use RegEx expressions to extract the pattern, replace the pattern or test for the pattern. Very efficient but not for the faint-hearted although there is plenty of help on-line. Try this article for starters.
The following RegEx may be what you need..
(?^[A-Z]{2}[0-9]{2}[A-Z]{3}$)|(?^[A-Z][0-9]{1,3}[A-Z]{3}$)|(?^[A-Z]{3}[0-9]{1,3}[A-Z]$)|(?^[0-9]{1,4}[A-Z]{1,2}$)|(?^[0-9]{1,3}[A-Z]{1,3}$)|(?^[A-Z]{1,2}[0-9]{1,4}$)|(?^[A-Z]{1,3}[0-9]{1,3}$)
This was copied from this article which gives a very full explanation using DVLA rules.
EDIT:
To use RegEx within Excel. In the IDE, Tools menu, select References and add the Microsoft VBScript Regular Expressions 5.5 reference.
With acknowlegement to user3616725s helpful observation.

Excel 2013, how to us the "search" function like vlookup

Essentially, I am looking for a way to use the "search" function like the "vlookup" function. In my case, I am have a long list, of say, 1000 descriptions of different types of fasteners and I want to classify them according to what they are, (ie. Nut, bolt, washer etc.). However, I can't sort by description or partnumber because they, alphanumerically, don't line up by class. But he descripotion field does say at some point in it, what it is(ie. Nut, bolt, washer etc.).
As said, I have a table of classes, and I am looking for a formula that would look in the "description" field for all the values in the table,and then return that value, or one associated with it (like vlookup does with cell values).
So that, if it found "nut" in the description, it would return "nut", or if it found "bolt" it would return "bolt."
I hope that this question makes sense. Let me also say that I found a way "manually" do this using the search function, along with others, but the formula was very long and each value in my table had to be specially called out. However, I will include the formula I used to make clear what I was trying to do.
See below.
=IF(ISNUMBER(SEARCH($G$2,C3)),$G$2,IF(ISNUMBER(SEARCH($G$3,C3)),$G$3,IF(ISNUMBER(SEARCH($G$4,C3)),$G$4,IF(ISNUMBER(SEARCH($G$5,C3)),$G$5,...IF(ISNUMBER(SEARCH($G$13,C3)),$G$13,"MISC"))))))))))))
You see that with each item you add to your table, you have to add another if loop. I am hoping there is a better way. (I would call it "vsearch" :-) )
Try this formula
=IFERROR(LOOKUP(2^15,SEARCH($G$3:$G$13,C3),$G$3:$G$13),"MISC")
SEARCH returns an array of numbers or errors depending on whether each term is found in C3. By searching for "bignum" (in this case 2^15) which won't be found, the match is always with the last number, i.e. the last matching term in G3:G13.
MATCH can be used to find the text, and INDEX can be used to return the text
using your example, where you are searching in G:
=MATCH("*"&C3&"*",$G:$G,0)
and then index to return the text
=INDEX($G:$G,MATCH("*"&C3&"*",$G:$G,0))
and as a finishing touch, the #VALUE! replacement
=IFERROR(INDEX($G:$G,MATCH("*"&C3&"*",$G:$G,0)),"MISC")

Resources