Comparing value similarities in Excel - excel-formula

I have 3 columns in Excel that I'm trying to compare against.
E.g.
I want to make sure that the email address should be more similar to Enquirer Name than it is to Client Name.
My current formula looks at if Enquirer Name differs to Client Name, then show me email addresses that match more than 5 characters with Client Name, which doesn't quite eliminate enquirers related to clients where their email address satisfies the equation.
Is Excel capable of returning something like Enquirer-Email match 9 characters, Client-Email match 5 characters, therefore no need to alert. Whereas if it's the other way around, conditional formatting to highlight the line?
Many thanks in advance!

Related

Lookup several keywords in a column, if found, return 4 columns

I've got two databases, one with 11,000 entries, and another that I've narrowed down to around 600. They are databases with Company name, and contact information for people at that company. So column A - Company Name. Column B - last name, column C - first name, column D - position, and column E - email address. What I'd like to do is search column D - position for several keywords - vice, benefit, resource, and if found, return the four columns - last name, first name, position and email address. So for company name X, we might have 10 contact names, and a couple may have contact info for people that fall into those keywords. I'd like to return those specific people.
I've managed to get the formatting between the company names normalized between the two lists, using brute force, and some index matching formulas (that was fun!), so those are the same, and I could probably do something like add 5 or 6 rows after each unique company name to accommodate the potential number of contacts we might have for each company, but I have no idea how to return multiple specific cells for a keyword search.
I think something like this might work -
=index(columntoreturn, small(if(isnumber(search(keywords, columntosearch)), match(row(column), row(column))), rows(array)))
But that will only return an individual cell, rather than the four I would need.
Here's an example of the two databases I'm working with.
as you asked the same have been closed as answered thorugh comment behalf of #Scott Craner
Answer
As I stated in my first comment, Advanced Filter is designed for this. You can put code to automatically do it based on certain cells on the page changing values. A formula is not ideal in that it would require it to be an array formula and the more array formulas and the larger the dataset would bog down the calc times. See here for an example on how to set up advanced filter using vba.

Excel Formula for Validating Email Addresses

I would like to check if anyone can help with the below.
I would like to have a validation formula for email addresses. After combing through the internet and other threads, I found something that works.
However, I'd like the data validation to check for comma and flag that as error too. The current formula only flag spaces.
Any advice/suggestion to tweak this formula?
=AND(FIND(“#”,A2),FIND(“.”,A2),ISERROR(FIND(” “,A2)))
To "tweak" your formula to also check for a comma, you can use an array constant:
EDIT: to correct formula to allow for its use in Data Validation (remove array constant)
=AND(ISNUMBER(FIND("#",A2)),ISNUMBER(FIND(".",A2)),ISERROR(OR(FIND(" ",A2),FIND(",",A2))))
I provide this only as a method of showing one way of testing to exclude multiple characters from a string. This is NOT, in my opinion, a good way to validate an email address.
Note that your formula, if it does NOT find an "#" or a ".", will return a #VALUE! error. If you would rather it return FALSE, suggest you wrap those FIND's in an ISNUMBER function.
However, and this is my opinion only, I think the best way to ensure that an email address is both valid and correct is to use a system which includes an activation email. That can avoid both typos and malformed data.
Of course, your formula will allow certain types of invalid emails. For example, it does not test that the # is in the middle of the string, nor the dot. Nor that the dot precedes the #. etc, etc, etc

Best match of a string from a predefined list of values?

I am doing a survey on household appliances, visiting stores. I note down the full model number by hand in a notebook. Back home, I need to fill these in a spreadsheet. I have a partial list of the model numbers. I want to minimise my effort by entering only unique parts of the model number string.
I'm using vlookup with * wildcard in excel for non-exact match. But problem is that it returns the first match, even though a better match is available.
Also, I might miss out on a hyphen or "/" and I need some mechanism to correct for that.
Is there any solution for Excel, Libreoffice/Openoffice or Google Sheets?
Assuming your model numbers are in column A and your search value is in B2, you can use this. It will handle the "/".
=query(A:A,"select A where lower(A) contains '"& lower(B2) &"'")

Excel formula to search partial match and align row

I have 2 column data in Excel like this:
Can somebody help me write a formula in column C that will take the first name or the last name from column A and match it with column B and paste the values in column C. The following picture will give you the exact idea what I am trying to do. Thanks
Since your data is not "regular", you can try this formula which uses wild card to look for just the last name.
=INDEX($B$1:$B$4,MATCH("*" &MID(A1,FIND(" ",A1)+1,99)&"*",$B$1:$B$4,0))
It would be simpler if the first part followed some rule, but some have the first initial of the first name at the beginning; and others at the end.
Edit: (explanation added)
The FIND returns the character number of the first space
Add 1 to that to get the character number of the next word
Use MID to extract that word
Use that word in MATCH (with wild-cards before and after), to find it in the array of email addresses. This will return it's position in the array (row number)
Use that row number as an argument to the INDEX function to return the actual email address.
If you want to first examine the email address, you will need to determine which of the letters comprise the last name. Since this is not regular according to your example, you will need to check both options.
You will not be able to look for the first name from the email, as it is not present.
If you can guarantee that the first part will be follow the rule of one or the other, eg: either
FirstInitialLastName or
LastNameFirstInitial
Then you can try this:
=IFERROR(INDEX($B$1:$B$4,MATCH(MID(A1,FIND(" ",A1)+1,99)& LEFT(A1,1) &"*",$B$1:$B$4,0)),
INDEX($B$1:$B$4,MATCH( LEFT(A1,1)&MID(A1,FIND(" ",A1)+1,99) &"*",$B$1:$B$4,0)))
This seems to do what you want.
=IFERROR(VLOOKUP(LOWER(MID(A1,(SEARCH(" ",A1)+1),LEN(A1)))&LOWER(MID(A1,1,1))&"*",$B$1:$B$4,1,FALSE),VLOOKUP(LOWER(MID(A1,1,1))&LOWER(MID(A1,(SEARCH(" ",A1)+1),LEN(A1)))&"*",$B$1:$B$4,1,FALSE))
Its pretty crazy long and would likely be easier to digest and debug broken up into columns instead of one huge formula.
It basically makes FLast and FirstL out of the name field by splitting on the space.
LastF:
=LOWER(MID(A1,(SEARCH(" ",A1)+1),LEN(A1)))&LOWER(MID(A1,1,1))
And FirstL:
=LOWER(MID(A1,1,1))&LOWER(MID(A1,(SEARCH(" ",A1)+1),LEN(A1)))
We then make 2 vlookups for these by using wildcards:
LastF:
=VLOOKUP([lastfirst equation from above]&"*",$B$1:$B$4,1,FALSE)
And FirstL:
=VLOOKUP([firstlast equation from above]&"*",$B$1:$B$4,1,FALSE)
And then wrap those in IfError so they both get tried:
=IfError([firstLast vlookup],[lastfirst vlookup])
The rub is that's going to be hell to edit if you ever need to, which is why I suggest doing each piece in another column then referencing the previous one.
You also need to be aware that this answer will get tripped up by essentially the same name - e.g. Sam Smith and Sasha Smith would both match whatever the first entry for ssmith was. Any solution here will likely have the same pitfall.

Excel: How to parse/cast text as a formula?

Is it possible to parse/cast text (like "=A1+A2") as a formula in MS Excel? I want to build a formula from pieces of text - some of which will only be typed in later by a user.
If the INDIRECT() function did not only work for referencing cells, then I could have typed this =INDIRECT("=A1+A2").
I know you can a work around this problem by simply adding a lot more hidden columns to do sub calculations. But for the sake scalability and efficiency, I would rather do something like the above.
I found a similar questions here and here, yet they don't solve my problem.
The Real-world problem:
Read on for a better understanding as to why you would want to do the above
Scenario
Each item in the list consists of a string, which contains anywhere from 1 to 5 account names each. Each account name is followed by an account number in brackets. The length of the number determines the type of account. Part of the account number is a date, of which the date format depends on the type of account. Further more, each account type may have more that 1 account-number length associated with it, although each number-length[*] is only associated with 1 account type.
Objectives
Extract account-names and their respective account-numbers and account-types from a list.
Make an assumption as to the account-type from the account-number
Validate this assumption by inspecting the build of the number and elements in the name
Check the validity of the account-numbers depending on their type.
The tricky part (this is where my problem lies)
The account-types and their respective account-number-lengths are not known before hand, and are typed into a table by the user of the sheet, specifying a type of account and the number-lengths associated with this account-type. The user should type this into a list - not go and tinker around with delicate formulas
Done so far
Column A: Contains the raw data (each cell has up to 5 names and numbers)
Columns B..F: Each column extracts 1 name, remains empty if all are already extracted
Columns G..K: Each column extracts 1 number corresponding to its name in columns B..F, remains empty if all are already extracted
Columns L..P: Each column calculates the length of the corresponding number in columns
G..K
Now the user would type the following details into a table which assigns certain number-lengths an account type:
TYPE2, BUSINESS, (OR(length=13,length=6))
where length will later be replaced with the cell address which contains the calculated account number-length.
What I want to do now
Columns Q..U:
Should all indicate the account-type of the corresponding account-number in columns G..K. The idea is to build a nested if-elseIf-elseIf formula using the criteria typed in by the user as specified above. Example of one of the elseIF statements:
SUBSTITUTE(CONCATENATE("=IF(",criteria,",",type,",",errCode)),"length","O10"))
All of these elseIf statements will then be concatenated together to form a master formula which will then need to be parsed/cast as a formula to calculate the account-type
This proposal uses only 5 columns (1 for each account-number, containing the master formula) and a table specifying account-types and criteria, also keeping the user away from formulas. Editing 1 line of code (the criteria) will update all formulas. Efficient & Scalable.
Since the user should never tinker around with the formulas under the hood, a simple 1 column if-elseIf-elseIf is out of the question. The alternative to the above would be to make a separate column to test for each account-type for each account-number. Separating/Abstracting out each test to its own column results in much better readability, easier editing & much less debugging - Unless you like multi-screen-wide-formulas. Example: 5 account-numbers * 10 possible account types = 50 extra columns.
Each edit to any criteria needs to copied to 4 other non-adjacent columns and drag-filled down 10,000 rows (columns can not be adjacent since it is effectively a 5x5 array of columns). Not Efficient nor scalable. Unless I'm missing some elegant way of updating non-adjacent formulas in a single click
The rest of the validations error indications are trivial.
Sample data
Tshepo Trust (6901/2005) Marlene Mead (8602250646085)
Great Force Inv 67 Pty Ltd (200602258007)
Jane (870811) Livingstone (6901/2005) Janette Appel (8503250647056) James (900111)
I know all this would probably be much easier to achieve with clever usage of VBA, eliminating all the need to simulate abstraction, encapsulation, multi-dimensional arrays and functional programming on a spreadsheet. But until I can program in VBA, worksheet formulas will be my refuge.
[*]: account number-length could also be described as the amount of digits in the number or as indicated by this formula: LEN(accNumber)
In VBA you have access to Cell.Formula.
I usually used Range to peek a cell by address.
I'm not sure if this would answer your question(it's a very detailed question!), but if your user was entering the account numbers in a table (I'm calling it 'RefTable') , that was:
Length of account number | business type
----------------------------------------
6 | Accountant
8 | Advisor
Then you could just use a vlookup on the length of the account number, given you've already separated them out.
=vlookup(len(accNumber), Reftable, 2, false)
Make sure that you either use a dynamic range name, or specify plenty of space below in RefTable, so that when your users add types, they don't get lost.
Also, if you have two different accounts with the same length, this could get you into trouble.

Resources