How to find the last 2 letters in an alpha numeric string? - excel

I have a column of alpha numeric addresses with no punctuation. In all cells, the state is the last 2 letters (abbreviation) followed by the ZIP. However, sometimes the ZIP is 5 letters and sometimes its xxxxx-xxxx so mid()right() wont work. Can anyone think of a formula that will work?

If those are the only two options: ##### and #####-#### then:
=MID(A1,LEN(A1)-IF(ISNUMBER(--MID(A1,LEN(A1)-5,1)),11,6),2)

Related

How to trim prefix in goggle sheets using various conditions

I have data as follows in excel/google sheets.
Numbers that have a length of 19 characters need to be manipulated in this way
For all strings with a length of 19 last 6 digits need to be trimmed, ( i can easily do it )
and remove the leading prefix which is either 200 or 20000
for example
2005507187528000001 to 5507187528 |
2000017303364000001 to 17303364
Have no idea what to do to remove the prefix, I tried trimming the last 14 digits to get 20000 or 20055 and using this to determine if I need to take out the first 3 or first 6, but no success.
Please help !!!
thanks
If I understood your question correctly you want to remove the first N characters whether it is 200 or 20000.
Try:
=IF(LEFT(A2,5)="20000",RIGHT(A2,LEN(A2)-5),RIGHT(A2,LEN(A2)-3))
Drag down to column.
Result:
Explanation:
Using the LEFT() function you can extract the first 5 characters. You can then use an IF() to check if it is equal to 20000. Then using the Combination of RIGHT() and LEN() to remove the first N characters. If it is equal to 20000 remove the first 5 characters, if not then remove the first 3 characters.
Using an ArrayFormula:
=ARRAYFORMULA(IF(A2:A="","",IF(LEFT(A2:A,5)="20000",RIGHT(A2:A,LEN(A2:A)-5),RIGHT(A2:A,LEN(A2:A)-3))))
Here's a way using arrayformula so you don't have to drag down/copy to cells below. This of course still needs to be adjusted to your range.
Note: I have not included the formula to remove the last 6 characters since according to you you already have this, so you can just add this formula to yours.
For all strings with a length of 19 last 6 digits need to be trimmed,
( i can easily do it )
References:
Remove the First N Characters in a Cell in Google Sheets - Multiple ways to remove the first N characters, refer to this link.
LEFT()
IF()
try:
=INDEX(IFERROR(REGEXEXTRACT(A2:A&""; ".{6}$")))
update:
=REGEXEXTRACT(F905&""; "^20+(\d.*)\d{6}")

How to make partial match between two strings with mispelled characters

I have a list of 12 digit alphanumeric codes and I need to match against a list of entries where codes might be misspelled.
For example, if the exact code is "K4I3T9OTG9GZ" the entry I have to check might be "K413T90TGS" (1 instead of capital I, 0 instead of capital O, S instead of Z).
I need to do a partial match to be able to find the right code.
Any ideas?
I already tried VLOOKUP with wildcards which worked for most entries with at least five consecutive right characters, but I still have a couple of hundred entries with no match.
Maybe this will help (array formula - Ctrl+Shift+Enter):
=SUMPRODUCT(--ISNUMBER(MATCH(MID($B$2,ROW($A$1:$A$12),1),MID($B$1,ROW($A$1:$A$12),1),0)))
The formula will check each character, one by one, and compare it against the "original"/"exact" code. In your example the result would be 7, as seven characters are matched exactly:
K 4 - 3 T 9 - T G -
{1;1;0;1;1;1;0;1;1;0;0;0}
Here's the full picture:

How to get rid off only certain zeros formula?

Ok so I had a nice formula until a problem came along. Basically I needed to get rid off a zeros in the middle of a 10 characters String/Range i.e AB00005879 to do that I have used formula SUBSTITUTE(NameRange,"0","") which gave me nice AB5879 solution. Sometimes the number at the end would only be 3 digit long AB00000975 so my formula would give me AB975 All great until I stumble a problem. Some of the strings came in a form of i.e. AB00004020 So my formula extracted every zero leaving me with AB42. Is there a way to extract only first four zeros in a middle an always keep the number at the and? so the last scenario would look like AB4020. Thanks in advance
SUBSTITUTE(NameRange,"0",""))
If you always have two characters at the start and then some zeros and then some numbers, all of which you want to keep, this should work
=LEFT(A1,2) & VALUE(RIGHT(A1,LEN(A1)-2))
EDIT #2
If your string always starts with two letters such as AB following by a random number of zeros and then a number string that you want to keep, try
=LEFT(A1,2)&RIGHT(A1,11-AGGREGATE(15,6,ROW($3:$10)/(--MID(A1,ROW($3:$10),1)>0),1))
Replace A1 with your actual case.

determining soundex conversion

when converting the name 'Lukasieicz' to soundex (LETTER,DIGIT,DIGIT,DIGIT,DIGIT), I come up with L2222.
However, I am being told by my lecture slides that the actual answer is supposed to be L2220.
Please explain why my answer is incorrect, or if the lecture answer was just a typo or something.
my steps:
Lukasieicz
remove and keep L
ukasieicz
Remove contiguous duplicate characters
ukasieicz
remove A,E,H,I,O,U,W,Y
KSCZ
convert up to first four remaining letters to soundex (as described in lecture directions)
2222
append beginning letter
L2222
If this is American Soundex as defined by the National Archives you're both wrong. American Soundex contains one letter and three numbers, you can't have L2222 nor L2220. It's L222.
But let's say they added another number for some reason.
The basic substitution gives L2222. But you're supposed to collapse adjacent letters with the same numbers (step 3 below) and then pad with zeros if necessary (step 4).
If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by 'h' or 'w' are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.
If you have too few letters in your word that you can't assign [four] numbers, append with zeros until there are [four] numbers. If you have more than [4] letters, just retain the first [4] numbers.
Lukasieicz # the original word
L_2_2___22 # replace with numbers, leave the gaps in
L_2_2___2 # apply step 3 and squeeze adjacent numbers
L2220 # apply step 4 and pad to four numbers
We can check how conventional (ie. three number) soundex implementations behave with the shorter Lukacz which becomes L_2_22. Following rules 3 and 4, it should be L220.
The National Archives recommends an online Soundex calculator which produces L220. So does PostgreSQL and Text::Soundex in both its original flavor and NARA implementations.
$ perl -wle 'use Text::Soundex; print soundex("Lukacz"); print soundex_nara("Lukacz")'
L220
L220
MySQL, predictably, is doing its own thing and returns L200.
This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.
In conclusion, you forgot the squeeze step.

Replace string with special character in Excel

I want to partially mask names on excel after concatenating:
A1: David Goliath
B1 (output): Dav*******ath
Please help. I need the 1st three and and last 3 characters shown and the rest to be replaced by a special character. Since this formula will be applied on a long list, the length of names would vary.
Formula
=LEFT(A1,3)&REPT("*", LEN(A1)-6)&RIGHT(A1,3)
Picture
How it works
This formula relies on string manipulation to grab the first 3 characters, last 3 characters, and a string of * in the middle. This assumes that the entries are at least 6 characters long. If you want it to work for less than 6, you would need to decide how to hide the middle.
The only real trick is knowing that the number of * you need is 6 less than the length of the string since you are taking 3 characters from the front and back.

Resources