Split prefix and surname (based on array data) - excel

I have a column of data existing out of names, which may or may not contain a surname prefix. Those prefixes can exist out of multiple words. I have a list of all possible prefixes, but now I need to split the prefix and surname and make 2 columns with the data.
What I did was writing an excel formula like the following:
=IF(
RIGHT(A1;7) = " van de"
;
RIGHT(A1;6)
;
IF(
RIGHT(A1;4) = " van"
;
RIGHT(A1;3)
;
IF(
RIGHT(A1;3) = " de"
;
RIGHT(A1;2)
;
--Insert more nested If statements here--
)
)
)
Data of the surnames can look like the following:
Name1 van de
Name1 van
Name1
Name1 Name2 van
Name1-Name2 Name3 van de
Name1 Name2 Name3
What I want:
OriginalName | Name | Prefix
-----------------|--------|----------
a b | a | b
a b c | a | b c
Firstly this is a pretty inefficient method, but I automated the creating of this formula, so that wasn't a problem anymore. Now I found out there's a limit to the nested If statements one can have, and I have to exceed that limit.
How should I solve this problem?
I have an array with the possible prefixes. Maybe this will help?

I made the assumption that you wanted to separate the "van" and "de" prefixes from the rest of the name. If I misunderstood, please provide more examples of your problem/question...
The following solution requires a helper column to determine where the "Prefix" starts, but you can hide it if necessary:
First, put my values in A8:A9 (van; de) anywhere and name it prefix so it can be referenced in the following formulas.
The formula in C1 is an array formula (use Ctrl+Shift+Enter):
=MIN(IF(ISNUMBER(SEARCH(prefix,A1)),SEARCH(prefix,A1)))
The formula in D1 and E1 or normal formulas:
=IF(C1>0,LEFT(A1,C1-2),A1)
=IF(C1>0,MID(A1,C1,LEN(A1)),"")

Put your list in order of the longest surname to the shortest. I put mine in E1:E3.
Then use this array formula:
=TRIM(IFERROR(SUBSTITUTE(A1,INDEX($E$1:$E$3,MATCH(TRUE,ISNUMBER(SEARCH($E$1:$E$3,A1)),0)),""),A1))
Then to get the Surname:
=IFERROR(INDEX($E$1:$E$3,MATCH(TRUE,ISNUMBER(SEARCH($E$1:$E$3,A1)),0)),"")
Being an array formulas they need to be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly then Excel will put {} around the formula.

I know this has already been answered but I done this yesterday and didn't have time to submit it (had to run for the bus).
All Prefix values have a space before them...
Column B formula (Array formula - Ctrl+Shift+Enter in formula bar)
=INDEX(SUBSTITUTE(A1,Prefix,""),MATCH(SMALL(LEN(SUBSTITUTE(A1,Prefix,"")),1),LEN(SUBSTITUTE(A1,Prefix,"")),0))
Column C formula - =TRIM(SUBSTITUTE(A1,B1,""))

Related

Transpose with default headers in excel

I have a simple data which is
Name Age
Venky 20
Anil 22
Output should be like :
Name : Venky
Age : 20
Name : Anil
Age : 22
Note : For each record should have header values
I have tried multiple ways apart from Macros
Can you please give me you inputs?
Without more specific context from the question, we can combine BYROW, MAP, TOCOL, TEXTJOIN, TEXTSPLIT and SUBSTITUTE to get the expected result (adding the link since some of them are relatively new Excel functions):
=LET(y,SUBSTITUTE(TEXTJOIN(";",,
BYROW(A2:C4, LAMBDA(x, TEXTJOIN(";",,MAP(TOCOL(A1:C1), TOCOL(x),
LAMBDA(a,b, a & ":"& b & ";")))))), ";;",";"),
TEXTSPLIT(LEFT(y, LEN(y) - 1),,";"))
The main idea is create on the fly for each row of input data(A1:B3 including the headers) a new array via MAP with the following structure (let’s call it tempArray:
| Name:name1; |
|-------------|
| Age:age1; |
Note: It works for more than two columns in the header, if that is the case a row per header item will be created.
For example:
=MAP(TOCOL(A1:B1), TOCOL(A2:B2), LAMBDA(a,b, a & ":"& b & ";"))
For the first row of the input data will return:
| Name:Venky; |
| ------------|
| Age:10; |
Then convert tempArray into a text via: TEXTJOIN so every element of BYROW, i.e. x is converted to a string, where each record is delimited by ;. For the first row it will be:
Name:Venky;;Age:10;
SUBSTITUTE is used to remove the extra ; generated by TEXTJOIN. For example for the first row it would be:
Name:Venky;Age:10;
After executing SUBSTITUTE the intermediate result would be:
Name:Venky;Age:20;Name:Anil;Age:22;
We use LEFT to remove the extra ; at the end. Now we have a string that has each row delimited by ;, so the string is ready to be converted back to an array using TEXTSPLIT.
Sample adding an additional record and Last Name as a new header item:

Rearrange excel table cells - reordering

I don't know excel very well and I am trying to take something like this (with a lot of entries):
Field ......Value ....... ID
A .......... blabla1 .......1
B ...........blabla2 .......1
C ...........blabla3 .......1
D ...........blabla4 .......1
A ...........blabla5 .......2
B ...........blabla6 .......2
C ...........blabla7 .......2
D ...........blabla8 .......2
and turn into something more readable like this:
ID -----A -------------B ---------------- C ---------------- D
1 ------blabla1 -----blabla2 -------- blabla3 --------blabla4
2 ------blabla5----- blabla6 -------- blabla7-------- blabla8
Does anyone know a good way to do that? Thank you
(sorry about the bad formatting)
The exact delimiter beween each word is key if text not already split in separate cells..
Assuming there are numerous words in place of '.....', with each word separated by a single space (different delimiter would be required if the blablas represented sentences comprising one / more spaces), then you could achieve the desired table representation as follows
(several function in this soln requires Office 365 compatible version of Excel,
the lookup in step 3 does not require Office 365, but may mean IDs and Fields need to be manually entered or VB could be deployed):
Starting position (after removing bank rows):
Field Value ID
A blabla1 1
B blabla2 1
C blabla3 1
D blabla4 1
A blabla5 2
B blabla6 2
C blabla7 2
D blabla8 2
1) Split cells according to delimiter (skip this step if not relevant)
=TRANSPOSE(FILTERXML("<x><y>"&SUBSTITUTE(F3," ","</y><y>")&"</y></x>","//y"))
(replace the " " inside the substitute function with a different delimiter if required/desired)
2) Obtain unique IDs (rows) and Fields (columns)
=UNIQUE(K4:K11)
=TRANSPOSE(UNIQUE(I4:I11))
3) Index lookup for table content
=INDEX(J4:J11,MATCH(M4#&N3#,K4:K11&I4:I11,0),0)

Excel adding more conditions in a formula

I believed the condtions written will be quite long and i am not really good in writing this long formula
There are 6 columns i've used which is D ,E, M, N, O, P
Sample data:
D3=123456(Changing variable as it can be 12345, 12345A,123456A)
E3=1
M3=31
N3=_
O3=00
P3=0
The formula are design based on this Column D field(the variable changes is in this field) let say
if length of D3 = 6 then (the current formula i've done)
=IF(LEN(D3)=6,CONCATENATE(M3,D3,N3,O3,E3),CONCATENATE(M3,D3,O3,E3))
The outcome for this will be 31123456_001, if let say the D variable is changed to 123456A( the else
in the formula i've shown as no concatenate N3)
then the outcome will be 31123456A001.
I have added in column p, so that i can use it to concatenate to the format that i need.
There are a few more conditions i need to add in,
Which is
1. If the D3= 12345, the format outcome will be 31012345_001 (concatenate M3,P3,D3,N3,O3,E3)
2. If the D= 12345A, the format outcome will be 31012345A001 (concatenate M3,P3,D3,O3,E3)
3. Data for the column D3 field, 12345A, the A alphabet can be in A-Z.
These are the list of all conditions and outcome that i required in a formula.
1. D3 = 123456 then the outcome will be 31123456_001
2. D3 = 123456A then outcome will be 31123456A001
3. D3 = 12345 then outcome will be 31012345_001
4. D3 = 12345A then outcome will be 31012345A001
Additional info:
These are just format as it can be any numbers combinations, the last letter alphabet can be A-Z
D3 = 123456
D3 = 123456A
D3 = 12345
D3 = 12345A
As I couldn't quite catch all the conditions and outcomes, here is an example of how your formula could look:
=IF(LEN(D3)=5,Outcome_1_Concatenation,IF(LEN(D3)=7,Outcome_2_Concatenation,IF(ISNUMBER(VALUE(RIGHT(D3,1))),Outcome_3_Concatenation,Outcome_4_Concatenation)))
Outcome_1_Concatenation => replace with formula when LEN = 5
Outcome_2_Concatenation => replace with formula when LEN = 7
Outcome_3_Concatenation => replace with formula when LEN = 6 and all are numbers
Outcome_4_Concatenation => replace with formula when LEN = 6 and last is character
If you give all examples in a condition => outcome list, I would be glad to help further.
I would look at creating a lookup table range with 3 options for lengths of 5,6,7.
I named my lookup table range "Length".
First setup this lookup table like this:
5 |
=CONCATENATE(M$3,P$3,D$3,IF(ISNUMBER(VALUE(RIGHT(D3,1))),N3,""),O$3,E$3)
6 |
=CONCATENATE(M$3,IF(ISNUMBER(VALUE(RIGHT(D$3,1))),"",P$3),D$3,IF(ISNUMBER(VALUE(RIGHT(D3,1))),N$3,""),O$3,E$3)
7 |
=CONCATENATE(M$3,D$3,IF(ISNUMBER(VALUE(RIGHT(D$3,1))),N$3,""),O$3,E$3)
For any D3 value, it is checking if that last character is a letter, and if not it will insert N3, otherwise it leaves it out.
Also, for any 6 character value, it checks if the last character is a letter, and if so, it will insert P3, otherwise it leaves it out.
Then, your output formula should be:
=VLOOKUP(LEN(D3),Length,2,FALSE)
This makes it clean and simple.
This is your formula plus the added conditions 1 and 2:
=IF(D3=12345,CONCATENATE(M3,P3,D3,N3,O3,E3),IF(D3="12345A",CONCATENATE(M3,P3,D3,O3,E3),IF(LEN(D3)=6,CONCATENATE(M3,D3,N3,O3,E3),CONCATENATE(M3,D3,O3,E3)))
If you want a more generalized version you can check if D3 is a number, the length of it, if D3 ends with a letter, and replace the nested ifs according to your needs
I got my answers, it's
=IF(AND(LEN(D3)>=6,ISNUMBER(RIGHT(D3,1)*1)),M3&D3&N3&O3&E3,IF(AND(LEN(D3)<6,ISNUMBER(RIGHT(D3,1)*1)),M3&P3&D3&N3&O3&E3,IF(AND(LEN(D3)=6,ISTEXT(RIGHT(D3,1))),M3&P3&D3&O3&E3,M3&D3&O3&E3)))

Compare multiple columns, pull out only cells that appear in every column

I have 10 or so columns in my worksheet. Each column contains about 200 names, and there is no other data on the sheet.
What I'd like to do is create a new column that only contains the names that are common between the columns. So essentially compare each cell in each column to all the other cells in all the other columns, and only return the the common cells.
For example:
Column1 : name_A, name_C, name_F
Column2: name_C, name_B, name_D
Column3: name_C, name_Z, name_X
So in this example, the new column would only contain name_C, because it's the only value common to all three columns.
Is there any way to do this? My knowledge of Excel is quite poor, and I can't find anything similar to my problem online so I would appreciate any help.
Thanks for reading,
N
Put everything on a single spreadsheet and create a pivot table is probably more efficient than the algorithm you have on your mind.
here is my mock-up. I added extra names to demonstrate better
D(formula) has the easiest version. this will list only values that appear in all columns, but these will appear on the same lines as the corresponding name in column A, with blanks, and not sorted (giving D(result))
IF you would like all the names to appear the the top - as shown here in column E you can either sort your table (you will have to re-sort if the columns change) OR you can use my solution below:
get yourself the MoreFunc Addon for Excell ( here is the last working download link I found, and here is a good installation walk-through video )
once all is done select cells E1:E8, click the formula bar and type the following: =UNIQUEVALUES(IF(COUNTIF(A2:C9,A2:A9)=3,A2:A9,""))
accept the formula by clicking ctrl-shift-enter (this will create an array-formula and curly braces will appear around your formula)
A B C D(formula) D(result) E(result - sorted)
-------------------------------------------------------------------------------------------------------
1 | name_A name_C name_C =IF(COUNTIF($A$1:$C$8,A1)=3,A1,"") name_m
2 | name_C name_B name_Z =IF(COUNTIF($A$1:$C$8,A2)=3,A2,"") name_C name_C
3 | name_F name_D name_X =IF(COUNTIF($A$1:$C$8,A3)=3,A3,"")
4 | name_t name_o name_g =IF(COUNTIF($A$1:$C$8,A4)=3,A4,"")
5 | name_y name_p name_h =IF(COUNTIF($A$1:$C$8,A5)=3,A5,"")
6 | name_u name_k name_7 =IF(COUNTIF($A$1:$C$8,A6)=3,A6,"")
7 | name_i name_5 name_9 =IF(COUNTIF($A$1:$C$8,A7)=3,A7,"")
8 | name_m name_m name_m =IF(COUNTIF($A$1:$C$8,A8)=3,A8,"") name_m

Destination, prefix lookup via Phone number - Excel

I have two tables.
Table one contains: phone number list
Table Two contains: prefix and destination list
I want look up prefix and destination for phone number.
Given below Row data table and result table
Table 01 ( Phone Number List)
Phone Number
------------
12426454407
12865456546
12846546564
14415332165
14426546545
16496564654
16896546564
16413216564
Table 02 (Prefix and Destination List)
PREFIX |COUNTRY
-------+---------------------
1 |Canada_USA_Fixed
1242 |Bahamas
1246 |Barbados
1268 |Antigua
1284 |Tortola
1340 |Virgin Islands - US
1345 |Cayman Island
144153 |Bermuda-Mobile
1473 |Grenada
1649 |Turks and Caicos
1664 |Montserrat
Table 03 (Result)
Phone Number | PREFIX | COUNTRY
--------------+--------+-------------------
12426454407 | 1242 | Bahamas
12865456546 | 1 | Canada_USA_Fixed
12846546564 | 1284 | Tortola
14415332165 | 144153 | Bermuda-Mobile
14426546545 | 1 | Canada_USA_Fixed
16496564654 | 1649 | Turks and Caicos
16896546564 | 1 | Canada_USA_Fixed
16643216564 | 1664 | Montserrat
Lets assume phone numbers are in column A, now in column B you need to extract the prefix. Something like this:
=LEFT(A1, 4)
However your Canada_USA_Fixed creates problems as does the Antigua mobile. I'll let you solve this issue yourself. Start with IF statements.
Now that you have extracted the prefix you can easily use VLOOKUP() to get the country.
Assuming that the longest prefix is 6 digits long you, can add 6 columns (B:G) next to the column with the phone numbers in table 1 (I assume this is column A). In column B you'd show the first 6 characters using =LEFT(A2,6), in the next column you show 5 chars, etc.
Then you add another 6 columns (H:M) , each doing a =MATCH(B2,Table2!A:A,0) to see if this prefix is in the list of prefixes.
Now if any of the 6 potential prefixes match, you'll get the row number of the prefix - else you'll get an #N/A error. Put the following formula in column N: {=INDEX(H2:M2,MATCH(FALSE,ISERROR(H2:M2),0))} - enter the formula as an array formula, i.e. instead of pressing Enter after entering it, press Ctrl-Shift-Enter - you'll see these {} around the formula then, so don't enter those manually!.
Column N now contains the row of the matching prefix or #N/A if no prefix matches. Therefore, put =IF(ISNA(N2,'No matching prefix',INDEX(Table2!B:B,N2)) in the next column and you'll be done.
You could also the above approach with less columns but more complex formulas but I wouldn't recommend it.
I'm also doing longest prefix matches and, like everyone else that Google has turned up, it's also for international phone number prefixes!
My solution is working for my table of 200 prefixes (including world zone 1, ie. having 1 for US/Canada and 1242 for Bahamas, etc).
Firstly you need this array formula (which I'm going to call "X" in the following but you'll want to type out in full)
(LEFT(ValueToFind,LEN(PrefixArray))=PrefixArray)*LEN(PrefixArray)
This uses the trick of multiplying a logical value with an integer so the result is zero if there's no match. You use this find the maximum value in one cell (which I'm calling "MaxValue").
{=MAX(X)}
If MaxValue is more than zero (and therefore some sort of match was found), you can then find the position of the maximum value in your prefix array.
{=MATCH(MaxValue,X,0)}
I've not worried about duplicates here - you can check for them in your PrefixArray separately.
Notes for neophytes:
PrefixArray should be an absolute reference, either stated with lots of $ or as a "named range".
I'm assuming you'll make ValueToFind, MaxValue and the resultant index into PrefixArray as cells on the same row, and therefore have a $ against their column letter but not their row number. This allows easy pasting for lots of rows of ValueToFind.
Array formula are indicated by curly braces, but are entered by typing the text without the curly braces and then hitting Ctrl-Shift-Enter.

Resources