Destination, prefix lookup via Phone number - Excel - excel

I have two tables.
Table one contains: phone number list
Table Two contains: prefix and destination list
I want look up prefix and destination for phone number.
Given below Row data table and result table
Table 01 ( Phone Number List)
Phone Number
------------
12426454407
12865456546
12846546564
14415332165
14426546545
16496564654
16896546564
16413216564
Table 02 (Prefix and Destination List)
PREFIX |COUNTRY
-------+---------------------
1 |Canada_USA_Fixed
1242 |Bahamas
1246 |Barbados
1268 |Antigua
1284 |Tortola
1340 |Virgin Islands - US
1345 |Cayman Island
144153 |Bermuda-Mobile
1473 |Grenada
1649 |Turks and Caicos
1664 |Montserrat
Table 03 (Result)
Phone Number | PREFIX | COUNTRY
--------------+--------+-------------------
12426454407 | 1242 | Bahamas
12865456546 | 1 | Canada_USA_Fixed
12846546564 | 1284 | Tortola
14415332165 | 144153 | Bermuda-Mobile
14426546545 | 1 | Canada_USA_Fixed
16496564654 | 1649 | Turks and Caicos
16896546564 | 1 | Canada_USA_Fixed
16643216564 | 1664 | Montserrat

Lets assume phone numbers are in column A, now in column B you need to extract the prefix. Something like this:
=LEFT(A1, 4)
However your Canada_USA_Fixed creates problems as does the Antigua mobile. I'll let you solve this issue yourself. Start with IF statements.
Now that you have extracted the prefix you can easily use VLOOKUP() to get the country.

Assuming that the longest prefix is 6 digits long you, can add 6 columns (B:G) next to the column with the phone numbers in table 1 (I assume this is column A). In column B you'd show the first 6 characters using =LEFT(A2,6), in the next column you show 5 chars, etc.
Then you add another 6 columns (H:M) , each doing a =MATCH(B2,Table2!A:A,0) to see if this prefix is in the list of prefixes.
Now if any of the 6 potential prefixes match, you'll get the row number of the prefix - else you'll get an #N/A error. Put the following formula in column N: {=INDEX(H2:M2,MATCH(FALSE,ISERROR(H2:M2),0))} - enter the formula as an array formula, i.e. instead of pressing Enter after entering it, press Ctrl-Shift-Enter - you'll see these {} around the formula then, so don't enter those manually!.
Column N now contains the row of the matching prefix or #N/A if no prefix matches. Therefore, put =IF(ISNA(N2,'No matching prefix',INDEX(Table2!B:B,N2)) in the next column and you'll be done.
You could also the above approach with less columns but more complex formulas but I wouldn't recommend it.

I'm also doing longest prefix matches and, like everyone else that Google has turned up, it's also for international phone number prefixes!
My solution is working for my table of 200 prefixes (including world zone 1, ie. having 1 for US/Canada and 1242 for Bahamas, etc).
Firstly you need this array formula (which I'm going to call "X" in the following but you'll want to type out in full)
(LEFT(ValueToFind,LEN(PrefixArray))=PrefixArray)*LEN(PrefixArray)
This uses the trick of multiplying a logical value with an integer so the result is zero if there's no match. You use this find the maximum value in one cell (which I'm calling "MaxValue").
{=MAX(X)}
If MaxValue is more than zero (and therefore some sort of match was found), you can then find the position of the maximum value in your prefix array.
{=MATCH(MaxValue,X,0)}
I've not worried about duplicates here - you can check for them in your PrefixArray separately.
Notes for neophytes:
PrefixArray should be an absolute reference, either stated with lots of $ or as a "named range".
I'm assuming you'll make ValueToFind, MaxValue and the resultant index into PrefixArray as cells on the same row, and therefore have a $ against their column letter but not their row number. This allows easy pasting for lots of rows of ValueToFind.
Array formula are indicated by curly braces, but are entered by typing the text without the curly braces and then hitting Ctrl-Shift-Enter.

Related

Is there a way to find any one of a set of characters using an excel formula

I have data that uses a range, or a less than symbol to denote 'between 0 and number'. But multiple characters are used for the same purpose.
It looks like below (first two columns), plus a column showing the results I want:
Country
Average hotdog consumption
Desired output
Madeupaland
10-200
105
Exampledesh
50—1000
525
Republic of Notreal
<1000
500
Inventia
≤5000
2500
Plus many rows where the data in the second column is purely numerical and doesn't need finessing into a number
I can use this formula to calculate the midpoint where there is a range:
=IFERROR(AVERAGE(LEFT(C2,FIND("–",C2)-1),RIGHT(C2, LEN(C2)-FIND("–",C2))), A2)
But they only covers one kind of dash(- and not —). Similarly, if I want to halve the numbers in rows with < and ≤ I'd need to replicate a formula there.
Is there a way of finding multiple different characters from a set? My understanding is that find looks for the whole string of characters. substitute is a work around, but I'd have to substitute every different value in the 'character set'.
In regex this would just be [-—].
I'm using Excel 2013 if that matters
It's not a perfect solution but you can try the following. This replaces those patterns of text with replacements representing which formula to use:
Create a Reference Table (I have made this in I1:K5)
|Pattern |Pattern Name |Substitution Rule |
|------- |------------ |----------------- |
|— |double dash |/2+0.5* |
|- |dash |/2+0.5* |
|< |lt |0.5* |
|≤ |lte |0.5* |
In your third column enter the following array formula (Using Ctrl + Shift + Enter to confirm)
=IF(ISNUMBER(B2),B2,"'="&SUBSTITUTE(B2,INDEX($I$2:$I$5,MIN(IF(ISNUMBER(FIND($I$2:$I$5,B2)),ROW($I$2:$I$5)-1,99))),INDEX($K$2:$K$5,MIN(IF(ISNUMBER(FIND($I$2:$I$5,B2)),ROW($I$2:$I$5),99)-1))))
Copy your third column and past values into a fourth column
Replace all the ''s with nothing to evaluate the expressions using Ctrl + H
My Result:
Country
Average hotdog consumption
Desired output
Formula Paste
Output after replacing 's
Madeupaland
10-200
105
'=10/2+0.5*200
105
Exampledesh
50—1000
525
'=50/2+0.5*1000
525
Republic of Notreal
<1000
500
'=0.5*1000
500
Inventia
≤5000
2500
'=0.5*5000
2500

how to vlookup if prefix found in the list?

HI.
how can i come up with return value of "company name" (column H) at Column B IF any of the "PrefiX" (Column G) found at "con no" (Column A).
Sample of outcome needed as in column B.
Sample:
620011113 = DD
CN1234 = BB
thanks
=INDEX($H:$H,AGGREGATE(15,6,ROW($G$1:$G$7)/(--(FIND($G$1:$G$7,$A2)=1)*--(LEN($G$1:$G$7)>0)),1),1)
Breaking this down, the INDEX retrieves the Nth item from Column H (Company name). To find the value of N, we are using the AGGREGATE function
AGGREGATE is a weird function - it lets us use things like MAX or LARGE or SUM while ignoring any error values. In this case, we will be using it for SMALL (first argument, 15), while Ignoring Error Values (second argument, 6). We will want the very smallest value, so the fourth argument will be 1. (If we wanted the second smallest, it would be 2, and so on)
=INDEX($H:$H,AGGREGATE(15,6, <SOMETHING> ,1),1)
So, all we need now is a list of values to compare! To make things slightly simpler, I'll break that bit of the code out for you here:
ROW($G$1:$G$7) / (--(FIND($G$1:$G$7,$A2)=1) * --(LEN($G$1:$G$7)>0))
There are 3 parts to this. The first, ROW($G$1:$G$7)is the actual value we want to retrieve - these will be the Row Numbers for each Prefix that matches your value. On its own, however, it will be all the row numbers. Since we are skipping errors, we want any Rows that don't match the prefix to throw an error. The easiest way to do this is to Divide by Zero
At the start of --(FIND($G$1:$G$7,$A2)=1) and --(LEN($G$1:$G$7)>0) we have a double-negative. This is a quick way to convert True and False to 1 and 0. Only when both tests are True will we not divide by 0, as this table shows:
A | B | A*B
1 | 1 | 1
1 | 0 | 0
0 | 1 | 0
0 | 0 | 0
Starting with the second test first (it's easier), we have LEN($G$1:$G$7)>0 - basically "don't look at blank cells".
The other test (FIND($G$1:$G$7,$A2)=1) will search for the Prefix in the Con No, and return where it is found (or a #VALUE! error if it isn't). We then check "is this at position 1" - in other words, "Is this at the start of the Con No, rather than in the middle". We don't want to say Con No CNQ6060 is part of Company AA instead of Company BB by mistake!
So, if the Prefix is at the Start of the Con No, AND it isn't Blank (because there is an infinite amount of Nothing Before, After, and Between every number and letter), then we get it added to our list of Rows. We then take the smallest row (i.e. closest to the top - change AGGREGATE(15 to AGGREGATE(14 if you want the closest to the bottom!), and use that to get the Company Name
You could try the below formal:
=VLOOKUP(IF(LEFT(A3,1)="6",LEFT(A3,4),IF(LEFT(A3,1)="C",LEFT(A3,2),IF(LEFT(A3,1)="E",LEFT(A3,7)))),$G$3:$H$7,2,0)
Have in mind that you have to use ' before the cell value of column A & G in order to convert cell value into text get the correct out comes using VLOOKUP
Result:

Pulling out specific characters that are combined in a single column

I have a data extract that resulted in fields being combined into column A like this:
Sales Figures Report pg 121
Walmart Inc. 001230134 99 Associates Parkway 56.12 20.00 10.00 86.12 00 1
1400.25 262.40 14.50 1677.15 02 9
50.00 100.25 10.00 160.25 00 1
1400.25 262.40 14.50 1677.15 02 9
There are over 50,000 rows in this sort of format, some are a little different as they'll start with vendor information and then have those values after (still all in col. A). In the above example 1677.15 is the combined value of the three numbers before that, i.e. the total amount due.
Originally I wanted to basically separate out each value using things like left(), mid(), right() etc. however at this point all I want is the total figure, i.e. the $1677.15 in the above example. What is the best non-vba method of doing this?
Two issues:
The problem is that the amounts are not always the same number of digits (can range from $xx.xx to $xxx,xxx.xx)
Since There are multiple "." you can't use search() to find the correct character location.
In your example, the Total figure is always the third group from the end. That being the case, you can use the following formula and defined names:
=IFERROR(--INDEX(TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",99)),seq_99,99)),LEN(A1)-LEN(SUBSTITUTE(A1," ",""))-1),"")
Defined Names
seq refers to: =ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,255,1))
seq_99 refers to: =IF(seq=1,1,(seq-1)*99)
In the formula,
SUBSTITUTE: Replace all spaces with a large number (99) of spaces
MID Return an array consisting of each word (space separated)
TRIM the words to remove the extra spaces
INDEX Return the item that is third from the end.
IFERROR handle lines such as your first, which does not contain the relevant pattern.
seq and seq_99 return arrays of {1,2,3,...}and{1,99,198,...}` to be used in the formulas

Rearrange excel table cells - reordering

I don't know excel very well and I am trying to take something like this (with a lot of entries):
Field ......Value ....... ID
A .......... blabla1 .......1
B ...........blabla2 .......1
C ...........blabla3 .......1
D ...........blabla4 .......1
A ...........blabla5 .......2
B ...........blabla6 .......2
C ...........blabla7 .......2
D ...........blabla8 .......2
and turn into something more readable like this:
ID -----A -------------B ---------------- C ---------------- D
1 ------blabla1 -----blabla2 -------- blabla3 --------blabla4
2 ------blabla5----- blabla6 -------- blabla7-------- blabla8
Does anyone know a good way to do that? Thank you
(sorry about the bad formatting)
The exact delimiter beween each word is key if text not already split in separate cells..
Assuming there are numerous words in place of '.....', with each word separated by a single space (different delimiter would be required if the blablas represented sentences comprising one / more spaces), then you could achieve the desired table representation as follows
(several function in this soln requires Office 365 compatible version of Excel,
the lookup in step 3 does not require Office 365, but may mean IDs and Fields need to be manually entered or VB could be deployed):
Starting position (after removing bank rows):
Field Value ID
A blabla1 1
B blabla2 1
C blabla3 1
D blabla4 1
A blabla5 2
B blabla6 2
C blabla7 2
D blabla8 2
1) Split cells according to delimiter (skip this step if not relevant)
=TRANSPOSE(FILTERXML("<x><y>"&SUBSTITUTE(F3," ","</y><y>")&"</y></x>","//y"))
(replace the " " inside the substitute function with a different delimiter if required/desired)
2) Obtain unique IDs (rows) and Fields (columns)
=UNIQUE(K4:K11)
=TRANSPOSE(UNIQUE(I4:I11))
3) Index lookup for table content
=INDEX(J4:J11,MATCH(M4#&N3#,K4:K11&I4:I11,0),0)

Extracting a substring from a string of arbitrary length

I have just a hair over 30,000 tweets. I have one column that has the actual tweet. There are two things that I would like to accomplish with this column.
First here is a snippet of sample data:
RT #Just_Sports: Cool page for fans of early pro #baseball. https://t.co/QCMYFQNSq8 #mlb #vintage #Chicago #Detroit #Boston #Brooklyn #Phil…
#brettjuliano you already know #unity #newengland #hiphop #boston #watertown #network
I have a column that uses the following formula to see if the message starts out with RT meaning a re-tweet. It returns 1 for yes and 0 for no.
What I would like to accomplish is to create a formula in two columns. One that will get the username if the RT column has a value of 1 and in the second column the username if the RT column has a value of 0. Since usernames are of arbitrary length I am unsure of how to go about this.
Example
RT #Just_Sports: | 1 | #Just_Sports | 0
#brettjuliano | 0 | | #brettjuliano
Take a look at Excel's FIND function. You can use this to identify the position of the #, then using a specified delimiter, match the end of the user name:
=MID(A1, FIND("#",A1), FIND(":",A1,FIND("#",A1)) - FIND("#",A1))
Where A1 is the cell containing the tweet, and ":" is your delimiter.
You can use the same feature to check for the existence of the "RT" identifier.
=FIND("RT",A1)>0
Which returns TRUE if "RT" is found. You may want to consider a search for " RT " (spaces), or some other variation, since there is no standard for using this in a tweet:
=OR(FIND("RT",A1)>0,FIND(" RT",A1)>0,FIND("RT ",A1)>0, FIND(" RT ",A1)>0)
But beware of false positives: ART, START, ARTOO, etc...
Additionally, your "RT" may be lower/upper/mixed case, in which case you'll want to normalize that search:
=OR(FIND("RT",UPPER(A1))>0,FIND(" RT",UPPER(A1))>0,FIND("RT ",UPPER(A1))>0, FIND(" RT ",UPPER(A1))>0)
My OR check is different than the 0/1 check you say you already have, so you can jsut add IF to that to convert to the 0/1 as needed:
=IF(OR(FIND("RT",A1)>0,FIND(" RT",A1)>0,FIND("RT ",A1)>0, FIND(" RT ",A1)>0),1,0)
Once you know you have the RT check correct, and your second column is filled properly, you can add to my original formula:
Case for 1 in 2nd column:
=IF(B1=1,MID(A1, FIND("#",A1), FIND(":",A1,FIND("#",A1)) - FIND("#",A1)),"")
Case for 0 in 2nd column:
=IF(B1=0,MID(A1, FIND("#",A1), FIND(":",A1,FIND("#",A1)) - FIND("#",A1)),"")

Resources