Extract two numbers out of a string in Excel - excel

I have a string that I need two numbers extracted and separated into two columns like this.
ID:1234567 RXN:89012345
ID:12345 RXN:678901
Column 1 Column 2
1234567 89012345
12345 678901
The numbers can be varying number of characters. I was able to get column 2 number by using the following function:
=RIGHT(G3,FIND("RXN:",G3)-5)
However, I'm having a hard time getting the ID number separated.
Also, I need this to be a function as I will be using a macro to use over many spreadsheets.

A way to do this is:
Select all your data - assuming it is in a string all the time - which means one cell has one row with ID&RXN nos. So if you have 100 rows such data, select all of it
Go to the Data tab, Text to columns
Choose Delimited>>Next>> choose Space here, in Other, type a colon(:) >> Finish
You will get "ID" in first column, every cell; ID no in second column every cell; RXN in third column every cell and RXN no in 4th column every cell.
Delete unwanted columns

With data in column A, in B1 enter:
=MID(A1,FIND("ID:",A1)+LEN("ID:"),FIND(" ",A1,FIND("ID:",A1)+LEN("ID:"))-FIND("ID:",A1)-LEN("ID:"))
and copy down. In C1 enter:
=MID(A1,FIND("RXN:",A1)+LEN("RXN:"),9999)
and copy down:
The column B formulas are a pretty standard way to capture a sub-string encapsulated by two other sub-strings.

If your format is always as you show it,then:
B1: =TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1," ",REPT(" ",99)),":",REPT(" ",99)),99,99))
C1: =TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1," ",REPT(" ",99)),":",REPT(" ",99)),3*99,99))
We substitute a long string of spaces for the space and : in the original string. Then we extract the 2nd and 4th items and trim off the extra spaces.

Related

Auto increment after concatenating 3 strings

Problem: I need a formula to automatically increment when dragging down. Since the 3 strings are joined by concatenating, it doesn't seem to work.
=CONCATENATE("Sheet2!", SUBSTITUTE(ADDRESS(1,MATCH("String to Search For", Sheet2!$13:$13,0),4),1,""),"17")
String 1 is a separate sheet reference (Sheet2!)
String 2 is the column number converted to column letter where string to be searched is found using MATCH, ADDRESS, and SUBSTITUTE. In this case it was column 2 converted to B.
String 3 is the row number that I need information from IF string searched is found
After concatenating these, I need to drag it down 5000 rows and increment String 3 (the row number) but because the reference is concatenated, it will not increment. I've tried everything! Please help!
Try adding a ROWS function, e.g. if you put the first formula in Z2 use this version copied down
=CONCATENATE("Sheet2!", SUBSTITUTE(ADDRESS(1,MATCH("String to Search For", Sheet2!$13:$13,0),4),1,""),"17"+ROWS(Z$2:Z2)-1)
change depending on actual start cell
ROWS function, used this way, will increment by 1 each row and is more "robust" than alternatives using ROW for example
Assuming your formula is in row 2, you could do this:
=CONCATENATE("Sheet2!", SUBSTITUTE(ADDRESS(1,MATCH("String to Search For", Sheet2!$13:$13,0),4),1,""),text(row()+15,"#"))
If your formulas starts on a different row, just change the 15 as needed.

Return Dates of Three Consecutive Values in a Row

I have a data file and I need to return the dates of when the value (MaxT) is greater than or equal to 30 (>=30) for 3 consecutive days.
Data File:
Date, MaxT
1872-03-01,31
1872-03-02,29
1872-03-03,37
1872-03-04,40
1872-03-05,22
1872-03-06,9
1872-03-07,28
1872-03-08,31
1872-03-09,35
1872-03-10,37
1872-03-11,44
1872-03-12,29
1872-03-13,35
1872-03-14,48
1872-03-15,33
1872-03-16,31
1872-03-17,38
1872-03-18,31
1872-03-19,42
1872-03-20,20
1872-03-21,24
1872-03-22,31
I have attempted to figure this out using the following code but, I do not think I'm even in the ballpark...
Attempted Code:
=SUMPRODUCT(--(FREQUENCY(IF(B2:B23>=30,ROW(B2:B23)),IF(B2:B23>=30,ROW(B2:B23)))=3))
I'm assuming that your data file consists of 2 columns Date and Max T. If they are delimited by commas, you need to split them to 2 different columns using Text to columns delimited by commas ,.
The Date should be in Column A and Max T in Column B.
Enter the below formula in cellC2 and drag down,
=IF(AND(B2>=30,B3>=30,B4>=30),"Consecutive Range","")
The starting of the consecutive range of values greater than 30 will be shown in the output as above. You could then use a filter of some other excel function like Index-Match to get the corresponding dates. Hope this helps.
Alright, I got it to work, but I'm not entirely sure how you would make it work without separating the formula into multiple cells.
One potential solution would be to write some of the formulas into a sheet that's in the background, place the final part of the formula in the front sheet and have it reference the "hidden" bits of the formula.
First, I wrote the data in columns... "Date" in Column A, "MaxT" in Column B.
The first part of the formula is written in cell D2:
=IF(B2>=30,B2,"")
The next part of the formula is written in cell E2:
=COUNT(D2:D4)
The last part of the formula is written in cell F2:
=IF(E2=3,A2&","&A3&","&A4,"")
The result of this formula, in column F, there are 7 cells that have three dates written in them, separated by a comma.
Note that you can make any character or string of text separate the three displayed dates by replacing the commas that are in-between the ampersand, quote text:
(&","&) can become (&"anything you want"&)
From here, auto-fill the formulas to the relevant cells.
EDIT:
One way to shorten the code is to add the COUNT formula into the last IF statement like this:
=IF(COUNT(D2:D4)=3,A9&","&A10&","&A11,"")
I do still think that the first IF statement will need to be separate from the rest of the formula, though.
EDIT #2
Here is the code in one single cell:
=IF(AND(B2>=30,B3>=30,B4>=30), A2&","&A3&","&A4,"")
Which will display three dates that are located within Column A, current row & the next two rows below it.
This code still produces 7 lines of results with the data that you've provided.

EXCEL Formulas Sum Everything above specific row

I want to SUM everything above a cell that contains the word "SUMTOTAL". So if I have 50 columns I want it to go to first row that has the text "SUMTOTAL" in it and then Sum everything aboce that word. Is it possible?
Use a MATCH formula to find the row and minus one from it then use an INDIRECT formula to put together a string of the address then plop it into a sum formula like this:
=SUM(INDIRECT("A1:A" & MATCH("SUMTOTAL",B:B,0)-1))
Assumption:
SUMTOTAL is in column B somewhere
The numbers you want to sum are in column A
Your data starts at row 1.
You are summing ONE column. To expand simply change "A1:A" to "A1:X" if you wanted to sum columns A to X
I assume that all your data is located in A1:N20, and SUMTOTAL appears somewhere inside this area (you can easily change the desired data location). The following formula does the summation of all numbers directly above SUMTOTAL, i.e., in the same column.
=SUM(OFFSET($A$1,0,SUMPRODUCT(COLUMN($A$1:$N$20)*($A$1:$N$20="SUMTOTAL"))-1,SUMPRODUCT(ROW($A$1:$N$20)*($A$1:$N$20="SUMTOTAL"))-1))
If you want to sum all numbers above SUMTOTAL, no matter if in the same column or not, use
=SUM(OFFSET($A$1,0,0,SUMPRODUCT(ROW($A$1:$N$20)*($A$1:$N$20="SUMTOTAL"))-1,COLUMNS($A$1:$N$20)))
=SUM(INDIRECT(ADDRESS(1,COLUMN())&":"&ADDRESS(ROW()-1,COLUMN())))

vlookup with multiple columns

I have the following formula in my B:B column
=VLOOKUP(A1;'mySheet'!$A:$B;2;FALSE)
It does output in B:B the values found in the mySheet!B:B where A:A = mySheet!A:A. It works fine. Now, I would like to also get the third column. It works if I add the following formula to the whole C:C column:
=VLOOKUP(A1;'mySheet'!$A:$C;3;FALSE)
However, I'm working with more than 100k lines and about 40 columns. I don't want to do 100k * 40 * VLOOKUP, I would like to only do it 100k and not have to multiply this by all the columns. Is there a way (with array-formulas maybe) to just do the VLOOKUP once per line to get all the columns I need?
data example
ID|Name
-------
1|AB
2|CB
3|DF
4|EF
ID|Column 1|Column 2
--------------------
1|somedata|whatever1
4|somedate|whatever2
3|somedaty|whatever3
I would like to get:
ID|Name|Column 1|Column 2
-------------------------
1|AB |somedata|whatever1
2|CB | |
3|DF |somedaty|whatever2
4|EF |somedate|whatever3
INDEX works fast than VLOOKUP, I would recommend using that. It'll reduce the strain that many vlookups would put on your system.
First find the row that contains what you need in a helper column with MATCH:
=MATCH(A1,'mySheet'!$A:$A,0)
Then an INDEX using that number, that you can drag across and populate all your columns with:
=INDEX('mySheet'!B:B,$B1)
Your output would be akin to:
ID|Name|Match |Column 1 |Column 2
-------------------------
1|AB |Match1|IndexCol1|IndexCol2
2|CD |Match2|IndexCol1|IndexCol2
3|EF |Match3|IndexCol1|IndexCol2
Also! I'd recomend setting these ranges to actually cover the data, rather than referencing the whole column, for additional speed gains, e.g.:
=INDEX('mySheet'!B1:B100000,$B1)
I was thinking more on your problem, and if you have contorl over the data you're looking up on, I have another suggestion you could try.
In 'mysheet', where the raw data is kept, add in a new column that concatenates each column into one cell, with some sort of unique divider not in your data:
=B1&"+"&C1&"+"&D1&"+"&E1 etc...
Then you could do one VLOOKUP or INDEX/MATCH for each row, instead of 40.
Once you have it in your new sheet, you could split the results back out.
Splitting without formulas
Copy/Paste the results of the lookup formulas as Values in the next column.
Select that column, and in the Data tab on your ribbon, select Text to Columns.
Leave it on Delimited, hit Next. Uncheck Tab, check Other, and input your delimeter (+ in my example).
Click Finish.
Splitting with formulas
Use =FIND() to locate each delimter, and =MID() to pull out the text between each set of delimeters, using the previous delimeter as the Start_num.
Definitely the more complex of the two methods.
If I'm understanding correctly one thing I would do to start would be to use =VLOOKUP(A1;'mySheet'!$A:LastColumn;COLUMN(B1);FALSE). This way your column reference will move as you drag your Vlookup to the right.
No formula.No output. So there can't be a way to apply formula on 1 column only and get on the others.
The other feasible way is, put i formula in 1 cell, use $ signs inteligently and drag across all cells in a giffy without having to put vlookup 40 times.
Vlookup has 4 codes to input
1-Lookup Value. Use this $A1 (put $ on A and not 1)
2-Source data- Put $ signs everywhere
3-Column index no. Just above your entire data,in the 1st row,add an empty row.Put the values 1 in A1, 2 in B1, 3 in C1 and so on. Now in the formula,instead of manually putting "2" or "3" Give reference to these cells.Put $ on Numberal and not column ( B$1).
4- Type false or 0
Then drag this across everywhere.
Lookup Value. Use this $A1 (put $ on A and not 1)
Source data- Put $ signs everywhere
Column index no. Just use column name from where data needs to be pulled (e.g. COLUMN(B1) if Lookup value is in Column A and you want value from column B).
Type false or 0

count all rows in excel that contain some string subset

not even sure if this is possible in excel .
i want to compare all the rows in the column and get the count of them if the string contains "the"
so i know i can do this
=count(B0:B99)
is there a way to do B0- BN ? the count
(~df.col3.str.contains('u|z')).sum() found this trying to make this work so i know its possible now.
now can i only count them if b0- bn contain the string "the" . lets assume all the rows contain strings
my backup plan is exporting the data and writing a ruby script to do it but i feel like i should be
note everything is in column B
=SUM(IF(IFERROR(SEARCH("the ",B:B),0)<>0,1,0))
enter this as array formula by pressing Ctrl+Shift+Enter in formula bar.
Result should look like this:
{=SUM(IF(IFERROR(SEARCH("the ",B:B),0)<>0,1,0))}
This will count all cells which contain string "the" in your column:
=COUNTIF(B1:B99,"*the*")
Add an extra column (D) with formulas similar to =IFERROR(SEARCH("the",C1),0) in each cell where Column C contains the text you want to search, and add a summary formula =SUMIF(D1:D100,">0",B1:B100) to a single cell where Column B contains the numbers you want to sum.

Resources