Splitting concatenated entries in a single column into separate rows - excel-formula

Can anyone help me with this in Excel, I need a formula to extract a correct data from this single column:
03127|16|D02|2|0025003128|1|D02|1|00008
03128|2|D02|1|00107
03131|3|D02|2|0020703132|1|D02|1|00000
03132|9|D02|2|0022803132|10|D02|2|00232
03131|3|D02|2|0020703132|10|D02|1|00000
03132|11|D02|2|0022803132|10|D02|2|00235
Screenshot:
I want a result like this:
03127|16|D02|2|00250
03128|1|D02|1|00008
03128|2|D02|1|00107
03131|3|D02|2|00207
03132|1|D02|1|00000
03132|9|D02|2|00228
03132|10|D02|2|00232
03131|3|D02|2|00207
03132|10|D02|1|00000
03132|11|D02|2|00228
03132|10|D02|2|00235

Ignoring the screenshot as this does not seem to match the other examples and assuming the first entry is in A1 then for longer strings this formula:
=IF(LEN(A1)<25,A1,LEFT(A1,FIND("|",A1,FIND("|",A1,FIND("|",A1,FIND("|",A1)+1)+1)+1)+5)&"#"&MID(A1,FIND("|",A1,FIND("|",A1,FIND("|",A1,FIND("|",A1)+1)+1)+1)+6,LEN(A1)))
should insert a hash # five characters after the fourth pipe | and merely copy the shorter strings. Copy the results into Word as text only, select all and INSERT > Tables - Table, Convert Text to Table..., Separate text at Other: #, Number of columns: 1 and copy the resulting Table back in to Excel

Related

Find common text within a range of cells(range containing blanks as well)

This is the problem i am facing in Excel formula
enter image description here
In column F, i want to find the common text across A2 to E2 (containing Blanks)
My Question:
Is there a simple way to get the result without VB?
Any help is appreciated,thanks
I found that google sheets has some really cool functions.
If you put the formula =SPLIT(A1, ",", TRUE,FALSE) in the cell after your row of common text (or probably even in a different sheet - "probably because hadn't tried it, though it should), the next x cells (where x is the number of "," in A1 - because "," is the delimitator) will be the text.
then you can put the code =IF(SUM(ARRAYFORMULA(if(REGEXMATCH($A$1:$D$1,F1),1,0)))=COUNTA($A$1:$D$1),F1,"") into an equal number of cells after that (probably should just put into the max number), and =CONCATENATE(I1:L1) into the last cell.
Ok. So to tweak this for yourself: I found that ARRAYFORMULA lets you put an array in place of a single cell in a function inside. how it exactly works I read its like a for loop. but I can't really vouch for that. but here it lets you have REGEXMATCH (which is a Boolean check on the cell you give it for if it contains the given REGEX) check each cell in the array.
the sum will add them up, and the if will match against the COUNTA to find if the number of cells in the array that contain this string is equal to the number of non-empty cells.
the concatenate at the end adds all the cells (containing the regex function) together, and since the only non-empty cells will be the one with the string, that is what this cell will return (no spaces).
code:
results:
the test data:
If you need in specifically Excel... this won't help.
We can use power query to achieve the desired result.
Unpivot the columns in Power query
Split all the columns by Comma delimiter
Create a custom column to see if the first column records exist in the remaining columns.
Use the functionText.contains.
Sample function: =Text.Contains([column.1],[column.1]&[column.2]&[column.3])
If the above function returns TRUE then get the first column result(This is the expected result) and load the data back to your excel

Determine sequence of most used words in Excel data set

I need to determine sequence of most occurring word in a excel data set. Eg.
A B C D
Joyce Bremner Lewis Chapman Claire Harper
Lesley Brown Brian Clough Natalie Hassan
Emma Cartwright Janet Coldwell Gillian Hedley
Lewis Chapman Sheena Doig Lesley Brown
Brian Clough Karen German Emma Cartwright
Janet Coldwell James Gledhill Lewis Chapman
Sheena Doig Maggie Gowan Brian Clough
Which name is the most occurring and then 2nd most occurring word and so on.
I have found solution for determining the most concurring word in sequence when you take only one column into consideration but struggling to combine for multiple column. Below is the formula used:
Enter this array formula in C2:
=IFERROR(INDEX(A2:A10,MODE(MATCH(A2:A10,A2:A10,0)+{0,0})),"")
Enter this array formula in C3 and copy down until you get blanks:
=IFERROR(INDEX(A$2:A$10,MODE(IF(COUNTIF(C$2:C2,A$2:A$10)=0,MATCH(A$2:A$10,A$2:A$10,0)+{0,0}))),"")
Just in case the single column and pivot are a feasible idea...
Put all the names in one column
Format as table (my pref.)
Select "Summarize as pivot table"
Add "Name" field to Rows and Values

Extract two numbers out of a string in Excel

I have a string that I need two numbers extracted and separated into two columns like this.
ID:1234567 RXN:89012345
ID:12345 RXN:678901
Column 1 Column 2
1234567 89012345
12345 678901
The numbers can be varying number of characters. I was able to get column 2 number by using the following function:
=RIGHT(G3,FIND("RXN:",G3)-5)
However, I'm having a hard time getting the ID number separated.
Also, I need this to be a function as I will be using a macro to use over many spreadsheets.
A way to do this is:
Select all your data - assuming it is in a string all the time - which means one cell has one row with ID&RXN nos. So if you have 100 rows such data, select all of it
Go to the Data tab, Text to columns
Choose Delimited>>Next>> choose Space here, in Other, type a colon(:) >> Finish
You will get "ID" in first column, every cell; ID no in second column every cell; RXN in third column every cell and RXN no in 4th column every cell.
Delete unwanted columns
With data in column A, in B1 enter:
=MID(A1,FIND("ID:",A1)+LEN("ID:"),FIND(" ",A1,FIND("ID:",A1)+LEN("ID:"))-FIND("ID:",A1)-LEN("ID:"))
and copy down. In C1 enter:
=MID(A1,FIND("RXN:",A1)+LEN("RXN:"),9999)
and copy down:
The column B formulas are a pretty standard way to capture a sub-string encapsulated by two other sub-strings.
If your format is always as you show it,then:
B1: =TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1," ",REPT(" ",99)),":",REPT(" ",99)),99,99))
C1: =TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1," ",REPT(" ",99)),":",REPT(" ",99)),3*99,99))
We substitute a long string of spaces for the space and : in the original string. Then we extract the 2nd and 4th items and trim off the extra spaces.

vlookup with multiple columns

I have the following formula in my B:B column
=VLOOKUP(A1;'mySheet'!$A:$B;2;FALSE)
It does output in B:B the values found in the mySheet!B:B where A:A = mySheet!A:A. It works fine. Now, I would like to also get the third column. It works if I add the following formula to the whole C:C column:
=VLOOKUP(A1;'mySheet'!$A:$C;3;FALSE)
However, I'm working with more than 100k lines and about 40 columns. I don't want to do 100k * 40 * VLOOKUP, I would like to only do it 100k and not have to multiply this by all the columns. Is there a way (with array-formulas maybe) to just do the VLOOKUP once per line to get all the columns I need?
data example
ID|Name
-------
1|AB
2|CB
3|DF
4|EF
ID|Column 1|Column 2
--------------------
1|somedata|whatever1
4|somedate|whatever2
3|somedaty|whatever3
I would like to get:
ID|Name|Column 1|Column 2
-------------------------
1|AB |somedata|whatever1
2|CB | |
3|DF |somedaty|whatever2
4|EF |somedate|whatever3
INDEX works fast than VLOOKUP, I would recommend using that. It'll reduce the strain that many vlookups would put on your system.
First find the row that contains what you need in a helper column with MATCH:
=MATCH(A1,'mySheet'!$A:$A,0)
Then an INDEX using that number, that you can drag across and populate all your columns with:
=INDEX('mySheet'!B:B,$B1)
Your output would be akin to:
ID|Name|Match |Column 1 |Column 2
-------------------------
1|AB |Match1|IndexCol1|IndexCol2
2|CD |Match2|IndexCol1|IndexCol2
3|EF |Match3|IndexCol1|IndexCol2
Also! I'd recomend setting these ranges to actually cover the data, rather than referencing the whole column, for additional speed gains, e.g.:
=INDEX('mySheet'!B1:B100000,$B1)
I was thinking more on your problem, and if you have contorl over the data you're looking up on, I have another suggestion you could try.
In 'mysheet', where the raw data is kept, add in a new column that concatenates each column into one cell, with some sort of unique divider not in your data:
=B1&"+"&C1&"+"&D1&"+"&E1 etc...
Then you could do one VLOOKUP or INDEX/MATCH for each row, instead of 40.
Once you have it in your new sheet, you could split the results back out.
Splitting without formulas
Copy/Paste the results of the lookup formulas as Values in the next column.
Select that column, and in the Data tab on your ribbon, select Text to Columns.
Leave it on Delimited, hit Next. Uncheck Tab, check Other, and input your delimeter (+ in my example).
Click Finish.
Splitting with formulas
Use =FIND() to locate each delimter, and =MID() to pull out the text between each set of delimeters, using the previous delimeter as the Start_num.
Definitely the more complex of the two methods.
If I'm understanding correctly one thing I would do to start would be to use =VLOOKUP(A1;'mySheet'!$A:LastColumn;COLUMN(B1);FALSE). This way your column reference will move as you drag your Vlookup to the right.
No formula.No output. So there can't be a way to apply formula on 1 column only and get on the others.
The other feasible way is, put i formula in 1 cell, use $ signs inteligently and drag across all cells in a giffy without having to put vlookup 40 times.
Vlookup has 4 codes to input
1-Lookup Value. Use this $A1 (put $ on A and not 1)
2-Source data- Put $ signs everywhere
3-Column index no. Just above your entire data,in the 1st row,add an empty row.Put the values 1 in A1, 2 in B1, 3 in C1 and so on. Now in the formula,instead of manually putting "2" or "3" Give reference to these cells.Put $ on Numberal and not column ( B$1).
4- Type false or 0
Then drag this across everywhere.
Lookup Value. Use this $A1 (put $ on A and not 1)
Source data- Put $ signs everywhere
Column index no. Just use column name from where data needs to be pulled (e.g. COLUMN(B1) if Lookup value is in Column A and you want value from column B).
Type false or 0

If third number is 1 then insert text X

I have variating numeric entries (SF123456, SF142365, ...). Every number of the numeric entries corresponds to a specific code. For each number of each entry I need to enter on a separate cell the corresponding code (download here example sheet: www.nivpat.com/Example.zip) How can I create an automatic function as I have thousands of entries to divide into codes... thanks!
Alright. What I did to solve this one is this:
Remove the '=' sign in your match table to be able to do a VLOOKUP on it;
Add the position of the digit you want to look up in the row 9 right above the headings. You might want to hide this row for cleaner presentation;
I used the following formula in the cells to extract the values:
=VLOOKUP(VALUE(MID($A11, B$9, 1)), $A$2:$B$7, 2, 0)
The VLOOKUP does the lookup on your table in A2:B7. The MID() extract exactly one character beginning with the character specified in B9 (in this case it would be 3). And the VALUE() converts the text string to a number to be able to do a match with the table above.
The only thing you now have to do is to drag your formulas and it's working !

Resources