Creating a space in between values in Excel spreadsheet - excel

There are two types of values in cell A of my spreadsheet
Value type 1: Have a space in between the postcode district and sub postcode, as the postcode district is less than ten (i.e. MK1-MK9)
MK1 1AS
Value type 2: Have no space in between, as the postcode district is greater than ten (i.e. MK10-MK46)
MK170DB
What would be the best way of splitting the second group of values into something like this:
MK17 0DB
I was thinking of some pseudo code in the the vein of:
if the value at the 4th character (counting from the right) in MK170DB is not an empty space
then count 4 spaces and create an empty character, leaving it like this MK17 0DB
if not then presume that the 4th character is a empty space (i.e. MK1 1AS) and leave it
As I just need to run this operation once to cleanse my data, I was thinking about creating a formula in column B that references column A and does the necessary cleansing. I would then replace the values in col A with what I have in col B.
Can anyone tell me whether the logic I have proposed can be executed in Excel or if there is a better way of doing it?
Thanks.

With your data in A1, in B1 enter:
=IF(ISERROR(FIND(" ",A1)),LEFT(A1,4) & " " & MID(A1,5,999),A1)

Something I came up with, which appears to work...
=IF(MID(A1,4, 1)=" ",A1,REPLACE(A1,5,0," "))

Related

Generate a unique ID (As much As possible) from a string in Excel using string functions

Let's say I have two strings in two cells
Cell A1 = Customer Country
Cell B1 = Customer City
I need to generate a unique ID using the Excel string functions (LEN, LEFT, MID, RIGHT etc.) or any other (CONCAT etc.) along with the ROW function.
Get first letter & last letter of each word, remove spaces and dashes, get the row number and return a unique string.
If I use
=IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=0,LEFT(A$1,1),IF(LEN(A$1)-LEN(SUBSTITUTE(A$1," ",""))=1,LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1),LEFT(A$1,1)&MID(A$1,FIND(" ",A$1)+1,1)&MID(A$1,FIND(" ",A$1,FIND(" ",A$1)+1)+1,1))) &ROW(A$1)
I get results as CC1 in both cases. How would I get a unique ID in such as case.
The idea in the comment-section by #JosWoolley is a good one. Though, be careful how/where you'd add a column index. If you'd just add the column index number you'd create confusion between say CC111 from row 11 column 1 and the number from row 1 and possibly column 11. Just adding the actual address of the cell instead of these indices will help but can create confusion too if you don't add a delimiter first. Therefor I'd suggest something along the lines of:
Formula in D1:
=CONCAT(LEFT(TEXTSPLIT(A1," ")),"|",ADDRESS(ROW(A1),COLUMN(A1),4))
Note: If you don't yet have access to TEXTSPLIT() you can swap this with FILTERXML(). Also, you mentioned CONCAT() but if used with Excel 2019 you may need to CSE the formula.

How to detect a specific string text in a cell and remove it in excel?

I have a list of locations in the format A2: [ City State ], where in the City or State can be made up of more than two words. All of this is in a single cell. I have the total data in thousands. Now, I also have a list of State with me. What I need to do is remove the State from A2 and just have the remaining string there. Thus leaving me with only City. I need help with this as the data is in millions and I have a list of around 30K cities.
You can create 2 cells next to the cell that has the data one for City and one for State. enter the formulas and autofill down.
Formula to extract City
=left(A1,find(" ",A1,1)-1)
Formula to extract State
=right(A1,len(A1)-find(" ",A1,1))
These formulas take the text to the left of the first space and the text to the right of the first space. You will have a problem if the city has a space e.g. "New York" but it is very difficult to work around that.
The other way to do this is to loop through all the cells using VBA code and run the same code on the cell and delete the State from that cell and place the State in a different cell. it's a more complicated option.
To solve the cities with 2 words problem
I'm assuming that the Sates are US States and so I've created a string of all the US states that contain 2 words
[New Hampshire][New Jersey][New Mexico][New York][North Carolina][North Dakota][Rhode Island][South Carolina][South Dakota][West Virginia]
If you paste these states into cell B1 then you can use the formula below in cell B3 to check the contents of A3 and extract the city.
=IF(ISNUMBER(FIND("[" & RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-1-LEN(SUBSTITUTE(A3," ",""))))) & "]",$B$1,1)),LEFT(A3,FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-1-LEN(SUBSTITUTE(A3," ",""))))-1),LEFT(A3,FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-LEN(SUBSTITUTE(A3," ",""))))-1))
Then paste the following formula into cell C3 to check the contents of A3 and extract the State.
=IF(ISNUMBER(FIND("[" & RIGHT(F9,LEN(F9)-FIND("☃",SUBSTITUTE(F9," ","☃",LEN(F9)-1-LEN(SUBSTITUTE(F9," ",""))))) & "]",$B$1,1)),RIGHT(F9,LEN(F9)-FIND("☃",SUBSTITUTE(F9," ","☃",LEN(F9)-1-LEN(SUBSTITUTE(F9," ",""))))),RIGHT(F9,LEN(F9)-FIND("☃",SUBSTITUTE(F9," ","☃",LEN(F9)-LEN(SUBSTITUTE(F9," ",""))))))
Formulas to deal with States with 3 words
The formula is getting really long but all it's doing is checking for a match with a split at 3 spaces from the end and then 2 spaces and if a state is not found splitting at 1 space.
Extract City
=IF(ISNUMBER(FIND("["&RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-2-LEN(SUBSTITUTE(A3," ","")))))&"]",$B$1,1)),LEFT(A3,FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-2-LEN(SUBSTITUTE(A3," ",""))))-1),IF(ISNUMBER(FIND("["&RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-1-LEN(SUBSTITUTE(A3," ","")))))&"]",$B$1,1)),LEFT(A3,FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-1-LEN(SUBSTITUTE(A3," ",""))))-1),LEFT(A3,FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-LEN(SUBSTITUTE(A3," ",""))))-1)))
Extract State
=IF(ISNUMBER(FIND("["&RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-2-LEN(SUBSTITUTE(A3," ","")))))&"]",$B$1,1)),RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-2-LEN(SUBSTITUTE(A3," ",""))))),IF(ISNUMBER(FIND("["&RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-1-LEN(SUBSTITUTE(A3," ","")))))&"]",$B$1,1)),RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-1-LEN(SUBSTITUTE(A3," ",""))))),RIGHT(A3,LEN(A3)-FIND("☃",SUBSTITUTE(A3," ","☃",LEN(A3)-LEN(SUBSTITUTE(A3," ","")))))))
How it works
The formula looks for the second to last space by taking one away from the last space (thanks to user m4573r for the last space formula). It then takes everything right of the last space, adds "[]" brackets and checks this against the State list text in B1. If it exists then it uses the second last space at the point to cut the text in half and return either the City from the left or the State on the right.

Find and remove unique values within a phrase across multiple cells

Have hit a complete roadblock on this one - I am trying to remove unique values within a Product Name column by comparing multiple cells. The end result is to create a logical 'category' in column B for each row, something like this:
Category Result Example:
Where I am coming unstuck is because there may be duplicates at various points in the string that clash e.g "Shirt Blue" in A2 & "Shirt Blue" in A4.
What might save me is the first 2 words, or roughly 10 characters, of 'like' products are always identical across cells so in essence I am trying to find a formula that will check if the first 10 characters are identical and then remove any remaining unique values from all cells in the range
This will not really do everything you need, but if you just want to pick up the first 2 words, you can find everything before the second space using this formula:
=LEFT(A1,FIND(" ", A1,FIND(" ", A1)+1)-1)
To increase the number of words to look for, you just need to add more FINDs in, like so:
=LEFT(A1,FIND(" ", A1, FIND(" ", A1,FIND(" ", A1)+1)+1)-1)

Create a table splitting comma and finding unique elements

I have the following data
Person Week1
P1 L,L
P2 M,H
Output I would like is
Person Week1
L M H
P1 2 0 0
P2 0 1 1
My intention is to create a chart based on the output so I can figure out how many codes a person got per week. Pivot tables does not seem to work for this case.
Thanks
This is a pure formula approach.
Its based off of two basic formulas. The first formula is how to count the number of times a string A occurs within string B. This is done, by counting the number of characters in the string B, then by counting the number of characters in String B after string A has been replaced by nothing or "". If string A is more than 1 character long you need to divide the result by the length of string A. That gives us this formula:
=(LEN(STRING B)-LEN(SUBSTITUTE(STRING B, STRING A, "")))/LEN(STRING A)
Now we know how to count the number of time L, M or H occur as they are string A and now we need to determine string B.
IF we look at the first table, it has nice row headers and column hearders. We could take a short cut and just assume everything is in order however I am going to go with the more generic approach in case the headers happen to be in a random order.
Basically we need to find out what column in the first table matches with the header in our second table. ie is week2 really the second column? is P1 still the first row? in order to do that we use the following
=MATCH("WEEK X",$B$1:$D$1,0)
and
=MATCH("PX",$A$2:$A$3,0)
Those will return an integers which we can then drop into the an INDEX function to locate and find the text in the first table:
=INDEX($B$2:$D$3,MATCH("PX",$A$2:$A$3,0),MATCH("WEEK X",$B$1:$D$1,0))
AWESOME we now know how to find the text from the table to drop into our counting formula that we started with. That las formula gets substituted into wherever there is STRING B!
=(LEN(INDEX($B$2:$D$3,MATCH("PX",$A$2:$A$3,0),MATCH("WEEK X",$B$1:$D$1,0)))-LEN(SUBSTITUTE(INDEX($B$2:$D$3,MATCH("PX",$A$2:$A$3,0),MATCH("WEEK X",$B$1:$D$1,0)), STRING A, "")))/LEN(STRING A)
yeah its getting a little ugly isn't it! String A is then whatever cell L is in in your second table. Replace "Week X" with your week header in your second table. Replace "PX" with the name of your person in your second table.
I would do the first formula, then copy it over under the M and under the H. Go into the M and H formula and adjust it so its pointing at the right week header in each. Lock the row but not the column references for the week header and the string A cells. Lock the column but not the row for the persons name. once you have that set up, copy al three formulas and paste under each week. Then just copy your entire first row of table two down for the number of people you have and voila!
Proof of concept
The formula I used in Cells H3, I3, and K3 respectively
=(LEN(INDEX($B$2:$D$3,MATCH($G3,$A$2:$A$3,0),MATCH(H$1,$B$1:$D$1,0)))-LEN(SUBSTITUTE(INDEX($B$2:$D$3,MATCH($G3,$A$2:$A$3,0),MATCH(H$1,$B$1:$D$1,0)),H$2,"")))/LEN(H$2)
=(LEN(INDEX($B$2:$D$3,MATCH($G3,$A$2:$A$3,0),MATCH(H$1,$B$1:$D$1,0)))-LEN(SUBSTITUTE(INDEX($B$2:$D$3,MATCH($G3,$A$2:$A$3,0),MATCH(H$1,$B$1:$D$1,0)),I$2,"")))/LEN(I$2)
=(LEN(INDEX($B$2:$D$3,MATCH($G3,$A$2:$A$3,0),MATCH(H$1,$B$1:$D$1,0)))-LEN(SUBSTITUTE(INDEX($B$2:$D$3,MATCH($G3,$A$2:$A$3,0),MATCH(H$1,$B$1:$D$1,0)),J$2,"")))/LEN(J$2)
Here is another proof of concept with expanded range showing rows out of order, and multiple letter strings to be searching for and more than two entries. Same formulas, just had to adjust the look up ranges for the increased table size.
If using VBA is acceptable, splitting the comma separated data using TextToColumns should help as a first processing step.
Then using a pivot table gives you the output you want.

Extract two numbers out of a string in Excel

I have a string that I need two numbers extracted and separated into two columns like this.
ID:1234567 RXN:89012345
ID:12345 RXN:678901
Column 1 Column 2
1234567 89012345
12345 678901
The numbers can be varying number of characters. I was able to get column 2 number by using the following function:
=RIGHT(G3,FIND("RXN:",G3)-5)
However, I'm having a hard time getting the ID number separated.
Also, I need this to be a function as I will be using a macro to use over many spreadsheets.
A way to do this is:
Select all your data - assuming it is in a string all the time - which means one cell has one row with ID&RXN nos. So if you have 100 rows such data, select all of it
Go to the Data tab, Text to columns
Choose Delimited>>Next>> choose Space here, in Other, type a colon(:) >> Finish
You will get "ID" in first column, every cell; ID no in second column every cell; RXN in third column every cell and RXN no in 4th column every cell.
Delete unwanted columns
With data in column A, in B1 enter:
=MID(A1,FIND("ID:",A1)+LEN("ID:"),FIND(" ",A1,FIND("ID:",A1)+LEN("ID:"))-FIND("ID:",A1)-LEN("ID:"))
and copy down. In C1 enter:
=MID(A1,FIND("RXN:",A1)+LEN("RXN:"),9999)
and copy down:
The column B formulas are a pretty standard way to capture a sub-string encapsulated by two other sub-strings.
If your format is always as you show it,then:
B1: =TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1," ",REPT(" ",99)),":",REPT(" ",99)),99,99))
C1: =TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1," ",REPT(" ",99)),":",REPT(" ",99)),3*99,99))
We substitute a long string of spaces for the space and : in the original string. Then we extract the 2nd and 4th items and trim off the extra spaces.

Resources