Keeping leading zeros with find and replace - excel

I'm using Excels find and replace to remove hyphens from long numbers. They are mixed between birth dates and organisation numbers that have been filled with leading zeros to have the same number of characters. There are a LOT of numbers so find and replace seems to be the simplest solution to remove the hyphens.
But when i use find and replace the leading zeros are truncated and I have not found a solution to keep them afterwards.
For example i have:
19551230-1234
01234567-8901
and after find and replace I have
1,95512E+11
12345678901
but want the format as:
195512301234
012345678901
So I want to keep the leading zeros after find and replace. I've tried formatting the cells as text, but it doesn't work as the find and replace automatically truncates the leading zero and keeps the remaining characters, so the zero is completely removed. I am using Excel 2010, but answers for several versions are appreciated.

Just put a single quote in front of your leading number - ex. '01234 It will take the number as-is literally and the quote will not show in the field.

Use the SUBSTITUTE formula instead of Find and Replace like so:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1," ",""),"/",""),")",""),"(",""),"-","")
The result is text.

Related

How can I substitute multiple occurrences of junk strings in Excel?

In the image, 'muddle' is the string containing junk words and the strings I want to extract. There is a fixed list of junk words - the good strings could be literally anything.
You can see this formula has correctly extracted "moo" and "coo", which are not in the list of junk words. The formula is below.
=LET(junkStart,FILTER(SEARCH(Table1[junkwords],Table2[muddle]),ISNUMBER(SEARCH(Table1[junkwords],Table2[muddle]))),
junkEnd,FILTER(SEARCH(Table1[junkwords],Table2[muddle])+LEN(Table1[junkwords])-1,ISNUMBER(SEARCH(Table1[junkwords],Table2[muddle])+LEN(Table1[junkwords])-1)),
goodstart,FILTER(junkEnd+1,(junkEnd+1<=LEN(Table2[muddle]))*(ISERROR(XMATCH(junkEnd+1,junkStart)))),
goodend,FILTER(junkStart-1,(junkStart-1>=LEN(1))*(ISERROR(XMATCH(junkStart-1,junkEnd))))+1,
goodchars,goodend-goodstart,
TEXTJOIN("; ",TRUE,MID(Table2[muddle],goodstart,goodchars)))
This works well, but it falls down if a junk word occurs more than once. See below.
The only difference is that 'woo' occurs twice in the second example.
I need a single cell solution. VBA is not an option for me. Using the name manager would be untidy, as would nested formulas.
I've got this far with formulas, which as far as I can tell is the furthest anyone has got with the 'removing multiple words from a cell' problem. I can see the issue - once SEARCH locates the start of a string in a cell, it doesn't go looking for a second occurrence of that string. But I don't know how to find the start of every instance of every string. Can anyone help?
REDUCE is perfect for this:
=REDUCE(Table2[muddle],Table1[junkwords],LAMBDA(m,j,SUBSTITUTE(m,j,"")))
REDUCE starts at the Table2[muddle] value as m then it substitutes the first value of Table1[junkwords] j with "" the outcome becomes the new m which will get a substitute of the second value of j. The result will be the new m, etc.
If you would want to have it comma separated it becomes more complicated, but you can realize by:
=LET(t,SUBSTITUTE(","&REDUCE(Table2[muddle],Table1[junkwords],LAMBDA(x,y,SUBSTITUTE(x,y,",")))&",",",,",","),
MID(t,2,LEN(t)-3))
This does almost the same as the previous solution, but instead of substituting for blanks it substitutes for , and substitutes all duplicate ,, for singles, so if more substitutes followed eachother it results in one comma. Also, if the first and/or last part got substituted by a single ,, then the result would have a leading and/or trailing ,. This is solved by first adding , in the front and back before substituting the double comma's for singles. the result t is then wrapped in MID, where the first and last character (both being a ,) are removed.
Alternate solution:
=LET(t,REDUCE(Table2[muddle],Table1[junkwords],LAMBDA(x,y,SUBSTITUTE(x,y," "))),
SUBSTITUTE(TRIM(t)," ",","))
Or in one go if you don't want to use LET:
=SUBSTITUTE(TRIM(REDUCE(Table2[muddle],Table1[junkwords],LAMBDA(x,y,SUBSTITUTE(x,y," "))))," ",",")
This replaces the junk words with a space. Regardless how many junk words in between words or how many trailing or leading spaces TRIM will fix it to the words separated by one space only. Substituting the spaces for comma gets to your result.
There's no single-formula solution if the junkwords list is not fixed.
Instead, you may choose to use the Substitute() function on each cell of the "Extracted Strings" column to substitute all occurances of each junk word in muddle, i.e. substitute "boo" muddle, then substitute "voo" in the resulted string, replace "noo" in the resulted string...so on. You will get the last cell.
One point to note though, you need to ensure no substring / partial strings problem in the junkwords or you need to define the rules of processing in order for the solution to be "complete". Consider the followings:
junk words = abc, def, cde
muddle = 1234abcdef5678
if you process the string in the above order, you got "12345678"
if you process the junk words in reverse order, you got "123abf5678"

how can I calculate how many characters trimstart removes

I have a string, and I need to calculate the number of spaces that I remove when I do trimStart.
For example, I have the following string \t\t \tabcs
so I have two tabs and two spaces and another tab that will be removed using trim start (the rest is non space related chars).
I need to know how many spaces will be removed. since I don't know how much is \t, I can't just count it as a single char.
(My purpose is to calculate the column shift of a string due to the trimming action. Obviously comparing the lengths before and after the trim will not return me the desired result.
Do you have any ideas?
Thanks!

How in Excel to Remove only 1st comma (exact character, symbol or string)

I have numbers witch when is 1000 then has comma "," before hundreds like 1,234,00
How to remove 1st comma or make 2nd to appear so it would be 1234,00 or in excel as it works as number if has only space then with space or comma?
I have formula so far for getting number
=MID(LEFT($A604;FIND(" on ";$A604)-1);FIND("?";$A604)+1;LEN($A604))*1
And for removing all i put it in substitute to remove commas but that makes number wrong higher like 123400
=SUBSTITUTE(MID(LEFT($A604;FIND(" on ";$A604)-1);FIND("?";$A604)+1;LEN($A604));",";"")*1
The issue is the format #,##0, puts a comma before every third number. You need to treat it as a string
Try this in B2:
=IF(A2<999,A2,CONCATENATE(MID(A2,1,LEN(A2)-3),",",MID(A2,LEN(A2)-2,3)))
Depending on your use it might be best to remove the IF

Covert general text to number when prefixed with a ~

I have a column in excel which is formatted a general and it contains numbers, some of which are prefixed with a ~. I know that this character is representing leading zeroes, but in some cases it is one, or it can be two, three or more leading zeroes.
Is there a way to convert this to a number and preserve the correct number of leading zeroes? I need this to lookup on another list and match them, and the format must be identical.
There needs to be a way to determine how many leading zeros are required. If you want to replace the tilda with a single leading 0, then use the Replace Dialog to replace two tildas with single-quote 0:
(use as many zeros as are required.)

Deleting everything after whitespace in Excel

I have a massive list of dates that are in a few different formats. What I would like to do is get rid of anything past the first whitespace character, whether it be a space, newline, tab, etc. I've found a lot of answers detailing how to get rid of whitespace, but not much about deleting substrings based on the location of whitespace. Example below:
BEFORE AFTER
37893 37893
37801 37801
37710 37710
37620 37620
36980 36980
06/30/2014\nUSD 06/30/2014
03/31/2014\nUSD 03/31/2014
12/31/2013\nUSD 12/31/2013
09/30/2013\nUSD 09/30/2013
06/30/2013\nUSD 06/30/2013
03/31/2013\nUSD 03/31/2013
12/31/2012\nUSD 12/31/2012
etc...
For your example data, this would suffice:
LEFT(A1,10)
To format as dates, you could do this:
=TEXT(LEFT(A1,10),"mm/dd/yyyy")
Here is a possible formula solution.
=IFERROR(--REPLACE(A1, IFERROR(FIND(CHAR(10), A1),LEN(A1)+1),LEN(A1), ""),REPLACE(A1, IFERROR(FIND(CHAR(10), A1),LEN(A1)+1),LEN(A1), ""))
That might seem overly complex but it guards against cells that may or may not have a line feed as well as attempting to convert numbers to numbers and dates to dates while leaving text alone. You will have to format the cells to change returned values like 41820 to 6/30/2014.

Resources