I have numerous files where the address field is in a single line of text, for the most part separated by a comma. My first step is using 'Replace' function in Excel to replace comma's with a carriage return. This is to turn an address from a single line into multiple lines.
The issue I'm looking to get assistance with, is when I complete the steps above, a leading space is often remaining in all rows from the second row onwards. I would like to know the best way to remove the leading spaces in these rows and keep the format of multi-line addresses.
I have tried using TRIM however these returns the address back to a single line
To show the pre and post transformed data I've added an image below as I can't seem to get the format to show correctly here on this post. Due to my profile being new I also can't imbed the image so there is a link below showing the pre and post transformed data, and the leading space issue I'm seeking help with
As #Anonymous mention in comment, replace both comma and space at a time by SUBSTITUTE() formula and use WRAP TEXT format of resulting cell.


How can I substitute multiple occurrences of junk strings in Excel?

In the image, 'muddle' is the string containing junk words and the strings I want to extract. There is a fixed list of junk words - the good strings could be literally anything.
You can see this formula has correctly extracted "moo" and "coo", which are not in the list of junk words. The formula is below.
TEXTJOIN("; ",TRUE,MID(Table2[muddle],goodstart,goodchars)))
This works well, but it falls down if a junk word occurs more than once. See below.
The only difference is that 'woo' occurs twice in the second example.
I need a single cell solution. VBA is not an option for me. Using the name manager would be untidy, as would nested formulas.
I've got this far with formulas, which as far as I can tell is the furthest anyone has got with the 'removing multiple words from a cell' problem. I can see the issue - once SEARCH locates the start of a string in a cell, it doesn't go looking for a second occurrence of that string. But I don't know how to find the start of every instance of every string. Can anyone help?
REDUCE is perfect for this:
REDUCE starts at the Table2[muddle] value as m then it substitutes the first value of Table1[junkwords] j with "" the outcome becomes the new m which will get a substitute of the second value of j. The result will be the new m, etc.
If you would want to have it comma separated it becomes more complicated, but you can realize by:
This does almost the same as the previous solution, but instead of substituting for blanks it substitutes for , and substitutes all duplicate ,, for singles, so if more substitutes followed eachother it results in one comma. Also, if the first and/or last part got substituted by a single ,, then the result would have a leading and/or trailing ,. This is solved by first adding , in the front and back before substituting the double comma's for singles. the result t is then wrapped in MID, where the first and last character (both being a ,) are removed.
Alternate solution:
=LET(t,REDUCE(Table2[muddle],Table1[junkwords],LAMBDA(x,y,SUBSTITUTE(x,y," "))),
SUBSTITUTE(TRIM(t)," ",","))
Or in one go if you don't want to use LET:
=SUBSTITUTE(TRIM(REDUCE(Table2[muddle],Table1[junkwords],LAMBDA(x,y,SUBSTITUTE(x,y," "))))," ",",")
This replaces the junk words with a space. Regardless how many junk words in between words or how many trailing or leading spaces TRIM will fix it to the words separated by one space only. Substituting the spaces for comma gets to your result.
There's no single-formula solution if the junkwords list is not fixed.
Instead, you may choose to use the Substitute() function on each cell of the "Extracted Strings" column to substitute all occurances of each junk word in muddle, i.e. substitute "boo" muddle, then substitute "voo" in the resulted string, replace "noo" in the resulted on. You will get the last cell.
One point to note though, you need to ensure no substring / partial strings problem in the junkwords or you need to define the rules of processing in order for the solution to be "complete". Consider the followings:
junk words = abc, def, cde
muddle = 1234abcdef5678
if you process the string in the above order, you got "12345678"
if you process the junk words in reverse order, you got "123abf5678"

How to search for items with multiple "-" in excel or VBA?

I have a list of item numbers (100K) like this:
Some of the items have format like SAG571A-244-4 (thousands) which need to be filtered so I can delete them and only keep the items that have ONE hyphen per SKU. How can I isolate the items that have two instances of "-" in it's SKU? I'm open to solutions within Excel or using VBA as well.
Native text filters don't seem to be capable of this. I'm stumped.
As per John Coleman's comment, "*-*-*" can be used to isolate strings that have at least two dashes in them.
I would add that if you're entering them as a custom text filter, you should lose the double quotes (so just *-*-*) as otherwise the field seems to interpret the quotes literally.
Seems to work for me.
If you want just an excel formula to verify this and give you a result of the number of hyphens (0, 1, or 2+), here is one:
Replace A1 with your relevant column, then fill down. This is kind of a terrible way to do this performance wise, but you avoid using VBA and possibly xlsm files.
The code first checks to see if there is one hyphen, then if there is it checks to see if there is another hyphen after the position the first one was found. Looking for multiple hyphens in this manner is cumbersome and I don't recommend it.

Trim text after Bracket in excel

I would like to trim off the text which is after the bracket in the cell Value
The current formula I'm using keeps giving me the error not being able to extract the targetted string.
=LEFT([#[Name ]],FIND("(",[#[Name ]]))
I want to go shopping (Today)
Goal: Is to remove
Expected Result:
I want to go shopping
One of these should do.
=TRIM(LEFT([name], FIND("(", [name]&"(")-1))
=TRIM(REPLACE([name], FIND("(", [name]&"("), LEN([name]), TEXT(,)))
Note that I suffixed the original text with the character that the FIND is looking for. In this manner, it will always be found even if it is not in the original text.
You may find that you have a rogue trailing space in the Name header label.

Vlookup Not working on text between two tables

This is not your average vlookup error.
I have two Power Query tables that I've setup. One is coming from a CSV file with a list of names. The other is from a website pulling a list of names.
=John Smith = John Smith would not be true for some reason.
They vlookup should be able to find the name easily. I've tried proper,upper, clean, trimming and text to columns and everything else that I could think of. I've changed data types to no avail.
I know that one query is causing the issue. I can type the name exactly and do a vlookup from one, and it works. The second query that I do this to doesn't return anything on the typed text.
Anyone encounter this issue while using Power Query?
EDIT: See Jeeped's Answer - When I replace the space from the web query with a normal space it works.
#Jeeped's comment has a good answer:
Assuming you have already trimmed off leading and trailing spaces, one of the John Smith entries (likely the one from the web) uses a non-breaking space (e.e. CHAR(160) or ASCII 0×A0) instead of a regular space (e.g CHAR(32) or ASCII 0×20). Use
=CODE(MID(A$1, ROW(1:1), 1))
on both, fill down to get a ASCII code for each letter and compare the numbers.

CSV - need comma at the start when first field is not present - csv generated from excel

I am using excel to generate comma separated values. I have a parser to parse the csv data and insert to database.
The issue I face is, the first field in my csv is not mandatory. When the first field is null, the generated csv has no comma BEFORE the second field and for the parser the second field becomes the first field.
When the first field is null, I expect the data to be like below.
I have tried
putting a space in the first field. In this case I wil have to change my parser.
Putting a static header. Then the comma is coming as expected in the underlying rows when first field is null. Change in parser will be required.
Putting a comma in the first field, but this is put as ",". :-)
Can someone through some solutions or workarounds ?
Quick workaround: Why don't you check how many values are present? If one is missing, asume it's the first one.
I've found this question that may help you. In a nutshell: apply any formatting to the range of cells you are using so Excel doesn't skip any of them when exporting. Also, I think that if you can swap the first column (optional) with any other one (required), it will work, too.
