Eliminate Characters In Excel - excel

I have thousands of file names like this one : LO_Oszukane_169_Pol___MP2_.mpg
Notice how there are 3 underscores after Pol. I need to remove all excess underscores and leave just one.
How could I achieve this in Excel.
Ive attempted Replace & Substitute
First time using StackOverflow, looking forward to seeing your responses!

To replace an arbitrary number of underscores, you can use:
=SUBSTITUTE(TRIM(SUBSTITUTE(A1,"_"," "))," ","_")
Assuming that you don't have spaces in your filenames, or if you do, you also want to replace them with underscores, and that you don't have any underscores at the beginning or end that you want to keep. Also note, that it keeps one underscore if it's right before the extension.

You have to replace 3 underscore with one. Use Substitute() function like below.
=SUBSTITUTE(A1,"___","_")

Related

In Excel, finding if cell contains any character other than letters, a dot . , single quotation and space

I have the following table, and would like to identify the cells (as HIT) that contain characters other than
letters
dot .
single quotation
Which formula can I use for this? I've tried different functions, they don't seem to work.
I think there will be a bunch of possibilities. Here is one using the logic that we will check every character in your string against all characters you'd like to exclude:
Formula in B2:
=IF(SUM(--ISERROR(SEARCH(MID(A2,SEQUENCE(LEN(A2)),1),"abcdefghijklmnopqrstuvwxyz'. "))),"Hit","No Hit")
Note: I deliberately included a space since you seems to be wanting to exclude that too.
Other options could be:
=IF(REDUCE(LOWER(A2),MID("abcdefghijklmnopqrstuvwxyz.' ",SEQUENCE(29),1),LAMBDA(a,b,SUBSTITUTE(a,b,"")))<>"","Hit","No Hit")
Or with FILTERXML():
=IF(ISERROR(FILTERXML("<t><s>"&LOWER(A2)&"</s></t>","//s[translate(., ""abcdefghijklmnopqrstuvwxyz.' "", '')!='']")),"No Hit","Hit")
Though these options are more verbose and both SUBSTITUTE() and FILTERXML() are case-sensitive whereas SEARCH() is not.
So, perhaps easier to do:
IF(MAX(IFERROR(FIND(".",A1,1),0),IFERROR(FIND("'",A1,1),0))>0,"No Hit","Hit")
I did not include a space but that is easily edited in.

How can I split a phrase into a new line every x characters on Google Sheets?

I am translating a game, and the game's text box only supports 50 characters max per line. Is there a way to use a formula to split the entire sentence every 50 characters or whole word (49, 48, 47, etc)?
I am currently working with this formula.
=JOIN(CHAR(10),SPLIT(REGEXREPLACE(A1, "(.{50})", "/$1"),"/"))
The problem with this code, is that it splits at exactly 50 characters (one time), and will split in the middle of the word.
So again, my goal is to have it not split on the 50th character IF the 50th character is in the middle of the word, and for the rule to apply for the rest of the lines too because it only applies on the first line.
Please take a look at this test google sheet to get an example of what I am talking about.
If it's impossible to do it on Google Sheets, I don't mind moving to Excel provided I get a functioning code.
For the record, I did ask in Google's product forums 2 days ago, and still haven't received an answer.
=REGEXREPLACE(A1, "(.{1,50})\b", "$1" & CHAR(10))
{50} matches exactly 50 times, but what you need is 50 or less.
\b is word boundary that matches between alphanumeric and non-alphanumeric character.
= REGEXEXTRACT(A1,"(?ism)^"&REPT("([\w\d'\(\),. ]{0,49}\s)", ROUNDUP(LEN(A1)/50,0))&"([\w\d'\(\),. ]{0,49})$")
Tested with various expressions and works as intended. Note that only these characters [a-zA-Z0-9_'(),.] are allowed, Which means - and other characters not mentioned will not work. If you need them, add them inside the REPT expression and finishing regexp formula. Otherwise, This will work perfectly.
You are pretty close. I'm not an expert in Sheets, so not sure if this is the best way, but your Regex is wrong for what you want.
Also, you need to be certain that you don't use a split character that might appear in the phrase itself. However, using CHAR(10) for the replace character allows you to insert LF without going through the JOIN SPLIT sequence.
replace any line feeds, carriage returns and spaces with a single space
Match strings that start with a non-Space character followed by up to 49 more characters which are followed by a space or the end of the string.
replace the capture group with the capturing group followed by the CHAR(10) (and delete the space following).
There will be extra CHAR(10) at the end which you can strip off.
EDIT Regex changed slightly due to a difference in behavior between Google's RE and what I am used to (probably has to do with how a non-backtracking regex works). The problem showed up on your example:
=regexreplace(REGEXREPLACE(REGEXREPLACE(A1 & " ","[\r\n\s]+"," "),"(\S.{0,49})\s","$1" & char(10)),"\n+\z","")

Keeping leading zeros with find and replace

I'm using Excels find and replace to remove hyphens from long numbers. They are mixed between birth dates and organisation numbers that have been filled with leading zeros to have the same number of characters. There are a LOT of numbers so find and replace seems to be the simplest solution to remove the hyphens.
But when i use find and replace the leading zeros are truncated and I have not found a solution to keep them afterwards.
For example i have:
19551230-1234
01234567-8901
and after find and replace I have
1,95512E+11
12345678901
but want the format as:
195512301234
012345678901
So I want to keep the leading zeros after find and replace. I've tried formatting the cells as text, but it doesn't work as the find and replace automatically truncates the leading zero and keeps the remaining characters, so the zero is completely removed. I am using Excel 2010, but answers for several versions are appreciated.
Just put a single quote in front of your leading number - ex. '01234 It will take the number as-is literally and the quote will not show in the field.
Use the SUBSTITUTE formula instead of Find and Replace like so:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1," ",""),"/",""),")",""),"(",""),"-","")
The result is text.

How to replace wildcharacter in CSV

I have below string in csv files
Part Number WP1166496 (AP6005317) replaces 1166496, 1156976.
Expected Output -
Part Number WP1166496 replaces 1166496, 1156976.
I want to replace (AP6005317) this with blanks.
As there are many rows with different values.
So how can I replace this string with brackets to blanks value.
I don't know how to achieve this exactly in Microsoft Excel.
If you look for find and replace feature, most probably you can see option to replace with regular expressions.
Use regular expression option and replace \(.*\) with (simple space). This will solve your problem.
Note : This is tested and verified in LibreOffice Calc.

What would be the exact formula for removing and adding these special characters in Excel?

Say I have this word(excluding quotes).
"XE Premium (TT) 2.0T"
I want the above word to show up like this. I am replacing the empty spaces and period with dash. I am also removing the brackets completely.
"XE-Premium-TT-2-0T"
So far I only know how to do one of those things at a time like this.
=SUBSTITUTE(TRIM(A38)," ","-")
How do I do all of them at the same time in Excel?
You can do all at once:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(A38)," ","-"),")",""),"(",""),".","-")

Resources