Removing Unnecessary Characters from Excel Cell - excel

Below is a listing of some cells with unnecessary text. The text to remove would be /%%, -, and empty spaces.
Text and Result
| Text | Result |
|:--------|:---------|
| DW80R201UB/AA| DW80R201UB |
| DW80R201UW/AA| RDW80R201UW |
| DWT24PNA12| RDWT24PNA12 |
| DV-2A/XAA| RDV2A |
| 1DV-MCK/A1| RDVMCK |
| 1HAFCU1/XAA| RHAFCU1 |
| HAF-CIN/EXP| RHAFCIN |
For entries with the forward slash, I use =SUBSTITUTE(A1,RIGHT(A1,LEN(A1)-FIND("/",A1)+1),"") since there can be more than one character after the forward slash.
For everything else, I would use =SUBSTITUTE(SUBSTITUTE(A1,"-","")," ","").
I'll usually use the first formula, and then filter the column to only get #VALUE results and use the second formula. I'm just wondering if there is an easier way to get all the models with one nested function.

Take all characters to the left of a forward slash. If there's no forward slash, then take the original value. From there, substitute any dash or space with an empty string.
=SUBSTITUTE(SUBSTITUTE(IFERROR(LEFT(A1,FIND("/",A1,1)-1),A1),"-","")," ","")

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1;"/";"");"-";"");" ";"");"%";"");";";"")
change semikolon to comma
This will remove all the charakters at once.
Your first formula is not working for me.

Related

Excel - Forumla to seacrh for mutliple terms and show the matching term

I have the challenge that I need to search in Excel for multiple terms and to get the result back for each cell which of the different terms has matched.
I know there is a formula combination to search for multiple terms but this will not give me the matched term back. The exampel below gives only a "0" or "1" back.
=IF(ISNUMBER(SEARCH({"TermA","TermB","TermC"},A1)),"1","0")
| | A | B |
| 1 | This is TermA | TermA |
| 2 | Some TermB Text | TermB |
| 3 | And TermA Text | TermA |
| 4 | another TermC | TermC |
Background I have to do some normalization of the values and look therefore for some forumla which can identify the values and list the match. The values which are used to search for should be later on another page so it can be easily extended.
Thank you for some hints and approaches which will put me into the right direction.
To return all matching terms:
=INDEX(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.='TermA' or .='TermB' or .='TermC']"),COLUMN(A1))
Wrap in an IFERROR() if no match is found at all.
If one has ExcelO365 and you refer to a range, things got a lot easier:
Formula in E1:
=TRANSPOSE(FILTER(C$1:C$3,ISNUMBER(FIND(C$1:C$3,A1))))
=INDEX(FILTER(C:C,C:C<>""),MATCH(1, COUNTIF(A1, "*"&FILTER(C:C,C:C<>"")&"*"), 0))
For use in office 365 version. If previous version replace FILTER(C:C,C:C<>"") with C$1:C$4 for your example or whatever your range of search values may be. Table reference is also possible.
The formula searches for the first match in your list of values if the text including your term contains a matching value anywhere in that text. It returns the first match.

Substracting part of cell

So lets say that in one row i have in 2 cells some data and I want to extract the data after the second "_" character:
| | A | B |
|---|:----------:|:---------------------:|
| 1 | 75875_QUWR | LALAHF_FHJ_75378_WZ44 | <- Input
| 2 | 75875_QUWR | 75378_WZ44 | <- Expected output
I tried using =RIGHT() function but than i will remove text from this first cell and so on, how can i write this function? Maybe I would compare this old cell and than to do if the second row is empty because maybe function deleted it to copy the one from first? No idea
Try:
=MID("_"&A1,FIND("#",SUBSTITUTE("_"&A1,"_","#",LEN("_"&A1)-LEN(SUBSTITUTE("_"&A1,"_",""))-1))+1,100)
Regardless of the times a "_" is present in your string, it will end up with the last two "words" in your string. Source
Use following formula.
=TRIM(MID(A1,SEARCH("#",SUBSTITUTE(A1,"_","#",2))+1,100))

Excel: select the last number or numbers from a cell

Given the following examples,
16A6
ECCB15
I would only like to extract the last number or numbers from the string value. So the end result that I'm looking for is:
6
15
I've been trying to find a way, but can't seem to find the correct one.
Use thisformula:
=MID(A1,AGGREGATE(14,7,ROW($Z$1:INDEX($ZZ:$ZZ,LEN(A1)))/(NOT(ISNUMBER(--MID(A1,ROW($Z$1:INDEX($ZZ:$ZZ,LEN(A1))),1)))),1)+1,LEN(A1))
Try this:
=--RIGHT(A2,SUMPRODUCT(--ISNUMBER(--RIGHT(SUBSTITUTE(A2,"E",";"),ROW(INDIRECT("1:"&LEN(A2)))))))
or this (avoid using INDIRECT):
=--RIGHT(A2,SUMPRODUCT(--ISNUMBER(--RIGHT(SUBSTITUTE(A2,"E",";"),ROW($A$1:INDEX($A:$A,LEN(A2)))))))
Replace A2 in the above formula to suit your case.
Here are the data for testing:
| String |
|-----------|
| 16A6 |
| ECCB15 |
| BATT5A6 |
| 16 |
| A1B2C3E0 |
| 16E |
| TEST00004 |
I have an even shorter version: --RIGHT(A2,SUMPRODUCT(--ISNUMBER(--RIGHT(SUBSTITUTE(A2,"E",";"),ROW(INDIRECT("1:"&LEN(A2)))))))
The difference is the use of SUBSTITUTE in my final formula. I used SUBSTITUTE to replace letter E with a symbol because in the fifth string in the above list, the RIGHT function in my formula will return the following: {"0";"E0";"3E0";"C3E0";"2C3E0";"B2C3E0";"1B2C3E0";"A1B2C3E0"} where the third string 3E0 will return TRUE by ISNUMBER function, and this will result in an incorrect answer. Therefore I need to get rid of letter E first.
Let me know if you have any questions. Cheers :)

Replace specific characters (a list of them) in a string without removing blanks

I'm trying to work with manually inserted strings in SAS and I need to remove specific special characters (maybe by inserting a list of them) without removing blank spaces between words.
I've found a possible solution with a combination of compbl and transtrn to remove special characters and substitute them with blanks, reduced to one by compbl but this requires multiple steps.
I'm wondering if there is a function that allows me to do this in a single step. I've tried with the compress function (with the 'k' modifier to keep only letters and digits) but it removes blanks between words.
I'd like to go from a string like this one:
O'()n?e /, ^P.iece
To:
One Piece
With a single blank between the two words.
If someone can help me it would be awesome!
Use the next tags for compress function:
k -- Keep chars instead replace it
a -- Alphabetic chars
s -- Space characters
d -- Digits
And after it, use function COMPBL.
Code:
data have;
value="O'()n?e /, ^P.iece";
run;
data want;
set have;
value_want=COMPBL((compress(value,,"kasd"));
run;
So:
+--------------------+------------+
| value | value_want |
+--------------------+------------+
| O'()n?e /, ^P.iece | One Piece |
+--------------------+------------+
You could use regex and prxchage.
data have;
value="O'()n?e /, ^P.iece";
run;
data want;
set have;
value_want=value_want=prxchange("s/\s\s+/ /",-1,prxchange("s/[^a-zA-Z0-9\s]*//",-1,value));
run;
Result:
+--------------------+------------+
| value | value_want |
+--------------------+------------+
| O'()n?e /, ^P.iece | One Piece |
+--------------------+------------+

Specific concatenation of text cells by formula

I'm having difficulty producing a CONCATENATE formula that combines text cells in the way that I want. There are five fields that I want to concatenate: Title, Forename, RegnalNumber, Surname, and Alias, in that order. I'm no regex expert, so excuse the poor formatting, but this is a rough way of expressing what I'm trying to achieve:
(title)? (forename) (regnalnumber)? (surname)?, (alias).
The only field that can't be null is the forename field, although it might have the value "?", in which case it shouldn't output anything in the concatenation, i.e. it should be treated as blank. Hopefully the following test cases should demonstrate the output I'm trying to achieve: the output on the right is what it should look like:
| Title | Forename | RN | Surname | Alias | CONCATENATE |
+--------+----------+----+-----------+--------------+---------------------------------------+
| Ser | Jaime | | Lannister | Kingslayer | Ser Jaime Lannister, Kingslayer |
| | Pate | | | | Pate |
| Lord | ? | | Vance | | Lord Vance |
| King | Aerys | II | Targaryen | The Mad King | King Aerys II Targaryen, The Mad King |
| Lord | Jon | | Arryn | | Lord Jon Arryn |
| | Garth | | | Of Oldtown | Garth, Of Oldtown |
I've experimented for ages trying to make this concatenation work, but haven't been able to get it right. This is the current formula, with cell references replaced by the field name for comprehensibility:
=CONCATENATE(IF(Title<>"",Title&" ",""),IF(AND(Forename<>"",Forename<>"?"),Forename,""),IF(RN<>""," "&RN,""), IF(OR(AND(Forename<>"", Forename<>"?"), Surname<>"", RN<>""), " ",""), IF(Surname<>"",Surname,""),IF(AND(Alias<>"",OR(Alias<>"",AND(Forename<>"", Forename<>"?"),Surname<>"")),", "&Alias, Alias))
There is one case where it doesn't work: if the Surname and RN are null but the the Forename and Alias are non-null. For example, if the Forename is Garth, and the Alias is Of Oldtown, the concatenation outputs: Garth , Of Oldtown. It's the same if the title is non-null. It shouldn't have a space before the comma.
Can you help me to fix this formula so it works as expected? If you can find a way to simplify it, even better! I know I'm probably overcomplicating this a great deal. I'm using LibreOffice Calc 4.3.1.2, not Excel.
The best way imho to solve situations like this is to divide the problem over multiple simple columns, rather than 1 huge complex formula. Remember you can always hide the columns that you don't want to see.
So create a column for Title that says =if(a2="","",a2&" ").
That can be extended for all the other columns, except:
for Forename, where you want to include the "?" as follows: =if(b2="?","",b2&" ")
for Alias, where you want to include the leading ",": =if(e2="","",", "&e2)
Lastly just concatenate each of your working columns with something like: =f2&g2&h2&i2&j2.
This breaks the problem down into very simple components, and makes it easy to debug. If you want to add extra functionality at a later stage, it is easy to swap out one of your formulae for something else.
I know this is only a bit of fun, but can I suggest a more algorithmic approach?
The algorithm is:-
If a field is empty or ?, do nothing
Else
If concatenation so far is empty, add field to concatenation
Else
Add a space followed by the field to concatenation
which leads to this formula in G2 :-
=IF(OR(A2="",A2="?"),F2,IF(F2="",A2,F2&" "&A2)
(need to put single apostrophes in column F to make it work)
which when copied across looks like this:-

Resources