Excel: Replace strings with numbers in a formula? - excel

I have color strings in one of my columns, like red, purple and so on. I want to replace those colors with corresponding numbers. Red becomes 1, purple becomes 2 and so on.
Thats not so hard, I used substitute, like this:
SUBSTITUTE(E3;"red";"1")
Now the problem ist that some columns have 2 or more colors, like "red purple", so I tried using:
SUBSTITUTE(E3;"red";"1")&SUBSTITUTE(E3;"purple";"2")
That results in a value in my column that looks like 1red, There is the color from that row attached for each &SUBSTITUTE I add. If I added another color, like that
SUBSTITUTE(E3;"red";"1")&SUBSTITUTE(E3;"purple";"2")&SUBSTITUTE(E3;"green";"3")
it would become 1redred.
How can I solve this issue? I want to replace each color string with its corresponding number.
Thanks!

Try this
=SUBSTITUTE(SUBSTITUTE(E3,"Purple","2"),"Red","1")

Please consider the following more compact solutions (assuming tested cell is A2):
Using MATCH: if you need to return sequenced numbers like 1, 2, 3 ... - this formula will do the job:
=IFERROR(MATCH(A2,{"Red","Green","Blue"},0),"UNKNOWN COLOR")
You may add a multiplier / constant to the returned value as well. Order in sequence of strings equals the number returned.
Using VLOOKUP: if you need some defined set of returned values - define them in 2-dim array constant:
=IFERROR(VLOOKUP(A2,{"Red",10;"Green",20;"Blue",30},2,0),"UNKNOWN COLOR")
For this example 10, 20 and 30 will be returned.
Both formulas include error handling for unspecified colors.
Sample file is shared: https://www.dropbox.com/s/77aj1vl6c5gek5c/ColorsLookup.xlsx
P.S. I'm not sure about correct array dimension delimiters, since my local settings use different ones, but in sample file formulas work fine.

Related

Display text different to actual value when actual value is *

That's basically it.
I have a * in some cells to represent "All" data, because I need it for some formulas later, but I want to use it in the titles so it would be nice if it could display "Total" but still maintain the * below.
As you can see from the picture, the cell should be showing "Total" (red arrow) while mantaining the vale as "*" (blue cirlce).
Tried a few tutorials but nothing seems to work, I'm not sure if it's because it's an asterisk or what.
Edit: the tutorial I mostly followed is this one, in my case instead of 1,2,3,4,etc I have * (aka the value) and I want to display "Total" instead.
Custom number formats in Excel split into four categories:
Positive values
Negative values
Zero values
Text values
* is a text value so you want the fourth category and you get this by inserting semicolons before the required format to indicate that 1, 2 and 3 don't apply so you end up with:
;;;"Total"
where A2 and A4 actually contain an asterisk.
You have cells that have contents X but you want Excel to say, "no, no. I'm going to show you Y"?
Aside from sounding like a maintenance nightmare, I am not aware of a way to do that.

FIND formula, Correlation Table

I have a correlation table like so:
I want to make a static formula that can produce the following outcome for a large correlation table. There are 2 parts, the #, and the text. I can imagine the # and text needs to be parsed and then the formula to be applied.
IF # 's are equal, produce 1, if not produce .35
IF text are equal produce 1, if not product .99
if both are equal produce 1.
I've tried something like IF(A2=B1,1,.99) thus far but this misses alot of what i'm looking for.
Any help would be appreciated.
Use this formula:
=IF(LEFT(B$1,FIND("_",B$1)-1)=LEFT($A2,FIND("_",$A2)-1),1,0.35)*IF(MID(B$1,FIND("_",B$1)+1,LEN(B$1))=MID($A2,FIND("_",$A2)+1,LEN($A2)),1,0.99)
Then copy over and down.

Excel: Multiple Substitute without nesting and without VBA

I have cells that have data that I want to remove. The cells typically look like this:
alm1 105430_65042M
1993_5689IB
ALM99 3455 344C
0001_4555Alm5
Some but not all of the cells contain text like "almj" where "j" is a positive integer. I want to remove that part. I want the output to look like this:
105430_65042M
1993_5689IB
3455 344C
0001_4555
So something like this works
SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"alm1",""),"ALM99",""),"Alm5","")
But for my full data set I would need this to be several-hundred-deep nested function because my "alm" is sometimes in capital letters and sometimes not and the integer can vary from 1 to 100. This seems like a painful way to do this.
Is there a way that I can tell it to look for text listed in a column (so I can have a column in a different sheet that just goes from alm, alm1, ..., alm100 ) and then also ask it to ignore capitalization and then just replace stuff?
I tried referencing a column $M$1:$M$100 in the second argument of the function substitute but it's not working.
SUBSTITUTE(A2,$M$1:$M$100,"")
Where the column M contains "alm" alm1", ..., "alm100" etc.
This should work for you, it returned expected results on your provided sample data:
=IF(COUNTIF(A2,"*alm*")=0,A2,TRIM(SUBSTITUTE(A2&" ",MID(A2&" ",SEARCH("alm",A2),FIND(" ",A2&" ",SEARCH("alm",A2))),"")))
Excel has lousy string-handling capabilities. Regular expressions would make this task much easier.
In JavaScript, it's as simple as:
string.replace(/ *alm[0-9]+ */gi,'')
This does a global (g) case-insensitive (i) search of everything with:
0 or more spaces
followed by "alm" (in any case)
followed by 1 or more numbers
followed by 0 or more spaces
You can paste your data in the HTML box in this fiddle, and it will fix it:
http://jsfiddle.net/xs71bvan/2/

Match 2 columns based on a % difference within the value

I am looking for a method to match two excel tables.
I basically have two Systems, where the values do not exactly match only some IDs. The values in system 2 are usually 10-20% different from system 1.
Here is how the sheet looks like:
I tried to use vlookup on the IDs and then going hand-by-hand through the values if they match, by using the filter with the ID. However, this takes extremely long and is very cumbersome.
Any recommendation how to match these two tables, much more easily?
I really appreciate your replies!
If you look at a formula for G3 you would be involving D3:E3 and A:B (where A10:B10 are the matching values).
When someone states that they are looking for a percentage, it is helpful to know "a percentage of what...?". You receive a different result if the calculation is ABS(12 - 15)/15 instead of ABS(12 - 15)/12. One may within tolerance and the other may not.
In any event, the formula for G3 would be something like,
=ABS(E3-VLOOKUP(D3,A:B, 2, FALSE))/E3
... or,
=ABS(E3-VLOOKUP(D3,A:B, 2, FALSE))/VLOOKUP(D3,A:B, 2, FALSE)
That produces a result of 0.25% or 0.20% depending on how you calculate the percentage. You could wrap that in an IF statement for a YES/NO text result or use a custom number format like [Color3][>0.2]\NO;;[Color10]\Y\E\S;# which will show a red NO for values greater than 20% and a green YES for values between 0 and 20%. Negative values do not have to be accounted for as the ABS removes them from consideration.
       
I've only reproduced a minimum of your sample data for demonstration purposes but perhaps you can get an idea on how to proceed from that.

Is there any way to label dummy variable in Excel?

The following column contains a dummy variable for gender:
gender
1
0
2
0
where 1 is male, 2 is female, 0 is unknown.
I want to assign labels to the values, like in Stata, so that when I construct a chart, the legend would show Male instead of 1. Also, it would be nice if the dataset depicted string Male, but assumed value 1 for calculations.
How can I do that?
What you want is not (as far as I know) supported in Excel. What you need is sort of an ID value and a display value as is commonly known from DB-oriented controls like a Listbox in Access or the like. This does not exist in a single Excel cell, you'd have to simply make another column or replace the values.
BUT...
In this specific case you could reach the goal with the following 'hack':
The cells' numberformat supports different format strings for positive and negative values and for 0. For your three-valued-logic, you could make use of that, but you'd have to change your IDs to 1, 0 and -1. Then use "Male";"Female";"Unknown" as a custom format string.
... if this does not work, rather use a seperate column with a simple formula.

Resources