Excel: Multiple Substitute without nesting and without VBA - excel

I have cells that have data that I want to remove. The cells typically look like this:
alm1 105430_65042M
1993_5689IB
ALM99 3455 344C
0001_4555Alm5
Some but not all of the cells contain text like "almj" where "j" is a positive integer. I want to remove that part. I want the output to look like this:
105430_65042M
1993_5689IB
3455 344C
0001_4555
So something like this works
SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"alm1",""),"ALM99",""),"Alm5","")
But for my full data set I would need this to be several-hundred-deep nested function because my "alm" is sometimes in capital letters and sometimes not and the integer can vary from 1 to 100. This seems like a painful way to do this.
Is there a way that I can tell it to look for text listed in a column (so I can have a column in a different sheet that just goes from alm, alm1, ..., alm100 ) and then also ask it to ignore capitalization and then just replace stuff?
I tried referencing a column $M$1:$M$100 in the second argument of the function substitute but it's not working.
SUBSTITUTE(A2,$M$1:$M$100,"")
Where the column M contains "alm" alm1", ..., "alm100" etc.

This should work for you, it returned expected results on your provided sample data:
=IF(COUNTIF(A2,"*alm*")=0,A2,TRIM(SUBSTITUTE(A2&" ",MID(A2&" ",SEARCH("alm",A2),FIND(" ",A2&" ",SEARCH("alm",A2))),"")))

Excel has lousy string-handling capabilities. Regular expressions would make this task much easier.
In JavaScript, it's as simple as:
string.replace(/ *alm[0-9]+ */gi,'')
This does a global (g) case-insensitive (i) search of everything with:
0 or more spaces
followed by "alm" (in any case)
followed by 1 or more numbers
followed by 0 or more spaces
You can paste your data in the HTML box in this fiddle, and it will fix it:
http://jsfiddle.net/xs71bvan/2/

Related

Counting number of occurences of a specific search string

I'm building a monitoring system that takes a log (where people register their work in a set format) and returns a counter, which I can use for analysis. The monitor and log are two separate workbooks. The log has entries like this: INITALS;DATE;HOUR:RESULT|
Each cell can contain multiple entries.
My first attempt was to do a simple countif and look for a string (note that I use ; instead of , in formulas since I work on a Dutch excel):
=COUNTIF('LOCATION'!Table[LOG];"*NB;??/??/????;??:??:#A*|*")
This worked fine, but the formula only counted the number of cells where this string was present, not the actual number of occurences. I then tried this solution.
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"NB";"")))
This indeed counted the number of times "NB" was present in the LOG. However, when I tried to use the original search string, this solution stopped working:
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"*NB;??/??/????;??:??:#A*|*";"")))
It seems to me that SUM does not recognize symbols like ? or * which are necessary to define the correct search string. Where did I go wrong? Or can this be solved in another way? I can still look into VBA, but the workbooks are slow as hell already.
"?" and "*" are wildcards. Some functions support these (like COUNTIFS()) where others don't. Like you found out, SUBSTITUTE() does not.
Here is one way to count, assuming ms365:
Formula in C1:
=REDUCE(0,A1:A2,LAMBDA(a,b,a+LET(X,SEQUENCE(LEN(b)),SUM(--(IFERROR(SEARCH("NB;??/??/????;??:??:#A*|*",b,X),0)=X)))))
Note: I removed the asterisk in front of "NB" just to make searching for a position valid in comparison to what i called variable "X".

How to replace numbers with corresponding text values in Excel without making it treat a number individually?

Okay, so here's the thing. I have a list of compounds which are listed as compound_1,compound_2,compound_3 and so on. I would like to change it to the corresponding compound names using Excel. There are over 140 compounds and in the datatable, the compound repeats itself for a number of times.
The problem arises when the Excel treats two digit numbers as two individual numbers instead of treating them as a one whole unit and substitutes value accordingly. I'll give you an example. Say, compound_2 is Nicotine and compound_20 is quercetin, it treats 20 as 2 and 0, and the cell is replaced with Nicotine0 instead of quercetin.
Is there a way to replace it without this hassle?
P.S: The table looks something like this:
compound_1 Nicotine
compound_2 β- sitosterol
compound_3 D - mannitol
compound_4 Stigma Sterol
compound_5 Betulinic Acid
And, I've attached a collage comparing the column values before(left) and after(right) I tried replacing compound_2 with Nicotine.(This is only for example and not the real pair)image

How can I check that a string only contains a defined set of substrings by Excel formula?

I have a dictionary containing lots of words - I want the user to be able to input a list of substrings, and then a filtered list will be updated, containing only words that contain those substrings and nothing else. Any words that contain extra characters the user didn't specify, should not appear. Cell F3 will use a FILTER function to create the list. As in the mock-up below:
What I need is a formula that would generate the TRUE or FALSE flags from the yellow section (B3:B9), but I'm not sure how to go about this.
I'm sure this could be solved by VBA or Regex using Google Sheets, but I want to know if there's a way to do this by formula, as I don't want this to require a button press or script execution, and my spreadsheet can't be hosted on Google sheets due to its size. Any ideas?
You can also use a combination of ISNUMBER and SUMPRODUCT:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(INDEX(A:A,1,1):INDEX(A:A,LEN(A3),1)),1),$D$3:$D$5,0)))
Adjusted formula:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(A$1:INDEX(A:A,LEN(A3))),1),$D$3:$D$5,0)))
The result:
The test being ran below is subtracting each instance of your dictionary from the length of original string. If the result is 0, this returns TRUE. If not, this returns FALSE. This is not case sensitive - a & A will be treated equally here.
=NOT(LEN(A1)-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D1,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D2,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D3,""))))
The equation works fine although I don't know if it is an optimal solution for you, but posting as answer in case it is for somebody else. The issue with this approach is the equation gets longer and longer for each character you add to your dictionary. Depending on the size of dictionary and strings to test against, this can get sloppy and calc heavy really quick.
Have you considered a UDF in VBA?

How to search for items with multiple "-" in excel or VBA?

I have a list of item numbers (100K) like this:
Some of the items have format like SAG571A-244-4 (thousands) which need to be filtered so I can delete them and only keep the items that have ONE hyphen per SKU. How can I isolate the items that have two instances of "-" in it's SKU? I'm open to solutions within Excel or using VBA as well.
Native text filters don't seem to be capable of this. I'm stumped.
As per John Coleman's comment, "*-*-*" can be used to isolate strings that have at least two dashes in them.
I would add that if you're entering them as a custom text filter, you should lose the double quotes (so just *-*-*) as otherwise the field seems to interpret the quotes literally.
Seems to work for me.
If you want just an excel formula to verify this and give you a result of the number of hyphens (0, 1, or 2+), here is one:
=IF(ISERROR(SEARCH("-",A1)),"0",IF(ISERROR(SEARCH("-",A1,IFERROR(SEARCH("-",A1)+1,LEN(A1)))),"1","2+"))
Replace A1 with your relevant column, then fill down. This is kind of a terrible way to do this performance wise, but you avoid using VBA and possibly xlsm files.
The code first checks to see if there is one hyphen, then if there is it checks to see if there is another hyphen after the position the first one was found. Looking for multiple hyphens in this manner is cumbersome and I don't recommend it.

Finding the right range from excel table

What is the best way to find the right column for the travelled miles using visual basic coding or some excel function and return the price from that column? HLOOKUP can't be used here because the lookup value isn't exact and the ranges in the table are also not with specific intervals (If they were, I could use e.g. FLOOR(travelled miles/100)*100 and find the price with HLOOKUP). Obviously, it's easy to find the price manually with a small table but with a big table computer will be faster.
Note that, if x is between a and b, then MEDIAN(x,a,b)=x. Combine this with some nested IFs:
=IF(MEDIAN(B5,B1,C1-1)=B5,B2,IF(MEDIAN(B5,C1,D1-1)=B5,C2,IF(MEDIAN(B5,D1,E1-1)=B5,D2)))
I'm on my phone, so just done the first three cases, but hopefully you can see how it continues.
(should note you need to remove the dashes for this to work)
Edit:
I also want to answer your question in the comments above. You can use the following to keep the dash, but get a number to work with.
Assume cell A1 has got the value 10-. We can use the FIND function to work out where the - occurs and then use the LEFT function to only return the characters from before the dash:
=LEFT(A1,FIND("-",A1)-1)
This will return the value 10, but it will return it as a string, not a number - basically Excel will think it is text. To force Excel to consider it as a number, we can simply multiply the value by one. Our formula above therefore becomes:
=(LEFT(A1,FIND("-",A1)-1))*1
You may also see people use a double minus sign, like this:
=--LEFT(A1,FIND("-",A1)-1)
I don't recommend this because it's a bit complex, but combining with the formula above would give:
=IF(MEDIAN(B5,--LEFT(B1,FIND("-",B1)-1),--LEFT(C1,FIND("-",C1)-1)-1)=B5,B2,IF(MEDIAN(B5,--LEFT(C1,FIND("-",C1)-1,--LEFT(D1,FIND("-",D1)-1-1)=B5,C2,IF(MEDIAN(B5,--LEFT(D1,FIND("-",D1)-1,--LEFT(E1,FIND("-",E1)-1-1)=B5,D2)))

Resources