Counting number of occurences of a specific search string - excel

I'm building a monitoring system that takes a log (where people register their work in a set format) and returns a counter, which I can use for analysis. The monitor and log are two separate workbooks. The log has entries like this: INITALS;DATE;HOUR:RESULT|
Each cell can contain multiple entries.
My first attempt was to do a simple countif and look for a string (note that I use ; instead of , in formulas since I work on a Dutch excel):
=COUNTIF('LOCATION'!Table[LOG];"*NB;??/??/????;??:??:#A*|*")
This worked fine, but the formula only counted the number of cells where this string was present, not the actual number of occurences. I then tried this solution.
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"NB";"")))
This indeed counted the number of times "NB" was present in the LOG. However, when I tried to use the original search string, this solution stopped working:
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"*NB;??/??/????;??:??:#A*|*";"")))
It seems to me that SUM does not recognize symbols like ? or * which are necessary to define the correct search string. Where did I go wrong? Or can this be solved in another way? I can still look into VBA, but the workbooks are slow as hell already.

"?" and "*" are wildcards. Some functions support these (like COUNTIFS()) where others don't. Like you found out, SUBSTITUTE() does not.
Here is one way to count, assuming ms365:
Formula in C1:
=REDUCE(0,A1:A2,LAMBDA(a,b,a+LET(X,SEQUENCE(LEN(b)),SUM(--(IFERROR(SEARCH("NB;??/??/????;??:??:#A*|*",b,X),0)=X)))))
Note: I removed the asterisk in front of "NB" just to make searching for a position valid in comparison to what i called variable "X".

Related

Remove everything after hyphen in hyphenated names in Excel using Formula

I am in charge of adding new employees to our speech recognition and gamification systems.
When I get a batch of tickets, I compile a bunch of data into a spreadsheet that I then reference when adding those users to the systems (Which unfortunately do not have a JSON/CSV upload option or anything similar)
To save some time with compiling, I've started exporting a bunch of data from our database and our HR management system into that sheet, and then using the new employee's email to XLOOKUP all the other data fields.
For one of our systems, it has a strict character limit, and the format for the username is "cde\firstname.lastname". This is no problem to CONCATENATE normally, but it has a strict character limit, so if the user has a hyphenated last name, I will basically dump everything after the hyphen.
At first I tried a simple formula using a combination of LEFT and FIND -1 to find the hyphen, and then take everything to the left of it. This obviously doesn't end up working because I get a #VALUE! for anyone without a hyphen in their last name.
I tried using IFERROR to say "OK try to return the last name without a hyphen, otherwise just return the last name", but for some reason when I put the reference in the Return_If_Error portion, it doesn't recognize it as a reference.
So I am looking for a formula that will work with a LOOKUP'd value and only give me what's before a hyphen, but otherwise will still just give me the last name.
The baseline formula I have, that just looks up and concatenates the first and last into the "cde\firstname.lastname" is:
=CONCATENATE("cde\",LOWER(XLOOKUP(G578,Sheet4!M:M,Sheet4!B:B)),".",LOWER(XLOOKUP(G578,Sheet4!M:M,Sheet4!C:C)))
To expand on the comments, you've got the right idea, just use an IF statement for testing if the string contains "-", then use the normal string functions like FIND, LEFT, etc. to pick out the things you want.
For example:
="cde/"&
LEFT(H1,FIND(".",H1)-1)&
IF(ISNUMBER(FIND("-",H1)),MID(H1,FIND(".",H1),FIND("-",H1)-FIND(".",H1)),
MID(H1,FIND(".",H1),FIND("#",H1)-FIND(".",H1)))

How can I check that a string only contains a defined set of substrings by Excel formula?

I have a dictionary containing lots of words - I want the user to be able to input a list of substrings, and then a filtered list will be updated, containing only words that contain those substrings and nothing else. Any words that contain extra characters the user didn't specify, should not appear. Cell F3 will use a FILTER function to create the list. As in the mock-up below:
What I need is a formula that would generate the TRUE or FALSE flags from the yellow section (B3:B9), but I'm not sure how to go about this.
I'm sure this could be solved by VBA or Regex using Google Sheets, but I want to know if there's a way to do this by formula, as I don't want this to require a button press or script execution, and my spreadsheet can't be hosted on Google sheets due to its size. Any ideas?
You can also use a combination of ISNUMBER and SUMPRODUCT:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(INDEX(A:A,1,1):INDEX(A:A,LEN(A3),1)),1),$D$3:$D$5,0)))
Adjusted formula:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(A$1:INDEX(A:A,LEN(A3))),1),$D$3:$D$5,0)))
The result:
The test being ran below is subtracting each instance of your dictionary from the length of original string. If the result is 0, this returns TRUE. If not, this returns FALSE. This is not case sensitive - a & A will be treated equally here.
=NOT(LEN(A1)-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D1,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D2,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D3,""))))
The equation works fine although I don't know if it is an optimal solution for you, but posting as answer in case it is for somebody else. The issue with this approach is the equation gets longer and longer for each character you add to your dictionary. Depending on the size of dictionary and strings to test against, this can get sloppy and calc heavy really quick.
Have you considered a UDF in VBA?

How to search for items with multiple "-" in excel or VBA?

I have a list of item numbers (100K) like this:
Some of the items have format like SAG571A-244-4 (thousands) which need to be filtered so I can delete them and only keep the items that have ONE hyphen per SKU. How can I isolate the items that have two instances of "-" in it's SKU? I'm open to solutions within Excel or using VBA as well.
Native text filters don't seem to be capable of this. I'm stumped.
As per John Coleman's comment, "*-*-*" can be used to isolate strings that have at least two dashes in them.
I would add that if you're entering them as a custom text filter, you should lose the double quotes (so just *-*-*) as otherwise the field seems to interpret the quotes literally.
Seems to work for me.
If you want just an excel formula to verify this and give you a result of the number of hyphens (0, 1, or 2+), here is one:
=IF(ISERROR(SEARCH("-",A1)),"0",IF(ISERROR(SEARCH("-",A1,IFERROR(SEARCH("-",A1)+1,LEN(A1)))),"1","2+"))
Replace A1 with your relevant column, then fill down. This is kind of a terrible way to do this performance wise, but you avoid using VBA and possibly xlsm files.
The code first checks to see if there is one hyphen, then if there is it checks to see if there is another hyphen after the position the first one was found. Looking for multiple hyphens in this manner is cumbersome and I don't recommend it.

Need to understand fomula re: excel search for a string in a table and return string if true

I've adapted this solution from a couple of years ago:
=LOOKUP(2^15,FIND(Keywords,A2),Categories)
I use this for searching within a description field for keywords in a named list, in order to return a corresponding category from an adjacent named list.
However I do not understand the significance of 2^15. Can someone explain?
Also it's unclear in what order the search operates. If two keyword options were "check" and "deposit," and they were assigned to different categories, but both appeared in the same description field cell, how do I know which will be found first? Is it placement in the string, or order in the list?
2^15 is simply an arbitrarily large number, which lookup attempts to find - when it can't find it, it takes the next lowest number.
Effectively your formula looks at Keywords, and attempts to find the value in A2. For each word that actually matches A2, it provides a non-error message. Then out of the whole list, it attempts to find that line number in categories, resulting in many errors, and a single correct value. Lookup picks the value by using 2^15. Though this seems to be a weird way of doing it; it is likely a holdover of pre-2007, as Lookup is generally used now only for backwards compatibility purposes. Also using 1 instead of 2^15 worked for a couple of simple cases that I tried when writing this up.

Sharepoint: Calculated Column replace all spaces

Seems like it would be a simple thing really (and it may be), but I'm trying to take the string data of a column and then through a calculated column, replace all the spaces with %20's so that the HTML link in the workflow produced email will actually not break off at the first space.
For example, we have this in our source column:
file:///Z:/data/This is our report.rpt
And would like to end up with this in the calculated column:
file:///Z:/data/This%20is%20our%20report.rpt
Already used the REPLACE, and made up a ghastly super nested REPLACE/SEARCH version, but the problem there is that you have to nest for EACH potential space, and if you don't know how many up front, it doesn't work, or will miss some.
Have any of you come across this scenario and how did you handle it?
Thanks in advance!
As far as I know there is no generic solution using the calculated-column syntax. The standard solution for this situation is using an ItemAdded (/ItemUpdated) event and initializing the field value from code.
I was able to solve this issue for my circumstances by using a series of calculated columns.
In the first calculated column (C1) I entered a formula to remove the first space, something like this:
=IF(ISNUMBER(FIND(" ",[Title])),REPLACE([Title],FIND(" ",[Title]),1,"%20"),[Title])
In the second Calculated column (C2) I used:
=IF(ISNUMBER(FIND(" ",[C1])),REPLACE([C1],FIND(" ",[C1]),1,"%20"),[C1]).
In my case, I wanted to encode upto four spaces, so I used 3 calculated columns (C1, C2, C3) in the same fashion and got the desired result.
This is not as efficient as using a single calculated column, but if SUBSTITUTE will not work in your SharePoint environment, and you cannot use an event handler or workflow, it may offer a workable alternative.
I actually used a slightly different formula, but it was on a work machine to which I don't have access at the moment, so I just grabbed this formula from a similar S.O. question. Any formula that will replace the first occurrence of a space with "%20" will work, the trick is to a) make sure the formula returns the original string unchanged if it does not have more spaces in it, and b) test, test, test. Create a view of your list that has the field you are trying to encode, plus the calculated fields, and see if you are getting the results you want.
so that the HTML link in the workflow produced email will actually not break off at the first space.
The browser only does this if you have not enclosed your link in quotes
If you wrap the link in quotes, it does not cut off at the first space
In a SharePoint Formula it would be:
="""file:///Z:/data/This is our report.rpt"""
becuase two quotes are the SP escape notation to output a quote
You can use this formula (Start trim for 1, in my case was 4):
=IF(ISBLANK([EUR Amount]),"",(TRIM(MID([EUR Amount],4,2))&TRIM(MID([EUR Amount],6,2))&TRIM(MID([EUR Amount],8,2))&TRIM(MID([EUR Amount],10,2))&TRIM(MID([EUR Amount],12,2))&TRIM(MID([EUR Amount],14,2)))*1)

Resources