How to extract unique values from an array EXCEPT specified values? - excel

I am already abel to extract unique values, in excel, from an array using this function:
{=INDEX(list,MATCH(0,COUNTIF(uniques,list),0))}
However, I want to specify certain values for excel not to return. Is there any way to specify values that I don't want to be found within the already specified "list"? The ideal outome would be something like this:
I am also using excel version 2101.
Any information is helpful, thanks!

From your example, I ASSUME you want to exclude the lines starting with Round.
Try:
=LET(x,UNIQUE(List),FILTER(x,LEFT(x,5)<>"Round"))
or
=UNIQUE(FILTER(List,(LEFT(List,5)<>"Round")))
I'm not sure if it is more efficient to filter a smaller list, as is done in the first formula; or to avoid using LET as is done in the second formula.
EDIT
This can also be done using FILTERXML and TEXTJOIN which should be present in all Windows versions 2016+
=FILTERXML("<t><s>"&TEXTJOIN("</s><s>",,list)&"</s></t>","//s[not(starts-with(.,'Round')) and not(preceding-sibling::*=.)]")
the xPath
not(starts-with(.,'Round')) : should be obvious
Return only unique values:
and not(preceding-sibling::*=.) : do not return a node if any preceding-sibling matches the current node being tested

Related

Counting number of occurences of a specific search string

I'm building a monitoring system that takes a log (where people register their work in a set format) and returns a counter, which I can use for analysis. The monitor and log are two separate workbooks. The log has entries like this: INITALS;DATE;HOUR:RESULT|
Each cell can contain multiple entries.
My first attempt was to do a simple countif and look for a string (note that I use ; instead of , in formulas since I work on a Dutch excel):
=COUNTIF('LOCATION'!Table[LOG];"*NB;??/??/????;??:??:#A*|*")
This worked fine, but the formula only counted the number of cells where this string was present, not the actual number of occurences. I then tried this solution.
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"NB";"")))
This indeed counted the number of times "NB" was present in the LOG. However, when I tried to use the original search string, this solution stopped working:
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"*NB;??/??/????;??:??:#A*|*";"")))
It seems to me that SUM does not recognize symbols like ? or * which are necessary to define the correct search string. Where did I go wrong? Or can this be solved in another way? I can still look into VBA, but the workbooks are slow as hell already.
"?" and "*" are wildcards. Some functions support these (like COUNTIFS()) where others don't. Like you found out, SUBSTITUTE() does not.
Here is one way to count, assuming ms365:
Formula in C1:
=REDUCE(0,A1:A2,LAMBDA(a,b,a+LET(X,SEQUENCE(LEN(b)),SUM(--(IFERROR(SEARCH("NB;??/??/????;??:??:#A*|*",b,X),0)=X)))))
Note: I removed the asterisk in front of "NB" just to make searching for a position valid in comparison to what i called variable "X".

Matching Pairs Problem in excel using basic functions

May I know how to check whether the Current Pairs match with the Ideal Pairs by using excel functions?
For example: The Current Pair of 1&8 does not match with the Ideal Pair of 1&14 and 1&29
I would also like to count the total number of matching pairs at the end. I have tried out different methods by using countifs() or match(), but in vain.
Appreciate your answer!
You can try below formula to count all matching pairs.
=SUM(--(IFERROR(XMATCH(B2:B7&C2:C7,F2:F7&G2:G7,0),0)>0))

How can I check that a string only contains a defined set of substrings by Excel formula?

I have a dictionary containing lots of words - I want the user to be able to input a list of substrings, and then a filtered list will be updated, containing only words that contain those substrings and nothing else. Any words that contain extra characters the user didn't specify, should not appear. Cell F3 will use a FILTER function to create the list. As in the mock-up below:
What I need is a formula that would generate the TRUE or FALSE flags from the yellow section (B3:B9), but I'm not sure how to go about this.
I'm sure this could be solved by VBA or Regex using Google Sheets, but I want to know if there's a way to do this by formula, as I don't want this to require a button press or script execution, and my spreadsheet can't be hosted on Google sheets due to its size. Any ideas?
You can also use a combination of ISNUMBER and SUMPRODUCT:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(INDEX(A:A,1,1):INDEX(A:A,LEN(A3),1)),1),$D$3:$D$5,0)))
Adjusted formula:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(A$1:INDEX(A:A,LEN(A3))),1),$D$3:$D$5,0)))
The result:
The test being ran below is subtracting each instance of your dictionary from the length of original string. If the result is 0, this returns TRUE. If not, this returns FALSE. This is not case sensitive - a & A will be treated equally here.
=NOT(LEN(A1)-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D1,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D2,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D3,""))))
The equation works fine although I don't know if it is an optimal solution for you, but posting as answer in case it is for somebody else. The issue with this approach is the equation gets longer and longer for each character you add to your dictionary. Depending on the size of dictionary and strings to test against, this can get sloppy and calc heavy really quick.
Have you considered a UDF in VBA?

Using excel and modifying string based on search function

I am trying to get a value or all values similar to below in excel:
#123 maybe some text and date 12/17/209
#048309 maybe some text and date 12/17/209
#9385 maybe some text and date 12/17/209
I want to get the value proceeding the # however, I am not sure if there is an easier function? I want it to find the # then get however many numbers proceeds it. I am familiar with regex not with excel functions unfortunately.
Sorry for vagueness:
I was trying to use an IF() supplying a # as the find operation for the character I just couldnt manage to get the number as I was trying to use RIGHT() to filter after the #. What I found with the RIGHT() function is that it expects a parameter for count and so would have to be dynamic so I dropped that idea.
This formula will get the numbers directly after #:
=--MID(A1,FIND("#",A1)+1,FIND(" ",A1,FIND("#",A1))-FIND("#",A1))

How to find a string within a string

I have the list with like 100,000 site link strings
Each link is unique, but it has consistent ?Item=
Then, it's either nothing or it continues after & symbol.
My question is: How do I pull out the item numbers?
I know replace function can offer similar functionality, but it works with Fixed sizes, in my case string can be different in size.
Link example:
www.site.com?sadfsf?sdfsdf&adfasfd?Item=JGFGGG55555
or
www.site.com?sadfsf?sdfsdf&adfasfd?Item=JGFGGG55555&sdafsdfsdfsdf
In both cases I need to get JGFGGG55555 only
If this always is the last portion of the string, you can use the following:
=MID(A1, FIND("?Item=", A1) + 6, 99)
This assumes:
no item numbers will be over 99 digits.
no additional fields follow the item number.
Edit:
With the update to your question, it is apparent you have some strings with additional data after the ?Item= field. Without using VBA there is not a simple means of using MID and FIND to extract this.
However you could create a column which acts as a placeholder.
For example, create a column using:
=MID(A1,FIND("?Item=",A1)+6,99)
This gets you the following value: JGFGGG55555&sdafsdfsdfsdf
Next, create a column using:
=IF(ISERROR(FIND("&",B2)),B2,LEFT(B2,FIND("&",B2)-1))
This produces: JGFGGG55555 by searching the first value for a & and using the portion before it. If it is not found, the first value is simply repeated.
This formula should work for both the examples given:
=MID(A1,FIND("=",A1),IFERROR(LEN(A1)-FIND("&",A1,FIND("=",A1))-1,LEN(A1)+1-FIND("=",A1)))

Resources