How to count exact text contain in string [Excel] - excel

I already use these below formula to count exact text contain in string but still formula wrongly counted it. For example, i would like to count "ZIKA" test code in table, the answer should be two. But the formula count ZIKA2 as ZIKA also. How to ignore ZIKA2 from count it?
TEST
HS2, CCAL, EGFR, AFB
ZIKA, AG21
PPB, ZIKA2
ZIKA, AG21
I already try these formulas:
=SUMPRODUCT(--(ISNUMBER(FIND("ZIKA",F:F))))
and also
=COUNTIF(F:F,"ZIKA")

you could count exact zika, and comma-separated vriations
=COUNTIF(F:F,"ZIKA")+COUNTIF(F:F,"ZIKA,*")+COUNTIF(F:F,"*, ZIKA")+COUNTIF(F:F,"*, ZIKA,*")

I assume your data follow this format
xxx, yyy, zzz
space after comma
You may need to split your formula into 3 parts
=COUNTIF(F:F,"ZIKA,*")+COUNTIF(F:F,"*, ZIKA")+COUNTIF(F:F,"ZIKA")
The first part will count those start with ZIKA, second part count those end with ZIKA, last we should count those only with ZIKA

Try this regex, it may need a helpercolumn. I have not tested it that much yet.
Press ALT + F11 to open VBA editor.
Click Insert -> module and copy paste the code below.
Function Regex(Cell, Search)
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(\b" & Search & "\b)"
RE.Global = True
RE.IgnoreCase = True
Set Matches = RE.Execute(Cell)
For Each res In Matches
Regex = Regex & "," & res
Next res
Regex = Mid(Regex, 2)
End Function
It will return "ZIKA" if it finds ZIKA in the cell you run it on.
And then you just count the ZIKAs in the helper column.
Updated with a new code that you can change the search in.
Use it with =regex(A1, "ZIKA")

Related

Delete character after ':' but not delete the rest

For example my excel column is
CodeandPrice
3&12|4&200|2&
5&|2&
4&|
2&12|35&744
With & is separation between code and price, | is separation betweeen 2 item.
I want to only get the code, so character before &.
CodeandPrice
3&|4&|2&
5&|2&
4&|
2&|35&
I ve googled but what I found is remove all character after/before. But what I want is,
remove a character after & but not all, since there will be another code.
For Excel 2010, maybe the easiest is to quickly throw together an UDF and invoke this as a function in your sheet, for example:
Function RegexReplace(s_in As String, pat As String, repl As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = pat
RegexReplace = .Replace(s_in, repl)
End With
End Function
Invoke like: =RegexReplace(A2,"\d+\|","")
If one happens to have newer functionality available try:
Formula in B2:
=MAP(A2:A5,LAMBDA(a,TEXTJOIN("|",,TAKE(TEXTSPLIT(a,"&","|"),,1)&"&")))

Replace non-printable characters with " (Inch sign) VBA Excel

I need to replace non-printable characters with " (Inch sign).
I tried to use excel clean function and other UDF functions, but it just remove and not replace.
Note: non-printable characters are highlighted in blue on the above photo and it's position is random on the cells.
this is a sample string file Link`
The expected correct output should be 12"x14" LPG . OUTLET OCT-SEP# process
In advance grateful for useful comments and answer.
As per my comment, you can try:
=SUBSTITUTE(A1,CHAR(25)&CHAR(25),CHAR(34))
Or the VBA pseudo-code:
[A1] = [A1].Replace(Chr(25) & Chr(25), Chr(34))
Where [A1] is the obvious placeholder for the range-object you would want to use with proper and absolute referencing.
With ms365 newest functions, we could also use:
=TEXTJOIN(CHAR(34),,TEXTSPLIT(A1,CHAR(25)))
You can use Regular Expressions within a UDF to create a flexible method to replace "bad" characters, when you don't know exactly what they are.
In the UDF below, I show two pattern options, but others are possible.
One is to replace all characters with a character code >127
the second is to replace all characters with a charcter code >255
Option Explicit
Function ReplaceBadChars(str As String, replWith As String) As String
Dim RE As Object
Set RE = CreateObject("Vbscript.Regexp")
With RE
.Pattern = "[\u0080-\uFFFF]" 'to replace all characters with code >127 or
'.Pattern = "[\u0100-\uFFFF]" 'to replace all characters with code >255
.Global = True
ReplaceBadChars = .Replace(str, replWith)
End With
End Function
On the worksheet you can use, for example:
=ReplaceBadChars(A1,"""")
Or you could use it in a macro if you wanted to process a column of data without adding an extra column.
Note: I am uncertain as to whether there might be an efficiency difference using a smaller negated character class (eg: [^\x00-\x79] instead of the character class I showed in the code. But if, as written, execution seems slow, I'd try this change)
You can try this :
Cells.Replace What:="[The caracter to replace]", Replacement:=""""

Filter phone numbers from open text field - Power BI, excel, VBA

I have a text field in a table where I need to substitute phone numbers where applicable.
For example the text field could have:
Call me on 08588812885 immediately
Call me on 07525812845
I need assistance please contact me
Good service
Sometimes a phone number will be in the text but not always and the phone number entered will always be different.
Is there a measure to use to replace the phone numbers with no text.
Ideally the solution would be Power BI, but can also be done in the raw data using excel or VBA
Regular expression in VBA (excel) or Python (Power BI) is a straightforward solution.
I have never used PowerBI with Python before but manage to make following python script.
In PowerBI transformation steps I created a new column that would copy [message] columns and named it [noPhoneNumber], then next step ran this python script
import re
def removePhone(x):
return re.sub('\d{10,11}', "**number removed**", x)
length = len(dataset["noPhoneNumber"])
for iRow in range(length):
dataset["noPhoneNumber"][iRow] = removePhone(dataset["noPhoneNumber"][iRow])
so column "noPhoneNumber"
Call me on 08588812885 immediately
Call me on 07525812845
I need assistance please contact me
Good service
becomes
Call me on **number removed** immediately
Call me on **number removed**
I need assistance please contact me
Good service
In VBA Preferable create UDF (user defined function) and don't create a subroutine, that would be too error prone for this kind of problem.
[Added]
If you need to make a Excel based solution, you can create a UDF function like so:
(remember early binding to import of VBScript_RegExp_55.RegExp in excel)
Function removePhoneNumber(text As String, Optional replacement As String = "**number removed**") As String
Dim regex As New RegExp
regex.Pattern = "\d{10,11}"
removePhoneNumber = regex.Replace(text, replacement)
End Function
...and then use excel function like so:
=removePhoneNumber(A2),
=removePhoneNumber(A3)
and so on...
A simple VBA function alternative
Function removePhone(s As String) As String
Const DELIM As String = " "
Dim i As Long, tokens As Variant
tokens = Split(s, DELIM)
For i = LBound(tokens) To UBound(tokens)
If IsNumeric(tokens(i)) Then
tokens(i) = "*Removed*" ' << change to your needs
Exit For ' assuming a single phone number per string
End If
Next
removePhone = Join(tokens, DELIM)
End Function
You can do this in Power Query. Create a custom column with this below code. I have considered the column name is Comments but please adjust this with your column name.
if Text.Length(Text.Select([comments], {"0".."9"})) = 11
then
Text.Replace(
[comments],
Text.Select([comments], {"0".."9"}),
""
)
else [comments]
Here is the output below. You can also replace phone numbers with other text like #### to make is anonymous.
NOTE
This will only work if there are only 1 number in the string with length 11 (You can adjust the length in code as per requirement).
This will Not work if there are more than one Numbers in the string.
If there are 1 number in the string but length not equal 11, this will keep the whole string as original.

Excel find and replace function correct formula

I wish to use the find and replace function in excel to remove example sentences from cells similar to this:
text <br>〔「text」text,「text」text〕<br>(1)text「sentence―sentence/sentence」<br>(2)text「sentence―sentence」
Sentences are in between 「」brackets and will include a ― and / character somewhere inside the brackets.
I have tried 「*―*/*」 however this will delete everything from the right of the〔
Is there any way to target and delete these specific sentence brackets, with the find and replace tool?
Desired outcome:
text <br>〔「text」text,「text」text〕<br>(1)text<br>(2)text「sentence―sentence」
Quite a long formula but in Excel O365 you could use:
=SUBSTITUTE(CONCAT(FILTERXML("<t><s>"&SUBSTITUTE(CONCAT(IF(MID(A1,SEQUENCE(LEN(A1)),1)="「","</s><s>「",IF(MID(A1,SEQUENCE(LEN(A1)),1)="」","」</s><s>",MID(A1,SEQUENCE(LEN(A1)),1)))),"<br>","|$|")&"</s></t>","//s[not(contains(., '「') and contains(., '―') and contains(., '/') and contains(., '」'))][node()]")),"|$|","<br>")
As long as you have access to CONCAT you could also do this in Excel 2019 but you'll have to swap SEQUENCE(LEN(A1)) for ROW(A$1:INDEX(A:A,LEN(A1)))
This formula won't work in many cases, but if the string has matching rules as in your example, then try this:
=SUBSTITUTE(C5,"「" & INDEX(TRIM(MID(SUBSTITUTE(","&SUBSTITUTE(C5,"」","「"),"「",REPT(" ",99)),(ROW(A1:INDEX(A1:A100,LEN(C5)-LEN(SUBSTITUTE(C5,"」",""))))*2-1)*99,99)),MATCH("*―*/*",TRIM(MID(SUBSTITUTE(","&SUBSTITUTE(C5,"」","「"),"「",REPT(" ",99)),(ROW(A1:INDEX(A1:A100,LEN(C5)-LEN(SUBSTITUTE(C5,"」",""))))*2-1)*99,99)),0)) & "」","")
explain how it works:
split the string between the characters "「 "and "」" into an array
use match("*―*/*",,0) to find the string position (note that it will only return one value if it exists, if you have multiple strings, you can replace match("*―*/*",) with search ("*―*/*",..) and use it as an extra column to get matches string)
Use the index(array,match("*―*/*",..)) to get the string needs replacing (result)
Replace the original string with the results found =substitute(txt,result,"")
Or,
In B1 enter formula :
=SUBSTITUTE(A1,"「"&TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND("」",A1,FIND("/",A1))),"「",REPT(" ",99)),99)),"")
You did not tag [VBA], but if you are not averse, you could write a User Defined Function that would do what you want using Regular Expressions.
To enter this User Defined Function (UDF), alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like =replStr(A1) in some cell.
Option Explicit
Function replStr(str As String) As String
Dim RE As Object
Const sPat As String = "\u300C(?:(?=[^\u300D]*\u002F)(?=[^\u300D]*\u2015)[^\u300D]*)\u300D"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = sPat
replStr = .Replace(str, "")
End With
End Function

Remove text appearing between two characters - multiple instances - Excel

In Microsoft Excel file, I have a text in rows that appears like this:
1. Rc8 {[%emt 0:00:05]} Rxc8 {[%emt 0:00:01]} 2. Rxc8 {[%emt 0:00:01]} Qxc8 {} 3. Qe7# 1-0
I need to remove any text appearing within the flower brackets { and }, including the brackets themselves.
In the above example, there are three instances of such flower brackets. But some rows might have more than that.
I tried =MID(LEFT(A2,FIND("}",A2)-1),FIND("{",A2)+1,LEN(A2))
This outputs to: {[%emt 0:00:05]}. As you see this is the very first instance of text between those flower brackets.
And if we use this to within SUBSTITUTE like this: =SUBSTITUTE(A2,MID(LEFT(A2,FIND("}",A2)),FIND("{",A2),LEN(A2)),"")
I get an output like this:
1. Rc8 Rxc8 {[%emt 0:00:01]} 2. Rxc8 {[%emt 0:00:01]} Qxc8 {} 3. Qe7# 1-0
If you have noticed, only one instance is removed. How do I make it work for all instances? thanks.
Highlight everything
Go to replace
enter {*} in text to replace
leave replace with blank
This should replace all flower brackets and anything in between them
It is not that easy without VBA, but there is still a way.
Either (as suggested by yu_ominae) just use a formula like this and auto-fill it:
=IFERROR(SUBSTITUTE(A2,MID(LEFT(A2,FIND("}",A2)),FIND("{",A2),LEN(A2)),""),A2)
Another way would be iterative calculations (go to options -> formulas -> check the "enable iterative calculations" button)
To do it now in one cell, you need 1 helper-cell (for my example we will use C1) and the use a formula like this in B2 and auto-fill down:
=IF($C$1,A2,IFERROR(SUBSTITUTE(B2,MID(LEFT(B2,FIND("}",B2)),FIND("{",B2),LEN(B2)),""),B2))
Put "1" in C1 and all formulas in B:B will show the values of A:A. Now go to C1 and hit the del-key several times (you will see the "{}"-parts disappearing) till all looks like you want it.
EDIT: To do it via VBA but without regex you can simply put this into a module:
Public Function DELBRC(ByVal str As String) As String
While InStr(str, "{") > 0 And InStr(str, "}") > InStr(str, "{")
str = Left(str, InStr(str, "{") - 1) & Mid(str, InStr(str, "}") + 1)
Wend
DELBRC = Trim(str)
End Function
and then in the worksheet directly use:
=DELBRC(A2)
If you still have any questions, just ask ;)
Try a user defined function. In VBA create a reference to "Microsoft VBScript Regular Expressions 5.5. Then add this code in a module.
Function RemoveTags(ByVal Value As String) As String
Dim rx As New RegExp
rx.Global = True
rx.Pattern = " ?{.*?}"
RemoveTags = Trim(rx.Replace(Value, ""))
End Function
On the worksheet in the cell enter: =RemoveTags(A1) or whatever the address is where you want to remove text.
If you want to test it in VBA:
Sub test()
Dim a As String
a = "Rc8 {[%emt 0:00:05]} Rxc8 {[%emt 0:00:01]}"
Debug.Print RemoveTags(a)
End Sub
Outputs "Rc8 Rxc8"

Resources