Replace non-printable characters with " (Inch sign) VBA Excel - excel

I need to replace non-printable characters with " (Inch sign).
I tried to use excel clean function and other UDF functions, but it just remove and not replace.
Note: non-printable characters are highlighted in blue on the above photo and it's position is random on the cells.
this is a sample string file Link`
The expected correct output should be 12"x14" LPG . OUTLET OCT-SEP# process
In advance grateful for useful comments and answer.

As per my comment, you can try:
=SUBSTITUTE(A1,CHAR(25)&CHAR(25),CHAR(34))
Or the VBA pseudo-code:
[A1] = [A1].Replace(Chr(25) & Chr(25), Chr(34))
Where [A1] is the obvious placeholder for the range-object you would want to use with proper and absolute referencing.
With ms365 newest functions, we could also use:
=TEXTJOIN(CHAR(34),,TEXTSPLIT(A1,CHAR(25)))

You can use Regular Expressions within a UDF to create a flexible method to replace "bad" characters, when you don't know exactly what they are.
In the UDF below, I show two pattern options, but others are possible.
One is to replace all characters with a character code >127
the second is to replace all characters with a charcter code >255
Option Explicit
Function ReplaceBadChars(str As String, replWith As String) As String
Dim RE As Object
Set RE = CreateObject("Vbscript.Regexp")
With RE
.Pattern = "[\u0080-\uFFFF]" 'to replace all characters with code >127 or
'.Pattern = "[\u0100-\uFFFF]" 'to replace all characters with code >255
.Global = True
ReplaceBadChars = .Replace(str, replWith)
End With
End Function
On the worksheet you can use, for example:
=ReplaceBadChars(A1,"""")
Or you could use it in a macro if you wanted to process a column of data without adding an extra column.
Note: I am uncertain as to whether there might be an efficiency difference using a smaller negated character class (eg: [^\x00-\x79] instead of the character class I showed in the code. But if, as written, execution seems slow, I'd try this change)

You can try this :
Cells.Replace What:="[The caracter to replace]", Replacement:=""""

Related

Delete character after ':' but not delete the rest

For example my excel column is
CodeandPrice
3&12|4&200|2&
5&|2&
4&|
2&12|35&744
With & is separation between code and price, | is separation betweeen 2 item.
I want to only get the code, so character before &.
CodeandPrice
3&|4&|2&
5&|2&
4&|
2&|35&
I ve googled but what I found is remove all character after/before. But what I want is,
remove a character after & but not all, since there will be another code.
For Excel 2010, maybe the easiest is to quickly throw together an UDF and invoke this as a function in your sheet, for example:
Function RegexReplace(s_in As String, pat As String, repl As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = pat
RegexReplace = .Replace(s_in, repl)
End With
End Function
Invoke like: =RegexReplace(A2,"\d+\|","")
If one happens to have newer functionality available try:
Formula in B2:
=MAP(A2:A5,LAMBDA(a,TEXTJOIN("|",,TAKE(TEXTSPLIT(a,"&","|"),,1)&"&")))

Excel find and replace function correct formula

I wish to use the find and replace function in excel to remove example sentences from cells similar to this:
text <br>〔「text」text,「text」text〕<br>(1)text「sentence―sentence/sentence」<br>(2)text「sentence―sentence」
Sentences are in between 「」brackets and will include a ― and / character somewhere inside the brackets.
I have tried 「*―*/*」 however this will delete everything from the right of the〔
Is there any way to target and delete these specific sentence brackets, with the find and replace tool?
Desired outcome:
text <br>〔「text」text,「text」text〕<br>(1)text<br>(2)text「sentence―sentence」
Quite a long formula but in Excel O365 you could use:
=SUBSTITUTE(CONCAT(FILTERXML("<t><s>"&SUBSTITUTE(CONCAT(IF(MID(A1,SEQUENCE(LEN(A1)),1)="「","</s><s>「",IF(MID(A1,SEQUENCE(LEN(A1)),1)="」","」</s><s>",MID(A1,SEQUENCE(LEN(A1)),1)))),"<br>","|$|")&"</s></t>","//s[not(contains(., '「') and contains(., '―') and contains(., '/') and contains(., '」'))][node()]")),"|$|","<br>")
As long as you have access to CONCAT you could also do this in Excel 2019 but you'll have to swap SEQUENCE(LEN(A1)) for ROW(A$1:INDEX(A:A,LEN(A1)))
This formula won't work in many cases, but if the string has matching rules as in your example, then try this:
=SUBSTITUTE(C5,"「" & INDEX(TRIM(MID(SUBSTITUTE(","&SUBSTITUTE(C5,"」","「"),"「",REPT(" ",99)),(ROW(A1:INDEX(A1:A100,LEN(C5)-LEN(SUBSTITUTE(C5,"」",""))))*2-1)*99,99)),MATCH("*―*/*",TRIM(MID(SUBSTITUTE(","&SUBSTITUTE(C5,"」","「"),"「",REPT(" ",99)),(ROW(A1:INDEX(A1:A100,LEN(C5)-LEN(SUBSTITUTE(C5,"」",""))))*2-1)*99,99)),0)) & "」","")
explain how it works:
split the string between the characters "「 "and "」" into an array
use match("*―*/*",,0) to find the string position (note that it will only return one value if it exists, if you have multiple strings, you can replace match("*―*/*",) with search ("*―*/*",..) and use it as an extra column to get matches string)
Use the index(array,match("*―*/*",..)) to get the string needs replacing (result)
Replace the original string with the results found =substitute(txt,result,"")
Or,
In B1 enter formula :
=SUBSTITUTE(A1,"「"&TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND("」",A1,FIND("/",A1))),"「",REPT(" ",99)),99)),"")
You did not tag [VBA], but if you are not averse, you could write a User Defined Function that would do what you want using Regular Expressions.
To enter this User Defined Function (UDF), alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like =replStr(A1) in some cell.
Option Explicit
Function replStr(str As String) As String
Dim RE As Object
Const sPat As String = "\u300C(?:(?=[^\u300D]*\u002F)(?=[^\u300D]*\u2015)[^\u300D]*)\u300D"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = sPat
replStr = .Replace(str, "")
End With
End Function

Turkish Excel version: transforming dotted upper i, "İ" to dotless upper i "I" with VBA

I identified a problem of special characters with “I” on a turkish version of Windows 10 with a Turkish Excel version. “i” gives another letter in Turkish when it is translated to uppercase: “İ” and not “I”, as for instance when "i" is converted to upper case in English.
The problem is that when I use non case sensitive Excel search formulas (for instance “match” or “countif” formulas), the Turkish Excel will look for “i” or “İ” (with a dot), which it doesn’t find as the letters are uppercase in the lookup range (and there's no "İ"), while the English version look for “i” or “I”, which they’ll find.
To summarize, I search for "i" in a non-case sensitive way, and my Turkish colleague doesn't find any result, because his computer looks for "i" and "İ", and all the expected results are with "I".
I cannot ask the users having this problem to change the language in Excel or in Windows, nor change the lookup source or target ranges. But I can change the formulas (match and countifs are used).
I'm not sure where this lower to uppercase conversion is coming from, if it's from Excel or Windows. But after installing a Turkish version of Excel on a German version of Windows, I didn't have the problem. So I presume the problem is coming from Windows (some language settings, in the end, knowing that is interesting but won't help much)...
I was thinking to writ a VBA formula to change texts to uppercase AND change dotted capital “İ” to dotless capital “I” with replace, then use this string in my search functions. But I cannot find the dotted capital “İ” in the Chr() formula... I think the Chr() function doesn't use the extended ASCII characters, just the standard ones. See the kind of function I intended to use below.
Function upper_i(myStr As String) As String
upper_i = UCase(myStr)
upper_i = Replace(upper_i, Chr(???), "I")
End Function
How can I tell Excel I want a dotted capital "İ" here?
Thanks for your help!
Assuming I'm reading correctly, maybe try:
Function upper_i(myStr As String) As String
upper_i = UCase$(myStr)
upper_i = Replace(upper_i, ChrW(304), "I")
End Function
Seems to pass the test below at least:
Private Sub TestFunction()
Dim someText As String
someText = "ok, ok, " & ChrW(304)
Debug.Assert someText <> "OK, OK, I"
someText = upper_i(someText)
Debug.Assert someText = "OK, OK, I"
End Sub
I didn't really understand why you're making the string uppercase, but maybe I need to read your question a few more times.

How to count exact text contain in string [Excel]

I already use these below formula to count exact text contain in string but still formula wrongly counted it. For example, i would like to count "ZIKA" test code in table, the answer should be two. But the formula count ZIKA2 as ZIKA also. How to ignore ZIKA2 from count it?
TEST
HS2, CCAL, EGFR, AFB
ZIKA, AG21
PPB, ZIKA2
ZIKA, AG21
I already try these formulas:
=SUMPRODUCT(--(ISNUMBER(FIND("ZIKA",F:F))))
and also
=COUNTIF(F:F,"ZIKA")
you could count exact zika, and comma-separated vriations
=COUNTIF(F:F,"ZIKA")+COUNTIF(F:F,"ZIKA,*")+COUNTIF(F:F,"*, ZIKA")+COUNTIF(F:F,"*, ZIKA,*")
I assume your data follow this format
xxx, yyy, zzz
space after comma
You may need to split your formula into 3 parts
=COUNTIF(F:F,"ZIKA,*")+COUNTIF(F:F,"*, ZIKA")+COUNTIF(F:F,"ZIKA")
The first part will count those start with ZIKA, second part count those end with ZIKA, last we should count those only with ZIKA
Try this regex, it may need a helpercolumn. I have not tested it that much yet.
Press ALT + F11 to open VBA editor.
Click Insert -> module and copy paste the code below.
Function Regex(Cell, Search)
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(\b" & Search & "\b)"
RE.Global = True
RE.IgnoreCase = True
Set Matches = RE.Execute(Cell)
For Each res In Matches
Regex = Regex & "," & res
Next res
Regex = Mid(Regex, 2)
End Function
It will return "ZIKA" if it finds ZIKA in the cell you run it on.
And then you just count the ZIKAs in the helper column.
Updated with a new code that you can change the search in.
Use it with =regex(A1, "ZIKA")

Identify pattern in words

I have a question I believe is quite simple, but I don't know the proper way to do it.
Basically, I would like my program to be able to identify words with a certain pattern in it, and if so, to extract what's before the pattern.
The pattern would be, in this case /F, specifically at the end of the word, and it would extract what's before.
For example, if the program finds 21/F, it will identify it as a good match and will extract 21. But if the word was 21/Fudge, it wouldn't do anything.
Do you know the way to look for a match at a specific position in the word?
I would do:
If str Like "*/F" Then
before=Left(str, Len(str)-len("/F"))
Else
'No match!
End If
I would use a regular expression, something like this:
\b\w+?(\d+)\/F\b
This will help you match any digits before "/F" and ignore the rest of the word. In order to use it in VBA you will need to add a reference to 'Microsoft VBScript Regular Expressions 5.5' and here's the VBA behind this. Pattern is "\b\w+?(\d+)/F\b"
Public Sub Extract(Pattern as String, Text as String)
Dim regEx As VBScript_RegExp_55.RegExp
Dim matches As VBScript_RegExp_55.MatchCollection
Set regEx = CreateObject("VBScript.RegExp") ' Create a regular expression.
regEx.Pattern = Pattern
Set matches = regEx.Execute(Text)
Dim i as Long
For i = 0 To (matches.Count - 1)
Debug.Print Matches.Item(i)
Next i
End Sub
Hope this helps.

Resources