Remove characters from string (String normalization?) - excel

I am attempting to remove characters from a string in excel by utilizing a VBA macro.For example the string is "UOZV3A-WB1○1.8ml vbn958Xzlv2" and I need it to return "UOZV3A-WB1". It is pretty straight forward, the code I am using is:
For Each c In Range("D2:D69")
If InStr(c.Value, "?") > 0 Then
c.Value = Left(c.Value, InStr(c.Value, "?") - 1)
End If
Next c
The issue I am running into is a single character in the string ("o") is unrecognized by the macro. The string is entered into the cell by scanning a QR code. I suspect that "o" is a sort of placeholder that is recognized/interpreted as a "o" in excel but interpreted differently in VBA. If I try to just copy and paste the character into VBA I get a "?".
Is there a way to manipulate or interpret that character in VBA? Some of the other posts I read seemed to indicate that the string could be normalized but the coding was over my head.
Thanks!

You need to understand what character you are parsing on:
Sub junkkiller()
For Each c In Range("D2:D69")
If InStr(c.Value, ChrW(9675)) > 0 Then
c.Value = Left(c.Value, InStr(c.Value, ChrW(9675)) - 1)
End If
Next c
End Sub

Related

How to check the value of Σ character in an Excel Cell? - VBA

I do have a cell in excel that contains Σh. Not sure how to check for it in vba. I have tried with .Value and .Text, but the check is never true.
If (tRange.Value = (ChrW(931) + "h")) Then
Exit Sub
End if
When testing (Debug) I get this result for ActiveCell.Value = Sh
(a) You have to use .Value to get the content of the cell.
(b) You should use the ampersand character (&) to concatenate strings in VBA. The plus-sign works also, but only if all operands are strings.
(c) ChrW(931) & "h" (or ChrW(931) + "h") should work. VBA is able to handle characters even if the VBA-environment cannot show them.
Seems to me that either the Sigma-character is composed with a different character, or your cell contains invisible characters like space, newline, tabs...
You can dump the content of the cell with the following code to get an idea why your If-statement fails:
Sub DumpString(s As String)
Dim i As Long
For i = 1 To Len(s)
Dim c As String
c = Mid(s, i, 1)
Debug.Print i, AscW(c), c
Next
End Sub
When you enter the following command into the immediate window, you will see output like that:
DumpString activecell.Value
1 931 S
2 104 h
This should check if cell value contains the sub-string 'Σh'
If tRange.Value Like "*" & ChrW(931) & "h*" Then
Exit Sub
End If
Another maybe simpler way for some folks
If InStr(1, tRange.Value, ChrW(931) & "h") <> 0 Then
Exit Sub
End If
You have to use an ampersand to join the two characters:
If (tRange.Value = ChrW(931) & "h") Then
Exit Sub
End if

Removing text with certain pattern in cell

I want to remove part of text in a cell with pattern such as [cid:image003.gif#01D863CC.CAE51sd0] & [https://xxxx=0].
It may appear several times in each cell randomly in different position.
I read some material (Remove Text Within Cell Starting with String and Ending with Character); but I have no clue how to handle the code with looping and line by line within a cell
I prepared 2 samples.
Sample A:
Hi xxx,
This is Ken
[cid:image003.gif#01D863CC.CAE51sd0]
[https://xxxx=0]
[cid:imagedddd0]
Expected:
Hi xxx,
This is Ken
Sample B:
[cid:image003.gif#01D863CC.CAE51sd0]
[https://xxxx=0]
Hi xxx,
This is Ken
[cid:imagedddd0]
Expected:
Hi xxx,
This is Ken
If I understand you correctly, you want to do something to any cell which has certain character, "[" and "]". What you want to do to that kind of cell, you want to remove all those "[", "]" and the value in between those two certain characters.
Example data in the active sheet:
The cell with that kind of data in yellow is scattered around to whatever cells in the active sheet.
if your data is similar with image above and the image below is your expected result after running the sub:
then the sub is something like this :
Sub test()
Dim c As Range
Dim pos1 As Long: Dim pos2 As Long
Do
Set c = ActiveSheet.UsedRange.Find("[", LookAt:=xlPart)
If Not c Is Nothing Then
Do
pos1 = InStr(c.Value, "["): If pos1 = 0 Then Exit Do
pos2 = InStr(c.Value, "]")
c.Replace What:=Mid(c.Value, pos1, pos2 - pos1 + 1), Replacement:="", LookAt:=xlPart
Loop
End If
Loop Until c Is Nothing
End Sub
There are two loops in the sub.
Loop-A is to find any cell of the active sheet which has "[" character and have it as c variable
This loop-A will stop when it doesn't find a cell which has "[" char.
Loop-B is to do something whenever there is "[" in the found cell.
This loop-B will stop if in the found cell there is no more "[" char.
What the sub do in this loop-B is to find the position of "[" as pos1 variable and find the position of "]" as pos2 variable. Then it replace the "[", the "]"
and whatever text in between those two char in the found cell (the c variable) value with nothing ("").
After seeing the sample data, I think it's better to do it in MS Words app. So I search the internet on how to do VBA in MS Words app. Not exactly sure if it's a correct syntax, but it seems the code below (MS Word VBA module) work as expected.
Sub test()
Dim pos1 As Long: Dim pos2 As Long
Dim txt As String: Dim slice As String: Dim rpl As String
Do
pos1 = InStr(ActiveDocument.Content, "[")
If pos1 = 0 Then Exit Do
pos2 = InStr(ActiveDocument.Content, "]")
txt = Mid(ActiveDocument.Content, pos1, pos2 - pos1 + 1)
If Len(txt) > 250 Then
slice = Left(txt, 250): rpl = "["
Else
slice = txt: rpl = ""
End If
With ActiveDocument.Content.Find
.Execute FindText:=slice, ReplaceWith:=rpl, _
Format:=True, Replace:=wdReplaceAll
End With
Loop
End Sub
The process of the sub is similar with the one in Excel app. The difference is, this sub check if the char is more then 250 in the text to be removed (the txt variable) then it will slice it the first 250 char into slice variable, and have "[" as the replacement into rpl variable.
If the txt variable is not more than 250 char, then the slice variable value is the same with the txt variable value, while the rpl variable value then it's just directly nothing ---> "".
WARNING:
In my computer, the sub takes almost 2 minutes to finish the job in MS Words app with the data coming from column A of your Excel sheet sample data.

Adding a space between two words once

I completed code to remove any data in front of a string, add some text (with a space) to the front and store it back in the cell.
However, every time I run the macro (to check if changes that I've made are working for example), a new space is added in between the words.
The code that removes anything before the name and adds the required string. I have called a InStr function and stored the value in integer pos. Note that this is in a loop over a specific range.
If pos > 0 Then
'Removes anything before the channel name
cellValue.Offset(0, 2) = Right(cell, Len(cell) - InStr(cell, pos) - 2)
'Add "DA" to the front of the channel name
cellValue.Offset(0, 0) = "DA " & Right(cell, Len(cell) - InStr(cell, pos) - 2)
'Aligns the text to the right
cellValue.Offset(0, 2).HorizontalAlignment = xlRight
End If
An additional "DA" is not being added and I haven't made any other functions to add spaces anywhere. The extra space is not added if adding "DA " is changed to "DA".
I'd prefer not to add another function/sub/something somewhere to search and remove any extra spaces.
What the string is AND what is in front of the string is unknown. It could be numbers, characters, spaces or exactly what I want it to be. For example, it could be "Q-Quincey", "BA Bob", "DA White" etc. I thought that searching through the cell for the string I want (Quincey, Bob, White) and altering the cell as needed would be the best way.
Solution that you all helped me come up with:
If pos > 0 Then
modString = Right(cell, Len(cell) - InStr(cell, pos) - 2)
'Removes anything before the channel name and places it in the last column
cellValue.Offset(0, 2) = modString
'Aligns the last column text to the right
cellValue.Offset(0, 2).HorizontalAlignment = xlRight
cellValue.Offset(0, 2).Font.Size = 8
'Add "DA" to the front of the channel name in the rightmost column
If StartsWith(cell, "DA ") = True Then
cellValue.Replace cell, "DA" & modString
Else
cellValue.Replace cell, "DA " & modString
End If
End If
Maybe this is something you can work with:
Sample data:
Sample code:
Sub Test()
With Sheet1.Range("A1:A4")
.Replace "*quincey", "AD Quincey"
End With
End Sub
Result:
In your examples, it seems you want to replace the first "word" in the string with something else. If that is always the case, the following function, which makes use of Regular Expressions, can do that:
Option Explicit
Function replaceStart(str As String, replWith As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = False
.MultiLine = True
.Pattern = "^\S+\W(?=\w)"
replaceStart = .Replace(str, replWith)
End With
End Function
Sub test()
Debug.Print replaceStart("Q-Quincy", "DA ")
Debug.Print replaceStart("BA Bob", "DA ")
Debug.Print replaceStart("DA White", "DA ")
End Sub
The debug.print will -->
DA Quincy
DA Bob
DA White
The regular expression matches everything up to but not including the first "word" character that follows a non-word character. This should be the second word in the string.
A "word" character is anything in the set of [A-Za-z0-9_]
Seems to work on the examples you present.
If you wanted to go about it through a loop you should remove some redundancies in your code. For instance, refering to cell.offset(0,0) doesn't make sense.
I would set the target cells to a range and simply edit that cell with out placing the unwanted strings in another cell.
**EDIT:
I'd try something like this.**
nameiwant = "Quincy"
Set cell = Range("A1")
If InStr(cell, nameiwant) > 0 And Left(cell, 3) <> "DA " Then
cell.Value = "DA " & nameiwant
End If

Keeping Specific Text line in a cell for specific starting letters with excel vba

I have multiple text lines in a cell on Column A. I want to keep only one line starting with specific letters (see: Picture 1). For example, first I would like to check if it has a line starting with "MB". If it has then I would like to keep only that line. If it doesn't has then it will search consecutively letters "SA" then "PQ" and so on. I am trying to implement this in Excel VBA.
1
I have found some clue using built in function. For example
*Remove everything after the first comma
=LEFT(A1,FIND(",",A1)-1)
*Remove everything before the second occurrence comma
=RIGHT(SUBSTITUTE(A1, ",", CHAR(9), 2), LEN(A1)- FIND(CHAR(9), SUBSTITUTE(A1, ",", CHAR(9), 2), 1) + 1)
However, those are not the solution I am looking for. I would highly appreciate if anyone can help me.
Regards,
Oliver
Google-Sheets
I know you didn't ask about Google-Sheets but in this case it could be a nice way out too (I'm no GS expert whatsoever, but tried something that seems interesting for you)
Formula in B2:
=IFERROR(ARRAYFORMULA(TRANSPOSE(REGEXEXTRACT(A2,"\b"&TRANSPOSE($B$1:$D$1)&"[^\s]+"))),"")
This way you can extend the parts you interested in (extend the range B1:D1), + it will give you all the regex matches available in the input.
Drag the formula down.
Excel
Within Excel you could think about an UDF using REGEX, here is a quick example:
Function GetRegEx(str As String)
Dim regex As Object
Set regex = CreateObject("VBScript.RegExp")
Dim MyArr() As String
MyArr = Split("MB,PQ,SA", ",")
For X = LBound(MyArr) To UBound(MyArr)
With regex
.Pattern = "\b" & MyArr(X) & "[^\s]+"
.Global = True
End With
Set matches = regex.Execute(str)
If matches.Count > 0 Then
GetRegEx = matches(0).Value
Exit Function
End If
Next X
End Function
Call it like =getregex(A2) and drag down...

how to remove "-" and "/" characters from excel sheet

I want to remove "-" and "/" from 011-2729729 011/2729729 these numbers and convert them in to 0112729729 in excel. I tried with substitute function but i could not get the correct answer.
Already attempted formula: =SUBSTITUTE(A1,"/"," ",4)
If you truly desire a formula over a macro use this, else Gary's Student provided a nice macro.
If you are specifically searching position 4 then use REPLACE not SUBSTITUTE with a simple IF check at the start to see if position 4 is a "/" or "-"
=IF(OR(MID(A1,4,1)="/",MID(A1,4,1)="-"),REPLACE(A1,4,1,""),A1)
Notes:
SUBSTITUTE is great when you want to replace certain text with other text
REPLACE is great when you want to replace a certain position with other text
Sometimes removing "/" and "-" will create a string that Excel will treat as a number and you can lose leading zeros. This small macro will fix the cells "in place"
Sub FixValues()
Dim r As Range, v As String
For Each r In ActiveSheet.UsedRange
v = r.Text
If InStr(v, "-") > 0 Or InStr(v, "/") > 0 Then
r.NumberFormat = "#"
r.Value = Replace(Replace(v, "-", ""), "/", "")
End If
Next r
End Sub

Resources