Dealing with special characters in String - string

I am downloading some files which have some chineses characters sometimes and apparently String doesn't recognize them.
Any ideas on how I could tell VBA the following :
If there are some unknown characters in the filename then delete them and only keep the first part of the filename which contains normal characters.
Actually each of those special characters will be replaced by a "?".
But the problem is that Msgbox InStr(1, AttachmentName, "?") will return 0 even though MsgBox AttachmentName will display some "?".
I did the following, but as I said above, the "?" are displayed on MsgBox but not truly there so it never satisfies the condition...
If InStr(1, AttachmentName, "?") <> 0 Then
AttachmentName = Mid(AttachmentName, 1, InStr(1, AttachmentName, "?") - 1) & "unknown characters "
End If

This sub removes all Chinese characters from a string.
Private Sub RemoveChinese()
Dim Fun As String
Dim Txt As String
Dim Ch As String
Dim n As Integer
Txt = Selection.Text
For n = 1 To Len(Txt)
Ch = Mid(Txt, n, 1)
If Asc(Ch) = AscW(Ch) Then Fun = Fun & Ch
Next n
MsgBox Fun
End Sub
The point is that Chinese characters are represented by 2 bytes whereas it takes only one to write a Latin character. You must have Chinese language support installed on your computer in order to be able to actually depict them. Hence the ? inserted for unrecognised characters.

Related

How to replace certain character in a string

I am trying to replace not each space in a single string with line break. String is taken from specific cell, and looks like:
Now, Im trying to replace each space after abbreviation to line break. The abbreviation can be any, so the best way for precaching which space I intend to replace is like: each space after number and before a letter?
The output I want to get is like:
Below is my code, but it will change every space to line break in cell.
Private Sub Workbook_SheetChange(ByVal Sh As Object, ByVal Target As Range)
On Error GoTo Exitsub
If Not Intersect(Target, .Columns(6)) Is Nothing Then
Application.EnableEvents = False
Target.Value = Replace(Target, " ", Chr(10))
End If
Application.EnableEvents = True
Exitsub:
Application.EnableEvents = True
End Sub
You can try
Target.Value = Replace(Target, "kg ", "kg" & Chr(10))
If you can have other abbreviations like "g" or "t", do something similar for them (maybe in a Sub), just be cautious with the order (replace first "kg", then "g")
Update: If you don't know in advance the possible abbreviations, one attempt is to use regular expressions. I'm not really good with them, but the following routine seems to do:
Function replaceAbbr(s As String) As String
Dim regex As New RegExp
regex.Global = True
regex.Pattern = "([a-z]+) "
replaceAbbr = regex.Replace(s, "$1" & Chr(10))
End Function
The below will replace every 2nd space with a carriage return. For reason unknown to me The worksheet function Replace will work as intended, but the VBA Replace doesnt
This will loop through every character in the defined area, you can change this to whatever you want.
The if statement is broken down as such
(SpaceCount Mod 2) = 0 this part is what enable it to get every 2nd character.
As a side note (SpaceCount Mod 3) = 0 will get the 3rd character and (SpaceCount Mod 2) = 1 will do the first character then every other character
Cells(1, 1).Characters(CountChr, 1).Text = " " is to make sure we are replacing a space, if the users enters something funny that looks like a space but isn't, that's on them
I believe something like this will work as intended for you
Private Sub Workbook_SheetChange(ByVal Sh As Object, ByVal Target As Range)
On Error GoTo Exitsub
Application.EnableEvents = False
For CountChr = 1 To Len(Target.Value)
If Target.Characters(CountChr, 1).Text = " " Then
Dim SpaceCount As Integer
SpaceCount = SpaceCount + 1
If (SpaceCount Mod 2) = 0 Then
Target.Value = WorksheetFunction.Replace(Target.Value, CountChr, 1, Chr(10))
End If
End If
Next CountChr
Application.EnableEvents = True
Exitsub:
Application.EnableEvents = True
End Sub
Identify arbitrary abbreviation first
"abbreviations aren't determined ..."
Knowing the varying abbreviation which, however is the same within each string (here e.g. kg ) actually helps following the initial idea to look at the blanks first: but instead of replacing them all by vbLf or Chr(10), this approach
a) splits the string at this " " delimiter into a zero-based tmp array and immediately identifies the arbitrary abbreviation abbr as second token, i.e. tmp(1)
b) executes a negative filtering to get the numeric data and eventually
c) joins them together using the abbreviation which is known now for the given string.
So you could change your assignment to
'...
Target.Value = repl(Target) ' << calling help function repl()
Possible help function
Function repl(ByVal s As String) As String
'a) split into tokens and identify arbitrary abbreviation
Dim tmp, abbr As String
tmp = Split(s, " "): abbr = tmp(1)
'b) filter out abbreviation
tmp = Filter(tmp, abbr, Include:=False)
'c) return result string
repl = Join(tmp, " " & abbr & vbLf) & abbr
End Function
Edit // responding to FunThomas ' comment
ad a): If there might be missing spaces between number and abbreviation, the above approach could be modified as follows:
Function repl(ByVal s As String) As String
'a) split into tokens and identify arbitrary abbreviation
Dim tmp, abbr As String
tmp = Split(s, " "): abbr = tmp(1)
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'b) renew splitting via found abbreviation (plus blank)
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tmp = Split(s & " ", abbr & " ")
'c) return result string
repl = Join(tmp, abbr & vbLf): repl = Left(repl, Len(repl) - 1)
End Function
ad b): following OP citing e.g. "10 kg 20 kg 30,5kg 15kg 130,5 kg" (and as already remarked above) assumption is made that the abbreviation is the same for all values within one string, but can vary from item to item.

How do I make IsNumeric check the whole string/variable instead of just the first character?

I am using IsNumeric to check if a part of a variable are numbers or not. Unfortunately it only seems to check the first character of the string part instead of the whole bit.
It currently accepts i.e. Q123 1234567 and QWER 1QWERTYR (and other varients of that). While I need the first 4 characters to be all letters and the others to be all numbers.
I have no idea what I am missing still. Please add extra comments if at all possible, my understanding of vba is below basic still.
Dim ConNr As String
Dim Space As String
Dim Four As String
Dim Six As String
Dim One As String
Dim Container As String
ConNr = Me.txtContainer.Value
Space = " "
Four = Left(Me.txtContainer.Value, 4)
Four = UCase(Four)
Six = Mid(Me.txtContainer.Value, 5, 6)
One = Right(Me.txtContainer.Value, 1)
'Check if all 4 are letters
If IsNumeric(Four) = True Then
MsgBox "First 4 need to be letters."
Me.txtContainer.SetFocus
Exit Sub
Else
'MsgBox "Four Letters " + Four
'Check if 6 characters are numbers
If IsNumeric(Six) = False Then
MsgBox "4 Letters followed by 6 numbers."
'MsgBox "These Six " + Six
Me.txtContainer.SetFocus
Exit Sub
Else
'MsgBox "Six Numbers " + Six
'Last number is number
If IsNumeric(One) = False Then
MsgBox "Last character needs to be a number."
Me.txtContainer.SetFocus
Exit Sub
Else
'MsgBox "Last Number " + One
ConNr = Four & Space & Six & Space & One
Container = ConNr
End If
End If
End If
Edit based on JvdV
When I tried "[A-Za-z][A-Za-z][A-Za-z][A-Za-z] ###### #" the output was empty.
I dont want to force the user to use the correct format. (Caps, spaces.) But the 4 letters/7 numbers are required.
Dim ConNr As String: ConNr = Me.txtContainer.Value
If ConNr Like "[A-Za-z][A-Za-z][A-Za-z][A-Za-z]#######" Then ‘Without spaces, else it doesn’t post.
Container = UCase(ConNr)
Else
MsgBox "YOU FAILED."
Me.txtContainer.SetFocus
Exit Sub
End If
‘Output should become ASDF 123456 7. Currently gives me ASDF1234567.
As per my comment, hereby a simple sample code to demonstrate the use of the Like operator:
Sub Test()
Dim str As String: str = "QWER 1234567"
Dim arr As Variant: arr = Split(str, " ")
If arr(0) Like "[A-Z][A-Z][A-Z][A-Z]" And IsNumeric(arr(1)) Then
Debug.Print str & " is passed!"
End If
End Sub
Btw, if you want to allow for upper- and lowercase you could use: [A-Za-z][A-Za-z][A-Za-z][A-Za-z]
Edit
If you looking for a pattern of 4 alphabetic chars, then a space, then 6 digits, you can even do something more simplistic:
Sub Test()
Dim str As String: str = "QWER 123456"
If str Like "[A-Z][A-Z][A-Z][A-Z] ######" Then
Debug.Print str & " is passed!"
End If
End Sub
Extend the expression if you want to include another space/digit. You are talking about:
"ConNr = Four & Space & Six & Space & One"
So [A-Z][A-Z][A-Z][A-Z] ###### # would work for you in that case.
As per your comment, you don't want to force a specific format on the users, as long as they have 4 alpha and 7 numeric characters in their string. In any form.
So I figured, since there are so many places to put spaces, it's best to get rid of them using Application.Substitute. Your code might look like:
If Application.Substitute(Me.txtContainer.Value, " ", "") Like "[A-Za-z][A-Za-z][A-Za-z][A-Za-z]#######" Then
Debug.Print str & " is passed!"
End If
If you don't want to forec upper cases but want to return it nonetheless then use the UCase function to cap the whole string at once!
Debug.Print UCase(Application.Substitute(Me.txtContainer.Value, " ", ""))
It's hard to hide the fact that this resembles RegEx a lot.
In this solution approval of the contract number format is provided by a function that returns True if the number is good, or False. If the number isn't good the function tells what's wrong with it. If found acceptable the calling procedure gets on with the program. Note that the function accommodates missing or extra spaces and converts lower case letters to upper.
Option Explicit
Private Sub TestConNumber()
Dim ConNr As String
' ConNr = Me.txtContainer.Value
ConNr = "QAAK 781234 x"
If GetConNumber(ConNr) Then
MsgBox "The Contract number is " & ConNr
End If
End Sub
Private Function GetConNumber(ConNr As String) As Boolean
' return Not True if incorrect
Dim Fun As Boolean ' function return value
Dim Nr As String
Dim Msg As String
Dim Arr(1 To 3) As String
Nr = UCase(Replace(ConNr, " ", ""))
If Len(Nr) = 11 Then
Arr(1) = Left(Nr, 4)
If Arr(1) Like "[A-Z][A-Z][A-Z][A-Z]" Then
If IsNumeric(Right(Nr, 7)) Then
Arr(2) = Mid(Nr, 2, 6)
Arr(3) = Right(Nr, 1)
ConNr = Join(Arr)
Fun = True
Else
Msg = "The last 7 digits must be numbers."
End If
Else
Msg = "The first 4 characters must be non-numeric"
End If
Else
Msg = "Input must have 11 characters"
End If
If Not Fun Then
MsgBox Msg, vbExclamation, "Wrong input"
End If
GetConNumber = Fun
End Function

Swapping delimited strings in an excel column

I have a column in a very large excel spreadsheet that is in some cases incorrectly formatted. It should contain first a street address, then a name, separated by a hyphen, as shown:
123 Main St-Smith
However, some are formatted in reverse, such as:
Jones-231 High St
All the addresses start with a numeric and all the names start with an alpha. I am looking for a macro or code that would swap only the name and address where it is incorrectly formatted. I have tried turning it into a comma delimited to separate them out, but since they only occur intermittently I am still left with fixing them one by one manually.
Any suggestions? I am by no means an Excel macro expert. Thanks!
Split the string on the hyphen then look for spaces in the second element.
dim i as long, tmp as variant
with worksheets("sheet1")
for i = 2 to .cells(.rows.count, "a").end(xlup).row
tmp = split(.cells(i, "a").value2, "-")
if cbool(instr(1, tmp(1), " ")) then _
.cells(i, "a") = join(array(tmp(1), tmp(0)), "-")
next i
end with
As you wrote
Street name is any string that begins with a digit and ends with either a hyphen or the end of the string
Name is any string that starts with a non-digit and ends with either a hyphen or the end of the string
This can be done using just native VBA, (although at first I was going to use Regular Expressions)
split on the hyphen
rearrange depending on if first starts with a number or not
do some error checking in case no hyphen present or don't have the number and non-number start as specified.
Option Explicit
Function fmtAddressName2(S As String) As String
Dim sAddr As String, sName As String
Dim v As Variant
v = Split(S, "-")
On Error GoTo badFormat
If IsNumeric(Left(v(0), 1)) And Not IsNumeric(Left(v(1), 1)) Then
sAddr = v(0)
sName = v(1)
ElseIf Not IsNumeric(Left(v(0), 1)) And IsNumeric(Left(v(1), 1)) Then
sAddr = v(1)
sName = v(0)
Else
GoTo badFormat
End If
fmtAddressName2 = sAddr & "-" & sName
Exit Function
badFormat:
'return unchanged string
fmtAddressName2 = S
'or could return an error message
End Function

vba search replace character

I'm trying to prepare a spreadsheet for a report in excel vba. Unforturnately there are some wierd characters here that need to be replaced. Easy enough, except for this chracter:
¦
I can't seem to be able to paste that character into the editor into a string replace function. When I try, the output is _. I then thought to refer to it by it's Chr code. A quick look up said it was Chr(166). http://www.gtwiki.org/mwiki/?title=VB_Chr_Values
Replace(s, "â€" + Chr(166), "...")
But this is not that character at all (at least on Mac excel). I tried:
For i = 1 To 255
Debug.Print Chr(i)
Next i
And I didn't see this character anywhere. Does anyone know how I can reference this character in vba code in order to replace it?
Not sure if regexp is available for vba-mac, but you could simplify your existing code greatly as below.
Uses a sample Strin
Dim strIn As String
strIn = "1â€1â€x123"
Do While InStr(strIn, "â€") > 0
Mid$(strIn, InStr(strIn, "â€"), 3) = "..."
Loop
Click on a cell containing your miscreant character and run this small macro:
Sub WhatIsIt()
Dim s As String, mesage As String
Dim L As Long
s = ActiveCell.Text
L = Len(s)
For i = 1 To L
ch = Mid(s, i, 1)
cd = Asc(ch)
mesage = mesage & ch & " " & cd & vbCrLf
Next i
MsgBox mesage
End Sub
It should reveal the characters in the cell and their codes.
It's dirty, but here's the workaround that I used to solve this problem. I knew that my issue character was always after "â€", so the idea was to replace the character that came after those 2. I don't really know how to replace a character at a position in a string, so my idea was to covert the string to an array of characters and replace the array at those specific indexes. Here's what it looks like:
Do While InStr(s, "â€") > 1
num2 = InStr(s, "â€")
arr = stringToArray(s)
arr(num2 - 1) = "<~>"
arr(num2) = "<~>"
arr(num2 + 1) = "<~>"
s = Replace(arrayToString(arr), "<~><~><~>", "...")
Loop
...
Function stringToArray(ByVal my_string As String) As Variant
Dim buff() As String
ReDim buff(Len(my_string) - 1)
For i = 1 To Len(my_string)
buff(i - 1) = Mid$(my_string, i, 1)
Next
stringToArray = buff
End Function
Function arrayToString(ByVal arr As Variant) As String
Dim s As String
For Each j In arr
s = s & j
Next j
arrayToString = s
End Function
In practice, what I replaced those indexes with is something that had to be unique but recognizable. Then i can replace my unique characters with whatever I want. There are sure to be edge cases, but for now it gets the job done. stringToArray function pulled from: Split string into array of characters?

Capitalise first letter of words without changing currently Capitalised

Ive got some items of text in Excel and id like to capitalise the first letter of each word. However, a lot of text contains the phrase 'IT' and using current capitalisation methods (PROPER) it changes this to 'It'. Is there a way to only capitalise the first letter of each word without DE capitalising the other letters in each word?
Here is a VBA way, add it to a module & =PrefixCaps("A1")
Public Function PrefixCaps(value As String) As String
Dim Words() As String: Words = Split(value, " ")
Dim i As Long
For i = 0 To UBound(Words)
Mid$(Words(i), 1, 1) = UCase$(Mid$(Words(i), 1, 1))
Next
PrefixCaps = Join(Words, " ")
End Function
Used the website http://www.textfixer.com/tools/capitalize-sentences.php and pasted it all in instead
That was all a bit complicated, but I did find if your spreadsheet is pretty simple, you can copy and paste it into word and use it's editing features and then copy and paste that back in to Excel. Worked quite well for me.
Fixes double spaces in the text:
Public Function PrefixCaps(value As String) As String
Dim Words() As String: Words = Split(value, " ")
Dim i As Long
For i = 0 To UBound(Words)
If Len(Words(i)) > 0 Then
Mid$(Words(i), 1, 1) = UCase$(Mid$(Words(i), 1, 1))
End If
Next
PrefixCaps = Join(Words, " ")
End Function`

Resources