extracting from a string without delimiters excel - string

This site has been a veritable treasure chest of answers and ideas to many of my vba problems in the past, but i have not been able to find any concerning what i am sure is for many, if not most, here in this forum a simple task. I have to deal with a lot of xml report files that all have a header string and my problem is how to parse the string for the nuggest i require for my macro.
This is a sample string:
<Function IDREF="TST_RxRccsMatrix_Rx64" Start="2011-04-07T14:21:35.593000+02:00" Status="Success" Tags="SystemSerialNumber:41009" End="2011-04-07T14:29:16.625000+02:00">
I need to extract
- the report type: TST_RxRccsMatrix (length of this string is not constant)
- the start date-time stamp: 2011-04-07T14:21:35.593000+02:00 (length is constant)
- the serial number: 41009 (length is constant)
I have tried methods using Split and InStr and Find but none produce the desired results for all three extractions.
I truely appreciate any help on this!

The old fashion way is to use instr to find beginning. Then use instr to find ending. Then use mid to suck it out.
Begin = instr(1,xmlstring,"IDREF=") + Len("IDREF")
'look for first space after IDREF= in string
End = instr(Begin, xmlstring, " ")
Report = mid(xmlstring, begin, end - begin)
I didn't test it.
But I's split on space, then go through the array splitting on =. That will give you an array of 2 element arrays with value name in (0) and value in (1).
But xml has it's own query language and libraries to access stuff.
This is some code splitting a command line and then splitting 320x200 into 300 and 200.
CmdLine = Command()
A = Split(CmdLine, Chr(32), 2, 1)
B = Split(A(0), "x", 2, 1)

xmlstring = "<Function IDREF=""TST_RxRccsMatrix_Rx64"" Start=""2011-04-07T14:21:35.593000+02:00"" Status=""Success"" Tags=""SystemSerialNumber:41009"" End=""2011-04-07T14:29:16.625000+02:00"">"
Set regEx = New RegExp
regEx.Pattern = "IDREF=""([a-z0-9_]+)"""
regEx.IgnoreCase = True
regEx.Global = True
Set Matches = regEx.Execute(xmlstring)
If Matches.count <> 1 then msgbox "no match or too many"
For Each Match in Matches
Msgbox match.submatches(0)
Next
I answered your qustions. The other person deleted two easier ways of doing it.
Ask Oded to put back my explanation of this code. And to restore the MS tutorial on how to do it with XML DOM objects. I showed FOUR ways.

After some polishing:
Private Sub GetFileInfo()
Dim fso As New FileSystemObject, strText As Variant, i As Integer
Dim X(0 To 2) As String, Y(0 To 2) As String, B, E As Variant
'get header string from xml file
'FName (file name) was ascertained by a previous sub and is made public
Set strText = fso.OpenTextFile(FName, ForReading, False)
'header string is in second (i = 2) line of file
For i = 1 To 2: [A1] = strText.ReadLine: Next: strText.Close: Set fso = Nothing
'User Oded's search and extract routine
X(0) = "IDREF=": X(1) = "Start=": X(2) = "Tags="
For i = LBound(X(), 1) To UBound(X(), 1)
B = InStr(1, [A1], X(i)) + Len(X(i)) + 1 ' + 1 includes trailing " character
E = InStr(B, [A1], " ") - 1 ' - 1 includes leading " character
'required if a search string in X() is at the end of the header which ends with a ">"
If (InStr(B, [A1], " ") - 1) < 0 Then E = InStr(B, [A1], ">")
Y(i) = Mid([A1], B, E - B)
Next
[D1] = "Test = " & Y(0)
[D2] = "Tested on : " & Left(Y(1), 10) & " at " & Mid(Y(1), 12, 8)
[D2] = [D2] & " - " & Y(2)
End Sub

Related

How to format strings with numbers and special characters in Excel or Access using VBA?

I have a mathematical problem: these five strings are IDs for the same object. Due to these differences, objects appear multiple times in my Access table/query. Although there are a lot of these mutations, but I take this as an example.
76 K 6-18
76 K 6-18(2)
0076 K 0006/ 2018
0076 K 0006/2018
76 K 6/18
How would the VBA-code have to look like to recognize that these numbers stand for the same thing , so a general formatting with "RegEx()" or "format()" or "replace()"...but they must not only refer to this example but to the kind.
The common factor of these and all other mutations is always the following:
1) includes "-", no zeros left of "-", just 18 an not 2018 (year) at the end.
2) is like the first but with (2) (which can be dropped).
3) includes "/", zeros left of "/", and 2018 as year at the end.
4) is like third, but without space after "/".
5) is like the first one, but with a "/" instead of "-".
Character is always one single "K"! I suppose the best way would be to convert all 5 strings to 76 K 6 18 or in ohter cases for example to 1 K 21 20 or 123 K 117 20 . Is this possible with one elegant code or formula? Thanks
Here is a fun alternative using a rather complex but intuitive regular expression:
^0*(\d+) (K) 0*(\d+)[-\/] ?\d{0,2}(\d\d)(?:\(\d+\))?$
See an online demo
^ - Start line anchor.
0* - 0+ zeros to catch any possible leading zeros.
(\d+) - A 1st capture group of 1+ digits ranging 0-9.
- A space character.
(K) - 2nd Capture group capturing the literal "K".
- A space character.
(\d+) - A 3rd capture group of 1+ digits ranging 0-9.
[-\/] - Character class of either a hyphen or forward slash.
? - An optional space character.
\d{0,2} - 0-2 digits ranging from 0-9.
(\d\d) - A 4th capture group holding exactly two digits.
(?:\(\d+\))? - An optional non-capture group holding 1+ digits inside literal paranthesis.
$ - End line anchor.
Now just replace the whole string by the 4 capture groups with spaces in between.
Let's test this in VBA:
'A code-block to call the function.
Sub Test()
Dim arr As Variant: arr = Array("76 K 6-18", "76 K 6-18(2)", "0076 K 0006/ 2018", "0076 K 0006/2018", "76 K 6/18")
For x = LBound(arr) To UBound(arr)
Debug.Print Transform(CStr(arr(x)))
Next
End Sub
'The function that transform the input.
Function Transform(StrIn As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "^0*(\d+) (K) 0*(\d+)[-\/] ?\d{0,2}(\d\d)(?:\(\d+\))?$"
Transform = .Replace(StrIn, "$1 $2 $3 $4")
End With
End Function
All the elements from the initial array will Debug.Print "76 K 6 18".
Hope it helps, happy coding!
EDIT: If your goal is just to check if your string compiles against the pattern, the pattern itself can be shortened a little and you can return a boolean instead:
'A code-block to call the function.
Sub Test()
Dim arr As Variant: arr = Array("76 K 6-18", "76 K 6-18(2)", "0076 K 0006/ 2018", "0076 K 0006/2018", "76 K 6/18")
For x = LBound(arr) To UBound(arr)
Debug.Print Transform(CStr(arr(x)))
Next
End Sub
'The function that checks the input.
Function Transform(StrIn As String) As Boolean
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "^0*\d+ K 0*\d+[-\/] ?\d{2,4}(?:\(\d+\))?$"
Transform = .Test(StrIn)
End With
End Function
As #Vincent has suggested, look at using a custom function to convert all of the different data to be consistent. Based on what you have described, the following seems to work:
Function fConvertFormula(strData As String) As String
On Error GoTo E_Handle
Dim astrData() As String
strData = Replace(strData, "/", " ")
strData = Replace(strData, "-", " ")
strData = Replace(strData, " ", " ")
astrData = Split(strData, " ")
If UBound(astrData) = 3 Then
astrData(0) = CLng(astrData(0))
astrData(2) = CLng(astrData(2))
If InStr(astrData(3), "(") > 0 Then
astrData(3) = Left(astrData(3), InStr(astrData(3), "(") - 1)
End If
If Len(astrData(3)) = 4 Then
astrData(3) = Right(astrData(3), 2)
End If
fConvertFormula = Join(astrData, " ")
End If
fExit:
On Error Resume Next
Exit Function
E_Handle:
MsgBox Err.Description & vbCrLf & vbCrLf & "fConvertFormula", vbOKOnly + vbCritical, "Error: " & Err.Number
Resume fExit
End Function
It starts by replacing "field" delimiters with spaces, and then does a replace of double spaces. It then removes any leading zeroes from the first and third elements, if there is a bracket in the last element then delete that part, and finally converts to a 2 digit value before joining it all back up.
You may have other cases that you need to deal with, so I would suggest creating a query with the original data and the data converted by this function, and seeing what it throws out.
This function unifies the given string by the rules you defined in your question:
Public Function UnifyValue(ByVal inputValue As String) As String
'// Remove all from "(" on.
inputValue = Split(inputValue, "(")(0)
'// Replace / by blank
inputValue = Replace(inputValue, "/", " ")
'// Replace - by blank
inputValue = Replace(inputValue, "-", " ")
'// Replace double blanks by one blank
inputValue = Replace(inputValue, " ", " ")
'// Split by blank
Dim splittedInputValue() As String
splittedInputValue = Split(inputValue, " ")
'// Create the resulting string
UnifyValue = CLng(splittedInputValue(0)) & _
" " & splittedInputValue(1) & _
" " & CLng(splittedInputValue(2)) & _
" " & Right(CLng(splittedInputValue(3)), 2)
End Function
It always returns 76 K 6 18 regarding to your sample values.

Faster alternatives to Characters object

I am required to extract passages of text from the contents of Excel cells in which the originator has essentially done a manual Track Changes using Strikethrough font. The passages are identifiable with certain character patterns, but I have to ignore Strikethrough characters to see them. The Strikethrough characters do not appear in regular locations within each cell, so are essentially randomly dispersed with normal font text.
I have achieved my goal using VBA for Excel, but the solution is extremely (and impracticably) slow. Having searched this site and the wider web for answers, it seems the use of the Characters object is to blame.
So my question is: has anyone found a way of parsing such text that does not involve the Characters object?
The sub I wrote to do the parsing is too long to post here, but following is some test code which uses the Characters object in a similar way. This takes 60 s to parse a cell with 3000 characters in it. At that speed, it would take 50 hours to process the entire spreadsheet I've been given.
Private Sub FindLineBreakChars(TargetCell As Excel.Range)
Dim n As Integer
Dim ch As String
Dim st As Boolean
If TargetCell.Cells.Count <> 1 Then
Call MsgBox("Error: more or less than one cell in range specified.")
Else
If IsEmpty(TargetCell.Value) Then
Call MsgBox("Error: target cell is empty.")
Else
If Len(TargetCell.Value) = 0 Then
Call MsgBox("Error: target cell contains an empty string.")
Else
'Parse the characters in the cell one by one.
For n = 1 To TargetCell.Characters.Count
ch = TargetCell.Characters(n, 1).Text
st = TargetCell.Characters(n, 1).Font.Strikethrough
If ch = vbCr Then
Debug.Print "#" & n & ": Carriage Return (vbCr)" & ", strikethrough = " & st & vbCrLf
ElseIf ch = vbLf Then
Debug.Print "#" & n & ": Line Feed (vbLf)" & ", strikethrough = " & st & vbCrLf
End If
Next n
End If
End If
End If
End Sub
You're right, the access to Characters is very slow, so your goal should be to reduce it's usage as much as possible.
I don't understand your requirement details, but the following code should get you an idea how you could speed up the code. It reads the content of a cell only once, split the text into separate lines, calculates the position of the single linefeed characters and look at that position for the formatting. As far as I know there is no way to access the formatting all at once, but now the access to the characters-object is reduced to one per line:
With TargetCell
Dim lines() As String, lineNo As Integer, textLen As Long
lines = Split(.Value2, vbLf)
textLen = Len(lines(0)) + 1
For lineNo = 1 To UBound(lines)
Dim st
st = .Characters(textLen, 1).Font.Strikethrough
Debug.Print "#" & textLen & ": LineFeed (vbLf) strikethrough = " & st
textLen = textLen + Len(lines(lineNo)) + 1
Next lineNo
End With
To my knowledge, Excel stores Linebreaks in a cell using just the LineFeed character, so the code is checking only that.
This might meet your performance needs: it calls a function which parses the XML representation of the cell content, removes the struck-out sections, and returns the remaining text.
It will be much faster than looping over Characters
Sub Tester()
Debug.Print NoStrikeThrough(Range("A1"))
End Sub
'Needs a reference to Microsoft XML, v6.0
' in your VBA Project references
Function NoStrikeThrough(c As Range) '
Dim doc As New MSXML2.DOMDocument60, rv As String
Dim x As MSXML2.IXMLDOMNode, s As MSXML2.IXMLDOMNode
'need to add some namespaces
doc.SetProperty "SelectionNamespaces", _
"xmlns:ss='urn:schemas-microsoft-com:office:spreadsheet' " & _
"xmlns:ht='http://www.w3.org/TR/REC-html40'"
doc.LoadXML c.Value(11) 'cell data as XML
Set x = doc.SelectSingleNode("//ss:Data")'<< cell content
Set s = x.SelectSingleNode("//ht:S") '<< strikethrough
Do While Not s Is Nothing
Debug.Print "Struck:", s.Text
x.RemoveChild s '<< remove struck section
Set s = x.SelectSingleNode("//ht:S")
Loop
NoStrikeThrough = doc.Text
End Function
EDIT: here's another way to go at it, by breaking up the text into "blocks" and checking each block to see if it has any strikethrough. How much faster this is than going character-by-character may depend on block size and the distribution of struck-out text in each cell.
Function NoStrikeThrough2(c As Range)
Const BLOCK As Long = 50
Dim L As Long, i As Long, n As Long, pos As Long, x As Long
Dim rv As String, s As String, v
L = Len(c.Value)
n = Application.Ceiling(L / BLOCK, 1) 'how many blocks to check
pos = 1 'block start position
For i = 1 To n
v = c.Characters(pos, BLOCK).Font.Strikethrough
If IsNull(v) Then
'if strikethough is "mixed" in this block - parse out
' character-by-character
s = ""
For x = pos To pos + BLOCK
If Not c.Characters(x, 1).Font.Strikethrough Then
s = s & c.Characters(x, 1).Text
End If
Next x
rv = rv & s
ElseIf v = False Then
'no strikethrough - take the whole block
rv = rv & c.Characters(pos, BLOCK).Text
End If
pos = pos + BLOCK 'next block position.
Next i
NoStrikeThrough2 = rv
End Function
EDIT2: if you need to make sure all newline characters are not struck out before processing the cell -
Sub ClearParaStrikes(c As Range)
Dim pos As Long
pos = InStr(pos + 1, c.Value, vbLf)
Do While pos > 0
Debug.Print "vbLf at " & pos
c.Characters(pos, 1).Font.Strikethrough = False
pos = InStr(pos + 1, c.Value, vbLf)
Loop
End Sub

vba search replace character

I'm trying to prepare a spreadsheet for a report in excel vba. Unforturnately there are some wierd characters here that need to be replaced. Easy enough, except for this chracter:
¦
I can't seem to be able to paste that character into the editor into a string replace function. When I try, the output is _. I then thought to refer to it by it's Chr code. A quick look up said it was Chr(166). http://www.gtwiki.org/mwiki/?title=VB_Chr_Values
Replace(s, "â€" + Chr(166), "...")
But this is not that character at all (at least on Mac excel). I tried:
For i = 1 To 255
Debug.Print Chr(i)
Next i
And I didn't see this character anywhere. Does anyone know how I can reference this character in vba code in order to replace it?
Not sure if regexp is available for vba-mac, but you could simplify your existing code greatly as below.
Uses a sample Strin
Dim strIn As String
strIn = "1â€1â€x123"
Do While InStr(strIn, "â€") > 0
Mid$(strIn, InStr(strIn, "â€"), 3) = "..."
Loop
Click on a cell containing your miscreant character and run this small macro:
Sub WhatIsIt()
Dim s As String, mesage As String
Dim L As Long
s = ActiveCell.Text
L = Len(s)
For i = 1 To L
ch = Mid(s, i, 1)
cd = Asc(ch)
mesage = mesage & ch & " " & cd & vbCrLf
Next i
MsgBox mesage
End Sub
It should reveal the characters in the cell and their codes.
It's dirty, but here's the workaround that I used to solve this problem. I knew that my issue character was always after "â€", so the idea was to replace the character that came after those 2. I don't really know how to replace a character at a position in a string, so my idea was to covert the string to an array of characters and replace the array at those specific indexes. Here's what it looks like:
Do While InStr(s, "â€") > 1
num2 = InStr(s, "â€")
arr = stringToArray(s)
arr(num2 - 1) = "<~>"
arr(num2) = "<~>"
arr(num2 + 1) = "<~>"
s = Replace(arrayToString(arr), "<~><~><~>", "...")
Loop
...
Function stringToArray(ByVal my_string As String) As Variant
Dim buff() As String
ReDim buff(Len(my_string) - 1)
For i = 1 To Len(my_string)
buff(i - 1) = Mid$(my_string, i, 1)
Next
stringToArray = buff
End Function
Function arrayToString(ByVal arr As Variant) As String
Dim s As String
For Each j In arr
s = s & j
Next j
arrayToString = s
End Function
In practice, what I replaced those indexes with is something that had to be unique but recognizable. Then i can replace my unique characters with whatever I want. There are sure to be edge cases, but for now it gets the job done. stringToArray function pulled from: Split string into array of characters?

InStrRev() problems

I am trying to write a Do Loop that enters into a long string and returns the place of the space before an email address and the place of the space after. Using InStr() I have been able to locate the end of the email address. Now I need to locate the beginning to then use Mid() to pull the address out. I see that InStrRev() should start at the end of a string and then search, but looking at actual manuals it appears it just gives the second instance of a character. For example:
My string is:
please shoot me an email. My E-Mail: fake#gmail.com If you cannot make it call me.
What I have done so far is returned the place of the # , which in this case if 42. Then I used InStr() to return the place of the first " " after the #. Which in this case is, 52. I wish to return the place of the first " " BEFORE the #. In this case it should be 37. My plan then is to use Mid(37, 15). Fifteen being the difference of 52 & 37. I have tried using InStrRev() to return 37 but cannot get it to work. Any suggestions? below is my code.
x = 2
Do
Cells(x, 11).Select
Cells(x, 11).Value = (InStrRev(Cells(x, 9), Cells(x, 2), " "))
x = x + 1
On Error Resume Next
Loop Until Cells(x, 2).Value = ""
Where (x,9) is the place of the # and (x, 2) is the string.
or maybe if all you need is the email address:
Function GetEmail(longstr As String) As String
GetEmail = Filter(Split(longstr, " "), "#")(0)
End Function
Generally, looping should be avoided in Excel as it is slow, the below will do what your code does without the loop:
Columns(12).Cells(1).Resize(Columns(11).Cells(2).End(xlDown).Row - 1, 1).Offset(1).Value = _
Application.Transpose(Filter(Split(Join(Application.Transpose(Columns(11).Value), " "), " "), "#"))
How about:
MyArray = Split(Mystring," ")
For i=0 To Ubound(MyArray)
If Instr(MyArray(i)),"#")>0 Then
''Email
End If
Next
' Find the # symbol
Dim atPosition As Integer
atPosition = InStr(cellValue, "#")
' check if found here
' Find the space after the #
Dim secondSpacePosition As Integer
secondSpacePosition = InStr(atPosition, cellValue, " ")
' check if found here
' Find the space before the #
Dim firstSpacePosition As Integer
firstSpacePosition = InstrRev(cellValue, " ", atPosition) ' beware, the arguments differ a little
' check if found here
Dim email As String
email = Mid(cellvalue, firstSpacePosition + 1, secondSpacePosition - firstSpacePosition - 1)

How to remove spaces in between text?

Why trim is not working in VBA?
for i = 3 to 2000
activesheet.cells(i,"C").value = trim(Activesheet.cells(i,"C").value)
next i
It is unable to remove the spaces in between the text.
hiii how ' even after trying trim the o/p is still these
hiii how
I need to remove the extra spaces so I found Trim to do it but it is not working while ltrim and rtrim are.
The VBA Trim function is different than Excel's. Use Excel's Application.WorksheetFunction.Trim function instead.
Excel Trim will remove all spaces except a single space between words. VBA Trim will remove leading and trailing spaces.
Thank MS for using the same keyword for different functions.
Trim removes extra spaces at start and end, not in the middle of a string.
Function CleanSpace(ByVal strIn As String) As String
strIn = Trim(strIn)
' // Replace all double space pairings with single spaces
Do While InStr(strIn, " ")
strIn = Replace(strIn, " ", " ")
Loop
CleanSpace = strIn
End Function
From here.
PS. It's not the most efficient way to remove spaces. I wouldn't use on many, very long strings or in a tight loop. It might be suitable for your situation.
I know this question is old but I just found it and thought I'd add what I use to remove multiple spaces in VBA....
cleanString = Replace(Replace(Replace(Trim(cleanString), _
" ", " |"), "| ", ""), " |", " ") 'reduce multiple spaces chr(32) to one
When you call Trim() VBA is actually calling Strings.Trim(). This function will only remove leading and trailing spaces. To remove excessive spaces within a string, use
Application.Trim()
Are all your other functions leaving whitespace behind?
Get CleanUltra!
CleanUltra removes all whitespace and non-printable characters including whitespace left behind by other functions!
I hope you find this useful. Any improvements are welcome!
Function CleanUltra( _
ByVal stringToClean As String, _
Optional ByVal removeSpacesBetweenWords As Boolean = False) _
As String
' Removes non-printable characters and whitespace from a string
' Remove the 1 character vbNullChar. This must be done first
' if the string contains vbNullChar
stringToClean = Replace(stringToClean, vbNullChar, vbNullString)
' Remove non-printable characters.
stringToClean = Application.Clean(stringToClean)
' Remove all spaces except single spaces between words
stringToClean = Application.Trim(stringToClean)
If removeSpacesBetweenWords = True Then _
stringToClean = Replace(stringToClean, " ", vbNullString)
CleanUltra = stringToClean
End Function
Here's an example of it's usage:
Sub Example()
Dim myVar As String
myVar = " abc d e "
MsgBox CleanUltra(myVar)
End Sub
Here's a test I ran to verify that the function actually removed all whitespace. vbNullChar was particularly devious. I had to set the function to remove it first, before the CLEAN and TRIM functions were used to stop them from removing all characters after the vbNullChar.
Sub Example()
Dim whitespaceSample As String
Dim myVar As String
' Examples of various types of whitespace
' (vbNullChar is particularly devious!)
whitespaceSample = vbNewLine & _
vbCrLf & _
vbVerticalTab & _
vbFormFeed & _
vbCr & _
vbLf & _
vbNullChar
myVar = " 1234" & _
whitespaceSample & _
" 56 " & _
"789 "
Debug.Print "ORIGINAL"
Debug.Print myVar
Debug.Print "Character Count: " & Len(myVar)
Debug.Print
Debug.Print "CLEANED, Option FALSE"
Debug.Print CleanUltra(myVar)
Debug.Print CleanUltra(myVar, False)
' Both of these perform the same action. If the optional parameter to
' remove spaces between words is left blank it defaults to FALSE.
' Whitespace is removed but spaces between words are preserved.
Debug.Print "Character Count: " & Len(CleanUltra(myVar))
Debug.Print
Debug.Print "CLEANED, Option TRUE"
Debug.Print CleanUltra(myVar, True)
' Optional parameter to remove spaces between words is set to TRUE.
' Whitespace and all spaces between words are removed.
Debug.Print "Character Count: " & Len(CleanUltra(myVar, True))
End Sub
My related issue was that the last character was a chr(160) - a non-breaking space. So trim(replace(Str,chr(160),"")) was the solution.
I know this question is old but I just want to share my solution on how to deal and fix with this issue.
Maybe you might wondering why sometimes TRIM function isn't working, remember that it will only remove spaces and spaces are equivalent to ASCII 32. So if these ASCII 13 or ASCII 10 exists in the Beginning or end of your string value then TRIM function will not work on it.
Function checkASCIItoBeRemoved(myVal) As String
Dim temp As String
temp = Replace(Trim(myVal), Chr(10), Empty)
temp = Replace(temp, Chr(13), Empty)
checkASCIItoBeRemoved = temp
End Function
With this code it works for me, by the way if this might not work on your side then try to check the ASCII of you string value because it might have another invisible special char that might not covered on my code to replace on it, kindly add on it to work.
Please see reference for some invisible special char.
I know this is quite old but thought I'd add in something else rather than all these replace options.
Using trim (or trim$) in VBA will remove the leading and trailing spaces, which as mentioned is different from =TRIM in Excel.
If you need to remove spaces (as mentioned below not necessarily all whitespace) from inside a string simply use WorksheetFunction.Trim.
Sometimes what looks to be a space is not a space but a character that cannot be displayed.
Use the ASC function to get the integer value of the character. Then use the following code:
Function CleanSpace(ByVal StrIn As String) As String
StrIn = Trim(StrIn)
' Searches string from end and trims off excess ascii characters
Dim StrLength As Integer
Dim SingleChar As Integer
Dim StrPosition As Integer
SingleChar = 1
StrLength = Len(StrIn)
StrPosition = StrLength - 1
Do Until Asc(Mid(StrIn, StrPosition, SingleChar)) <> 0
StrPosition = StrPosition - 1
Loop
StrIn = Mid(StrIn, 1, StrPosition)
End Function
If You are familiar with collections, i once wrote a quick code that process the whole sheet even if it is huge and remove all double spaces, lead and trail spaces and invisible characters from all cells. Just take care it will remove the format of your text, i also did not do much testing and it's exhaustive but it worked for my short task and worked fast.
This is an Auxiliary function that loads the sheet into a collection
Function LoadInCol() As Collection
Dim currColl As Collection
Dim currColl2 As Collection
Set currColl = New Collection
Set currColl2 = New Collection
With ActiveSheet.UsedRange
LastCol = .Columns(.Columns.Count).Column
lastrow = .Rows(.Rows.Count).Row
End With
For i = 1 To lastrow
For j = 1 To LastCol
currColl.Add Cells(i, j).Value
Next
currColl2.Add currColl
Set currColl = New Collection
Next
Set LoadInCol = currColl2
End Function
And this is the main Sub that removes the spaces
Sub RemoveDSpaces()
'Removes double spaces from the whole sheet
Dim Col1 As Collection
Dim Col2 As Collection
Dim Col3 As Collection
Dim StrIn As String
Dim Count As Long
Set Col1 = New Collection
Set Col2 = New Collection
Set Col3 = New Collection
Set Col1 = LoadInCol()
Count = Col1.Count
i = 0
For Each Item In Col1
i = i + 1
If i >= Count + 1 Then Exit For
Set Col2 = Item
For Each Item2 In Col2
StrIn = WorksheetFunction.Clean(Trim(Item2))
Do Until InStr(1, StrIn, " ", vbBinaryCompare) = 0
StrIn = Replace(StrIn, " ", Chr(32))
Loop
Col3.Add StrIn
Next
Col1.Remove (1)
Col1.Add Col3
Set Col3 = New Collection
Next
'Store Results
Cells.ClearContents
Z = 1
m = 1
Set Col3 = New Collection
For Each Item In Col1
Set Col3 = Item
For Each Item2 In Col3
Cells(Z, m) = Item2
m = m + 1
Next
m = 1
Z = Z + 1
Next
End Sub

Resources