Assign line numbers to items in text - string

I'm quite stuck with a fairly simple task but I'm not entirely sure how to make this function. I have a simple string as seen below:
{
"0":{"variable1":"ABC1","variable2":"AA","variable3":"BB"},
"5":{"variable1":"ABC2","variable2":"AA","variable3":"BB"},
"3":{"variable1":"BC3","variable2":"AA","variable3":"BB"},
"1":{"variable1":"DC4","variable2":"AA","variable3":"BB"},
"4":{"variable1":"DD5","variable2":"AA","variable3":"BB"}
}
What I'm trying to do, in VB.NET, is to create a loop that finds each line and arranges those first numbers "0", "1", etc. in order depending on what line it is on then simply replaces whatever number is in it, with the correct order number.
In simple:
1) Find how many number of lines the string has. Let's say 20 lines for example.
2) Find and replace each number within "": starting point of the lines in order 1-20 for this example.
Output would look like if used the example at the top:
{
"2":{"variable1":"ABC1","variable2":"AA","variable3":"BB"}, //"2" because it is the second line within the string
"3":{"variable1":"ABC2","variable2":"AA","variable3":"BB"},
"4":{"variable1":"BC3","variable2":"AA","variable3":"BB"},
"5":{"variable1":"DC4","variable2":"AA","variable3":"BB"},
"6":{"variable1":"DD5","variable2":"AA","variable3":"BB"}
}
Any ideas for a quick fix?

The question may be a case of "I have X and I need Y" where X is the item which needs attention.
If the string really is as you presented it, then
Imports System.Text
Module Module1
Sub Main()
Dim s = "{
""0"":{""variable1"":""ABC1"",""variable2"":""AA"",""variable3"":""BB""},
""5"":{""variable1"":""ABC2"",""variable2"":""AA"",""variable3"":""BB""},
""3"":{""variable1"":""BC3"",""variable2"":""AA"",""variable3"":""BB""},
""1"":{""variable1"":""DC4"",""variable2"":""AA"",""variable3"":""BB""},
""4"":{""variable1"":""DD5"",""variable2"":""AA"",""variable3"":""BB""}
}"
Dim t = s.Split({vbCrLf}, StringSplitOptions.None)
Dim u As New StringBuilder
For i = 0 To t.Length - 1
If t(i).StartsWith("""") Then
Dim parts = t(i).Split({":"c}, 2)
If parts.Count = 2 Then
u.AppendLine($"""{i + 1}"":{parts(1)}")
End If
Else
u.AppendLine(t(i))
End If
Next
Console.WriteLine(u.ToString().TrimEnd())
Console.ReadLine()
End Sub
End Module
outputs:
{
"2":{"variable1":"ABC1","variable2":"AA","variable3":"BB"},
"3":{"variable1":"ABC2","variable2":"AA","variable3":"BB"},
"4":{"variable1":"BC3","variable2":"AA","variable3":"BB"},
"5":{"variable1":"DC4","variable2":"AA","variable3":"BB"},
"6":{"variable1":"DD5","variable2":"AA","variable3":"BB"}
}

Related

How to identify and remove a single letter from a string in a cell?

I have a dataset of names in a column in Excel.
However, only some but not all of the names have a letter attached to the end of it (e.g. John Doe A, Kai Jin, Johnny Desplat Lang B, etc).
Can anyone think of a method to remove the letter from the end of the name from each row, if it is there? Such that, using the example above, I will be left with: John Doe, Kai Jin, Johnny Desplat Lang, etc.
I am fairly familiar with VBA and Excel and would be open to trying anything at all.
Thank you for your help with this question! Apologies beforehand if this seems like an elementary question but I have no idea how to begin to solve it.
"I am fairly familiar with VBA and Excel and would be open to trying anything at all."
If so, then this can be done with a simple formula if you wish to avoid VBA. With your value in A1:
=IF(MID(A1,LEN(A1)-1,1)=" ",LEFT(A1,LEN(A1)-2),A1)
If you must use VBA, I think the Like operator comes in handy:
Sub Test()
Dim arr As Variant: arr = Array("John Doe A", "Kai Jin", "Johnny Desplat Lang B")
For Each el In arr
If el Like "* ?" Then 'Or "* [A-Z]" if you must check for uppercase alpha.
Debug.Print Left(el, Len(el) - 2)
Else
Debug.Print el
End If
Next
End Sub
Just for fun and in order to demonstrate another approach via the Filter() function:
Function ShortenName(ByVal FullName As Variant) As String
'Purpose: remove a single last letter
Dim n: n = Split(FullName, " "): n = Len(n(UBound(n)))
ShortenName = Left(FullName, Len(FullName) + 2 * (n = 1))
End Function
Explanation
Applying the Split() function upon the full name and isolating the last name token (via UBound()) allows to check for a single letter length (variable n).
The function result returns the entire string length minus 2 (last letter plus preceding space) in case of a single letter (the the condition n = 1 then results in True equalling -1). - Alternatively you could have coded: ShortenName = Left(FullName, Len(FullName) - IIf(n = 1, 2, 0))

Quickly remove unnecessary whitespace from a (very large) string

I'm working with very large (45,000,000+ character) strings in VBA, and I need to remove superfluous whitespace.
One space (aka, ASCII Code 32) is okay but any sections with two or more consecutive spaces should be reduced to only one.
I found a similar question here, although that OP's definition of a "very long string" was only 39,000 characters. The accepted answer was a loop using Replace:
Function MyTrim(s As String) As String
Do While InStr(s, " ") > 0
s = Replace$(s, " ", " ")
Loop
MyTrim = Trim$(s)
End Function
I tried this method and it was "worked", but was painfully slow:
Len In: 44930886
Len Out: 35322469
Runtime: 247.6 seconds
Is there a faster way to remove whitespace from a "very large" string?
I suspect the performance problem is due to creating a very large number of large intermediate strings. So, any method that does things without creating intermediate strings or with much fewer would perform better.
A Regex replace has a good chance of that.
Option Explicit
Sub Test(ByVal text As String)
Static Regex As Object
If Regex Is Nothing Then
Set Regex = CreateObject("VBScript.RegExp")
Regex.Global = True
Regex.MultiLine = True
End If
Regex.Pattern = " +" ' space, one or more times
Dim result As String: result = Regex.Replace(text, " ")
Debug.Print Len(result), Left(result, 20)
End Sub
With an input string of 45 million characters takes about a second.
Runner:
Sub Main()
Const ForReading As Integer = 1
Const FormatUTF16 As Integer = -1 ' aka TriStateTrue
Dim fso As Object: Set fso = CreateObject("Scripting.FileSystemObject")
Dim file As Object: Set file = fso.OpenTextFile("C:\ProgramData\test.txt", ForReading, False, FormatUTF16)
Dim text As String: text = file.ReadAll()
Set file = Nothing
Set fso = Nothing
Debug.Print Len(text), Left(text, 20)
Test (text)
End Sub
Test data creator (C#):
var substring = "××\n× ×× ";
var text = String.Join("", Enumerable.Repeat(substring, 45_000_000 / substring.Length));
var encoding = new UnicodeEncoding(false, false);
File.WriteAllText(#"C:\ProgramData\test.txt", text, encoding);
BTW—Since VBA (VB4, Java, JavaScript, C#, VB, …) uses UTF-16, the space character is the one UTF-16 code unit ChrW(32). (Any similarity to or comparison with ASCII, is unnecessary mental gymnastics, and if put into code as ANSI [Chr(32)], unnecessary conversion behind the scenes, with different behavior for different machines, users and times.)
In VBA, the size of a String is limited to approximately 2 Billion Characters. The "Replace-Loop" method above took 247 seconds for a 45 Million character string, which is over 4 minutes.
Theoretically, that means a 2 Billion character string would take at least 3 hours — if it even finished without crashing — so it's not exactly practical.
Excel has a built-in worksheet function Trim which is not the same as VBA's Trim function.
Worksheet function Trim removes all spaces from text except for single spaces between words.
The problem is that Trim, like all functions called with Application.WorksheetFunction, has a size limit of 32,767 characters, and this [unfortunately] applies even when calling the function from VBA with a string that's not even in a cell.
However, we can still use the function if we use it to loop through our "gigantic string" in sections, like this:
EDIT: Don't even bother with this crap (my function, below)! See the RegEx answer above.
Function bigTrim(strIn As String) As String
Const maxLen = 32766
Dim loops As Long, x As Long
loops = Int(Len(strIn) / maxLen)
If (Len(strIn) / maxLen) <> loops Then loops = loops + 1
For x = 1 To loops
bigTrim = bigTrim & _
Application.WorksheetFunction.Trim(Mid(strIn, _
((x - 1) * maxLen) + 1, maxLen))
Next x
End Function
Running this function on the same string that was used with the "Replace-Loop" method yielded much better results:
Len In: 44930886
Len Out: 35321845
Runtime: 33.6 seconds
That's more than 7x faster than the "Replace-Loop" method, and managed to remove 624 spaces that were somehow missed by the other method.
(I though about looking into why the first method missed characters, but since I know my string isn't missing anything, and the point of this exercise was to save time, that would be silly!) ☺

VBA Greater Than Function Not Working

I have an issue where I am trying to compare a values that can be alphanumeric, only numeric, or only alphabetic.
The code originally worked fine for comparing anything within the same 100s group (IE 1-99 with alphabetic components). However when I included 100+ into it, it malfunctioned.
The current part of the code reads:
For j = 1 To thislength
If lennew < j Then
enteredval = Left("100A", lennew)
ElseIf lennew >= j Then
enteredval = Left("100A", j)
End If
If lenold < j Then
cellval = Left("67", lenold)
ElseIf lenold >= j Then
cellval = Left("67", j)
End If
'issue occurs here
If enteredval >= cellval Then
newrow = newrow+1
End If
Next j
The issue occurs in the last if statement.
When cycling through the 100 is greater than the 67 but still skips over. I tried to declare them both as strings (above this part of code) to see if that would help but it didn't.
What I am trying to accomplish is to sort through a bunch of rows and find where it should go. IE the 100A should go between 100 and 100B.
Sorry lennew=len("100A") and lennold=len("67"). And thislength=4or whatever is larger of the two lengths.
The problem is that you're trying to solve the comparison problem by attacking specific values, and that's going to be a problem to maintain. I'd make the problem more generic by creating a function that supplies takes two values returns -1 if the first operand is "before" the second, 0 if they are the same, and 1 if the first operand is "after" the second per your rules.
You could then restructure your code to eliminate the specific hardcoded prefix testing and then just call the comparison function directly, eg (and this is COMPLETELY untested, off-the-cuff, and my VBA is VERRRRRY stale :) but the idea is there: (it also assumes the existence of a simple string function called StripPrefix that just takes a string and strips off any leading digits, which I suspect you can spin up fairly readily yourself)
Function CompareCell(Cell1 as String, Cell2 as String) as Integer
Dim result as integer
Dim suffix1 as string
Dim suffix2 as string
if val(cell1)< val(cell2) Then
result = -1
else if val(cell1)>val(cell2) then
result = 1
else if val(cell1)=val(cell2) then
if len(cell1)=len(cell2) then
result =0
else
' write code to strip leading numeric prefixes
' You must supply StripPrefix, but it's pretty simple
' I just omitted it here for clarity
suffix1=StripPrefix(cell1) ' eg returns "ABC" for "1000ABC"
suffix2=StripPrefix(cell2)
if suffix1 < suffix2 then
result = -1
else if suffix1 > suffix2 then
result = 1
else
result = 0
end if
end if
return result
end function
A function like this then allows you to take any two cell references and compare them directly to make whatever decision you need:
if CompareCell(enteredval,newval)>=0 then
newrow=newrow+1
end if

Comparing Strings in VBA

I have a basic programming background and have been self sufficient for many years but this problem I can't seem to solve. I have a program in VBA and I need to compare two strings. I have tried using the following methods to compare my strings below but to no avail:
//Assume Cells(1, 1).Value = "Cat"
Dim A As String, B As String
A="Cat"
B=Cell(1, 1).Value
If A=B Then...
If A Like B Then...
If StrCmp(A=B, 1)=0 Then...
I've even tried inputting the Strings straight into the code to see if it would work:
If "Cat" = "Cat" Then...
If "Cat" Like "Cat" Then...
If StrCmp("Cat" = "Cat", 1) Then...
VBA for some reason does not recognize these strings as equals. When going through Debugger it shows that StrComp returns 1. Do my strings have different Char lengths? Thanks for any help.
Posting as answer because it doesn't fit in the comments:
I find it hard to believe that something like:
MsgBox "Cat" = "Cat"
Would not display True on your machine. Please verify.
However, I do observe that you are most certainly using StrComp function incorrectly.
The proper use is StrComp(string, anotherstring, [comparison type optional])
When you do StrComp(A=B, 1) you are essentially asking it to compare whether a boolean (A=B will either evaluate to True or False) is equivalent to the integer 1. It is not, nor will it ever be.
When I run the following code, all four message boxes confirm that each statement evaluates to True.
Sub CompareStrings()
Dim A As String, B As String
A = "Cat"
B = Cells(1, 1).Value
MsgBox A = B
MsgBox A Like B
MsgBox StrComp(A, B) = 0
MsgBox "Cat" = "Cat"
End Sub
Update from comments
I don't see anything odd happening if I use an array, just FYI. Example data used in the array:
Modified routine to use an array:
Sub CompareStrings()
Dim A As String, B() As Variant
A = "Cat"
B = Application.Transpose(Range("A1:A8").Value)
For i = 1 To 8
MsgBox A = B(i)
MsgBox A Like B(i)
MsgBox StrComp(A, B(i)) = 0
MsgBox "Cat" = B(i)
Next
End Sub
What I would check is how you're instantiating the array. Range arrays (as per my example) are base 1. If it assigned some other way, it is most likely base 0, so check to make sure that you're comparing the correct array index.

How to use 2 variables in vbscript for InStr Function in VBScript

'I need to be able to use two variables, (strings) in the instr function but it will not return proper values, the first example is the style i need. but cant seem to get it to work. any help would be greatly appreciated. ive been working on this for 3 days..its filling me with rage.
Option Explicit
dim message, searchTerm, position
message = "bob dole was here"
searchTerm = "dole"
position = InStr(message, searchTerm)
'This always returns 0
position = InStr("bob dole was here", searchTerm)
'This returns 5, which is accurate
position = InStr(message, "dole")
'This returns 0,
By default, InStr(str1, str2) performs a binary comparison.
Try performing a textual comparison, like so:
position = InStr(1, message, searchTerm, 1)
[I wondering if it is an encoding issue. Where is the message string coming from?]

Resources