Triming strings and char - string

I'm working on mini program. I have a List of Strings. I'm reading strings from .txt file, and if i have a word that contains 4 and more char then read it. Ok, that's i what is working. Then i need to write all strings (words) in another file, and it's working, but i have problem.
For example i have a word School (6char) and i need trim some char from words. For example
School = chool, hool, etc..
Program = rogram, ogram, gram, etc...
I need to get something like this, here's the code. My code is only working for first char, but not for other in the loop.
For example i will get Program = rogram, but not ogram, gram, etc...
My question is, how to get all trim words from my list of words in input .txt file i have for example:
Program,
school,
etc
and in output .txt file i need to get something like this:
rogram,
ogram,
gram,
chool,
hool,
Here's the code.
Dim path As String = "input_words.txt"
Dim write As String = "trim_words.txt"
Dim lines As New List(Of String)
'reading file'
Using sr As StreamReader = New StreamReader(path)
Do While sr.Peek() >= 4
lines.Add(sr.ReadLine())
Loop
End Using
'writing file'
Using sw As StreamWriter = New StreamWriter(write)
For Each line As String In lines
sw.WriteLine(line.Substring(1, 5))
Next
End Using

The While Loop
A simple way to approach the problem is to use a While loop while the length is bigger than 4
Here while our current string is more than 4 length:
We write it
We remove the first character
For Each line As String In lines
Dim current As String = line
While current.Length > 4
Console.Write(current & ",")
current = current.Remove(0, 1)
End While
Console.Write(current & vbNewLine)
Next
The For Loop
The second way to approach this is using a For loop where the idea is from the length of the current word to (5) applying a last -1 :
We write it
We remove the first character
For Each line As String In lines
Dim current As String = line
For i As Integer = line.Length To 5 Step -1
Console.Write(current & ",")
current = current.Remove(0, 1)
Next
Console.Write(current & vbNewLine)
Next

I made a different solution. Because i needed to limited to 4 char i made this.
Dim path As String = "input_words.txt"
Dim write As String = "trim_words.txt"
Dim lines As New List(Of String)
'reading file'
Using sr As StreamReader = New StreamReader(path)
Do While sr.Peek() >= 4
lines.Add(sr.ReadLine())
Loop
End Using
'writing file'
Using sw As StreamWriter = New StreamWriter(write)
For Each line As String In lines
Dim iStringLength = line.Length
Dim iPossibleStrings = iStringLength - 5
For i = 0 To iPossibleStrings
Console.WriteLine(line.Substring(i, 4))
Next
Next
End Using
Tnx for help #Mederic!

Related

Put a space in front of every capital letter in a string

I want to put spaces before each capital letter in a string.
So turn this: TheQuickBrownFox
into this: The Quick Brown Fox
This is the code I have so far: it finds uppercase chars in the string and shows each upper letter in a Message Box.
I can't figure out where to go from here:
Dim input As String = "TheQuickBrownFox"
For i As Integer = 0 To input.Length - 1
Dim c As Char = input(i)
If Char.IsUpper(c) Then
MsgBox(c)
End If
Next
I've googled around but I wasn't able to find a solution for visual basic.
A couple of example using LINQ:
Imports System.Linq
Imports System.Text
Dim input As String = "TheQuickBrownFox"
In case you don't know it, a String is a collection of Chars, so you can iterate the string content using a ForEach loop (e.g., For Each c As Char In input).
▶ Generate a string from an collection of chars (Enumerable(Of Char)), excluding the first uppercase char if it's the first in the string.
The second parameter of a Select() method, when specified (in Select(Function(c, i) ...)), represents the index of the element currently processed.
String.Concat() rebuilds a string from the Enumerable(Of Char) that the Select() method generates:
Dim result = String.Concat(input.Select(Function(c, i) If(i > 0 AndAlso Char.IsUpper(c), ChrW(32) + c, c)))
The same, not considering the position of the first uppercase char:
Dim result = String.Concat(input.Select(Function(c) If(Char.IsUpper(c), ChrW(32) + c, c)))
▶ With an aggregation function that uses a StringBuilder as accumulator (still considering the position of the first uppercase char).
When processing strings, a StringBuilder used as storage can make the code more efficient (creates way less garbage) and more performant.
See, e.g., here: How come for loops in C# are so slow when concatenating strings?
➨ Note that I'm adding an Array of chars to the StringBuilder:
Dim result = input.Aggregate(New StringBuilder(),
Function(sb, c) sb.Append(If(sb.Length > 0 AndAlso Char.IsUpper(c), {ChrW(32), c}, {c})))
➨ result is a StringBuilder object: extract the string with result.ToString().
Or, as before, without considering the position:
Dim result = input.Aggregate(New StringBuilder(),
Function(sb, c) sb.Append(If(Char.IsUpper(c), {ChrW(32), c}, {c})))
▶ The two example above are somewhat equivalent to a simple loop that iterates all chars in the string and either creates a new string or uses a StringBuilder as storage:
Dim sb As New StringBuilder()
For Each c As Char In input
sb.Append(If(sb.Length > 0 AndAlso Char.IsUpper(c), {ChrW(32), c}, {c}))
Next
Change the code as described before if you want to add a space to the first uppercase Char without considering its position.
Yet another option is a language extension method.
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Demo()
Console.ReadLine()
End Sub
Private Sub Demo()
Dim data = New List(Of String) From
{
"ThisIsASentence",
"TheQuickBrownFox",
"ApplesOrangesGrapes"}
data.ForEach(Sub(item) Console.WriteLine($"{item,-25}[{item.SplitCamelCase}]"))
End Sub
End Module
Public Module StringExtensions
<Runtime.CompilerServices.Extension>
Public Function SplitCamelCase(sender As String) As String
Return Regex.Replace(Regex.Replace(sender,
"(\P{Ll})(\P{Ll}\p{Ll})", "$1 $2"),
"(\p{Ll})(\P{Ll})",
"$1 $2")
End Function
End Module
You can use a regular expression:
Dim input = "TheQuickBrownFox"
Dim withSpaces = Regex.Replace(a, "(?<!^)([A-Z])", " $1")
The regex finds any uppercase A-Z that are not preceded by the start of the string and captures it into a group. The replacement takes the group content and prefixes a space
If you don't want to use the negative lookbehind, you can trim the result:
Dim withSpaces = Regex.Replace(a, "([A-Z])", " $1").TrimStart()
Or don't trim if you don't care that the string starts with a space
To expand on your code:
Dim input As String = "TheQuickBrownFox"
Dim outputSB as new StringBuilder
For i As Integer = 0 To input.Length - 1
Dim c As Char = input(i)
If Char.IsUpper(c) Then
outputSB.Append(" ")
'MsgBox(c)
End If
outputSB.Append(c)
Next
Console.Writeline(outputSB.ToString())

Removing everything to left of a string up to a certain character in Visual Studio 2017 (Visual Basic)

The string in question is a file path take from an xml file.
..\Folder\Folder\Folder\filename.ext
the code I have takes this file path element from each xml entry and adds it to a ListView
what I need to do is remove everything from the left of the last "\" so that all that is added to the ListView is
filename.ext
the file path element is never the same length and doesn't have the same amount of "\" characters every time, with this i mean it could be
..\Folder\Folder\Folder\Folder\biglongfilename.ext
..\Folder\Folder\filename.ext
I have searched and tried some things like the string manipulation stuff
'Dim pos As Integer
'pos = Rname.IndexOf("\") - 1
'Dim RnameSH As String = Rname.Substring(10, pos)
This only gives me correctly if I specify the amount of letters I need (which could be different every time)
Dim RnameSH As String = ""
Dim i As Integer
Dim pos As Integer
For i = 1 To 200
If (Microsoft.VisualBasic.Strings.Right(Rname, i)) <> "\" Then
i = i + 1
ElseIf (Microsoft.VisualBasic.Strings.Right(Rname, i)) = "\" Then
pos = i
'Next i
End If
Next i
RnameSH = Microsoft.VisualBasic.Strings.Right(Rname, pos)
Also tried the old fashion way but that doesn't seem to work.
Nothing I try gives me the filename.ext only that i need. This is my first attempt at real programming and I've read about string manipulation on various sites but i just can't figure this out at all.
Thanks for any help you can give with this or for pointing me in the right direction

Split String New Line After 3 Space in VB.net

i have problem to split string into newline in vb.net.
right now i can make it to split by a single space.i want split new line after 3 space.
Dim s As String = "SOMETHING BIGGER THAN YOUR DREAM"
Dim words As String() = s.Split(New Char() {" "c})
For Each word As String In words
Console.WriteLine(word)
Next
output :
SOMETHING
BIGGER
THAN
YOUR
DREAM
Desire output :
SOMETHING BIGGER THAN
YOUR DREAM
Another alternative added to existing efficient answers might to be:
Dim separator As Char = CChar(" ")
Dim sArr As String() = "SOMETHING BIGGER THAN YOUR DREAM".Split(separator)
Dim indexOfSplit As Integer = 3
Dim sFinal As String = Join(sArr.Take(indexOfSplit).ToArray, separator) & vbNewLine &
Join(sArr.Skip(indexOfSplit).ToArray, separator)
Console.WriteLine(sFinal)
You can split your input string, then loop the array of parts generated and add them to a StringBuilder object.
When you have read a number of parts that is multiple of a defined value, (wordsPerLine, here), you append vbNewLine to the current part.
When the loop completes, print the content of the StringBuilder to the Console:
Dim input As String = "SOMETHING BIGGER THAN YOUR DREAM, NOT MORE THAN YOUR ACCOUNT BALANCE"
Dim wordsPerLine As Integer = 3
Dim wordsCounter As Integer = 1
Dim sb As StringBuilder = New StringBuilder()
For Each word As String In input.Split()
sb.Append(word & If(wordsCounter Mod wordsPerLine = 0, vbNewLine, " "))
wordsCounter += 1
Next
Console.WriteLine(sb.ToString())
Prints:
SOMETHING BIGGER THAN
YOUR DREAM, NOT
MORE THAN YOUR
ACCOUNT BALANCE
Instead of using split, you might capture 3 words in a capturing group and match the trailing whitespace chars.
In the replacement use the group followed by a newline.
Pattern
(\S+(?:\s+\S+){2})\s*
That will match:
( Capture group 1
\S+ Match 1+ non whitespace chars
(?:\s+\S+){2} Repeat 2 times matching 1+ whitespace chars and 1+ non whitespace chars
) Close group 1
\s* Match trailing whitespace chars
.NET Regex demo | VB.NET demo
Example code
Dim s As String = "SOMETHING BIGGER THAN YOUR DREAM"
Dim output As String = Regex.Replace(s, "(\S+(?:\s+\S+){2})\s*", "$1" + Environment.NewLine)
Console.WriteLine(output)
Output
SOMETHING BIGGER THAN
YOUR DREAM
String.Join has an overload that will help you.
First parameter is the character to use between elements of your array.
Second parameter is the array you wish to join.
Third parameter is the starting position, for the first line in your desired output this would be the element at index 0.
Fourth parameter is the length to use, for the first line we want three array elements.
Private Sub OPCode()
Dim s As String = "SOMETHING BIGGER THAN YOUR DREAM"
Dim words As String() = s.Split(New Char() {" "c})
Dim line1 As String = String.Join(" ", words, 0, 3)
Console.WriteLine(line1)
Dim line2 As String = String.Join(" ", words, 3, words.Length - 3)
Console.WriteLine(line2)
End Sub

Quickly remove unnecessary whitespace from a (very large) string

I'm working with very large (45,000,000+ character) strings in VBA, and I need to remove superfluous whitespace.
One space (aka, ASCII Code 32) is okay but any sections with two or more consecutive spaces should be reduced to only one.
I found a similar question here, although that OP's definition of a "very long string" was only 39,000 characters. The accepted answer was a loop using Replace:
Function MyTrim(s As String) As String
Do While InStr(s, " ") > 0
s = Replace$(s, " ", " ")
Loop
MyTrim = Trim$(s)
End Function
I tried this method and it was "worked", but was painfully slow:
Len In: 44930886
Len Out: 35322469
Runtime: 247.6 seconds
Is there a faster way to remove whitespace from a "very large" string?
I suspect the performance problem is due to creating a very large number of large intermediate strings. So, any method that does things without creating intermediate strings or with much fewer would perform better.
A Regex replace has a good chance of that.
Option Explicit
Sub Test(ByVal text As String)
Static Regex As Object
If Regex Is Nothing Then
Set Regex = CreateObject("VBScript.RegExp")
Regex.Global = True
Regex.MultiLine = True
End If
Regex.Pattern = " +" ' space, one or more times
Dim result As String: result = Regex.Replace(text, " ")
Debug.Print Len(result), Left(result, 20)
End Sub
With an input string of 45 million characters takes about a second.
Runner:
Sub Main()
Const ForReading As Integer = 1
Const FormatUTF16 As Integer = -1 ' aka TriStateTrue
Dim fso As Object: Set fso = CreateObject("Scripting.FileSystemObject")
Dim file As Object: Set file = fso.OpenTextFile("C:\ProgramData\test.txt", ForReading, False, FormatUTF16)
Dim text As String: text = file.ReadAll()
Set file = Nothing
Set fso = Nothing
Debug.Print Len(text), Left(text, 20)
Test (text)
End Sub
Test data creator (C#):
var substring = "××\n× ×× ";
var text = String.Join("", Enumerable.Repeat(substring, 45_000_000 / substring.Length));
var encoding = new UnicodeEncoding(false, false);
File.WriteAllText(#"C:\ProgramData\test.txt", text, encoding);
BTW—Since VBA (VB4, Java, JavaScript, C#, VB, …) uses UTF-16, the space character is the one UTF-16 code unit ChrW(32). (Any similarity to or comparison with ASCII, is unnecessary mental gymnastics, and if put into code as ANSI [Chr(32)], unnecessary conversion behind the scenes, with different behavior for different machines, users and times.)
In VBA, the size of a String is limited to approximately 2 Billion Characters. The "Replace-Loop" method above took 247 seconds for a 45 Million character string, which is over 4 minutes.
Theoretically, that means a 2 Billion character string would take at least 3 hours — if it even finished without crashing — so it's not exactly practical.
Excel has a built-in worksheet function Trim which is not the same as VBA's Trim function.
Worksheet function Trim removes all spaces from text except for single spaces between words.
The problem is that Trim, like all functions called with Application.WorksheetFunction, has a size limit of 32,767 characters, and this [unfortunately] applies even when calling the function from VBA with a string that's not even in a cell.
However, we can still use the function if we use it to loop through our "gigantic string" in sections, like this:
EDIT: Don't even bother with this crap (my function, below)! See the RegEx answer above.
Function bigTrim(strIn As String) As String
Const maxLen = 32766
Dim loops As Long, x As Long
loops = Int(Len(strIn) / maxLen)
If (Len(strIn) / maxLen) <> loops Then loops = loops + 1
For x = 1 To loops
bigTrim = bigTrim & _
Application.WorksheetFunction.Trim(Mid(strIn, _
((x - 1) * maxLen) + 1, maxLen))
Next x
End Function
Running this function on the same string that was used with the "Replace-Loop" method yielded much better results:
Len In: 44930886
Len Out: 35321845
Runtime: 33.6 seconds
That's more than 7x faster than the "Replace-Loop" method, and managed to remove 624 spaces that were somehow missed by the other method.
(I though about looking into why the first method missed characters, but since I know my string isn't missing anything, and the point of this exercise was to save time, that would be silly!) ☺

Separate the new string that the new string contains the string in old string

I am newbie in programming and would like to ask if I could separate the string below. I am using Visual Basic. Basically I have two string below :
String 1 : gOldStr= TEDDYBEARBLACKCAT
String 2 = gNewStr= BLACKCATWHITECAT
I wanted to separated the string 2 by looking the exact value in String 1
so that I have String2 that is part of string 1 = BLACKCAT
String 2 that is new = WHITECAT
I have tried below script but it doesn't work all the time. Could suggest me the better logic? Thanks2
For i=1 to Len(gOldStr)
TempStr = Left$(gNewStr,i)
Ctr1 = InStr(gOldStr, TempStr)
gTemporary = Mid$(gOldStr,Ctr1)
gTemporary = Trim(gTemporary)
Ctr2 = StrComp(gOldStr, gTemporary)
If Ctr2=1 Then
gTemporary2 = Replace(gNewStr,gTemporary,"")
Exit For
End If
Next i
If the common part is always at the end of the first one and at the beginning of the 2nd one, you could look from the end, like this:
Dim strMatchedWord As String
For i=1 to Len(gOldStr)
If i>Len(gNewStr) Then Exit For 'We need this to avoid an error
Test1 = Right$(gOldStr, i)
Test2 = Left$(gNewStr, i)
If Test1 = Test2 Then
strMatchedWord = Test1 'Store your match is Test1
End If
Next
Debug.Print strMatchedWord 'Once the loop finishes it contains the longest match
I modified the code so that the loop does not exit until it went through the full string. This way it will get you the longest match by the end of the loop.

Resources