Get only letters from string vb.net - string

i want to get only letters from string.
eg.
Lets say the string is this :123abc456d
I want to get: abcd
Looking for something like this but for letters in a string:
Dim mytext As String = "123a123"
Dim myChars() As Char = mytext.ToCharArray()
For Each ch As Char In myChars
If Char.IsDigit(ch) Then
MessageBox.Show(ch)
End If
Next
Thanks

You can do it like this :
Dim mytext As String = "123a123"
Dim RemoveChars As String = "0123456789" 'These are the chars that you want to remove from your mytext string
Dim FinalResult As String
Dim myChars() As Char = mytext.ToCharArray()
For Each ch As Char In myChars
If Not RemoveChars.Contains(ch) Then
FinalResult &= ch
End If
Next
MsgBox(FinalResult)
OR :
Dim mytext As String = "1d23ad123d"
Dim myChars() As Char = mytext.ToCharArray()
Dim FinalResult As String
For Each ch As Char In myChars
If Not Char.IsDigit(ch) Then
FinalResult &= ch
End If
Next
MsgBox(FinalResult)
Both will give you the same result.
Hope that helped you :)

You can use Regex to solve this problem. This regex basically says anything that is not in this class, the class being letters in the alphabet then remove by replacing it with nothing.
Dim mytext As String = "123a123"
Dim Result as String = Regex.Replace(myText, "[^a-zA-Z]", "")
Dim myChars() As Char = Result.ToCharArray()
For Each ch As Char In myChars
If Char.IsDigit(ch) Then
MessageBox.Show(ch)
End If
Next
Make sure you have this at the top of your code Imports System.Text.RegularExpressions

Here is a LINQ one liner:
Debug.Print(String.Concat("123abc456d".Where(AddressOf Char.IsLetter)))
Result: abcd.
Here, .Where(AddressOf Char.IsLetter) treats the string as a list of chars, and only keeps letters in the list. Then, String.Concat re-builds the string out of the char list by concatenating the chars.

Related

vb.net check if word exists in string and act accordingly

I read a text file, remove all punctuations and than read all the words in a String(). I want to count the words so I need some String() with two fields, word and frequency. Before I add a word I count the amount of times it is occuring in the text with the Function CountMyWords. If the word already is in the String() I dont want to add it again, just increase it's frequency.
Private Sub CreateWordList()
Dim text As String = File.ReadAllText("C:\Users\Gebruiker\Downloads\shakespear.txt")
text = Regex.Replace(text, "[^A-Za-z']+", " ")
Dim words As String() = text.Split(New Char() {" "c})
Dim i As Integer
For Each word As String In words
If Len(word) > 5 Then
word = word.ToLower()
'now check if the word already exists
If words.Contains(word) = True Then
End If
i = CountMyWords(text, word)
Console.WriteLine("{0}", word + " " + i.ToString)
End If
Next
End Sub
Private Function CountMyWords(input As String, phrase As String) As Integer
Dim Occurrences As Integer = 0
Dim intCursor As Integer = 0
Do Until intCursor >= input.Length
Dim strCheckThisString As String = Mid(LCase(input), intCursor + 1, (Len(input) - intCursor))
Dim intPlaceOfPhrase As Integer = InStr(strCheckThisString, phrase)
If intPlaceOfPhrase > 0 Then
Occurrences += 1
intCursor += (intPlaceOfPhrase + Len(phrase) - 1)
Else
intCursor = input.Length
End If
Loop
CountMyWords = Occurrences
End Function
Any thought how to do that?
As with Steve's answer, I suggest using a Dictionary, but you might not need the overhead of having a class as the value in the dictionary.
Also, if you're using fairly large files, you can process them one line at a time with the File.ReadLines method instead of reading the whole lot into RAM.
You can make the processing of the text a little terser with some LINQ, like this:
Imports System.IO
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
' using https://raw.githubusercontent.com/brunoklein99/deep-learning-notes/master/shakespeare.txt
Dim src = "C:\temp\TheSonnets.txt"
Dim wordsWithCounts As New Dictionary(Of String, Integer)
For Each line In File.ReadLines(src)
Dim text = Regex.Replace(line, "[^A-Za-z']+", " ")
Dim words = text.Split({" "c}).
Where(Function(s) s.Length > 5).
Select(Function(t) t.ToLower())
For Each w In words
If wordsWithCounts.ContainsKey(w) Then
wordsWithCounts(w) += 1
Else
wordsWithCounts.Add(w, 1)
End If
Next
Next
' extracting some data as an example...
Dim mostUsedFirst = wordsWithCounts.
Where(Function(x) x.Value > 18).
OrderByDescending(Function(y) y.Value)
For Each w As KeyValuePair(Of String, Integer) In mostUsedFirst
Console.WriteLine(w.Key & " " & w.Value)
Next
Console.ReadLine()
End Sub
End Module
With the example text, this outputs:
beauty 52
should 44
though 33
praise 28
love's 26
nothing 19
better 19
I would use a different approach. The first thing to do is to create a class that represent the word frequency. It is just a string for the word and an integer to count the word repetitions
Public Class WordFrequency
Public Property Word As String
Public Property Frequency As Integer
End Class
Now, you can create a dictionary where the key is the word and the value is an instance of the WordFrequency class. Using a dictionary is a great bonus in searching if an item exists in the collection. You use a syntax similar to the one used for arrays and specific methods exist to find the element in the collection. So your code becomes simply
' Declared at the global class level
Dim wordCounter As Dictionary(Of String, WordFrequency) = New Dictionary(Of String, WordFrequency)
.....
Private Sub CreateWordList()
Dim text As String = File.ReadAllText("C:\Users\Gebruiker\Downloads\shakespear.txt")
text = Regex.Replace(text, "[^A-Za-z']+", " ")
' remove any blank entries eventually created by the replace
Dim words As String() = text.Split(New Char() {" "c}, StringSplitOptions.RemoveEmptyEntries)
For Each word As String In words
If word.Length > 5 Then
word = word.ToLower()
' If we don't have the word in the dictionary, create the entry
If Not wordCounter.ContainsKey(word) Then
wordCounter.Add(word, New WordFrequency With
{
.Word = word,
.Frequency = 0
})
End If
' just increment the property frequency from the dictionary Value
wordCounter(Word).Frequency += 1
End If
Next
End Sub
Note that instead of having a Value of type WordFrequency you can just use an integer for the frequency, but I prefer to have a class because if you ever need to expand the informations kept in the dictionary a class will be easily extended
I suggest a Dictionary:
Public Function CountWords(words As IEnumerable(Of String)) As Dictionary(Of String, Integer)
Dim result As New Dictionary(Of String, Integer)()
For Each word As String In words
If result.ContainsKey(word)
result(word)+=1
Else
result.Add(word, 1)
End If
Next
Return result
End Function
Private Sub CreateWordList(filePath As String)
Dim text As String = File.ReadAllText(filePath).ToLower()
text = Regex.Replace(text, "[^a-z']+", " ")
Dim words As IEnumerable(Of String) = text.Split(New Char() {" "c}).
Where(Function(w) w.Length > 5)
Dim wordCounts As Dictionary(Of String, Integer) = CountMyWords(words)
For Each kvp As KeyValuePair(Of String, Integer) In wordCounts
Console.WriteLine($"{kvp.Key} {kvp.Value}")
Next
End Sub

Delete part of string by indicator in VBA

I'm trying to delete the ending part of a string where the beginning and end are variable but have several indicators. The string I'm working with is "Num1_xc_min_20201229_112401.rdf".
The Num1 is variable and the 20201229_112401 is variable (since it's a date). Num1 will not always have four characters. The end result I want is "Num1". xc is always constant.
Here is the code I'm working with:
Sub Macro1()
Dim input1 As String
Dim remove1 As String
Dim result1 As String
input1 = Range("C2").Value
remove1 = "_xc_*" 'this is the char to be removed
result1 = Replace(input1, LCase(remove1), "")
Range("C2").Value = result1
End Sub
This hasn't worked because you can't set a variable equal to a like statement.
Split by the remove1 value and return the first (0-th) element:
Sub Macro1()
Dim input1 As String
Dim remove1 As String
input1 = "Num1_xc_min_20201229_112401.rdf"
remove1 = "_xc_"
Debug.Print Split(input1, remove1)(0)
End Sub
Returns Num1.

Issues stripping special characters from text in VBA

I have an Excel file that pulls in data from a csv, manipulates it a bit, and then saves it down as a series of text files.
There are some special characters in the source data that trip things up so I added this to strip them out
Const SpecialCharacters As String = "!,#,#,$,%,^,&,*,(,),{,[,],},?,â,€,™"
Function ReplaceSpecialCharacters(myString As String) As String
Dim newString As String
Dim char As Variant
newString = myString
For Each char In Split(SpecialCharacters, ",")
newString = Replace(newString, char, "")
Next
ReplaceSpecialCharacters = newString
End Function
The issue is that this doesn't catch all of them. When I try to process the following text it slips through the above code and causes Excel to error out.
Hero’s Village
I think the issue is that the special character isn't being recognized by Excel itself. I was only able to get the text to look like it does above by copying it out of Excel and pasting it into a different IDE. In Excel is displays as:
In the workbook
In the edit field
In the immediate window
Based on this site it looks like it's having issues displaying the ' character, but how do I get it to fix/filter it out if it can't even read it properly in VBA itself?
Option Explicit
dim mystring as String
dim regex as new RegExp
Private Function rgclean(ByVal mystring As String) As String
'function that find and replace string if contains regex pattern
'returns str
With regex
.Global = True
.Pattern = "[^ \w]" 'regex pattern will ignore spaces, word and number characters...
End With
rgclean = regex.Replace(mystring, "") '.. and replaces everything else with ""
End Function
Try using regular expression.
Make sure you enable regular expression on:
Tools > References > checkbox: "Microsoft VBScript Regular Expressions 5.5"
Pass the "mystring" string variable into the function (rgclean). The function will check for anything that is not space, word[A-Za-z], or numbers[0-9], replace them with "", and returns the string.
The function will pretty much remove any symbols in the string. Any Numbers, Space, or Word will NOT be excluded.
Here is the opposite approach. Remove ALL characters that are not included in this group of 62:
ABCDEFGHIJKLMNOPQESTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
The code:
Const ValidCharacters As String = "ABCDEFGHIJKLMNOPQESTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
Function ReplaceSpecialCharacters(myString As String) As String
Dim newString As String, L As Long, i As Long
Dim char As Variant
newString = myString
L = Len(newString)
For i = 1 To L
char = Mid(newString, i, 1)
If InStr(ValidCharacters, char) = 0 Then
newString = Replace(newString, char, "#")
End If
Next i
ReplaceSpecialCharacters = Replace(newString, "#", "")
End Function
Note:
You can also add characters to the string ValidCharacters if you want to retain them.

VB.NET remove specific chars between two characters in a string

In vb.net how do i remove a character from string which occurs between two known characters in a series.For example how do you remove commas from the number occurring between the hashtag
Balance,#163,464.24#,Cashbook Closing Balance:,#86,689.45#,Money,End
You can use this simple and efficient approach using a loop and a StringBuilder:
Dim text = "Balance,#163,464.24#,Cashbook Closing Balance:,#86,689.45#,Money,End"
Dim textBuilder As New StringBuilder()
Dim inHashTag As Boolean = False
For Each c As Char In text
If c = "#"c Then inHashTag = Not inHashTag ' reverse Boolean
If Not inHashTag OrElse c <> ","c Then
textBuilder.Append(c) ' append if we aren't in hashtags or the char is not a comma
End If
Next
text = textBuilder.ToString()
Because I'm bad at regex:
Dim str = "Balance,#163,464.24#,Cashbook Closing Balance:,#86,689.45#,Money,End"
Dim split = str.Split("#"c)
If UBound(split) > 1 Then
For i = 1 To UBound(split) Step 2
split(i) = split(i).Replace(",", "")
Next
End If
str = String.Join("#", split)

How to tokenize a string in Lotus Notes Script

I need to split a string into several tokens just like the java code below:
StringTokenizer st = new StringTokenizer(mystring);
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
You can use the function Split(myString, " "), where the first parameter is your string and the second one the token delimiter.
Here's the solution:
Dim myString = myDocument.myField(0)
Dim myTokens = Split(myString, " ")
Dim fisrtToken = myTokens(0)
Dim secondToken = myTokens(1)
Here's the code I implemented from the answers around for IBM Lotus Notes 7:
Function isTokenInStr(tokenStr As String, strToSearch As String) As Boolean
isTokenInStr = True
Dim tokenArr As Variant
tokenArr = Split(tokenStr, " ")
Dim idxTokenArr As Integer
For idxTokenArr = LBound(tokenArr) To UBound(tokenArr)
Dim tokenElementStr As String
tokenElementStr = tokenArr(idxTokenArr)
If InStr(strToSearch, tokenElementStr) <= 0 Then
isTokenInStr = False
Exit For
End If
next
End Function

Resources