First of all, I am not from programming background and totally new to VB. For some reasons, I have to do a small scripting task at office.
I have gone through many posts here as well on other forums but not able to find the required information.
Ok so here's what I need to do:
I want to find an integer from a string (the integer can be any from 5 to 4095) and replace it with an integer of my choice.
I have done some analysis but not able to find a function which can search for "any" integer from a string. Searching for fixed integers is working fine.
e.g:
Convert: "There are 10 apples" - where 10 can be any number from 5 to 4095, not known to me.
To: "There are 5 apples" - 5 is the number I will manually give.
Any help is appreciated.
Thank you.
Edit
Final Code:
Set objFSO = CreateObject("Scripting.FileSystemObject")
strFile = "C:\Users\inkasap\Desktop\temp\IOParams_Backup.xml"
Set objFile = objFSO.OpenTextFile(strFile)
Do Until objFile.AtEndOfStream strLine = objFile.ReadLine
Dim re, strLine
Set re = New RegExp
if InStr(strLine,"name=""BIOSCUPP") > 0 Then
re.Pattern = "\d+"
Set colMatch = re.Execute(strLine)
For Each objMatch In colMatch
strLine = Replace(strLine, objMatch.Value, "30")
Next end
if WScript.Echo strLine
Loop
Well, you can use regular expressions to find any numerical value, and then replace it:
Dim re, test
Set re = New RegExp
re.Pattern = "\d+"
test = "There are 4000 apples"
Set colMatch = re.Execute(test)
For Each objMatch In colMatch
MsgBox(Replace(test,objMatch.Value,"5"))
Next
This page contains all informations you need.
You can use a RegularExpressions with Regex.Replace method:
Dim str = "There are 10 apples"
Dim regex = New System.Text.RegularExpressions.Regex("\d+", System.Text.RegularExpressions.RegexOptions.Compiled)
Dim replaced = regex.Replace(str, "5") ' "There are 5 apples" '
Edit: Just only seen that you need a vbscript instead of VB.NET approach.
Related
First Things First: The macro is run from Excel VBA editor, but performs the biggest part of its job on the previously opened Word document, where it's goal is to find the full names of the people who are the contracting parties in the agreement being analized.
The issue I'm experiencing with the code is that it is variable number of words, that I need to pull from every consecutive paragraph. If the name is Will SMITH, then its two words I need to pull, when it's Carrie Ann MOSS, then it's three words, sometimes it can be Anna Nicole SMITH BURKE, than its four words but when it's Anna Nicole SMITH-BURKE, than its five words and so on.
The other idea to get this full name is, that it always ends with a coma, and this coma is always the first coma in this paragraph, where the full name appears.
ATTENTION !!! The Paragraphs we work with are not ListParagraphs. They are the normal/ordinary ones albeit indented and numbered. I get these contracts from people who don't care to use numbered list :-(
So for the last time: The numbered list is not enabled on those paragraphs we work with.
This is how it looks like in Word and the selected words are the names and surnames that the macro is supposed to extract from the document - excluding the coma after the last surname.
Sub FindNamesCleanDraftWithLoop()
'Variables declaration
Dim WordApp As Word.Application
Dim WordDoc As Word.Document
Dim ExcelApp As Excel.Application
Dim MySheet As Excel.Worksheet
Dim Para As Word.Paragraph
Dim Rng As Word.Range
Dim RngStart As Word.Range
Dim RngEnd As Word.Range
Dim TextToFind1 As String
Dim TextToFind2 As String
Dim firstName As String
Dim startPos As Long
Dim endPos As Long
Application.ScreenUpdating = False
'Assigning object variables
Set WordApp = GetObject(, "Word.Application")
Set ExcelApp = GetObject(, "Excel.Application")
Set WordDoc = WordApp.ActiveDocument
Set MySheet = Application.ActiveWorkbook.ActiveSheet
'Set MySheet = ExcelApp.ActiveWorkbook.ActiveSheet
Set Rng = WordApp.ActiveDocument.Content
TextToFind1 = "REGON 364061169, NIP 951-24-09-783,"
TextToFind2 = "- ad."
'InStr function returns a Variant (Long) specifying the position of the first occurrence of one string within another.
startPos = InStr(1, Rng, TextToFind1) - 1 'here we get 1421, we're looking 4 "TextToFind1"
endPos = InStr(1, Rng, TextToFind2) - 1 'here we get 2246, we're looking 4 "- ad."
If startPos = 0 Or endPos = 0 Then Exit Sub
Rng.SetRange Start:=startPos, End:=endPos
Debug.Print Rng.Paragraphs.Count
If startPos = 0 Or endPos = 0 Then
MsgBox ("Client's names were not found!")
Else
'somewhere here I need your help to write some lines that will
'recognize how many words need to be pulled to extract the full
'name/names + surname/surnames and nothing else - we end on the first coma.
For Each Para In Rng.Paragraphs
firstName = Trim$(Para.Range.Words(3))
Debug.Print Para.Range.Words(1) & Para.Range.Words(2) & _
Para.Range.Words(3) & Para.Range.Words(4) & _
Para.Range.Words(5) & Para.Range.Words(6)
Next Para
End If
End Sub
There in the For Each Para ... Next Para loop, I need your help to write some lines that will recognize how many words need to be pulled to extract the full name/names + surname/surnames and nothing else - we end on the first coma - that means excluding the coma after the last surname.
This might do what you want, if I have understood correctly.
If the Number and Dot are NOT part of the paragraph, then to extract the full name, you could use:
Debug.Print Left(Para, InStr(Para, ",") - 1)
If the Number and Dot ARE part of the paragraph, then:
Dim Start As Long, Length As Long
Start = InStr(Para, ".") + 1
Length = InStr(Para, ",") - Start
Debug.Print Trim(Mid(Para, Start, Length))
If you want to split the capitalized portion of the name from the rest, post several samples of the actual strings as TEXT as I am unable to copy/paste your screenshots into Excel.
This is actually question on Natural Language Processing - NLP (as scientific field) - problem of Tokenization you already solved, but you are now facing with: Part of Sentence (POS) tagging (and construction of semantic Chunk-s). Issue of Person Identity recognition is one of first since very dawn of the discipline, so, there are many libraries for this, but I'm skeptic you'll find anything you can use from VBA that easy. Also, for comprehensive/robust solution you'll need proper lexicon with names and lastnames (suitable resource would be a Morpho-sintactic Lexicon/Dictionary, extracted only for PName LName tags).
To solve this, you'll have to do some research on keywords mentioned above. I have no experience with VBA, but asume C# and .NET would be the easiest to integrate. Therefore, Standford CoreNLP for .NET would be starting point I recon: https://sergey-tihon.github.io/Stanford.NLP.NET/StanfordCoreNLP.html
Highly relevant lexical source for English:
https://wordnet.princeton.edu/
You can query WordNet lexicon using WordNet.NET. I've used both and its very good resource&library.
This seems obvious and straightforward, but it's not working. I'm working from in Excel VBA. I get the open Word application and extract text from a bookmarked spot within a table in Word. Then the trouble starts. The resulting string is 5 chr(32) spaces. But whatever I try, I can't get rid of the spaces. Why are the spaces not being trimmed or replaced?
Dim Wd As Word.Application
Set Wd = GetObject(, "Word.Application")
Dim doc As Word.Document
'Dim r As Word.Range
'Dim p As Word.Paragraph
Dim tb As Word.Table
Set doc = Wd.ActiveDocument
Set tb = doc.Tables(1)
'tb.Select
Dim Place As String
Place = Trim(doc.Bookmarks("County").Range.Text)
'outputs length 5
Debug.Print Len(Place)
'this outputs 32 5 times so I know we have chr(32) and not something else
Dim i As Integer
For i = 1 To Len(Place)
Debug.Print Asc(Mid(Place, i, 1))
Next
'try trim
Place = Trim(Place)
Debug.Print Len(Place)
'still 5 spaces
'try replace
Dim s As String
s = VBA.Replace(Place, Chr(32), "")
Debug.Print Len(Place)
'still 5 spaces
What is happening with my code?
Probably a unicode space, consider U2000 EN QUAD Whitespace:
x="W" & chrw(&h2000) & "W"
?x
W W
?asc(mid(x,2,1))
32 <= normalized
?ascw(mid(x,2,1))
8192 <= true character
So examine the char with ascw and replace with chrw
I need to pick a particular data from a text file. But this text file has data which is more than 1024 characters on a single line.
For eg: I need data between string text1 and text 2. My code takes only the first data between text1 & text2 in the huge line, and moves to next line. But previous huge line has multiple instances of text1 & text2. I am not able to get those data. Please help. Find below my code:
Sub Macro1()
Dim dat As String
Dim fn As String
fn = "C:\Users\SAMUEL\Desktop\123\Source1.TXT" '<---- change here
With CreateObject("Scripting.FileSystemObject").OpenTextFile(fn)
Do While Not .AtEndOfStream
dat = .Readline
If InStr(1, dat, "text1", vbTextCompare) > 0 Then
x = InStr(dat, "text1") + 8
y = InStr(dat, "text2")
Z = y - x
MsgBox Mid(dat, x, Z)
End If
Loop
.Close
End With
End Sub
I want to pick the data between Text1 and Text2 to a specific cell.
The data looks like "This is an Text1 awesome Text2 website. I like this Text1 website Text2."
This is a huge data which I copied from a website. When I save in a Text file, one line of this web data is more than 4000 characters. So the line in text file ends at 1024 characters and data moves to next line that becomes 3 lines. But My macro takes first 1024 in string "dat" and moves to second line of web data, that means it skips all data after 1024 characters to 4000 characters. The data I want which exists between Text1 and Text2 could be anywhere in whole 4000 characters, But It will be in same pattern. It will never be like Text1...Text1...Text2..
Using a regexp is a useful way to quickly replace all matches in a single shot, or work through each match (including multiple matches per line) as this sample does below.
Sub DisappearingSwannie()
Dim objFSO As Object
Dim objFil As Object
Dim objRegex As Object
Dim objRegMC As Object
Dim objRegM As Object
Dim strIn As String
Dim X
Dim lngCnt As Long
Dim fn As String
fn = "C:\temp\test.TXT" '<---- change here
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objRegex = CreateObject("vbscript.regexp")
Set objFil = objFSO.OpenTextFile(fn)
X = Split(objFil.readall, vbNewLine)
With objRegex
.Global = True
.Pattern = "text1(.+?)text2"
End With
For lngCnt = 1 To UBound(X)
If objRegex.test(X(lngCnt)) Then
Set objRegMC = objRegex.Execute(X(lngCnt))
For Each objRegM In objRegMC
Debug.Print "line " & lngCnt & " position:" & objRegM.firstindex
Next
End If
Next
End Sub
Here is a macro that looks in A1 and B1 for Text1 and Text2. It then allows you to pick a file to process and parses out the substrings from text1 to text2 inclusive. Finally, it splits them into chunks of no more than 1024 characters (ensuring that each chunk ends with a space, so as not to split words), and writes them into a series of rows in column A starting in A2.
Both the parsing of the substrings, and the breaking them up into 1024 character chunks, are accomplished using regular expressions. The "work" is done in VBA arrays as this is faster than going back and forth to the worksheet.
Since the length of a string variable can be approximately 2^31 characters, I doubt you will have any problem reading the entire document into a single variable and then processing it, instead of going line by line.
Since the macro has arguments, you will need to call it from another macro; or it should be trivial for you to change the code to allow different methods of input for text1 and text2.
There is no error checking.
If you do not want to include Text1 and Text2 in the results, a minor change in the regular expression pattern is all that would be required.
I used early binding so as to take advantage of the "hints" while writing the macro. This requires setting references as noted in the macro. However, it should be simple for you to change this to late binding if you wish.
You might also consider a modification so that the multi-row chunks are somehow differentiated from the single row chunks.
Enjoy
Option Explicit
'Set Reference to Microsoft Scripting Runtime
'Set Reference ot Microsoft VBScript Regular Expressions 5.5
Sub ExtractPhrases(Text1 As String, Text2 As String)
Dim FSO As FileSystemObject
Dim TS As TextStream
Dim FN As File, sFN As String
Dim RE As RegExp, MC As MatchCollection, M As Match
Dim RE2 As RegExp, MC2 As MatchCollection, M2 As Match
Dim sPat As String
Dim S As String, sTemp As String
Dim V() As Variant, vRes() As Variant
Dim I As Long, J As Long, K As Long
Dim C As Range
Dim rRes As Range
'Get File path
With Application.FileDialog(msoFileDialogFilePicker)
.AllowMultiSelect = False
.ButtonName = "Process File"
.Filters.Add "Text", "*.txt", 1
.FilterIndex = 1
.InitialView = msoFileDialogViewDetails
If .Show = -1 Then sFN = .SelectedItems(1)
End With
'Read File into String variable
Set FSO = New FileSystemObject
Set TS = FSO.OpenTextFile(FileName:=sFN, IOMode:=ForReading, Create:=False)
S = TS.ReadAll
'Get results
Set RE = New RegExp
Set RE2 = New RegExp
With RE2
.Global = True
.MultiLine = False
.Pattern = "(\S[\s\S]{1,1023})(?:\s+|$)"
End With
With RE
.Global = True
.IgnoreCase = True
.Pattern = "\b" & Text1 & "\b([\s\S]+?)\b" & Text2 & "\b"
If .Test(S) = True Then
ReDim vRes(0)
Set MC = RE.Execute(S)
For I = 1 To MC.Count
Set MC2 = RE2.Execute(MC(I - 1))
ReDim V(1 To MC2.Count)
For J = 1 To MC2.Count
V(J) = MC2(J - 1).SubMatches(0)
Next J
ReDim Preserve vRes(UBound(vRes) + J - 1)
For J = 1 To MC2.Count
K = K + 1
vRes(K) = V(J)
Next J
Next I
End If
End With
vRes(0) = "Phrases"
'transpose vRes
ReDim V(1 To UBound(vRes) + 1, 1 To 1)
For I = 0 To UBound(vRes)
V(I + 1, 1) = vRes(I)
Next I
Set rRes = Range("a2").Resize(rowsize:=UBound(V))
Range(rRes(1), Cells(Rows.Count, rRes.Column)).Clear
rRes = V
End Sub
I've been trying to write a program that opens a text file and inserts spaces after the 5th, 6th and 7th word for every line beginning with C_PIN.
I feel that I have it almost complete but I've benn getting run-time errors 5 and 438.
Example text being read:
COMP C48 66250110810 cap sc_cap.0603_H9 43.3959 74.3331 1 0
C_PROP (PB_FREE,"Y") (VALUE,"10nF") (TOLER,"10%") (PART_NAME,"06035C103K4T2A")
C_PIN C48-1 43.3959 75.0951 1 1 0 sp.0603_H9.1 /N$1567
C_PIN C48-2 43.3959 73.5711 1 1 0 sp.0603_H9.2 GN
An extra space is required after the 1 1 0
Here's where I think the problem in my code lies:
x = " "
Do While Not EOF(infilenum%)
Line Input #infilenum%, a$
If Left$(a$, 5) = "C_PIN" Then
For Each x In InStr
'If InStr(strText, " ") Then
w = w + 1
'w = strText.Split
For w = 5 To w = 7
My.Computer.FileSystem.WriteAllText (infilename$)
strText = My.Computer.FileSystem.ReadAllText(infilename$).Replace(w, x + w)
vb.newline
Any help is much appreciated!
For Each x In InStr
isn't valid in any way!!!
You can use Split and Join:
If Left(a, 5) = "C_PIN" Then
va = Split(a, " ")
va(4) = va(4) & " " 'Add extra space
va(5) = va(4) & " "
va(6) = va(4) & " "
a = Join(va, " ") 'Join with extra spaces added
End If
Now you can write the string.
I played with an alternative version that uses a regexp to make a single shot replacement
While it looks a little complicated, the plus - other than a single shot replacement - is that it only will alter lines that
start with C_PIN
have (at least) a further 6 words
This sample take your initial file, and saves a second version with the padded spacing.
Upated for additional requirement, using two separate regexp replacements
Sub ReDo()
Dim objFso As Object
Dim objFil As Object
Dim objFil2 As Object
Dim objRegex As Object
Dim strFile As String
Dim strAll As String
strFil = "c:\temp\REnglish.txt"
strFil2 = "c:\temp\REnglish2.txt"
Set objFso = CreateObject("Scripting.FileSystemObject")
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\nC_PIN\s)((\b[^\s]+\b\s){3})(\b[^\s]+\b\s)(\b[^\s]+\b\s)(\b\d\b\s)"
.Global = True
Set objFil = objFso.OpenTextFile(strFil)
strAll = objFil.ReadAll
Set objFil2 = objFso.createtextfile(strFil2)
strAll = .Replace(strAll, "$1$2$3$4 $5 $6 ")
.Pattern = "(\nC_PIN\s)((\b[^\s]+\b\s){3})(\b[^\s]+\b\s)(\b[^\s]+\b\s)(\b\d{2,}\b\s)"
objFil2.write .Replace(strAll, "$1$2$3$4 $5 $6")
End With
objFil.Close
objFil2.Close
End Sub
I want to search several words at the same time in a text file.
For example think I want to search these 3 words: Majid,superuser,device
Normally I should search for them one by one, and I can not search all of them at the same time. So I want to search these words at the same time in a text file.
I want to enter those 3 words in a text file, one word per line. Let's name it SearchText. Now I have a Target Text which I want to search those words in it. Let's name it TargetText.
I want to tell an app or something similar to get words from SearchText and find them in TargetText and Highlights them or gives me the find result.
I hope I'm clear. So can anyone hep me?
You're clear. I think the best option would be Regex.
Try this:
Option Explicit
Dim oFso : Set oFso = CreateObject("Scripting.FileSystemObject")
Dim srcPath : SrcPath = oFso.GetParentFolderName(WScript.ScriptFullName)
Dim sWordList : sWordList = ofso.OpenTextFile(oFso.BuildPath(srcPath, "search.txt")).ReadAll()
Dim sTargFile : sTargFile = ofso.OpenTextFile(oFso.BuildPath(srcPath, "target.txt")).ReadAll()
Dim strWords : strWords = Join(Split(sWordList, vbCrLf), "|")
Dim oReg : Set oReg = New RegExp
Dim oDict : Set oDict = CreateObject("Scripting.Dictionary") 'for found words
oDict.CompareMode = vbTextCompare 'case insensitive
With oReg
.Global = True
.IgnoreCase = True
.Pattern = "("& strWords &")" ' (word1|word2|word3|etc..)
'if the words contain some special chars for regex, you may need to escape with \ char
'see the information below: http://goo.gl/cqGVp
Dim collFND : Set collFND = oReg.Execute(sTargFile)
Dim Fnd
'WScript.Echo String(50, "-")
For Each Fnd In collFND
With Fnd
'WScript.Echo "Word : "& .Value, ", Position: ("& .FirstIndex &","& .Length &")"
oDict(.Value) = ""
End With
Next
Set collFND = Nothing
'WScript.Echo String(50, "-"), vbCrLf, "Higlighted Output:", oReg.Replace(sTargFile, "[$1]")
ofso.CreateTextFile(oFso.BuildPath(srcPath, "foundwords.txt"),True).Write(Join(oDict.Keys, vbCrLf)) 'found words saved
End With