I've been trying to write a program that opens a text file and inserts spaces after the 5th, 6th and 7th word for every line beginning with C_PIN.
I feel that I have it almost complete but I've benn getting run-time errors 5 and 438.
Example text being read:
COMP C48 66250110810 cap sc_cap.0603_H9 43.3959 74.3331 1 0
C_PROP (PB_FREE,"Y") (VALUE,"10nF") (TOLER,"10%") (PART_NAME,"06035C103K4T2A")
C_PIN C48-1 43.3959 75.0951 1 1 0 sp.0603_H9.1 /N$1567
C_PIN C48-2 43.3959 73.5711 1 1 0 sp.0603_H9.2 GN
An extra space is required after the 1 1 0
Here's where I think the problem in my code lies:
x = " "
Do While Not EOF(infilenum%)
Line Input #infilenum%, a$
If Left$(a$, 5) = "C_PIN" Then
For Each x In InStr
'If InStr(strText, " ") Then
w = w + 1
'w = strText.Split
For w = 5 To w = 7
My.Computer.FileSystem.WriteAllText (infilename$)
strText = My.Computer.FileSystem.ReadAllText(infilename$).Replace(w, x + w)
vb.newline
Any help is much appreciated!
For Each x In InStr
isn't valid in any way!!!
You can use Split and Join:
If Left(a, 5) = "C_PIN" Then
va = Split(a, " ")
va(4) = va(4) & " " 'Add extra space
va(5) = va(4) & " "
va(6) = va(4) & " "
a = Join(va, " ") 'Join with extra spaces added
End If
Now you can write the string.
I played with an alternative version that uses a regexp to make a single shot replacement
While it looks a little complicated, the plus - other than a single shot replacement - is that it only will alter lines that
start with C_PIN
have (at least) a further 6 words
This sample take your initial file, and saves a second version with the padded spacing.
Upated for additional requirement, using two separate regexp replacements
Sub ReDo()
Dim objFso As Object
Dim objFil As Object
Dim objFil2 As Object
Dim objRegex As Object
Dim strFile As String
Dim strAll As String
strFil = "c:\temp\REnglish.txt"
strFil2 = "c:\temp\REnglish2.txt"
Set objFso = CreateObject("Scripting.FileSystemObject")
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\nC_PIN\s)((\b[^\s]+\b\s){3})(\b[^\s]+\b\s)(\b[^\s]+\b\s)(\b\d\b\s)"
.Global = True
Set objFil = objFso.OpenTextFile(strFil)
strAll = objFil.ReadAll
Set objFil2 = objFso.createtextfile(strFil2)
strAll = .Replace(strAll, "$1$2$3$4 $5 $6 ")
.Pattern = "(\nC_PIN\s)((\b[^\s]+\b\s){3})(\b[^\s]+\b\s)(\b[^\s]+\b\s)(\b\d{2,}\b\s)"
objFil2.write .Replace(strAll, "$1$2$3$4 $5 $6")
End With
objFil.Close
objFil2.Close
End Sub
Related
i've got a large (by number of lines) plain text file that I'd like to split into smaller files, also by number of lines. So if my file has around 2M lines, I'd like to split it up into 10 files that contain 200k lines, or 100 files that contain 20k lines (plus one file with the remainder; being evenly divisible doesn't matter).
I found the following code online but it takes some time and it´s not the most efficient way, anyway can anyone help on another way for me to do this ?
ReDim outputlines(maxRows - 1) As String
p = InStrRev(inputFile, ".")
part = 0
n = 0
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TSRead = FSO.OpenTextFile(inputFile)
While Not TSRead.AtEndOfStream
outputlines(n) = TSRead.ReadLine
n = n + 1
If n = maxRows Then
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p) & " .csv"
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines, vbCrLf)
TSWrite.Close
ReDim outputlines(maxRows - 1) As String
n = 0
' Set wb = Workbooks.Open(inputFile, 0, True, 5)
'wb.Activate
'MEF
'wb.Close (False)
End If
Wend
TSRead.Close
If n > 0 Then
ReDim outputlines2(n - 1) As String
For i = 0 To n - 1
outputlines2(i) = outputlines(i)
Next
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p) & " .csv"
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines2, vbCrLf)
TSWrite.Close
' Set wb = Workbooks.Open(inputFile, 0, True, 5)
'wb.Activate
'MEF
'wb.Close (False)
End If
MsgBox "Done"
I would avoid the array as a "middleman" altogether. Please see the following solution, which avoids that. It should (WARNING: untested) read each line of the input file, and write to consecutive output files, creating new output files as it progresses. I created the function BuildOutputName() because it is an excellent practice to avoid repeating code.
I must admit, I really didn't follow your purpose to writing to the output file twice, but that's a moot point since this replaces that solution.
Function BuildOutputName(inputName as string, partNumber as integer) as String
' Using the Part Number, build a new filename based on the original filename.
p = InStrRev(inputFile, ".")
Return Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p) & ".csv"
End Function
ReDim outputlines(maxRows - 1) As String
p = InStrRev(inputFile, ".")
part = 1
n = 0
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TSRead = FSO.OpenTextFile(inputFile)
OutputFile = BuildOutputName(inputFile, Part)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
While Not TSRead.AtEndOfStream
outputline = TSRead.ReadLine
n = n + 1
If n > maxRows Then
' This line would cause us to exceed our maximum. Close the current output
' file, and create a new one. This will be our first line in that file.
TSWrite.Close
part = part + 1
outputFile = BuildOutputName(inputFile, Part)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
n = 0
End If
TSWrite.Write outputline + vbCrLf
Wend
TSWrite.Close
TSRead.Close
MsgBox "Done"
I have a function that will add up all dollar amounts within an Excel comment box. However I have some notes written in the comment box that causes an error due to the fact that it does not start with $XX.xx is there a way to either ignore entire strings (separated by enter) or possibly make a "comment out string" special character? For example if I start a string with ; then ignore all text after that util the next line?
Here are my current functions:
Function CleanString(strIn As String) As String
Dim objRegex
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "[^0-9" & Application.DecimalSeparator & "]"
CleanString = .Replace(strIn, vbCrLf)
End With
End Function
Function commentSum(cmt As Comment) As Double
Dim vDat As Variant
Dim i As Long
Dim res As Double
vDat = Split(CleanString(cmt.Text), vbCrLf)
For i = LBound(vDat) To UBound(vDat)
If Len(vDat(i)) > 0 Then
res = res + CDbl(vDat(i))
End If
Next i
commentSum = res
End Function
Replace:
If Len(vDat(i)) > 0 Then
With:
If Len(vDat(i)) > 0 And Not Left(vDat(i), 1) = ";" Then
then it will ignore any line starting with ;
I am trying to go read a text file and count the number of times a phrase/string (not word) occurs in the text file, but so far what I have is this:
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("D:\VBscript project\testing.txt", ForReading)
strContents = objFile.ReadAll
objFile.Close
i = 0
arrLines = Split(strContents, "")
For Each strLine in arrLines
If InStr(strLine, "hi there") Then
i = i + 1
End If
Next
WScript.Echo "Number of times word occurs: " & i
This will only allow me to count the number of times a word occurs, which does not work when I try to tweak it to count phrases.
Consider the below example:
strPath = "D:\VBscript project\testing.txt"
strPhrase = "hi there"
strContent = ReadTextFile(strPath, 0)
arrContent = Split(strContent, strPhrase)
MsgBox "Number of times phrase occurs: " & UBound(arrContent)
Function ReadTextFile(strPath, lngFormat)
' lngFormat -2 - System default, -1 - Unicode, 0 - ASCII
With CreateObject("Scripting.FileSystemObject").OpenTextFile(strPath, 1, False, lngFormat)
ReadTextFile = ""
If Not .AtEndOfStream Then ReadTextFile = .ReadAll
.Close
End With
End Function
Note that Split-based method is case-sensitive.
strPath = "D:\VBscript project\testing.txt"
strPhrase = "hi there"
strContent = ReadTextFile(strPath, 0)
arrContent = Split(strContent, strPhrase)
MsgBox "Number of times phrase occurs: " & UBound(arrContent)
Function ReadTextFile(strPath, lngFormat)
' lngFormat -2 - System default, -1 - Unicode, 0 - ASCII
With CreateObject("Scripting.FileSystemObject").OpenTextFile(strPath, 1, False, lngFormat)
ReadTextFile = ""
If Not .AtEndOfStream Then ReadTextFile = .ReadAll
.Close
End With
End Function
If I understood you correctly and what you are asking for is really as simple as it looks, you could just change the "hi there" string to a parameter. This way you can dynamically tell your function what to look for.
EDIT: Thanks to #omegastripes I noticed a flaw in my previous code, so this is one that would work.
The code would be like this:
Sub yourSubName (pstrTextToCount)
Const ForReading = 1
Dim objFSO : Set objFSO = CreateObject("Scripting.FileSystemObject")
Dim objFile : Set objFile = objFSO.OpenTextFile("D:\VBscript project\testing.txt", ForReading)
Dim strContents : strContents = objFile.ReadAll
objFile.Close
' You don't need these objects anymore, so release them
Set objFile = Nothing
Set objFSO = Nothing
Dim intTextPosition : intTextPosition = 0
Dim i : i = -1
Do
i = i + 1
intTextPosition = InStr(intTextPosition + 1, strContents, pstrTextToCount)
Loop While (intTextPosition > 0)
Wscript.Echo "Number of times '" & pstrTextToCount & "' occurs: " & i
End Sub
I am assuming your Sub will only do that and this is why I enclosed it into the Sub, End Sub statements. You can add any other coding that you need, but only remember to add your required parameter on the signature of the Sub for it to work.
PS: As a good practice, always Dim your variables and release memory of objects that are no longer needed with Set objName = Nothing
Here a version using Regular Expressions so you can specify if the search needs to be case sensitive.
For testpurpose I use the contents of the script itself as input.
Dim path, phrase, content
path = Wscript.ScriptFullName
phrase = "hi there\^$*+?{}.()|[]"
content = CreateObject("Scripting.FileSystemObject").OpenTextFile(path).ReadAll
Function NumberOfPhrasesInString(phrase, text, IgnoreCase)
Dim regexpr, matches
Set regexpr = New RegExp
phrase = RegExEscape(phrase)
With regexpr
.Pattern = phrase
.Global = True
.IgnoreCase = IgnoreCase
Set matches = .Execute(text)
End With
NumberOfPhrasesInString = matches.count
End Function
Function RegExEscape(str)
Dim special
RegExEscape = str
special = "\^$*+?{.()|[]"
For i=1 To Len(special)
RegExEscape = replace(RegExEscape, Mid(special, i, 1), "\" & Mid(special, i, 1))
Next
End Function
Wscript.Echo "Number of times phrase occurs: " & NumberOfPhrasesInString(phrase, content, false)
As a bonus, since I'm switched to Ruby here also that version
path = __FILE__ # the path to this script for test purposes
phrase = 'HI THERE \ ^ $ * + ? { . ( | ['
puts phrase
content = File.read path
def number_of_phrases_in_string(phrase, text, ignoreCase=false )
escaped = Regexp.escape(phrase)
text.scan(Regexp.new(escaped, ignoreCase)).count.to_s
end
puts "Number of times phrase occurs: " + number_of_phrases_in_string(phrase, content, true)
Or in a single line
puts File.read(__FILE__).scan(Regexp.new(Regexp.escape(phrase), true)).count
The true in the last line defines casesensitivity
I need to pick a particular data from a text file. But this text file has data which is more than 1024 characters on a single line.
For eg: I need data between string text1 and text 2. My code takes only the first data between text1 & text2 in the huge line, and moves to next line. But previous huge line has multiple instances of text1 & text2. I am not able to get those data. Please help. Find below my code:
Sub Macro1()
Dim dat As String
Dim fn As String
fn = "C:\Users\SAMUEL\Desktop\123\Source1.TXT" '<---- change here
With CreateObject("Scripting.FileSystemObject").OpenTextFile(fn)
Do While Not .AtEndOfStream
dat = .Readline
If InStr(1, dat, "text1", vbTextCompare) > 0 Then
x = InStr(dat, "text1") + 8
y = InStr(dat, "text2")
Z = y - x
MsgBox Mid(dat, x, Z)
End If
Loop
.Close
End With
End Sub
I want to pick the data between Text1 and Text2 to a specific cell.
The data looks like "This is an Text1 awesome Text2 website. I like this Text1 website Text2."
This is a huge data which I copied from a website. When I save in a Text file, one line of this web data is more than 4000 characters. So the line in text file ends at 1024 characters and data moves to next line that becomes 3 lines. But My macro takes first 1024 in string "dat" and moves to second line of web data, that means it skips all data after 1024 characters to 4000 characters. The data I want which exists between Text1 and Text2 could be anywhere in whole 4000 characters, But It will be in same pattern. It will never be like Text1...Text1...Text2..
Using a regexp is a useful way to quickly replace all matches in a single shot, or work through each match (including multiple matches per line) as this sample does below.
Sub DisappearingSwannie()
Dim objFSO As Object
Dim objFil As Object
Dim objRegex As Object
Dim objRegMC As Object
Dim objRegM As Object
Dim strIn As String
Dim X
Dim lngCnt As Long
Dim fn As String
fn = "C:\temp\test.TXT" '<---- change here
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objRegex = CreateObject("vbscript.regexp")
Set objFil = objFSO.OpenTextFile(fn)
X = Split(objFil.readall, vbNewLine)
With objRegex
.Global = True
.Pattern = "text1(.+?)text2"
End With
For lngCnt = 1 To UBound(X)
If objRegex.test(X(lngCnt)) Then
Set objRegMC = objRegex.Execute(X(lngCnt))
For Each objRegM In objRegMC
Debug.Print "line " & lngCnt & " position:" & objRegM.firstindex
Next
End If
Next
End Sub
Here is a macro that looks in A1 and B1 for Text1 and Text2. It then allows you to pick a file to process and parses out the substrings from text1 to text2 inclusive. Finally, it splits them into chunks of no more than 1024 characters (ensuring that each chunk ends with a space, so as not to split words), and writes them into a series of rows in column A starting in A2.
Both the parsing of the substrings, and the breaking them up into 1024 character chunks, are accomplished using regular expressions. The "work" is done in VBA arrays as this is faster than going back and forth to the worksheet.
Since the length of a string variable can be approximately 2^31 characters, I doubt you will have any problem reading the entire document into a single variable and then processing it, instead of going line by line.
Since the macro has arguments, you will need to call it from another macro; or it should be trivial for you to change the code to allow different methods of input for text1 and text2.
There is no error checking.
If you do not want to include Text1 and Text2 in the results, a minor change in the regular expression pattern is all that would be required.
I used early binding so as to take advantage of the "hints" while writing the macro. This requires setting references as noted in the macro. However, it should be simple for you to change this to late binding if you wish.
You might also consider a modification so that the multi-row chunks are somehow differentiated from the single row chunks.
Enjoy
Option Explicit
'Set Reference to Microsoft Scripting Runtime
'Set Reference ot Microsoft VBScript Regular Expressions 5.5
Sub ExtractPhrases(Text1 As String, Text2 As String)
Dim FSO As FileSystemObject
Dim TS As TextStream
Dim FN As File, sFN As String
Dim RE As RegExp, MC As MatchCollection, M As Match
Dim RE2 As RegExp, MC2 As MatchCollection, M2 As Match
Dim sPat As String
Dim S As String, sTemp As String
Dim V() As Variant, vRes() As Variant
Dim I As Long, J As Long, K As Long
Dim C As Range
Dim rRes As Range
'Get File path
With Application.FileDialog(msoFileDialogFilePicker)
.AllowMultiSelect = False
.ButtonName = "Process File"
.Filters.Add "Text", "*.txt", 1
.FilterIndex = 1
.InitialView = msoFileDialogViewDetails
If .Show = -1 Then sFN = .SelectedItems(1)
End With
'Read File into String variable
Set FSO = New FileSystemObject
Set TS = FSO.OpenTextFile(FileName:=sFN, IOMode:=ForReading, Create:=False)
S = TS.ReadAll
'Get results
Set RE = New RegExp
Set RE2 = New RegExp
With RE2
.Global = True
.MultiLine = False
.Pattern = "(\S[\s\S]{1,1023})(?:\s+|$)"
End With
With RE
.Global = True
.IgnoreCase = True
.Pattern = "\b" & Text1 & "\b([\s\S]+?)\b" & Text2 & "\b"
If .Test(S) = True Then
ReDim vRes(0)
Set MC = RE.Execute(S)
For I = 1 To MC.Count
Set MC2 = RE2.Execute(MC(I - 1))
ReDim V(1 To MC2.Count)
For J = 1 To MC2.Count
V(J) = MC2(J - 1).SubMatches(0)
Next J
ReDim Preserve vRes(UBound(vRes) + J - 1)
For J = 1 To MC2.Count
K = K + 1
vRes(K) = V(J)
Next J
Next I
End If
End With
vRes(0) = "Phrases"
'transpose vRes
ReDim V(1 To UBound(vRes) + 1, 1 To 1)
For I = 0 To UBound(vRes)
V(I + 1, 1) = vRes(I)
Next I
Set rRes = Range("a2").Resize(rowsize:=UBound(V))
Range(rRes(1), Cells(Rows.Count, rRes.Column)).Clear
rRes = V
End Sub
I am working in classic ASP and have a string as below :
ABC
Now I want to split the string, I have tried vbCrLf, vbNewline , vblf and but none of them work.
Please suggest me an alternative to split the string. I am in a bad fix.
Are you sure, you have newlines in the string?
First you can output all character codes to find out, by which character to split:
dim i, c
for i = 1 to len(my_string)
c = mid(my_string, i, 1)
Response.Write "CHAR: " & ASC(c) & " = " & c
next
Then you have 2 options:
If you can split by one character (e.g. char num 10), you can use:
a_result = split(my_string, CHR(10))
You can grab values out of your string by using regular expression matching. This is much overhead, but if all else fails, here is how you could do that:
function findStrings(s_text, s_pattern)
dim a_out, obj_regex, obj_matches
dim obj_match, n_index
set obj_regex = New RegExp
obj_regex.IgnoreCase = true
obj_regex.Global = true
obj_regex.MultiLine = true
obj_regex.Pattern = s_pattern
set obj_matches = obj_regex.execute(s_text)
if obj_matches.Count>0 then
redim a_out(obj_matches.Count-1)
n_index = 0
for each obj_match in obj_matches
a_out(n_index) = cvStr(obj_match.Value)
n_index = n_index + 1
next
end if
findStrings = a_out
set obj_regex = Nothing
end function
a_result = findStrings(my_string, "\w+")
This assumes, that there is no whitespace in the strings you are looking for.
This happens more often than you think, you need to remove the vbcr first then replace only vblf and forget about spliting on vbcrlf because it wont work for 100% of the user envrioments out there.
A
B
C
' assuming about is in the variable str
split(replace(str,vbcr,""),vblf)