I am working in classic ASP and have a string as below :
ABC
Now I want to split the string, I have tried vbCrLf, vbNewline , vblf and but none of them work.
Please suggest me an alternative to split the string. I am in a bad fix.
Are you sure, you have newlines in the string?
First you can output all character codes to find out, by which character to split:
dim i, c
for i = 1 to len(my_string)
c = mid(my_string, i, 1)
Response.Write "CHAR: " & ASC(c) & " = " & c
next
Then you have 2 options:
If you can split by one character (e.g. char num 10), you can use:
a_result = split(my_string, CHR(10))
You can grab values out of your string by using regular expression matching. This is much overhead, but if all else fails, here is how you could do that:
function findStrings(s_text, s_pattern)
dim a_out, obj_regex, obj_matches
dim obj_match, n_index
set obj_regex = New RegExp
obj_regex.IgnoreCase = true
obj_regex.Global = true
obj_regex.MultiLine = true
obj_regex.Pattern = s_pattern
set obj_matches = obj_regex.execute(s_text)
if obj_matches.Count>0 then
redim a_out(obj_matches.Count-1)
n_index = 0
for each obj_match in obj_matches
a_out(n_index) = cvStr(obj_match.Value)
n_index = n_index + 1
next
end if
findStrings = a_out
set obj_regex = Nothing
end function
a_result = findStrings(my_string, "\w+")
This assumes, that there is no whitespace in the strings you are looking for.
This happens more often than you think, you need to remove the vbcr first then replace only vblf and forget about spliting on vbcrlf because it wont work for 100% of the user envrioments out there.
A
B
C
' assuming about is in the variable str
split(replace(str,vbcr,""),vblf)
Related
i've got a large (by number of lines) plain text file that I'd like to split into smaller files, also by number of lines. So if my file has around 2M lines, I'd like to split it up into 10 files that contain 200k lines, or 100 files that contain 20k lines (plus one file with the remainder; being evenly divisible doesn't matter).
I found the following code online but it takes some time and it´s not the most efficient way, anyway can anyone help on another way for me to do this ?
ReDim outputlines(maxRows - 1) As String
p = InStrRev(inputFile, ".")
part = 0
n = 0
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TSRead = FSO.OpenTextFile(inputFile)
While Not TSRead.AtEndOfStream
outputlines(n) = TSRead.ReadLine
n = n + 1
If n = maxRows Then
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p) & " .csv"
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines, vbCrLf)
TSWrite.Close
ReDim outputlines(maxRows - 1) As String
n = 0
' Set wb = Workbooks.Open(inputFile, 0, True, 5)
'wb.Activate
'MEF
'wb.Close (False)
End If
Wend
TSRead.Close
If n > 0 Then
ReDim outputlines2(n - 1) As String
For i = 0 To n - 1
outputlines2(i) = outputlines(i)
Next
part = part + 1
outputFile = Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p) & " .csv"
Set TSWrite = FSO.CreateTextFile(outputFile, True)
TSWrite.Write Join(outputlines2, vbCrLf)
TSWrite.Close
' Set wb = Workbooks.Open(inputFile, 0, True, 5)
'wb.Activate
'MEF
'wb.Close (False)
End If
MsgBox "Done"
I would avoid the array as a "middleman" altogether. Please see the following solution, which avoids that. It should (WARNING: untested) read each line of the input file, and write to consecutive output files, creating new output files as it progresses. I created the function BuildOutputName() because it is an excellent practice to avoid repeating code.
I must admit, I really didn't follow your purpose to writing to the output file twice, but that's a moot point since this replaces that solution.
Function BuildOutputName(inputName as string, partNumber as integer) as String
' Using the Part Number, build a new filename based on the original filename.
p = InStrRev(inputFile, ".")
Return Left(inputFile, p - 1) & " PART" & part & Mid(inputFile, p) & ".csv"
End Function
ReDim outputlines(maxRows - 1) As String
p = InStrRev(inputFile, ".")
part = 1
n = 0
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TSRead = FSO.OpenTextFile(inputFile)
OutputFile = BuildOutputName(inputFile, Part)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
While Not TSRead.AtEndOfStream
outputline = TSRead.ReadLine
n = n + 1
If n > maxRows Then
' This line would cause us to exceed our maximum. Close the current output
' file, and create a new one. This will be our first line in that file.
TSWrite.Close
part = part + 1
outputFile = BuildOutputName(inputFile, Part)
Set TSWrite = FSO.CreateTextFile(outputFile, True)
n = 0
End If
TSWrite.Write outputline + vbCrLf
Wend
TSWrite.Close
TSRead.Close
MsgBox "Done"
I'm trying to determine if a column has a header or not via VBA. Basically the column will have data following an unknown but identical regex pattern. My plan is to test if A2 has the same type regex string as A1. It would likely even be the same ID + 1. Eg
A1 = X001
A2 = X002
Func IsHeader("A") = True
A1 = ID's
A2 = X001
Func IsHeader("A") = False
I've got an idea to utilize an existing script I made to generate a regex pattern based on an input alphanumerical string, but I'm interested to see what other idea's/ways people might have of solving the issue. I realize there isn't much code, but I know I can do this and I'm working on it now. If you're not interested in answering, thats ok!
Update: Posted Answer, but I'm looking for more than a code review as I realize there is an exchange for that. I'd like to know better ways to achieve goal with a different attack vector.
This is what I got! I'm not sure how SO feels about code reviews, but im interested in what ppl think and how else they could "skin the cat" so please feel free to post an answer.
Sub Test()
If IsHeader = True Then
MsgBox "Has Header"
Else
MsgBox "No Header"
End If
End Sub
Public Function IsHeader() As Boolean
A1Pattern = RegExPattern(Range("A1").Value)
A2Pattern = RegExPattern(Range("A2").Value)
If A1Pattern = A2Pattern Then
IsHeader = True
End If
End Function
Public Function RegExPattern(my_string) As String
RegExPattern = ""
'''Special Character Section'''
Dim special_charArr() As String
Dim special_char As String
special_char = "!,#,#,$,%,^,&,*,+,/,\,;,:"
special_charArr() = Split(special_char, ",")
'''Special Character Section'''
'''Alpha Section'''
Dim regexp As Object
Set regexp = CreateObject("vbscript.regexp")
Dim strPattern As String
strPattern = "([a-z])"
With regexp
.ignoreCase = True
.Pattern = strPattern
End With
'''Alpha Section'''
Dim buff() As String
'my_string = "test1*1#"
ReDim buff(Len(my_string) - 1)
Dim i As Variant
For i = 1 To Len(my_string)
buff(i - 1) = Mid$(my_string, i, 1)
char = buff(i - 1)
If IsNumeric(char) = True Then
'MsgBox char & " = Number"
RegExPattern = RegExPattern & "([0-9])"
End If
For Each Key In special_charArr
special = InStr(char, Key)
If special = 1 Then
If Key <> "*" Then
'MsgBox char & " = Special NOT *"
RegExPattern = RegExPattern & "^[!##$%^&()].*$"
Else
'MsgBox char & " = *"
RegExPattern = RegExPattern & "."
End If
End If
Next
If regexp.Test(char) Then
'MsgBox char & " = Alpha"
RegExPattern = RegExPattern & "([a-z])"
End If
Next
'RegExPattern = Chr(34) & RegExPattern & Chr(34)
'MsgBox RegExPattern
End Function
I have a requirement to remove the text between two strings.
An example of the text is:
Abc%678x”cv ","#metadata":{abxkl "DataArray"}},{"columnName":"
The requirement is to start removing text from ,"#met till "}
The requirement is to use ,"#met and "} as start and end identifiers and remove text between them including the identifiers.
There could be multiple occurrence of this start and end identifiers within the file.
The output should look like this:
Abc%678x”cv "},{"columnName":"
How to write an Excel formula or simple VBA script to remove text between two strings, including identifiers?
Formula:
=LEFT(A1,FIND(",""#met",A1)-1)&RIGHT(A1,LEN(A1)-FIND("}",A1,FIND("""#met",A1)))
VBA function:
Function RemoveBetweenSeparators( _
ByVal MyString As String, _
ByVal SepL As String, _
ByVal SepR As String) _
As String
Dim sL As String
Dim sR As String
sL = Split(MyString, SepL)(0)
sR = Replace(MyString, sL, "")
sR = Replace(sR, Split(sR, SepR)(0) & SepR, "")
RemoveBetweenSeparators = sL & sR
End Function
Which can be used like this:
=RemoveBetweenSeparators(A1,"""#meta","}")
EDIT: I also missed the 'multiple occurences' requirement, first time round! That makes it a little trickier, but try this:
Function RemoveBetweenSeparatorsMultiple( _
ByVal MyString As String, _
ByVal SepL As String, _
ByVal SepR As String) _
As String
Dim sOut As String
Dim sL As String
Do Until InStr(MyString, SepL) = 0
sL = Split(MyString, SepL)(0)
sOut = sOut & sL
MyString = Replace(MyString, sL & SepL, "", 1, 1)
sL = Split(MyString, SepR)(0)
MyString = Replace(MyString, sL & SepR, "", 1, 1)
Loop
RemoveBetweenSeparatorsMultiple = sOut & MyString
End Function
My apology, didn't notice that there could be multiple occurrence. I'll edit my answer later.
Assuming the original text is stored in A1.
A2=LEFT(A1,FIND(",""#met",A1)-1)&RIGHT(A1,LEN(A1)-FIND("""}",A1)-1)
Note: If you need to force excel treat a double quote mark as a normal text, you have to type two " for representing a ".
If there may be multiple occurrence, try this
Private Function RemoveText(ByVal tgtString As String, ByVal StartText As String, ByVal EndText As String) As String
Do While InStr(1, tgtString, StartText) > 0
tgtString = Left(tgtString, InStr(1, tgtString, StartText) - 1) & Right(tgtString, Len(tgtString) - InStr(1, tgtString, EndText) - 1)
Loop
RemoveText = tgtString
End Function
Private Sub test()
'remove certain string in A1 and store the result in A2
Range("A2").Value = RemoveText(Range("A1").Value, ",""#met", """}")
End Sub
maybe somthing like this (not tested though!) :
Function cleanedStr (inpStr as String; beginDel as string; endDel as Str) as String
Dim idx as long
Dim take as boolean
Dim outStr as String
Dim myCh as String
take = true
outStr = ""
for idx = 1 to len(inpStr)
myCh = mid(inpStr, idx, 1)
if myCh = beginDel then take = false
if take then
outStr = outStr & myCh
else
if myCh = endDel then take = true
end if
next idx
cleanedStr = outStr
end Function
Mind, the begin-identifier is 1 character only.
beginDel would be # and endDel would be }
This can be done easily using VBA and regular expressions:
Option Explicit
Function RemoveBetweenDelimiters(S As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.ignorecase = True
.Pattern = ",""#met[^}]+}"
RemoveBetweenDelimiters = .Replace(S, "")
End With
End Function
The regex interpretation:
,"#met[^}]+}
,"#met[^}]+}
Options: Case insensitive; ^$ match at line breaks
Match the character string “,"#met” literally ,"#met
Match any character that is NOT a “}” [^}]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
Match the character “}” literally }
Created with RegexBuddy
I need to remove programmatically non printable characters such as:
tabs - char(9)
line breaks - char(10)
carriage return - char(13)
data link escape - char(16)
I started a generic function that will be called from the lost_focus event of the ms access form field.
I have not figured out how to identify when the string contains the unwanted characters.
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = 1 To Len(dirtyString)
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9 ' Char(9)
Case 10 ' Char(10)
Case 13 ' Char(13)
Case 16 ' Char(16)
Case Else ' Add character to clean field.
cleanString = cleanString & Mid(dirtyString, iPosition, 1)
End Select
Next
RemoveNonPrintableCharacters = cleanString
End Function
These are 2 strings I have been using whilst testing:
This line, has multiple, tabs that need to be removed
This line, has multiple,
line
breaks
that
need to be removed
This line, has multiple, tabs that need to be removed
And
Also contains
multiple,
line
breaks
that
need to be removed
This is the top google result when I search for a quick function to use, I've had a good old google but nothing that solves my issue fully has really come up.
The main issue is that all of these functions touch the original string even if there's no issue. Which slows stuff down.
I've rewritten it so that only amends if bad character, also expanded to all non-printable characters and characters beyond standard ascii.
Public Function Clean_NonPrintableCharacters(Str As String) As String
'Removes non-printable characters from a string
Dim cleanString As String
Dim i As Integer
cleanString = Str
For i = Len(cleanString) To 1 Step -1
'Debug.Print Asc(Mid(Str, i, 1))
Select Case Asc(Mid(Str, i, 1))
Case 1 To 31, Is >= 127
'Bad stuff
'https://www.ionos.com/digitalguide/server/know-how/ascii-codes-overview-of-all-characters-on-the-ascii-table/
cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
Case Else
'Keep
End Select
Next i
Clean_NonPrintableCharacters = cleanString
End Function
A = Chr(09) & "Cat" & Chr(10) & vbcrlf
A = Replace(A, Chr(10))
A = Replace(A, Chr(13))
A = Replace(A, Chr(09))
Msgbox A
This is how one normally does it.
Your code is creating a lot of implicit variables.
It seems as if this should be much simpler, using the Excel Clean function. The following also works:
myString = Worksheets("Sheet1").Range("A" & tRow).Value
myString = Application.WorksheetFunction.Clean(myString)
You can also use other normal and home-grown Excel functions:
myString = Application.WorksheetFunction.Trim(myString)
Still haven't gotten the Substitute function to work in this way, but I'm working on it.
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = 1 To Len(dirtyString)
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9, 10, 13, 16
cleanString = cleanString & " "
Case Else
cleanString = cleanString & Mid(dirtyString, iPosition, 1)
End Select
Next
RemoveNonPrintableCharacters = cleanString
End Function
'first you need to find a character
YourStr = "Bla bla bla..."
if instr(YourStr, chr(10)) > 0 then
NewStr = Replace(YourStr, Chr(10),"")
end if
I'm replacing non-printable characters with a space character chr(32) but you can alter this to suit your needs.
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim sClean$
sClean = Replace(TextData, Chr(9), Chr(32))
sClean = Replace(sClean, Chr(10), Chr(32))
sClean = Replace(sClean, Chr(13), Chr(32))
sClean = Replace(sClean, Chr(16), Chr(32))
RemoveNonPrintableCharacters = sClean
End Function
This works to remove Non-printing Characters from the right side of the string only and do not replace the characters with spaces.
Function fRemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = Len(dirtyString) To 1 Step -1
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9, 10, 13, 16, 32, 160
cleanString = cleanString
Case Else
cleanString = Left(dirtyString, iPosition)
Exit For
End Select
Next
fRemoveNonPrintableCharacters = cleanString
End Function
When Unicode characters appear, the codes presented here should be modified. My proposal includes unrecognized characters by the program:
Public Function Clean_NonPrintableCharacters(Str As String) As String
'Removes non-printable characters from a string
Dim cleanString As String
Dim i As Integer
cleanString = Str
For i = Len(cleanString) To 1 Step -1
If Chr(Asc(Mid(cleanString, i, 1))) <> Mid(cleanString, i, 1) Then
cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
End If
Next i
Clean_NonPrintableCharacters = WorksheetFunction.Clean(cleanString)
End Function
It can be solved by RegEx (add MS VBScript Regular Expression in Tools - References in VBE):
Function NormalString(text As String, Optional filler = vbNullString) As String
Dim re As New RegExp
With re
.Pattern = "([\x00-\x1F\xA0])"
.Global = True
text = .Replace(text, filler)
End With
NormalString = text
End Function
If there are special characters, then replace them with spaces
If InStr(TextData, Chr(9)) > 0 Then TextData = Replace(TextData, Chr(9), Chr(32))
If InStr(TextData, Chr(10)) > 0 Then TextData = Replace(TextData, Chr(10), Chr(32))
If InStr(TextData, Chr(13)) > 0 Then TextData = Replace(TextData, Chr(13), Chr(32))
If InStr(TextData, Chr(16)) > 0 Then TextData = Replace(TextData, Chr(16), Chr(32))
I've been trying to write a program that opens a text file and inserts spaces after the 5th, 6th and 7th word for every line beginning with C_PIN.
I feel that I have it almost complete but I've benn getting run-time errors 5 and 438.
Example text being read:
COMP C48 66250110810 cap sc_cap.0603_H9 43.3959 74.3331 1 0
C_PROP (PB_FREE,"Y") (VALUE,"10nF") (TOLER,"10%") (PART_NAME,"06035C103K4T2A")
C_PIN C48-1 43.3959 75.0951 1 1 0 sp.0603_H9.1 /N$1567
C_PIN C48-2 43.3959 73.5711 1 1 0 sp.0603_H9.2 GN
An extra space is required after the 1 1 0
Here's where I think the problem in my code lies:
x = " "
Do While Not EOF(infilenum%)
Line Input #infilenum%, a$
If Left$(a$, 5) = "C_PIN" Then
For Each x In InStr
'If InStr(strText, " ") Then
w = w + 1
'w = strText.Split
For w = 5 To w = 7
My.Computer.FileSystem.WriteAllText (infilename$)
strText = My.Computer.FileSystem.ReadAllText(infilename$).Replace(w, x + w)
vb.newline
Any help is much appreciated!
For Each x In InStr
isn't valid in any way!!!
You can use Split and Join:
If Left(a, 5) = "C_PIN" Then
va = Split(a, " ")
va(4) = va(4) & " " 'Add extra space
va(5) = va(4) & " "
va(6) = va(4) & " "
a = Join(va, " ") 'Join with extra spaces added
End If
Now you can write the string.
I played with an alternative version that uses a regexp to make a single shot replacement
While it looks a little complicated, the plus - other than a single shot replacement - is that it only will alter lines that
start with C_PIN
have (at least) a further 6 words
This sample take your initial file, and saves a second version with the padded spacing.
Upated for additional requirement, using two separate regexp replacements
Sub ReDo()
Dim objFso As Object
Dim objFil As Object
Dim objFil2 As Object
Dim objRegex As Object
Dim strFile As String
Dim strAll As String
strFil = "c:\temp\REnglish.txt"
strFil2 = "c:\temp\REnglish2.txt"
Set objFso = CreateObject("Scripting.FileSystemObject")
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\nC_PIN\s)((\b[^\s]+\b\s){3})(\b[^\s]+\b\s)(\b[^\s]+\b\s)(\b\d\b\s)"
.Global = True
Set objFil = objFso.OpenTextFile(strFil)
strAll = objFil.ReadAll
Set objFil2 = objFso.createtextfile(strFil2)
strAll = .Replace(strAll, "$1$2$3$4 $5 $6 ")
.Pattern = "(\nC_PIN\s)((\b[^\s]+\b\s){3})(\b[^\s]+\b\s)(\b[^\s]+\b\s)(\b\d{2,}\b\s)"
objFil2.write .Replace(strAll, "$1$2$3$4 $5 $6")
End With
objFil.Close
objFil2.Close
End Sub