save csv as pipe delimited using vba - excel

I have a csv file which has data as shown below:
PPIC,11/20/2013 10:23,11431,10963,,Tremors ,
PPIC,11/20/2013 10:23,11431,11592,,"Glioblastoma, Barin ",
the key difference is that row 1 contains a single word (last column), whereas the second row contains data enclosed in double quotes (but comma separated)
This is causing my BULK Import routine to import data wrong for second row. when BULK IMPORT runs, it splits the second row into multiple columns.
I read lots of posts on StackOverflow, and lots of suggestions point out to have a "pipe" delimited file as the input for bulk insert, that will remove any inconsistencies with the quoted text.
How can I convert this comma separated file into a pipe delimited file using a vb excel macro? I want to keep the process automated (where it will take the input csv file, convert it to pipe delimited and then send the file further for importing).
OR
How can I address the inconsistent quotes to be used whilst doing a BULK Insert?
Any thoughts / help appreciated.

This would do it:
Sub MySub()
Dim FileString As String
Dim Pattern As String
Dim ReplacementPattern As String
Dim ChangedStr As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
FileString = "PPIC,11/20/2013 10:23,11431,10963,,Tremors ," & vbCrLf & "PPIC,11/20/2013 10:23,11431,11592,,""Glioblastoma, Barin "","
Pattern = "(""[^""]*""|[^"",]*)?,"
ReplacementPattern = "$1|"
RE.Pattern = Pattern
RE.Global = True
RE.MultiLine = True
ChangedStr = RE.Replace(mystr, ReplacementPattern)
End Sub
Hope this does the trick

Related

Fastest way to transfer array to text file

I have a one dimensional array with more than 3 million items and I would like to transfer it to a text file. I tried a FileSystemObject method, which is not fast enough for me. So I tried to write to cells in a worksheet and export it as txt file, but I am still searching for a faster way to write an array to a txt file.
Please try also Put (and maybe later also Get):
Private Sub TestPut(myArray() as string)
Dim handle As Long
handle = FreeFile
Open Application.Defaultfilepath & "\Whatever.txt" For Binary As #handle
Put #handle, , myArray
Close #handle
End Sub
You may join your array as a single string to prevent unwanted descriptors (see above Put-documentation) and to define CR or CRLF or whatever as delimiter,
but only if the resulting string's length does not exceed 2,147,483,647 bytes:
Put #handle, , Join(myArray, vbCrLf)
Try something like that
FilePath = "C:\output.txt"
Set FileStream = CreateObject("ADODB.Stream")
FileStream.Open
FileStream.Type = 2 'Text
FileStream.Charset = "utf-8"
FileStream.WriteText vba.Strings.Join(YourArray)
FileStream.SaveToFile (FilePath)
FileStream.Close

Excel VBA extract date from CSV txt

I'm having trouble extracting this text, exactly as it appears, from a CSV. There are similar questions posted on SO but they don't match my requirements:
I want to extract "31 January 2017" from this row:
4,'31 January 2017','Funds Received/Credits',56,,401.45,
Currently, VBA considers it "31 Jan" without the year. I've tried applying .NumberFormat to the cell (general, text, date).
SOLUTION REQUIREMENTS:
No user action required -- Interact with the file only using VBA (not using File > Import > Wizard)
Compatible with VBA Excel 2003
Extract the full text regardless of Excel or operating system date settings
Thank you for your ideas
You can use the split function, using the comma as a delimiter like this:
sResult = Split("4,'31 January 2017','Funds Received/Credits',56,,401.45, ", ",")(1)
If you dont want the single quotes, then add the replace function like this:
sResult = Replace(Split("4,'31 January 2017','Funds Received/Credits',56,,401.45, ", ",")(1), "'", "")
If you include the "Microsoft VBScript Regular Expressions 5.5" Reference, you can set up a pattern that will extract the whole date if it is found. For example:
Dim tstring As String
Dim myregexp As RegExp
Dim StrMatch As Object
tstring = 'Line from the CSV, or entire CSV as one string
Set myregexp = New RegExp
myregexp.Pattern = "\d{1,2} [A-Z]{3,9} \d{4}"
Set StrMatch = myregexp.Execute(tstring)
You get the benefit from this method that all the dates in the CSV will be pulled out at once, much faster than using a split line by line. Additionally, the dates may be accessed by using
DateStr = StrMatch.Item(index)
for the whole string line, or substrings can be set up to get specific parts of the string(Such as month, day, year).
myregexp.Pattern = "\(d{1,2}) ([A-Z]{3,9}) (\d{4})"
Set StrMatch = myregexp.Execute(tstring)
DateStr = StrMatch.Item(index1).SubMatches(index2)
It is a very powerful tool, with a simple set of symbols for development of patterns. I highly suggest you familiarize yourself with it for manipulation of large strings.

Why does Excel treat double spaces as a comma?

I wrote an export to CSV file in my vb.net application, and I then exported it into Outlook.
The issue I've got, is that when the CSV file is being written, my code is checking for a comma in the current field, but while doing this, it also mistakes a double space for a comma, or space followed by 'Enter' key being pressed (for multiline textboxes)
An example would be if in the notes section of the customer, there is 4 lines of text, and one ends in a space - The user has then pressed enter to go to the next line, however the program is taking the next line of text and creating a new record for it, as it thinks it's a comma...
What is the reason for this? This means that data has to be super validated (ie checking for no double spaces etc) before it can be exported, which is far too time consuming.
Hopefully this makes sense!
This is the code:
Dim result As Boolean = True
Try
Dim sb As New StringBuilder()
Dim separator As String = ","
Dim group As String = """"
Dim newLine As String = Environment.NewLine
For Each column As DataColumn In dtable.Columns
sb.Append(wrapValue(column.ColumnName, group, separator) & separator)
Next
sb.Append(newLine)
For Each row As DataRow In dtable.Rows
For Each col As DataColumn In dtable.Columns
sb.Append(wrapValue(row(col).ToString(), group, separator) & separator)
Next
sb.Append(newLine)
Next
The code for wrapValue
Function wrapValue(value As String, group As String, separator As String) As String
If value.Contains(separator) Then
If value.Contains(group) Then
value = value.Replace(group, group + group)
End If
value = group & value & group
End If
Return value
End Function
Based on the fact that it's shortening it by 430 lines, I'd suggest it's something to do with the fact you're adding a load of "" before and after the value variable.
If it's removing a value at the start, then it will be removing a " before the first column header. As to why it's importing one record as you mentioned in the comments, I'm not entirely sure, however, I would suggest the issue lies in your wrapValue code.
Can you try changing
value = group & value & group
to
value = value
and see if that changes anything?

Replace String Two Different Parts

I am extracting a column of data from a range of filenames. All my filenames are strings in the form:
Temporary PSD Report 'Month' 2011.xls
I am using Replace to extract the month from each, at the moment I am doing it in two stages which works but it seems a bit clumsy. Is there a way to use some kind of AND for multiple replacements in the same string?
Dim strfilename As String
Dim mnth As String
Dim mnthshrt As String
mnth = Replace(strfilename, "Temporary PSD Report ", "")
mnthshrt = Replace(mnth, " 2011.xls", "")
I've tried using & and AND to reference both parts to be removed but it either has no effect on the original string or produces an error.
You could also split the string at each space character and take the 4th word (index starts at 0):
s = "Temporary PSD Report 'Month' 2011.xls"
mth = Split(s, " ")(3)

VB.Net Split A Group Of Text

I am looking to split up multiple lines of text to single them out, for example:
Url/Host:ftp://server.com/1
Login:Admin1
Password:Password1
Url/Host:ftp://server.com/2
Login:Admin2
Password:Password2
Url/Host:ftp://server.com/3
Login:Admin3
Password:Password3
How can I split each section into a different textbox, so that section one would be put into TextBox1.Text on its own:
Url/Host:ftp://server.com/1
Login:Admin1
Password:Password1
If that's the exact format you could just split it on two newlines in a row:
Dim text As String ' your text
Dim sep() As String = {vbNewLine & vbNewLine}
Dim sections() As String = text.Split(sep, StringSplitOptions.None)
and then just loop through them and put one value in each Textbox.

Resources