Create text files from excel - excel

Need to create text files from excel rows.
Column 1 should include the file names and column 2 the content of the text files. Each row will have either new file name or new content for that new text file. Also, the content of the text should be split into several lines.
How to accomplish this? Thank you.

Edited solution with text separation to lines.
For the sample the following chars are used:
;:,/|
Add new separators to RegEx pattern as required. Full code is below:
Sub Text2Files()
Dim FileStream As Object
Dim FileContent As String
Dim i As Long
Dim SavePath As String
Dim RegX_Split As Object
Set RegX_Split = CreateObject("VBScript.RegExp")
RegX_Split.Pattern = "[\;\:\,\\\/\|]" 'List of used line seperators as \X
RegX_Split.IgnoreCase = True
RegX_Split.Global = True
Set FileStream = CreateObject("ADODB.Stream")
SavePath = "D:\DOCUMENTS\" 'Set existing folder with trailing "\"
For i = 1 To ThisWorkbook.ActiveSheet.Cells(1, 1).CurrentRegion.Rows.Count
FileContent = RegX_Split.Replace(ThisWorkbook.ActiveSheet.Cells(i, 2).Text, vbNewLine)
FileStream.Open
FileStream.Type = 2 'Text
FileStream.Charset = "UTF-8" 'Change encoding as required
FileStream.WriteText FileContent
FileStream.SaveToFile SavePath & ThisWorkbook.ActiveSheet.Cells(i, 1).Text, 2 'Will overwrite the existing file
FileStream.Close
Next i
End Sub
Read more about ADO Stream Object:
http://www.w3schools.com/ado/ado_ref_stream.asp
RegEx for beginners:
http://www.jose.it-berater.org/scripting/regexp/regular_expression_syntax.htm
Sample file with the above code is here: https://www.dropbox.com/s/kh9cq1gqmg07j20/Text2Files.xlsm

Related

Remove rows from csv document using VBA

So currently I have the following VBA code, what this does is it collects a csv from a dedicated directory.
Sub UploadData()
' Define the relative variables
Application.ScreenUpdating = False
Application.Calculation = xlManual
' Define the variables necessary
Dim Path As String
Dim DataFile1 As String
Dim datasheet As String
Dim Temp_File_Name1 As String
Dim File_Name1 As String
' Set the path locations to grab the 151_.csv file
Path1 = Worksheets("FileNames").Cells(25, 3).Value
DataFile1 = Worksheets("FileNames").Cells(29, 3).Value
' ------------------------------------------- Send the csv file as an email ------------------------------------------------- '
' Assign the path directory to the csv file
Temp_File_Name1 = Path1 & DataFile1 & "*.csv"
File_Name1 = Dir(Temp_File_Name1)
File_Name1 = Path1 & File_Name1
End Sub
The CSV file retrieved is in the following format, where my goal is to remove the rows that have the "pipeline_point_code" 30000001PC as depicted in the following image.
Is it possible to remove these rows from the csv through the vba code? If not, how can I paste this csv stored in the "File_Name1" variable into my excel sheet that is labelled "Data"
If I understand correctly this isn't about reading content of a .CSV file. Instead, you want to loop over all the filenames in a folder, list the filenames in a sheet, and then exclude certain filenames, is that correct?
You can use Dir() without parameters to iterate over the files, like so:
Dim FileName As String
Dim folder as String
Dim rownum As Integer
' get starting folder
folder = Worksheets("FileNames").Cells(25, 3).Value
' start iterating all files
FileName = Dir(folder + "\*.csv", vbDirectory)
rownum = 1
' iterate all files
Do While FileName <> ""
' add to Data sheet, but filter out "pipeline_point_code" file name
If (FileName <> "pipeline_point_code") Then
rownum = rownum + 1
Worksheets("Data").Cells(rownum, 1).Value = FileName
End If
' next file
FileName = Dir()
Loop

Removing CrLf in whole csv in VBA - ADODB.Stream (Excel Macro)

I have a vba macro with that adds BOM to UTF-8 csv - it's needed for succesfully opening in Excel.
But the problem is, that when at the end of the line there is CrLf mark - excel makes new line below that line.
What I need is to remove all CrLf marks (no Cr or Lf should be added instead). It should help because only Cr will exist in csv. CrLf exists only in fault lines.
Can you help please with my source code? What formula should I add to replace in source csv to save target csv with BOM without any CrLf?
Dim fsT, tFileToOpen, tFileToSave As String
tFileToOpen = "C:\source_NO_BOM.csv"
tFileToSave = "C:\target_WITH_BOM.csv"
tFileToOpenPath = tFileToOpen
tFileToSavePath = tFileToSave
Set fsT = CreateObject("ADODB.Stream"): 'Create Stream object
fsT.Type = 2: 'Specify stream type – we want To save text/string data.
fsT.Charset = "utf-8": 'Specify charset For the source text data.
fsT.Open: 'Open the stream
fsT.LoadFromFile tFileToOpenPath: 'And write the file to the object stream
fsT.SaveToFile tFileToSavePath, 2: 'Save the data to the named path
You can create another stream object, change the content you broke into it, and export it.
Dim fsT, tFileToOpen, tFileToSave As String
Dim s As String, Newfst As Object
tFileToOpen = "C:\source_NO_BOM.csv"
tFileToSave = "C:\target_WITH_BOM.csv"
tFileToOpenPath = tFileToOpen
tFileToSavePath = tFileToSave
Set fsT = CreateObject("ADODB.Stream"): 'Create Stream object
With fsT
.Type = 2: 'Specify stream type ? we want To save text/string data.
.Charset = "utf-8": 'Specify charset For the source text data.
.Open: 'Open the stream
.LoadFromFile tFileToOpenPath: 'And write the file to the object stream
s = .ReadText
s = Replace(s, vbCrLf, "")
End With
Set Newfst = CreateObject("ADODB.Stream")
With Newfst
.Type = 2
.Charset = "utf-8"
.Open
.WriteText s
.SaveToFile tFileToSave, 2
End With

Why is VBA script adding double quotes at start/end of files

Hello Dear StackOverFlowers,
My question may be trival but I'm currently out of option to think of after searching all afternoon
Context: I have a excel worksheet with 120 rows or so that I need to use to create files with.
Data is structured as follow:
The A column contains destination file names
B column has the corresponding data that needsto be written in each file
Giving us the following general layout
data file layout
So, to get data from B column written in each A column named files, I wrote the followin VBScript snippet:
Option Explicit
Sub writeExportedMsgToXML()
' wrote that tiny script not to have to copy pate 117 messages by hand to have ops put them back on Q
Dim currentRow As Integer
' modify to match your data row start and end
For currentRow = 2 To 11
Dim messageID As String
Dim messageitSelf As String
messageID = Trim(ActiveSheet.Range("A" & currentRow))
messageitSelf = ActiveSheet.Range("B" & currentRow)
Dim subDirectory As String
subDirectory = "xmls"
Dim filePath As String
filePath = ActiveWorkbook.Path & "\" & subDirectory & "\" & messageID & ".xml"
MsgBox (messageitSelf) ' for test purpose
Open filePath For Output As #1
Write #1, messageitSelf
Close #1
Next currentRow
End Sub
The script does mostly what it's intended for Except , and this is the source of my question today, it enclose the file content between double quotes as you can see below:
file content enclosed in double quotes
So, in a case where a file named F1.xml should just contain <foo><bar>Baz</bar></foo>
My script transform it as "<foo><bar>Baz</bar></foo>"
What I tried
Replacing file writing part with the following
Dim objStream
Set objStream = CreateObject("ADODB.Stream")
objStream.Charset = "UTF-8"
Dim subDirectory As String
subDirectory = "xmls"
Dim filePath As String
filePath = ActiveWorkbook.Path & "\" & subDirectory & "\" & messageID & ".xml"
objStream.Open
objStream.WriteText messageitSelf
objStream.SaveToFile filePath
objStream.Close
With same outcome
Any clues on what I'm missing/Doing wrong ?
Should I declare messageitSelf as a different type ?
Any help would be appreciated :)
Thank you
Write# statements surround strings with double quotes:
Unlike the Print # statement, the Write # statement inserts commas between items and quotation marks around strings as they are written to the file.
Use Print# instead:
Dim fn As Long
fn = VBA.FreeFile
Open filePath For Output As #fn
Print #fn, messageitSelf
Close #fn

How to read .txt file with Chinese characters?

I have a subroutine that reads text files and extracts certain data from them. Here is an example:
NamePrefix = "Example"
OutputPath = "C:\Example"
DbSize = 65536
LstStr = ""
Dim Success() As Boolean
Dim Value() As Double
ReDim Success(1 to DbSize)
ReDim Value(1 to DbSize)
For ID = 1 to DbSize
'Read string
FileName = NamePrefix & Format(ID,"000000") & ".lst"
FilePath = OutputPath & "\" & FileName
Open FilePath For Input As 1
LstStr = Input(LOF(1),1)
Close 1
'Extract data
If InStr(1, LstStr, "SUCCESS") <> 0 Then Success(i) = True Else Success(i) = False
Pos1 = InStr(1, LstStr, "TH 1 value: ") 'Position varies for each file
Value(i) = Val(Mid(LstStr, Pos1 + 13, 10)) 'Value in scientific notation
Next ID
The use of InStr to locate strings by position works perfectly when there are just alphabets, numbers and symbols. However, sometimes the files contain Chinese characters and the Input function returns an empty string "" to LstStr. I tried to use some other suggested methods but in vain (e.g. Extract text from a text file with Chinese characters using vba). How should I read files with Chinese characters successfully, in a way that I do not need to modify other parts of the code which extract data by position? Thanks!
This would be an alternative way to read the string. Make sure that the .Charset is set to the charset of the file you want to read.
To use ADOBD you will need to add the reference Microsoft ActiveX Data Objects 6.1 Library (Version can be different) in VBA Menu › Extras › References
Dim adoStream As ADODB.Stream
Set adoStream = New ADODB.Stream
adoStream.Charset = "UTF-8" 'set the correct charset
adoStream.Open
adoStream.LoadFromFile FilePath
LstStr = adoStream.ReadText
adoStream.Close
Set adoStream = Nothing

How to extract specific words from text files into xls spreadsheet

I'm new in VBA. Before posting my question here,I have spent almost 3 days surfing Internet.
I have 300+ text files (text converted from PDF using OCR),from text file. I need to get all words that contain "alphabet" and "digits" (as example KT315A, KT-315-a, etc) along with source reference (txt file name).
What I need is
1.add "smart filter" that will copy only words that contains
"alphabets" and "digits"
paste copied data to column A
add reference file name to column B
I have found code below that can copy all data from text files into excel spreadsheet.
text files look like
"line from 252A-552A to ddddd, ,,, #,#,rrrr, 22 , ....kt3443 , fff,,,etc"
final result in xls should be
A | B
252A-552A | file1
kt3443 | file1
Option Explicit
Const sPath = "C:\outp\" 'remember end backslash
Const delim = "," 'comma delimited text file - EDIT
'Const delim = vbTab 'for TAB delimited text files
Sub ImportMultipleTextFiles()
Dim wb As Workbook
Dim sFile As String
Dim inputRow As Long
RefreshSheet
On Error Resume Next
sFile = Dir(sPath & "*.txt")
Do Until sFile = ""
inputRow = Sheets("Temp").Range("A" & Rows.Count).End(xlUp).Row + 1
'open the text file
'format=6 denotes a text file
Set wb = Workbooks.Open(Filename:=sPath & sFile, _
Format:=6, _
Delimiter:=delim)
'copy and paste
wb.Sheets(1).Range("A1").CurrentRegion.Copy _
Destination:=ThisWorkbook.Sheets("Temp").Range("A" & inputRow)
wb.Close SaveChanges:=False
'get next text file
sFile = Dir()
Loop
Set wb = Nothing
End Sub
Sub RefreshSheet()
'delete old sheet and add a new one
On Error Resume Next
Application.DisplayAlerts = False
Sheets("Temp").Delete
Application.DisplayAlerts = True
Worksheets.Add
ActiveSheet.Name = "Temp"
On Error GoTo 0
End Sub
thanks!
It's a little tough to tell exactly what constitutes a word from your example. It clearly can contain characters other than letters and numbers (eg the dash), but some of the items have dots preceding, so it cannot be defined as being delimited by a space.
I defined a "word" as a string that
Starts with a letter or digit and ends with a letter or digit
Contains both letters and digits
Might also contain any other non-space characters except a comma
To do this, I first replaced all the commas with spaces, and then applied an appropriate regular expression. However, this might accept undesired strings, so you might need to be more specific in defining exactly what is a word.
Also, instead of reading the entire file into an Excel workbook, by using the FileSystemObject we can process one line at a time, without reading 300 files into Excel. The base folder is set, as you did, by a constant in the VBA code.
But there are other ways to do this.
Be sure to set the references for early binding as noted in the code:
Option Explicit
'Set References to:
' Microsoft Scripting Runtime
' Microsoft VBscript Regular Expressions 5.5
Sub SearchMultipleTextFiles()
Dim FSO As FileSystemObject
Dim TS As TextStream, FO As Folder, FI As File, FIs As Files
Dim RE As RegExp, MC As MatchCollection, M As Match
Dim WS As Worksheet, RW As Long
Const sPath As String = "C:\Users\Ron\Desktop"
Set FSO = New FileSystemObject
Set FO = FSO.GetFolder(sPath)
Set WS = ActiveSheet
WS.Columns.Clear
Set RE = New RegExp
With RE
.Global = True
.Pattern = "(?:\d(?=\S*[a-z])|[a-z](?=\S*\d))+\S*[a-z\d]"
.IgnoreCase = True
End With
For Each FI In FO.Files
If FI.Name Like "*.txt" Then
Set TS = FI.OpenAsTextStream(ForReading)
Do Until TS.AtEndOfStream
'Change .ReadLine to .ReadAll *might* make this run faster
' but would need to be tested.
Set MC = RE.Execute(Replace(TS.ReadLine, ",", " "))
If MC.Count > 0 Then
For Each M In MC
RW = RW + 1
WS.Cells(RW, 1) = M
WS.Cells(RW, 2) = FI.Name
Next M
End If
Loop
End If
Next FI
End Sub

Resources