How to change encoding from UTF-8 to UTF-8-BOM of exported *.txt files from Excel? - excel

Exported text files from Excel are encoded with UTF-8.
An encoding UTF-8-BOM is needed.
I think that in code shall be inserted a row, written like:
Java
?xml version="1.0" encoding="UTF-8"?
Jasperreport CSV UTF-8 without BOM instead of UTF-8
or
HTML5
meta charset="utf-8"
Bad UTF-8 without BOM encoding
Sub export_data()
Dim row, column, i, j As Integer
Dim fullPath, myFile As String
fullPath = "C:\Workspace"
row = 21
column = 5
For i = 1 To column
myFile = Cells(1, i).Value + ".txt"
myFile = fullPath + "/" + myFile
Open myFile For Output As #1
For j = 2 To row
Print #1, Cells(j, i).Value
Next j
Close #1
Next i
End Sub
How can I define and where to put a row, which defines encoding UTF-8-BOM?
Thank You.

Instead of Printing the file line by line, it might be more efficient to
save your selected range as a CSV UTF-8
you might need to change the file type after saving
Use ADO to process the file as UTF-8
Either will add a BOM automatically.
EDIT
If you are unfamiliar, you could perform the save to csv - utf8 process manually with the macro recorder turned on. Then examine what you have recorded and make appropriate edits.
Another way of adding the BOM, in the context of your existing code, would be to write it directly as a byte array to the first line.
For example:
Dim BOM(0 To 2) As Byte 'EF BB BF
BOM(0) = &HEF
BOM(1) = &HBB
BOM(2) = &HBF
Open myFile For Binary Access Write As #1
Put #1, 1, BOM
Close #1
will put the BOM at the beginning of the file.
You should then change the mode in your subsequent Print code to Append.
I suggest you read about the pros and cons of using Print vs Write
You should also read about declaration statements. In yours, only the last variable on each line is being declared as the specified type; the preceding variables are being implicitly declared as being of type Variant.

Related

VBA Excel, when I read from a text file into a (string) variable, why does it not read Carriage return and line feed characters (0d 0a)

I am working on a macro that takes data from a textbox and writes it to a file. The data includes carriage return/Linefeed characters. At a later stage, I need to read the data from the file and put it back in the textbox in exactly the same form. When I do this, the Cr/Lf characters (0d 0a) are missing. I have established that they are written to the file but not read back. I am using the following snippet to write the file:
Print #1, Temstg
Close #1
and reading it using the following snippet:
Do Until EOF(1)
Line Input #1, T
Temstg = Temstg & T
Loop
Close #1
Where am I going wrong
Rob
As said in the comment Line Input considers newline character as the separator for a new line. Here is the documentation.
The following code reads the complete text file in one shot including newline characters.
Sub ReadFIle()
Dim fileName As String: fileName = "C:\temp\yourfile.txt"
Dim fileContent As String
Dim File As Integer: File = FreeFile
Open fileName For Input As #File
fileContent = Input(LOF(File), File)
Close #File
' example how to split the file in lines then
Dim vDat As Variant
vDat = Split(fileContent, vbNewLine)
End Sub
Reading on Input function

VB .Net when exporting to CSV issue when viewing in MS Excel

I have encountered something really weird. When exporting to CSV my top line shows the quotation marks yet the lines below down.
I use UTF8 encoding and manually add the double quotation marks to the value so that it is encased with quotation marks.
the code being used is
Dim fs As New IO.FileStream(GenericValueEditorExportFilename.Value, IO.FileMode.Create)
Dim writer As New IO.StreamWriter(fs, Encoding.UTF8)
fs.Write(Encoding.UTF8.GetPreamble(), 0, Encoding.UTF8.GetPreamble().Length)
....
....
....
While reader.Read
If reader("TargetLanguageID") = targetLanguageID Then
writer.WriteLine(Encode(reader("SourcePhrase")) & ", " & Encode(reader("TargetPhrase")))
End If
....
....
....
Friend Shared Function Encode(ByVal value As String) As String
Return ControlChars.Quote & value.Replace("""", """""") & ControlChars.Quote
End Function
the result when displayed in excel is shown as (https://ibb.co/ntMYdw)
when i open the file in Notepad++ the text is shown as below. But each line is displayed differently. Why is it that the 1st row displays them and the 2nd does not. Notepad++ result is displayed as (https://ibb.co/fMkWWG)
Excel is treating the first line as headers.
https://stackoverflow.com/a/24923167/2319909
So the issue was being caused by the BOM that was created to manually set the encoding for the file as a start writing to the file.
fs.Write(Encoding.UTF8.GetPreamble(), 0, Encoding.UTF8.GetPreamble().Length)
Removing this resolves by issue and the file remains in the desired UTF8 encoding as it is set on the stream writer. so there is no need to add the BOM to set the encoding.
Something like this should work for you.
Dim str As New StringBuilder
For Each dr As DataRow In Me.NorthwindDataSet.Customers
For Each field As Object In dr.ItemArray
str.Append(field.ToString & ",")
Next
str.Replace(",", vbNewLine, str.Length - 1, 1)
Next
Try
My.Computer.FileSystem.WriteAllText("C:\temp\testcsv.csv", str.ToString, False)
Catch ex As Exception
MessageBox.Show("Write Error")
End Try

Copying lines from Wordpad into Excel using VBA

I am writing some code where I import some files under TMX (a form of xml).
I tried various options
a) using the Open FileName For input, but this messes up the character encoding
b) opening the file and copying the data using the msoDialog, but this return an error if the file is too large (which is often the case) and this put the data in an utterly messy manner.
c) opening the file using notepad, but there are the same limitations in so far as copying the entirety of the file into Excel as the previous option.
I am not trying to use a shell function calling onto Wordpad.
My issue right now, is that I need to copy the file line by line to treat its content according to my needs (hopefully without losing the character encoding
Would someone know how to copy every single line from the file opened in WordPad and paste it post treatment (selection of the relevant elements) into Excel?
Thank you
For large files you can use this solution:
Public Sub ImportTMXtoExcel()
Call Application.FileDialog(msoFileDialogOpen).Filters.Clear
Call Application.FileDialog(msoFileDialogOpen).Filters.Add("TMX Files", "*.tmx")
Application.FileDialog(msoFileDialogOpen).Title = "Select a file to import..."
Application.FileDialog(msoFileDialogOpen).AllowMultiSelect = False
intChoice = Application.FileDialog(msoFileDialogOpen).Show
If intChoice <> 0 Then
strFileToImport = Application.FileDialog(msoFileDialogOpen).SelectedItems(1)
Else
Exit Sub
End If
intPointer = FreeFile()
Open strFileToImport For Input Access Read Lock Read As #intPointer
intCounter = 0
Do Until EOF(intPointer)
Line Input #intPointer, strLine
intCounter = intCounter + 1
Worksheets(1).Cells(intCounter + 1, 1).Value2 = strLine
Loop
Close intPointer
End Sub
For other encodings you can use ADO's Stream as described in this solution:
VB6/VBScript change file encoding to ansi
If you have large files which require ADO's Stream then you might want to consider breaking down the large files first as described in this solution:
How to split a large text file into smaller files with equal number of lines?
The following website provides a tool which mimics the Unix command split for Windows in command prompt: https://www.fourmilab.ch/splits/

How to create a file from hex code in Excel/VBA

I have code in hexadecimal format in an excel workbook. Each 2-digit piece of code is in a separate cell, so it looks like 4D|54|68|64|00|00|00|06 etc. There may also be a few cells with 4 or 6 digit pieces, if it makes it any simpler. Is there any way to code a file from here? Effectively I need it so that opening the file in a hex editor will reveal the code. I have a feeling this may involve HEX2DEC or HEX2BIN, but even then I wouldn't know where to go from there.
Might not be the most efficient solution in terms of converting the strings to bytes, but just opening a file in binary mode and putting bytes to it works just fine:
Sub HexStringToBinaryFile()
Dim hex_val As String
hex_val = "4D|54|68|64|00|00|00|06"
Dim output() As String
output = Split(hex_val, "|")
Dim handle As Long
handle = FreeFile
Open "C:\Dev\test.bin" For Binary As #handle
Dim i As Long
For i = LBound(output) To UBound(output)
Put #handle, , CByte("&H" & output(i))
Next i
Close #handle
End Sub
You didn't actually say what you did but textfile functions do character conversions. Use a stream object in binary mode to write to disk. Use write method to put a byte array into it. And like textfile functions avoid VBA's character or string functions.
Sub test()
Dim ByteArray(4) As Byte
ByteArray(0) = CByte(55)
ByteArray(1) = CByte(55)
ByteArray(2) = CByte(55)
ByteArray(3) = CByte(55)
Set BS = CreateObject("ADODB.Stream")
BS.Type = 1
BS.Open
BS.Write ByteArray
BS.SaveToFile "c:\users\test", 2
End Sub

Retrieving last line of a txt file with a VBscript

Before all, I want to say that I am not a programmer, so this may be basic for some people but surely not for me!!
The task that I want to accomplish is to retrieve some characters of a data file that is imported automatically from a server.
Data is stored in lines in a CSV or tabbed .txt file, each line consists of date and some numeric values. The format is always the same, only the file grows in one line each time a new value is entered.
What I need the script to do, is open that file (wich adress is known and constant) search for the last line, and then extract a string from that line and write it on a different .TXT file, from where I can import it to another specific software as a raw value.
The part in the middle (extracting string) is fairly simple, but opening and isolating the last line is far too much for me.
Thanks everybody for helping!
dim path
path = "fileName.txt"
otherOption(path)
function otherOption(fileName)
const read = 1
dim arrFileLines()
set objArgs = CreateObject("scripting.FileSystemObject")
if objArgs.FileExists(fileName) then
set objFile = objArgs.OpenTextFile(fileName,read)
i=0
do until objFile.AtEndOfStream
redim preserve arrFileLines(i)
arrFileLines(i) = objFile.ReadLine
i = i + 1
loop
objFile.Close
end if
wscript.Echo arrFileLines(i-1)
end function

Resources