I have a vba macro with that adds BOM to UTF-8 csv - it's needed for succesfully opening in Excel.
But the problem is, that when at the end of the line there is CrLf mark - excel makes new line below that line.
What I need is to remove all CrLf marks (no Cr or Lf should be added instead). It should help because only Cr will exist in csv. CrLf exists only in fault lines.
Can you help please with my source code? What formula should I add to replace in source csv to save target csv with BOM without any CrLf?
Dim fsT, tFileToOpen, tFileToSave As String
tFileToOpen = "C:\source_NO_BOM.csv"
tFileToSave = "C:\target_WITH_BOM.csv"
tFileToOpenPath = tFileToOpen
tFileToSavePath = tFileToSave
Set fsT = CreateObject("ADODB.Stream"): 'Create Stream object
fsT.Type = 2: 'Specify stream type – we want To save text/string data.
fsT.Charset = "utf-8": 'Specify charset For the source text data.
fsT.Open: 'Open the stream
fsT.LoadFromFile tFileToOpenPath: 'And write the file to the object stream
fsT.SaveToFile tFileToSavePath, 2: 'Save the data to the named path
You can create another stream object, change the content you broke into it, and export it.
Dim fsT, tFileToOpen, tFileToSave As String
Dim s As String, Newfst As Object
tFileToOpen = "C:\source_NO_BOM.csv"
tFileToSave = "C:\target_WITH_BOM.csv"
tFileToOpenPath = tFileToOpen
tFileToSavePath = tFileToSave
Set fsT = CreateObject("ADODB.Stream"): 'Create Stream object
With fsT
.Type = 2: 'Specify stream type ? we want To save text/string data.
.Charset = "utf-8": 'Specify charset For the source text data.
.Open: 'Open the stream
.LoadFromFile tFileToOpenPath: 'And write the file to the object stream
s = .ReadText
s = Replace(s, vbCrLf, "")
End With
Set Newfst = CreateObject("ADODB.Stream")
With Newfst
.Type = 2
.Charset = "utf-8"
.Open
.WriteText s
.SaveToFile tFileToSave, 2
End With
Related
My system is Window 10 English-US.
I need to write some non-printable ASCII characters to a text file. So for eg for the ASCII value of 28, I want to write \u001Cw to the file. I don't have to do anything special when coded in Java. Below is my code in VBS
Dim objStream
Set objStream = CreateObject("ADODB.Stream")
objStream.Open
objStream.Type = 2
objStream.Position = 0
objStream.CharSet = "utf-16"
objStream.WriteText ChrW(28) 'Need this to appear as \u001Cw in the output file
objStream.SaveToFile "C:\temp\test.txt", 2
objStream.Close
You need a read-write stream so that writing to it and saving it to file both work.
Const adModeReadWrite = 3
Const adTypeText = 2
Const adSaveCreateOverWrite = 2
Sub SaveToFile(text, filename)
With CreateObject("ADODB.Stream")
.Mode = adModeReadWrite
.Type = adTypeText
.Charset = "UTF-16"
.Open
.WriteText text
.SaveToFile filename, adSaveCreateOverWrite
.Close
End With
End Sub
text = Chr(28) & "Hello" & Chr(28)
SaveToFile text, "C:\temp\test.txt"
Other notes:
I like to explicitly define with Const all the constants in the code. Makes reading so much easier.
A With block save quite some typing here.
Setting the stream type to adTypeText is not really necessary, that's the default anyway. But explicit is better than implicit, I guess.
Setting the Position to 0 on a new stream is superfluous.
It's unnecessary to use ChrW() for ASCII-range characters. The stream's Charset decides the byte width when you save the stream to file. In RAM, everything is Unicode anyway (yes, even in VBScript).
There are two UTF-16 encodings supported by ADODB.Stream: little-endian UTF-16LE (which is the default and synonymous with UTF-16) and big-endian UTF-16BE, with the byte order reversed.
You can achieve the same result with the FileSystemObject and its CreateTextFile() method:
Set FSO = CreateObject("Scripting.FileSystemObject")
Sub SaveToFile(text, filename)
' CreateTextFile(filename [, Overwrite [, Unicode]])
With FSO.CreateTextFile(filename, True, True)
.Write text
.Close
End With
End Sub
text = Chr(28) & "Hello" & Chr(28)
SaveToFile text, "C:\temp\test.txt"
This is a little bit simpler, but it only offers a Boolean Unicode parameter, which switches between UTF-16 and ANSI (not ASCII, as the documentation incorrectly claims!). The solution with ADODB.Stream gives you fine-grained encoding choices, for example UTF-8, which is impossible with the FileSystemObject.
For the record, there are two ways to create an UTF-8-encoded text file:
The way Microsoft likes to do it, with a 3-byte long Byte Order Mark (BOM) at the start of the file. Most, if not all Microsoft tools do that when they offer "UTF-8" as an option, ADODB.Stream is no exception.
The way everyone else does it - without a BOM. This is correct for most uses.
To create an UTF-8 file with BOM, the first code sample above can be used. To create an UTF-8 file without BOM, we can use two stream objects:
Const adModeReadWrite = 3
Const adTypeBinary = 1
Const adTypeText = 2
Const adSaveCreateOverWrite = 2
Sub SaveToFile(text, filename)
Dim iStr: Set iStr = CreateObject("ADODB.Stream")
Dim oStr: Set oStr = CreateObject("ADODB.Stream")
' one stream for converting the text to UTF-8 bytes
iStr.Mode = adModeReadWrite
iStr.Type = adTypeText
iStr.Charset = "UTF-8"
iStr.Open
iStr.WriteText text
' one steam to write bytes to a file
oStr.Mode = adModeReadWrite
oStr.Type = adTypeBinary
oStr.Open
' switch first stream to binary mode and skip UTF-8 BOM
iStr.Position = 0
iStr.Type = adTypeBinary
iStr.Position = 3
' write remaining bytes to file and clean up
oStr.Write iStr.Read
oStr.SaveToFile filename, adSaveCreateOverWrite
oStr.Close
iStr.Close
End Sub
I am trying to import special characters from a txt file into excel.
I've tried so many things but the characters BREAK in excel.
example of my string:
in txt: Changjíhuízúzìzhìzhou
converts in excel to: ChangjÃhuÃzúzìzhìzhou
so I tried moving values over bit by bit but no luck..
Sub ImportTXTFile()
Dim file As Variant
Dim EXT As String
Dim Direct As String ' directory...
Direct = "C:\FilePath\Here\"
EXT = ".txt"
Dim COL As Long
Dim row As Long
COL = 1
row = 1
file = Dir(Direct)
Do While (file <> "") ' Cycle through files until no more files
If InStr(file, "Data.txt") > 0 Then
'
Open Direct & "Data.txt" For Input As #1
'
While Not EOF(1)
Line Input #1, DataLine ' Read in line
Do While DataLine <> ""
If InStr(DataLine, ",") = 0 Then ' Drop value into excel upto the first ,
Sheets("test").Cells(row, COL).Value = DataLine
DataLine = ""
Else
Sheets("test").Cells(row, COL).Value = Left(DataLine, InStr(DataLine, ",") - 1)
DataLine = Right(DataLine, Len(DataLine) - InStr(DataLine, ",")) ' rebuild array without data upto first ,
End If
COL = COL + 1 ' next column
Loop
COL = 1 ' reset column
row = row + 1 ' write to next row
Wend
'
Close #1 ' Close files straight away
End If
file = Dir
Loop
MsgBox "Data Updated"
End Sub
So I want to cry because all this converting of UTF-8 to ASCII can be avoid simply by:
opening the txt file in Notepad++
going to the encoding tab
clicking convert to ASCII
ran my original code.
BLAM
everything is perfect.
Thank you danieltakeshi for all your help!
Using the first link i gave you, here is a test code, i tested with success. Using the charset: CdoISO_8859_1
Dim objStream As Object
Dim strData As String
Set objStream = CreateObject("ADODB.Stream")
objStream.Charset = "iso-8859-1"
objStream.Open
objStream.LoadFromFile ("C:\Users\user_name\Desktop\test.txt")
strData = objStream.ReadText()
Debug.Print strData & " Compare to: Changjíhuízúzìzhìzhou"
The output was:
EDIT:
Check the encoding type of your .txt file and import to Excel with the same encoding charset, for example, i changed the test.txt to UTF-8 and imported successfully with the .Charset as "utf-8"
You can Save As your .txt file and choose the encoding.
I have a subroutine that reads text files and extracts certain data from them. Here is an example:
NamePrefix = "Example"
OutputPath = "C:\Example"
DbSize = 65536
LstStr = ""
Dim Success() As Boolean
Dim Value() As Double
ReDim Success(1 to DbSize)
ReDim Value(1 to DbSize)
For ID = 1 to DbSize
'Read string
FileName = NamePrefix & Format(ID,"000000") & ".lst"
FilePath = OutputPath & "\" & FileName
Open FilePath For Input As 1
LstStr = Input(LOF(1),1)
Close 1
'Extract data
If InStr(1, LstStr, "SUCCESS") <> 0 Then Success(i) = True Else Success(i) = False
Pos1 = InStr(1, LstStr, "TH 1 value: ") 'Position varies for each file
Value(i) = Val(Mid(LstStr, Pos1 + 13, 10)) 'Value in scientific notation
Next ID
The use of InStr to locate strings by position works perfectly when there are just alphabets, numbers and symbols. However, sometimes the files contain Chinese characters and the Input function returns an empty string "" to LstStr. I tried to use some other suggested methods but in vain (e.g. Extract text from a text file with Chinese characters using vba). How should I read files with Chinese characters successfully, in a way that I do not need to modify other parts of the code which extract data by position? Thanks!
This would be an alternative way to read the string. Make sure that the .Charset is set to the charset of the file you want to read.
To use ADOBD you will need to add the reference Microsoft ActiveX Data Objects 6.1 Library (Version can be different) in VBA Menu › Extras › References
Dim adoStream As ADODB.Stream
Set adoStream = New ADODB.Stream
adoStream.Charset = "UTF-8" 'set the correct charset
adoStream.Open
adoStream.LoadFromFile FilePath
LstStr = adoStream.ReadText
adoStream.Close
Set adoStream = Nothing
I have the following encrypted text in cell A1:
ԓԗՃխՓ՛ՐեՐ՞՚ըՖՑ՟խՑՙՔը՟՝ՇխՑ
I am trying to write this in a text file but the text shows as question marks characters ????????????????? in the text file.
Here is my code:
textFilePath = ThisWorkbook.Path & "\customfile.txt"
FF = VBA.FreeFile
Open textFilePath For Output As #FF
Print #FF, CStr(Sheet1.Cells(1,1).Value)
Close #FF
FYI: If I manually copy and paste the value of cell A1 in notepad, the text shows fine.
You cannot write like a normal text... You need to write like UTF-8:
Dim fsT As Object
Set fsT = CreateObject("ADODB.Stream")
fsT.Type = 2 'Specify stream type - we want To save text/string data.
fsT.Charset = "utf-8" 'Specify charset For the source text data.
fsT.Open 'Open the stream And write binary data To the object
fsT.WriteText Sheet1.Cells(1, 1).Value
fsT.SaveToFile "e:\0\customfile.txt", 2 'Save binary data To disk
taken from:
Save text file UTF-8 encoded with VBA
Need to create text files from excel rows.
Column 1 should include the file names and column 2 the content of the text files. Each row will have either new file name or new content for that new text file. Also, the content of the text should be split into several lines.
How to accomplish this? Thank you.
Edited solution with text separation to lines.
For the sample the following chars are used:
;:,/|
Add new separators to RegEx pattern as required. Full code is below:
Sub Text2Files()
Dim FileStream As Object
Dim FileContent As String
Dim i As Long
Dim SavePath As String
Dim RegX_Split As Object
Set RegX_Split = CreateObject("VBScript.RegExp")
RegX_Split.Pattern = "[\;\:\,\\\/\|]" 'List of used line seperators as \X
RegX_Split.IgnoreCase = True
RegX_Split.Global = True
Set FileStream = CreateObject("ADODB.Stream")
SavePath = "D:\DOCUMENTS\" 'Set existing folder with trailing "\"
For i = 1 To ThisWorkbook.ActiveSheet.Cells(1, 1).CurrentRegion.Rows.Count
FileContent = RegX_Split.Replace(ThisWorkbook.ActiveSheet.Cells(i, 2).Text, vbNewLine)
FileStream.Open
FileStream.Type = 2 'Text
FileStream.Charset = "UTF-8" 'Change encoding as required
FileStream.WriteText FileContent
FileStream.SaveToFile SavePath & ThisWorkbook.ActiveSheet.Cells(i, 1).Text, 2 'Will overwrite the existing file
FileStream.Close
Next i
End Sub
Read more about ADO Stream Object:
http://www.w3schools.com/ado/ado_ref_stream.asp
RegEx for beginners:
http://www.jose.it-berater.org/scripting/regexp/regular_expression_syntax.htm
Sample file with the above code is here: https://www.dropbox.com/s/kh9cq1gqmg07j20/Text2Files.xlsm