Convert Xlsx to CSV UTF-8 format - excel

i want to convert my XLXS file to CSV UTF-8 format using vb script or macros.
if WScript.Arguments.Count < 2 Then
WScript.Echo "Error! Please specify the source path and the destination. Usage: XlsToCsv SourcePath.xls Destination.csv"
Wscript.Quit
End If
Dim oExcel
Set oExcel = CreateObject("Excel.Application")
Dim oBook
Set oBook = oExcel.Workbooks.Open(Wscript.Arguments.Item(0))
oBook.SaveAs WScript.Arguments.Item(1), 6
oBook.Close False
oExcel.Quit
WScript.Echo "Done"enter code here
The above script works fine for normal formats.
Please help me in converting in into UTF-8 format
i have also tries the below ,code but it converts into junk characters
Public Sub convert_UnicodeToUTF8()
Dim parF1, parF2 As String
parF1 = "C:\shrangi\SX_Hospital.xlsx"
parF2 = "C:\shrangi\SX_Hospital.csv"
Const adSaveCreateOverWrite = 2
Const adTypeText = 2
Dim streamSrc, streamDst ' Source / Destination
Set streamSrc = CreateObject("ADODB.Stream")
Set streamDst = CreateObject("ADODB.Stream")
streamDst.Type = adTypeText
streamDst.Charset = "UTF-8"
streamDst.Open
With streamSrc
.Type = adTypeText
.Charset = "UTF-8"
.Open
.LoadFromFile parF1
.copyTo streamDst
.Close
End With
streamDst.SaveToFile parF2, adSaveCreateOverWrite
streamDst.Close
Set streamSrc = Nothing
Set streamDst = Nothing
End Sub

Simply:
ActiveWorkbook.SaveAs Filename:="C:\yourPath\yourFileName.csv", FileFormat:=xlCSVUTF8
More Info:
MSDN: Workbook.SaveAs Method

Since you are converting an external file to an external file, you don't need to do it within Excel with VBA. That opens up some possibilities. With the OpenXML SDK you don't even need Excel.
OpenXML SDK is a bit hard to use so there are a few wrappers for it to optimize Workbook programming. EPPlus has a PowerShell wrapper around it called PSExcel. It makes this task really easy in PowerShell
One-time setup, typically as an Administrator:
Install-Module PSExcel
Once per PowerShell session:
Import-Module PSExcel
Then:
Import-XLSX 'C:\shrangi\SX_Hospital.xlsx' | Export-CSV 'C:\shrangi\SX_Hospital.csv' -Encoding UTF8
For a simple workbook, that's all you need.
Side note on CSV: Converting from xlsx to csv throws out almost all the metadata and introduces the need for more metadata. Along with the file, you need to communicate the character encoding, the data types of each column, whether there is a header row, the line terminator, the field separator (not always comma), the culture-specific numeric formatting, the quote character (aka "text qualifier"), and the quote character escape mechanism. You can see all of these question that Excel has to ask when you use its text import wizard.

Related

Numbers get converted to strings while xlsx to csv conversion using vbs

Hi i am trying to convert .xlsx file to .csv file and then read this csv file in python. Below is my code in VBS:
if WScript.Arguments.Count < 2 Then
WScript.Echo "Error! Please specify the source path and the destination. Usage: XlsToCsv SourcePath.xls Destination.csv"
Wscript.Quit
End If
Dim oExcel
Set oExcel = CreateObject("Excel.Application")
Dim oBook
Set oBook = oExcel.Workbooks.Open(Wscript.Arguments.Item(0))
oBook.SaveAs WScript.Arguments.Item(1), 6
oBook.Close False
oExcel.Quit
WScript.Echo "Done"
This code is working properly. but when i read converted csv file in python the numbers get converted to string. example:
$ 266,74245.4545 -> '$266,74245'
Is there any way i can make changes in vbs code to get these numbers as it is in csv file. for my application i require whole number while reading.
I took above code from "https://stackoverflow.com/questions/1858195/convert-xls-to-csv-on-command-line"
I found solution to this question. I'm changing the currency($) format to a Numbers format in the opened worksheet, and then converting it to CSV.
This how the code looks:
Dim oBook
Set oBook = oExcel.Workbooks.Open(Wscript.Arguments.Item(0))
oBook.ActiveSheet.Range(WScript.Arguments.Item(2)).Select
oExcel.Selection.NumberFormat = "0000000000.00000"
oBook.SaveAs WScript.Arguments.Item(1), 6

VBScript Importing .txt file to .xlsx file [duplicate]

I'm trying to convert pipe-delimited files to xls (Excel) with batch file and vbscript. Unfortunately, my "output.xls" file is still showing the pipe delimiter in the table and the data are not organized.
srccsvfile = Wscript.Arguments(0)
tgtxlsfile = Wscript.Arguments(1)
'Create Spreadsheet
'Look for an existing Excel instance.
On Error Resume Next ' Turn on the error handling flag
Set objExcel = GetObject(,"Excel.Application")
'If not found, create a new instance.
If Err.Number = 429 Then '> 0
Set objExcel = CreateObject("Excel.Application")
End If
objExcel.Visible = false
objExcel.displayalerts=false
'Import CSV into Spreadsheet
Set objWorkbook = objExcel.Workbooks.open(srccsvfile)
Set objWorksheet1 = objWorkbook.Worksheets(1)
'Adjust width of columns
Set objRange = objWorksheet1.UsedRange
objRange.EntireColumn.Autofit()
'This code could be used to AutoFit a select number of columns
'For intColumns = 1 To 17
' objExcel.Columns(intColumns).AutoFit()
'Next
'Make Headings Bold
objExcel.Rows(1).Font.Bold = TRUE
'Freeze header row
With objExcel.ActiveWindow
.SplitColumn = 0
.SplitRow = 1
End With
objExcel.ActiveWindow.FreezePanes = True
'Add Data Filters to Heading Row
objExcel.Rows(1).AutoFilter
'set header row gray
objExcel.Rows(1).Interior.ColorIndex = 15
'-0.249977111117893
'Save Spreadsheet, 51 = Excel 2007-2010
objWorksheet1.SaveAs tgtxlsfile, 51
'Release Lock on Spreadsheet
objExcel.Quit()
Set objWorksheet1 = Nothing
Set objWorkbook = Nothing
Set ObjExcel = Nothing
source :http://www.tek-tips.com/viewthread.cfm?qid=1682555
Pipe doesn't equal Comma, Excel natively knows what to do with a CSV, but not with Pipe.
All is not lost, record your actions opening the file manually, once open highlight column A and click Data / Text To Columns, choose delimited and in the "other" box put a pipe then click next, choose the column formats (great to format numbers as text if you need to like Postcodes and phone numbers) then click finish.
Now stop the recorder and look at the code it generated. Port this over to your Excel object in your script.
Excel is a little picky when it comes to reading CSV files. If you have a delimited file with the extension .csv Excel will only open it correctly via the Open method if the delimiter is the character configured in the system's regional settings.
The Open method has optional parameters that allow you to specify a custom delimiter character (credit to #Jeeped for pointing this out):
set objWorkbook = objExcel.Workbooks.Open(srccsvfile, , , 6, , , , , "|")
You can also use the OpenText method (which will be used when recording the action as a macro):
objExcel.Workbooks.OpenText srccsvfile, , , 1, , , , , , , True, "|"
Set objWorkbook = objExcel.Workbooks(1)
Note that the OpenText method does not return a workbook object, so you must assign the workbook to a variable yourself after opening the file.
Important: either way your file must not have the extension .csv if your delimiter character differs from your system's regional settings, otherwise the delimiter will be ignored.

Save .txt locally (UTF-8) [duplicate]

My system is Window 10 English-US.
I need to write some non-printable ASCII characters to a text file. So for eg for the ASCII value of 28, I want to write \u001Cw to the file. I don't have to do anything special when coded in Java. Below is my code in VBS
Dim objStream
Set objStream = CreateObject("ADODB.Stream")
objStream.Open
objStream.Type = 2
objStream.Position = 0
objStream.CharSet = "utf-16"
objStream.WriteText ChrW(28) 'Need this to appear as \u001Cw in the output file
objStream.SaveToFile "C:\temp\test.txt", 2
objStream.Close
You need a read-write stream so that writing to it and saving it to file both work.
Const adModeReadWrite = 3
Const adTypeText = 2
Const adSaveCreateOverWrite = 2
Sub SaveToFile(text, filename)
With CreateObject("ADODB.Stream")
.Mode = adModeReadWrite
.Type = adTypeText
.Charset = "UTF-16"
.Open
.WriteText text
.SaveToFile filename, adSaveCreateOverWrite
.Close
End With
End Sub
text = Chr(28) & "Hello" & Chr(28)
SaveToFile text, "C:\temp\test.txt"
Other notes:
I like to explicitly define with Const all the constants in the code. Makes reading so much easier.
A With block save quite some typing here.
Setting the stream type to adTypeText is not really necessary, that's the default anyway. But explicit is better than implicit, I guess.
Setting the Position to 0 on a new stream is superfluous.
It's unnecessary to use ChrW() for ASCII-range characters. The stream's Charset decides the byte width when you save the stream to file. In RAM, everything is Unicode anyway (yes, even in VBScript).
There are two UTF-16 encodings supported by ADODB.Stream: little-endian UTF-16LE (which is the default and synonymous with UTF-16) and big-endian UTF-16BE, with the byte order reversed.
You can achieve the same result with the FileSystemObject and its CreateTextFile() method:
Set FSO = CreateObject("Scripting.FileSystemObject")
Sub SaveToFile(text, filename)
' CreateTextFile(filename [, Overwrite [, Unicode]])
With FSO.CreateTextFile(filename, True, True)
.Write text
.Close
End With
End Sub
text = Chr(28) & "Hello" & Chr(28)
SaveToFile text, "C:\temp\test.txt"
This is a little bit simpler, but it only offers a Boolean Unicode parameter, which switches between UTF-16 and ANSI (not ASCII, as the documentation incorrectly claims!). The solution with ADODB.Stream gives you fine-grained encoding choices, for example UTF-8, which is impossible with the FileSystemObject.
For the record, there are two ways to create an UTF-8-encoded text file:
The way Microsoft likes to do it, with a 3-byte long Byte Order Mark (BOM) at the start of the file. Most, if not all Microsoft tools do that when they offer "UTF-8" as an option, ADODB.Stream is no exception.
The way everyone else does it - without a BOM. This is correct for most uses.
To create an UTF-8 file with BOM, the first code sample above can be used. To create an UTF-8 file without BOM, we can use two stream objects:
Const adModeReadWrite = 3
Const adTypeBinary = 1
Const adTypeText = 2
Const adSaveCreateOverWrite = 2
Sub SaveToFile(text, filename)
Dim iStr: Set iStr = CreateObject("ADODB.Stream")
Dim oStr: Set oStr = CreateObject("ADODB.Stream")
' one stream for converting the text to UTF-8 bytes
iStr.Mode = adModeReadWrite
iStr.Type = adTypeText
iStr.Charset = "UTF-8"
iStr.Open
iStr.WriteText text
' one steam to write bytes to a file
oStr.Mode = adModeReadWrite
oStr.Type = adTypeBinary
oStr.Open
' switch first stream to binary mode and skip UTF-8 BOM
iStr.Position = 0
iStr.Type = adTypeBinary
iStr.Position = 3
' write remaining bytes to file and clean up
oStr.Write iStr.Read
oStr.SaveToFile filename, adSaveCreateOverWrite
oStr.Close
iStr.Close
End Sub

VBS OpenTextFile returns unexpected result

This is my code:
Set fso = CreateObject("Scripting.FileSystemObject")
strText = fso.OpenTextFile(strLocalFolderName & "\" & Oudste).ReadAll()
msgbox strText
But strText contains rubbish after these lines.
How can that be?
Darn! The boolean option within OpenTextFile examples is often left out!
fso.OpenTextFile(Path, ForReading, False, TriStateTrue)
Path is the path to the file. ForReading should be 1 for read only.
Then this False is the often omitted boolean (false means it's not written )
Only when the boolean is added correctly, you can pick a type of txt file.
In my case unicode so I pick -1 for the Tristate.
Tip: if you ever get weird results with textfiles, open in notepad, choose save as and then it will reveal what kind of text you actually have.
Your problem can be because a lot of thigs like the encode of target file, one of the most common encode us UTF-8 you can chage it with notepad++:
How do I convert an ANSI encoded file to UTF-8 with Notepad++?
I think you should put some validation code to find the real problem, I suggest this code:
ForReading=1 'Open a file for reading only. You can't write to this file.
ForWriting=2 'Open a file for writing.
ForAppending=8 'Open a file and write to the end of the file.
CreateIfNotExist=TRUE 'If you use FALSE you get error if not exist
set fso = CreateObject("Scripting.FileSystemObject")
if (fso.fileexists(".\test.txt")) then
set ts = fso.OpenTextFile(".\test.txt", ForReading, CreateIfNotExist)
if NOT ts.AtEndOfStream then
s = ts.ReadAll
msgbox s
else
msgbox "End of file"
end if
else
msgbox "File not found"
end if

How do I read CRLF delimited lines from MSXML.ResponseText in VBA/Excel

Most of the examples I see with MSXML have to to with Javascript or JQuery, but I'm writing an Excel 2010 macro that doesn't use either.
My goal is to download a file (as shown below), and parse a medium sized (5 to 15MB) CSV file. I ultimately want to save the CSV data in a hidden data tab.
I made a little progress with this CSV VBA sample here but I don't know how to glue the output of MSXML.ResponstText with that sample.
Here is my VBA/Macro code
Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
'objHttp.SetRequestHeader "Content-Type", "text/csv"
'objHttp.SetRequestHeader "charset", "gb2312"
Call objHttp.Open("GET", fileURL, False)
Call objHttp.Send("")
'Call MsgBox(objHttp.ResponseText)
How do I get excel to work with ResponseText and only read one line at a time?
I say, don't mix things. First download the CSV file, then read it.
From your question it isn't obvious what your goal is. If you want to parse the file, then you can read and parse it line by line like this using native VBA statements:
Dim filePath As String
Dim fn As Integer
Dim myLine As String
Dim myParsedLine() As String
filePath = "C:\DatabaseWeeklyStats.csv"
fn = FreeFile()
Open filePath For Input As #fn
Do Until EOF(fn)
Line Input #fn, myLine
myParsedLine = Split(myLine, ",")
' Line is now parsed. Do stuff.
Loop
If you just want to stick the entire CSV file in a new sheet in your workbook without necessarily "parsing" it (i.e. interpreting its contents) beforehand, then you can do this:
Dim dbSheet As Worksheet
Dim targetSheet As Worksheet
Workbooks.Open Filename:="C:\DatabaseWeeklyStats.csv", _
Format:=2 ' use comma delimiters
Set dbSheet = ActiveSheet
Set targetSheet = Workbooks("Book1").Sheets(3) ' wherever you want to move it to
dbSheet.Move After:=targetSheet
' dbSheet is now in your workbook.
' Hide it.
Set dbSheet = ActiveSheet
dbSheet.Visible = xlSheetHidden
Dim opener As New FileSystemObject
Dim fContainer
Set fContainer = opener.OpenTextFile("c:\DatabaseWeeklyStats.csv")
Do Until fContainer.AtEndOfStream
sText = fContainer.ReadLine
Debug.Print sText
Loop
' This requires reference to Microsoft Scripting Runtime

Resources