How to convert .xls to .csv using Powershell without Excel installed - excel

Is there a way to convert .xls to .csv without Excel being installed using Powershell?
I don't have access to Excel on a particular machine so I get an error when I try:
New-Object -ComObject excel.application
New-Object : Retrieving the COM class factory for component with CLSID
{00000000-0000-0000-0000-000000000000} failed due to the following
error: 80040154 Class not registered (Exception from HRESULT:
0x80040154 (REGDB_E_CLASSNOTREG)).

Forward
Depending on what you already have installed on your system you might need the Microsoft Access Database Engine 2010 Redistributable for this solution to work. That will give you access to the provider: "Microsoft.ACE.OLEDB.12.0"
Disclaimer: Not super impressed with the result and someone with more background could make this answer better but here it goes.
Code
$strFileName = "C:\temp\Book1.xls"
$strSheetName = 'Sheet1$'
$strProvider = "Provider=Microsoft.ACE.OLEDB.12.0"
$strDataSource = "Data Source = $strFileName"
$strExtend = "Extended Properties='Excel 8.0;HDR=Yes;IMEX=1';"
$strQuery = "Select * from [$strSheetName]"
$objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
$sqlCommand.Connection = $objConn
$objConn.open()
$da = New-Object system.Data.OleDb.OleDbDataAdapter($sqlCommand)
$dt = New-Object system.Data.datatable
[void]$da.fill($dt)
$dataReader.close()
$objConn.close()
$dt
Create an ODBC connection to the excel file $strFileName. You need to know your sheet name and populate $strSheetName which helps build $strQuery. When then use several objects to create a connection and extract the data from the sheet as a System.Data.DataTable. In my test file, with one populated sheet, I had two columns of data. After running the code the output of $dt is:
letter number
------ ------
a 2
d 34
b 0
e 4
You could then take that table and then ExportTo-CSV
$dt | Export-Csv c:\temp\data.csv -NoTypeInformation
This was built based on information gathered from:
Scripting Guy
PowerShell Code Repository

Related

Excel Open CSV file using TEXT format with powershell

I want to open the CSV file using powershell Excel.Application.
my code is like this:
$csv = "csv name"
$xlsx = "output excel name"
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$wb = $excel.Workbook.Open($csv)
$wb.SaveAs($xlsx,51)
$excel.Quit()
But Turns out that the data in the csv "004" will loaded as 4
Anyone can think of a way to do this?
Noted that there are many special case in my csv:
there are data like "004", "01234678" in the csv and I would like to import all of them as text.
there are comma within the data like "FlatA, 7/F"
there are newline character within the data like
"abcdef
def
ghi"
you can also give your own solution that can load the csv to excel using powershell which can fulfill all the above cases.
Thanks a lot. You will save my life if you able to do this.

Merge multiple CSV into one without using Excel.Application

I created a PowerShell script that allows me to merge multiple .CSV into one .XLSX file.
It works well on my computer:
$path = "C:\Users\Francesco\Desktop\CSV\Results\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
Write-Host " "$csvs.Name"`n"
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
for ($i=1;$i -le $y;$i++) {
$worksheet = $xlsx.Worksheets.Item($i)
$worksheet.Name = $csvs[$i-1].Name
$file = (Import-Csv $csvs[$i-1].FullName)
$file | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | clip
$worksheet.Cells.Item(1).PasteSpecial() | Out-Null
}
$output = "C:\Users\Francesco\Desktop\CSV\Results\Results.xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
The problem is that I need to run this on several servers and servers are well known for not having Office installed so I cannot use Excel.Application.
Is there a way to merge multiple CSV into one CSV or XLSX without using Excel.Application and saving each CSV into a different sheet?
#AnsgarWiechers is right, ImportExcel is powerful and not difficult to use. However for your specific case you can use a more limited approach, using OleDb (or ODBC or ADO) to write to an Excel file like a database. Here is some sample code showing how to write to an Excel file using OleDb.
$provider = 'Microsoft.ACE.OLEDB.12.0'
$dataSource = 'C:\users\user\OleDb.xlsb'
$connStr = "Provider=$provider;Data Source=$dataSource;Extended Properties='Excel 12.0;HDR=YES'"
$objConn = [Data.OleDb.OleDbConnection]::new($connStr)
$objConn.Open()
$cmd = $objConn.CreateCommand()
$sheetName = 'Demo'
$cmd.CommandText = 'CREATE TABLE $sheetName (Name TEXT,Age NUMBER)'
$cmd.ExecuteNonQuery()
$cmd.CommandText = "INSERT INTO demo (Name,Age) VALUES ('Adam', 20)"
$cmd.ExecuteNonQuery()
$cmd.CommandText = "INSERT INTO demo (Name,Age) VALUES ('Bob',30)"
$cmd.ExecuteNonQuery()
$cmd.Dispose()
$objConn.Close()
$objConn.Dispose()
You didn't say much about the CSV files you'll be processing. If column data varies, to create the table you'll have to get the attribute (column) names from the CSV header (either by reading the first line of the CSV file, or by enumerating the properties of the first item returned by Import-CSV).
If your CSV files have a large number of lines, writing one line at a time may be slow. In that case using a DataSet and OleDbDataAdapter might improve performance (but I haven't tested). But at that point you might as well use OleDb to read the .csv directly into a DataSet, create a OleDbDataAdapter, set the adapter's InsertCommand property, and finally call the adapters Update method. I don't have time to write and test all that.
This is not intended as a full solution, just a demo of how to use OleDb to write to an Excel file.
Note: I tested this on a server that didn't have Office or Excel installed. The Office data providers pre-installed on that machine were 32-bit, but I was using 64-bit PowerShell. To get 64-bit drivers I installed the Microsoft Access Database Engine 2016 Redistributable and that's what I used for testing.
Time has passed and I have found a new solution: Install-Module -Name ImportExcel
This way the module takes care of the job like in this script.

Is it possible to speed up an Excel query within PowerShell?

I am currently using the following query to add a specific row (with 10 columns) from an Excel spreadsheet (~1500 rows and hosted on SharePoint) to an array in PowerShell.
$connection = New-Object System.Data.OleDb.OleDbConnection
$connectstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=$Source;Extended Properties='Excel 12.0 Xml;HDR=YES'"; $connection.ConnectionString = $connectstring
$connection.open()
$cmdObject = New-Object System.Data.OleDb.OleDbCommand
$query = "Select * from [Sheet1$]
WHERE [Sheet1$].[Test] = '$Test'"
$cmdObject.CommandText = $query
$cmdObject.CommandType = "Text"
$cmdObject.Connection = $connection
$oReader = $cmdObject.ExecuteReader()
[void]$oReader.Read()
$Global:oData = New-Object PSObject
$oData | Add-Member NoteProperty List1 $oReader[0]
...
$oData | Add-Member NoteProperty List10 $oReader[9]
$oReader.Close()
$cmdObject.Dispose()
$connection.Close()
$connection.Dispose()
This works absolutely fine, however it is often quite slow. Is there any way in which I could speed up the query? I can't add the entire excel sheet into an array, as the data changes throughout the day and is queried regularly.
I've found other questions, such as Speed up reading an Excel File in Powershell however that doesn't seem relevant to this particular issue.
Appreciate any help.

Process .xlsx to csv with Powershell using rename and set delimiter

I have an Excel file that I receive and want to process it to a CSV using Powershell.
I have to alter it quite specifically so it can be a reliable input for a program that will process the csv info.
I don't know the exact headers, but i know there can be duplicates.
What I do is open the xlsx file with excel and save it as CSV:
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $True
$objExcel.DisplayAlerts = $True
$Workbook = $objExcel.Workbooks.open($xlsx1)
$WorkSheet = $WorkBook.sheets.item($sheet)
$xlCSV = 6
$Workbook = $objExcel.Workbooks.open($xlsx2)
$WorkSheet = $WorkBook.sheets.item($sheet)
$WorkBook.SaveAs($csv2,$xlCSV)
Now, the XLSX file will have comma's, so first I want to change them to dots.
I tried this, but it's not working:
$objRange = $worksheet.UsedRange
$objRange.Replace ",", "."
It errors out saying: Unexpected token '", "'.
Then when saving I want to set the Delimiter to comma, as it uses ";" standard.
With something like:
$WorkBook.SaveAs($csv2,$xlCSV) -delimiter ","
The last problem is the duplicate headers; this prevents PS to use Import-CSV. Here I tried, when file is separated with a comma it works:
Get-Content $downloads\BBKS_DIR_AUTO_COMMA.csv -totalcount 1 >$downloads\Headers.txt
But then I need to rename de duplicate names like I can have Regio, Regio, Regio.
I want to change this to Regio, Regio2, Regio3
My plan was to lookup the data of the txt, search for duplicates, and then ad an incremental nummer.
In the end I need to add a column with incremental numbers, but always with four numbers, like; 0001, 0002, 0010, 0020, 0200, 1500, I wont exceed 9999. How can this be done?
If you can help me, if only partially I'm very happy.
Further, I'm running Windows 7 x64, Powershell 3.0, Excel 2016 (if relevant)
If easier, its fine to go back to Command prompt for some tasks.
Personally, I wouldn't try and work with Excel sheets via Excel itself and COM - I'd use the excellent module https://github.com/dfinke/ImportExcel
Then you can import from the sheet straight to a native Powershell object array, and re-export with Export-Csv -Delimiter.
Edit: To answer follow ups :
Once you've loaded the module you can do "Get-Module ImportExcel | Select-Object -ExpandProperty ExportedCommands" to see what it makes available.
To import your Excel in the first place, do something like :
$WorkBook = Import-Excel
And if you need to take care of duplicate column names, you can do :
$WorkBook = Import-Excel -Header #("Regio1", "Regio2", "Regio")
Where the array you pass to -Header needs to include every column you want from the workbook.

powershell excel access without installing Excel

I need to be able to read an existing (password protected) Excel spreadsheet (an .xlsx file) from Powershell - but I don't want to install Excel. Every approach I've found assumes that Excel is installed on the workstation where the script is running.
I've tried the Excel viewer, but it doesn't seem to work; it won't invoke properly. I've looked at other solutions on stackoverflow, but all of them seem to want to update the excel spreadsheet, and I'm hoping I don't have to go that far.
Am I missing something obvious?
See the Detailed Article from Scripting Guy here. You have to use classic COM ADO in your Powershell Script.
Hey, Scripting Guy! How Can I Read from Excel Without Using Excel?
Relevant Powershell Snippet:
$strFileName = "C:\Data\scriptingGuys\Servers.xls"
$strSheetName = 'ServerList$'
$strProvider = "Provider=Microsoft.Jet.OLEDB.4.0"
$strDataSource = "Data Source = $strFileName"
$strExtend = "Extended Properties=Excel 8.0"
$strQuery = "Select * from [$strSheetName]"
$objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
$sqlCommand.Connection = $objConn
$objConn.open()
$DataReader = $sqlCommand.ExecuteReader()
While($DataReader.read())
{
$ComputerName = $DataReader[0].Tostring()
"Querying $computerName ..."
Get-WmiObject -Class Win32_Bios -computername $ComputerName
}
$dataReader.close()
$objConn.close()
That said, you have stated that your Excel file is password protected.
According to this Microsoft Support article, you cannot open password protected Excel files using OLEDB Connections.
From the Article:
On the Connection tab, browse to your workbook file. Ignore the "User
ID" and "Password" entries, because these do not apply to an Excel
connection. (You cannot open a password-protected Excel file as a data
source. There is more information on this topic later in this
article.)
If you don't have Excel installed, EPPlus is the best solution I know of to access Excel files from PowerShell. Refer to my answer here to setup EPPlus for PowerShell.
The following code creates a passwort protected Excel file containing the output of Get-Process and then reads back the process information from the password protected file:
# Load EPPlus
$DLLPath = "C:\Windows\System32\WindowsPowerShell\v1.0\Modules\EPPlus\EPPlus.dll"
[Reflection.Assembly]::LoadFile($DLLPath) | Out-Null
$FileName = "$HOME\Downloads\Processes.xlsx"
$Passwort = "Excel"
# Create Excel File with Passwort
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage
$Worksheet = $ExcelPackage.Workbook.Worksheets.Add("FromCSV")
$ProcessesString = Get-Process | ConvertTo-Csv -NoTypeInformation | Out-String
$Format = New-object -TypeName OfficeOpenXml.ExcelTextFormat -Property #{TextQualifier = '"'}
$null=$Worksheet.Cells.LoadFromText($ProcessesString,$Format)
$ExcelPackage.SaveAs($FileName,$Passwort)
# Open Excel File with Passwort
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage -ArgumentList $FileName,$Passwort
# Select First Worksheet
$Worksheet = $ExcelPackage.Workbook.Worksheets[1]
# Get Process data from Cells
$Processes = 0..$Worksheet.Dimension.Columns | % {
# Get all Cells in a row
$Row = $Worksheet.Cells[($Worksheet.Dimension.Start.Row+$_),$Worksheet.Dimension.Start.Column,($Worksheet.Dimension.Start.Row+$_),$Worksheet.Dimension.End.Column]
# Join values of all Cells in a row to a comma separated string
($Row | select -ExpandProperty Value) -join ','
} | ConvertFrom-Csv
Refer to my answer here for more options to protect Excel files.

Resources