import xlsx converted to csv to powershell - excel

I need to import csv file to my PS script. In the same script I have a function to convert an xlsx file to csv, which I need to import. But I can't find a way to do that. I tried three approaches:
$CSVLicence = Import-Csv (ConvertXLSX -File $Licence) -Delimiter "," -Encoding UTF8
Where ConvertXLSX is the function to convert xlsx to csv, $Licence is a variable defined at the beginning of the script. The second approach is this:
$CSVfile = ConvertXLSX -File $Licence
$CSVLicence = ImportCsv $CSVfile -Delimiter "," -Encoding UTF8
In both cases I get an error message on the column right after "Import-Csv", that argument Path is null or empty.
The third approach was defining the path literally, which is not really a good solution, but should work, if executor of the file runs it in correct folder and uses correct file name. In this case I get an error "The member "Proemail Exchg Business" is already present" in front of the "Import-Csv" cmdlet
The csv file gets created and it looks precisely as it should, so there's obviously not an error in the converting function.
The ConvertXLSX function is defined like this:
$Licence = "licence"
Function ConvertXLSX ($File)
{
$PWD = "r:\Licence\"
$ExcelFile = $PWD + $File + ".xlsx"
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
$wb = $Excel.Workbooks.Open($ExcelFile)
foreach ($ws in $wb.Worksheets | where {$_.name -eq "Pridani"})
{
$ws.SaveAs($PWD + $File + ".csv",6)
}
$Excel.Quit()
}
ConvertXLSX -File $Licence
The function creates the csv file successfully, without any problems

You're currently using the wrong cmdlet. Import-Csv is designed to be used with a file.csv, not a string object or otherwise. There is a cmdlet that exists for that purpose, ConvertFrom-Csv, that takes an -InputObject parameter (indicating it takes pipeline input).
PS C:\> Get-Help -Name 'ConvertFrom-Csv'
SYNTAX
ConvertFrom-Csv [-InputObject] <PSObject[]> [[-Delimiter] <Char>] [-Header <String[]>]
Used in your example:
## Declaration of the delimiter is unnecessary when it's a comma
$CSVLicense = ConvertXLSX -File $Licence | ConvertFrom-Csv

TheIncorrigible1 explained the issue quite well. Now that you've edited the question to add the ConvertXLSX function, we can build on that to get a solution that fits.
$ws.SaveAs($PWD + $File + ".csv",6)
This part is the save to the filename $PWD + $File + ".csv". The number 6 represents the CSV XlFileFormat Enumeration.
The function does not return anything, just performs an action on external file. That's why you are getting the null or empty error. We can get it to return the file path, or to return the data in an object and use ConvertFrom-Csv.
Given your code, the object would require more work as you already have the file path there.
To return file path, change:
}
$Excel.Quit()
}
To:
}
$Excel.Quit()
return $($PWD + $File + ".csv")
}
You code should then work.
You could improve it's readability and conciseness more:
Assign the CSV filename to a variable.
Avoid assigning to $PWD as it's an alias for Get-Location
Get rid of the foreach. You are only getting one sheet and saving that sheet at the moment. If you are doing for multiple sheets, you will have to thing how to pass the multiple paths/object to Import-Csv/ConvertFrom-Csv
Putting it all together:
$Licence = "licence"
Function ConvertXLSX ($File)
{
$rootDir = "r:\Licence\"
$ExcelFile = $rootDir + $File + ".xlsx"
$CSVFile = $ExcelFile.Replace(".xlsx",".csv")
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
$wb = $Excel.Workbooks.Open($ExcelFile)
$ws = $wb.Worksheets | where {$_.name -eq "Pridani"})
$ws.SaveAs($CSVFile,6)
$Excel.Quit()
return CSVFile
}
ConvertXLSX -File $Licence
I get an error "The member "Proemail Exchg Business" is already present" in front of the "Import-Csv" cmdlet
This one is handled in this StackOverflow post. Short version is you have two columns with "Proemail Exchg Business" in the first row.
PowerShell creates an object from the CSV: the column headers must be unique. Either fix in the excel sheet, or manually specify unique headers using -Header as in linked post.

Related

Issues pulling value of cell using excel com objects in powershell

I am writing a script that scans each cell in an excel file for PII. I've got most of it working, but I am experiencing two issues which may be related.
First of all, I am not convinced that the "Do" loop is performing as intended. The goal here is if the text in a cell matches the regex string, create a PSCustomObject with the location information, then use the object to add a line to a csv file.
It appears that the loop is running for every file, regardless of whether or not it actually found a match.
The other issue is that I can't seem to actually pull the cell value for the matched cell. I've tried several different variables and methods, the latest attempt being "$target.text," but the value of the variable is always null.
I've been racking my brain on this for days, but I'm sure it'll be obvious once I see it.
Any help here would be appreciated.
Thanks.
$searchtext = "\b(?!0{3}|6{3})([0-6]\d{2}|7([0-6]\d|7[012]))([ -]?)(?!00)\d\d\3(?!0000)\d{4}\b"
$xlsFiles = Get-ChildItem $searchpath -recurse -include *.xlsx, *.xls, *.xlxm | Select-object -Expand FullName
$Excel = New-Object -ComObject Excel.Application
$excel.DisplayAlerts = $false;
$excel.AskToUpdateLinks = $false;
foreach ($xlsfile in $xlsfiles) {
Write-host (Get-Date -f yyyymmdd:hhmm) $xlsfile
try{
$Workbook = $Excel.Workbooks.Open($xlsFile, 0, 0, 5, "password")
}
Catch {
Write-host $xlsfile 'is password protected. Skipping...' -ForegroundColor Yellow
continue
}
ForEach ($Sheet in $($Workbook.Sheets)) {
$i = $sheet.Index
$Range = $Workbook.Sheets.Item($i).UsedRange
$Target = $Sheet.UsedRange.Find($Searchtext)
$First = $Target
Do {
$Target = $Range.Find($Target)
$Violation = [PSCustomObject]#{
Path = $xlsfile
Line = "SSN Found" + $target.text
LineNumber = "Sheet: " + $i
}
$Violation | Select-Object Path, Line, LineNumber | export-csv $outputpath\$PIIFile -append -NoTypeInformation
}
While ($NULL -ne $Target -and $Target.AddressLocal() -ne $First.AddressLocal())
}
$Excel.Quit()
}
Figured it out. Just a simple case of faulty logic in the loops.
Thanks to everyone who looked at this.

Powershell - Excel SaveAs csv with specified delimiter

Afternoon all,
Is it possible to save a CSV file using Powershell with a different delimiter, in my case "§". I am using the following script to open and change items in an XLSX file and then wish to save as a "§" delimited CSV. The find and replace method does not work in my case ( (Get-Content -Path $CSVfile).Replace(',','§') | Set-Content -Path $CSVfile2)
$Path = "C:\ScriptRepository\CQC\DataToLoad\"
$FileName = (Get-ChildItem $path).FullName
$FileName2 = (Get-ChildItem $path).Name
$CSVFile = "$Path\$Filename2.csv"
$Excel = New-Object -ComObject Excel.Application -Property #{Visible =
$false}
$Excel.displayalerts=$False
$Workbook = $Excel.Workbooks.Open($FileName)
$WorkSheet = $WorkBook.Sheets.Item(2)
$Worksheet.Activate()
$worksheet.columns.item('G').NumberFormat ="m/d/yyyy"
$Worksheet.Cells.Item(1,3).Value = "Site ID"
$Worksheet.Cells.Item(1,4).Value = "Site Name"
$Worksheet.SaveAs($CSVFile,
[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
$workbook.Save()
$workbook.Close()
$Excel.Quit()
Running the following command, will let you save the CSV file using the delimiter §
Import-CSV filename.csv | ConvertTo-CSV -NoTypeInformation -Delimiter "§" | Out-File output_filename.csv
You should check out ImportExcel - PowerShell module to import/export Excel spreadsheets, without Excel. It makes working with excel files easier using powershell.
I know this is an older post but here is an option I recently came across:
Just update the e:\projects\dss\pse&g.xlsxwith the source location and file as well as the file.csv with the location and file name. Lastly your Worksheet if it is named differently [Sheet1$].
$oleDbConn = New-Object System.Data.OleDb.OleDbConnection
$oleDbCmd = New-Object System.Data.OleDb.OleDbCommand
$oleDbAdapter = New-Object System.Data.OleDb.OleDbDataAdapter
$dataTable = New-Object System.Data.DataTable
$oleDbConn.ConnectionString="Provider=Microsoft.ACE.OLEDB.12.0;Data
Source=e:\projects\dss\pse&g.xlsx;Extended Properties=Excel 12.0;Persist Security Info=False"
$oleDbConn.Open()
$oleDbCmd.Connection = $OleDbConn
$oleDbCmd.commandtext = “Select * from [Sheet1$]”
$oleDbAdapter.SelectCommand = $OleDbCmd
$ret=$oleDbAdapter.Fill($dataTable)
Write-Host "Rows returned:$ret" -ForegroundColor green
$dataTable | Export-Csv file.csv -Delimiter ';'
$oleDbConn.Close()
Source
I was using SaveAs(file.csv,6) but couldn't change the delimiter. Also Ishan's resolution works but I wanted something more OOB as this is going to be used within an SSIS package for myself across different systems and this just works. =)

Remove known Excel passwords with PowerShell

I have this PowerShell code that loops through Excel files in a specified directory; references a list of known passwords to find the correct one; and then opens, decrypts, and saves that file to a new directory.
But it's not executing as quickly as I'd like (it's part of a larger ETL process and it's a bottleneck). At this point I can remove the passwords faster manually as the script takes ~40 minutes to decrypt 40 workbooks while referencing a list of ~50 passwords.
Is there a cmdlet or function (or something) that's missing which would speed this up, an overlooked flaw in the processing, or is PowerShell, perhaps, just not the right tool for this job?
Original Code (updated code can be found below):
$ErrorActionPreference = "SilentlyContinue"
CLS
# Paths
$encrypted_path = "C:\PoShTest\Encrypted\"
$decrypted_Path = "C:\PoShTest\Decrypted\"
$original_Path = "C:\PoShTest\Originals\"
$password_Path = "C:\PoShTest\Passwords\Passwords.txt"
# Load Password Cache
$arrPasswords = Get-Content -Path $password_Path
# Load File List
$arrFiles = Get-ChildItem $encrypted_path
# Create counter to display progress
[int] $count = ($arrfiles.count -1)
# Loop through each file
$arrFiles| % {
$file = get-item -path $_.fullname
# Display current file
write-host "Processing" $file.name -f "DarkYellow"
write-host "Items remaining: " $count `n
# Excel xlsx
if ($file.Extension -eq ".xlsx") {
# Loop through password cache
$arrPasswords | % {
$passwd = $_
# New Excel Object
$ExcelObj = $null
$ExcelObj = New-Object -ComObject Excel.Application
$ExcelObj.Visible = $false
# Attempt to open file
$Workbook = $ExcelObj.Workbooks.Open($file.fullname,1,$false,5,$passwd)
$Workbook.Activate()
# if password is correct - Save new file without password to $decrypted_Path
if ($Workbook.Worksheets.count -ne 0) {
$Workbook.Password=$null
$savePath = $decrypted_Path+$file.Name
write-host "Decrypted: " $file.Name -f "DarkGreen"
$Workbook.SaveAs($savePath)
# Close document and Application
$ExcelObj.Workbooks.close()
$ExcelObj.Application.Quit()
# Move original file to $original_Path
move-item $file.fullname -Destination $original_Path -Force
}
else {
# Close document and Application
write-host "PASSWORD NOT FOUND: " $file.name -f "Magenta"
$ExcelObj.Close()
$ExcelObj.Application.Quit()
}
}
}
$count--
# Next File
}
Write-host "`n Processing Complete" -f "Green"
Updated code:
# Get Current EXCEL Process ID's so they are not affected but the scripts cleanup
# SilentlyContinue in case there are no active Excels
$currentExcelProcessIDs = (Get-Process excel -ErrorAction SilentlyContinue).Id
$a = Get-Date
$ErrorActionPreference = "SilentlyContinue"
CLS
# Paths
$encrypted_path = "C:\PoShTest\Encrypted"
$decrypted_Path = "C:\PoShTest\Decrypted\"
$processed_Path = "C:\PoShTest\Processed\"
$password_Path = "C:\PoShTest\Passwords\Passwords.txt"
# Load Password Cache
$arrPasswords = Get-Content -Path $password_Path
# Load File List
$arrFiles = Get-ChildItem $encrypted_path
# Create counter to display progress
[int] $count = ($arrfiles.count -1)
# New Excel Object
$ExcelObj = $null
$ExcelObj = New-Object -ComObject Excel.Application
$ExcelObj.Visible = $false
# Loop through each file
$arrFiles| % {
$file = get-item -path $_.fullname
# Display current file
write-host "`n Processing" $file.name -f "DarkYellow"
write-host "`n Items remaining: " $count `n
# Excel xlsx
if ($file.Extension -like "*.xls*") {
# Loop through password cache
$arrPasswords | % {
$passwd = $_
# Attempt to open file
$Workbook = $ExcelObj.Workbooks.Open($file.fullname,1,$false,5,$passwd)
$Workbook.Activate()
# if password is correct, remove $passwd from array and save new file without password to $decrypted_Path
if ($Workbook.Worksheets.count -ne 0)
{
$Workbook.Password=$null
$savePath = $decrypted_Path+$file.Name
write-host "Decrypted: " $file.Name -f "DarkGreen"
$Workbook.SaveAs($savePath)
# Added to keep Excel process memory utilization in check
$ExcelObj.Workbooks.close()
# Move original file to $processed_Path
move-item $file.fullname -Destination $processed_Path -Force
}
else {
# Close Document
$ExcelObj.Workbooks.Close()
}
}
}
$count--
# Next File
}
# Close Document and Application
$ExcelObj.Workbooks.close()
$ExcelObj.Application.Quit()
Write-host "`nProcessing Complete!" -f "Green"
Write-host "`nFiles w/o a matching password can be found in the Encrypted folder."
Write-host "`nTime Started : " $a.ToShortTimeString()
Write-host "Time Completed : " $(Get-Date).ToShortTimeString()
Write-host "`nTotal Duration : "
NEW-TIMESPAN –Start $a –End $(Get-Date)
# Remove any stale Excel processes created by this script's execution
Get-Process excel -ErrorAction SilentlyContinue | Where-Object{$currentExcelProcessIDs -notcontains $_.id} | Stop-Process
If nothing else I do see one glaring performance issue that should be easy to address. You are opening a new excel instance for testing each individual password for each document. 40 workbooks with 50 passwords mean you have opened 2000 Excel instances one at a time.
You should be able to keep using the same one without a functionality hit. Get this code out of your inner most loop
# New Excel Object
$ExcelObj = $null
$ExcelObj = New-Object -ComObject Excel.Application
$ExcelObj.Visible = $false
as well as the snippet that would close the process. It would need to be out of the loop as well.
$ExcelObj.Close()
$ExcelObj.Application.Quit()
If that does not help enough you would have to consider doing some sort of parallel processing with jobs etc. I have a basic solution in a CodeReview.SE answer of mine doing something similar.
Basically what it does is run several excels at once where each one works on a chunk of documents which runs faster than one Excel doing them all. Just like I do in the linked answer I caution the automation of Excel COM with PowerShell. COM objects don't always get released properly and locks can be left on files or processes.
You are looping for all 50 passwords regardless of success or not. That means you could find the right password on the first go but you are still going to try the other 49! Set a flag in the loop to break that inner loop when that happens.
As far as the password logic goes you say that
At this point I can remove the passwords faster manually since the script takes ~40 minutes
Why can you do it faster? What do you know that the script does not. I don't see you being able to out perform the script but doing exactly what it does.
With what I see another suggestion would be to keep/track successful passwords and associated file name. So that way when it gets processed again you would know the first password to try.
This solution uses the modules ImportExcel for easier working with Excel files, and PoshRSJob for multithreaded processing.
If you do not have these, install them by running:
Install-Module ImportExcel -scope CurrentUser
Install-Module PoshRSJob -scope CurrentUser
I've raised an issue on the ImportExcel module GitHub page where I've proposed a solution to open encrypted Excel files. The author may propose a better solution (and consider the impact across other functions in the module, but this works for me). For now, you'll need to make a modification to the Import-Excel function yourself:
Open: C:\Username\Documents\WindowsPowerShell\Modules\ImportExcel\2.4.0\ImportExcel.psm1 and scroll to the Import-Excel function. Replace:
[switch]$DataOnly
With
[switch]$DataOnly,
[String]$Password
Then replace the following line:
$xl = New-Object -TypeName OfficeOpenXml.ExcelPackage -ArgumentList $stream
With the code suggested here. This will let you call the Import-Excel function with a -Password parameter.
Next we need our function to repeatedly try and open a singular Excel file using a known set of passwords. Open a PowerShell window and paste in the following function (note: this function has a default output path defined, and also outputs passwords in the verbose stream - make sure no-one is looking over your shoulder or just remove that if you'd prefer):
function Remove-ExcelEncryption
{
[CmdletBinding()]
Param
(
[Parameter(Mandatory=$true)]
[String]
$File,
[Parameter(Mandatory=$false)]
[String]
$OutputPath = 'C:\PoShTest\Decrypted',
[Parameter(Mandatory=$true)]
[Array]
$PasswordArray
)
$filename = Split-Path -Path $file -Leaf
foreach($Password in $PasswordArray)
{
Write-Verbose "Attempting to open $file with password: $Password"
try
{
$ExcelData = Import-Excel -path $file -Password $Password -ErrorAction Stop
Write-Verbose "Successfully opened file."
}
catch
{
Write-Verbose "Failed with error $($Error[0].Exception.Message)"
continue
}
try
{
$null = $ExcelData | Export-Excel -Path $OutputPath\$filename
return "Success"
}
catch
{
Write-Warning "Could not save to $OutputPath\$filename"
}
}
}
Finally, we can run code to do the work:
$Start = get-date
$PasswordArray = #('dj7F9vsm','kDZq737b','wrzCgTWk','DqP2KtZ4')
$files = Get-ChildItem -Path 'C:\PoShTest\Encrypted'
$files | Start-RSJob -Name {$_.Name} -ScriptBlock {
Remove-ExcelEncryption -File $_.Fullname -PasswordArray $Using:PasswordArray -Verbose
} -FunctionsToLoad Remove-ExcelEncryption -ModulesToImport Import-Excel | Wait-RSJob | Receive-RSJob
$end = Get-Date
New-TimeSpan -Start $Start -End $end
For me, if the correct password is first in the list it runs in 13 seconds against 128 Excel files. If I call the function in a standard foreach loop, it takes 27 seconds.
To view which files were successfully converted we can inspect the output property on the RSJob objects (this is the output of the Remove-ExcelEncryption function where I've told it to return "Success"):
Get-RSJob | Select-Object -Property Name,Output
Hope that helps.

Convert multiple xls to csv using powershell

I'm trying to convert multiple excel files (xls) to csv which is located in a folder using powershell.
I can convert a single file but need help converting multiple files in a folder.
But need advise on how to convert multiple files.
$ExcelWB = new-object -comobject excel.application
$Workbook = $ExcelWB.Workbooks.Open(c:\temp\temp.xls)
$Workbook.SaveAs("c:\temp\temp.csv",6)
$Workbook.Close($false)
$ExcelWB.quit()
You can just wrap it in a loop that iterates over all the files and change the xls extension to csv:
foreach($file in (Get-ChildItem "C:\temp")) {
$newname = $file.FullName -replace '\.xls$', '.csv'
$ExcelWB = new-object -comobject excel.application
$Workbook = $ExcelWB.Workbooks.Open($file.FullName)
$Workbook.SaveAs($newname,6)
$Workbook.Close($false)
$ExcelWB.quit()
}
There are caveats with this untested code but it should help wrap your head around your issue
$ExcelWB = new-object -comobject excel.application
Get-ChildItem -Path c:\folder -Filter "*.xls" | ForEach-Object{
$Workbook = $ExcelWB.Workbooks.Open($_.Fullname)
$newName = ($_.Fullname).Replace($_.Extension,".csv")
$Workbook.SaveAs($newName,6)
$Workbook.Close($false)
}
$ExcelWB.quit()
Take the lines in between the first and last and build a loop. Use Get-ChildItem to grab your xls files and then build a new name by replacing the extension if the FullName of the file
The conversion from xlsx files to csv can be done far quicker and without COM Objects - so without Excel installed - using the ImportExcel module developped by Doug Finke:
Install-Module -Name ImportExcel -RequiredVersion 5.4.2
gci *.xlsx | %{Import-Excel $_ | Export-Csv ($_.basename + ".csv")}
Or the other way around:
gci *.csv | %{Import-Csv $_ | Export-Excel ($_.basename + ".xlsx")}
Parameters available for the Import-Excel cmdlet:
WorksheetName
Specifies the name of the worksheet in the Excel workbook to import. By default, if no name is provided, the first worksheet will be imported.
DataOnly
Import only rows and columns that contain data, empty rows and empty columns are not imported.
HeaderName
Specifies custom property names to use, instead of the values defined in the column headers of the TopRow.
NoHeader
Automatically generate property names (P1, P2, P3, ..) instead of the ones defined in the column headers of the TopRow.
StartRow
The row from where we start to import data, all rows above the StartRow are disregarded. By default this is the first row.
EndRow
By default all rows up to the last cell in the sheet will be imported. If specified, import stops at this row.
StartColumn
The number of the first column to read data from (1 by default).
EndColumn
By default the import reads up to the last populated column, -EndColumn tells the import to stop at an earlier number.
Password
Accepts a string that will be used to open a password protected Excel file.
Expanding on the answer from #arco444, if you are doing this in bulk you should create the excel object outside the loop for a much more performant conversion
$ExcelWB = new-object -comobject excel.application
foreach($file in (Get-ChildItem "C:\temp")) {
$newname = $file.FullName -replace '\.xls$', '.csv'
$Workbook = $ExcelWB.Workbooks.Open($file.FullName)
$Workbook.SaveAs($newname,6)
$Workbook.Close($false)
}
$ExcelWB.quit()
Apologies I can't comment and edit queue has been full for some time, so posting as an answer instead.

How to export a CSV to Excel using Powershell

I'm trying to export a complete CSV to Excel by using Powershell. I stuck at a point where static column names are used. But this doesn't work if my CSV has generic unknown header names.
Steps to reproduce
Open your PowerShell ISE and copy & paste the following standalone code. Run it with F5
"C:\Windows\system32\WindowsPowerShell\v1.0\powershell_ise.exe"
Get-Process | Export-Csv -Path $env:temp\process.csv -NoTypeInformation
$processes = Import-Csv -Path $env:temp\process.csv
$Excel = New-Object -ComObject excel.application
$workbook = $Excel.workbooks.add()
$i = 1
foreach($process in $processes)
{
$excel.cells.item($i,1) = $process.name
$excel.cells.item($i,2) = $process.vm
$i++
}
Remove-Item $env:temp\process.csv
$Excel.visible = $true
What it does
The script will export a list of all active processes as a CSV to your temp folder. This file is only for our example. It could be any CSV with any data
It reads in the newly created CSV and saves it under the $processes variable
It creates a new and empty Excel workbook where we can write data
It iterates through all rows (?) and writes all values from the name and vm column to Excel
My questions
What if I don't know the column headers? (In our example name and vm). How do I address values where I don't know their header names?
How do I count how many columns a CSV has? (after reading it with Import-Csv)
I just want to write an entire CSV to Excel with Powershell
Ups, I entirely forgot this question. In the meantime I got a solution.
This Powershell script converts a CSV to XLSX in the background
Gimmicks are
Preserves all CSV values as plain text like =B1+B2 or 0000001.
You don't see #Name or anything like that. No autoformating is done.
Automatically chooses the right delimiter (comma or semicolon) according to your regional setting
Autofit columns
PowerShell Code
### Set input and output path
$inputCSV = "C:\somefolder\input.csv"
$outputXLSX = "C:\somefolder\output.xlsx"
### Create a new Excel Workbook with one empty sheet
$excel = New-Object -ComObject excel.application
$workbook = $excel.Workbooks.Add(1)
$worksheet = $workbook.worksheets.Item(1)
### Build the QueryTables.Add command
### QueryTables does the same as when clicking "Data » From Text" in Excel
$TxtConnector = ("TEXT;" + $inputCSV)
$Connector = $worksheet.QueryTables.add($TxtConnector,$worksheet.Range("A1"))
$query = $worksheet.QueryTables.item($Connector.name)
### Set the delimiter (, or ;) according to your regional settings
$query.TextFileOtherDelimiter = $Excel.Application.International(5)
### Set the format to delimited and text for every column
### A trick to create an array of 2s is used with the preceding comma
$query.TextFileParseType = 1
$query.TextFileColumnDataTypes = ,2 * $worksheet.Cells.Columns.Count
$query.AdjustColumnWidth = 1
### Execute & delete the import query
$query.Refresh()
$query.Delete()
### Save & close the Workbook as XLSX. Change the output extension for Excel 2003
$Workbook.SaveAs($outputXLSX,51)
$excel.Quit()
I am using excelcnv.exe to convert csv into xlsx and that seemed to work properly.
You will have to change the directory to where your excelcnv is. If 32 bit, it goes to Program Files (x86)
Start-Process -FilePath 'C:\Program Files\Microsoft Office\root\Office16\excelcnv.exe' -ArgumentList "-nme -oice ""$xlsFilePath"" ""$xlsToxlsxPath"""
This topic really helped me, so I'd like to share my improvements.
All credits go to the nixda, this is based on his answer.
For those who need to convert multiple csv's in a folder, just modify the directory. Outputfilenames will be identical to input, just with another extension.
Take care of the cleanup in the end, if you like to keep the original csv's you might not want to remove these.
Can be easily modifed to save the xlsx in another directory.
$workingdir = "C:\data\*.csv"
$csv = dir -path $workingdir
foreach($inputCSV in $csv){
$outputXLSX = $inputCSV.DirectoryName + "\" + $inputCSV.Basename + ".xlsx"
### Create a new Excel Workbook with one empty sheet
$excel = New-Object -ComObject excel.application
$excel.DisplayAlerts = $False
$workbook = $excel.Workbooks.Add(1)
$worksheet = $workbook.worksheets.Item(1)
### Build the QueryTables.Add command
### QueryTables does the same as when clicking "Data » From Text" in Excel
$TxtConnector = ("TEXT;" + $inputCSV)
$Connector = $worksheet.QueryTables.add($TxtConnector,$worksheet.Range("A1"))
$query = $worksheet.QueryTables.item($Connector.name)
### Set the delimiter (, or ;) according to your regional settings
### $Excel.Application.International(3) = ,
### $Excel.Application.International(5) = ;
$query.TextFileOtherDelimiter = $Excel.Application.International(5)
### Set the format to delimited and text for every column
### A trick to create an array of 2s is used with the preceding comma
$query.TextFileParseType = 1
$query.TextFileColumnDataTypes = ,2 * $worksheet.Cells.Columns.Count
$query.AdjustColumnWidth = 1
### Execute & delete the import query
$query.Refresh()
$query.Delete()
### Save & close the Workbook as XLSX. Change the output extension for Excel 2003
$Workbook.SaveAs($outputXLSX,51)
$excel.Quit()
}
## To exclude an item, use the '-exclude' parameter (wildcards if needed)
remove-item -path $workingdir -exclude *Crab4dq.csv
Why would you bother? Load your CSV into Excel like this:
$csv = Join-Path $env:TEMP "process.csv"
$xls = Join-Path $env:TEMP "process.xlsx"
$xl = New-Object -COM "Excel.Application"
$xl.Visible = $true
$wb = $xl.Workbooks.OpenText($csv)
$wb.SaveAs($xls, 51)
You just need to make sure that the CSV export uses the delimiter defined in your regional settings. Override with -Delimiter if need be.
Edit: A more general solution that should preserve the values from the CSV as plain text. Code for iterating over the CSV columns taken from here.
$csv = Join-Path $env:TEMP "input.csv"
$xls = Join-Path $env:TEMP "output.xlsx"
$xl = New-Object -COM "Excel.Application"
$xl.Visible = $true
$wb = $xl.Workbooks.Add()
$ws = $wb.Sheets.Item(1)
$ws.Cells.NumberFormat = "#"
$i = 1
Import-Csv $csv | ForEach-Object {
$j = 1
foreach ($prop in $_.PSObject.Properties) {
if ($i -eq 1) {
$ws.Cells.Item($i, $j++).Value = $prop.Name
} else {
$ws.Cells.Item($i, $j++).Value = $prop.Value
}
}
$i++
}
$wb.SaveAs($xls, 51)
$wb.Close()
$xl.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)
Obviously this second approach won't perform too well, because it's processing each cell individually.
If you want to convert CSV to Excel without Excel being installed, you can use the great .NET library EPPlus (under LGPL license) to create and modify Excel Sheets and also convert CSV to Excel really fast!
Preparation
Download the latest stable EPPlus version
Extract EPPlus to your preferred location (e.g. to $HOME\Documents\WindowsPowerShell\Modules\EPPlus)
Right Click EPPlus.dll, select Properties and at the bottom of the General Tab click "Unblock" to allow loading of this dll. If you don't have the rights to do this, try [Reflection.Assembly]::UnsafeLoadFrom($DLLPath) | Out-Null
Detailed Powershell Commands to import CSV to Excel
# Create temporary CSV and Excel file names
$FileNameCSV = "$HOME\Downloads\test.csv"
$FileNameExcel = "$HOME\Downloads\test.xlsx"
# Create CSV File (with first line containing type information and empty last line)
Get-Process | Export-Csv -Delimiter ';' -Encoding UTF8 -Path $FileNameCSV
# Load EPPlus
$DLLPath = "$HOME\Documents\WindowsPowerShell\Modules\EPPlus\EPPlus.dll"
[Reflection.Assembly]::LoadFile($DLLPath) | Out-Null
# Set CSV Format
$Format = New-object -TypeName OfficeOpenXml.ExcelTextFormat
$Format.Delimiter = ";"
# use Text Qualifier if your CSV entries are quoted, e.g. "Cell1","Cell2"
$Format.TextQualifier = '"'
$Format.Encoding = [System.Text.Encoding]::UTF8
$Format.SkipLinesBeginning = '1'
$Format.SkipLinesEnd = '1'
# Set Preferred Table Style
$TableStyle = [OfficeOpenXml.Table.TableStyles]::Medium1
# Create Excel File
$ExcelPackage = New-Object OfficeOpenXml.ExcelPackage
$Worksheet = $ExcelPackage.Workbook.Worksheets.Add("FromCSV")
# Load CSV File with first row as heads using a table style
$null=$Worksheet.Cells.LoadFromText((Get-Item $FileNameCSV),$Format,$TableStyle,$true)
# Load CSV File without table style
#$null=$Worksheet.Cells.LoadFromText($file,$format)
# Fit Column Size to Size of Content
$Worksheet.Cells[$Worksheet.Dimension.Address].AutoFitColumns()
# Save Excel File
$ExcelPackage.SaveAs($FileNameExcel)
Write-Host "CSV File $FileNameCSV converted to Excel file $FileNameExcel"
This is a slight variation that worked better for me.
$csv = Join-Path $env:TEMP "input.csv"
$xls = Join-Path $env:TEMP "output.xlsx"
$xl = new-object -comobject excel.application
$xl.visible = $false
$Workbook = $xl.workbooks.open($CSV)
$Worksheets = $Workbooks.worksheets
$Workbook.SaveAs($XLS,1)
$Workbook.Saved = $True
$xl.Quit()
I had some problem getting the other examples to work.
EPPlus and other libraries produces OpenDocument Xml format, which is not the same as you get when you save from Excel as xlsx.
macks example with open CSV and just re-saving didn't work, I never managed to get the ',' delimiter to be used correctly.
Ansgar Wiechers example has some slight error which I found the answer for in the commencts.
Anyway, this is a complete working example. Save this in a File CsvToExcel.ps1
param (
[Parameter(Mandatory=$true)][string]$inputfile,
[Parameter(Mandatory=$true)][string]$outputfile
)
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$wb = $excel.Workbooks.Add()
$ws = $wb.Sheets.Item(1)
$ws.Cells.NumberFormat = "#"
write-output "Opening $inputfile"
$i = 1
Import-Csv $inputfile | Foreach-Object {
$j = 1
foreach ($prop in $_.PSObject.Properties)
{
if ($i -eq 1) {
$ws.Cells.Item($i, $j) = $prop.Name
} else {
$ws.Cells.Item($i, $j) = $prop.Value
}
$j++
}
$i++
}
$wb.SaveAs($outputfile,51)
$wb.Close()
$excel.Quit()
write-output "Success"
Execute with:
.\CsvToExcel.ps1 -inputfile "C:\Temp\X\data.csv" -outputfile "C:\Temp\X\data.xlsx"
I found this while passing and looking for answers on how to compile a set of csvs into a single excel doc with the worksheets (tabs) named after the csv files. It is a nice function. Sadly, I cannot run them on my network :( so i do not know how well it works.
Function Release-Ref ($ref)
{
([System.Runtime.InteropServices.Marshal]::ReleaseComObject(
[System.__ComObject]$ref) -gt 0)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
}
Function ConvertCSV-ToExcel
{
<#
.SYNOPSIS
Converts one or more CSV files into an excel file.
.DESCRIPTION
Converts one or more CSV files into an excel file. Each CSV file is imported into its own worksheet with the name of the
file being the name of the worksheet.
.PARAMETER inputfile
Name of the CSV file being converted
.PARAMETER output
Name of the converted excel file
.EXAMPLE
Get-ChildItem *.csv | ConvertCSV-ToExcel -output ‘report.xlsx’
.EXAMPLE
ConvertCSV-ToExcel -inputfile ‘file.csv’ -output ‘report.xlsx’
.EXAMPLE
ConvertCSV-ToExcel -inputfile #(“test1.csv”,”test2.csv”) -output ‘report.xlsx’
.NOTES
Author: Boe Prox
Date Created: 01SEPT210
Last Modified:
#>
#Requires -version 2.0
[CmdletBinding(
SupportsShouldProcess = $True,
ConfirmImpact = ‘low’,
DefaultParameterSetName = ‘file’
)]
Param (
[Parameter(
ValueFromPipeline=$True,
Position=0,
Mandatory=$True,
HelpMessage=”Name of CSV/s to import”)]
[ValidateNotNullOrEmpty()]
[array]$inputfile,
[Parameter(
ValueFromPipeline=$False,
Position=1,
Mandatory=$True,
HelpMessage=”Name of excel file output”)]
[ValidateNotNullOrEmpty()]
[string]$output
)
Begin {
#Configure regular expression to match full path of each file
[regex]$regex = “^\w\:\\”
#Find the number of CSVs being imported
$count = ($inputfile.count -1)
#Create Excel Com Object
$excel = new-object -com excel.application
#Disable alerts
$excel.DisplayAlerts = $False
#Show Excel application
$excel.V isible = $False
#Add workbook
$workbook = $excel.workbooks.Add()
#Remove other worksheets
$workbook.worksheets.Item(2).delete()
#After the first worksheet is removed,the next one takes its place
$workbook.worksheets.Item(2).delete()
#Define initial worksheet number
$i = 1
}
Process {
ForEach ($input in $inputfile) {
#If more than one file, create another worksheet for each file
If ($i -gt 1) {
$workbook.worksheets.Add() | Out-Null
}
#Use the first worksheet in the workbook (also the newest created worksheet is always 1)
$worksheet = $workbook.worksheets.Item(1)
#Add name of CSV as worksheet name
$worksheet.name = “$((GCI $input).basename)”
#Open the CSV file in Excel, must be converted into complete path if no already done
If ($regex.ismatch($input)) {
$tempcsv = $excel.Workbooks.Open($input)
}
ElseIf ($regex.ismatch(“$($input.fullname)”)) {
$tempcsv = $excel.Workbooks.Open(“$($input.fullname)”)
}
Else {
$tempcsv = $excel.Workbooks.Open(“$($pwd)\$input”)
}
$tempsheet = $tempcsv.Worksheets.Item(1)
#Copy contents of the CSV file
$tempSheet.UsedRange.Copy() | Out-Null
#Paste contents of CSV into existing workbook
$worksheet.Paste()
#Close temp workbook
$tempcsv.close()
#Select all used cells
$range = $worksheet.UsedRange
#Autofit the columns
$range.EntireColumn.Autofit() | out-null
$i++
}
}
End {
#Save spreadsheet
$workbook.saveas(“$pwd\$output”)
Write-Host -Fore Green “File saved to $pwd\$output”
#Close Excel
$excel.quit()
#Release processes for Excel
$a = Release-Ref($range)
}
}

Resources