Convert multiple xls to csv using powershell - excel

I'm trying to convert multiple excel files (xls) to csv which is located in a folder using powershell.
I can convert a single file but need help converting multiple files in a folder.
But need advise on how to convert multiple files.
$ExcelWB = new-object -comobject excel.application
$Workbook = $ExcelWB.Workbooks.Open(c:\temp\temp.xls)
$Workbook.SaveAs("c:\temp\temp.csv",6)
$Workbook.Close($false)
$ExcelWB.quit()

You can just wrap it in a loop that iterates over all the files and change the xls extension to csv:
foreach($file in (Get-ChildItem "C:\temp")) {
$newname = $file.FullName -replace '\.xls$', '.csv'
$ExcelWB = new-object -comobject excel.application
$Workbook = $ExcelWB.Workbooks.Open($file.FullName)
$Workbook.SaveAs($newname,6)
$Workbook.Close($false)
$ExcelWB.quit()
}

There are caveats with this untested code but it should help wrap your head around your issue
$ExcelWB = new-object -comobject excel.application
Get-ChildItem -Path c:\folder -Filter "*.xls" | ForEach-Object{
$Workbook = $ExcelWB.Workbooks.Open($_.Fullname)
$newName = ($_.Fullname).Replace($_.Extension,".csv")
$Workbook.SaveAs($newName,6)
$Workbook.Close($false)
}
$ExcelWB.quit()
Take the lines in between the first and last and build a loop. Use Get-ChildItem to grab your xls files and then build a new name by replacing the extension if the FullName of the file

The conversion from xlsx files to csv can be done far quicker and without COM Objects - so without Excel installed - using the ImportExcel module developped by Doug Finke:
Install-Module -Name ImportExcel -RequiredVersion 5.4.2
gci *.xlsx | %{Import-Excel $_ | Export-Csv ($_.basename + ".csv")}
Or the other way around:
gci *.csv | %{Import-Csv $_ | Export-Excel ($_.basename + ".xlsx")}
Parameters available for the Import-Excel cmdlet:
WorksheetName
Specifies the name of the worksheet in the Excel workbook to import. By default, if no name is provided, the first worksheet will be imported.
DataOnly
Import only rows and columns that contain data, empty rows and empty columns are not imported.
HeaderName
Specifies custom property names to use, instead of the values defined in the column headers of the TopRow.
NoHeader
Automatically generate property names (P1, P2, P3, ..) instead of the ones defined in the column headers of the TopRow.
StartRow
The row from where we start to import data, all rows above the StartRow are disregarded. By default this is the first row.
EndRow
By default all rows up to the last cell in the sheet will be imported. If specified, import stops at this row.
StartColumn
The number of the first column to read data from (1 by default).
EndColumn
By default the import reads up to the last populated column, -EndColumn tells the import to stop at an earlier number.
Password
Accepts a string that will be used to open a password protected Excel file.

Expanding on the answer from #arco444, if you are doing this in bulk you should create the excel object outside the loop for a much more performant conversion
$ExcelWB = new-object -comobject excel.application
foreach($file in (Get-ChildItem "C:\temp")) {
$newname = $file.FullName -replace '\.xls$', '.csv'
$Workbook = $ExcelWB.Workbooks.Open($file.FullName)
$Workbook.SaveAs($newname,6)
$Workbook.Close($false)
}
$ExcelWB.quit()
Apologies I can't comment and edit queue has been full for some time, so posting as an answer instead.

Related

Powershell append data into existing XLSX with multiple sheets

New to Powershell and looking to learn.
Goal:
Trying to take the Data out of a .csv file (14 cells of data per row) and import into an existing .xlsx file Starting on the second row columns (A2:N2).
The .xlsx file has 4 sheets with the one I am looking to edit being labeled "Data". Data sheet/tab has 18 columns, the first 14 are where I would like the imported data starting on row (A2:N2-> End will vary).
Looking for a way to automate the report by filling rows A-N with data from a file (.csv) which gets generated automatically.
Sample of "Data" tab with some values:
Current process is to open one xls file and copy/past into cells starting at A2. Looking to automate this and have automated the report -> Emails .xls file, which I convert to .csv and remove some titles and extra info which is not needed using the following code:
Function ExcelCSV ($File)
{
$pwd = "C:\Users\..." #Removed local path
$excelFile = "$pwd\" + $File + ".xls"
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
$wb = $Excel.Workbooks.Open($excelFile)
foreach ($ws in $wb.Worksheets)
{
$ws.SaveAs("$pwd\" + $File + ".csv", 6)
}
$Excel.Quit()
}
$TestFile = (Get-Content .\FileName.xls) -replace 'null',''
$TestFile | Out-File Test.xls
$FileName = "Test"
ExcelCSV -File $FileName
Get-Content Test.csv | Select-Object -Skip 2 | Select-Object -SkipLast 3 | Set-Content Test2.csv
Please use great ImportExcel powershell module ImportExcelModule
using it You can achieve Your goal by simply doing so
$csv=Import-CSV <YourImportParameters>
$csv|Export-Excel -Path $pwd -Show -StartRow 2 -StartColumn 2 -sheet $sheetname
Above will take the object and export it to excel file $pwd,sheet $sheetname starting from second row of second column
If You want to send that via mail to someone afterwards - Powershell can help You do that in 1 line too :)

Consolidate excel workbooks data to csv file from folder using power shell

In a folder i have around 20 excel workbooks,each workbook having MIS for upload excel sheet i want to consolidate all data from each workbook from MIS for upload excel sheet to new csv file using powershell
below is the code which i have tried.But i want Browse for a Folder method.
#Get a list of files to copy from
$Files = GCI 'C:\Users\r.shishodia\Desktop\May 2018' | ?{$_.Extension -Match "xlsx?"} | select -ExpandProperty FullName
#Launch Excel, and make it do as its told (supress confirmations)
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $True
$Excel.DisplayAlerts = $False
#Open up a new workbook
$Dest = $Excel.Workbooks.Add()
#Loop through files, opening each, selecting the Used range, and only grabbing the first 6 columns of it. Then find next available row on the destination worksheet and paste the data
ForEach($File in $Files[0..20]){
$Source = $Excel.Workbooks.Open($File,$true,$true)
If(($Dest.ActiveSheet.UsedRange.Count -eq 1) -and ([String]::IsNullOrEmpty($Dest.ActiveSheet.Range("A1").Value2))){ #If there is only 1 used cell and it is blank select A1
$Source.WorkSheets.item("MIS for Upload").Activate()
[void]$source.ActiveSheet.Range("A1","R$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A1").Select()
}Else{ #If there is data go to the next empty row and select Column A
$Source.WorkSheets.item("MIS for Upload").Activate()
[void]$source.ActiveSheet.Range("A2","R$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A$(($Dest.ActiveSheet.UsedRange.Rows|Select -last 1).row+1)").Select()
}
[void]$Dest.ActiveSheet.Paste()
$Source.Close()
}
$Dest.SaveAs("C:\Users\r.shishodia\Desktop\Book2.xlsx",51)
$Dest.close()
$Excel.Quit()
For this purpose you could use ImportExcel module - installation guide included in repo README.
Once you install this module you can easily use Import-Excel cmdlet like this:
$Files = GCI 'C:\Users\r.shishodia\Desktop\May 2018' | ?{$_.Extension -Match "xlsx?"} | select -ExpandProperty FullName
$Temp = #()
ForEach ($File in $Files[0..20]) { # or 19 if you want to have exactly 20 files imported
$Temp += Import-Excel -Path $File -WorksheetName 'MIS for Upload' `
| Select Property0, Property1, Property2, Property3, Property4, Property5
}
To export (you wrote CSV but your destination file format says xlsx):
$Temp | Export-Excel 'C:\Users\r.shishodia\Desktop\Book2.xlsx'
or
$Temp | Export-Csv 'C:\Users\r.shishodia\Desktop\Book2.csv'
That ImportExcel module is really handy ;-)

import xlsx converted to csv to powershell

I need to import csv file to my PS script. In the same script I have a function to convert an xlsx file to csv, which I need to import. But I can't find a way to do that. I tried three approaches:
$CSVLicence = Import-Csv (ConvertXLSX -File $Licence) -Delimiter "," -Encoding UTF8
Where ConvertXLSX is the function to convert xlsx to csv, $Licence is a variable defined at the beginning of the script. The second approach is this:
$CSVfile = ConvertXLSX -File $Licence
$CSVLicence = ImportCsv $CSVfile -Delimiter "," -Encoding UTF8
In both cases I get an error message on the column right after "Import-Csv", that argument Path is null or empty.
The third approach was defining the path literally, which is not really a good solution, but should work, if executor of the file runs it in correct folder and uses correct file name. In this case I get an error "The member "Proemail Exchg Business" is already present" in front of the "Import-Csv" cmdlet
The csv file gets created and it looks precisely as it should, so there's obviously not an error in the converting function.
The ConvertXLSX function is defined like this:
$Licence = "licence"
Function ConvertXLSX ($File)
{
$PWD = "r:\Licence\"
$ExcelFile = $PWD + $File + ".xlsx"
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
$wb = $Excel.Workbooks.Open($ExcelFile)
foreach ($ws in $wb.Worksheets | where {$_.name -eq "Pridani"})
{
$ws.SaveAs($PWD + $File + ".csv",6)
}
$Excel.Quit()
}
ConvertXLSX -File $Licence
The function creates the csv file successfully, without any problems
You're currently using the wrong cmdlet. Import-Csv is designed to be used with a file.csv, not a string object or otherwise. There is a cmdlet that exists for that purpose, ConvertFrom-Csv, that takes an -InputObject parameter (indicating it takes pipeline input).
PS C:\> Get-Help -Name 'ConvertFrom-Csv'
SYNTAX
ConvertFrom-Csv [-InputObject] <PSObject[]> [[-Delimiter] <Char>] [-Header <String[]>]
Used in your example:
## Declaration of the delimiter is unnecessary when it's a comma
$CSVLicense = ConvertXLSX -File $Licence | ConvertFrom-Csv
TheIncorrigible1 explained the issue quite well. Now that you've edited the question to add the ConvertXLSX function, we can build on that to get a solution that fits.
$ws.SaveAs($PWD + $File + ".csv",6)
This part is the save to the filename $PWD + $File + ".csv". The number 6 represents the CSV XlFileFormat Enumeration.
The function does not return anything, just performs an action on external file. That's why you are getting the null or empty error. We can get it to return the file path, or to return the data in an object and use ConvertFrom-Csv.
Given your code, the object would require more work as you already have the file path there.
To return file path, change:
}
$Excel.Quit()
}
To:
}
$Excel.Quit()
return $($PWD + $File + ".csv")
}
You code should then work.
You could improve it's readability and conciseness more:
Assign the CSV filename to a variable.
Avoid assigning to $PWD as it's an alias for Get-Location
Get rid of the foreach. You are only getting one sheet and saving that sheet at the moment. If you are doing for multiple sheets, you will have to thing how to pass the multiple paths/object to Import-Csv/ConvertFrom-Csv
Putting it all together:
$Licence = "licence"
Function ConvertXLSX ($File)
{
$rootDir = "r:\Licence\"
$ExcelFile = $rootDir + $File + ".xlsx"
$CSVFile = $ExcelFile.Replace(".xlsx",".csv")
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
$wb = $Excel.Workbooks.Open($ExcelFile)
$ws = $wb.Worksheets | where {$_.name -eq "Pridani"})
$ws.SaveAs($CSVFile,6)
$Excel.Quit()
return CSVFile
}
ConvertXLSX -File $Licence
I get an error "The member "Proemail Exchg Business" is already present" in front of the "Import-Csv" cmdlet
This one is handled in this StackOverflow post. Short version is you have two columns with "Proemail Exchg Business" in the first row.
PowerShell creates an object from the CSV: the column headers must be unique. Either fix in the excel sheet, or manually specify unique headers using -Header as in linked post.

Finding content of Excel file in Powershell

I am currently working on a fairly large powershell script. However, I got stuck at one part. The issue is the following.
I have various reports with the same file name, they just have a different time stamp at the end. Within the report, I have a field displaying the date from when to when the report is from.
---> 2/1/2015 5:00:00AM to 3/1/2015 5:00:00AM <--- This is what it looks like.
This field is randomly placed on the Excel Sheet. Pretty much in the range of A5 to Z16. What I would like the script to do is:
Read the file / Check the range of cells for the dates, if the date is found and it matches my search criteria, close the sheet and move it to a different folder / If date does not match, close and check next XLS file
This is what I got so far:
$File = "C:\test.XLS"
$SheetName = "Sheet1"
# Setup Excel, open $File and set the the first worksheet
$Excel = New-Object -ComObject Excel.Application
$Excel.visible = $true
$Workbook = $Excel.workbooks.open($file)
$Worksheets = $Workbooks.worksheets
$WorkSheet = $WorkBook.sheets.item($SheetName)
$SearchString = "AM" #just for test purposes since it is in every report
$Range = $Worksheet.Range("A1:Z1").EntireColumn
$Search = $Range.find($SearchString)
If you want it to search the entire column for A to Z you would specify the range:
$Range = $Worksheet.Range("A:Z")
Then you should be able to execute a $Range.Find($SearchText) and if the text is found it will spit back the first cell it finds it in, otherwise it returns nothing. So start Excel like you did, then do a ForEach loop, and inside that open a workbook, search for your text, if it is found close it, move it, stop the loop. If it is not found close the workbook, and move to the next file. The following worked just fine for me:
$Destination = 'C:\Temp\Backup'
$SearchText = '3/23/2015 10:12:19 AM'
$Excel = New-Object -ComObject Excel.Application
$Files = Get-ChildItem "$env:USERPROFILE\Documents\*.xlsx" | Select -Expand FullName
$counter = 1
ForEach($File in $Files){
Write-Progress -Activity "Checking: $file" -Status "File $counter of $($files.count)" -PercentComplete ($counter*100/$files.count)
$Workbook = $Excel.Workbooks.Open($File)
If($Workbook.Sheets.Item(1).Range("A:Z").Find($SearchText)){
$Workbook.Close($false)
Move-Item -Path $File -Destination $Destination
"Moved $file to $destination"
break
}
$workbook.close($false)
$counter++
}
I even got ambitious enough to add a progress bar in there so you can see how many files it has to potentially look at, how many it's done, and what file it's looking at right then.
Now this does all assume that you know exactly what the string is going to be (at least a partial) in that cell. If you're wrong, then it doesn't work. Checking for ambiguous things takes much longer, since you can't use Excel's matching function and have to have PowerShell check each cell in the range one at a time.

Powershell 2.0 write back to xlsx

Cross post from powershell.org..
I am trying to have Powershell read an xlsx for username info, convert to a csv (to be imported) and then write back something to the xlsx so next time it won't reimport the same users.
I don't want to delete the users in the xlsx but am thinking to add a date column or some other data maybe the word "created" and have powershell write this data in an available column. But then I would have to have my script ignore this new column if contains a old date or the word created?
<br> Current xlsx columns headers
<br> A B C
<br> 1 Full Name, Personal Email, "write back data"
<br> 2 John Doe Jdoe#gmail.com, Created (Sample write back data)
<br> 3 Don Juan Djuan#gmail.com, Date Imported (sample write back data)
Convert to csv code (This part is working fine.)
$File = "C:\Scripts\Excel\Accounts.xlsx"
$Savepath1 = "C:\Scripts\Csv\Employee Accounts.csv"
$SheetName1 = "Employee Accounts"
$ObjExcel = New-Object -ComObject Excel.Application
$Objexcel.Visible = $false
$Objworkbook=$ObjExcel.Workbooks.Open($File)
$Objworksheet=$Objworkbook.worksheets.item($Sheetname1)
$Objworksheet.Activate()
$Objexcel.application.DisplayAlerts= $False
$Objworkbook.SaveAs($SavePath1,6)
$Objworkbook.Close($True)
$ObjExcel.Quit()
Here is my current import-csv code
$EmployeeAccounts = Import-Csv $savepath1 | Where-Object { $_.Fullname -and $_.PersonalEmail}
Things to consider:
There might be additional concatenated info in additional fields added to the xlsx. Therefore excel might count these as used rows if the fields have formulas in them. So I only want to write the data to the new column if there is a username and email address in columns A & B.
Thanks!
To be honest it's going to be simpler to import the whole thing, perform your process filtering for entries that don't have anything in the Updated field, update the "Updated" field for each entry that you processed, and then just write the entire thing back to the file. So, something like:
$Users = Import-CSV $FilePath
$Users | ?{[String]::IsNullOrEmpty($_.Updated)} | %{
Do stuff here
$_.Updated = Get-Date
}
$Users | Export-CSV $FilePath -Force -NoTypeInfo
Edit: Well then, that does complicate things a little bit. So, this one does take a plugin, but it's a plugin that I whole heartedly feel should be included in almost any installation that's functionally used regularly IMHO. The PowerShell Community Extensions (PSCX) can be gotten from pscx.codeplex.com and will grant you access to the command Out-Clipboard which is awesome for what you want to do.
$Users = Import-CSV $FilePath
$Users | ?{[String]::IsNullOrEmpty($_.Updated)} | %{
Do stuff here
$_.Updated = Get-Date
}
$Users | Select Updated | ConvertTo-CSV -Delimiter "`t" -NoTypeInformation | Out-Clipboard
$SheetName1 = "Employee Accounts"
$ObjExcel = New-Object -ComObject Excel.Application
$Objexcel.Visible = $false
$Objworkbook=$ObjExcel.Workbooks.Open($File)
$Objworksheet=$Objworkbook.worksheets.item($Sheetname1)
$Objworksheet.Activate()
$Range = $Objworksheet.Range("C1","C1")
$Objworksheet.Paste($Range, $false)
$Objexcel.DisplayAlerts = $false
$Objworkbook.Save()
$Objexcel.DisplayAlerts = $true
$Objworkbook.Close()
$Objexcel.Quit()
[void][System.Runtime.Interopservices.Marshal]::FinalReleaseComObject($Objexcel)
That will paste your data, header included, into column C.

Resources