I have an Excel file that I receive and want to process it to a CSV using Powershell.
I have to alter it quite specifically so it can be a reliable input for a program that will process the csv info.
I don't know the exact headers, but i know there can be duplicates.
What I do is open the xlsx file with excel and save it as CSV:
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $True
$objExcel.DisplayAlerts = $True
$Workbook = $objExcel.Workbooks.open($xlsx1)
$WorkSheet = $WorkBook.sheets.item($sheet)
$xlCSV = 6
$Workbook = $objExcel.Workbooks.open($xlsx2)
$WorkSheet = $WorkBook.sheets.item($sheet)
$WorkBook.SaveAs($csv2,$xlCSV)
Now, the XLSX file will have comma's, so first I want to change them to dots.
I tried this, but it's not working:
$objRange = $worksheet.UsedRange
$objRange.Replace ",", "."
It errors out saying: Unexpected token '", "'.
Then when saving I want to set the Delimiter to comma, as it uses ";" standard.
With something like:
$WorkBook.SaveAs($csv2,$xlCSV) -delimiter ","
The last problem is the duplicate headers; this prevents PS to use Import-CSV. Here I tried, when file is separated with a comma it works:
Get-Content $downloads\BBKS_DIR_AUTO_COMMA.csv -totalcount 1 >$downloads\Headers.txt
But then I need to rename de duplicate names like I can have Regio, Regio, Regio.
I want to change this to Regio, Regio2, Regio3
My plan was to lookup the data of the txt, search for duplicates, and then ad an incremental nummer.
In the end I need to add a column with incremental numbers, but always with four numbers, like; 0001, 0002, 0010, 0020, 0200, 1500, I wont exceed 9999. How can this be done?
If you can help me, if only partially I'm very happy.
Further, I'm running Windows 7 x64, Powershell 3.0, Excel 2016 (if relevant)
If easier, its fine to go back to Command prompt for some tasks.
Personally, I wouldn't try and work with Excel sheets via Excel itself and COM - I'd use the excellent module https://github.com/dfinke/ImportExcel
Then you can import from the sheet straight to a native Powershell object array, and re-export with Export-Csv -Delimiter.
Edit: To answer follow ups :
Once you've loaded the module you can do "Get-Module ImportExcel | Select-Object -ExpandProperty ExportedCommands" to see what it makes available.
To import your Excel in the first place, do something like :
$WorkBook = Import-Excel
And if you need to take care of duplicate column names, you can do :
$WorkBook = Import-Excel -Header #("Regio1", "Regio2", "Regio")
Where the array you pass to -Header needs to include every column you want from the workbook.
Related
I want to open the CSV file using powershell Excel.Application.
my code is like this:
$csv = "csv name"
$xlsx = "output excel name"
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$wb = $excel.Workbook.Open($csv)
$wb.SaveAs($xlsx,51)
$excel.Quit()
But Turns out that the data in the csv "004" will loaded as 4
Anyone can think of a way to do this?
Noted that there are many special case in my csv:
there are data like "004", "01234678" in the csv and I would like to import all of them as text.
there are comma within the data like "FlatA, 7/F"
there are newline character within the data like
"abcdef
def
ghi"
you can also give your own solution that can load the csv to excel using powershell which can fulfill all the above cases.
Thanks a lot. You will save my life if you able to do this.
I created a PowerShell script that allows me to merge multiple .CSV into one .XLSX file.
It works well on my computer:
$path = "C:\Users\Francesco\Desktop\CSV\Results\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
Write-Host " "$csvs.Name"`n"
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
for ($i=1;$i -le $y;$i++) {
$worksheet = $xlsx.Worksheets.Item($i)
$worksheet.Name = $csvs[$i-1].Name
$file = (Import-Csv $csvs[$i-1].FullName)
$file | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | clip
$worksheet.Cells.Item(1).PasteSpecial() | Out-Null
}
$output = "C:\Users\Francesco\Desktop\CSV\Results\Results.xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
The problem is that I need to run this on several servers and servers are well known for not having Office installed so I cannot use Excel.Application.
Is there a way to merge multiple CSV into one CSV or XLSX without using Excel.Application and saving each CSV into a different sheet?
#AnsgarWiechers is right, ImportExcel is powerful and not difficult to use. However for your specific case you can use a more limited approach, using OleDb (or ODBC or ADO) to write to an Excel file like a database. Here is some sample code showing how to write to an Excel file using OleDb.
$provider = 'Microsoft.ACE.OLEDB.12.0'
$dataSource = 'C:\users\user\OleDb.xlsb'
$connStr = "Provider=$provider;Data Source=$dataSource;Extended Properties='Excel 12.0;HDR=YES'"
$objConn = [Data.OleDb.OleDbConnection]::new($connStr)
$objConn.Open()
$cmd = $objConn.CreateCommand()
$sheetName = 'Demo'
$cmd.CommandText = 'CREATE TABLE $sheetName (Name TEXT,Age NUMBER)'
$cmd.ExecuteNonQuery()
$cmd.CommandText = "INSERT INTO demo (Name,Age) VALUES ('Adam', 20)"
$cmd.ExecuteNonQuery()
$cmd.CommandText = "INSERT INTO demo (Name,Age) VALUES ('Bob',30)"
$cmd.ExecuteNonQuery()
$cmd.Dispose()
$objConn.Close()
$objConn.Dispose()
You didn't say much about the CSV files you'll be processing. If column data varies, to create the table you'll have to get the attribute (column) names from the CSV header (either by reading the first line of the CSV file, or by enumerating the properties of the first item returned by Import-CSV).
If your CSV files have a large number of lines, writing one line at a time may be slow. In that case using a DataSet and OleDbDataAdapter might improve performance (but I haven't tested). But at that point you might as well use OleDb to read the .csv directly into a DataSet, create a OleDbDataAdapter, set the adapter's InsertCommand property, and finally call the adapters Update method. I don't have time to write and test all that.
This is not intended as a full solution, just a demo of how to use OleDb to write to an Excel file.
Note: I tested this on a server that didn't have Office or Excel installed. The Office data providers pre-installed on that machine were 32-bit, but I was using 64-bit PowerShell. To get 64-bit drivers I installed the Microsoft Access Database Engine 2016 Redistributable and that's what I used for testing.
Time has passed and I have found a new solution: Install-Module -Name ImportExcel
This way the module takes care of the job like in this script.
I am doing data output to csv file via powershell. Generally things goes well.
I have exported the data to csv file. It contains about 10 columns. When I open it with MS Excel it's all contained in first column. I want to split it by several columns programmatically via powershell(same GUI version offers). I could make looping and stuff to split the every row and then put values to appropriate cell but then it would take way too much time.
I believe there should be an elegant solution to make one column split to multiple. Is there a way to make it in one simple step without looping?
This is what I came up with so far:
PS, The CSV file is 100% FINE. The delimiter is ','
Get-Service | Export-Csv -NoTypeInformation c:\1.csv -Encoding UTF8
$xl = New-Object -comobject Excel.Application
$xl.Visible = $true
$xl.DisplayAlerts = $False
$wb = $xl.Workbooks.Open('c:\1.csv')
$ws = $wb.Sheets|?{$_.name -eq '1'}
$ws.Activate()
$col = $ws.Cells.Item(1,1).EntireColumn
This will get you the desired functionality; add to your code. Check out the MSDN page for more information on TextToColumns.
# Select column
$columnA = $ws.Range("A1").EntireColumn
# Enumerations
$xlDelimited = 1
$xlTextQualifier = 1
# Convert Text To Columns
$columnA.texttocolumns($ws.Range("A1"),$xlDelimited,$xlTextQualifier,$true,$false,$false,$true,$false)
$ws.columns.autofit()
I had to create a CSV which had "","" as delimiter to test this out. The file with "," was fine in excel.
# Opens with all fields in column A, used to test TextToColumns works
"Name,""field1"",""field2"",""field3"""
"Test,""field1"",""field.2[]"",""field3"""
# Opens fine in Excel
Name,"field1","field2","field3"
Test,"field1","field.2[]","field3"
Disclaimer: Tested with $ws = $wb.Worksheets.item(1)
I have two CSV files: File1.csv has one column with 4000+ rows. File2.csv has 200 columns with 10000+ rows of content. I want to add the one column in file1.csv as an additional column on File2.csv. I am OK adding it to the end (rightmost) of the existing file. I have found several options online, but none has worked as desire. I can get it done with the Input-CSV cmdlet and adding a Property but that is taking more than ~1 hour to execute. Is there any way to do this without having to convert the CSV content into PSobjects? I have used Get-Content and Set-Content in the past, but that will append one file to the bottom of the other one. Is there any way I could do something similar but appending to the right of the existing file?
Here is the piece of code that has gotten me closer to what I need. The problem with this one is Excel is not saving or closing. Any ideas on how this problem can be solved either by fixing the code below or an easier/more efficient way to do it?
$source = "C:\Users\Desktop\Script_Development\04-16-2015\Bit.csv"
$dest = "C:\Users\Desktop\Script_Development\04-16-2015\MergedwithHeader_04-16-2015.csv"
$Excel = New-Object -ComObject Excel.Application
$Excel.visible = $false
$Workbooksource = $excel.Workbooks.open($source)
$Worksheetsource = $Workbooksource.WorkSheets.item("Bit")
$Worksheetsource.activate()
$range = $Worksheetsource.Range("A1").EntireColumn
$range.Copy() | out-null
$Workbookdest = $excel.Workbooks.open($dest)
$Worksheetdest = $Workbookdest.Worksheets.item("MergedwithHeader_04-16-2015")
$Range = $Worksheetdest.Range("FT1")
$Worksheetdest.Paste($range)
$Workbookdest.SaveAs("C:\Users\Desktop\Script_Development\04-16-2015\MergedwithHeader_04-16-2015.xls")
$Excel.quit()
The following code will loop through the lines of a file. You could use this to read each line into an ArrayList.
$FileData = Get-Content "$Filename"
foreach ($i in $FileData)
{
DoSomethingWithLine($i)
}
Then you loop through the other file, and combine each line with a line that is stored in the ArrayList, concatenating it with the necessary commas and quotes, and append each line to a new file using Add-Content.
There would be numerous other and more sophisticated ways to do this.
I have imported a comma seperated csv file using powershell.
I gets imported and looks as it should. The problem is, the cells contain formulas.
Like =20+50+70. It doesn't get calculated unless i click enter i the top field.
Another problem is, that some of the cells contains numbers like =50,2+70,5. These cells excel doesn't understand at all. It can't caltulate them, unless i remove the , or replace it with a dot (.). But this is not a possibility.
How to i fix this?
The csv file is imported with powershell using this:
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
$wbpath=Join-Path "$psscriptroot" 'file.xlsx'
$importcsv=Join-Path "$psscriptroot" 'file.csv'
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.Workbooks.OpenText($importcsv)
$xl.DisplayAlerts = $false
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
$xl.ActiveWorkbook.SaveAs($wbpath,51)
$xl.Quit()
while([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)){'released'}
The
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
is necessary or i will get errors because my system locale is not us.
Thank you.
CSV Sample:
name1.name1.name1,"=20","=7,65","=20,01"
name2.name2.name2,"=20+10","=4,96+0,65","=20,01+10"
name3.name3.name3,"=20","=4,96+0,88","=21,01+11"
Sounds like you need to
a) Force the worksheet to calculate
b) If you're going to stick with en-US locale then you need to replace those commas with decimal points. That's the GB/US standard and how Excel will interpret decimals. I'd strongly advise however that you stick to the locale that your data is set up in.
(untested as I'm currently on a Mac)
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
$wbpath=Join-Path "$psscriptroot" 'file.xlsx'
$importcsv=Join-Path "$psscriptroot" 'file.csv'
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$wb = $xl.Workbooks.OpenText($importcsv)
$xl.DisplayAlerts = $false
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
$sh = $wb.Sheets.Item(1)
# loop through the used range and replace any commas with decimals
foreach ($cell in $sh.usedRange)
{
[string]$formula = $cell.formula
$formula -replace ',','.'
$cell.formula = $formula
}
# force the sheet to calculate
$sh.Calculate()
$xl.ActiveWorkbook.SaveAs($wbpath,51)
$xl.Quit()
while([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)){'released'}
As with the previous answer, you have to account for locale; not all .csv files are the same formatting based on what country locale they were encoded in. While UTF is standard, in some respects CSV is a "legacy format", even if it's the most lightweight, simple way to transfer data using plaintext.
Sam already answered the majority of the difficult stuff, so I'll just add a few things. If you are making an automated solution and work with multiple countries, there's a few ways you can determine how it's encoded. You can go the more technically proficient route and implement a custom function similar to this one https://gist.github.com/jpoehls/2406504 or, because it's a CSV, you can make a decent guess since the most common encoding formats use different delimiters; I believe the one you are mentioning uses tabs as encoding.
I'll focus on the ones within Excel importing because those weren't mentioned. There's a fairly neat function in the Data tab that allows you to customize your import based on what delimiters it uses. In the third step when you press Advanced, it allows you to tell it which separator (comma or decimal) that the source data is using, and once you select that and press Finish, it will convert the result to whatever your locale is set to for Excel and properly evaluate functions. Example picture So, the workflow for this would be open a new Excel book, select Data > From Text and proceed from there. It will convert the text from the locale you choose (in your case 1252 is likely) into whatever decimal format you specify.