Merging CSV Files into a XLSX with Tabs - excel

I currently have 5 Registry CSV files which are created during a PowerShell script:
HKCC
HKCR
HKCU
HKLM
HKU
I need these CSV files to open at the end of the script however would like if all of them were contained within one XLSX file with 5 different headings
Is there a way to combine the files through PowerShell?
I understand how to get the data of the CSV files but don't understand how to merge them or convert. Some of the variables I believe which may be helpful.
$Date = Get-Date -Format "d.MMM.yyyy"
$DIR = $WPFlistview.Selecteditem.Ransomware
$path = "F:\Registry_Export\Results\$DIR\$Date\*"
$csvs = Get-ChildItem $path -Include *.csv
$output = "F:\Registry_Export\Results\$DIR\$Date\Results.Xlsx"
Paths to the CSV files if needed:
F:\Registry_Export\Results\$DIR\$Date\HKCR.CSV
F:\Registry_Export\Results\$DIR\$Date\HKCU.CSV
F:\Registry_Export\Results\$DIR\$Date\HKLM.CSV
F:\Registry_Export\Results\$DIR\$Date\HKU.CSV
F:\Registry_Export\Results\$DIR\$Date\HKCC.CSV
This is what I have tried prior. However, it completly scrambles my data into the wrong lines and cells:
function MergeCSV {
$Date = Get-Date -Format "d.MMM.yyyy"
$DIR = $WPFlistview.Selecteditem.Ransomware
$path = "F:\Registry_Export\Results\$DIR\$Date\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
foreach ($csv in $csvs) {
Write-Host " "$csv.Name
}
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
$sheet = 1
foreach ($csv in $csvs) {
$row = 1
$column = 1
$worksheet = $xlsx.Worksheets.Item($sheet)
$worksheet.Name = $csv.Name
$file = (Get-Content $csv)
foreach ($line in $file) {
$linecontents = $line -split ',(?!\s*\w+")'
foreach ($cell in $linecontents) {
$worksheet.Cells.Item($row,$column) = $cell
$column++
}
$column = 1
$row++
}
$sheet++
}
$output = "F:\Registry_Export\Results\$DIR\$Date\Results.Xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
}
How the CSV looks
https://gyazo.com/177c7c3bb21ddf06d0ebacbb7f4d537b
How the XLSX looks
https://gyazo.com/cd5fb48d61f93aac5ec3034d81811094

So, using the Excel.Application ComObject still, what I would suggest is loading each CSV as a CSV, not using Get-Content like you are. Then use the ConvertTo-CSV cmdlet, specifying to use tab as the delimiter, and copy that to the clipboard. Then just paste into Excel, and it will paste in fairly nicely. You may want to adjust column size, but the data will show up just as you would expect it to. I would also use a For loop instead of a ForEach loop, since Excel plays nice with numbers for the tabs (though it is 1 based instead of PowerShell's 0 base). Here's what I would end up with after making those modifications:
function MergeCSV {
$Date = Get-Date -Format "d.MMM.yyyy"
$DIR = $WPFlistview.Selecteditem.Ransomware
$path = "F:\Registry_Export\Results\$DIR\$Date\*"
$csvs = Get-ChildItem $path -Include *.csv
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
Write-Host " "$csvs.Name"`n"
$outputfilename = "Final Registry Results"
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.SheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
for($i=1;$i -le $y;$i++) {
$worksheet = $xlsx.Worksheets.Item($i)
$worksheet.Name = $csvs[$i-1].Name
$file = (Import-Csv $csvs[$i-1].FullName)
$file | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Clip
$worksheet.Cells.Item(1).PasteSpecial()|out-null
}
$output = "F:\Registry_Export\Results\$DIR\$Date\Results.Xlsx"
$xlsx.SaveAs($output)
$excelapp.Quit()
}

You could use ImportExcel by Doug Finke And then replace your Export-CSV in the original script with Export-Excel -WorksheetName
Install-Module ImportExcel
Export-Excel "F:\Registry_Export\Results\$DIR\$Date\Results.xlsx" -worksheetname "HKCR"
Export-Excel "F:\Registry_Export\Results\$DIR\$Date\Results.xlsx" -worksheetname "HKCU"

Related

How to use powershell to select range and dump that to csv file

Actually, this is a version of question here:
How to use powershell to select and copy columns and rows in which data is present in new workbook.
The goal is to grab certain columns from multiple Excel workbooks and dump everything to one csv file. Columns are always the same.
I'm doing that manually:
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
$wb = $xl.Workbooks.Open($_.FullName, 0, 1, 5, "")
try {
$ws = $wb.Worksheets.item('Calls') # => This specific worksheet
$rowMax = ($ws.UsedRange.Rows).count
for ($i=1; $i -le $rowMax-1; $i++) {
$newRow = New-Object -Type PSObject -Property #{
'Type' = $ws.Cells.Item(1+$i,1).text
'Direction' = $ws.Cells.Item(1+$i,2).text
'From' = $ws.Cells.Item(1+$i,3).text
'To' = $ws.Cells.Item(1+$i,4).text
}
$newRow | Export-Csv -Path $("$output_folder\$ESO_Output") -Append -noType -Force
}
}
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
Question:
This works, but is extremely slow because Excel has to select every cell, copy that, then Powershell has to create array and save row by row in output csv file.
Is there a method to select a range in Excel (number of columns times ($ws.UsedRange.Rows).count), cut header line and just append this range (array?) to csv file to make everything much faster?
So that's the final solution
Script is 22 times faster!!! than original solution.
Hope somebody will find that useful :)
PasteSpecial is to filter out empty rows. There is no need to save them into csv
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
try {
$new_ws1 = $wb.Worksheets.add()
$ws = $wb.Worksheets.item('Calls')
$rowMax = ($ws.UsedRange.Rows).count
$range = $ws.Range("A1:O$rowMax")
$x = $range.copy()
$y = $new_ws1.Range("A1:O$rowMax").PasteSpecial([System.Type]::Missing,[System.Type]::Missing,$true,$false)
$wb.SaveAs("$($output_folder)\$($_.Basename)",[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
$xl.Quit()
Part above will generate a bunch of csv files.
Part below will read these files in separate loop and combine them together into one.
-exclude is an array of something I want to omit
Remove-Item to remove temporary files
Answer below is based on this post: https://stackoverflow.com/a/27893253/6190661
$getFirstLine = $true
Get-ChildItem "$output_folder\*.csv" -exclude $excluded | foreach {
$filePath = $_
$lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "$($output_folder)\MERGED_CSV_FILE.csv" $linesToWrite
Remove-Item $_.FullName
}

Multiple csv files into a xlsx file but different sheets using powershell

I have 20 csv files. Each are unrelated. How do I combine them together into one xlsx file with 20 sheets, each named after the csv files.
$root = "C:\Users\abc\Desktop\testcsv"
$CSVfiles = Get-ChildItem -Path $root -Filter *.csv
$xlsx = "C:\Users\abc\Desktop\testxl.xlsx" #output location
$delimiter = "," #delimiter
#Create a excel
$xl=New-Object -ComObject Excel.Application
$xl.Visible=$true
#add a workbook
$wb=$xl.WorkBooks.add(1)
ForEach ($csv in $CSVfiles){
#name the worksheet
$ws=$wb.WorkSheets.item(1)
$ws.Name = [io.path]::GetFileNameWithoutExtension($csv)
$TxtConnector = ("TEXT;" + $csv)
$Connector = $ws.QueryTables.add($TxtConnector,$ws.Range("A1"))
$query = $ws.QueryTables.item($Connector.name)
$query.TextFileOtherDelimiter = $delimiter
$query.TextFileParseType = 1
$query.TextFileColumnDataTypes = ,1 * $ws.Cells.Columns.Count
$query.AdjustColumnWidth = 1
# Execute & delete the import query
$query.Refresh()
$query.Delete()
$wb.SaveAs($xlsx,51)
}
# Save & close the Workbook as XLSX.
$xl.Quit()
This way, change the first line to the folder where you store those 20 CSV files and then
$path="c:\path\to\folder" #target folder
cd $path;
$csvs = Get-ChildItem .\* -Include *.csv
$y=$csvs.Count
Write-Host "Detected the following CSV files: ($y)"
foreach ($csv in $csvs)
{
Write-Host " "$csv.Name
}
$outputfilename = $(get-date -f yyyyMMdd) + "_" + $env:USERNAME + "_combined-data.xlsx" #creates file name with date/username
Write-Host Creating: $outputfilename
$excelapp = new-object -comobject Excel.Application
$excelapp.sheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
$sheet=1
foreach ($csv in $csvs)
{
$row=1
$column=1
$worksheet = $xlsx.Worksheets.Item($sheet)
$worksheet.Name = $csv.Name
$file = (Get-Content $csv)
foreach($line in $file)
{
$linecontents=$line -split ',(?!\s*\w+")'
foreach($cell in $linecontents)
{
$worksheet.Cells.Item($row,$column) = $cell
$column++
}
$column=1
$row++
}
$sheet++
}
$output = $path + "\" + $outputfilename
$xlsx.SaveAs($output)
$excelapp.quit()
cd \ #returns to drive root
https://stackoverflow.com/a/51094040/5995160 answer is too slow when dealing with csv's with a ton of data, I modified this solution to use https://github.com/dfinke/ImportExcel. This has greatly improved the performance of this task, at least for me.
Install-Module ImportExcel -scope CurrentUser
$csvs = Get-ChildItem .\* -Include *.csv
$csvCount = $csvs.Count
Write-Host "Detected the following CSV files: ($csvCount)"
foreach ($csv in $csvs) {
Write-Host " -"$csv.Name
}
$excelFileName = $(get-date -f yyyyMMdd) + "_" + $env:USERNAME + "_combined-data.xlsx"
Write-Host "Creating: $excelFileName"
foreach ($csv in $csvs) {
$csvPath = ".\" + $csv.Name
$worksheetName = $csv.Name.Replace(".csv","")
Write-Host " - Adding $worksheetName to $excelFileName"
Import-Csv -Path $csvPath | Export-Excel -Path $excelFileName -WorkSheetname $worksheetName
}
This solution assumes that the user has already changed directories to where all the csv's live.
See below for a solution with uses the OpenText method.
At least two things to note:
I'm assuming your workbook creates a single sheet by default. if creates more than that, you will need to modify the script so that these additional sheets are deleted from the end result.
The way you specify TextFileColumnDataTypes is quite clever. You will need to modify it and feed the array to the FieldInfo argument below. See the documentation linked above for the kind of array it is expecting.
$CSVfiles = Get-ChildItem -Path $root -Filter *.csv
$xlsx = "C:\Users\abc\Desktop\testxl.xlsx" #output location
#Create a excel
$xl = New-Object -ComObject Excel.Application
$xl.Visible=$true
#add a workbook
$wb = $xl.WorkBooks.add(1)
# how many worksheets do you have in your original workbook? Assuming one:
$ws = $wb.Worksheets.Item(1)
ForEach ($csv in $CSVfiles){
# OpenText method does not work well with csv files
Copy-Item -Path $csv.FullName -Destination ($csv.FullName).Replace(".csv",".txt") -Force
# Use OpenText method. FieldInfo will need to be amended to suit your needs
$xl.WorkBooks.OpenText(`
($file.FullName).Replace(".csv",".txt"), # Filename
2, # Origin
1, # StartRow
1, # DataType
1, # TextQualifier
$false, # ConsecutiveDelimiter
$false, # Tab
$false, # Semicolon
$true, # Comma
$false, # Space
$false, # Other
$false, # OtherChar
#() # FieldInfo
)
$tempBook = $xl.ActiveWorkbook
$tempBook.worksheets.Item(1).Range("A1").Select() | Out-Null
$tempBook.worksheets.Item(1).Move($wb.Worksheets.Item(1)) | Out-Null
# name the worksheet
$xl.ActiveSheet.Name = $csv.BaseName
Remove-Item -Path ($csv.FullName).Replace(".csv",".txt") -Force
}
$ws.Delete()
# Save & close the Workbook as XLSX.
$wb.SaveAs($xlsx,51)
$wb.Close()
$xl.Quit()

Passing CSV to Excel Workbook (Not From File)

I have a folder of CSV files that contain log entries. For each entry of the CSV, if the Risk property is not Low and not None then I put it in an accumulation CSV object. From there, I want to import it into an Excel Workbook directly WITHOUT having to save the CSV to file.
$CSVPaths = (Split-Path $PSCommandPath)
$AccumulateExportPath = (Split-Path $PSCommandPath)
$FileName="Accumulate"
$Acc=#()
Foreach ($csv in (Get-ChildItem C:\Scripts\Nessus\Sheets |? {$_.Extension -like ".csv" -and $_.BaseName -notlike "$FileName"}))
{
$Content = Import-CSV $csv.FullName
Foreach ($Log in $Content)
{
If ($Log.Risk -ne "None" -and $Log.Risk -ne "Low")
{
$Acc+=$Log
}
}
}
$CSV = $ACC |ConvertTo-CSV -NoTypeInformation
Add-Type -AssemblyName Microsoft.Office.Interop.Excel
$Script:Excel = New-Object -ComObject Excel.Application
$Excel.Visible=$True
#$Excel.Workbooks.OpenText($CSV) What should replace this?
Is there a Method like OpenText() that lets me pass a CSV object instead of a filepath to a CSV file or am I going to have to write my own conversion function?
Interesting question. I'm not aware of a method that allows you to pass a CSV Object.
However, if your result CSV is not too big and you are using PowerShell 5.0+ you could convert the object to a string and leverage Set-Clipboard (more info)
$headers = ($csv | Get-Member | Where-Object {$_.MemberType -eq "NoteProperty"}).Name
$delim = "`t"
# headers
foreach($header in $headers){
$myString += $header + $delim
}
# trim delimiter at the end, and add new line
$myString = $myString.TrimEnd($delim)
$myString = $myString + "`n"
# loop over each line and repeat
foreach($line in $csv){
foreach($header in $headers){
$myString += $line.$header + $delim
}
$myString = $myString.TrimEnd($delim)
$myString = $myString + "`n"
}
# copy to clipboard
Set-Clipboard $myString
# paste into excel from clipboard
$Excel.Workbooks.Worksheets.Item(1).Paste()
Here is another way to create an Excel spreadsheet from PowerShell without writing a .csv file.
$dirs = 'C:\src\t', 'C:\src\sql'
$records = $()
$records = foreach ($dir in $dirs) {
Get-ChildItem -Path $dir -File '*.txt' -Recurse |
Select-Object #{Expression={$_.FullName}; Label="filename"}
}
#open excel
$excel = New-Object -ComObject excel.application
$excel.visible = $false
#add a default workbook
$workbook = $excel.Workbooks.Add()
#remove worksheet 2 & 3
$workbook.Worksheets.Item(3).Delete()
$workbook.Worksheets.Item(2).Delete()
#give the remaining worksheet a name
$uregwksht = $workbook.Worksheets.Item(1)
$uregwksht.Name = 'File Names'
# Start on row 1
$i = 1
# the .appendix to $record refers to the column header in the csv file
foreach ($record in $records) {
$excel.cells.item($i,1) = $record.filename
$i++
}
#adjusting the column width so all data's properly visible
$usedRange = $uregwksht.UsedRange
$usedRange.EntireColumn.AutoFit() | Out-Null
#saving & closing the file
$outputpath = Join-Path -Path $Env:USERPROFILE -ChildPath "desktop\exceltest.xlsx"
$workbook.SaveAs($outputpath)
$excel.Quit()

Powershell script using Excel running slow

So i have this script that i coded on my laptop that works just fine, the job is to combine two .csv-files into one .xls-file.
And running the script with two .csv-files containing a couple of thousand rows takes a few seconds max.
But when i try to run it on the server where it should be located, it takes... hours. I haven't done a full run, but writing one line in the .xls-file takes maybe 2-3 seconds.
So what im wondering is what is causing the huge increase in runtime. I'm monitoring the CPU-load while the script is running, and it's at 50-60% load.
The server has loads of Ram, and two CPU-core.
How can i speed this up?
The script looks like this:
$path = "C:\test\*"
$path2 = "C:\test"
$date = Get-Date -Format d
$csvs = Get-ChildItem $path -Include *.csv | Sort-Object LastAccessTime -Descending | Select-Object -First 2
$y = $csvs.Count
Write-Host "Detected the following CSV files: ($y)"
foreach ($csv in $csvs) {
Write-Host " "$csv.Name
}
$outputfilename = "regSCI " + $date
Write-Host Creating: $outputfilename
$excelapp = New-Object -ComObject Excel.Application
$excelapp.sheetsInNewWorkbook = $csvs.Count
$xlsx = $excelapp.Workbooks.Add()
$sheet = 1
$xlleft = -4131
foreach ($csv in $csvs) {
$row = 1
$column = 1
$worksheet = $xlsx.Worksheets.Item($sheet)
$worksheet.Name = $csv.Name
$worksheet.Rows.HorizontalAlignment = $xlleft
$file = (Get-Content $csv)
Write-Host Worksheet created: $worksheet.Name
foreach($line in $file) {
Write-Host Writing Line
$linecontents = $line -split ',(?!\s*\w+")'
foreach($cell in $linecontents) {
Write-Host Writing Cell
$cell1 = $cell.Trim('"')
$worksheet.Cells.Item($row, $column) = $cell1
$column++
}
$column = 1
$row++
$WorkSheet.UsedRange.Columns.Autofit() | Out-Null
}
$sheet++
$headerRange = $worksheet.Range("a1", "q1")
$headerRange.AutoFilter() | Out-Null
}
$output = $path2 + "\" + $outputfilename
Write-Host $output
$xlsx.SaveAs($output)
$excelapp.Quit()
To speed up your existing code, add these just after creating Excel object:
$excelapp.ScreenUpdating = $false
$excelapp.DisplayStatusBar = $false
$excelapp.EnableEvents = $false
$excelapp.Visible = $false
And these just before SaveAs:
$excelapp.ScreenUpdating = $true
$excelapp.DisplayStatusBar = $true
$excelapp.EnableEvents = $true
This causes excel not to render the worksheet in realtime and fire events every time you change the contets. Most probably DisplayStatusBar and ScreenUpdating doesn't matter if you make an application invisible, but I included it just in case.
Also, you're running Autofit() after every line. This certainly doesn't help with performance.

Import csv into excel and specify cell format

I am trying to import multiple csv files into their own tabs in 1 excel workbook. I am having an issue with long number fields being displayed as exponential data and changing the last digit to 0. For example I have a 16 digit account number (1234567890123456) it is being displayed in excel as an exponential number (1.23457E+15). When I look at the actual number in the cell it is (1234567890123450). I assume if I make the column text before I bring it in, it will work, but I'm not sure how to do that. Here is my code.
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$excel.displayalerts=$False
$workbook = $excel.workbooks.add()
$sheets = $workbook.sheets
$sheetCount = $Sheets.Count
$mySheet = 1
$mySheetName = "Sheet" + $mySheet
$s1 = $sheets | where {$_.name -eq $mySheetName }
$s1.Activate()
If($sheetCount -gt 1)
{
#Delete other Sheets
$Sheets | ForEach
{
$tmpSheetName = $_.Name
$tmpSheet = $_
If($tmpSheetName -ne "Sheet1"){$tmpSheet.Delete()}
}
}
#import csv files
$files = dir -Path $csvDir*.csv
ForEach($file in $files){
If($mySheet -gt 1){$s1 = $workbook.sheets.add()}
$s1.Name = $file.BaseName
$s1.Activate()
$s1Data = Import-Csv $file.FullName
$s1data | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Clip
$s1.cells.item(1,1).Select()
$s1.Paste()
$mySheet ++
if (test-path $file ) { rm $file }
}
$workbook.SaveAs($excelTMGPath)
$workbook.Close()
$workbook = $null
#$excel.quit()
while ([System.Runtime.InteropServices.Marshal]::FinalReleaseComObject($excel)) {}
$excel = $null
Try
If $s1 is pointed correctly,
$s1.cells.item(1,1).NumberFormat="#"
If that does not work, use NumberFormat where necessary. Use the format you prefer.
Change the name of your file extension from .csv to .txt. Adjust your filename in the code,
$files = dir -Path $csvDir*.txt

Resources