I have a CSV file that has similar products within it and quantities of each product beside it.
Sample from CSV file
Qty Ordered Product/Item Description Top row (header)
7 Product1
3 Product2
5 Product1
3 Product3
I need a method to find all the similar product#s, add up their Quantities, and place the total of each similar product in a new row.
Add-Type -AssemblyName System.Windows.Forms
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog -Property
#{
Multiselect = $false # Multiple files can be chosen
Filter = 'Excel (*.csv, *.xlxs)|*.csv;*.xlsx' # Specified file types
}
[void]$FileBrowser.ShowDialog()
$file = $FileBrowser.FileNames;
[Reflection.Assembly]::LoadWithPartialName
("Microsoft.Office.Interop.Excel")|Out-Null
$excel = New-Object Microsoft.Office.Interop.Excel.ApplicationClass
$excel.Visible = $true
$wb = $excel.Workbooks.Open($file)
$ws = $wb.ActiveSheet
$c = $ws.Columns
$c.Item(2).hidden = $true
This code, asks the user to select the csv file, hides useless columns and auto-sizes the important columns as well.
Rather than using Excel as a COM Object you could use Import-CSV and then Group-Object. Then loop through the groups for the information you need.
Add-Type -AssemblyName System.Windows.Forms
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog -Property #{
Multiselect = $false # Multiple files can be chosen
Filter = 'Excel (.csv, *.xlxs)|.csv;*.xlsx' # Specified file types
}
[void]$FileBrowser.ShowDialog()
ForEach ($file in $FileBrowser.FileNames) {
$CSV = Import-CSV $file | Add-Member -Name Total -Value 0 -MemberType NoteProperty
$Groups = $CSV | Group-Object "Product/Item Description"
$NewCSV = Foreach ($Group in $Groups) {
$Count = 0
$Group.Group."Qty Ordered" | ForEach-Object {$Count += $_}
Foreach ($value in $CSV) {
If ($value."Product/Item Description" -eq $Group.Name) {
$value.Total = $Count
$value
}
}
}
Export-CSV "$filenew" -NoTypeInformation
}
Related
I'm trying to create an Excel workbook, then populate the cells with data found from searching many txt files.
I read a file and extract all comments AFTER I find "IDENTIFICATION DIVISION" and BEFORE I find "ENVIRONMENT DIVISION"
I then populate two cells in my excel workbook. cell one if the file and cell two is the comments extracted.
I have 256GB of memory on the work server. less than %5 is being used before Powershell throws the memory error.
Can anyone see where I'm going wrong?
Thanks,
-Ron
$excel = New-Object -ComObject excel.application
$excel.visible = $False
$workbook = $excel.Workbooks.Add()
$diskSpacewksht= $workbook.Worksheets.Item(1)
$diskSpacewksht.Name = "XXXXX_Desc"
$col1=1
$diskSpacewksht.Cells.Item(1,1) = 'Program'
$diskSpacewksht.Cells.Item(1,2) = 'Description'
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$Flowerbox = #()
ForEach($CBLFile in $CBLFileList) {
$treat = $false
Write-Host "Processing ... $CBLFile" -foregroundcolor green
Get-content -Path $CBLFile.FullName |
ForEach-Object {
if ($_ -match 'IDENTIFICATION DIVISION') {
# Write-Host "Match IDENTIFICATION DIVISION" -foregroundcolor green
$treat = $true
}
if ($_ -match 'ENVIRONMENT DIVISION') {
# Write-Host "Match ENVIRONMENT DIVISION" -foregroundcolor green
$col1++
$diskSpacewksht.Cells.Item($col1,1) = $CBLFile.Name
$diskSpacewksht.Cells.Item($col1,2) = [String]$Flowerbox
$Flowerbox = #()
continue
}
if ($treat) {
if ($_ -match '\*(.{62})') {
Foreach-Object {$Flowerbox += $matches[1] + "`r`n"}
$treat = $false
}
}
}
}
$excel.DisplayAlerts = 'False'
$ext=".xlsx"
$path="C:\Desc.txt"
$workbook.SaveAs($path)
$workbook.Close
$excel.DisplayAlerts = 'False'
$excel.Quit()
Not knowing what the contents of the .CBL files could be, I would suggest not to try and do all of this using an Excel COM object, but create a CSV file instead to make things a lot easier.
When finished, you can simply open that csv file in Excel.
# create a List object to collect the 'flowerbox' strings in
$Flowerbox = [System.Collections.Generic.List[string]]::new()
$treat = $false
# get a list of the .cbl files and loop through. Collect all output in variable $result
$CBLFileList = Get-ChildItem -Path 'C:\XXXXX\XXXXX' -Filter '*.cbl' -File -Recurse
$result = foreach ($CBLFile in $CBLFileList) {
Write-Host "Processing ... $($CBLFile.FullName)" -ForegroundColor Green
# using switch -File is an extremely fast way of testing a file line by line.
# instead of '-Regex' you can also do '-WildCard', but then add asterikses around the strings
switch -Regex -File $CBLFile.FullName {
'IDENTIFICATION DIVISION' {
# start collecting Flowerbox lines from here
$treat = $true
}
'ENVIRONMENT DIVISION' {
# stop colecting Flowerbox lines and output what we already have
# output an object with the two properties you need
[PsCustomObject]#{
Program = $CBLFile.Name # or $CBLFile.FullName
Description = $Flowerbox -join [environment]::NewLine
}
$Flowerbox.Clear() # empty the list for the next run
$treat = $false
}
default {
# as I have no idea what these lines may look like, I have to
# assume your regex '\*(.{62})' is correct..
if ($treat -and ($_ -match '\*(.{62})')) {
$Flowerbox.Add($Matches[1])
}
}
}
}
# now you have everything in an array of PSObjects so you can save that as Csv
$result | Export-Csv -Path 'C:\Desc.csv' -UseCulture -NoTypeInformation
Parameter -UseCulture ensures you can double-click the file so it will open correctly in your Excel
You can also create an Excel file from this csv programmatically like:
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open('C:\Desc.csv')
$worksheet = $workbook.Worksheets.Item(1)
$worksheet.Name = "XXXXX_Desc"
# save as .xlsx
# 51 ==> [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
# see: https://learn.microsoft.com/en-us/office/vba/api/excel.xlfileformat
$workbook.SaveAs('C:\Desc.xlsx', 51)
# quit Excel and remove all used COM objects from memory
$excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Actually, this is a version of question here:
How to use powershell to select and copy columns and rows in which data is present in new workbook.
The goal is to grab certain columns from multiple Excel workbooks and dump everything to one csv file. Columns are always the same.
I'm doing that manually:
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
$wb = $xl.Workbooks.Open($_.FullName, 0, 1, 5, "")
try {
$ws = $wb.Worksheets.item('Calls') # => This specific worksheet
$rowMax = ($ws.UsedRange.Rows).count
for ($i=1; $i -le $rowMax-1; $i++) {
$newRow = New-Object -Type PSObject -Property #{
'Type' = $ws.Cells.Item(1+$i,1).text
'Direction' = $ws.Cells.Item(1+$i,2).text
'From' = $ws.Cells.Item(1+$i,3).text
'To' = $ws.Cells.Item(1+$i,4).text
}
$newRow | Export-Csv -Path $("$output_folder\$ESO_Output") -Append -noType -Force
}
}
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
Question:
This works, but is extremely slow because Excel has to select every cell, copy that, then Powershell has to create array and save row by row in output csv file.
Is there a method to select a range in Excel (number of columns times ($ws.UsedRange.Rows).count), cut header line and just append this range (array?) to csv file to make everything much faster?
So that's the final solution
Script is 22 times faster!!! than original solution.
Hope somebody will find that useful :)
PasteSpecial is to filter out empty rows. There is no need to save them into csv
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.DisplayAlerts = $false
$counter = 0
$input_folder = "C:\Users\user\Documents\excelfiles"
$output_folder = "C:\Users\user\Documents\csvdump"
Get-ChildItem $input_folder -File |
Foreach-Object {
$counter++
try {
$new_ws1 = $wb.Worksheets.add()
$ws = $wb.Worksheets.item('Calls')
$rowMax = ($ws.UsedRange.Rows).count
$range = $ws.Range("A1:O$rowMax")
$x = $range.copy()
$y = $new_ws1.Range("A1:O$rowMax").PasteSpecial([System.Type]::Missing,[System.Type]::Missing,$true,$false)
$wb.SaveAs("$($output_folder)\$($_.Basename)",[Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSVWindows)
} catch {
Write-host "No such workbook" -ForegroundColor Red
# Return
}
}
$xl.Quit()
Part above will generate a bunch of csv files.
Part below will read these files in separate loop and combine them together into one.
-exclude is an array of something I want to omit
Remove-Item to remove temporary files
Answer below is based on this post: https://stackoverflow.com/a/27893253/6190661
$getFirstLine = $true
Get-ChildItem "$output_folder\*.csv" -exclude $excluded | foreach {
$filePath = $_
$lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "$($output_folder)\MERGED_CSV_FILE.csv" $linesToWrite
Remove-Item $_.FullName
}
My goal is to add a version number to the file properties of an Excel file that can then be read externally with PowerShell.
If I run (Get-Item "example.xls").VersionInfo I get blank ProductVersion and FileVersion.
ProductVersion FileVersion FileName
-------------- ----------- --------
example.xls
I cannot find a way to set these attributes from VBA. I did find a way to get\set a Revision Number:
Public Function FileVersion() As String
With ThisWorkbook.BuiltinDocumentProperties
FileVersion = .Item("Revision Number").Value
End With
End Function
Public Sub UpdateFileVersion()
With ThisWorkbook.BuiltinDocumentProperties
.Item("Revision Number").Value = .Item("Revision Number").Value + 1
End With
End Sub
However, I can't find a way to read the Revision Number from PowerShell. I either need to read Revision Number from PowerShell or I need to set ProductVersion and FileVersion from VBA. I would accept any combination of things that results in setting a file version in Excel that is visible outside of Excel, ideally I would like to be able to use all of these properties.
You can see the Revision Number I am trying to get from PowerShell and also the Version Number that I cannot set from VBA here:
If you right-click a file and hit properties in the Details tab, you see all that is available.
If you don't want to have to COM into the EOM (Excel Object Model), then you need to assign these in the EOM first, then hit them via PowerShell just as Windows Explorer shows them or enum metadata.
So, something like...
### Get file properties
##
Get-ItemProperty -Path 'D:\Temp' -filter '*.xl*' |
Format-list -Property * -Force
Or
### Enumerate file properties in PowerShell
# get the first file
(
$Path = ($FileName = (Get-ChildItem -Path 'D:\Temp' -Filter '*.xl*').FullName ) |
Select-Object -First 1
)
$shell = New-Object -COMObject Shell.Application
$folder = Split-Path $path
$file = Split-Path $path -Leaf
$shellfolder = $shell.Namespace($folder)
($shellfile = $shellfolder.ParseName($file))
<#
You'll need to know what the ID of the extended attribute is.
This will show you all of the ID's:
#>
0..287 |
Foreach-Object { '{0} = {1}' -f $_, $shellfolder.GetDetailsOf($null, $_) }
# Once you find the one you want you can access it like this:
$shellfolder.GetDetailsOf($shellfile, 216)
As for this...
Thanks but your list, and the one I got from running this on my Excel
file, do not contain Revision
... try it this way.
Gleened from here:
Hey, Scripting Guy! How Can I List All the Properties of a Microsoft
Word Document?
and here:
# Getting specific properties fomr MS Word
$Path = "D:\Temp"
$ObjectProperties = "Author","Keywords","Revision number"
$Application = New-Object -ComObject Word.Application
$Application.Visible = $false
$Binding = "System.Reflection.BindingFlags" -as [type]
$Select = "Name","Created"
$Select += $ObjectProperties
ForEach ($File in (Get-ChildItem $Path -Include '*.docx' -Recurse))
{ $Document = $Application.Documents.Open($File.Fullname)
$Properties = $Document.BuiltInDocumentProperties
$Hash = #{}
$Hash.Add("Name",$File.FullName)
$Hash.Add("Created",$File.CreationTime)
ForEach ($Property in $ObjectProperties)
{ $DocProperties = [System.__ComObject].InvokeMember("item",$Binding::GetProperty,$null,$Properties,$Property)
Try {$Value = [System.__ComObject].InvokeMember("value",$binding::GetProperty,$null,$DocProperties,$null)}
Catch {$Value = $null}
$Hash.Add($Property,$Value)
}
$Document.Close()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($Properties) |
Out-Null
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($Document) |
Out-Null
New-Object PSObject -Property $Hash |
Select $Select
}
$Application.Quit()
# Results
<#
Name : D:\Temp\Test.docx
Created : 06-Feb-20 14:23:55
Author : ...
Keywords :
Revision number : 5
#>
# Getting specific properties fomr MS Excel
$Path = "D:\Temp"
$ObjectProperties = "Author","Keywords","Revision number"
$Application = New-Object -ComObject excel.Application
$Application.Visible = $false
$Binding = "System.Reflection.BindingFlags" -as [type]
$Select = "Name","Created"
$Select += $ObjectProperties
ForEach ($File in (Get-ChildItem $Path -Include '*.xlsx' -Recurse))
{ $Document = $Application.Workbooks.Open($File.Fullname)
$Properties = $Document.BuiltInDocumentProperties
$Hash = #{}
$Hash.Add("Name",$File.FullName)
$Hash.Add("Created",$File.CreationTime)
ForEach ($Property in $ObjectProperties)
{ $DocProperties = [System.__ComObject].InvokeMember("item",$Binding::GetProperty,$null,$Properties,$Property)
Try {$Value = [System.__ComObject].InvokeMember("value",$binding::GetProperty,$null,$DocProperties,$null)}
Catch {$Value = $null}
$Hash.Add($Property,$Value)
}
$Document.Close()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($Properties) |
Out-Null
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($Document) |
Out-Null
New-Object PSObject -Property $Hash |
Select $Select
}
$Application.Quit()
# Results
<#
Name : D:\Temp\Test.xlsx
Created : 25-Nov-19 20:47:15
Author : ...
Keywords :
Revision number : 2
#>
Point of note: I meant to add sources:
Regarding setting properties, see this Word example from the MS PowerShellgallery.com, which can be tweaked of course for other Office docs.
Set specific word document properties using PowerShell
The attached script uses the Word automation model to set a specific
BuiltIn Word document property. It is provided as an example of how to
do this. You will need to modify the pattern used to find the files,
as well as the built-in Word property and value you wish to assign.
As note above, getting is the same thing...
Get Word built-in document properties
This script will allow you to specify specific Word built-in document
properties. It returns an object containing the specified word
document properties as well as the path to those documents. Because a
PowerShell object returns, you can filter and search different
information fr
Thanks to #postanote for pointing me in the right direction. None of the code offered worked out of the box for me.
This is what I ended up doing to pull the Revision Number from my Excel document:
<# Get-Excel-Property.ps1 v1.0.0 by Adam Kauffman 2020-02-03
Returns the property value from an Excel File
#>
param(
[Parameter(Mandatory=$true, Position=0)][string]$FilePath,
[Parameter(Mandatory=$true, Position=1)][string]$ObjectProperties
)
Function Get-Property-Value {
[CmdletBinding()]Param (
[Parameter(Mandatory = $true)]$ComObject,
[Parameter(Mandatory = $true)][String]$Property
)
$Binding = "System.Reflection.BindingFlags" -as [type]
Try {
$ObjectType = $ComObject.GetType()
$Item = $ObjectType.InvokeMember("Item",$Binding::GetProperty,$null,$ComObject,$Property)
return $ObjectType.InvokeMember("Value",$Binding::GetProperty,$null,$Item,$null)
}
Catch {
return $null
}
}
# Main
$Application = New-Object -ComObject Excel.Application
$Application.Visible = $false
$Document = $Application.Workbooks.Open($FilePath)
$Properties = $Document.BuiltInDocumentProperties
$Hash = #{}
$Hash.Add("Name",$FilePath)
ForEach ($Property in $ObjectProperties)
{
$Value = Get-Property-Value -ComObject $Properties -Property $Property
$Hash.Add($Property,$Value)
}
# COM Object Cleanup
if ($null -ne $Document) {
$Document.Close($false)
Remove-Variable -Name Document
}
if ($null -ne $Properties) {
Remove-Variable -Name Properties
}
if ($null -ne $Application) {
$Application.Quit()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($Application) | Out-Null
Remove-Variable -Name Application
}
[gc]::collect()
[gc]::WaitForPendingFinalizers()
# Show collected information
New-Object PSObject -Property $Hash
I have a csv file which contains all data in 1 column.
This is the format,
EPOS SKU QTY ReferenceNr
---- --- --- -----------
717 30735002 1 S04-457312
700 30777125 1 S06-457360
700 25671933 1 S06-457389
716 25672169 1 S09-457296
716 25440683 1 S09-457296
I would like to separate those data into 4 columns with these following headers and save/export to csv or xlsx via powershell script.
Thank you for your help
This should work:
Add-Type -AssemblyName Microsoft.Office.Interop.Excel
$inputFile = $PSScriptRoot + '\rawtext.txt'
$csvFile = $PSScriptRoot + '\rawtext.csv'
$excelFile = $PSScriptRoot + '\rawtext.xlsx'
# Create datatable
$dt = New-Object system.Data.DataTable
[void]$dt.Columns.Add('EPOS',[string]::empty.GetType() )
[void]$dt.Columns.Add('SKU',[string]::empty.GetType() )
[void]$dt.Columns.Add('QTY',[string]::empty.GetType() )
[void]$dt.Columns.Add('ReferenceNr',[string]::empty.GetType() )
# loop file
foreach($line in [System.IO.File]::ReadLines($inputFile))
{
if( $line -match '^\d+' ) {
$contentArray = $line -split ' +'
$newRow = $dt.NewRow()
$newRow.EPOS = $contentArray[0]
$newRow.SKU = $contentArray[1]
$newRow.QTY = $contentArray[2]
$newRow.ReferenceNr = $contentArray[3]
[void]$dt.Rows.Add( $newRow )
}
}
# create csv
$dt | Export-Csv $outputFile -Encoding UTF8 -Delimiter ';' -NoTypeInformation
#create excel
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$excel.DisplayAlerts = $false
[void]$excel.Workbooks.Open($csvFile).SaveAs($excelFile, [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault)
[void]$excel.Workbooks.Close()
[void]$excel.Quit()
# remove csv
Remove-Item -Path $csvFile -Force | Out-Null
With the Export-Csv instead of Format-Table solved.
$ftr = Get-Content -Path $pathfile |
select -Skip 1 |
ConvertFrom-Csv -Delimiter '|' -Header 'Detail', 'LineNr', 'EPOS', 'SKU',
'SKUName', 'QTY', 'StoreName', 'Contact', 'ReferenceNr' |
Select-Object -Property EPOS, SKU, QTY, ReferenceNr |
Export-Csv -Path $target$ArvName.csv -NoTypeInformation
If your question is regarding Excel... (It is not clear for me)
Just rename the file from *.csv to *.txt and open it on Excel.
On the Text Assistant choose "My data has headers" and "Delimited" and it will be correctly imported with each data on one column. As you ask for.
Later on, save as csv or xlsx.
I have multiple Excel files with different names in path.
e.g. C:\Users\XXXX\Downloads\report
Each file has a fixed number of columns.
e.g. Date | Downtime | Response
I want to create a new Excel file with merge of all Excel data. New column should be added with client name in which i want to enter file name. Then each Excel file data append below one by one.
e.g. Client name | Date | Downtime | Response
Below code can able to append all excel data but now need to add Client name column.
$path = "C:\Users\XXXX\Downloads\report"
#Launch Excel, and make it do as its told (supress confirmations)
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $True
$Excel.DisplayAlerts = $False
$Files = Get-ChildItem -Path $path
#Open up a new workbook
$Dest = $Excel.Workbooks.Add()
#Loop through files, opening each, selecting the Used range, and only grabbing the first 5 columns of it. Then find next available row on the destination worksheet and paste the data
ForEach($File in $Files)
{
$Source = $Excel.Workbooks.Open($File.FullName,$true,$true)
If(($Dest.ActiveSheet.UsedRange.Count -eq 1) -and ([String]::IsNullOrEmpty($Dest.ActiveSheet.Range("A1").Value2)))
{
#If there is only 1 used cell and it is blank select A1
[void]$source.ActiveSheet.Range("A1","E$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A1").Select()
}
Else
{
#If there is data go to the next empty row and select Column A
[void]$source.ActiveSheet.Range("A2","E$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A$(($Dest.ActiveSheet.UsedRange.Rows|Select -last 1).row+1)").Select()
}
[void]$Dest.ActiveSheet.Paste()
$Source.Close()
}
$Dest.SaveAs("$path\Merge.xls")
$Dest.close()
$Excel.Quit()
Suggest any effective way to do this. Please provide links if available.
Convert XLS to XLSX :
$xlFixedFormat = [Microsoft.Office.Interop.Excel.XlFileFormat]::xlWorkbookDefault
$excel = New-Object -ComObject excel.application
$excel.visible = $true
$folderpath = "C:\Users\xxxx\Downloads\report\*"
$filetype ="*xls"
Get-ChildItem -Path $folderpath -Include $filetype |
ForEach-Object `
{
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
"Converting $path to $filetype..."
$workbook = $excel.workbooks.open($_.fullname)
$workbook.saveas($path, $xlFixedFormat)
$workbook.close()
}
$excel.Quit()
$excel = $null
[gc]::collect()
[gc]::WaitForPendingFinalizers()
If you are willing to use the external module Import-Excel, you could simply loop through the files like so:
$report_directory = ".\reports"
$merged_reports = #()
# Loop through each XLSX-file in $report_directory
foreach ($report in (Get-ChildItem "$report_directory\*.xlsx")) {
# Loop through each row of the "current" XLSX-file
$report_content = foreach ($row in Import-Excel $report) {
# Create "custom" row
[PSCustomObject]#{
"Client name" = $report.Name
"Date" = $row."Date"
"Downtime" = $row."Downtime"
"Response" = $row."Response"
}
}
# Add the "custom" data to the results-array
$merged_reports += #($report_content)
}
# Create final report
$merged_reports | Export-Excel ".\merged_report.xlsx"
Please note that this code is not optimized in terms of performance but it should allow you to get started