How to modify excel data and export to text file using PowerShell script? - excel

First time poster here. Apologies if I am not following best practices for posting this question.
I am very new to scripting and PowerShell.
Problem:
I have data in an excel sheet in this format.
Excel Data Image Link
I want to modify and export this data into a text file. In this format.
Required Output Image Link
Till now I have tried to modify the excel data by accessing each cell. To access each cell I am using a similar code mentioned below.
for (($i = 1); $i -lt 4; $i++)
{
$column=$ExcelWorkSheet.Columns.Item(1).Rows.Item($i).Text
$dataType=$ExcelWorkSheet.Columns.Item(2).Rows.Item($i).Text
$c1=("`"" + "$column" + "`""+":")
$c2=("`"" + "$dataType" + "`"" + ",")
$ExcelWorkSheet.Columns.Item(1).Rows.Item($i).Value=$c1
$ExcelWorkSheet.Columns.Item(2).Rows.Item($i).Value=$c2
}
I am still not sure if this is the correct way to go.
what would be the best way to solve this?
Just want to understand what I should do to solve this problem. I am not looking for the exact code.
Step by step instructions or some resources would be helpful.
Thanks!

This might help... maybe...
# Import Stuff
$Data = Import-Csv -Path .\Desktop\data.csv
# New Array
$Output = #()
# Run through Unique Owners
foreach ($Owner in ($Data | Select-Object OWNER -Unique)) {
$Lines = $Data | Where-Object {$_.OWNER -eq $Owner.OWNER}
# Lazy way to do a bit of checking, if same then use it or Break
if ($Lines[0].TABLE_NAME -eq $Lines[1].TABLE_NAME) {
$Out_TableName = $Lines[0].TABLE_NAME
# ID and NAME data
$Out_ID = $Lines | Where-Object {$_.COLUMN_NAME -eq "ID"} | Select-Object COLUMN_NAME, DATA_TYPE, DATA_LENGTH
$Out_NAME = $Lines | Where-Object {$_.COLUMN_NAME -eq "NAME"} | Select-Object COLUMN_NAME, DATA_TYPE, DATA_LENGTH
} else {
# Show the user that something
Write-Host "Problem with Owner ""$($Owner.OWNER)"" Data?!" -ForegroundColor Red
Break
}
# Output into the array in format
$Output += #"
"$($Owner.OWNER).$($Out_TableName)":{
"$($Out_ID.COLUMN_NAME)": "$($Out_ID.DATA_TYPE) ($($Out_ID.DATA_LENGTH))",
"$($Out_NAME.COLUMN_NAME)": "$($Out_NAME.DATA_TYPE) ($($Out_NAME.DATA_LENGTH))"
}
"#
}
# Put Output in a text file
$Output | Set-Content .\Desktop\output.txt -Force
I should add, that I had your data in a CSV like this...
OWNER,TABLE_NAME,COLUMN_NAME,DATA_TYPE,DATA_LENGTH
A,Employee,ID,NUMBER,22
A,Employee,NAME,VARCHAR2,22
B,Department,ID,NUMBER,23
B,Department,NAME,VARCHAR2,24

Related

How to change the background color of an Excel report

Basically, I am using this script to run through all the .csv files in a specific folder and merge them all together. But after the merge, I still want to change the background color of each .csv file.
The script that I got so far does not do that, can't figure out how to do it as I am really new in PowerShell.
# Get all the information from .csv files that are in the $IN_FILE_PATH skipping the first line:
$getFirstLine = $true
get-childItem "$IN_FILE_PATH\*.csv" | ForEach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch ($getFirstLine) {
$true { $lines }
$false { $lines | Select -Skip 1 }
}
# Import all the information... and tranfer to the new workbook.
$Report_name = $((get-date).ToString("yyyy.MM.dd-hh.mm"))
$getFirstLine = $false
Add-Content "$OUT_FILE_PATH\Report $Report_Name.csv" $linesToWrite
}
e.g. The .csv file has this pattern:
Name Age
Richard 18
Carlos 20
Jonathan 43
Mathew 25
Making sure to understand that Richard (18 years old) and Carlos (20 years old) are from filenumber1.csv - Jonathan (43 years old) and Mathew (25 years old) are from filenumber2.csv
I want Carlos' and Richard's rows to be with a white background, whereas Jonathan's and Mathew's rows to be grey. So that repeats in white-grey-white-grey dividing it by each file.
I am trying to make it more friendly to observe the report in the end - to make sure that you can this separation from file to file more clear.
Any ideas?
As Vivek Kumar Singh mentioned in comments, .csv doesn't contain any formatting options. It's recommended to work with Excel file instead. And for that purpose, the best module I know and use is ImportExcel.
The code to set formatting is as below (inspired by this thread):
$IN_FILE_PATH = "C:\SO\56870016"
# mkdir $IN_FILE_PATH
# cd $IN_FILE_PATH
# rm out.xlsx
# Define colors
$colors = "White", "Gray"
# Initialization
$colorsTable = #()
$data = #()
$n = 0
Get-ChildItem "$IN_FILE_PATH\*.csv" | % {
$part = Import-Csv $_
$data += $part
for ($i = 0; $i -lt ($part).Count; $i++) {
$colorsTable += $colors[$n%2]
}
$n++
}
$j = 0
$data | Export-Excel .\out.xlsx -WorksheetName "Output" -Append -CellStyleSB {
param(
$workSheet,
$totalRows,
$lastColumn
)
foreach($row in (2..$totalRows )) {
# The magic happens here
# We set colors based on the list which was created while importing
Set-CellStyle $workSheet $row $LastColumn Solid $colorsTable[$j]
$j++
}
}
Hopefully the comments in the code help you to better understanding of what's going on in the code.

Powershell comparing data in a CSV against files in a folder

I'm fairly new to powershell.
I'm trying to compare data in a CSV File against random files in a specific folder.
I want to see if and what has changed and then log that in another column called "Changed".
Here's what I've done below, it seems to create a new column called 'Changed' but doesn't input the changes in it.
$Spreadsheet = 'C:\Powershell\CSV\inv.csv'
$SpreadSheetPath = "C:\Powershell\CSV"
Import-Csv $Spreadsheet -Delimiter "|" -Encoding Default | ForEach-Object -
{
$Path += $_.Path
$Filename += $_.Filename
$DateModified += $_.DateModified
$FileSize += $_.FileSize
$MD5Hash += $_.MD5Hash
}
{
$Msg1 = "Path changed"
$Msg2 = "File Name changed"
$Msg3 = "Date Modified changed"
$Msg4 = "File Size changed"
$Msg5 = "MD5 changed"
$Msg6 = "Files are the same"
$psdata = "D:\ps-test\data\*.*"
}
If (($Path -eq $psdata))
{
Import-Csv C:\Powershell\CSV\inv.csv |
Select-Object *,#{Name='Changed';Expression={$Msg6}} |
Export-Csv C:\Powershell\CSV\NewSpreadsheet4.csv
}
Else
{
Import-Csv C:\Powershell\CSV\inv.csv |
Select-Object *,#{Name='Changed';Expression={$Msg1}} |
Export-Csv C:\Powershell\CSV\NewSpreadsheet4.csv
}
Here is an example of what the CSV looks like:
Path Filename Date Modified File Size MD5 Hash
D:\ps-test\data adminmodeinfo.htm 03/11/2010 22:42 1079 BD1C9468D71FD33BB35716630C4EC6AC
E:\ps-test\data admintoolinfo.htm 03/11/2010 22:42 868 24B99B6316F0C49C23F27FEA6FF1C6AC
E:\ps-test\data admin_ban.bmp 03/11/2010 22:42 63480 C856F1F3C58962B456E749F2EA9C933A
E:\ps-test\data baseline.dat 03/20/2010 03:18:33 173818 F13183D88AABD1A725437802F8551A06
E:\ps-test\data blueRule.gif 03/11/2010 22:42 815 D1AEFE884935095DAB42DAFD072AA46F
E:\ps-test\data deffactory.dat 03/20/2010 03:18:33 706 862D4DFD2F49021BB7C145BDAFE62F6F
E:\ps-test\data dividerArt.jpg 03/11/2010 22:42 367 F7050C596C097C0B01A443058CD15E35
There are many issues with your code.I will try to highlight a few of the issues, link to documentation and point you in the right direction so that you can resolve your issues. A proper solution would require getting many more requirements, or writing code (off-topic for StackOverflow)
Change
| ForEach-Object -
{
to
| ForEach-Object {
In the Foreach-Object, you are concatenating values from each line because you are using +=.
On the first run, $Path contains D:\ps-test\data.
After the second run, it contains D:\ps-test\dataE:\ps-test\data.
At the end of your test data, it contains D:\ps-test\dataE:\ps-test\dataE:\ps-test\dataE:\ps-test\dataE:\ps-test\dataE:\ps-test\dataE:\ps-test\data
The messages are contained in a script block, but it does not look like this is intentional as this is never executed. So after the scriptblock, the variable $Msg1 has not been created; it's blank.
If (($Path -eq $psdata))
double brackets not required.
will always be false because the variable $psdata does not exist as it was stated inside a script block.
will always be false because you are attempting to equate the strings; your input does not literally contain "D:\ps-test\data\*.*". You probably want -like instead of -eq.
will always be inaccurate because even if the paths are compared, there is no check that the file actually exists on the system.
Useful links
Test-Path to check if file exists.
Get-FileHash to get MD5 hash and compare to file.
Get-ChildItem to get a list of directories/files in a directory.
Write-Output so that you can print variables and make sure they contain what you expect.
about_comparison_operators - -in and -contains will help you.
This is a suggestion to help you get started. It's not complete and not tested! Let me know if it works as expected and if you have any questions.
Import-Csv 'C:\Powershell\CSV\inv.csv' -Delimiter "|" -Encoding Default | foreach {
$Path += $_.Path
$Filename += $_.Filename
$DateModified += $_.DateModified
$FileSize += $_.FileSize
$MD5Hash += $_.MD5Hash
$file = [System.IO.FileInfo](Join-Path $Path $Filename)
if (-not $file.Exists) {
$message = "File does not exist"
}
elseif ($file.LastWriteTime -ne [DateTime]$DateModified) {
$message = "Dates differ"
}
elseif ($file.Length -ne [int]$FileSize) {
$message = "Sizes differ"
}
# and so on...
# (You cannot really compared a changed file name btw)
New-Object -Type PSObject -Prop #{
Path = $Path
Filename = $Filename
DateModified = $DateModified
FileSize = $FileSize
MD5Hash = $MD5Hash
Message = $message
}
} | Export-CSV 'C:\Powershell\CSV\NewSpreadsheet4.csv'

List down column headers and get the maximum length of string per column

I'm looking for a translation of my Excel formula in a form of a script in Powershell, vbscript or Excel VBA. I'm trying to get the list of column headers and the max length of string under it.
Normally, what I do is manually open the .txt file in Excel, from there I can get the header names.. next, I create an array formula =MAX(LEN(A1:A100,000)) for example. This will get the max length of string in the column. I'll do the same formula to other columns.
Right now I can't do this since files have increased to 1GB in size and i can't open them anymore, my desktop crashes. It is also maybe because theyre more than 1 million rows which Excel cant handle. My friend suggested Powershell but I have limited knowledge there.. don't know if it can be done in vbscript or Excel VBA.
Thanks in advance for your help.
Below code works for .csv files but does not with .txt delimited files -
$fileName = "C:\Desktop\EFile.csv"
<#
Sample format of c:\temp\data.csv
"id","name","grade","address"
"1","John","Grade-9","test1"
"2","Ben","Grade-9","test12222"
"3","Cathy","Grade-9","test134343"
#>
$colCount = (Import-Csv $fileName | Get-Member | Where-Object {$_.MemberType -eq 'NoteProperty'} | Measure-Object).Count
$csv = Import-Csv $fileName
$csvHeaders = ($csv | Get-Member -MemberType NoteProperty).name
$dict = #{}
foreach($header in $csvHeaders) {
$dict.Add($header,0)
}
foreach($row in $csv)
{
foreach($header in $csvHeaders)
{
if($dict[$header] -le ($row.$header).Length)
{
$dict[$header] =($row.$header).Length
}
}
}
$dict.Keys | % { "key = $_ , Column Length = " + $dict.Item($_) }
This is how I get my data.
$data = #"
"id","name","grade","address"
"1","John","Grade-9","test1"
"2","Ben","Grade-9","test12222"
"3","Cathy","Grade-9","test134343"
"#
$csv = ConvertFrom-Csv -Delimiter ',' $data
But you should get your data like this
$fileName = "C:\Desktop\EFile.csv"
$csv = Import-Csv -Path $fileName
And then
# Extract the header names
$headers = $csv | Get-Member -MemberType NoteProperty | Select-Object -ExpandProperty Name
# Capture output in $result variable
$result = foreach($header in $headers) {
# Select all items in $header column, find the longest, and select the item for output
$maximum = $csv | Select-Object -ExpandProperty $header | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum
# Generate new object holding the information.
# This will end up in $results
[pscustomobject]#{
Header = $header
Max = $maximum.Length
String = $maximum
}
}
# Simple output
$result | Format-Table
This is what I get:
Header Max String
------ --- ------
address 10 test134343
grade 7 Grade-9
id 1 3
name 4 John
Alternatively, if you have memory issues dealing with large files, you may have to get a bit more dirty with the .NET framework. This snippet processes one csv line at a time, instead of reading the entire file into memory.
$fileName = "$env:TEMP\test.csv"
$delimiter = ','
# Open a StreamReader
$reader = [System.IO.File]::OpenText($fileName)
# Read the headers and turn it into an array, and trim away any quotes
$headers = $reader.ReadLine() -split $delimiter | % { $_.Trim('"''') }
# Prepare a hashtable for the results
$result = #{}
# So long as there's more data, keep running
while(-not $reader.EndOfStream) {
# Read a single line and process it as csv
$csv = $reader.ReadLine() | ConvertFrom-Csv -Header $headers -Delimiter $delimiter
# Determine if the item in the result hashtable is smaller than the current, using the header as a key
foreach($header in $headers) {
$item = $csv | Select-Object -ExpandProperty $header
if($result[$header].Maximum -lt $item.Length) {
$result[$header] = [pscustomobject]#{
Header = $header
Maximum = $item.Length
String = $item
}
}
}
}
# Clean up our spent resource
$reader.Close()
# Simple output
$result.Values | Format-Table

Return duplicate names (including partial matches)

Excel guy here that occasionally turns to automating powershell via vba.
I tried to solve https://stackoverflow.com/q/36538022/641067 (now closed) and couldn't get there with my basic powershell knowledge and googlefu alone.
In essence the problem the OP presented is:
There are a list of names in a text file.
Aim is to capture only those names that occurr at least once (so discard unique names, see point (3)).
Names occurring at least once include partial matches, ie Will and William can be considered duplicates and should be retained. Whereas Bill is not a duplicate of William.
I tried various approaches including
Group
Compare-Object see example below
But I was stymied by part (3). I suspect that a loop is required to do this but am curious whether there is a direct Powershellapproach,
Looking forward to hearing from the experts.
what I tried
$a = Get-Content "c:\temp\in.txt"
$b = $a | select -unique
[regex] $a_regex = ‘(?i)(‘ + (($a |foreach {[regex]::escape($_)}) –join “|”) + ‘)’
$c = $b -match $a_regex
Compare-object –referenceobject $c -IncludeEqual $a
Following testscript using a loop would work for the rules you outlined and looks foolproof to me
$t = ('first', 'will', 'william', 'williamlong', 'unique', 'lieve', 'lieven')
$s = $t | sort-object
[String[]]$r = #()
$i = 0;
while ($i -lt $s.Count - 1) {
if ($s[$i+1].StartsWith($s[$i])) {
$r += $s[$i]
$r += $s[$i+1]
}
$i++
}
$r | Sort-Object -Unique
and following testscript using a regex might get you started.
$content = "nomatch`nevenmatch1`nevenmatch12`nunevenmatch1`nunevenmatch12`nunevenmatch123"
$string = (($content.Split("`n") | Sort-Object -Unique) -join "`n")
$regex = [regex] '(?im)^(\w+)(\n\1\w+)+'
$matchdetails = $regex.Match($string)
while ($matchdetails.Success) {
$matchdetails.Value
$matchdetails = $matchdetails.NextMatch()
}

PowerShell - paste data into Excel

Today I have just thrown together this PowerShell script which
takes a tab-delimited text file,
reads it into memory,
makes a variable number of filter queries based on distinct values of a certain column
creates a new empty Excel workbook
adds each of the subsets of filtered data to
a new Excel worksheet
The last step is where I am stuck. Currently my code puts a few lines of data into a range in the worksheet, in the form of unrolled/transposed "key: value" entries, resulting in a horizontal data layout. The same range of data is always overwritten.
I want data in the form of a vertical layout, i.e., data in columns, just the same way as if the CSV file was imported with the import-file-wizard of MS Excel.
Is there a simpler way to do it than below?
I admit, some of the PowerShell features are pasted in here in a cargo-cult mode of programming. Please note that I have no PowerShell experience whatsoever. I did some batchfile, VBScript, and VBA coding a few years back. So, other criticisms are also welcome.
PARAM (
[Parameter(ValueFromPipeline = $true)]
$infile = ".\04-2011\110404-13.txt"
)
PROCESS {
echo " $infile"
Write-Host "Num Args:" $args.Length;
$xl = New-Object -comobject Excel.Application;
$xl.Visible = $true;
$Workbook = $xl.Workbooks.Add();
$content = Import-Csv -delimiter "`t" $infile;
$ports = $content | Select-Object Port# | Sort-Object Port# -Unique -Descending;
$ports | ForEach-Object {
$p = $_;
Write-Host $p.{Port#};
$Worksheet = $Workbook.Worksheets.Add();
$workSheet.Name = [string]::Format("{0} {1}", "PortNo", $p.{Port#});
$filtered = $content | Where-Object {$_.{Port#} -eq $p.{Port#} };
$filtered | ForEach-Object {
Write-Host $_.{ObsDateTime}, $_.{Port#}
}
$filtered | clip.exe;
$range = $Workbook.ActiveSheet.Range("a2", "a$($filtered.count)");
$Workbook.ActiveSheet.Paste($range, $false);
}
$xl.Quit()
}
Data Output Example
Wrong
Port# : 1
Obs# : 1
Exp_Flux : 0,99
IV Cdry : 406.96
IV Tcham : 16.19
IV Pressure : 100.7
IV H2O : 9.748
IV V3 : 11.395
IV V4 : 0.759
IV RH : 53.12
Right
Port# Obs# Exp_Flux IV Cdry IV Tcham IV Pressure IV H2O IV V3 IV V4 IV RH
1 1 0,99 406.96 16.19 100.7 9.748 11.395 0.759 53.12
Try Export-Xls, it looks very nice. Never had the chance to use it, but (virtually) knowing the person who worked on it, I'm sure you will be very happy to use it. If you'll go with it, please provide a feedback here will be appreciated.
POSSIBLE WORKAROUND FOR UNORDERED PROPERTIES IN Export-Xls
The function Add-Array2Clipboard could be changed so that it accepts a new input parameter: an array providing the name of the properties ordered as required.
Then the you can change the section where get-member is used. Silly example:
"z", "a", "c" | %{ get-member -name $_ -inputobject $thecurrentobject }
This is just an example on how you can achieve ordered properties from get-member.
I've used the $Workbook.ActiveSheet.Cells.Item($row, $col).Value2 function to more be able to pinpoint more precisely where to put the data when exporting to Excel.
Something like
$row = 1
Get-Content $file | Foreach-Object {
$cols = $_.split("`t")
for ($i = 0; $i < $cols.count; $i++)
{
$Workbook.ActiveSheet.Cells.Item($row, $i+1).Value2 = $cols[$i]
}
$row++
}
Warning: dry-coded! You'll probably need some try..catch as well.
I used a modified Export-Xls function, a bit different as User empo suggested.
This is my call to it
Export-Xls $filtered -Path $outfile -WorksheetName "$wn" -SheetPosition "end" | Out-Null # -SheetPosition "end";
However, the current release of Export-Xls re-orders the columns of the in-memory representation of the csv-text -file. I want the data columns of the text file in their original order, so I had to hack and simplify the original code as follows:
function Add-Array2Clipboard {
param (
[PSObject[]]$ConvertObject,
[switch]$Header
)
process{
$array = #();
$line =""
if ($Header) {
$line = #()
$row = $ConvertObject | Select -First 1
$row.psobject.properties | Foreach {$line += "$($_.Name)" }
$array += [String]::Join("`t", $line)
}
else {
foreach($row in $ConvertObject){
$line =""
$vals = #()
$row.psobject.properties | Foreach {$vals += $_.Value}
$array += [String]::Join("`t", $vals)
}
}
$array | clip.exe;
}
}

Resources