Copying an entire column from one CSV file to another using Powershell - excel

I have two CSV files: File1.csv has one column with 4000+ rows. File2.csv has 200 columns with 10000+ rows of content. I want to add the one column in file1.csv as an additional column on File2.csv. I am OK adding it to the end (rightmost) of the existing file. I have found several options online, but none has worked as desire. I can get it done with the Input-CSV cmdlet and adding a Property but that is taking more than ~1 hour to execute. Is there any way to do this without having to convert the CSV content into PSobjects? I have used Get-Content and Set-Content in the past, but that will append one file to the bottom of the other one. Is there any way I could do something similar but appending to the right of the existing file?
Here is the piece of code that has gotten me closer to what I need. The problem with this one is Excel is not saving or closing. Any ideas on how this problem can be solved either by fixing the code below or an easier/more efficient way to do it?
$source = "C:\Users\Desktop\Script_Development\04-16-2015\Bit.csv"
$dest = "C:\Users\Desktop\Script_Development\04-16-2015\MergedwithHeader_04-16-2015.csv"
$Excel = New-Object -ComObject Excel.Application
$Excel.visible = $false
$Workbooksource = $excel.Workbooks.open($source)
$Worksheetsource = $Workbooksource.WorkSheets.item("Bit")
$Worksheetsource.activate()
$range = $Worksheetsource.Range("A1").EntireColumn
$range.Copy() | out-null
$Workbookdest = $excel.Workbooks.open($dest)
$Worksheetdest = $Workbookdest.Worksheets.item("MergedwithHeader_04-16-2015")
$Range = $Worksheetdest.Range("FT1")
$Worksheetdest.Paste($range)
$Workbookdest.SaveAs("C:\Users\Desktop\Script_Development\04-16-2015\MergedwithHeader_04-16-2015.xls")
$Excel.quit()

The following code will loop through the lines of a file. You could use this to read each line into an ArrayList.
$FileData = Get-Content "$Filename"
foreach ($i in $FileData)
{
DoSomethingWithLine($i)
}
Then you loop through the other file, and combine each line with a line that is stored in the ArrayList, concatenating it with the necessary commas and quotes, and append each line to a new file using Add-Content.
There would be numerous other and more sophisticated ways to do this.

Related

Pull data from a specified row in Excel spreadsheet

I'm working on a PS script to take a row of data from an Excel spreadsheet and populate that data in certain places in a Word document. To elaborate, we have a contract tracking MASTER worksheet that among other things contains data such as name of firm, address, services, contact name. Additionally, we have another TASK worksheet in the same workbook that tracks information such as project owner, project name, contract number, task agree number.
I'm writing a script that does the following:
Ask the user through a message box what kind of contract is being written ("Master", or "Task")
Opens the workbook with the appropriate worksheet opened ("Master" tab or "Task" tab)
Asks the user through a VB InputBox from which Excel row of data they want to use to populate the Word contract
Extracts that row of data from Excel
Outputs certain portions of that row of data to certain location in a Word document
Saves the Word document
Opens the Word document so the user can continue editing it
My question is this - using something like PSExcel, how do I extract that row of data out to variables that can be placed in a Word document. For reference, in case you're going to reply with a snippet of code, here are what the variables are defined as for the Excel portion my script:
$Filepath = "C:\temp\ContractScript\Subconsultant Information Spreadsheet.xlsx"
$Excel = New-Object -ComObject Excel.Application
$Workbook = $Excel.Workbooks.Open($Filepath)
$Worksheet = $Workbook.sheets.item($AgreementType)
$Excel.Visible = $true
#Choosing which row of data
[int]$RowNumber = [Microsoft.VisualBasic.Interaction]::InputBox("Enter the row of data from $AgreementType worksheet you wish to use", "Row")
Additionally, the first row of data in the excel worksheets are the column headings, in case it matters.
I've gotten this far so far:
import-module psexcel
$Consultant = new-object System.Collections.Arraylist
foreach ($data in (Import-XLSX -path $Filepath -Sheet $AgreementType -RowStart $RowNumber))
{
$Consultant.add($data)'
But I'm currently stuck because I can't figure out how to reference the data being added to $consultant.$data. Somehow I need to read in the column headings first so the $data variable can be defined in some way, so when I add the variable $consultant.Address in Word it finds it. Right now I think the variable name is going to end up "$Consultant.1402 S Broadway" which obviously won't work.
Thanks for any help. I'm fairly new to powershell scripting, so anything is much appreciated.
I have the same issue and searching online for solutions in a royal PITA.
I'd love to find a simple way to loop through all of the rows like you're doing.
$myData = Import-XLSX -Path "path to the file"
foreach ($row in $myData.Rows)
{
$row.ColumnName
}
But sadly something logical like that doesn't seem to work. I see examples online that use ForEach-Object and Where-Object which is cumbersome. So any good answers to the OP's question would be helpful for me too.
UPDATE:
Matthew, thanks for coming back and updating the OP with the solution you found. I appreciate it! That will help in the future.
For my current project, I went about this a different way since I ran into lack of good examples for Import-XLSX. It's just quick code to do a local task when needed, so it's not in a production environment. I changed var names, etc. to show an example:
$myDataField1 = New-Object Collections.Generic.List[String]
$myDataField2 = New-Object Collections.Generic.List[String]
# ...
$myDataField10 = New-Object Collections.Generic.List[String]
# PSExcel, the third party library, might want to install it first
Import-Module PSExcel
# Get spreadsheet, workbook, then sheet
try
{
$mySpreadsheet = New-Excel -Path "path to my spreadsheet file"
$myWorkbook = $mySpreadsheet | Get-Workbook
$myWorksheet = $myWorkbook | Get-Worksheet -Name Sheet1
}
catch { #whatever error handling code you want }
# calculate total number of records
$recordCount = $myWorksheet.Dimension.Rows
$itemCount = $recordCount - 1
# specify column positions
$r, $my1stColumn = 1, 1
$r, $my2ndColumn = 1, 2
# ...
$r, $my10thColumn = 1, 10
if ($recordCount -gt 1)
{
# loop through all rows and get data for each cell's value according to column
for ($i = 1; $i -le $recordCount - 1; $i++)
{
$myDataField1.Add($myWorksheet.Cells.Item($r + $i, $my1stColumn).text)
$myDataField2.Add($myWorksheet.Cells.Item($r + $i, $my2ndColumn).text)
# ...
$myDataField10.Add($myWorksheet.Cells.Item($r + $i, $my10thColumn).text)
}
}
#loop through all imported cell values
for ([int]$i = 0; $i -lt $itemCount; $i++)
{
# use the data
$myDataField1[$i]
$myDataField2[$i]
# ...
$myDataField10[$i]
}

Adding Same Column Header Names to all worksheets in an Excel Book

I have a script that will create worksheets based on the number of files that it finds in a directory. From there it changes the name of the sheets to the file name. During that process, I am attempting to add two Column header values of "Hostname" and "IP Address" to every sheet. I can achieve this by activating each sheet individually but this becomes rather cumbersome as the amount of sheets goes past 20+ and thus I am trying to find a dynamic way of doing this regardless the amount of sheets that are present.
This is the code that I have to do everything up to the column header portion:
$WorksheetCount = (Get-ChildItem $Path\Info*.txt).count
$TabNames = Get-ChildItem $Path\Info*.txt
$NewTabNames = Foreach ($file IN $TabNames.Name){$file.Substring(0,$file.Length-4)}
$Break = 0
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $true
$Workbook = $Excel.Workbooks.Add()
$null = $Excel.Worksheets.Add($MissingType, $Excel.Worksheets.Item($Excel.Worksheets.Count),
$WorksheetCount - $Excel.Worksheets.Count, $Excel.Worksheets.Item(1).Type)
1..$WorksheetCount
Start-Sleep -s 1
ForEach ($Name In $NewTabNames){
$Break++
$Excel.Worksheets.Item($Break).Name = $Name
}
I have attempted to insert my code as such:
ForEach ($Name In $NewTabNames){
$Break++
$Excel.Worksheets.Item($Break).Name = $Name
$cells=$Name.Cells
$cells.item(1,1)="Hostname"
$cells.item(1,2)="IP Address"
}
When I attempt to run the script, I get the following error..
You cannot call a method on a null-valued expression.
And then it proceeds to list each line of the code that I had put in. I thought that since I created a variable during the operation, that it was the issue:
$cells=$Name.Cells
I thought That perhaps if I moved it before the ForEach command that it would resolve it but I still receive the same issue. I have looked through various ways of trying to select ranges of sheets within excel via powershell but have not found anything helpful.
Would appreciate any assistance on this.
This is actually my first post in StackOverflow ever and I feel pretty excited to finally help out. I made some small modifications to your code and seems to work fine. I noticed some odd behavior when I removed the $null variable that was getting assigned because it seemed strange to me why it was being done, but after removing that assignment my outlook application open by itself automatically every time I ran the script. I found the site where you got the code from just to see if there were any changes to the original code.
I found this Microsoft documentation very helpful to figure this out.
This is what I modified
ForEach ($Name In $NewTabNames){
$Break++
$Excel.Worksheets($Break).Name = $Name
$Excel.Worksheets($Break).Cells(1,1).Font.Bold = $true
$Excel.Worksheets($Break).Cells(1,1) = "Hostname"
$Excel.Worksheets($Break).Cells(1,2).Font.Bold = $true
$Excel.Worksheets($Break).Cells(1,2) = "IP Address"
}

Process .xlsx to csv with Powershell using rename and set delimiter

I have an Excel file that I receive and want to process it to a CSV using Powershell.
I have to alter it quite specifically so it can be a reliable input for a program that will process the csv info.
I don't know the exact headers, but i know there can be duplicates.
What I do is open the xlsx file with excel and save it as CSV:
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $True
$objExcel.DisplayAlerts = $True
$Workbook = $objExcel.Workbooks.open($xlsx1)
$WorkSheet = $WorkBook.sheets.item($sheet)
$xlCSV = 6
$Workbook = $objExcel.Workbooks.open($xlsx2)
$WorkSheet = $WorkBook.sheets.item($sheet)
$WorkBook.SaveAs($csv2,$xlCSV)
Now, the XLSX file will have comma's, so first I want to change them to dots.
I tried this, but it's not working:
$objRange = $worksheet.UsedRange
$objRange.Replace ",", "."
It errors out saying: Unexpected token '", "'.
Then when saving I want to set the Delimiter to comma, as it uses ";" standard.
With something like:
$WorkBook.SaveAs($csv2,$xlCSV) -delimiter ","
The last problem is the duplicate headers; this prevents PS to use Import-CSV. Here I tried, when file is separated with a comma it works:
Get-Content $downloads\BBKS_DIR_AUTO_COMMA.csv -totalcount 1 >$downloads\Headers.txt
But then I need to rename de duplicate names like I can have Regio, Regio, Regio.
I want to change this to Regio, Regio2, Regio3
My plan was to lookup the data of the txt, search for duplicates, and then ad an incremental nummer.
In the end I need to add a column with incremental numbers, but always with four numbers, like; 0001, 0002, 0010, 0020, 0200, 1500, I wont exceed 9999. How can this be done?
If you can help me, if only partially I'm very happy.
Further, I'm running Windows 7 x64, Powershell 3.0, Excel 2016 (if relevant)
If easier, its fine to go back to Command prompt for some tasks.
Personally, I wouldn't try and work with Excel sheets via Excel itself and COM - I'd use the excellent module https://github.com/dfinke/ImportExcel
Then you can import from the sheet straight to a native Powershell object array, and re-export with Export-Csv -Delimiter.
Edit: To answer follow ups :
Once you've loaded the module you can do "Get-Module ImportExcel | Select-Object -ExpandProperty ExportedCommands" to see what it makes available.
To import your Excel in the first place, do something like :
$WorkBook = Import-Excel
And if you need to take care of duplicate column names, you can do :
$WorkBook = Import-Excel -Header #("Regio1", "Regio2", "Regio")
Where the array you pass to -Header needs to include every column you want from the workbook.

Excel com-object via powershell

I am doing data output to csv file via powershell. Generally things goes well.
I have exported the data to csv file. It contains about 10 columns. When I open it with MS Excel it's all contained in first column. I want to split it by several columns programmatically via powershell(same GUI version offers). I could make looping and stuff to split the every row and then put values to appropriate cell but then it would take way too much time.
I believe there should be an elegant solution to make one column split to multiple. Is there a way to make it in one simple step without looping?
This is what I came up with so far:
PS, The CSV file is 100% FINE. The delimiter is ','
Get-Service | Export-Csv -NoTypeInformation c:\1.csv -Encoding UTF8
$xl = New-Object -comobject Excel.Application
$xl.Visible = $true
$xl.DisplayAlerts = $False
$wb = $xl.Workbooks.Open('c:\1.csv')
$ws = $wb.Sheets|?{$_.name -eq '1'}
$ws.Activate()
$col = $ws.Cells.Item(1,1).EntireColumn
This will get you the desired functionality; add to your code. Check out the MSDN page for more information on TextToColumns.
# Select column
$columnA = $ws.Range("A1").EntireColumn
# Enumerations
$xlDelimited = 1
$xlTextQualifier = 1
# Convert Text To Columns
$columnA.texttocolumns($ws.Range("A1"),$xlDelimited,$xlTextQualifier,$true,$false,$false,$true,$false)
$ws.columns.autofit()
I had to create a CSV which had "","" as delimiter to test this out. The file with "," was fine in excel.
# Opens with all fields in column A, used to test TextToColumns works
"Name,""field1"",""field2"",""field3"""
"Test,""field1"",""field.2[]"",""field3"""
# Opens fine in Excel
Name,"field1","field2","field3"
Test,"field1","field.2[]","field3"
Disclaimer: Tested with $ws = $wb.Worksheets.item(1)

Powershell Column Removal Script - SharePoint

I have a csv file that has multiple lines of text and columns. I want to write a script that will assign each column a variable from a single row one at a time and move to the next row. Basically, what should happen is that the first column should be stored as URL, the second as list and the third as field and then perform these tasks. Then move to the next row.
$web = Get-SPWeb (Your site URL)
$list = $web.Lists[“Your List Name”]
$field = $list.Fields[“Your Column Name”]
$field.AllowDeletion = “true”
$field.Sealed = “false”
$field.Delete()
$list.Update()
$web.Dispose()
I can do it line by line, but I would like to find out a better way to do this. I've tried writing code a number of ways, but I can't figure out how to do this with a "foreach" loop. Please help.
Without knowing the format of the CSV fields, it's hard to say for sure but something like this might work in terms of adapting your code (assuming here you have csv column names of URL, List and Column):
$csv = import-csv C:\yourpath\yourfile.csv
$csv | select * | foreach {
$web = Get-SPWeb ($_.URL)
$list = $web.Lists[$_.List]
$field = $list.Fields[$_.Column]
$field.AllowDeletion = “true”
$field.Sealed = “false”
$field.Delete()
$list.Update()
$web.Dispose()
}
I can't test the sharepoint elements but importing the csv (if it has a header row) then iterating through each object/row using a foreach loop will work.
Importance of header row is that Import-CSV will automatically create a NoteProperty named for each heading that you can then use to access that value in the current object/row.

Resources