PowerShell - Import Excel then Export CSV without using Excel or COM - excel

I am developing a PowerShell script to import an Excel file and output the data to a flat file. The code that I have below works fine except that it fails to preserve leading zeros; when the CSV file is opened in a text editor, the leading zeros are not present. (Leading zeros are necessary for certain ID numbers, and the ID numbers are stored in Excel using a custom format.) Does anyone have any thoughts on how to get the ImportExcel module to preserve the leading zeros, or, perhaps another way of getting to the same goal? I would like to do this without using the COM object and without having to install Excel on the server; that's why I've been trying to make the ImportExcel module work.
$dataIn = filename.xlsx ; $dataOut = filename.csv
Import-Excel -Path $dataIn | Export-Csv -Path $dataOut

I presume you're using the ImportExcel module?
I just did this and it worked. I created a spreadsheet like:
Name ID1 ID2
Steven 00012345 00012346
I gave them a custom number format of 00000000 then ran:
Import-Excel .\Book1.xlsx | Export-Csv .\book1.csv
When looking at the csv file I have both ID numbers as quoted strings:
"Name","ID1","ID2"
"Steven","00012345","00012346"
Is there anything else I need to do to reproduce this? Can you give the specifics of the custom number format?
Also withstanding your answer to above. You can modify the properties of each incoming object by converting them to strings. Assuming there's a fixed number of digits you can use the string format with the .ToString() method like:
(12345).ToString( "00000000" )
This will return "00012345"...
So redoing my test with regular numbers (no custom format):
$Input = #(Import-Excel \\nynastech1\adm_only\ExAdm\Temp\Book1.xlsx)
$Input |
ForEach{
$_.ID1 = $_.ID1.ToString( "00000000" )
$_.ID2 = $_.ID2.ToString( "00000000" )
}
This will convert ID1 & ID2 into "00012345" & "00012345" respectively.
You can also use Select-Object, but you might need to rename the properties. If you are interested I can demo that approach.
Note: the #() wrapping in my example is because I only have the 1 object, and is partly force of habit.
Let me know how it goes.

Related

Strange characters found in XML file and PowerShell output after exporting from Excel: ​

I have an XML file that I'm trying to read with PowerShell. However when I read it, the output of some of the XML objects have the following characters in them: ​
I simply downloaded an XML file I needed from a third-party, which opens in Excel. Then I grab the columns I need and paste them into a new Excel Workbook. Then I map the fields with an XML Schema and then export it as an XML file, which I then use for scripting.
In the Excel spreadsheet my data looks clean, but then when I export it and run the PS script, these strange characters appear in the output. The characters even appear in the actual XML file after exporting. What am I doing wrong?
I tried using -Encoding UTF8, but I'm relatively new to PowerShell and am not sure how to appropriately apply it to my script. Appreciate any help!
PowerShell
$xmlpath = 'Path\To\The\File.xml'
[xml]$xmldata = (Get-Content $xmlpath)
$xmldata.applications.application.name
Example of Output
​ABC_DEF_GHI​.com​​
​JKL_MNO_PQRS​.com​
TUV_WXY_Z.com
AB_CD_EF_GH​.com
This is a prime example of why you shouldn't use the idiom [xml]$xmldata = (Get-Content $xmlpath) - as convenient as it is.[1] The problem is indeed one of character encoding: your file is UTF-8-encoded, but Windows PowerShell's Get-Content cmdlet interprets it as ANSI-encoded in the absence of a BOM - this answer explains the encoding part in detail.Thanks, choroba.
Instead, to ensure that the XML file's character encoding is interpreted correctly, use the following:
# Note: If you know that $xmlPath contains a *full*, native path,
# you don't need the Convert-Path call.
($xmlData = [xml]::new()).Load((Convert-Path -LiteralPath $xmlPath))
This delegates interpretation of the character encoding to the System.Xml.XmlDocument.Load .NET API method, which not only assumes the proper default for XML (UTF-8), but also respects any explicit encoding specification as part of the XML declaration, if present (e.g., <?xml version="1.0" encoding="iso-8859-1"?>)
See also:
the bottom section of this answer for background information.
GitHub proposal #14505, which proposes introducing a New-Xml cmdlet that robustly parses XML files.
[1] If you happen to know the encoding of the input file ahead of time, you can get away with using Get-Content's -Encoding parameter in your original approach ([xml]$xmldata = (Get-Content -Encoding utf8 $xmlpath), but the .Load()-based approach is much more robust.

export-csv powershell with custom column type

I'm exporting an AD report via power shell using the code below.
$Get-ADUser -Filter 'enabled -eq $true' -SearchBase "OU=Staff,OU=Users,OU=OSA,DC=domian,DC=org" -properties mail, employeeID | select employeeID, mail, ObjectGUID | Export-CSV "C:\Reports\ADExports\Students.csv" -notypeinformation
It outputs the csv file and everything looks fine except, the 'Data type' of all columns are set to 'Short Text'.
I require the Employee ID column type to be 'Numbers'. Is it possible to export a csv with custom field type.
I hope this make sense.
Thanks in advance.
CSVs are plain text and do not contain type information. However, you can use the following module, which provides an Export-Excel cmdlet. This cmdlet takes various Excel parameters, including a -NumberFormat.
$x | Export-Excel -Numberformat 'Number' -Path 'test.xlsx' #This worked for me.
You will probably have to play around with it a little depending on your exact use case. Good luck!
ok guys, I'm going to assume that you cannot export a csv via powershell with custom data types.
However, I found a way around my issue. I've converted the data type when importing the csv in to access and managed to solve my issue.
If anyone's interested you can find the exact issue and solution here - Joining Two Tables with Different Data types MS ACCESS - 'type mismatch in expression' error
Thank you for bring the 'ImportExcel' module to my attention. Now i know there's module available for this and you can do quite a bit on excel manipulation.
Thank you all for your comments/answers.
Thanks.

Replace strings in text files with string literals and file names in powershell

My google-fu has failed me, so I'd love to get some help with this issue. I have a directory full of markup files (extension .xft). I need to modify these files by adding string literals and the filename (without the file extension) to each file.
For example, I currently have:
<headerTag>
<otherTag>Some text here </otherTag>
<finalTag> More text </finalTag>
What I need to end up with is:
<modifiedHeaderTag>
<secondTag> filenameGoesHere </secondTag>
<otherTag>Some text here </otherTag>
<finalTag> More text </finalTag>
So in this example,
"<modifiedHeaderTag>
<secondTag>"
would be my first string literal (this is a constant that gets inserted into each file in the same place),
filenameGoesHere
would be the variable string (the name of each file) and,
"</secondTag>"
would be my second constant string literal.
I was able to successfully replace text using:
(Get-Content *.xft).Replace("<headerTag>", "<modifiedHeaderTag>")
However, when I tried
(Get-Content *.xft).Replace("<headerTag>", "<modifiedHeaderTag> `n
<secondTag> $($_.Name) </secondTag>")
I just got an error message. Replacing $($_.Name) with ${$_.Name) also had no effect.
I've tried other things, but this method was the closest that I had gotten to success. I would appreciate any help that I can get. It's probably simple and I'm just not seeing something due to inexperience with Powershell, so a helping hand would be great.
If the above isn't clear enough, I'd be happy to provide more info, just let me know. Thanks everyone!
Here's my approach, assuming you have all of the XFT's in one folder and you want to write the updates back to the same file:
$path = "C:\XFTs_to_Modify"
$xfts = Get-ChildItem $path -Include "*.xft"
foreach ($xft in $xfts) {
$replace = "<modifiedHeaderTag>
<secondTag> $($xft.Name) </secondTag>"
(Get-Content *.xft).Replace("<headerTag>", $replace) | Set-Content $xft.FullName -Force
}

Trying to Export a CSV list of users using Active Directory Module for Windows Powershell

So the below is where I'm at so far:
import-module activedirectory
$domain = "ourdomain"
Get-ADUser -Filter {enabled -eq $true} -Properties whenCreated,EmailAddress,CanonicalName |
select-object Name,EmailAddress,CanonicalName,whenCreated | export-csv C:\Data\test.csv
Unfortunately, when I run the above I get dates in two different formats in the CSV, e.g.:
01/01/2017
1/01/2017 8:35:56 PM
The issue this poses is that there isn't really a clean way to sort them. Excel's formatting doesn't change either of these formats to be more like the other, both because of the inclusion of time in one and not the other, and because the time-inclusive format doesn't use trailing zeroes in the single digit numbers, but the time-exclusive format does.
We have an existing script that captures users using the LastLogonTimestamp attribute that does this correctly by changing the bottom line to the following:
select-object Name,EmailAddress,CanonicalName,#{Name="Timestamp"; Expression={[DateTime]::FromFileTime($_.whenCreated).ToString('yyyy-MM-dd_hh:mm:ss')}}
For some reason this expression runs properly when we query the LastLogonTimestamp attribute, but when we run this version querying the whenCreated attribute, we get an entirely blank column underneath the Timestamp header.
I'm not particularly knowledgeable about PowerShell itself, and my colleague who had found the original script for the LastLogonTimestamp just found it online and adapted it as minimally as possible to have it work for us, so I don't know if something in this line would work properly with one of these attributes and not the other. It seems strange to me though that two attributes using dates in the same program would store them in different formats though, so I'm not convinced that's it.
In any case, any help anyone can offer to help us get a uniform date format in the output of this script would be greatly appreciated - it needn't have the time included if it's easier to do away with it, though if they're equally easy we may as well keep it.
whencreated is already a [DateTime]. Notice the difference between the properties when you run something like this:
Get-ADUser TestUser -Properties lastlogon,whenCreated | select lastlogon,whenCreated | fl
(Get-ADUser TestUser -Properties lastlogon).lastlogon | gm
(Get-ADUser TestUser -Properties whenCreated).whenCreated | gm
This means that you don't have to convert to a DateTime before running the toString() method.
select-object #{Name="Timestamp"; Expression={$_.whenCreated.ToString('yyyy-MM-dd_hh:mm:ss')}}

Adding a header to a '|' delimited CSV file in Powershell?

I was wondering if anybody knows a way to achieve this without breaking/mesing with the data itself?
I have a CSV file which is delimited by '|' which was created by retrieving data from Sharepoint using an SPQuery and exported using out-file (because export-csv is not an option since I would have to store the data in a variable and this would eat at the RAM of the server, querying remotely unfortuntely will also not work so i have to do this on the server itself). Nevertheless I have the Data i need but i want to perform some manipulations and move and autocalc certain data within an excel file and export the said excel file.
The problem I have right now is that I sort of need a header to the file. I have tried using the following code:
$header ="Name|GF_GZ|GF_Title|GF_UniqueId|GF_OldId|GFURL|GF_ITEMREDIRECTID"
$file = Import-Csv inputfilename.csv -Header $header | Export-Csv D:\outputfilename.csv
In powershell but the issue here is that when i perform the second Export-Csv it will delimit at anything that has a comma and thus remove it, i sort of need the data to remain intact.
I have tried playing with the -Delimit '|' setting both on the import and the export path but no matter what i do it seems to be cutting off the data. Is there a better way to simply add a row at the Top (a header) without messing with the already existing file structure?
I have found out that using a delimiter such as -delimiter '°' or any other special case character will remove my problem entirely, but i can never be sure if such a character is going to show up in the dataset and thus (as stated already) am looking for a more "elegant" solution.
Thanks
One option you have is to create the original CSV with the headers first. Then when you are exporting the SharePoint data, use the switch -Append in the Out-File command to append the SP data to the CSV.
I wouldn't even bother messing with it in csv format.
$header ="Name|GF_GZ|GF_Title|GF_UniqueId|GF_OldId|GFURL|GF_ITEMREDIRECTID"
$in_file = '.\inputfilename.csv'
$out_file = '.\outputfilename.csv'
$x = Get-Content $in_file
Set-Content $out_file -Value $header,$x
There's probably a more eloquent/refined two-liner for some of this, but this should get you what you need.

Resources