Convert csv to xlsb from Linux Server - excel

I have a client who is wanting a rather large report ran that returns the results in csv format. However they are unable to open the .csv because the row count is too high. We found that we are able to import the csv into MSaccess and then load that database into excel.
However I am looking for a solution that will allow me to convert the .csv to xlsb from the linux server that this report is running on. I did some searching around and haven't found anything fruitful.
I tried using a Powershell script to perform the conversion however the file is still to large to, and the PC runs out of resources when attempting to make the conversion.
#Define locations and delimiter
$csv = "source.csv" #Location of the source file
$xlsx = "output.xlsx" #Desired location of output
$delimiter = "," #Specify the delimiter used in the file
# Create a new Excel workbook with one empty sheet
$excel = New-Object -ComObject excel.application
$workbook = $excel.Workbooks.Add(1)
$worksheet = $workbook.worksheets.Item(1)
# Build the QueryTables.Add command and reformat the data
$TxtConnector = ("TEXT;" + $csv)
$Connector = $worksheet.QueryTables.add($TxtConnector,$worksheet.Range("A1"))
$query = $worksheet.QueryTables.item($Connector.name)
$query.TextFileOtherDelimiter = $delimiter
$query.TextFileParseType = 1
$query.TextFileColumnDataTypes = ,1 * $worksheet.Cells.Columns.Count
$query.AdjustColumnWidth = 1
# Execute & delete the import query
$query.Refresh()
$query.Delete()
# Save & close the Workbook as XLSX.
$Workbook.SaveAs($xlsx,51)
$excel.Quit()
I am open to any and all solutions that may help automate this process.
Thanks

Related

Executing excel formula using powershell taking too much time

I am having below code which I using to calculate
$excel = new-object -comobject Excel.Application
$excel.visible = $false
$workbook = $excel.workbooks.open("C:\SLAFile.xlsx")
$worksheet = $workbook.Worksheets.Item(1)
$rows = $worksheet.range("D2").currentregion.rows.count
$STDSheet = $workbook.WorkSheets.Add()
$STDSheet.Name = 'STD'
###### Copy Date #####
$worksheet.activate()
$lastRow1 = $worksheet.UsedRange.rows.count
$range1 = $worksheet.Range("B2:B$lastRow1")
$range1.copy()
$STDSheet.activate()
$lastRow2 = $STDSheet.UsedRange.rows.count + 1
$range2 = $STDSheet.Range("A$($lastRow2)")
$STDSheet.Paste($range2)
###### Copy Date ######
###### Copy SYMM_ID ####
$worksheet.activate()
$lastRow3 = $worksheet.UsedRange.rows.count
$range3 = $worksheet.Range("A2:A$lastRow3")
$range3.copy()
$STDSheet.activate()
$lastRow4 = $STDSheet.UsedRange.rows.count + 1
$range4 = $STDSheet.Range("C$($lastRow2)")
$STDSheet.Paste($range4)
###### Copy SYMM_ID ####
####### STD Column Header #######
$STDSheet.Cells(1,1) = 'Time (DD.MM.YYYY HH:MM:SS)'
$STDSheet.Cells(1,2) = 'DATE (TEXT FORMAT)'
$STDSheet.Cells(1,3) = 'SYMM_ID'
$STDSheet.Cells(1,4) = 'STD_RT'
$STDSheet.Cells(1,5) = 'IO/sec'
$STDSheet.Cells(1,6) = 'DATA/sec'
$STDSheet.Cells(1,7) = 'Block Size'
$STDSheet.Cells(1,8) = 'MIN RT/6h'
####### STD Column Header #######
##### STD Report Part ####
$count=73
for($j=2;$j -le $rows;$j++)
{
$k = $j-1
$STDSheet.range("B$j:B$j").formula = '=TEXT(A'+$j+',"dd-mm-yyyy hh:mm")'
$STDSheet.range("E$j:E$j").formula = '=IFERROR(SUMIFS(SLA!$G:$G,SLA!$C:$C,"SG_STG_*_L_*STD",SLA!$B:$B,$B'+$j+',SLA!$D:$D,"<>0"),0)+IFERROR(SUMIFS(SLA!$H:$H,SLA!$C:$C,"SG_STG_*_L_*STD",SLA!$B:$B,$B'+$j+',SLA!$D:$D,"<>0"),0)'
$STDSheet.range("D$j:D$j").formula = '=IFERROR(AVERAGEIFS(SLA!$D:$D,SLA!$C:$C,"SG_STG_*_L_*STD",SLA!$B:$B,SLA!B'+$j+',SLA!$D:$D,"<>0"),"-")'
$STDSheet.range("F$j:F$j").formula = '=IFERROR(SUMIFS(SLA!$E:$E,SLA!$C:$C,"SG_STG_*_L_*STD",SLA!$B:$B,$B'+$j+',SLA!$D:$D,"<>0"),0)+IFERROR(SUMIFS(SLA!$F:$F,SLA!$C:$C,"SG_STG_*_L_*STD",SLA!$B:$B,$B'+$j+',SLA!$D:$D,"<>0"),0)'
$STDSheet.range("G$j:G$j").formula = '=IFERROR($F'+$j+'*1024/$E'+$j+',"-")'
$STDSheet.range("H$j:H$j").formula = '=IF(COUNTIF($D'+$j+':$D'+$count+',">0")=72,MIN($D'+$j+':$D'+$count+'),"-")'
$STDSheet.range("I1:I1").formula = '="Nb Period with Min RT > "&5&"ms"'
$STDSheet.range("I$j:I$j").formula = '=IFS($H'+$j+'="-","NA",$H'+$j+'<=5,"OK",$H'+$k+'>5,"NOK CONT",1,"NOK START")'
$STDSheet.range("j1:j1").formula = '=COUNTIF($I:$I,"NOK START")'
The problem is, when I keep only 1000-2000 records in the input file, the code executes very quickly.
But the actual input file contains around 90k records. when I execute that it takes too much time.
Please let me know is there anything wrong with the code? Is there any way to make the execution faster?
While trying I have broke the master excel into 1000 rows each and kept in separate files and tried to execute. but then also I am getting the same slowness issue
$GetExcels = Get-ChildItem -Path 'C:\xlfile\multiple'
foreach($EFile in $GetExcels)
{
}

How do I fix New-Object Com Class Error for Alteryx?

I have an Alteryx workflow that runs a powershell script to convert a csv to excel. It works locally, but when I publish that workflow to Alteryx Gallery, I get an error and my log produces something like this:
New-Object: Retrieving the COM class factory for component with CLSID....
$excel = New-Object -ComObject excel.application...
Resource Unavailable...
FullyQualifiedErrorId: No COMClassIdentified,Microsoft.PowerShell.Commands.NewObjectCommand...
This reads to me like excel is not installed on the alteryx server. Is my assumption correct? What do I need to do to fix this error?
$SharedDriveFolderPath = "\\---\shares\Groups\---\---\---\---\--\B\"
$files = Get-ChildItem $SharedDriveFolderPath -Filter *.csv
foreach ($f in $files){
$outfilename = $f.BaseName +'.xlsx'
$outfilename
#Define locations and delimiter
$csv = "\\---\shares\Groups\---\---\---\---\--\B\$f" #Location of the source file
$xlsx = "\\---\shares\Groups\---\---\---\---\---\---\C\$outfilename" #Desired location of output
$delimiter = "," #Specify the delimiter used in the file
# Create a new Excel workbook with one empty sheet
$excel = New-Object -ComObject excel.application
$workbook = $excel.Workbooks.Add(1)
$worksheet = $workbook.worksheets.Item(1)
# Build the QueryTables.Add command and reformat the data
$TxtConnector = ("TEXT;" + $csv)
$Connector = $worksheet.QueryTables.add($TxtConnector,$worksheet.Range("A1"))
$query = $worksheet.QueryTables.item($Connector.name)
$query.TextFileOtherDelimiter = $delimiter
$query.TextFileParseType = 1
$query.TextFileColumnDataTypes = ,1 * $worksheet.Cells.Columns.Count
$query.AdjustColumnWidth = 1
# Execute & delete the import query
$query.Refresh()
$query.Delete()
# Save & close the Workbook as XLSX.
$Workbook.SaveAs($xlsx,51)
$excel.Quit()
}

How to resolve null-valued expression error

I have a PowerShell script (copied from Net) that would be executed using SQL Server Agent Job. This script will read an excel and create a CSV file from it.
I do not have excel installed on the machine running this script hence using this method.
Code is :
$strFileName = "C:\xxxx\yyyy.xlsx"
$strSheetName = 'MasterCalendar$'
$strProvider = "Provider=Microsoft.ACE.OLEDB.12.0"
$strDataSource = "Data Source = $strFileName"
$strExtend = "Extended Properties='Excel 8.0;HDR=Yes;IMEX=1';"
$strQuery = "Select [Machine_ID],[1469] from [$strSheetName]"
$objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
$sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
$sqlCommand.Connection = $objConn
$objConn.open()
$da = New-Object system.Data.OleDb.OleDbDataAdapter($sqlCommand)
$dt = New-Object system.Data.datatable
[void]$da.fill($dt)
$dataReader.close()
$objConn.close()
$dt
$dt | Export-Csv C:\xxxx\data.csv -NoTypeInformation
When executed through SQL server agent it is throwing below error :
Message Executed as user: Test\User1. A job step received
an error at line 17 in a PowerShell script. The corresponding line is
'$dataReader.close() '. Correct the script and reschedule the job.
The error information returned by PowerShell is: 'You cannot call a
method on a null-valued expression. '. Process Exit Code -1. The
step failed.
I am new to scripting and PowerShell, unable to understand the issue as this exact code is running fine on my machine but starts throwing error when executed through SQL server agent job.

Insufficient memory to continue the execution of the program. Creating Xlxs from CSV

I have a script that cycles through a folder and condenses multiple CSVs to one xlsx file with the names of the CSV as worksheets. However, when the script runs as part of a larger script it failes when it refreshes the query.
$Query.Refresh()
On its own the script runs fine, but when added to the larger one it fails. Can anyone advise why this is the case?
Below is the error I get:
Insufficient memory to continue the execution of the program.
At C:\Temp\Scripts\Shares_Complete.psm1:254 char:13
+ $Query.Refresh()
+ ~~~~~~~~~~~~~~~~
+ CategoryInfo : OperationStopped: (:) [], OutOfMemoryException
+ FullyQualifiedErrorId : System.OutOfMemoryException
I have tried single csv with the same code and still the same result.
$script:SP = "C:\Temp\Servers\"
$script:TP = "C:\Temp\Servers\Pc.txt"
$script:FSCSV = "C:\Temp\Server_Shares\Server Lists\"
$script:Message1 = "Unknown Hosts"
$script:Message2 = "Unable to connect"
$script:Message3 = "Unknown Errors Occurred"
$script:Txt = ".txt"
$script:OT = ".csv"
$script:FSERROR1 = $FSCSV+$Message1+$OT
$script:FSERROR2 = $FSCSV+$Message2+$OT
$script:FSERROR3 = $FSCSV+$Message2+$OT
$script:ERL3 = $E4 + "Shares_Errors_$Date.txt"
$script:ECL1 = $E4 + "Shares_Exceptions1_$Date.txt"
$script:ERL1 = $E4 + "Shares_Errors1_$Date.txt"
$script:ECL3 = $E4 + "Shares_Exceptions_$Date.txt"
function Excel-Write {
if ($V -eq "1") {
return
}
[System.GC]::Collect()
$RD = $FSCSV + "*.csv"
$CsvDir = $RD
$Ma4 = $FSCSV + "All Server Shares for Domain $CH4"
$csvs = dir -path $CsvDir # Collects all the .csv's from the driectory
$FSh = $csvs | Select-Object -First 1
$FSh = ($FSh -Split "\\")[4]
$FSh = $FSh -replace ".{5}$"
$FSh
$outputxls = "$Ma4.xlsx"
$script:Excel = New-Object -ComObject Excel.Application
$Excel.DisplayAlerts = $false
$workbook = $excel.Workbooks.Add()
# Loops through each CVS, pulling all the data from each one
foreach ($iCsv in $csvs) {
$script:iCsv
$WN = ($iCsv -Split "\\")[-1]
$WN = $WN -replace ".{4}$"
if ($WN.Length -gt 30) {
$WN = $WN.Substring(0, [Math]::Min($WN.Length, 20))
}
$Excel = New-Object -ComObject Excel.Application
$Excel.DisplayAlerts = $false
$Worksheet = $workbook.Worksheets.Add()
$Worksheet.Name = $WN
$TxtConnector = ("TEXT;" + $iCsv)
$Connector = $worksheet.Querytables.Add($txtconnector,$worksheet.Range("A1"))
$query = $Worksheet.QueryTables.Item($Connector.Name)
$query.TextfileOtherDelimiter = $Excel.Application.International(5)
$Query.TextfileParseType = 1
$Query.TextFileColumnDataTypes = ,2 * $worksheet.Cells.Column.Count
$query.AdjustColumnWidth = 1
$Query.Refresh()
$Query.Delete()
$Worksheet.Cells.EntireColumn.AutoFit()
$Worksheet.Rows.Item(1).Font.Bold = $true
$Worksheet.Rows.Item(1).HorizontalAlignment = -4108
$Worksheet.Rows.Item(1).Font.Underline = $true
$Workbook.Save()
}
$Empty = $workbook.Worksheets.Item("Sheet1")
$Empty.Delete()
$Workbook.SaveAs($outputxls,51)
$Workbook.Close()
$Excel.Quit()
$ObjForm.Close()
Delete
}
Should continue script and create the xlsx.
Looking at your script, it doesn't surprise me you eventually run out of memory, because you are continouisly creating Com objects and never release them from memory.
Whenever you have created Com objects and finished with them, use these lines to free up the memory:
$Excel.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Also, take a good look at the code.
You are creating a $Script:Excel = New-Object -ComObject excel.application object before the foreach loop but you don't use that. Instead, you are creating new Excel and workbook objects inside the loop over and over again where there is absolutely no reason for it since you can re-use the one you created before the loop.
As an aside: The following characters are not allowed in excel worksheet names
\/?*[]:
Length limitation is 31 characters.
EDIT
I had a look at your project and especially the Shares_Complete.psm1 file.
Although I'm not willing of course to rewrite your entire project, I do have some remarks that may help you:
[System.Net.Dns]::GetHostByName() is obsolete. Use GetHostEntry()
when done with a Windows form, use $ObjForm.Dispose() to clear it from memory
you do a lot of [System.GC]::Collect(); [System.GC]::WaitForPendingFinalizers() for no reason
Why not use [System.Windows.MessageBox]::Show() instead of using a Com object $a = new-object -comobject wscript.shell. Again you leave that object lingering in memory..
use Join-Path cmdlet instead of $RD = $FSCSV + "*.csv" or $Cop = $FSCSV + "*.csv" constructs
remove invalid characters from Excel worksheet names (replace '[\\/?*:[\]]', '')
use Verb-Noun names for your functions so it becomes clear what they do. Now you have functions like Location, Delete and File that don't mean anything
you are calling functions before they are defined like in line 65 where you call function Shares. At that point it does not exist yet because the function itself is written in line 69
add [System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet) | Out-Null in function Excel-Write
there is no need to use the variable $Excel in script scope ($Script:Excel = New-Object -ComObject excel.application) where it is used only locally to the function.
you may need to look at Excel specifications and limits
fix your indentation of code so it is clear when a loop or if starts and ends
I would recommend using variable names with more meaning. For an outsider or even yourself after a couple of months two-letter variable names become confusing

Script hangs during import of .tsv file to Excel

I have been using a script to import a data.tsv file into Excel which creates a workbook then closes it. It worked in PowerShell 4 under Windows 8.1. I've upgraded to Windows 10 with PowerShell 5. Now it is no longer working.
As far as I can tell, by debugging line by line in ISE, it is this line:
$xl.Workbooks.OpenText($importcsv)
The script:
#Import to xlsx
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
$wbpath=Join-Path "$psscriptroot" 'data.xlsx'
$importcsv=Join-Path "$psscriptroot\CPU\" 'data.tsv'
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
$xl.Workbooks.OpenText($importcsv)
$xl.DisplayAlerts = $false
[threading.thread]::CurrentThread.CurrentCulture = 'en-US'
$xl.ActiveWorkbook.SaveAs($wbpath,51)
$xl.Quit()
while([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)){'released'}
If I run the script, it just hangs. It opens Excel in the background and just sits there. No errors.
Any ideas?

Resources