How can I resize an Excel Table using Gembox.Spreadsheet? - excel

I'm replacing Excel Table contents in an existing workbook with new contents from C# code using Gembox.Spreadsheet. Sometimes the data has more rows than the existing table, sometimes it has fewer. To resize the table my first attempt has been to incrementally add or remove rows. However, this can be slow if the difference in the number of rows is quite large. Here's the code:
var workbook = ExcelFile.Load("workbook.xlsx");
var table = workbook.Sheets["Sheet1"].Tables["Table1"];
var lastWrittenRowIndex = 0;
for(var rowIndex = 0; rowIndex < data.Count; rowIndex++)
{
// If the table isn't big enough for this new row, add it
if (rowIndex == table.Rows.Count) table.Rows.Add();
// … Snipped code to add information in 'data' into 'table' …
lastWrittenRowIndex = rowIndex;
}
// All data written, now wipe out any unused rows
while (lastWrittenRowIndex + 1 < table.Rows.Count)
{
table.Rows.RemoveAt(table.Rows.Count - 1);
}
Adding a profiler shows that by far the slowest operation is table.Rows.Add(). I haven't yet profiled a situation where I need to remove the rows, but I anticipate the same.
I know how large my data is before writing, so how can I prepare the table to be of the correct size in a smaller operation? There are formulae and pivot tables referencing the table and I don't want to break them.

Try again with this latest version that was just released (Full version: 45.0.35.1010):
https://www.gemboxsoftware.com/spreadsheet/downloads/BugFixes.htm
It has a Table.Rows.Add overload method that takes count.
There are also similar ones for Insert and RemoveAt as well, see the following help page:
https://www.gemboxsoftware.com/spreadsheet/help/html/Methods_T_GemBox_Spreadsheet_Tables_TableRowCollection.htm
Last just as an FYI, you can additionally also set the following:
workbook.AutomaticFormulaUpdate = false;
This should improve the performances as well.
Note, setting this property to false also improves the performances of all ExcelWorksheet.Rows and ExcelWorksheet.Columns insert and remove methods.

Related

How to keep table formatting when sorting table generated by PHPSpreadsheet?

I have generated an Excel table using PHPSpreadsheet including the style and the autofilter:
The problem is when I sort the data by the second and third columns, the table formatting is gone. This is how it looks like compared if I use Table Style directly from Excel (using Home-> Format as Table):
Is there any way to keep the formatting when I sort the table generated from PHPSpreadsheet?
Relevant PHP Code:
for ($rowNumber = 0, $rowNumberMax = sizeof($rows); $rowNumber < $rowNumberMax; $rowNumber++) //rows (all data)
{
$columnNumber = 0; //1 = A
for ($i = 0, $j = sizeof($tableColumns); $i < $j; $i++) //loop through table header label
{
foreach ($rows[$rowNumber] as $rowKey => $rowValue) //loop through single row data
{
if($tableColumns[$i] == $rowKey)
{
$sheet->setCellValueByColumnAndRow($columnNumber + 1, ($rowNumber + 5), $rowValue);
$currentCell = Utilities::num2alpha($columnNumber) .''. ($rowNumber + 5);
$sheet->getStyle($currentCell)->getNumberFormat()->setFormatCode('#');
$sheet->getStyle($currentCell)->getAlignment()->setVertical(\PhpOffice\PhpSpreadsheet\Style\Alignment::HORIZONTAL_LEFT);
if(($rowNumber+5) % 2 == 0)
{
//even row
$sheet->getStyle($currentCell)->getFill()->setFillType(\PhpOffice\PhpSpreadsheet\Style\Fill::FILL_SOLID)->getStartColor()->setARGB('ffd9e1f2');
}
else
{
//odd row
}
$columnNumber++;
break;
}
}
}
}
//set autofilter
$headerFirstCellPosition = 'A4';
$tableLastCellPosition = Utilities::num2alpha(sizeof($tableColumns) - 1) . '' . (sizeof($rows) + 4);
$sheet->setAutoFilter($headerFirstCellPosition . ':' . $tableLastCellPosition);
The problem is you were just applying formatting to the cells based on if the row was even or odd, but it wasn't actually replicating a table in Excel. You would find the same result in Excel if you just formatted every other row like you did with your PHP code, where the "table" format would get lost.
Somebody just recently implemented a first pass of the actual table feature in Excel: https://github.com/PHPOffice/PhpSpreadsheet/pull/2671
You need to be on PHPSpreadSheet version 1.23.0 in order to be able to use this.
Using that, you would have to modify your code but you can go to the Samples section in the code area and view how to implement it: https://github.com/PHPOffice/PhpSpreadsheet/tree/master/samples/Table
https://github.com/PHPOffice/PhpSpreadsheet/blob/master/samples/Table/01_Table.php
Here is the relevant code (I removed some of the lines and added additional comments from the 01_Table.php sample at the link provided).
Table styles can be found here: https://github.com/PHPOffice/PhpSpreadsheet/blob/master/src/PhpSpreadsheet/Worksheet/Table/TableStyle.php
// Create Table
$table = new Table('A1:D17', 'Sales_Data');
// Create Table Style
$tableStyle = new TableStyle();
// this line is the style type you want, you can verify this in Excel by clicking the "Format as Table" button and then hovering over the style you like to get the name
$tableStyle->setTheme(TableStyle::TABLE_STYLE_MEDIUM2);
// this gives you the alternate row color; I suggest to use either this or columnStripes as both together do not look good
$tableStyle->setShowRowStripes(true);
// similar to the alternate row color but does it for columns; I suggest to use either this or rowStripes as both together do not look good; I personally set to false and only used the rowStripes
$tableStyle->setShowColumnStripes(true);
// this will bold everything in the first column; I personally set to false
$tableStyle->setShowFirstColumn(true);
// this will bold everything in the last column; I personally set to false
$tableStyle->setShowLastColumn(true);
$table->setStyle($tableStyle);
Also make sure that you include the following to be able to use these:
use PhpOffice\PhpSpreadsheet\Worksheet\Table;
use PhpOffice\PhpSpreadsheet\Worksheet\Table\TableStyle;
Implementing that into your code will then allow you to sort using the auto filters and keep the formatting like you are expecting.
There are a few caveats such as:
Note that PreCalculateFormulas needs to be disabled when saving spreadsheets containing tables with formulae (totals or column formulae).
Also, as I am actually currently working on doing this, it doesn't look like you can apply an autofilter and have a table at the same time at this point.
That does appear to be on the todo list though, as the first link I provided the contributor has "Filter expressions similar to AutoFilter."
Otherwise, that should get you what you want and aside from being able to auto filter prior to creating the Excel file, it has worked well in my small testing.
Edit to add:
I think you can actually simplify your code a bit by using the functionality of PHPSpreadsheet to create a a spreadsheet from an array.
Documentation from PHPSpreadsheet can be found here: https://phpspreadsheet.readthedocs.io/en/latest/topics/accessing-cells/#setting-a-range-of-cells-from-an-array
You'll need to change it so that the array that is holding the info starts with your headers, so I believe that would look similar to this for your code:
$rows = [
['header1', 'header2', 'header3', 'header4']
];
Then you can populate the $rows array with your data from the rows either with a loop or just a single declaration depending on what you are putting in there, but basically using the below to populate the array.
$rows[] = [
$field1Data,
$field2Data,
$field3Data,
$field4Data
];
After you do that, you can then generate the spreadsheet using the following:
$sheet->getActiveSheet()
->fromArray(
$rows, // the data to set
NULL, // array values with this value will not be set
'A1', // top left coordinate of the worksheet range where we want to set these values (default is A1)
true // adds 0 to cell instead of blank if a 0 is the value
);
After doing the above, you can then add the code to create the table I posted and then save the file and you should be good.
Also, if you are in a situation where you still need to use the autofilter (for instance if you want to pre-filter the file on one or more columns which at this point you can't use a table when doing), you can make the autofilter call a bit easier.
// determine the the number of rows in the active sheet
$highestRow = $spreadsheet->getActiveSheet()->getHighestRow();
// get the highest column letter
$highestColumn = $spreadsheet->getActiveSheet()->getHighestColumn();
// set autofilter range
$spreadsheet->getActiveSheet()->setAutoFilter('A1:'.$highestColumn.$highestRow);
I realize the additional edit goes beyond the question, but figured I'd point it out since there are some built-in methods that you could use to reduce some of your code.
-Matt

Excel Javascript (Office.js) - LastRow/LastColumn - better alternative?

I have been a fervent reader of StackOverflow over the last few years, and I was able to resolve pretty much everything in VBA Excel with a search and some adapting. I never felt the need to post any questions before, so I do apologize if this somehow duplicates something else, or there is an answer to this already and I couldn't find it.
Now I`m considering Excel-JS in order to create an AddIn (or more), but have to say that Javascript is not exactly my bread and butter. Over the time of using VBA, I find that one of the most simple and common needs is to get the last row in a sheet or given range, and maybe less often the last column.
I've managed to put some code together in Javascript to get similar functionality, and as it is... it works. There are 2 reasons I`m posting this
Looking to improve the code, and my knowledge
Maybe someone else can make use of the code meanwhile
So... in order to get my lastrow/lastcolumn, I use global variables:
var globalLastRow = 0; //Get the last row in used range
var globalLastCol = 0; //Get the last column in used range
Populate the global variables with the function to return lastrow/lastcolumn:
function lastRC(wsName) {
return Excel.run(function (context) {
var wsTarget = context.workbook.worksheets.getItem(wsName);
//Get last row/column from used range
var uRange = wsTarget.getUsedRange();
uRange.load(['rowCount', 'columnCount']);
return context.sync()
.then(function () {
globalLastRow = uRange.rowCount;
globalLastCol = uRange.columnCount;
});
});
}
And lastly get the value where I need them in other functions:
var lRow = 0; var lCol = 0;
await lastRC("randomSheetName");
lRow = globalLastRow; lCol = globalLastCol;
I`m mainly interested if I can return the values directly from the function lastRC (and how...), rather than go around with this solution.
Any suggestions are greatly appreciated (ideally if they don't come with stones attached).
EDIT:
I've gave up on using an extra function for this as for now, given that it uses extra context.sync, and as I've read since this post, the less syncs, the better.
Also, the method above is only good, as long your usedrange starts in cell "A1" (or well, in the first row/column at least), otherwise a row/column count is not exactly helpful, when you need the index.
Luckily, there is another method to get the last row/column:
var uRowsIndex = ws.getCell(0, 0).getEntireColumn().getUsedRange().getLastCell().load(['rowIndex']);
var uColsIndex = ws.getCell(0, 0).getEntireRow().getUsedRange().getLastCell().load(['columnIndex']);
To break down one of this examples, you are:
starting at cell "A1" getCell(0, 0)
select the entire column "A:A" getEntireColumn()
select the usedrange in that column getUsedRange() (i.e.: "A1:A12")
select the last cell in the used range getLastCell() (i.e.: "A12")
load the row index load(['rowIndex']) (for "A12" rowIndex = 11)
If your data is constant, and you don't need to check lastrow at specific column (or last column at specific row), then the shorter version of the above is:
uIndex = ws.getUsedRange().getLastCell().load(['rowIndex', 'columnIndex']);
Lastly, keep in mind that usedrange will consider formatting as well, not just values, so if you have formatted rows under your data, expect the unexpected.
late edit - you can specify if you want your used range to be of values only (thanks Ethan):
getUsedRange(valuesOnly?: boolean): Excel.Range;
I have to say a big thank you to Michael Zlatkovsky who has put a lot of work, in a lot of documentation, which I`m far from finishing to read.

How can we include the cell formula while export to excel from .rdlc

In my rdlc report have following columns
SlNo, Item, Uom, Qty, Rate, Amount
Here the Amount field is a formula (Rate*Qty)
The report is working fine, and when i export to excel also displaying the values are correctly.
But my problem is, after export to excel, when i change the Qty or Rate columns in excel file the Amount is not get changed automatically, because the formula is missing in the excel cell.
How can we include the formula in Amount column while export to excel from .rdlc?
I'm afraid that this required behaviour isn't really possible by just using the rdlc rendering.
In my search I stumbled upon this same link that QHarr posted: https://social.msdn.microsoft.com/Forums/en-US/3ddf11bf-e10f-4a3e-bd6a-d666eacb5ce4/report-viewer-export-ms-report-data-to-excel-with-formula?forum=vsreportcontrols
I haven't tried the project that they're suggesting but this might possibly be your best solution if it works. Unfortunately I do not have the time to test it myself, so if you test this please share your results.
I thought of the following workaround that seems to work most of the times, but isn't really that reliable because the formula sometimes gets displayed as full-text instead of being calculated. But I guess this could be solved by editing the excel file just after being exported, and changing the cell properties of this column containing the formula or just triggering the calculate.
Using the built-in-field Globals!RenderFormat.Name you can determine the render mode, this way you can display the result correctly when the report is being rendered to something different than Excel. When you export to Excel, you could change the value of the cell to the actual formula.
To form the formula it's self you'll need to figure this out on your own, but the RowNumber(Scope as String) function can be of use here to determine the row number of your cells.
Here is a possible example for the expression value of your amount column
=IIF(Globals!RenderFormat.Name LIKE "EXCEL*", "=E" & Cstr(RowNumber("DataSet1")+2) & "*F" & Cstr(RowNumber("DataSet1")+2) ,Fields!Rate.Value * Fields!Qty.Value )
Now considering that this formula sometimes gets displayed as full-text, and you'll probably have to edit the file post-rendering. If it's too complicated to determine which row/column the cell is on, you could also do this post-rendering. But I believe that the above expression should be easy enough to use to get your desired result without having to do much after rendering.
Update: The following code could be used to force the calculation of the formula (post rendering)
var fpath = #"C:\MyReport.xlsx";
using (var fs = File.Create(fpath))
{
var lr = new LocalReport();
//Initializing your reporter
lr.ReportEmbeddedResource = "MyReport.rdlc";
//Rendering to excel
var fbytes = lr.Render("Excel");
fs.Write(fbytes, 0, fbytes.Length);
}
var xlApp = new Microsoft.Office.Interop.Excel.Application() { Visible = false };
var wb = xlApp.Workbooks.Open(fpath);
var ws = wb.Worksheets[1];
var range = ws.UsedRange;
foreach (var cell in range.Cells)
{
var cellv = cell.Text as string;
if (!string.IsNullOrWhiteSpace(cellv) && cellv.StartsWith("="))
{
cell.Formula = cellv;
}
}
wb.Save();
wb.Close(0);
xlApp.Quit();

vsto excel workbook project: How to write HUGE datatable into a excel sheet fairly quickly

I have a complex object(tree structure) which I am flattening it out into a datatable to display it on an excel sheet. Datatable is huge and has around 20000 rows and 10000 columns.
Writing the data onto an excel cell one at a time took forever. So, I am converting the complex object into a datatable and then writing it to the excel sheet using the code below.
Is it possible to write 20K rows x 10K columns data to an excel sheet fairly quickly in less than a minute or < 5 minutes? What is the best technique to complete this task fast.
Environment: Visual studio 2010, VSTO excel workbook project, .net framework 4.0, excel 2010/2007
EDIT:
Original source of data is a rest service response in json format. I am then deserializing json response into c# objects and finally flattening it into a datatable.
Using this Code to write datatable to an excel sheet:
Excel.Range oRange;
var oSheet = Globals.Sheet3;
int rowCount = 1;
foreach (DataRow dr in resultsDataTable.Rows)
{
rowCount += 1;
for (int i = 1; i < resultsDataTable.Columns.Count + 1; i++)
{
// Add the header the first time through
if (rowCount == 2)
{
oSheet.Cells[1, i] = resultsDataTable.Columns[i - 1].ColumnName;
}
oSheet.Cells[rowCount, i] = dr[i - 1].ToString();
}
}
// Resize the columns
oRange = oSheet.get_Range(oSheet.Cells[1, 1],
oSheet.Cells[rowCount, resultsDataTable.Columns.Count]);
oRange.EntireColumn.AutoFit();
Final Solution:
Used a 2D Object array instead of datatable and wrote it to the range.
In addition to freezing Excel's animation, you can, given the data source this is coming from, save yourself the looping through the Excel.Range object, which is bound to be a bottleneck, by instead of writing to a Datatable, write to a string[,], which Excel can use to write to a Range at once. Looping through a string[,] is much faster than looping through Excel cells.
string[,] importString = new string[yourJsonSource.Rows.Count, yourJsonSource.Columns.Count];
//populate the string[,] however you can
for (int r = 0; r < yourJsonSource.Rows.Count; r++)
{
for (int c = 0; c < yourJsonSource.Columns.Count; c++)
{
importString[r, c] = yourJsonSource[r][c].ToString();
}
}
var oSheet = Globals.Sheet3;
Excel.Range oRange = oSheet.get_Range(oSheet.Cells[1, 1],
oSheet.Cells[yourJsonSource.Rows.Count, yourJsonSource.Columns.Count]);
oRange.Value = importString;
I can't speak about using a datatable for the job, but if you want to use Interop, you definitely want to avoid writing cell by cell. Instead, create a 2-d array, and write it at once to a range, which will give you a very significant performance improvement.
Another option you should consider is avoiding interop altogether, and using OpenXML. If you are working with Excel 2007 or above, this is typically a better approach to manipulate files.
VSTO is always gonna take its time, the best tip I can share with you is disable sheet refresh when you populate data, one way to do this is pop up a "Modal" progress dialog box and refresh your sheet in background, this will give you 50-70% better performance. Another thing you can do is update VS to sp1, it helps.

Can I import INTO excel from a data source without iteration?

Currently I have an application that takes information from a SQLite database and puts it to Excel. However, I'm having to take each DataRow, iterate through each item, and put each value into it's own cell and determine highlighting. What this is causing is 20 minutes to export a 9000 record file into Excel. I'm sure it can be done quicker than that. My thoughts are that I could use a data source to fill the Excel Range and then use the column headers and row numbers to format only those rows that need to be formatted. However, when I look online, no matter what I seem to type, it always shows examples of using Excel as a database, nothing about importing into excel. Unless I'm forgetting a key word or to. Now, this function has to be done in code as it's part of a bigger application. Otherwise I would just have Excel connect to the DB and pull the information itself. Unfortunately that's not the case. Any information that could assist me in quick loading an excel sheet would be appreciated. Thanks.Additional Information:Another reason why the pulling of the information from the DB has to be done in code is that not every computer this is loaded on will have Excel on it. The person using the application may simply be told to export the data and email it to their supervisor. The setup app includes the needed dlls for the application to make the proper format.Example Code (Current):
For Each strTemp In strColumns
excelRange = worksheet.Cells(1, nCounter)
excelRange.Select()
excelRange.Value2 = strTemp
excelRange.Interior.Color = System.Drawing.Color.Gray.ToArgb()
excelRange.BorderAround(Excel.XlLineStyle.xlContinuous, Excel.XlBorderWeight.xlThin, Excel.XlColorIndex.xlColorIndexAutomatic, Type.Missing)
nCounter += 1
Next
Now, this is only example code in terms of the iteration I'm doing. Where I'm really processing the information from the database I'm iterating through a dataTable's Rows, then iterating through the items in the dataRow and doing essentially the same as above; value by value, selecting the range and putting the value in the cell, formatting the cell if it's part of a report (not always gray), and moving onto the next set of data. What I'd like to do is put all of the data in the excel sheet (A2:??, not a row, but multiple rows) then iterate through the reports and format each row then. That way, the only time I iterate through all of the records is when every record is part of a report.
Ideal Code
excelRange = worksheet.Cells("A2", "P9000")
excelRange.DataSource = ds 'ds would be a queried dataSet, and I know there is no excelRange.DataSource.
'Iteration code to format cells
Update:
I know my examples were in VB, but it's because I was also trying to write a VB version of the application since my boss prefers VB. However, here's my final code using a Recordset. The ConvertToRecordset function was obtained from here.
private void CreatePartSheet(Excel.Worksheet excelWorksheet)
{
_dataFactory.RevertDatabase();
excelWorksheet.Name = "Part Sheet";
string[] strColumns = Constants.strExcelPartHeaders;
CreateSheetHeader(excelWorksheet, strColumns);
System.Drawing.Color clrPink = System.Drawing.Color.FromArgb(203, 192, 255);
System.Drawing.Color clrGreen = System.Drawing.Color.FromArgb(100, 225, 137);
string[] strValuesAndTitles = {/*...Column Names...*/};
List<string> lstColumns = strValuesAndTitles.ToList<string>();
System.Data.DataSet ds = _dataFactory.GetDataSet(Queries.strExport);
ADODB.Recordset rs = ConvertToRecordset(ds.Tables[0]);
excelRange = excelWorksheet.get_Range("A2", "ZZ" + rs.RecordCount.ToString());
excelRange.Cells.CopyFromRecordset(rs, rs.RecordCount, rs.Fields.Count);
int nFieldCount = rs.Fields.Count;
for (int nCounter = 0; nCounter < rs.RecordCount; nCounter++)
{
int nRowCounter = nCounter + 2;
List<ReportRecord> rrPartReports = _lstReports.FindAll(rr => rr.PartID == nCounter).ToList<ReportRecord>();
excelRange = (Excel.Range)excelWorksheet.get_Range("A" + nRowCounter.ToString(), "K" + nRowCounter.ToString());
excelRange.Select();
excelRange.NumberFormat = "#";
if (rrPartReports.Count > 0)
{
excelRange.Interior.Color = System.Drawing.Color.FromArgb(230, 216, 173).ToArgb(); //Light Blue
foreach (ReportRecord rr in rrPartReports)
{
if (lstColumns.Contains(rr.Title))
{
excelRange = (Excel.Range)excelWorksheet.Cells[nRowCounter, lstColumns.IndexOf(rr.Title) + 1];
excelRange.Interior.Color = rr.Description.ToUpper().Contains("TAG") ? clrGreen.ToArgb() : clrPink.ToArgb();
if (rr.Description.ToUpper().Contains("TAG"))
{
rs.Find("PART_ID=" + (nCounter + 1).ToString(), 0, ADODB.SearchDirectionEnum.adSearchForward, "");
excelRange.AddComment(Environment.UserName + ": " + _dataFactory.GetTaggedPartPrevValue(rs.Fields["POSITION"].Value.ToString(), rr.Title));
}
}
}
}
if (nRowCounter++ % 500 == 0)
{
progress.ProgressComplete = ((double)nRowCounter / (double)rs.RecordCount) * (double)100;
Notify();
}
}
rs.Close();
excelWorksheet.Columns.AutoFit();
progress.Message = "Done Exporting to Excel";
Notify();
_dataFactory.RestoreDatabase();
}
Can you use ODBC?
''http://www.ch-werner.de/sqliteodbc/
dbName = "c:\docs\test"
scn = "DRIVER=SQLite3 ODBC Driver;Database=" & dbName _
& ";LongNames=0;Timeout=1000;NoTXN=0;SyncPragma=NORMAL;StepAPI=0;"
Set cn = CreateObject("ADODB.Connection")
cn.Open scn
Set rs = CreateObject("ADODB.Recordset")
rs.Open "select * from test", cn
Worksheets("Sheet3").Cells(2, 1).CopyFromRecordset rs
BTW, Excel is quite happy with HTML and internal style sheets.
I have used the Excel XML file format in the past to write directly to an output file or stream. It may not be appropriate for your application, but writing XML is much faster and bypasses the overhead of interacting with the Excel Application. Check out this Introduction to Excel XML post.
Update:
There are also a number of libraries (free and commercial) which can make creating excel document easier for example excellibrary which doesn't support the new format yet. There are others mentioned in the answers to Create Excel (.XLS and .XLSX) file from C#
Excel has the facility to write all the data from a ADO or DAO recordset in a single operation using the CopyFromRecordset method.
Code snippet:
Sheets("Sheet1").Range("A1").CopyFromRecordset rst
I'd normally recommend using Excel to pull in the data from SQLite. Use Excel's "Other Data Sources". You could then choose your OLE DB provider, use a connection string, what-have-you.
It sounds, however, that the real value of your code is the formatting of the cells, rather than the transfer of data.
Perhaps refactor the process to:
have Excel import the data
use your code to open the Excel spreadsheet, and apply formatting
I'm not sure if that is an appropriate set of processes for you, but perhaps something to consider?
Try this out:
http://office.microsoft.com/en-au/excel-help/use-microsoft-query-to-retrieve-external-data-HA010099664.aspx
Perhaps post some code, and we might be able to track down any issues.
I'd consider this chain of events:
query the SQLite database for your dataset.
move the data out of ADO.NET objects, and into POCO objects. Stop using DataTables/Rows.
use For Each to insert into Excel.

Resources