OleDB Jet - Float issues in reading excel data - excel

When I read a sheet into a DataTable using the OleDbDataReader, floating point numbers loose their precision.
I tried forcing OleDb to read the excel data as string, but although the data is now contained in a DataRow with each Column defined as System.String it looses precision (18.125 -> 18.124962832).
Any idea how to avoid this behaviour?

I just tested your data and this method posted here worked.
i.e. the cell value kept it's precision 18.124962832, when put into DataSet.

I'm pretty sure that Jet tries to assign a datatype to each column based on what it sees in the first five rows. If something after those first five rows doesn't fall into that data type it will either convert it or return nothing at all.
Do the first five rows of your spreadsheet have a lower precision than the items that are begin truncated?
Take a look at this post.

The output from the code below shows you how to get the underlying number and the formatted text with SpreadsheetGear for .NET:
Here is the output from the code:
x=18.124962832 y=18.124962832 formattedText=18.125
Here is the code:
namespace Program
{
class Program
{
static void Main(string[] args)
{
// Create a new workbook and get a reference to Sheet1!A1.
var workbook = SpreadsheetGear.Factory.GetWorkbook();
var sheet1 = workbook.Worksheets[0];
var a1 = workbook.Worksheets[0].Cells["A1"];
// Put the number in the cell.
double x = 18.124962832;
a1.Value = x;
a1.NumberFormat = "0.000";
double y = (double)a1.Value;
string formattedText = a1.Text;
System.Console.WriteLine("x={0} y={1} formattedText={2}", x, y, formattedText);
}
}
}
You can see live SpreadsheetGear samples here and download the free trial here.
Disclaimer: I own SpreadsheetGear LLC

Related

Excel VSTO cell precision

When I read very small values from Excel sheet these are shown as scientific precision. For example, the -0.00002 is always read as -2E05 using Cells().Values function. Below are the conversion lines I have used, without any success. How to get the actual value instead of the scientific format?
var canConvert = decimal.TryParse(ws.Cells[1, 1].Value.ToString(), out _); // result in false
var cellString = ws.Cells[1, 1].Value.ToString("R"); // -2E05
var cellStrin2g = ws.Cells[1, 2].Value.ToString(); // -2E05
It seems you need to set up the required NumberFormat first:
ws.Range("A17").NumberFormat = "General"
The format code is the same string as the Format Codes option in the Format Cells dialog box.
After several trial I have came up to the following solution, which works well in my model.
double number = 0;
var canConvert = double.TryParse(ws.Cells[row, column].Value.ToString(), out number3);
if (canConvert)
string doubleAsString = cell.ToString("F99").TrimEnd('0');
Bear in mind that the above scenario works well upto 99 following digits. Feel free to extend it to your needs.

How can we include the cell formula while export to excel from .rdlc

In my rdlc report have following columns
SlNo, Item, Uom, Qty, Rate, Amount
Here the Amount field is a formula (Rate*Qty)
The report is working fine, and when i export to excel also displaying the values are correctly.
But my problem is, after export to excel, when i change the Qty or Rate columns in excel file the Amount is not get changed automatically, because the formula is missing in the excel cell.
How can we include the formula in Amount column while export to excel from .rdlc?
I'm afraid that this required behaviour isn't really possible by just using the rdlc rendering.
In my search I stumbled upon this same link that QHarr posted: https://social.msdn.microsoft.com/Forums/en-US/3ddf11bf-e10f-4a3e-bd6a-d666eacb5ce4/report-viewer-export-ms-report-data-to-excel-with-formula?forum=vsreportcontrols
I haven't tried the project that they're suggesting but this might possibly be your best solution if it works. Unfortunately I do not have the time to test it myself, so if you test this please share your results.
I thought of the following workaround that seems to work most of the times, but isn't really that reliable because the formula sometimes gets displayed as full-text instead of being calculated. But I guess this could be solved by editing the excel file just after being exported, and changing the cell properties of this column containing the formula or just triggering the calculate.
Using the built-in-field Globals!RenderFormat.Name you can determine the render mode, this way you can display the result correctly when the report is being rendered to something different than Excel. When you export to Excel, you could change the value of the cell to the actual formula.
To form the formula it's self you'll need to figure this out on your own, but the RowNumber(Scope as String) function can be of use here to determine the row number of your cells.
Here is a possible example for the expression value of your amount column
=IIF(Globals!RenderFormat.Name LIKE "EXCEL*", "=E" & Cstr(RowNumber("DataSet1")+2) & "*F" & Cstr(RowNumber("DataSet1")+2) ,Fields!Rate.Value * Fields!Qty.Value )
Now considering that this formula sometimes gets displayed as full-text, and you'll probably have to edit the file post-rendering. If it's too complicated to determine which row/column the cell is on, you could also do this post-rendering. But I believe that the above expression should be easy enough to use to get your desired result without having to do much after rendering.
Update: The following code could be used to force the calculation of the formula (post rendering)
var fpath = #"C:\MyReport.xlsx";
using (var fs = File.Create(fpath))
{
var lr = new LocalReport();
//Initializing your reporter
lr.ReportEmbeddedResource = "MyReport.rdlc";
//Rendering to excel
var fbytes = lr.Render("Excel");
fs.Write(fbytes, 0, fbytes.Length);
}
var xlApp = new Microsoft.Office.Interop.Excel.Application() { Visible = false };
var wb = xlApp.Workbooks.Open(fpath);
var ws = wb.Worksheets[1];
var range = ws.UsedRange;
foreach (var cell in range.Cells)
{
var cellv = cell.Text as string;
if (!string.IsNullOrWhiteSpace(cellv) && cellv.StartsWith("="))
{
cell.Formula = cellv;
}
}
wb.Save();
wb.Close(0);
xlApp.Quit();

Opening excel file prompts a message box "content recovery of the workbook"

While I'm trying to open excel file a message box is prompting like "We found a problem with some content in file name. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes.". What actually done is i have a excel template designed and copying the file to another file and created temp file I'm inserting data to temp file using OPEN XML and data is getting from the database.
i have tried the solutions provided in the net but those fixes are not resolving my issue.My excel is 2010
Anyone solution provided is much appreciated.
I had this problem. It was caused by the way I was storing numbers and strings in cells.
Numbers can be stored simply using cell.CellValue = new CellValue("5"), but for non-numeric text, you need to insert the string in the SharedStringTable element and get the index of that string. Then change the data type of the cell to SharedString, and set the value of the cell to the index of the string in the SharedStringTable.
// Here is the text I want to add.
string text = "Non-numeric text.";
// Find the SharedStringTable element and append my text to it.
var sharedStringTable = document.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First().SharedStringTable;
var item = sharedStringTable.AppendChild(new SharedStringItem(new Text(text)));
// Set the data type of the cell to SharedString.
cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
// Set the value of the cell to the index of the SharedStringItem.
cell.CellValue = new CellValue(item.ElementsBefore().Count().ToString());
This is explained in the documentation here: http://msdn.microsoft.com/en-us/library/office/cc861607.aspx
Another few cases that can cause this type of error:
Your sheet name is longer than 31 characters
You have invalid characters in sheet name
You have cells with values longer than 32k
The issue is due to using
package.Save();
and
package.GetAsByteArray();
at the same time
when we call
package.GetAsByteArray();
it will do following operations
this.Workbook.Save();
this._package.Close();
this._package.Save(this._stream);
Hence, removing
package.Save();
will solve this problem "We found a problem with some content in file name. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes."
Another possible cause could be exceeded maximum number of cell styles.
You can define:
up to 4000 styles in a .xls workbook
up to 64000 styles in a .xlsx workbook
In this case you should re-use the same cell style for multiple cells, instead of creating a new cell style for every cell.
I added the right cellReference and fixed this issue for me:
string alpha = "ABCDEFGHIJKLMNOPQRSTUVQXYZ";
for (int colInx = 0; colInx < reader.FieldCount; colInx++)
{
AppendTextCell(alpha[colInx] + "1", reader.GetName(colInx), headerRow);
}
private static void AppendTextCell(string cellReference, string cellStringValue, Row excelRow)
{
// Add a new Excel Cell to our Row
Cell cell = new Cell() { CellReference = cellReference, DataType = new EnumValue<CellValues>(CellValues.String) };
CellValue cellValue = new CellValue();
cellValue.Text = cellStringValue.ToString();
cell.Append(cellValue);
excelRow.Append(cell);
}
Same warning but the problem with me was that I was using a client input (name of wave) as sheet name for the file and when date was presented within the name, the character '/' used as date part separator was causing the issue.
I think Microsoft need to provide a better error log to save people time investigate such minor issues. Hope my answer will save someone else's time.
The issue was due to storing a string in the cell directly using cell.CellValue = new CellValue("Text"). It is possible to store numbers like this but not string. For string, define data type as string before assigning the text using Cell.DataType = CellValues.String;

vsto excel workbook project: How to write HUGE datatable into a excel sheet fairly quickly

I have a complex object(tree structure) which I am flattening it out into a datatable to display it on an excel sheet. Datatable is huge and has around 20000 rows and 10000 columns.
Writing the data onto an excel cell one at a time took forever. So, I am converting the complex object into a datatable and then writing it to the excel sheet using the code below.
Is it possible to write 20K rows x 10K columns data to an excel sheet fairly quickly in less than a minute or < 5 minutes? What is the best technique to complete this task fast.
Environment: Visual studio 2010, VSTO excel workbook project, .net framework 4.0, excel 2010/2007
EDIT:
Original source of data is a rest service response in json format. I am then deserializing json response into c# objects and finally flattening it into a datatable.
Using this Code to write datatable to an excel sheet:
Excel.Range oRange;
var oSheet = Globals.Sheet3;
int rowCount = 1;
foreach (DataRow dr in resultsDataTable.Rows)
{
rowCount += 1;
for (int i = 1; i < resultsDataTable.Columns.Count + 1; i++)
{
// Add the header the first time through
if (rowCount == 2)
{
oSheet.Cells[1, i] = resultsDataTable.Columns[i - 1].ColumnName;
}
oSheet.Cells[rowCount, i] = dr[i - 1].ToString();
}
}
// Resize the columns
oRange = oSheet.get_Range(oSheet.Cells[1, 1],
oSheet.Cells[rowCount, resultsDataTable.Columns.Count]);
oRange.EntireColumn.AutoFit();
Final Solution:
Used a 2D Object array instead of datatable and wrote it to the range.
In addition to freezing Excel's animation, you can, given the data source this is coming from, save yourself the looping through the Excel.Range object, which is bound to be a bottleneck, by instead of writing to a Datatable, write to a string[,], which Excel can use to write to a Range at once. Looping through a string[,] is much faster than looping through Excel cells.
string[,] importString = new string[yourJsonSource.Rows.Count, yourJsonSource.Columns.Count];
//populate the string[,] however you can
for (int r = 0; r < yourJsonSource.Rows.Count; r++)
{
for (int c = 0; c < yourJsonSource.Columns.Count; c++)
{
importString[r, c] = yourJsonSource[r][c].ToString();
}
}
var oSheet = Globals.Sheet3;
Excel.Range oRange = oSheet.get_Range(oSheet.Cells[1, 1],
oSheet.Cells[yourJsonSource.Rows.Count, yourJsonSource.Columns.Count]);
oRange.Value = importString;
I can't speak about using a datatable for the job, but if you want to use Interop, you definitely want to avoid writing cell by cell. Instead, create a 2-d array, and write it at once to a range, which will give you a very significant performance improvement.
Another option you should consider is avoiding interop altogether, and using OpenXML. If you are working with Excel 2007 or above, this is typically a better approach to manipulate files.
VSTO is always gonna take its time, the best tip I can share with you is disable sheet refresh when you populate data, one way to do this is pop up a "Modal" progress dialog box and refresh your sheet in background, this will give you 50-70% better performance. Another thing you can do is update VS to sp1, it helps.

Can I import INTO excel from a data source without iteration?

Currently I have an application that takes information from a SQLite database and puts it to Excel. However, I'm having to take each DataRow, iterate through each item, and put each value into it's own cell and determine highlighting. What this is causing is 20 minutes to export a 9000 record file into Excel. I'm sure it can be done quicker than that. My thoughts are that I could use a data source to fill the Excel Range and then use the column headers and row numbers to format only those rows that need to be formatted. However, when I look online, no matter what I seem to type, it always shows examples of using Excel as a database, nothing about importing into excel. Unless I'm forgetting a key word or to. Now, this function has to be done in code as it's part of a bigger application. Otherwise I would just have Excel connect to the DB and pull the information itself. Unfortunately that's not the case. Any information that could assist me in quick loading an excel sheet would be appreciated. Thanks.Additional Information:Another reason why the pulling of the information from the DB has to be done in code is that not every computer this is loaded on will have Excel on it. The person using the application may simply be told to export the data and email it to their supervisor. The setup app includes the needed dlls for the application to make the proper format.Example Code (Current):
For Each strTemp In strColumns
excelRange = worksheet.Cells(1, nCounter)
excelRange.Select()
excelRange.Value2 = strTemp
excelRange.Interior.Color = System.Drawing.Color.Gray.ToArgb()
excelRange.BorderAround(Excel.XlLineStyle.xlContinuous, Excel.XlBorderWeight.xlThin, Excel.XlColorIndex.xlColorIndexAutomatic, Type.Missing)
nCounter += 1
Next
Now, this is only example code in terms of the iteration I'm doing. Where I'm really processing the information from the database I'm iterating through a dataTable's Rows, then iterating through the items in the dataRow and doing essentially the same as above; value by value, selecting the range and putting the value in the cell, formatting the cell if it's part of a report (not always gray), and moving onto the next set of data. What I'd like to do is put all of the data in the excel sheet (A2:??, not a row, but multiple rows) then iterate through the reports and format each row then. That way, the only time I iterate through all of the records is when every record is part of a report.
Ideal Code
excelRange = worksheet.Cells("A2", "P9000")
excelRange.DataSource = ds 'ds would be a queried dataSet, and I know there is no excelRange.DataSource.
'Iteration code to format cells
Update:
I know my examples were in VB, but it's because I was also trying to write a VB version of the application since my boss prefers VB. However, here's my final code using a Recordset. The ConvertToRecordset function was obtained from here.
private void CreatePartSheet(Excel.Worksheet excelWorksheet)
{
_dataFactory.RevertDatabase();
excelWorksheet.Name = "Part Sheet";
string[] strColumns = Constants.strExcelPartHeaders;
CreateSheetHeader(excelWorksheet, strColumns);
System.Drawing.Color clrPink = System.Drawing.Color.FromArgb(203, 192, 255);
System.Drawing.Color clrGreen = System.Drawing.Color.FromArgb(100, 225, 137);
string[] strValuesAndTitles = {/*...Column Names...*/};
List<string> lstColumns = strValuesAndTitles.ToList<string>();
System.Data.DataSet ds = _dataFactory.GetDataSet(Queries.strExport);
ADODB.Recordset rs = ConvertToRecordset(ds.Tables[0]);
excelRange = excelWorksheet.get_Range("A2", "ZZ" + rs.RecordCount.ToString());
excelRange.Cells.CopyFromRecordset(rs, rs.RecordCount, rs.Fields.Count);
int nFieldCount = rs.Fields.Count;
for (int nCounter = 0; nCounter < rs.RecordCount; nCounter++)
{
int nRowCounter = nCounter + 2;
List<ReportRecord> rrPartReports = _lstReports.FindAll(rr => rr.PartID == nCounter).ToList<ReportRecord>();
excelRange = (Excel.Range)excelWorksheet.get_Range("A" + nRowCounter.ToString(), "K" + nRowCounter.ToString());
excelRange.Select();
excelRange.NumberFormat = "#";
if (rrPartReports.Count > 0)
{
excelRange.Interior.Color = System.Drawing.Color.FromArgb(230, 216, 173).ToArgb(); //Light Blue
foreach (ReportRecord rr in rrPartReports)
{
if (lstColumns.Contains(rr.Title))
{
excelRange = (Excel.Range)excelWorksheet.Cells[nRowCounter, lstColumns.IndexOf(rr.Title) + 1];
excelRange.Interior.Color = rr.Description.ToUpper().Contains("TAG") ? clrGreen.ToArgb() : clrPink.ToArgb();
if (rr.Description.ToUpper().Contains("TAG"))
{
rs.Find("PART_ID=" + (nCounter + 1).ToString(), 0, ADODB.SearchDirectionEnum.adSearchForward, "");
excelRange.AddComment(Environment.UserName + ": " + _dataFactory.GetTaggedPartPrevValue(rs.Fields["POSITION"].Value.ToString(), rr.Title));
}
}
}
}
if (nRowCounter++ % 500 == 0)
{
progress.ProgressComplete = ((double)nRowCounter / (double)rs.RecordCount) * (double)100;
Notify();
}
}
rs.Close();
excelWorksheet.Columns.AutoFit();
progress.Message = "Done Exporting to Excel";
Notify();
_dataFactory.RestoreDatabase();
}
Can you use ODBC?
''http://www.ch-werner.de/sqliteodbc/
dbName = "c:\docs\test"
scn = "DRIVER=SQLite3 ODBC Driver;Database=" & dbName _
& ";LongNames=0;Timeout=1000;NoTXN=0;SyncPragma=NORMAL;StepAPI=0;"
Set cn = CreateObject("ADODB.Connection")
cn.Open scn
Set rs = CreateObject("ADODB.Recordset")
rs.Open "select * from test", cn
Worksheets("Sheet3").Cells(2, 1).CopyFromRecordset rs
BTW, Excel is quite happy with HTML and internal style sheets.
I have used the Excel XML file format in the past to write directly to an output file or stream. It may not be appropriate for your application, but writing XML is much faster and bypasses the overhead of interacting with the Excel Application. Check out this Introduction to Excel XML post.
Update:
There are also a number of libraries (free and commercial) which can make creating excel document easier for example excellibrary which doesn't support the new format yet. There are others mentioned in the answers to Create Excel (.XLS and .XLSX) file from C#
Excel has the facility to write all the data from a ADO or DAO recordset in a single operation using the CopyFromRecordset method.
Code snippet:
Sheets("Sheet1").Range("A1").CopyFromRecordset rst
I'd normally recommend using Excel to pull in the data from SQLite. Use Excel's "Other Data Sources". You could then choose your OLE DB provider, use a connection string, what-have-you.
It sounds, however, that the real value of your code is the formatting of the cells, rather than the transfer of data.
Perhaps refactor the process to:
have Excel import the data
use your code to open the Excel spreadsheet, and apply formatting
I'm not sure if that is an appropriate set of processes for you, but perhaps something to consider?
Try this out:
http://office.microsoft.com/en-au/excel-help/use-microsoft-query-to-retrieve-external-data-HA010099664.aspx
Perhaps post some code, and we might be able to track down any issues.
I'd consider this chain of events:
query the SQLite database for your dataset.
move the data out of ADO.NET objects, and into POCO objects. Stop using DataTables/Rows.
use For Each to insert into Excel.

Resources