Can I import INTO excel from a data source without iteration? - excel

Currently I have an application that takes information from a SQLite database and puts it to Excel. However, I'm having to take each DataRow, iterate through each item, and put each value into it's own cell and determine highlighting. What this is causing is 20 minutes to export a 9000 record file into Excel. I'm sure it can be done quicker than that. My thoughts are that I could use a data source to fill the Excel Range and then use the column headers and row numbers to format only those rows that need to be formatted. However, when I look online, no matter what I seem to type, it always shows examples of using Excel as a database, nothing about importing into excel. Unless I'm forgetting a key word or to. Now, this function has to be done in code as it's part of a bigger application. Otherwise I would just have Excel connect to the DB and pull the information itself. Unfortunately that's not the case. Any information that could assist me in quick loading an excel sheet would be appreciated. Thanks.Additional Information:Another reason why the pulling of the information from the DB has to be done in code is that not every computer this is loaded on will have Excel on it. The person using the application may simply be told to export the data and email it to their supervisor. The setup app includes the needed dlls for the application to make the proper format.Example Code (Current):
For Each strTemp In strColumns
excelRange = worksheet.Cells(1, nCounter)
excelRange.Select()
excelRange.Value2 = strTemp
excelRange.Interior.Color = System.Drawing.Color.Gray.ToArgb()
excelRange.BorderAround(Excel.XlLineStyle.xlContinuous, Excel.XlBorderWeight.xlThin, Excel.XlColorIndex.xlColorIndexAutomatic, Type.Missing)
nCounter += 1
Next
Now, this is only example code in terms of the iteration I'm doing. Where I'm really processing the information from the database I'm iterating through a dataTable's Rows, then iterating through the items in the dataRow and doing essentially the same as above; value by value, selecting the range and putting the value in the cell, formatting the cell if it's part of a report (not always gray), and moving onto the next set of data. What I'd like to do is put all of the data in the excel sheet (A2:??, not a row, but multiple rows) then iterate through the reports and format each row then. That way, the only time I iterate through all of the records is when every record is part of a report.
Ideal Code
excelRange = worksheet.Cells("A2", "P9000")
excelRange.DataSource = ds 'ds would be a queried dataSet, and I know there is no excelRange.DataSource.
'Iteration code to format cells
Update:
I know my examples were in VB, but it's because I was also trying to write a VB version of the application since my boss prefers VB. However, here's my final code using a Recordset. The ConvertToRecordset function was obtained from here.
private void CreatePartSheet(Excel.Worksheet excelWorksheet)
{
_dataFactory.RevertDatabase();
excelWorksheet.Name = "Part Sheet";
string[] strColumns = Constants.strExcelPartHeaders;
CreateSheetHeader(excelWorksheet, strColumns);
System.Drawing.Color clrPink = System.Drawing.Color.FromArgb(203, 192, 255);
System.Drawing.Color clrGreen = System.Drawing.Color.FromArgb(100, 225, 137);
string[] strValuesAndTitles = {/*...Column Names...*/};
List<string> lstColumns = strValuesAndTitles.ToList<string>();
System.Data.DataSet ds = _dataFactory.GetDataSet(Queries.strExport);
ADODB.Recordset rs = ConvertToRecordset(ds.Tables[0]);
excelRange = excelWorksheet.get_Range("A2", "ZZ" + rs.RecordCount.ToString());
excelRange.Cells.CopyFromRecordset(rs, rs.RecordCount, rs.Fields.Count);
int nFieldCount = rs.Fields.Count;
for (int nCounter = 0; nCounter < rs.RecordCount; nCounter++)
{
int nRowCounter = nCounter + 2;
List<ReportRecord> rrPartReports = _lstReports.FindAll(rr => rr.PartID == nCounter).ToList<ReportRecord>();
excelRange = (Excel.Range)excelWorksheet.get_Range("A" + nRowCounter.ToString(), "K" + nRowCounter.ToString());
excelRange.Select();
excelRange.NumberFormat = "#";
if (rrPartReports.Count > 0)
{
excelRange.Interior.Color = System.Drawing.Color.FromArgb(230, 216, 173).ToArgb(); //Light Blue
foreach (ReportRecord rr in rrPartReports)
{
if (lstColumns.Contains(rr.Title))
{
excelRange = (Excel.Range)excelWorksheet.Cells[nRowCounter, lstColumns.IndexOf(rr.Title) + 1];
excelRange.Interior.Color = rr.Description.ToUpper().Contains("TAG") ? clrGreen.ToArgb() : clrPink.ToArgb();
if (rr.Description.ToUpper().Contains("TAG"))
{
rs.Find("PART_ID=" + (nCounter + 1).ToString(), 0, ADODB.SearchDirectionEnum.adSearchForward, "");
excelRange.AddComment(Environment.UserName + ": " + _dataFactory.GetTaggedPartPrevValue(rs.Fields["POSITION"].Value.ToString(), rr.Title));
}
}
}
}
if (nRowCounter++ % 500 == 0)
{
progress.ProgressComplete = ((double)nRowCounter / (double)rs.RecordCount) * (double)100;
Notify();
}
}
rs.Close();
excelWorksheet.Columns.AutoFit();
progress.Message = "Done Exporting to Excel";
Notify();
_dataFactory.RestoreDatabase();
}

Can you use ODBC?
''http://www.ch-werner.de/sqliteodbc/
dbName = "c:\docs\test"
scn = "DRIVER=SQLite3 ODBC Driver;Database=" & dbName _
& ";LongNames=0;Timeout=1000;NoTXN=0;SyncPragma=NORMAL;StepAPI=0;"
Set cn = CreateObject("ADODB.Connection")
cn.Open scn
Set rs = CreateObject("ADODB.Recordset")
rs.Open "select * from test", cn
Worksheets("Sheet3").Cells(2, 1).CopyFromRecordset rs
BTW, Excel is quite happy with HTML and internal style sheets.

I have used the Excel XML file format in the past to write directly to an output file or stream. It may not be appropriate for your application, but writing XML is much faster and bypasses the overhead of interacting with the Excel Application. Check out this Introduction to Excel XML post.
Update:
There are also a number of libraries (free and commercial) which can make creating excel document easier for example excellibrary which doesn't support the new format yet. There are others mentioned in the answers to Create Excel (.XLS and .XLSX) file from C#

Excel has the facility to write all the data from a ADO or DAO recordset in a single operation using the CopyFromRecordset method.
Code snippet:
Sheets("Sheet1").Range("A1").CopyFromRecordset rst

I'd normally recommend using Excel to pull in the data from SQLite. Use Excel's "Other Data Sources". You could then choose your OLE DB provider, use a connection string, what-have-you.
It sounds, however, that the real value of your code is the formatting of the cells, rather than the transfer of data.
Perhaps refactor the process to:
have Excel import the data
use your code to open the Excel spreadsheet, and apply formatting
I'm not sure if that is an appropriate set of processes for you, but perhaps something to consider?

Try this out:
http://office.microsoft.com/en-au/excel-help/use-microsoft-query-to-retrieve-external-data-HA010099664.aspx

Perhaps post some code, and we might be able to track down any issues.
I'd consider this chain of events:
query the SQLite database for your dataset.
move the data out of ADO.NET objects, and into POCO objects. Stop using DataTables/Rows.
use For Each to insert into Excel.

Related

openxml sdk excel total sums

i am having difficulties with open xml sdk:
i am trying to generate excel file with several columns that have numbers and i want to have total sum at the end
i have tried to Generate Table Definition Part Content and inside define every column (id, name etC). If column has true for TotalColumn, it adds code (rough example)
var column = new TableColumn{
id = 1,
name = "example",
TotalsRowFunction = TotalsRowFunctionValues.Sum,
TotalsRowFormula = new TotalsRowFormula("=SUBTOTAL(109;[" + rowName + "])")
};
I can't get it to work, when i open excel it reports error, but it doesn't explicitly says what the problem is... I tried with microsoft validator but can't figure anything out...
I'd appreciate any help / example code since i can't google anything out
EDIT:
i use this at the end:
workbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
workbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
Why not use a cell formula?
E.g.
cell.DataType = new EnumValue<CellValues>(CellValues.Number);
cell.CellFormula = new CellFormula(string.Format("=SUBTOTAL({0};[{1}])", "109", rowName));
//This will force a full recalculation of all the cells
workbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
workbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
I ended using EPPlus for this as it seems to be working simple and efficient

Read data from excel in vb6 and put in a datatable

Is there any way to read all the data from excel and put it in the datatable or any other container so that i can filter the data based on the conditions required. As shown in attached image i want to get the CuValue of a Partnumber whose status is Success and i want the latest record based on the Calculation date(Latest calculation date). In the below example i want the CuValue 11292 as it is the latest record with status Success..lue.
Thanks in advance
Your question seems very broad, but you're right to ask because there are many different possibilities and pitfalls.
As you don't provide any sample code, i assume you are looking for a strategy, so here is it.
In short: create a database, a table and a stored procedure. Copy the
data you need in this table, and then query the table to get the
result.
You may use ADO for this task. If it is not available on your machine you can download and install the MDAC redistributable from the Microsoft web site.
The advantage vs. OLE Automation is that you doesn't need to install Excel on the target machine where the import shall be executed, so you can execute the import also server-side.
With ADO installed, you will need to create two Connection objects, a Recordset object to read the data from the Excel file and a Command object to execute a stored procedure which will do the INSERT or the UPDATE of the subset of the source fields in the destination table.
Following is a guideline which you should expand and adjust, if you find it useful for your task:
Option Explicit
Dim PartNo as String, CuValue as Long, Status as String, CalcDate as Date
' objects you need:
Dim srcConn As New ADODB.Connection
Dim cmd As New ADODB.Command
Dim rs As New ADODB.Recordset
Dim dstConn As New ADODB.Connection
' Example connection with your destination database
dstConn.Open *your connection string*
'Example connection with Excel - HDR is discussed below
srcConn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=C:\Scripts\Test.xls;" & _
"Extended Properties=""Excel 8.0; HDR=NO;"";"
rs.Open "SELECT * FROM [Sheet1$]", _
srcConn, adOpenForwardOnly, adLockReadOnly, adCmdText
' Import
Do Until rs.EOF
PartNo = rs.Fields.Item(0);
CuValue = rs.Fields.Item(1);
CalcDate = rs.Fields.Item(6);
Status = rs.Fields.Item(7);
If Status = "Success" Then
'NumSuccess = NumSuccess + 1
' copy data to your database
' using a stored procedure
cmd.CommandText = "InsertWithDateCheck"
cmd.CommandType = adCmdStoredProc
cmd(1) = PartNo
cmd(2) = CuValue
cmd(3) = CalcDate
cmd.ActiveConnection = dstConn
cmd.Execute
Else
'NumFail = NumFail + 1
End If
rs.MoveNext
Loop
rs.Close
Set rs = Nothing
srcConn.Close
Set srcConn = Nothing
dstConn.Close
Set dstConn = Nothing
'
By using a stored procedure to check the data and execute the insert or update in your new table, you will be able to read from Excel in fast forward-only mode and write a copy of the data with the minimum of time loss, delegating to the database engine half the work.
You see, the stored procedure will receive three values. Inside the stored procedure you should insert or update this values. Primary key of the table shall be PartNo. Check the Calculation Date and, if more recent, update CuValue.
By googling on the net you will find enough samples to write such a stored procedure.
After your table is populated, just use another recordset to get the data and whatever tool you need to display the values.
Pitfalls reading from Excel:
The provider of your Excel file shall agree to remove the first two or three rows, otherwise you will have some more work for the creation of a fictitious recordset, because the intelligent datatype recognition of Excel may fail.
As you know, Excel cells are not constrained to the same data type per-column as in almost all databases.
If you maintain the field names, use HDR=YES, without all the first three rows, use HDR=NO.
Always keep a log of the "Success" and "Fail" number of records read
in your program, then compare these values with the original overall
number of rows in Excel.
Feel free to ask for more details, anyway i think this should be enough for you to start.
There are lots ways you can do this.
1. You can create an access DB table and import by saving your sheet as can file first, into the access table. Then you can write queries.
2. You can create a sql DB and a table, write some code to import the sheet into that table.
3. You can Write some code in VBA and accomplish that task if your data is not very big.
4. You can write c# code to access the sheet using excel.application and office objects, create a data table and query that data table
Depends on what skills you want to employ to accomplish your task.

Creating an Excel Table with MATLAB

Since writing varying sized data ranges to a sheet seems to remove an Excel Table if the data range is larger than the existing Excel tables range, I want to create a Table in Excel every time I run the code. I'm currently having a fair bit of difficulty creating the tables. The code I have right now to try and create the ListObject:
eSheets = e.ActiveWorkbook.Sheets;
eSheet = eSheets.get('Item', j);
eSheet.Activate;
eSheet.Range(horzcat('A1:R',mat2str(size(obj,1)+1))).Select;
eSheet.Listobjects.Add;
eSheet.Listobjects.Item(1).TableStyle = 'TableStyleMedium2';
eSheet.ListObjects.Item(1).Name = tablename;
Any commentary or suggestions would be appreciated
I dont know about using eSheet in matlab but with the function
xlswrite(filename,A,sheet,xlRange)
you can also write your data from a matrix to an excel table http://de.mathworks.com/help/matlab/ref/xlswrite.html and with
[A,B] = xlsfinfo('foofoo.xlsx');
sheetValid = any(strcmp(B, 'foo2'));
you can also check if a table sheet already exist so that you wont override the old one and create a new one, as seen in https://de.mathworks.com/matlabcentral/answers/25848-how-to-check-existence-of-worksheet-in-excel-file
I am not sure if this is what you are looking for thougth
Alright, since the post got downvoted (not sure why...) I found my own answer with the help of some VBA forums and MATLAB Newsgroup. Here's what the final code looks like for anyone else that has issues:
e = actxserver('Excel.Application');
ewb = e.Workbooks.Open('Path/to/file');
eSheets = e.ActiveWorkbook.Sheets;
eSheet = eSheets.get('Item', j);
eSheet.Activate;
range = horzcat('A1:R',mat2str(size(obj,1)+1));
range_todelete = horzcat('A1:R',mat2str(size(obj,1)+300));
Range1 = eSheet.get('Range',range_todelete);
Range1.Value=[];
eSheet.Range(range).Select;
name = 'Table_Name';
try eSheet.ListObjects(name).Item(1).Delete
catch
end
eSheet.Listobjects.Add;
eSheet.ListObjects.Item(1).Name = name;
eSheet.ListObjects.Item(1).TableStyle = 'TableStyleMedium2';
Range = eSheet.get('Range',range);
Range.Value = cellarray;

Excel process stays alive even after calling Quit

I am creating Excel Sheet using Devexpress Exporter and then saving the file at a particular location.
After the creation of file, I have to open it, to add dropdownlist of items and then save it again in same location.
After all the operations, the file has to be emailed automatically to the email address from database.
Now if I have 1000 email addresses, and to automate this process, it is creating more than 10 instances of Excel.
How can I stop creation of those instance and how can I use excel operations without using more memory.
Code is as below:
protected string CreateExcelFile(string FilterName)
{
Random ranNumber = new Random();
int number = ranNumber.Next(0, 10000000);
string FileName = "TestDoc"+DateTime.Now.Year.ToString()+number.ToString()+DateTime.Now.Second.ToString()+".xls";
string path = #"c:\TestDocuments\"+FileName;
Directory.CreateDirectory(Path.GetDirectoryName(path));
FileStream fs = new FileStream(path, FileMode.OpenOrCreate);
XlsExportOptions options = new XlsExportOptions();
options.ExportHyperlinks = false;
ASPxExporter.WriteXls(fs, options);
fs.Close();
AddDropDownToExcel(path);
return path;
}
//Adding The Dropdownlist Of Items TO Generated Excel Sheet
protected void AddDropDownToExcel(string path)
{
Microsoft.Office.Interop.Excel.Application application = new Microsoft.Office.Interop.Excel.Application();
string fileName = path.Replace("\\", "\\\\");
string RowCount = "F" + (testgrid.VisibleRowCount + 1).ToString();
// Open Excel and get first worksheet.
var workbook = application.Workbooks.Open(fileName);
var worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
// Set range for dropdownlist
var rangeNewStatus = worksheet.get_Range("F2", RowCount);
rangeNewStatus.ColumnWidth = 20;
rangeNewStatus.Validation.Add(Microsoft.Office.Interop.Excel.XlDVType.xlValidateList, Microsoft.Office.Interop.Excel.XlDVAlertStyle.xlValidAlertStop,
Microsoft.Office.Interop.Excel.XlFormatConditionOperator.xlBetween, "Item1,Item2,Item3,Item4");
// Save.
workbook.Save();
workbook.Close(Microsoft.Office.Interop.Excel.XlSaveAction.xlSaveChanges, Type.Missing, Type.Missing);
application.Quit();
}
First, I sincerely hope this isn't running on a server.
Then, if your problem is that too many instances of Excel are created, a thought is "don't create an instance every single time". Instead of starting Excel every time AddDropDownToExcel is called, can you reuse the same instance?
The problem you are having shows up regularly in Excel interop scenario; even though you are done and tell Excel to close, it "stays alive". It's usually caused by your app still holding a reference to a COM object that hasn't been disposed, preventing Excel from closing. This StackOverflow answer provides some pointers: https://stackoverflow.com/a/158752/114519
In general, to avoid that problem, you want to follow the "one-dot" rule. For instance, in your code:
var workbook = application.Workbooks.Open(fileName);
will be a problem, because an "anonymous" wrapper for Workbooks is created, and will likely not be disposed properly. The "one-dot" rule would say "don't use more than one dot when working with Excel interop", in this case:
var workbooks = application.Workbooks;
var workbook = workbooks.Open(fileName);
A totally different thought - instead of using Interop, can't you use OpenXML to generate your Excel file? I have never tried it to create drop downs, but if it supports it, it will be massively faster than Interop, and the type of problems you have won't happen.
Hope this helps.
As I know the grow of number of runnig excel.exe processes is 'normal' situation to excel :)
The dumbest advice is just kill sometimes it's processes. BUT, this way will be absolutely unhelpful if you use excel during your app is working because of you rather don't get which one excel.exe is yours.

Unexpected null members when reading XLSX with Excel Data Reader (.Net)

I'm reading an XLSX (Microsoft Excel XML file) using the Excel Data Reader from http://exceldatareader.codeplex.com/ and am having a problem with missing data. Data which is in the source Excel spreadsheet is missing from the data set returned by the library.
Here's a bit more detail of what I'm doing:
Created a simple test spreadsheet in Excel with one sheet, a header row and two data rows. Save and close Excel.
Open the file and pass the stream into the CreateOpenXmlReader() method and get back an IExcelDataReader.
Call the AsDataSet() method on the IExcelDataReader and get back a DataSet.
Get the ItemArray from row 1 of table 0.
Loop through the ItemArray. Discovered there is data missing (i.e. there are System.DBNull members where I expected System.string members).
Here's a bit more analysis...
I debugged the code and looked inside the ExcelDataReader object model. Found a non-public string array called "SST" which appears to contain the data from the spreadsheet as a single linear (one-dimensional) array.
On closer inspection, I found that the data I was looking for was also missing from this array. In this raw data, the member does not exist at all.
My guess is that for some reason the parser is not picking up the data from the OOXML and concluding that the cell is empty. Looking at the OOXML itself, the data seems to be split across the sharedStrings.xml and sheet1.xml files, so perhaps the parser is having a tough time putting all this together.
Saving the file in binary format (Excel 97 to 2003) and reading that in solves the problem so on the surface that seems to confirm my suspicion is with reading the OOXML format.
Suggestions?
As a stop gap I'm converting all files to binary format, but that seems like a kludge. Is there some way to get my OOXML formatted Excel files to read in properly with Excel Data Reader?
To retrieve an Excel spreadsheet (.xlsx) and load it into a DataSet, you don't need to mess with XML readers or a separate library like Excel Data Reader. The code for reading an entire spreadsheet into a DataSet is pretty simple when using the normal OleDb functions in .NET:
Sub readInMyExcelFile
Dim xlsFile as string = "myexcelfile"
Dim conStr As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & xlsFile & ";Extended Properties=""Excel 12.0 Xml;HDR=YES"""
Dim dtSheets As New DataTable
Dim tmp As String
Dim sqlText as Sting
Using cn As New OleDbConnection(conStr)
cn.Open()
dtSheets = cn.GetSchema("Tables")
End Using
//Dataset for the spreadsheet
Dim ds as New DataSet
/Loop through the names of all the worksheets in the file.
For Each rw as DataRow in dtSheets.Rows
tmp = rw("TABLE_NAME")
tmp = "[" & tmp & "]"
Dim dt as New DataTable
Using cn as New OleDbConnection(conStr)
cn.Open
/Retrieve all the records from the worksheet.
sqlText = "SELECT * FROM " & tblName
Dim adp As New OleDbDataAdapter(sqlText, cn)
/Fill the data table with the all the data.
adp.Fill(dt)
End Using
ds.Tables.Add(dt)
Next
End Sub
It seems there is a bug in Excel Data Reader (it is first time I have heard about it). Do you have to use it? If not, EPPlus would be a better choice.
excel datareader from codeplex is used for reading data from the excel file directly on web application without any sort of caching on the server.the above code only stands when we can store the excel file somewhere.I have faced similar problems with exceldatareader where some of the data are missing.Most importanly i coludnt find any specific trend.All i cud see that if all the rows have values then there is no problem. Best chance is to convert xlsx to xls.

Resources