How to skip rows while importing ExcelSheet with variable sheetname? - excel

I have to import ExcelSheets with variable Sheetnames via SSIS. Sheetname is determined by scripttask and passed via User::Variable to ExcelSource. The problem is that data/headline always starts at row 11.
How is it possible to pass something like "$A10:AB50" (required data selection) to the sheetname delivered in User::Variable?
I´ve already tried to pass the required data-selection-string (e.g. "$A10:AB50") to OpenRowsetVariable, ExcelConnectionManager, ConnectionString... without success. I also tried skipping/splitting first 10 rows via conditional split, but couldn`t find any approach to this.
A test with a second ConnectionManager getting data from a defined file and that has data access via sql-query (e.q. "select f1,f2... from [sheetname$a1:ab50]") works great indeed.
This is how scripttask determines sheetname:
ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + FILEPATH + ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=0\"";
OleDbConnection cnn = new OleDbConnection(ConStr);
cnn.Open();
DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetname = "";
foreach (DataRow drSheet in dtSheet.Rows)
{
// sheetname only
if (drSheet["TABLE_NAME"].ToString().Contains("$") && !drSheet["TABLE_NAME"].ToString().Contains("Print_Area"))
{
sheetname = drSheet["TABLE_NAME"].ToString();
// return sheetname
//MessageBox.Show(sheetname);
Dts.Variables["User::rSheetName"].Value = sheetname;

Solved!
Simply added the decided data selection to the sheet-return-variable in scripttask:
if (drSheet["TABLE_NAME"].ToString().Contains("$") && !drSheet["TABLE_NAME"].ToString().Contains("Print_"))
{
SheetName = drSheet["TABLE_NAME"].ToString();
// concat datazone to sheetname
DataZone = "A10:AB50";
SnameValue = SheetName + DataZone;
SnameValue = SnameValue.Replace(#"'", "");
// return combined sheetname
MessageBox.Show(SnameValue);
Dts.Variables["User::rSheetName"].Value = SnameValue;
cnn.Close();
}
It was way to easy... thanks for thinking about it anyway ;-)

Related

Get Excel sheet names unsorted via delphi ADO [duplicate]

I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.
Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.
Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).
Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.
This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}
Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards
I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}
I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen
This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;
Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}
As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}
I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`

Allowing VB.NET app to convert Excel Files to Datatable

My VB.NET app currently allows me to convert CSV files to a datatable thanks to the code provided by David in this question I posted: Previous Question
Now I am trying to allow .XLSX files to be imported to a datatable as well. Currently the code looks like this:
Private Function ConvertCSVToDataTable(ByVal path As String) As DataTable
Dim dt As DataTable = New DataTable()
Using con As OleDb.OleDbConnection = New OleDb.OleDbConnection()
Try
If System.IO.Path.GetExtension(path) = ".csv" Then
con.ConnectionString = String.Format("Provider={0};Data Source={1};Extended Properties=""Text;HDR=YES;FMT=Delimited""", "Microsoft.Jet.OLEDB.4.0", IO.Path.GetDirectoryName(path))
ElseIf System.IO.Path.GetExtension(path) = ".xlsx" Then
con.ConnectionString = String.Format("Provider={0};Data Source={1};Extended Properties=""Excel 12.0 XML;HDR=Yes;""", "Microsoft.ACE.OLEDB.12.0", IO.Path.GetDirectoryName(path))
End If
Using cmd As OleDb.OleDbCommand = New OleDb.OleDbCommand("SELECT * FROM " & IO.Path.GetFileName(path), con)
Using da As OleDb.OleDbDataAdapter = New OleDb.OleDbDataAdapter(cmd)
con.Open()
da.Fill(dt)
con.Close()
End Using
End Using
Catch ex As Exception
Console.WriteLine(ex.ToString())
Finally
If con IsNot Nothing AndAlso con.State = ConnectionState.Open Then
con.Close()
End If
End Try
End Using
Return dt
End Function
However, when I run the code using the .XLSX file, I get the following error:
{"The Microsoft Office Access database engine cannot open or write to
the file 'C:\Users\XSLXFilePath'. It is already opened exclusively by
another user, or you need permission to view and write its data."}
The file is not open anywhere else to my knowledge. And the app also runs fine when .CSV file is put through it instead. How do I get the app to properly work for .XLSX, or any Excel file format?
I think that the error is that from the connection string and the OLEDB Command:
ConnectionString
You don't have to use IO.Path.GetDirectoryName(path) it returns the directory name, you have to provide the file full path:
con.ConnectionString = String.Format("Provider={0};Data Source={1};Extended Properties=""Excel 12.0 XML;HDR=Yes;""", "Microsoft.ACE.OLEDB.12.0", path)
Refer to this link for excel connectionstring generation function: import data from excel 2003 to dataTable
OLEDB Command
You must provide the Worksheet name in the Command instead of the Filename:
Using cmd As OleDb.OleDbCommand = New OleDb.OleDbCommand("SELECT * FROM [Sheet1$]" , con)
If the Sheet names is dynamic and you have to get the first sheet in the excel file:
Dim dbSchema as DataTable = con.GetOleDbSchemaTable (OleDbSchemaGuid.Tables, null)
Dim firstSheetname as String = dbSchema.Rows(0)("TABLE_NAME").ToString
Using cmd As OleDb.OleDbCommand = New OleDb.OleDbCommand("SELECT * FROM [" & firstSheetname & "]" , con)
References
Reading from excel using oledbcommand
Read and Write Excel Documents Using OLEDB
Use can use the following connection string for .xlsx file.
I have used it and working fine.
P_FIle = ( File Name with path )
P_Con_Str = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & P_File & ";Extended Properties=""Excel 12.0 XML;HDR=Yes;"""

Proper way to get excel sheet names using C# and oledb

I'm trying to figure out why the behavior I'm seeing and the "documented" behavior are different. I've read both of these articles:Read and Write Excel Documents Using OLEDB and Working with MS Excel(xls / xlsx) Using MDAC and Oledb and this is text from the second link.
If you read in the second link it says:
To Retrieve Schema Information of Excel Workbook :
You can get the worksheets that are present in the excel workbook using GetOleDbSchemaTable. Use the following snippet.
DataTable dtSchema = null;
dtSchema = conObj.GetOleDbSchemaTable(
OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
Here dtSchema will hold the list of all workbooks. Say we have two workbooks : wb1, wb2. The above code will return a list of wb1, wb1$,wb2,wb2$. We need to filter out $ elements.
However when I run this code I only get "wb1$ and wb2$". I can easily remove the $ in code but I'm trying to make sure I'm not going to have code that breaks when I put it on a different computers/OS/environment and it behaves as is documented. Can somebody tell my what or if something changed since these were written or if I'm missing some key piece. Something to note this is being developed in VS2015, Windows 7 Pro, and Office 2010 installed.
//Connection String
//string connstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties='Excel 8.0;HDR=NO;IMEX=1';"; // Extra blank space cannot appear in Office 2007 and the last version. And we need to pay attention on semicolon.
//string connstring = Provider = Microsoft.JET.OLEDB.4.0; Data Source = " + path + "; Extended Properties = 'Excel 8.0;HDR=NO;IMEX=1'; "; //This connection string is appropriate for Office 2007 and the older version. We can select the most suitable connection string according to Office version or our program.
using (OleDbConnection conn = new OleDbConnection(_connectionString))
{
conn.Open();
//DataTable sheetNames = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" }); //Get All Sheets Name
DataTable sheetNames = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); //Get All Sheets Name
// Loop through all Sheets to get data
foreach (DataRow dr in sheetNames.Rows)
{
string sheetName = dr["TABLE_NAME"].ToString();
//if (!sheetName.EndsWith("$"))
// continue;
Debug.Print(sheetName);
}
return sheetNames;
Thanks
dbl

query for reading data from excel sheet in c#

Thanks Astander for replying to my query
I am here with more detailed query.
string cs = "Provider=Microsoft.ACE.OLEDB.12.0;" + "Data Source=" + #"D:\\sample.xls;" + "Excel 12.0;HDR=YES;";
OleDbConnection Excelcon = new OleDbConnection(cs);
OleDbDataAdapter ad = new OleDbDataAdapter();
ad.SelectCommand = new OleDbCommand("SELECT *FROM [Sheet1$]", Excelcon);
DataTable dt = new DataTable();
ad.Fill(dt);
return dt;
I am getting error at the select statement that :
The Microsoft Office Access database engine could not find the object 'Sheet1$'. Make sure the object exists and that you spell its name and the path name correctly.
Hope someone can help me find a solution.
What worked for me is,
when file was created, it was stored in some specific location. In my case,C:/Documents.
I had manually changed the location to D:
this was what I had written
string connStringExcel = #"Provider=Microsoft.ACE.OLEDB.12.0; Data Source=D:\example.xls;Extended Properties=""Excel 12.0;HDR=YES;""";`
So,the actual path should be
string connStringExcel = #"Provider=Microsoft.ACE.OLEDB.12.0; Data Source=C:\A\Documents\example.xls;Extended Properties=""Excel 12.0;HDR=YES;""";`
So on giving the path of correct location,my query was solved.
Hope it helps someone else too.
// Create connection string variable. Modify the "Data Source"
// parameter as appropriate for your environment.
String sConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + Server.MapPath("../ExcelData.xls") + ";" +
"Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
OleDbConnection objConn = new OleDbConnection(sConnectionString);
// Open connection with the database.
objConn.Open();
// The code to follow uses a SQL SELECT command to display the data from the worksheet.
// Create new OleDbCommand to return data from worksheet.
OleDbCommand objCmdSelect =new OleDbCommand("SELECT * FROM myRange1", objConn);
// Create new OleDbDataAdapter that is used to build a DataSet
// based on the preceding SQL SELECT statement.
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
// Pass the Select command to the adapter.
objAdapter1.SelectCommand = objCmdSelect;
// Create new DataSet to hold information from the worksheet.
DataSet objDataset1 = new DataSet();
// Fill the DataSet with the information from the worksheet.
objAdapter1.Fill(objDataset1, "XLData");
// Bind data to DataGrid control.
DataGrid1.DataSource = objDataset1.Tables[0].DefaultView;
DataGrid1.DataBind();
// Clean up objects.
objConn.Close();
ref to thisLink

Why does one ADO.NET Excel query work and another does not?

I'm working on a SharePoint workflow, and the first step requires me to open an Excel workbook and read two things: a range of categories (from a range named, conveniently enough, Categories) and a category index (in the named range CategoryIndex). Categories is a list of roughly 100 cells, and CategoryIndex is a single cell.
I'm using ADO.NET to query the workbook
string connectionString =
"Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source=" + temporaryFileName + ";" +
"Extended Properties=\"Excel 12.0 Xml;HDR=YES\"";
OleDbConnection connection = new OleDbConnection(connectionString);
connection.Open();
OleDbCommand categoryIndexCommand = new OleDbCommand();
categoryIndexCommand.Connection = connection;
categoryIndexCommand.CommandText = "Select * From CategoryIndex";
OleDbDataReader indexReader = categoryIndexCommand.ExecuteReader();
if (!indexReader.Read())
throw new Exception("No category selected.");
object indexValue = indexReader[0];
int categoryIndex;
if (!int.TryParse(indexValue.ToString(), out categoryIndex))
throw new Exception("Invalid category manager selected");
OleDbCommand selectCommand = new OleDbCommand();
selectCommand.Connection = connection;
selectCommand.CommandText = "SELECT * FROM Categories";
OleDbDataReader reader = selectCommand.ExecuteReader();
if (!reader.HasRows || categoryIndex >= reader.RecordsAffected)
throw new Exception("Invalid category/category manager selected.");
connection.Close();
Don't judge the code itself too harshly; it's been through a lot. Anyway, the first command never executes correctly. It doesn't throw an exception. It just returns an empty data set. (HasRows is true, and Read() returns false, but there is no data there) The second command works perfectly. These are both named ranges.
They are populated differently, however. There's a web service call that fills up Categories. Those values are displayed in a dropdown box. The selected index goes into CategoryIndex. After hours of banging my head, I decided to write a couple of lines of code so that the dropdown's value goes into a different cell, then I copy the value using a couple of lines of C# into CategoryIndex, so that the data is set identically. That turned out to be a blind alley, too.
Am I missing something? Why would one query work perfectly and the other fail to return any data?
I have found the issue. Excel was apparently unable to parse the value in the cell, so it was returning nothing. What I had to do was adjust the connection string to the following:
string connectionString =
"Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source=" + temporaryFileName + ";" +
"Extended Properties=\"Excel 12.0 Xml;HDR=NO;IMEX=1\"";
It would have been helpful if it would have thrown an exception or given any indication of why it was failing, but that's beside the point now. The option IMEX=1 tells Excel to treat all values as strings only. I'm quite capable of parsing my own integers, thankyouverymuch, Excel, so I didn't need its assistance.

Resources