I'm using OleDb to read from an excel workbook with many sheets.
I need to read the sheet names, but I need them in the order they are defined in the spreadsheet; so If I have a file that looks like this;
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
|_____|_____|____|____|____|____|____|____|____|
\__GERMANY__/\__UK__/\__IRELAND__/
Then I need to get the dictionary
1="GERMANY",
2="UK",
3="IRELAND"
I've tried using OleDbConnection.GetOleDbSchemaTable(), and that gives me the list of names, but it alphabetically sorts them. The alpha-sort means I don't know which sheet number a particular name corresponds to. So I get;
GERMANY, IRELAND, UK
which has changed the order of UK and IRELAND.
The reason I need it to be sorted is that I have to let the user choose a range of data by name or index; they can ask for 'all the data from GERMANY to IRELAND' or 'data from sheet 1 to sheet 3'.
Any ideas would be greatly appreciated.
if I could use the office interop classes, this would be straightforward. Unfortunately, I can't because the interop classes don't work reliably in non-interactive environments such as windows services and ASP.NET sites, so I needed to use OLEDB.
Can you not just loop through the sheets from 0 to Count of names -1? that way you should get them in the correct order.
Edit
I noticed through the comments that there are a lot of concerns about using the Interop classes to retrieve the sheet names. Therefore here is an example using OLEDB to retrieve them:
/// <summary>
/// This method retrieves the excel sheet names from
/// an excel workbook.
/// </summary>
/// <param name="excelFile">The excel file.</param>
/// <returns>String[]</returns>
private String[] GetExcelSheetNames(string excelFile)
{
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
// Connection String. Change the excel file to the file you
// will search.
String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
// Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
// Open connection with the database.
objConn.Open();
// Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if(dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
// Add the sheet name to the string array.
foreach(DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for(int j=0; j < excelSheets.Length; j++)
{
// Query each excel sheet.
}
return excelSheets;
}
catch(Exception ex)
{
return null;
}
finally
{
// Clean up.
if(objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if(dt != null)
{
dt.Dispose();
}
}
}
Extracted from Article on the CodeProject.
Since above code do not cover procedures for extracting list of sheet name for Excel 2007,following code will be applicable for both Excel(97-2003) and Excel 2007 too:
public List<string> ListSheetInExcel(string filePath)
{
OleDbConnectionStringBuilder sbConnection = new OleDbConnectionStringBuilder();
String strExtendedProperties = String.Empty;
sbConnection.DataSource = filePath;
if (Path.GetExtension(filePath).Equals(".xls"))//for 97-03 Excel file
{
sbConnection.Provider = "Microsoft.Jet.OLEDB.4.0";
strExtendedProperties = "Excel 8.0;HDR=Yes;IMEX=1";//HDR=ColumnHeader,IMEX=InterMixed
}
else if (Path.GetExtension(filePath).Equals(".xlsx")) //for 2007 Excel file
{
sbConnection.Provider = "Microsoft.ACE.OLEDB.12.0";
strExtendedProperties = "Excel 12.0;HDR=Yes;IMEX=1";
}
sbConnection.Add("Extended Properties",strExtendedProperties);
List<string> listSheet = new List<string>();
using (OleDbConnection conn = new OleDbConnection(sbConnection.ToString()))
{
conn.Open();
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dtSheet.Rows)
{
if (drSheet["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
listSheet.Add(drSheet["TABLE_NAME"].ToString());
}
}
}
return listSheet;
}
Above function returns list of sheet in particular excel file for both excel type(97,2003,2007).
Can't find this in actual MSDN documentation, but a moderator in the forums said
I am afraid that OLEDB does not preserve the sheet order as they were in Excel
Excel Sheet Names in Sheet Order
Seems like this would be a common enough requirement that there would be a decent workaround.
This is short, fast, safe, and usable...
public static List<string> ToExcelsSheetList(string excelFilePath)
{
List<string> sheets = new List<string>();
using (OleDbConnection connection =
new OleDbConnection((excelFilePath.TrimEnd().ToLower().EndsWith("x"))
? "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + excelFilePath + "';" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'"
: "provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + excelFilePath + "';Extended Properties=Excel 8.0;"))
{
connection.Open();
DataTable dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drSheet in dt.Rows)
if (drSheet["TABLE_NAME"].ToString().Contains("$"))
{
string s = drSheet["TABLE_NAME"].ToString();
sheets.Add(s.StartsWith("'")?s.Substring(1, s.Length - 3): s.Substring(0, s.Length - 1));
}
connection.Close();
}
return sheets;
}
Another way:
a xls(x) file is just a collection of *.xml files stored in a *.zip container.
unzip the file "app.xml" in the folder docProps.
<?xml version="1.0" encoding="UTF-8" standalone="true"?>
-<Properties xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties">
<TotalTime>0</TotalTime>
<Application>Microsoft Excel</Application>
<DocSecurity>0</DocSecurity>
<ScaleCrop>false</ScaleCrop>
-<HeadingPairs>
-<vt:vector baseType="variant" size="2">
-<vt:variant>
<vt:lpstr>Arbeitsblätter</vt:lpstr>
</vt:variant>
-<vt:variant>
<vt:i4>4</vt:i4>
</vt:variant>
</vt:vector>
</HeadingPairs>
-<TitlesOfParts>
-<vt:vector baseType="lpstr" size="4">
<vt:lpstr>Tabelle3</vt:lpstr>
<vt:lpstr>Tabelle4</vt:lpstr>
<vt:lpstr>Tabelle1</vt:lpstr>
<vt:lpstr>Tabelle2</vt:lpstr>
</vt:vector>
</TitlesOfParts>
<Company/>
<LinksUpToDate>false</LinksUpToDate>
<SharedDoc>false</SharedDoc>
<HyperlinksChanged>false</HyperlinksChanged>
<AppVersion>14.0300</AppVersion>
</Properties>
The file is a german file (Arbeitsblätter = worksheets).
The table names (Tabelle3 etc) are in the correct order. You just need to read these tags;)
regards
I have created the below function using the information provided in the answer from #kraeppy (https://stackoverflow.com/a/19930386/2617732). This requires the .net framework v4.5 to be used and requires a reference to System.IO.Compression. This only works for xlsx files and not for the older xls files.
using System.IO.Compression;
using System.Xml;
using System.Xml.Linq;
static IEnumerable<string> GetWorksheetNamesOrdered(string fileName)
{
//open the excel file
using (FileStream data = new FileStream(fileName, FileMode.Open))
{
//unzip
ZipArchive archive = new ZipArchive(data);
//select the correct file from the archive
ZipArchiveEntry appxmlFile = archive.Entries.SingleOrDefault(e => e.FullName == "docProps/app.xml");
//read the xml
XDocument xdoc = XDocument.Load(appxmlFile.Open());
//find the titles element
XElement titlesElement = xdoc.Descendants().Where(e => e.Name.LocalName == "TitlesOfParts").Single();
//extract the worksheet names
return titlesElement
.Elements().Where(e => e.Name.LocalName == "vector").Single()
.Elements().Where(e => e.Name.LocalName == "lpstr")
.Select(e => e.Value);
}
}
I like the idea of #deathApril to name the sheets as 1_Germany, 2_UK, 3_IRELAND. I also got your issue to do this rename for hundreds of sheets. If you don't have a problem to rename the sheet name then you can use this macro to do it for you. It will take less than seconds to rename all sheet names. unfortunately ODBC, OLEDB return the sheet name order by asc. There is no replacement for that. You have to either use COM or rename your name to be in the order.
Sub Macro1()
'
' Macro1 Macro
'
'
Dim i As Integer
For i = 1 To Sheets.Count
Dim prefix As String
prefix = i
If Len(prefix) < 4 Then
prefix = "000"
ElseIf Len(prefix) < 3 Then
prefix = "00"
ElseIf Len(prefix) < 2 Then
prefix = "0"
End If
Dim sheetName As String
sheetName = Sheets(i).Name
Dim names
names = Split(sheetName, "-")
If (UBound(names) > 0) And IsNumeric(names(0)) Then
'do nothing
Else
Sheets(i).Name = prefix & i & "-" & Sheets(i).Name
End If
Next
End Sub
UPDATE:
After reading #SidHoland comment regarding BIFF an idea flashed. The following steps can be done through code. Don't know if you really want to do that to get the sheet names in the same order. Let me know if you need help to do this through code.
1. Consider XLSX as a zip file. Rename *.xlsx into *.zip
2. Unzip
3. Go to unzipped folder root and open /docprops/app.xml
4. This xml contains the sheet name in the same order of what you see.
5. Parse the xml and get the sheet names
UPDATE:
Another solution - NPOI might be helpful here
http://npoi.codeplex.com/
FileStream file = new FileStream(#"yourexcelfilename", FileMode.Open, FileAccess.Read);
HSSFWorkbook hssfworkbook = new HSSFWorkbook(file);
for (int i = 0; i < hssfworkbook.NumberOfSheets; i++)
{
Console.WriteLine(hssfworkbook.GetSheetName(i));
}
file.Close();
This solution works for xls. I didn't try xlsx.
Thanks,
Esen
This worked for me. Stolen from here: How do you get the name of the first page of an excel workbook?
object opt = System.Reflection.Missing.Value;
Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Excel.Workbook workbook = app.Workbooks.Open(WorkBookToOpen,
opt, opt, opt, opt, opt, opt, opt,
opt, opt, opt, opt, opt, opt, opt);
Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
string firstSheetName = worksheet.Name;
Try this. Here is the code to get the sheet names in order.
private Dictionary<int, string> GetExcelSheetNames(string fileName)
{
Excel.Application _excel = null;
Excel.Workbook _workBook = null;
Dictionary<int, string> excelSheets = new Dictionary<int, string>();
try
{
object missing = Type.Missing;
object readOnly = true;
Excel.XlFileFormat.xlWorkbookNormal
_excel = new Excel.ApplicationClass();
_excel.Visible = false;
_workBook = _excel.Workbooks.Open(fileName, 0, readOnly, 5, missing,
missing, true, Excel.XlPlatform.xlWindows, "\\t", false, false, 0, true, true, missing);
if (_workBook != null)
{
int index = 0;
foreach (Excel.Worksheet sheet in _workBook.Sheets)
{
// Can get sheet names in order they are in workbook
excelSheets.Add(++index, sheet.Name);
}
}
}
catch (Exception e)
{
return null;
}
finally
{
if (_excel != null)
{
if (_workBook != null)
_workBook.Close(false, Type.Missing, Type.Missing);
_excel.Application.Quit();
}
_excel = null;
_workBook = null;
}
return excelSheets;
}
As per MSDN, In a case of spreadsheets inside of Excel it might not work because Excel files are not real databases. So you will be not able to get the sheets name in order of their visualization in workbook.
Code to get sheets name as per their visual appearance using interop:
Add reference to Microsoft Excel 12.0 Object Library.
Following code will give the sheets name in the actual order stored in workbook, not the sorted name.
Sample Code:
using Microsoft.Office.Interop.Excel;
string filename = "C:\\romil.xlsx";
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Excel.Application excel = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =excel.Workbooks.Open(filename, missing, missing, missing, missing,missing, missing, missing, missing, missing, missing, missing, missing, missing, missing);
ArrayList sheetname = new ArrayList();
foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in wb.Sheets)
{
sheetname.Add(sheet.Name);
}
I don't see any documentation that says the order in app.xml is guaranteed to be the order of the sheets. It PROBABLY is, but not according to the OOXML specification.
The workbook.xml file, on the other hand, includes the sheetId attribute, which does determine the sequence - from 1 to the number of sheets. This is according to the OOXML specification. workbook.xml is described as the place where the sequence of the sheets is kept.
So reading workbook.xml after it is extracted form the XLSX would be my recommendation. NOT app.xml. Instead of docProps/app.xml, use xl/workbook.xml and look at the element, as shown here -
`
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303" />
<workbookPr defaultThemeVersion="124226" />
- <bookViews>
<workbookView xWindow="120" yWindow="135" windowWidth="19035" windowHeight="8445" />
</bookViews>
- <sheets>
<sheet name="By song" sheetId="1" r:id="rId1" />
<sheet name="By actors" sheetId="2" r:id="rId2" />
<sheet name="By pit" sheetId="3" r:id="rId3" />
</sheets>
- <definedNames>
<definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'By song'!$A$1:$O$59</definedName>
</definedNames>
<calcPr calcId="145621" />
</workbook>
`
I need to get the column headers of CSV files with LINQ to Excel
I use the code specified on https://github.com/paulyoder/LinqToExcel
This works perefectly for .XLSX files but not for CSV
//Select File
var book = new LinqToExcel.ExcelQueryFactory(link + #"\" + fileName);
//Select firtworkbook
var query = (from row in book.Worksheet(0) select row).ToList();
var workSheetName = book.GetWorksheetNames();
var columnNames = (from row in book.GetColumnNames(workSheetName.FirstOrDefault()) select row).ToList();
I also tried hardcoding the sheet name and calling the CSV sheet1
var columnNames = (from row in book.GetColumnNames("Sheet1") select row).ToList();
This breaks and gives me this error:
Message = "'54733658.csv' is not a valid worksheet name in file...
I double checked it is the correct path.
I then tried:(It takes worksheet name which is the same as file name - extention)
string extension = System.IO.Path.GetExtension(fileName);
string result = fileName.Substring(0, fileName.Length - extension.Length);
var colNames = book.GetColumnNames(result, "A1:F1").ToList();
This gives me the following error:
The Microsoft Jet database engine could not find the object '02119249$A1_Z1.txt'. Make sure the object exists and that you spell its name and the path name correctly.
I googled that error those results are not applicable.
I don't have enough time to figure out why I cant read the CSV column headers
For those of you who have the same issue:
Read the first line and make a list of strings, here is my method:
public List<string> ColumnNameGenerator(string FilePath)
{
string firstLine = "";
using (StreamReader reader = new StreamReader(FilePath))
{
firstLine = reader.ReadLine() ?? "";
}
return firstLine.Split(',').ToList();
}
When trying to get the Input the data from the Excel Sheet while working with OATS tool, it always gets into the catch block of the function. The below is the script written. Please help us resolve this issue.
public String getInputfromExcel(int argColumnNumber,int argRowNumber)throws Exception
{
String inputExcelName = dataPath+".xlsx";
String cellContent = "12";
try
{
Workbook workbook = Workbook.getWorkbook(new File(inputExcelName));
Sheet sheet = workbook.getSheet(0);
Cell a1 = sheet.getCell(argColumnNumber, argRowNumber);
cellContent = (a1.getContents()).toString();
System.out.println(cellContent.toString());
workbook.close();
}
catch (Exception e)
{
addReport("Getting Input From Excel", "Fail","Exception while reading value from excel sheet");
}
return cellContent;
}
Axel has brought up the point. On a further note, if I remember correctly, the function sheet.getCell(arg1, arg2) has first argument as rowNumber and 2nd as columnNumber (both the values are 0 based index).
Its old quetion....but just posting answer it might be helpful for someone needy.
In Oracle Application Testing Suite. NO NEED of external JARs to read/write data.
You can enable DataTable module in the tool
Complete explanation given here, http://www.testinghive.com/how-to-read-write-excel-in-oats/
//Define Sheet name to be read, and provide comma seperated to read multiple sheets
String sheetName = "Sheet1";
//Mention excel sheet path
String strFile= "C:\\Demo\\test.xls";
//Defined array list to add Sheets to read
List sheetList = new ArrayList();
sheetList.add(sheetName);
// Iports Sheet1
datatable.importSheets(strFile, sheetList, true, true);
//get rowcount
info("Total rows :"+datatable.getRowCount());
int rowcount=datatable.getRowCount();
//Loop to read all rows
for (int i=0;i<rowcount;i++)
{
//Set current row fromw here you need to start reading, in this case start from first row
datatable.setCurrentRow(sheetName, i);
String strCompany=(String) datatable.getValue(sheetName,i,"Company");
String strEmpFName=(String) datatable.getValue(sheetName,i,"FirstName");
String strEmpLName=(String) datatable.getValue(sheetName,i,"LastName");
String strEmpID=(String) datatable.getValue(sheetName,i,"EmpID");
String strLocation=(String) datatable.getValue(sheetName,i,"Location");
//prints first name and last name
System.out.println("First Name : "+strEmpFName+", Last Name : "+strEmpLName);
//Sets ACTIVE column in excel sheet to Y
String strActive="Y";
datatable.setValue(sheetName, i, datatable.getColumn(sheetName, datatable.getColumnIndex("Active")), strActive);
}
//Updates sheet with updated values ie ACTIVE column sets to Y
datatable.exportToExcel("C:\\Demo\\test1.xlsx");
There is a DataGridView that has a CheckBox Column. I'm generating the other columns of DataGridView from a table. It's working fine but now I'm trying some checkboxes to be checked using this code but it is not working. The code looks like
string query = "SELECT ID, Group_Name+' '+Phone_No as Info FROM Group_Info";
GenerateGridView(dataGridView1, query);
DataTable dt = GetTableData("SELECT Group_ID FROM tblGenerate");
foreach(DataRow rw in dt.Rows )
{
foreach (DataGridViewRow row in dataGridView1.Rows)
{
DataGridViewCheckBoxCell chk = (DataGridViewCheckBoxCell)(row.Cells[0].Value );
if (Convert.ToInt32(row.Cells[1].Value) == Convert.ToInt32(rw["Group_ID"]))
{
chk.Value = chk.TrueValue;
}
}
}
How can we do this?
May you try this:
DataGridViewCheckBoxColumn chk = new DataGridViewCheckBoxColumn();
dataGridView1.Columns.Add(chk);
chk.HeaderText = "Check Data";
chk.Name = "chk";
dataGridView1.Rows[2].Cells[3].Value = true;
I have a Sharepoint (2007) list with some items in it. When I click on one of these items, it will open an Excel (2003) file with a lot of macros. I need to get the ID of this (Sharepoint) item and send it to a cell of my Excel file... Then a macro will be executed and get all the data we need for this ID.
How can I send the item's ID to my Excel file ?
Any idea ?
Thanks
I once write a DataTable into an new excel file. So you can go ahead and change the function parameter from DataTable to SPList/SPLisItem, and write to an existing file (my current implementation writes to a new Excel file everytime, I execute this function). Also, make sure you add references for the Excel (COM) objects for e.g. Microsoft Excel 12.0 Object Library etc. If you need more help let me know.
public void excelgenerate(DataSet ds)
{
Microsoft.Office.Interop.Excel.Application oAppln;
//declaring work book
Microsoft.Office.Interop.Excel.Workbook oWorkBook;
//declaring worksheet
Microsoft.Office.Interop.Excel.Worksheet oWorkSheet;
oAppln = new Microsoft.Office.Interop.Excel.Application();
oWorkBook = (Microsoft.Office.Interop.Excel.Workbook)(oAppln.Workbooks.Add(true));
Microsoft.Office.Interop.Excel.Range wRange;
foreach (DataTable table in ds.Tables)
{
oWorkSheet = (Microsoft.Office.Interop.Excel.Worksheet)(oWorkBook.Worksheets.Add(Type.Missing, Type.Missing, Type.Missing, Type.Missing));
oWorkSheet.Name = table.TableName;
oWorkSheet.Activate();
DataRow dr = table.Rows[0];
string path = dr["Path"].ToString();
if (path.Length > 0)
{
string[] mylist = path.Split('\\');
var features = Array.FindLastIndex(mylist, str => str.Equals("Features"));
string stringmine = "Type ---> " + mylist[4]
+ "/" + mylist[5]
+ " Project Name ---> " + mylist[6]
+ " Feature Name ---> " + mylist[features + 1];
oWorkSheet.Cells[1, 1] = stringmine;
Microsoft.Office.Interop.Excel.Range colrange = oWorkSheet.get_Range(oWorkSheet.Cells[1, 1], oWorkSheet.Cells[1, 8]);
colrange.Merge(true);
}
int ColumnIndex = 0;
foreach (DataColumn col in table.Columns)
{
ColumnIndex++;
oWorkSheet.Cells[2, ColumnIndex] = col.ColumnName;
wRange = (Microsoft.Office.Interop.Excel.Range)oWorkSheet.Cells[2, ColumnIndex];
wRange.Font.Bold = true;
}
int rowIndex = 1;
foreach (DataRow row in table.Rows)
{
rowIndex++;
ColumnIndex = 0;
foreach (DataColumn col in table.Columns)
{
ColumnIndex++;
oWorkSheet.Cells[rowIndex + 1, ColumnIndex] = row[col.ColumnName].ToString();
}
}
oWorkSheet.Columns.AutoFit();
oWorkSheet.Rows.AutoFit();
}
string fileName = System.Guid.NewGuid().ToString().Replace("-", "") + ".xls";
Console.WriteLine("Number of sheets written : " + oWorkBook.Worksheets.Count);
oWorkBook.SaveAs(fileName, Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookNormal, null, null, false, false, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, false, false, null, null, null);
oWorkBook.Close(null, null, null);
oAppln.Quit();
}
For executing the Macro using C# ASP.NET and SharePoint, I would recommend you using this article
Hope it will answer your question!
Unless there a reason why you cannot link the SharePoint list data directly into a worksheet and bring your macros into that spreadsheet I think the steps below will get you what you need. It seems too simple...there must be a reason this does not work for what you're trying to do. In any case, here are the steps to make this work:
1) Make sure the SharePoint list actually has an indexed column that enforces unique values. You can check this by looking at the document library settings. Look to make sure there is an index column under the columns listing. If there is not one, you can create one by selecting the "create new column" action, select your data type and make sure that you select the radio button that says "enforce unique values".
2) Export the library to excel using the "export to excel" options in the library's main page menu. This will establish a data link by default and store an excel query file at a default location on your machine that you can discover by going to the data tab and selecting "connections".
3) Copy the macro into the spreadsheet that is linked to your data source and adjust the references in your macro to extract the information you need from the SharePoint list.
Hope this helps.