I'm trying to get the header / footer parts from an excel document so that I can do something with their contents, however I cannot seem to get anything from them.
I thought this would be pretty simple... Consider this code:
using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(filePath, true))
{
var headers = spreadsheet.GetPartsOfType<HeaderPart>().ToList();
foreach (var header in headers)
{
//do something
}
}
Even with a file that contains a header, headers will always be empty. I've tried drilling down into the workbook -> worksheets -> etc but i get nothing back. My testing excel file definitely has a header (headers are ghastly in excel!).
Annoyingly the api's for excel in openxml seem to be worse as in a docx you can get the header by calling:
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
{
MainDocumentPart documentPart = wordDoc.MainDocumentPart;
var headerParts = wordDoc.MainDocumentPart.HeaderParts.ToList();
foreach (var headerPart in headerParts)
{
//do something
}
}
I've seen some people on google saying that I should query the worksheet's descendants (code from this link):
HeaderFooter hf = ws.Descendants<HeaderFooter>().FirstOrDefault();
if (hf != null)
{
//here you can add your code
//I just try to append here for demo
hf = new HeaderFooter();
ws.AppendChild<HeaderFooter>(hf);
}
But I cannot see any way of querying the workbook/sheet/anything with .Descendants and obviously none of the code examples on google show how they got ws 🙃.
Any ideas? Thanks
HeaderFooter, as per your second example, is the correct way to read a Header or Footer from a Spreadsheet using OpenXML. The ws in your example refers to a Worksheet.
The following is an example that reads the HeaderFooter and dumps the InnerText to the console.
using (SpreadsheetDocument document = SpreadsheetDocument.Open(filePath, false))
{
WorkbookPart workbookPart = document.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet ws = worksheetPart.Worksheet;
HeaderFooter hf = ws.Descendants<HeaderFooter>().FirstOrDefault();
if (hf != null)
{
Console.WriteLine(hf.InnerText);
}
}
I would highly recommend that you read the documentation for the HeaderFooter element as it's more complex than you might imagine. The documentation can be found in section 18.3.1.46 of the Fifth Edition of the Ecma Office Open XML Part 1 - Fundamentals And Markup Language Reference which can be found here.
Related
I'm working on a C# project that create, encrypt and send an excel file as an e-mail attachment.
I'm not looking to protect the excel, but encrypt it and set a password for opening it.
Is it even possible with the OpenXML SDK ?
I tried that, but I don't think this is for encrypting :
string hexConvertedPassword = HexPasswordConversion("Azerty123");
WorkbookProtection WorkbookProt = new WorkbookProtection()
{
WorkbookPassword = hexConvertedPassword,
RevisionsPassword = hexConvertedPassword,
LockRevision = true,
LockStructure = true,
LockWindows = true,
};
Workbook workbook = new Workbook();
workbook.Append(WorkbookProt);
I have already search in the documentation, but may have missed it.
I know some library can do it like devexpress, EasyXLS, Spire.Xls, etc, but those are very expensive.
You can try to encrypt the content of DOCUMENT via reading cell by cell and generate a new .xlsx file.
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
string text;
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
// send to ecryption function.
text = c.CellValue.Text;
Console.Write(text + " ");
}
}
For more details visit: https://learn.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet
In the end I used FreeSpire.XLS to load my existing Excel file and add the encryption over it. Hopefully there is no limitation to load and save file with the free version.
Open XML is generating .xlsx files that can be read by Open Office, but not by Excel itself.
With this as my starting point( Export DataTable to Excel with Open Xml SDK in c#) I have added code to create a .xlsx file. Attempting to open with Excel, I'm asked if I want to repair the file. Saying yes gets "The workbook cannot be opened or repaired by Microsoft Excel because it's corrupt." After many hours of trying to jiggle the data from my table to make this work, I finally threw up my hands in despair and made a spreadsheet with a single number in the first cell.
Still corrupt.
Renaming it to .zip and exploring shows intact .xml files. On a whim, I took a legit .xlsx file created by Excel, unzipped it, rezipped without changing contents and renamed back to .xlsx. Excel declared it corrupt. So this is clearly not a content issue, but file a format issue. Giving up on Friday, I sent some of the sample files home and opened them there with Libre Office. There were no issues at all. File content was correct and Calc had no problem. I'm using Excel for Office 365, 32 bit.
// ignore the bits (var list) that get data from the database. I've reduced this to just the output of a single header line
List< ReportFilingHistoryModel> list = DB.Reports.Report.GetReportClientsFullHistoryFiltered<ReportFilingHistoryModel>(search, client, report, signature);
MemoryStream memStream = new MemoryStream();
using (SpreadsheetDocument workbook = SpreadsheetDocument.Create(memStream, SpreadsheetDocumentType.Workbook))
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new Workbook();
workbook.WorkbookPart.Workbook.Sheets = new Sheets();
var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
var sheetData = new SheetData();
sheetPart.Worksheet = new Worksheet(sheetData);
Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
uint sheetId = 1;
if (sheets.Elements<Sheet>().Count() > 0)
{
sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
Sheet sheet = new Sheet() { Id = relationshipId, SheetId = sheetId, Name = "History" };
sheets.Append(sheet);
Row headerRow = new Row();
foreach( var s in "Foo|Bar".Split('|'))
{
var cell = new Cell();
cell.DataType = CellValues.Number;
cell.CellValue = new CellValue("5");
headerRow.AppendChild(cell);
}
sheetData.AppendChild(headerRow);
}
memStream.Seek(0, SeekOrigin.Begin);
Guid result = DB.Reports.Report.AddClientHistoryList( "test.xlsx", memStream.GetBuffer(), "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
return Ok(result);
This should just work. I've noticed other stack overflow discussions that direct back to the first link I mentioned above. I seem to be doing it right (and Calc concurs). There have been discussions of shared strings and whatnot, but by using plain numbers I shouldn't be having issues. What am I missing here?
In working on this, I went with the notion that some extraneous junk on the end of a .zip file is harmless. 7-Zip, Windows Explorer and Libre Office all seem to agree (as does some other zip program I used at home whose name escapes me). Excel, however, does not. Using the pointer at memStream.GetBuffer() was fine, but using its length was not. (The preceding Seek() was unnecessary.) Limiting the write of the data to a length equal to the current output position keeps Excel from going off the rails.
I want to add new sheet to an existing csv file, but I dont know how to go about it. I already opened the .csv file and i can access each element. so i want to create a new sheet on the existing .csv file and populate the cells with the data from the previous sheet.
class Program
{
static void Main(string[] args)
{
var reader = new StreamReader(File.OpenRead(#"C:\Users\Desktop\test.csv"));
List<string> listA = new List<string>();
List<string> listB = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
//line = line.Skip(1);
var values = line.Split(',');
listA.Add(values[0]);
listB.Add(values[1]);
listA.ForEach(Console.WriteLine);
listB.ForEach(Console.WriteLine);
Console.ReadLine();
}
}
}
I'm going to post this as an answer, even though it's kind of a non-answer. CSV files are simple flat-text files that are comma delimited. There are no higher-level concepts to this file type such as sheets, or cells, workbooks, or formulas.
Since they are just simple text files that are specially formatted, there is no concept of sheets. Instead you can maybe create additional CSV files and name the files accordingly.
If you want to create Excel files, and have individual sheets you can use various libraries or the COM Interops to do this.
COM Interops are for direct connections to Excel natively. Here's a MSDN How-To for Excel. This allows you to create a special object that will allow you to use Excel's API even though it's not a managed API through the .NET Framework.
Here's an example on how to add a sheet in that situation:
Microsoft.Office.Interop.Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
if (xlApp == null)
{
Console.WriteLine("EXCEL could not be started. Check that your office installation and project references are correct.");
return;
}
xlApp.Visible = true;
Workbook wb = xlApp.Workbooks.Add(XlWBATemplate.xlWBATWorksheet); //adds worksheet to our workbook
Worksheet ws = (Worksheet)wb.Worksheets[1]; //access that worksheet linked into the workbook
if (ws == null)
{
Console.WriteLine("Worksheet could not be created. Check that your office installation and project references are correct.");
}
Another option is to use the Open XML SDK for Office, which can be used for the new Office formats (.xlsx for example). Personally, I've never used this library, but it's similar to Apache POI for the .NET Framework.
I've been looking on the web for 30 minutes now and can't find any explanation about that. Here is my problem :
I wrote an application with poi to parse some data from 200 excel files or so and put some of it into a new file. I do some cell evaluation with FormulaEvaluator to know the content of the cells before choosing to keep them or not.
Now, when i test it on a test file with only values in the cells, the program works perfectly but when i use it on my pile of files I get this error :
"could not resolve external workbook name"
Is there any way to ignore external workbook references or set up the environment so that it wont evaluate formula with external references?
Because the ones I need don't contain references...
Thank you
Can you not just catch the error, and skip over that cell?
You're getting the error because you've asked POI to evaluate a the formula in a cell, and that formula refers to a different file. However, you've not told POI where to find the file that's referenced, so it objects.
If you don't care about cells with external references, just catch the exception and move on to the next cell.
If you do care, you'll need to tell POI where to find your files. You do this with the setupEnvironment(String[],Evaluator[]) method - pass it an array of workbook names, and a matching array of evaluators for those workbooks.
In order for POI to be able to evaluate external references, it needs access to the workbooks in question. As these don't necessarily have the same names on your system as in the workbook, you need to give POI a map of external references to open workbooks, through the setupReferencedWorkbooks(java.util.Map<java.lang.String,FormulaEvaluator> workbooks) method.
I have done please see below code that is working fine at my side
public static void writeWithExternalReference(String cellContent, boolean isRowUpdate, boolean isFormula)
{
try
{
File yourFile = new File("E:\\Book1.xlsx");
yourFile.createNewFile();
FileInputStream myxls = null;
myxls = new FileInputStream(yourFile);
XSSFWorkbook workbook = new XSSFWorkbook(myxls);
FormulaEvaluator mainWorkbookEvaluator = workbook.getCreationHelper().createFormulaEvaluator();
XSSFWorkbook workbook1 = new XSSFWorkbook(new File("E:\\elk\\lookup.xlsx"));
// Track the workbook references
Map<String,FormulaEvaluator> workbooks = new HashMap<String, FormulaEvaluator>();
workbooks.put("Book1.xlsx", mainWorkbookEvaluator);
workbooks.put("elk/lookup.xlsx", workbook1.getCreationHelper().createFormulaEvaluator());
workbook2.getCreationHelper().createFormulaEvaluator());
// Attach them
mainWorkbookEvaluator.setupReferencedWorkbooks(workbooks);
XSSFSheet worksheet = workbook.getSheetAt(0);
XSSFRow row = null;
if (isRowUpdate) {
int lastRow = worksheet.getLastRowNum();
row = worksheet.createRow(++lastRow);
}
else {
row = worksheet.getRow(worksheet.getLastRowNum());
}
if (!isFormula) {
Cell cell = row.createCell(row.getLastCellNum()==-1 ? 0 : row.getLastCellNum());
cell.setCellValue(Double.parseDouble(cellContent));
} else {
XSSFCell cell = row.createCell(row.getLastCellNum()==-1 ? 0 : row.getLastCellNum());
System.out.println(cellContent);
cell.setCellFormula(cellContent);
mainWorkbookEvaluator.evaluateInCell(cell);
cell.setCellFormula(cellContent);
// mainWorkbookEvaluator.evaluateInCell(cell);
//System.out.println(cell.getCellFormula() + " = "+cell.getStringCellValue());
}
workbook1.close();
myxls.close();
FileOutputStream output_file =new FileOutputStream(yourFile,false);
//write changes
workbook.write(output_file);
output_file.close();
} catch (Exception e) {
e.printStackTrace();
}
}
I have some data that's currently stored in an Excel workbook. It makes sense for the data to be in Excel (in that it's easy to manage, easy to extend, do calcs, etc.) but some of the data there is required by an automated process, so from that point of view it would be more convenient if it were in a database.
To give the information more visibility, workflow, etc. I'm thinking of moving it to SharePoint. Actually turning it into a SharePoint form would be tedious & time-consuming, and then the flexibility/convenience would be lost; instead, I'm thinking of simply storing the current Excel file within a SharePoint library.
My problem then would be: how can the automated process extract the values it needs from the Excel workbook that now lives within the SharePoint library? Is this something that Excel Services can be used for? Or is there another/better way? And even if it can be done, is it a sensible thing to do?
Having gone through something similar, I can tell you it actually isn't that bad getting values out of an Excel file in a document library. I ended up writing a custom workflow action (used within a SharePoint Designer workflow) that reads values out of the Excel file for processing. I ended up choosing NPOI to handle all of the Excel operations.
Using NPOI, you can do something like this:
// get the document in the document library
SPList myList = web.Lists[listGuid];
SPListItem myItem = myList.GetItemById(ListItem);
SPFile file = myItem.File;
using (Stream stream = file.OpenBinaryStream())
{
HSSFWorkbook workbook = new HSSFWorkbook(stream);
HSSFSheet sheet = workbook.GetSheet("Sheet1");
CellReference c = new CellReference("A1");
HSSFRow row = sheet.GetRow(c.Row);
HSSFCell cell = row.GetCell(c.Col);
string cellValue = cell.StringCellValue;
// etc...
}
You could easily put this in a console application as well.
Yes, I am trying to extract a range of cells on several sheets within a workbook. I was able to use some of the code below in a console application and view the data within the command window. Now I need to dump the data to a SQL Table and was looking for some examples on how to accomplish this and make sure I am going down the correct coding path.
Here is a snapshot of the code I am using.
protected override ActivityExecutionStatus Execute(ActivityExecutionContext executionContext)
{
using (SPSite site = new SPSite(SPContext.Current.Site.Url))
{
using (SPWeb web = site.RootWeb)
{
SPList docList = web.Lists[__ListId];
SPListItem docItem = docList.GetItemById(__ListItem);
SPFile docFile = docItem.File;
using (Stream stream = docFile.OpenBinaryStream())
{
HSSFWorkbook wb = new HSSFWorkbook(stream);
//loop through each sheet in file, ignoring the first sheet
for (int i = 1; i < 0; i++)
{
NPOI.SS.UserModel.Name name = wb.GetNameAt(i);
String sheet = wb.GetSheetName(i);
NPOI.SS.UserModel.Name nameRange = wb.CreateName();
nameRange.NameName = ("DispatchCells");
//start at a specific area on the sheet
nameRange.RefersToFormula = (sheet + "!$A$11:$AZ$100");
}
wb.Write(stream);
}
}
}
return ActivityExecutionStatus.Closed;
}