Selecting and editing cells in Excel with OpenXML SDK - c#-4.0

I'm basically inserting information from a dataset into an excel document. We really don't want to use interop services, so my best option is to use the OpenXML SDK.
Basically all I need to do is select a cell based on an id/name/whatever (I would rather not use the standard "A1" format), and then insert something into it. But for the life of me I can't figure out how to obtain a set of cell elements based on an attribute value.
Part 2 of this is I then need to merge certain cells. Which is much easier to do since it involves simply appending a collection of mergecell elements to the mergecells table. But I'm looking to make sure that any cell within a merge range isn't a part of another merged cell range, as this would cause problems.
I feel like this is a REALLY powerful tool, but the lack of documentation and examples makes it a difficult subject to approach.

It seems the easiest interface for editing excel documents is the ClosedXML library.
Searching each worksheet for specific cells is a lot easier. You can also search for named ranges.
and it automatically handles cell mergers by deleting any previous merged ranges that conflict with the newest one.

Related

Using Office.js, how to get the exact styling of a table in Excel

I'm using Office.js to create an Add-in for Excel.
I'm able to get the exact styling (fill colors, font size, borders, etc) of individual cells of ranges and now need to do the same for cells of tables.
I'm able to get the the table's style and then get the style's properties. However, besides the fact that this is only supported since ExcelAPI 1.7, it seems to only describe the style in general terms. That is, it doesn't seem to describe the detailed table style properties such as "First row stripe", "Total row", etc.
(Note that getting the table's individual cell styles doesn't work like for cells of ranges. The styling properties, such as fill.color, don't represent the effectively applied styling of the cell if the cell's styling from the table's base style hasn't been overridden.)
Things I have considered:
Convert the table to a range. But I guess that this is destructive to the worksheet and I see no way of performing an undo.
Convert the table to a range and back to a table again. This is destructive as well and I'm afraid I could somehow not exactly recreate the table as it was before.
Create a copy of the worksheet and then covert the tables to ranges. That could work but it's ExcelAPI 1.7 only.
Read the raw style info from the file's OOXML representation but I don't think that that's possible with Office.js alone.
Any suggestion on how to get an Excel table's exact styling information using only Office.js and with an API version as low as possible?
Edit: Turns out we do have access to the whole .xlsx file with Office.js using the Office.context.document.getFileAsync method. I'll try to get the full style info by reading the xl/styles.xml file but I'd still prefer a better solution.
You can still get range from the table you want to handle. For example,
Var range = worksheet.getRange("A1:E1").
Although "A1:E1" is part of a table, you can still get the area as a range and do your further actions.

Comparison of data in Access

I have written some pretty lengthy VBA code in excel for the comparison of 2 worksheets. My code does the following:
Lets you import 2 sheets for comparison
arranges the columns
removes departments which require different comparisons into a new worksheet
In sheet 1 checks if the id's appear more than once then checks, which row of data to use for comparison based on the latest update, and deletes the old rows
compares the sheets based on the header and then the cell contents as header names are different, for different values it then highlights them red
finally giving me a breakdown per column per department of differences and any id's that are missing
I have now found that my data set is becoming to big and looking to use MS Access, is it possible to copy my VBA code over to access? What do you guys suggest for this?
Any advice would be helpful.
From the nature of your question it sounds like you may not have used a database before. If you were using access, you would need to totally re-write the code using SQL statements. eg An Aggregating SQL SELECT statement to find the most recently updated update and ignore the rest.
You can use conditional formatting in an access form, but it's no better than using it in excel. How many rows does your data have? Will it fit in an excel sheet?
You might use access to pre-process the data to remove the unwanted rows that you use in excel. OR use power query or sql directly from excel to remove them.
You have a way to go.
Harvy

Creating a multilingual Excel template spreadsheet

I have created an Excel template to collect data from multiple sources in a standard format. Pretty soon, I will be sending this template to people from different countries. I would like them to be able to select the language of the template directly from within the Excel workbook. This would enable them to have the headers translated in their own languages. I want to support 4 major languages and I can provide the translations of my headers in these languages.
Is there a good solution to do this? Could my Excel workbook embed a set of *.properties files containing the translations? Or should I use nasty formulas to retrieve the headers from a hidden sheet? Should I use VBA and how?
Of course, another solution would be for me to create 4 different files. But I feel this will become a nightmare when I want to support more languages or make changes to my template.
Thanks,
I would choose a nasty formula because your recipients might not appreciate the security risks of VBA, amongst other considerations. If you have a range with Language names (or other references) in a column (say a named range of HLcol1) and the appropriate headers in a matrix alongside HLcol1 (with the entire array named say HeaderLanguagethen:
=INDEX(HeaderLanguage,MATCH($A$6,HLcol1,0),COLUMN())
in B6 and copied across might suit, where the chosen Language name (or other reference) is in A6.
6 because rows 1-5 seem as good a place as anywhere to place the lookup array - these rows could be hidden.

default value to a cell of a new row without a macro (Excel)

I would like to insert a default value, to a specific cell in my worksheet,
but this default value should take care of new rows inserted in the worksheet.
I must not use a macro for this.
thanks
Maybe I am wrong, but I can not think of any way, one would be able to that. This is more a typicall database functionality. Without macros you would have to use a function or a format. A function, like values, would not be copied by inserting new rows - only formats would, so this narrows it down.
By the way, I interpreted your question to "default value on instertion of new row", not "default value when writing data in a existing clear row".
So, as a kind of "default value behaviour" you could use user defined cellformats.
i.e. use ;;'x'; as a user format and format your cell or column with it. This won't fill empty cells with 'x' but, whenever you would type in '0' it would change to 'x'.
However, I am very interested, if there is a better solution.
You can add validation to cells which can help you force a number into a particular cell, but as stated in the earlier answer a database is more designed for Default Values.
It seems to me you are trying to create a database in Excel, I wouldn't recommend this as excel is very good at prototyping algorithms however when it comes to structuring tables it can fail very quickly.
Use at least MS Access, namely as it comes with a database. Alternatives are rapid prototyping tool such as Eclipse or Netbeans, or Visual Studio if your budget can stretch that far. Couple the RAD tools with MySql (namely for ease of use and the fact that the community licence is good) and the system should be stable.

How to export SSIS to Microsoft Excel without additional software?

This question is long winded because I have been updating the question over a very long time trying to get SSIS to properly export Excel data. I managed to solve this issue, although not correctly. Aside from someone providing a correct answer, the solution listed in this question is not terrible.
The only answer I found was to create a single row named range wide enough for my columns. In the named range put sample data and hide it. SSIS appends the data and reads metadata from the single row (that is close enough for it to drop stuff in it). The data takes the format of the hidden single row. This allows headers, etc.
WOW what a pain in the butt. It will take over 450 days of exports to recover the time lost. However, I still love SSIS and will continue to use it because it is still way better than Filemaker LOL. My next attempt will be doing the same thing in the report server.
Original question notes:
If you are in Sql Server Integrations Services designer and want to export data to an Excel file starting on something other than the first line, lets say the forth line, how do you specify this?
I tried going in to the Excel Destination of the Data Flow, changed the AccessMode to OpenRowSet from Variable, then set the variable to "YPlatters$A4:I20000" This fails saying it cannot find the sheet. The sheet is called YPlatters.
I thought you could specify (Sheet$)(Starting Cell):(Ending Cell)?
Update
Apparently in Excel you can select a set of cells and name them with the name box. This allows you to select the name instead of the sheet without the $ dollar sign. Oddly enough, whatever the range you specify, it appends the data to the next row after the range. Oddly, as you add data, it increases the named selection's row count.
Another odd thing is the data takes the format of the last line of the range specified. My header rows are bold. If I specify a range that ends with the header row, the data appends to the row below, and makes all the entries bold. if you specify one row lower, it puts a blank line between the header row and the data, but the data is not bold.
Another update
No matter what I try, SSIS samples the "first row" of the file and sets the metadata according to what it finds. However, if you have sample data that has a value of zero but is formatted as the first row, it treats that column as text and inserts numeric values with a single quote in front ('123.34). I also tried headers that do not reflect the data types of the columns. I tried changing the metadata of the Excel destination, but it always changes it back when I run the project, then fails saying it will truncate data. If I tell it to ignore errors, it imports everything except that column.
Several days of several hours a piece later...
Another update
I tried every combination. A mostly working example is to create the named range starting with the column headers. Format your column headers as you want the data to look as the data takes on this format. In my example, these exist from A4 to E4, which is my defined range. SSIS appends to the row after the defined range, so defining A4 to E68 appends the rows starting at A69. You define the Connection as having the first row contains the field names. It takes on the metadata of the header row, oddly, not the second row, and it guesses at the data type, not the formatted data type of the column, i.e., headers are text, so all my metadata is text. If your headers are bold, so is all of your data.
I even tried making a sample data row without success... I don't think anyone actually uses Excel with the default MS SSIS export.
If you could define the "insert range" (A5 to E5) with no header row and format those columns (currency, not bold, etc.) without it skipping a row in Excel, this would be very helpful. From what I gather, noone uses SSIS to export Excel without a third party connection manager.
Any ideas on how to set this up properly so that data is formatted correctly, i.e., the metadata read from Excel is proper to the real data, and formatting inherits from the first row of data, not the headers in Excel?
One last update (July 17, 2009)
I got this to work very well. One thing I added to Excel was the IMEX=1 in the Excel connection string: "Excel 8.0;HDR=Yes;IMEX=1". This forces Excel (I think) to look at all rows to see what kind of data is in it. Generally, this does not drop information, say for instance if you have a zip code then about 9 rows down you have a zip+4, Excel without this blanks that field entirely without error. With IMEX=1, it recognizes that Zip is actually a character field instead of numeric.
And of course, one more update (August 27, 2009)
The IMEX=1 will succeed importing data with missing contents in the first 8 rows, but it will fail exporting data where no data exists. So, have it on your import connection string, but not your export Excel connection string.
I have to say, after so much fiddling, it works pretty well.
P.S. If you are using a x64 bit version, make sure you call the DTExec from C:\Program Files\Microsoft SQL Server\90\DTS.x86\Binn. It will load the 32 bit Excel driver and work fine.
Would it be easier to create the Excel Workbook in a script task, then just pick it up later in the flow?
The engine part of SSIS is good but the integration with Excel is awful
"Using SSIS in conjunction with Excel is like having hot tar funnelled up your iHole in a road cone"
Dr. Zim, I believe you were the one that originally brought up this question. I totally feel your pain. I love SSIS overall, but I absolutely hate the limited tools that come standard for Excel. All I want to do is Bold the Heading or Row1 record in Excel, and not bold the following records. I have not found a great way to do that; granted I am approaching this with no script tasks or custom extensions, but you would think something this simple would be a standard option. Looks like I may be forced to research and program up something fancy for a task that should be so fundamental. I've already spent a rediculous amount of time on this myself. Does anyone know if you can use Excel XML with Excel versions: 2000/XP/2003? Thanks.
This is an old thread but what about using a flat file connection and writing the data out as a formatted html document. Set the mime type in the page header to "application/excel". When you send the document as an attachment and the recipient opens the attachment, it will open a browser session but should pop Excel up over the top of it with the data formatted according to the style (CSS) specified in the page.
Can you have SSIS write the data to an Excel sheet starting at A1, then create another sheet, formatted as you like, that refers to the other sheet at A1, but displays it as A4? That is, on the "pretty" sheet, A4 would refer to A1 on the SSIS sheet.
This would allow SSIS to do what it's good for (manipulate table-based data), but allow the Excel to be formatted or manipulated however you'd like.
When excel is the destination in SSIS, or the target export type in SSRS, you do not have much control over formatting and specifying how you want the final file to be. I have written a custom excel rendering engine for SSRS once, as my client was so strict about the format of final Excel report generated. I used 'Excel xml' to get the job done inside my custom renderer. May be you can use XML output and convert it to Excel XML using XSLT.
I understand you would rather not use a script component so perhaps you could create your own custom task using the code that a script contains so that others can use this in the future. Check here for an example.
If this seems feasible the solution I used was CarlosAg Excel Xml Writer Library. With this you can create code which is similar to using the Interop library but produces excel in xml format. This avoids using the Interop object which can sometimes lead to excel processes hanging around.
Instead of using a roundabout way to do this exercise of trying to write data to particular cell(s), format the cell(s), style them which is indeed a very tedius effort considering the support SSIS has for EXCEL, we could go the "template" way to do this.
assume we need to write data in the so & so cell with all the custom formating thats done on it. Have all the formatting in a sheet, say "SheetActual", Whereas the cells that will hold the data will actually have Lookups/ refrences/ Formulaes to refer to the original data that SSIS exports in a hidden sheet say "SheetMasterHidden" of the same Excel connection. This "SheetMasterHidden" will essentially hold the master data in default format that SSIS writes data to the excel. This way you need not worry about formatting the data runtime.
Formatting the Excel is a one time work "IF" the formatting dont change very often. If the format changes and the format is decided runtime this solution maynot go very well.
The answer is in the question. Over time, it became a progress status. However, there is SSRS that will create Excel files if you create TABLE presentations. It works pretty well too.

Resources