How do I return conditional formatting properties without adding additional rules in Excel VBA? - excel

If a cell has conditional formatting that uses an Icon Set (my current situation is using the Traffic Light Icon Set), is there a way to identify in VBA what particular icon is showing in that cell?
The motivation behind it is that it will correspond to a red/amber/green value which I'm exporting in a SQL statement, so I need to find it in VBA.
I can add new rules and select icon sets just fine:
Set Newiconset = Range("H3").FormatConditions.AddIconSetCondition
It's returning the properties of an existing set of rules that has me hung up.
Thanks for your help - I scoured StackOverflow for a solution and couldn't find it. If someone's solved this, let me know and I'll gladly remove my question.

Bad news: what I'm looking to do technically isn't possible.
Here's why:
Excel data is stored in XML files in a main Zip file (you can experiment with this by renaming an xlsx file to zip and opening it). Inside the data is stored in XML files, and when you finally find your workbook, you can see that the data is stored as the actual conditions themselves, with the range values and such. Excel then takes those and computes the result on the fly every time you look at that file. States are not saved when saving the file unfortunately. It's worth noting though that the current state of formulas is stored - I'm assuming this is how accessing values from external workbooks is handled.
This explains why you can set and read the rules just fine, but since there's nothing officially to read a value from you can't "get" the data.

Related

Vba code in original file to track all copied files

Is it possible to insert a code so we can track all copied excel files in the future?
The reason why: we are creating a template excel file that people can copy and fill in. The problem is that they regularly have to fill in the same information so instead of starting from the template they copy the already filled in template.
If we decide to change the template, we want to change all the files that were copied so there are no multiple versions going around.
All the files are stored on a server in subfolders so We can access them all. Titles of the file will vary based on the wishes from the customer.
After reading you, I see that:
Summary:
You have one single Template that everybody copies
You store all the filled templates on one Server Subfolder
Title of the Files varies from Customer's needs
Challenges:
For Performance shake, you might need of a program than Excel to manage those files
Otherwise, it is possible to use Excel VBA, but is somehow/enough complicated so you would need to have an advanced skills and enough time to write everything handling that Subfolders' file renaming if you wish to collect the data in one Single Excel.
Suggested Solution:
I recommend you to have A Locked Worksheet + Workbook Excel
Template so your customers won't be able to edit its structure and
it will keep all of your templates to be the same.
You better have some kind of the Standard in the nomenclature of your Excel Files which will help you use that description later on for search/filter/sorting ...
You can have a Reset Button as well within the Template where your customers will click and will empty all the fields effortless.
In short, If you wish to track of files being copies, you would need more than Excel VBA for that as you need to play with A windows service for you to track them.
Hope this will give you some ideas. All the Best!

Finding the cause of Excel file corruption

I have a feature that downloads things to an xls file using Apache POI. Mostly it works. But on one particular database, the resulting files are corrupted and won't open in Excel. I get the message "We found a problem with some content in 'DownloadFoo.xls'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes." . Clicking yes results in all the formatting, data validation, etc being stripped out. On the other hand, if I open the file in Open Office Calc and save it, it's fine and can be opened in Excel from then on. (The people who want to use these files aren't allowed to download Open Office Calc, so this is not considered an acceptable workaround.)
I have tried narrowing it down to see which data is causing the problem, but it seems to occur whenever 10 or more items are downloaded, regardless of which items they are. (On other databases, it's fine to download 100+). Excluding some of the columns helps, but they are perfectly innocuous looking columns (and virtually identical to other columns which are fine) so this still hasn't got me to the bottom of it.
Are there any techniques I could use to find out what Excel has a problem with in the corrupted spreadsheets?
I can't make major changes like getting it to download to xlsx instead as this feature is going to be scrapped and replaced with something completely different in the near future, so I'd like to just focus on the problem at hand.
It turns out that the solution to the problem was to reset the data validation lists more often. Quite a lot of the cells in my spreadsheet have data validation. When the data validation lists are longer, they are stored on a hidden sheet. If several cells need the same validation, I try to get them referencing the same list in order to not write out too much stuff on the hidden sheet. However Excel apparently dislikes it when too many cells reference the same list- it's not against the rules as far as I can tell, but it doesn't like it anyway. When I changed it to rewrite the validation lists for every 5 items, it started working.
The reason this database was different was that the items had an unusually high number of subitems, so they occupied a lot of rows even though it didn't seem like many things were being downloaded. Some of the problem columns just had true or false validation rather than using the lists on the hidden sheet, so I don't know what that was about, but resetting the validation lists helped anyway.
This doesn't really answer my question as I never managed to get any information from Excel about what the problem was, or use a particular technique, it was just a series of coincidental findings. I'm putting it here anyway in case anyone else has a similar problem. Also the thing that started me on the right track was finding an old comment when double checking that it doesn't do anything different for over 10 items (it doesn't) in response to Andrew Morton's comment, so thanks Andrew!

Structured references: Keep names through copy-past

So I'm using structured references for user data. Eventually we'll have a proper import procedure, but I was hoping to push that back to later and instead use copy-past of their raw data for now.
So I have a table, say tData, that contains a a copy of raw data output. That output already comes with headers, which are actually used in the structured references to the data throughout the workbook.
My problem is that the raw output isn't 100% stable - e.g. some columns may appear & go. Those aren't used in calculations, but it does affect slightly the table structure (position of columns). I can't control the raw output.
I was hoping to instruct them for now to copy-past directly. As I am working with the headers & not position, I thought the formulas would still work.
Summarize in the images below: =tTest[Column2] is the formula set for the box on the right. It should refer to the content of column2 (so 2). If I copy-past different headers, with "Column0" that shifts everything to the right, you can see that a positional reference is actually used by excel to refer to your data. It now returns "1" and Excel even changed the formula to "=tTest[Column1].
Seems wrong to me - e.g. if you reference something by name you don't expect it to actually be referenced by position.
I already tried tTest[[header]:[header]] and it doesn work either.
So, in the end, better write a proper import procedure I think. I won't post here because it's somewhat involved and there are various checks to perform. But overall steps are:
File picker to select the data to input
Copy everything into ThisWorkbook, close input file
Match the column headers from input data with mytable.HeaderRowRange
Copy the raw data into the appropriate column if match if found
So basically write a script to do a match on column headers, which is was I thought Excel would be doing. And still think it should do. However thinking about it, it is probably a lot simpler to code it this way than to have it actually dynamically adapt named references and not positions....
As mentioned by Bob Phillips, you can use INDIRECT. However that does involve adding an extra INDIRECT(reference_to_table_here) to, well, all references. Excel worksheet formulas are clunky enough I think without have to make them any clunkier.
Even though you write the formulas using column names, in the background Excel actually uses position of the column within your table for its calculations. Whenever you change the layout of the workbook, it updates the position for you.

Working with Office "open" XML - just how hard is it?

I'm considering replacing a (very) large body of Office-automation code with something that works with the Office XML format directly. I'm just starting out, but already I'm worried that it's too big a task.
I'll be dealing with Word, Excel and PowerPoint. So far I've only looked at Word and Excel. It looks like Word documents should be reasonably easy to manipulate, but Excel workbooks look like a nightmare. For example...
In Word, it looks like you could delete a paragraph simply by deleting the corresponding "w:p" tag. However, the supplied code snippet for deleting a row in Excel takes about 150 lines of code(!).
The reason the Excel code is so big is that deleting a row means updating the row indexes of all the subsequent rows, fixing up the "shared strings" table, etc. According to a comment at the top, the code snippet is not even complete, in that it won't deal with a workbook that has tables in it (I can live with that).
What I'm not clear on is whether that's the only restriction that the sample code has. For example, would there also be a problem if the workbook contained a Pivot Table? Or a chart that references data from the same sheet? Or some named ranges? Wouldn't you also have to update the formulae for any cells (etc.) that referenced a row whose row index had changed?
[That's not to mention the "calc chain", which (thankfully) I think you can simply delete since it's only a chache that can be re-built.]
And that's my question, woolly though it is. Just how hard do you have to work do something as simple as deleting a row properly? Is it an insurmountable task?
Also, if there are other, similar issues either with Excel or with Word or PowerPoint, I'd love to hear about them now, before I waste too much time going down a blind alley. Thanks.
Having worked with the Open XML SDK 2.0 for almost two years now I can say that doing seemingly trivial tasks can take many hours and sometimes days to figure out how to do it properly. For example, deleting an Excel row should be fairly straightforward and easy to do right? Nope because not only do you need code to delete your row, but then you have to update all the row indices, update any merged cell references, update hyperlink references, etc. Our internal delete method is close to 500 lines of code to just delete a row and I'm sure we don't have all the cases accounted for either.
The biggest complaint I have is the lack of documentation on how to do the most common tasks. The MSDN section on the Open XML SDK is very limited and whenever you need to do anything complicated you are really on your own. I've had to read the Open XML standard a lot to figure out what certain elements mean and how they should be implemented since I could find very little online.
The other challenging part is if you insert an element in a spot where it doesn't belong or put an invalid attribute on an element you will get a corrupt file when you try and open it. Most of the time you will not get any information on what caused the error and you will have to look at the Open XML standard spec to see what you did wrong.
If you need a fast turnaround time on converting that Office automation code into Open XML and what you are doing is not really basic, then I would say pass. If you have time and the patience to read up on the Word, Excel and PowerPoint XML structures and get familiar with how they relate then I say go for it. In my opinion it is really the only way to have very fine control over these office documents, but there will be a great learning curve when you start.
Oh and just for fun here is how much code is needed to add a comment to an Excel cell.
Just for completeness, here are some libraries I found for working with Excel XML:
www.extremexml.com - a layer on top of the Open XML SDK classes; focusses on injecting data into an existing spreadsheet; handles many of the cross-reference problems I identified in my question. Open source but GPL2 not LGPL. Code looks nice, and documentation is excellent. Does not appear terribly active on codeplex though.
Closed XML - another layer on top of the Open XML SDK - again open source, but with a less restrictive license (MIT). Looks nice, and looks more "active" than the above.
SpreadsheetLight - from what I can tell, a closed-source library sitting atop the Open XML SDK classes. Targeted more at those looking to create a spreadsheet from scratch rather than making changes to existing spreadsheets.
Here is another third party library dedicated to working with OpenXML:
http://www.officewriter.com
In the example cited by amurra above of deleting Excel spreadsheet rows, this is a single method call with this tool. It updates formulas and all the other references for which it seems that 500 lines of code would be required for otherwise.
The OpenXML SDK itself is a great tool for very simple things, but you still have to concern yourself with a lot of the internals of the file format and packaging structure to get things really right.
Here are some additional libraries that can manipulate with OOXML formats:
- GemBox.Spreadsheet (XLSX)
- GemBox.Document (DOCX)
Also GemBox published some articles that demonstrate how to manipulate with OOXML file format with pure .NET (without a use of any library), I think you'll find this interesting:
www.codeproject.com/Articles/15593/Read-and-write-Open-XML-files-MS-Office
(Introduction to SpreadsheetML format and an explanation on how we can read and write worksheet's cell content)
www.codeproject.com/Articles/649064/Show-Word-File-in-WPF
(Introduction to WordprocessingML format and demonstration on how we can read document's text)

How to export SSIS to Microsoft Excel without additional software?

This question is long winded because I have been updating the question over a very long time trying to get SSIS to properly export Excel data. I managed to solve this issue, although not correctly. Aside from someone providing a correct answer, the solution listed in this question is not terrible.
The only answer I found was to create a single row named range wide enough for my columns. In the named range put sample data and hide it. SSIS appends the data and reads metadata from the single row (that is close enough for it to drop stuff in it). The data takes the format of the hidden single row. This allows headers, etc.
WOW what a pain in the butt. It will take over 450 days of exports to recover the time lost. However, I still love SSIS and will continue to use it because it is still way better than Filemaker LOL. My next attempt will be doing the same thing in the report server.
Original question notes:
If you are in Sql Server Integrations Services designer and want to export data to an Excel file starting on something other than the first line, lets say the forth line, how do you specify this?
I tried going in to the Excel Destination of the Data Flow, changed the AccessMode to OpenRowSet from Variable, then set the variable to "YPlatters$A4:I20000" This fails saying it cannot find the sheet. The sheet is called YPlatters.
I thought you could specify (Sheet$)(Starting Cell):(Ending Cell)?
Update
Apparently in Excel you can select a set of cells and name them with the name box. This allows you to select the name instead of the sheet without the $ dollar sign. Oddly enough, whatever the range you specify, it appends the data to the next row after the range. Oddly, as you add data, it increases the named selection's row count.
Another odd thing is the data takes the format of the last line of the range specified. My header rows are bold. If I specify a range that ends with the header row, the data appends to the row below, and makes all the entries bold. if you specify one row lower, it puts a blank line between the header row and the data, but the data is not bold.
Another update
No matter what I try, SSIS samples the "first row" of the file and sets the metadata according to what it finds. However, if you have sample data that has a value of zero but is formatted as the first row, it treats that column as text and inserts numeric values with a single quote in front ('123.34). I also tried headers that do not reflect the data types of the columns. I tried changing the metadata of the Excel destination, but it always changes it back when I run the project, then fails saying it will truncate data. If I tell it to ignore errors, it imports everything except that column.
Several days of several hours a piece later...
Another update
I tried every combination. A mostly working example is to create the named range starting with the column headers. Format your column headers as you want the data to look as the data takes on this format. In my example, these exist from A4 to E4, which is my defined range. SSIS appends to the row after the defined range, so defining A4 to E68 appends the rows starting at A69. You define the Connection as having the first row contains the field names. It takes on the metadata of the header row, oddly, not the second row, and it guesses at the data type, not the formatted data type of the column, i.e., headers are text, so all my metadata is text. If your headers are bold, so is all of your data.
I even tried making a sample data row without success... I don't think anyone actually uses Excel with the default MS SSIS export.
If you could define the "insert range" (A5 to E5) with no header row and format those columns (currency, not bold, etc.) without it skipping a row in Excel, this would be very helpful. From what I gather, noone uses SSIS to export Excel without a third party connection manager.
Any ideas on how to set this up properly so that data is formatted correctly, i.e., the metadata read from Excel is proper to the real data, and formatting inherits from the first row of data, not the headers in Excel?
One last update (July 17, 2009)
I got this to work very well. One thing I added to Excel was the IMEX=1 in the Excel connection string: "Excel 8.0;HDR=Yes;IMEX=1". This forces Excel (I think) to look at all rows to see what kind of data is in it. Generally, this does not drop information, say for instance if you have a zip code then about 9 rows down you have a zip+4, Excel without this blanks that field entirely without error. With IMEX=1, it recognizes that Zip is actually a character field instead of numeric.
And of course, one more update (August 27, 2009)
The IMEX=1 will succeed importing data with missing contents in the first 8 rows, but it will fail exporting data where no data exists. So, have it on your import connection string, but not your export Excel connection string.
I have to say, after so much fiddling, it works pretty well.
P.S. If you are using a x64 bit version, make sure you call the DTExec from C:\Program Files\Microsoft SQL Server\90\DTS.x86\Binn. It will load the 32 bit Excel driver and work fine.
Would it be easier to create the Excel Workbook in a script task, then just pick it up later in the flow?
The engine part of SSIS is good but the integration with Excel is awful
"Using SSIS in conjunction with Excel is like having hot tar funnelled up your iHole in a road cone"
Dr. Zim, I believe you were the one that originally brought up this question. I totally feel your pain. I love SSIS overall, but I absolutely hate the limited tools that come standard for Excel. All I want to do is Bold the Heading or Row1 record in Excel, and not bold the following records. I have not found a great way to do that; granted I am approaching this with no script tasks or custom extensions, but you would think something this simple would be a standard option. Looks like I may be forced to research and program up something fancy for a task that should be so fundamental. I've already spent a rediculous amount of time on this myself. Does anyone know if you can use Excel XML with Excel versions: 2000/XP/2003? Thanks.
This is an old thread but what about using a flat file connection and writing the data out as a formatted html document. Set the mime type in the page header to "application/excel". When you send the document as an attachment and the recipient opens the attachment, it will open a browser session but should pop Excel up over the top of it with the data formatted according to the style (CSS) specified in the page.
Can you have SSIS write the data to an Excel sheet starting at A1, then create another sheet, formatted as you like, that refers to the other sheet at A1, but displays it as A4? That is, on the "pretty" sheet, A4 would refer to A1 on the SSIS sheet.
This would allow SSIS to do what it's good for (manipulate table-based data), but allow the Excel to be formatted or manipulated however you'd like.
When excel is the destination in SSIS, or the target export type in SSRS, you do not have much control over formatting and specifying how you want the final file to be. I have written a custom excel rendering engine for SSRS once, as my client was so strict about the format of final Excel report generated. I used 'Excel xml' to get the job done inside my custom renderer. May be you can use XML output and convert it to Excel XML using XSLT.
I understand you would rather not use a script component so perhaps you could create your own custom task using the code that a script contains so that others can use this in the future. Check here for an example.
If this seems feasible the solution I used was CarlosAg Excel Xml Writer Library. With this you can create code which is similar to using the Interop library but produces excel in xml format. This avoids using the Interop object which can sometimes lead to excel processes hanging around.
Instead of using a roundabout way to do this exercise of trying to write data to particular cell(s), format the cell(s), style them which is indeed a very tedius effort considering the support SSIS has for EXCEL, we could go the "template" way to do this.
assume we need to write data in the so & so cell with all the custom formating thats done on it. Have all the formatting in a sheet, say "SheetActual", Whereas the cells that will hold the data will actually have Lookups/ refrences/ Formulaes to refer to the original data that SSIS exports in a hidden sheet say "SheetMasterHidden" of the same Excel connection. This "SheetMasterHidden" will essentially hold the master data in default format that SSIS writes data to the excel. This way you need not worry about formatting the data runtime.
Formatting the Excel is a one time work "IF" the formatting dont change very often. If the format changes and the format is decided runtime this solution maynot go very well.
The answer is in the question. Over time, it became a progress status. However, there is SSRS that will create Excel files if you create TABLE presentations. It works pretty well too.

Resources