Using Office.js, how to get the exact styling of a table in Excel

Using Office.js, how to get the exact styling of a table in Excel - excel

I'm using Office.js to create an Add-in for Excel.
I'm able to get the exact styling (fill colors, font size, borders, etc) of individual cells of ranges and now need to do the same for cells of tables.
I'm able to get the the table's style and then get the style's properties. However, besides the fact that this is only supported since ExcelAPI 1.7, it seems to only describe the style in general terms. That is, it doesn't seem to describe the detailed table style properties such as "First row stripe", "Total row", etc.
(Note that getting the table's individual cell styles doesn't work like for cells of ranges. The styling properties, such as fill.color, don't represent the effectively applied styling of the cell if the cell's styling from the table's base style hasn't been overridden.)
Things I have considered:
Convert the table to a range. But I guess that this is destructive to the worksheet and I see no way of performing an undo.
Convert the table to a range and back to a table again. This is destructive as well and I'm afraid I could somehow not exactly recreate the table as it was before.
Create a copy of the worksheet and then covert the tables to ranges. That could work but it's ExcelAPI 1.7 only.
Read the raw style info from the file's OOXML representation but I don't think that that's possible with Office.js alone.
Any suggestion on how to get an Excel table's exact styling information using only Office.js and with an API version as low as possible?
Edit: Turns out we do have access to the whole .xlsx file with Office.js using the Office.context.document.getFileAsync method. I'll try to get the full style info by reading the xl/styles.xml file but I'd still prefer a better solution.

You can still get range from the table you want to handle. For example,
Var range = worksheet.getRange("A1:E1").
Although "A1:E1" is part of a table, you can still get the area as a range and do your further actions.

Related

Programmatically add rows to an Excel data model via C#

We're looking at allowing our customers to download an Excel file from our web application which contains a raw export of their data along with some basic charts and pivot tables based on that data.
The basic way, we want to make this work is that we have a fixed Excel file which contains all the reporting elements in one worksheet and have room for the underlying data in another worksheet. When the user requests their Excel report, we programmatically fill out the data worksheet with their results and send them the final Excel file.
Everything seemed a bit to easy when doing the prototyping with a fixed set of data. The dataset we worked with was added to the Excel Data Model and we then set up the charts and other reporting elements. However, when using that file as the template for the generated Excel file in our application we are finding that the definition of the data model still remains - meaning, that we built the "protype" with a table definition of $A$1:$T$5879
but when generating the report, that definition isn't changed to contain whatever size the added dataset might have.
We're using EPPlus to work with the generation of our Excel sheets and have so far been unable to find any sort of solution to this kind of problem. This might very much be due to us being quite Excel novices. The goal is to have the user experience, that the charts and pivot tables contained in the Excel sheet reflects the total dataset contained in the Excel file without them having to do anything.

Ok, I've actually found a solution for it.
The solution was right infront of us.
We define the dataset as a named set - this is done under under the "Formulas" and inside the "Name Manager". We have a range which defines our dataset - the "Refers To" field when defining a range can take a formula. So intead of giving it a fixed size, we use this: =OFFSET(Data!$A$1;0;0;COUNTA(Data!$A:$A);COUNTA(Data!$1:$1))
This counts the amount of rows and columns, with reference to A1 in our Data worksheet. All our pivots are set to reload on startup and that seems to work.

Read excel cell colour into Power BI

I have an excel file that I need to read into Power BI. Unfortunately I have no control over this file as its auto generated from another person.
Some of the cells in this file are just filled with colours and I want to be able to translate these colours when importing the data into Power BI.
For example if the colour is green in excel then show true in the corresponding power BI cell. At the moment it's just blank.
Does anyone know of a way to get cell "meta" data like colour from excel in Power BI?

Don't give up just yet...
I found an example that works in a roundabout way using Power Query in Excel. It will give you the meta data associated with each cell by its address (e.g. A1 is highlighted with color FFFFFF00). I relied on some Excel functions to associate the highlighted cell addresses with the cell values. Pulling the cell data with Power BI might take some additional work.
The technique is to use Power Query to open the Excel .xlsx file, which is basically a .zip file containing .xml documents. The color information for each cell can be extracted into a table. From there I was able to use INDIRECT() statements to read from the .xlsx workbook and extract the values from the colored cells. It worked quite well for me.
You can find a working example in the forum in the link below. The user defined DecompressFiles function in the sample uses the Binary.Decompress command to access the XML files within the .xlsx file.
https://www.excelguru.ca/forums/showthread.php?7047-Extract-Cell-Color-with-M&p=28875&viewfull=1#post28875
In my situation, I had a database export of about 7,000 rows and 50 columns into Excel. Working offline, users then went through Excel and made changes, highlighting every cell they had changed. Then they wanted me to update the database with only the highlighted cells. The background color used by each person varied but I didn't care what the color was, just that it was colored.
For each changed cell I was able to generate SQL statements to update the database and also insert into a transaction log table. The main database table was mostly flat but the few foreign key lookup values that were modified I had to update manually.
Column F uses the Indirect formula to pull data from the source workbook. Note that the source workbook must be open for the Indirect formula to read from it.
=INDIRECT("'[" & Import_Filename & "]" & Sheet_Name & "'!"&[#[SheetCellRef.2]])
Column G refines the data in Column F by putting quotes around strings or NULL if the cell is blank.
Column H grabs the column heading to know what field to update.
Column K grabs the Record ID value from the row specified in Column E.
I have had to run this process three different times for the users so my time invested paid off quickly. All I have to do is put their latest highlighted Excel file in the local folder and refresh the Power Query to generate new SQL statements.
Sorry I don't have a 'solution' posted right here. The process is still a little fragile and I'm trying to make a more robust example I can share. Stack Overflow doesn't seem to be set up for ongoing development of a solution. The point of this answer is to give hope to some of you who are desperate for a solution and won't take 'No' for an answer.

Sigh.
Color is not data. Unfortunately, many people color-code cells and then expect to be able to do things based on the color of the cell. But it's not that simple.
Although Excel now provides some ways to filter by cell color, it still cannot identify cell color with a worksheet formula.
Hence, you will need a VBA routine that evaluates all cells and records their colors in another table, which you will then need to push into your Power BI data model.
In the long run, it might be easier to talk to that other person who produces the color coded cells, and teach them a better way of doing things. Show them how to use conditional formatting based on cell values for color coding. The logic used for conditional formatting can also be applied to classify the data in Power BI.
From a data architecture point of view, the best solution is to address the problem at the source, instead of creating tools to handle bad data input.
Just sayin'.

How do I apply data filter to only the table range and not the whole row?

I have got two adjacent tables. When I apply data filter on first table, it filters the whole row hiding rows from 2nd table as well. How do I restrict filter to only the first table range?

To answer your direct question How do I restrict filter to only the first table range? the answer is - you can't.
Reading the comments it seems what you need is to display the filtered table data next to a chart and another table. There is a little know tool in Excel that you can use to achieve this - the Camera Tool. With this you can create a dynamic image of a range and place it where you want. The image updates when a filter is applied to the source range, without affecting the rows on the Dashboard sheet.
Screenshots to demonstrate:
Setup with tables on seperate sheets, and camera images beside chart on dashboard sheet
With Filter applied to Table A
The Camera tool is not on the Ribbon (Excel 2010) or the standard toolbars (Excel 2003). You need to add it using Customisation. (Add to Qucik Access Toolbar in 2010 or Tools/Customisation Menu in 2003)

Unfortunately you won't be able to do that. When you filter, it filters the entire row (something to think about would be how the row number would display if that weren't the case). You will need to restructure your setup if you wish to prevent that (not sure of your particular use case, so sorry I can't give a more specific suggestion).

I had a similar issue, where i had a table I wanted to remain static - like a key, but wanted to filter the main table.
To get around this, I copied the static table, and pasted it as an image. This way, when you filter on the main table, the image remains where you have put it.

A simple workaround for this general issue that others may have mentioned (but I don't see here):
You can't filter just a range (e.g. a few columns in a spreadsheet), but you can sort just a range. And by sorting the range, then deleting some blocks of unwanted cells in the range, then sorting the range back to the original order, you can fake a filter.
A bit clunky, but easy for some jobs if you're careful.

Selecting and editing cells in Excel with OpenXML SDK

I'm basically inserting information from a dataset into an excel document. We really don't want to use interop services, so my best option is to use the OpenXML SDK.
Basically all I need to do is select a cell based on an id/name/whatever (I would rather not use the standard "A1" format), and then insert something into it. But for the life of me I can't figure out how to obtain a set of cell elements based on an attribute value.
Part 2 of this is I then need to merge certain cells. Which is much easier to do since it involves simply appending a collection of mergecell elements to the mergecells table. But I'm looking to make sure that any cell within a merge range isn't a part of another merged cell range, as this would cause problems.
I feel like this is a REALLY powerful tool, but the lack of documentation and examples makes it a difficult subject to approach.

It seems the easiest interface for editing excel documents is the ClosedXML library.
Searching each worksheet for specific cells is a lot easier. You can also search for named ranges.
and it automatically handles cell mergers by deleting any previous merged ranges that conflict with the newest one.

Getting mixed tabular & non-tabular data from Excel into Access

My Access programming is a little rusty, & I've never worked with Excel files all that much.
I have a requirement to bring data from Excel spreadsheets into Access 2007. These spreadsheets have a fixed (predictable) format, but it includes a "header area" where I need to read single data items from specific cells, followed by a mass of tabular data (~500 rows in the one sample I've seen so far). I will be processing all of this into a set of tables that are normalized quite differently from the flat layout of the spreadsheet.
I know how to open an ADO recordset on the tabular data, and it should work fairly well for my purposes. I also figure that I can reference the Excel object model and open the sheets through Automation to get the "header area" data items.
My question is this: since I have to (I think) use the Automation approach for the "header area", am I better off just leaving it open in this mode to move on to the tabular data (with cell/range navigation), or closing that mode & going over to ADO? I suspect it's the latter--and I'd be more comfortable with it--but I don't want to do the wrong thing just because it's more familiar.
Edit
It seems I wasn't clear that I need to build this capability into the "application", as something that a user can repeat down the line. I'm assured that I can trust the format of the spreadsheet (though I'll include error trapping for graceful failure if that turns out to be false). These spreadsheets are "official design documents" for hardware, and my app needs to handle bringing in new &/or updated ones to track the things that are described in the tabular data in ways that the flat Excel format diesn't allow for.

Of those two options, I would choose the second simply because I find it more convenient to work with an ADO recordset. It should be fairly simple if you can assign a named range to your spreadsheet's tabular data.
Edit: If your spreadsheet includes field names, the recordset approach would be less prone to break due to spreadsheet changes such as one or more new columns inserted before or between the existing columns or a re-ordering of the existing columns.
But actually, I think the TransferSpreadsheet Method might be more convenient. You can specify the spreadsheet range as a named range or by cell address as in this example from the linked page:
DoCmd.TransferSpreadsheet acImport, 3, _
"Employees","C:\Lotus\Newemps.wk3", True, "A1:G12"
Also, you can choose between importing the spreadsheet range directly into an Access table, or linking to the range as a "virtual" table ... whichever best meets your application's needs.
Edit2: Creating a link (acLink instead of acImport) with TransferSpreadsheet would allow you to execute SQL statements against the link table:
INSERT INTO DestinationTable (field1, field2, field3)
SELECT foo, bar, bat FROM LinkedTable;

If the header information is really complicated, this can simplify your coding work:
In the official design Excel file, create a hidden tab.
In that tab, make a 1-row table connecting to all the header elements you're interested in. (i.e. set row 1 column 1 to "Document#" and row 2 column 1 to Sheet1:A1)
Then you can re-use the same VBA procedure to import both your tabular data and your header data.

I would do it all via Automation. Why have two separate processes where one will do? After you've read the header information reading the tabular information will be quite easy.

I inherited an application back in mid-2000 that was built to import Excel spreadsheets that were basically reporting output from MYOB (an accounting program). What had been done was to simply create a template table that had all the columns necessary to accomodate the report, using text data type for all columns. Then the non-data rows were filtered out and processed into the eventual destination table.
It's not elegant, and doesn't require a lot of programming, though the implementation I inherited used a dedicated temp table for each report layout that was being imported. You could easily replace all of those with a single table with 100 text columns of 255 (or memo fields, for that matter, if that was a requirement), and just re-use it.
I'm not sure if I'd recommend it or not, but it really is quite easy without requiring much in the way of code.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string