Visually represented user re-sizable data ranges in Excel - excel

Short Version
Is there a way to have visually represented user re-sizable data ranges in Excel? (If so, via VSTO?)
Long version
I'm writing an add-in to Excel that helps with exporting data within arbitrary workbooks to existing database tables. The data is more or less tabular but it's almost always laid out differently. I'm looking to make the process as error free but quick as possible. For example, columns for ranges of tabular data have their header names ranked for similarity to a pre-determined field names list. The rankings are then fed into a solver for the assignment problem. This allows columns to be mapped to fields automatically with surprisingly high accuracy.
However, detecting the ranges of tabular data isn't feasible -- often not all of the data is wanted for export. Therefore, I'm looking to make a familiar yet quick to operate user interface for users to specify the tabular data ranges within a workbook.
One such user interface would be to have the user draw and re-size the ranges they'd like to export. Thus, I'm seeking to do exactly that. However, I'm open to other user interface ideas if they're more conducive to implementation yet still easy to use.

The solution I've ended up with is creating a scratch copy of the worksheet and managing the formatting of it in order to highlight various portions of it -- instead of user-resizable areas they're "painted" instead by selecting cells. (The interface takes the form of a familiar paint program in a way with various tools that allow you to manipulate the cell information in certain ways.) The data export region and normal modes are switched between with a toggle bottom.
Under the hood it's not a pretty, elegant solution, but it's pretty slick for the end user.

Related

Scenario manager: can you create scenarios in a table, and get the Scenario Manager to read from there?

Microsoft's official site has an explanation on how to use scenarios in Excel.
If you name the input cells, the scenario manager will show the name, so it's easier to remember that $C$5 is, say, the price.
My question is: is it possible to set up the scenarios in a table somewhere in Excel, and get the scenario manager to read from there? Setting multiple scenarios in the scenario manager is very fiddly, time-consuming and error-prone, especially when the inputs are linked - e.g. setting 10 scenarios where each scenario is an x% change from the previous.
Any suggestions?
PS I know all these things can be done very easily in a scripting language like Python or R, but in this very specific case the calculations are not too complex and the file needs to be shared with other people, so I must use Excel.
VBA would be a last resort because some of these people have VBA disabled by default.
Edit
To clarify, what I'd need is a way to create a table like this below, where those in blue are the inputs, and those in grey are the outputs. I have put together a banal example below, along the lines of the example in the VBA macro answer given below, but the general idea is:
define a number of scenario as the combination of multiple inputs (more than 2) ;
create a table showing, for each scenario, the inputs and some key outputs;
note the table doesn't have all the possible combinations of all the inputs, like the macro given in one of the examples - that would be too much and wouldn't be very readable.
I could put together a quick VBA script that changes the inputs in the model, reads the result and creates the table, but I was wondering if there is a better way - VBA is typically not very robust, in the sense that just changing the location of one cell can often mess things up. I usually avoid Excel for the more complex models (this would be banal in any scripting language), but this I have to do in Excel.
EDIT #2:
Trying to further clarify what I have in mind, I have put together the screenshot below. Each output is the result of many different calculations, and CANNOT be calculated as a small, simple formula - if it could, I would not have any issue, of course!
My issue is that:
- if I change an input, then all the many many calculations occurring behind the scenes change
- the outputs are read from all those calculations
- I cannot use two-way what-if tables
If even this is not clear, the only other thing I can try is to upload an Excel file, which is generally discouraged on SO.
Scenario Manager is a built in function with it's own GUI.
For this reason, the function will be limited in what it can call (only data entered in the GUI)
VBA will allow you to manipulate this data, telling it where to pull the changing values and what data to change it by
So the answer for your specific query:
Can I use Excel without VBA to perform Scenario Manager tasks not set by the GUI?
No.
But it doesn't mean fiddling with the Manger itself would be horrendous. There are ways to teach and learn with it, but also if you save a macro enabled document, users should be able to turn the macro on with the click of a button - so VBA can be an option too
I hope this helps?

Excel tables vs plain data

I have a large Excel file consisting of multiple data sheets with plain data and a couple of dashboard sheets with various graphs and kpi's based on these data.
I am looking to make the file smaller and faster to work with. Should I convert the unformatted data to tables or not. I can't really find anything to support this.
Anyone got any ideas?
Tables have a lot of benefits but they are generally slower than plain data, (although the latest version of Excel 2016 has significant Table speed improvements).
You're actually asking the wrong question. The questions you should be asking is "Why is Excel taking so long to recalculate, why is my filesize so large, and what can I do about it?"
And you don't give us much info about your symptoms. How large is your file in MB? How many rows of data in it? Lots of lookups on big ranges? Lots of volatile functions like OFFSET, INDIRECT at the head of long dependency chains? How slow is it? What version/SKU of Excel do you have?
If Excel runs slowly, it's generally because people have inadvertently programmed it to run slowly, due to poor formula choice and suboptimal layout. Converting to Tables or not isn't going to make a hell of a lot of difference, by the sound of things.
Common culprits that result in slow files include the following (note the last one re Tables) :
Volatile Functions with long calculation chains running off them. See
my post at
https://chandoo.org/wp/2014/03/03/handle-volatile-functions-like-they-are-dynamite/
for more on this
Inefficient lookups on multiple columns (such as using multiple
VLOOKUPS to bring through data for the same record rather than using
one MATCH and multiple INDEX functions)
Lookups on very, very long arrays. See my post at
http://dailydoseofexcel.com/archives/2015/04/23/how-much-faster-is-the-double-vlookup-trick/
to learn how sorting your lookup tables and using the binary match
parameter can speed up lookups thousands fold.
Overuse of resource-intensive formulas such as SUMPRODUCT, when
simpler alternatives exist (including SUMIF and it's variants, or even better, PivotTables)
Using IF and other functions to change the formatting of thousands of cells, instead of using custom number formatting
Using Data Tables. (These can really hog resources, and sometimes
better alternatives exist)
Using many thousands of extra formulas to reference data input cells
that might not be used, rather than using Excel Tables (aka
ListObjects) that expand dynamically, automatically. I always use Tables to host my data and settings. They radically simplify referencing (including from VBA) and file maintainability.
You need to address the root cause, not the symptoms. A good place to start is this article by recalculation guru Charles Williams (who I see has dropped by):
https://msdn.microsoft.com/en-us/vba/excel-vba/articles/excel-tips-for-optimizing-performance-obstructions
In terms of file size, as Charles Williams puts it: To save memory and reduce file size, Excel tries to store information about the area only on a worksheet that was used. This is called the used range. Sometimes various editing and formatting operations extend the used range significantly beyond the range that you would currently consider used. This can cause performance obstructions and file-size obstructions.
You can check what Excel thinks is the used range by pushing Ctrl + End. If you find yourself miles below or to the right of where your data ends, then delete all the rows/columns between that point and the edge of your data:
To quickly do the rows, select the entire row that lies beneath the
bottom of your data, then push Ctrl + Shift + Down Arrow (which
selects all the rows right to the bottom of the spreadsheet) and then
using the Right-Click DELETE option.
For columns, you would select the entire column to the immediate
right of your data, and use the using Ctrl + Shift + Right Arrow to
select the unused bits, and then use the Right-Click DELETE option.
(Note that you’ve got to use the Right-Click DELETE option, and not just push the Delete key on the keyboard.)
When you’ve done this, then push Ctrl + End again and see where you end up – hopefully close to the bottom right corner of your data. Sometimes it doesn’t work, in which case you need to push Alt + F11 (which opens the VBA editor) and type Application.ActiveSheet.UsedRange in the Immediate Window and then pushing ENTER (and if you can’t see a window with the caption “Immediate” then push Ctrl G).
Lastly, depending on what version and SKU of Excel you have, you may be able to use PowerPivot and PowerQuery to radically simplify things and drastically cut down on the amount of formulas in your workbook.

Sort text-based information into different sheets

I am creating a tracking document for artists' accommodation as part of an arts festival and would like to automate part of my work flow. Whilst we use event management/scheduling software for confirmed bookings, it's nice to do all my working in Excel.
I would like to have a master sheet (sheet 1), with a full list of artists and their respective accommodation - that can then be sorted into individual sheets (sheet 2, 3 etc) based on the name of the accommodation. The automatic sorting would also capture the other pieces of information in the row.
This would allow for each different sheet to show a report on who is staying in each type of accommodation and would be rather handy!
I would recommend one or more PivotTables as a simpler solution. Here a PT and two clones are shown on your Master Sheet, but they could each be on their own sheet:
Accom is in Report Filter, Company is in Row Labels and PAX (as Sum) is in Σ Values. Once having clicked on PivotTable in Insert > Tables - PivotTable and having chosen you range ('Master Sheet'!$A$2:$C$7A2:C7) and Location just drag the fields from the big box to the little ones.
This is feasible using Excel, but I don't recommend it; it is creating a maintenance nightmare in the long run.
From the question I can't gather whether the data is available in some kind of event management software package; if so you can use that one as a data source. Or create an Access or SQL database with a few tables. After that, you can use one of the following options to make the necessary overviews and as many more as you think up during the project:
Use Excel with ODBC or web query to retrieve data aggregated and
sorted as you like. Make changes in the event management package
allowing others to see the same facts. Or do it in Access. When you
change one thing, it automatically propogates also into the Excel.
Similarly, you can use an Excel add-in such as Invantive
Control (caution I work at a supplier) to retrieve the data from
the database using SQL or a webservice, change it from within Excel and
then synchronize the changes back assuming you have write access.
A similar solution is available as SQL*XL. Probably there are others too.
If the solution must be Excel only, I would recommend using vertical/horizontal lookups with the Excel function vlookup / hlookup (Dutch: vert.zoeken, horiz.zoeken). These function perform reasonable with a small amount of data and performance can be improved by sorting. And they resemble SQL joins, so the database you get within Excel more easily conforms to the relational model.
I hope the event is successfull and the people enjoy it.

What is the best way to import data from sophisticated formula enriched Excel files into SalesForce.com?

My current employer (to remain nameless) has a collection of incredibly sophisticated Microsoft Excel 2003 worksheets (developed by contractors, also to remain nameless).
The employer is replacing the Excel-based solution with a SalesForce-based solution (developed by other contractors, likewise to remain unnamed). The SalesForce solution is also very complex using dozens of related objects and "Dynamic SOQL" to contain the data and formulas which previously was contained in the Excel-based solution.
The employer's problem, which has become my problem, is that the data from the Excel spreadsheets needs to be meticulously and tediously recreated in .CSV files so it can be imported into SalesForce.
While I've recently learned I can use CTRL-` to review formulas in Excel, this doesn't solve the problem that variables in Excel have cryptic names like $O$15. If I'm lucky, when I investigate $O$15, I'll find some metadata explaining if n cells up and/or some other data m cells to the left, and/or (in rare instances) there may be a comment on the cell.
Patterns within the Excel spreadsheets are very limited, rarely lasting more than 6 concurrent rows or columns and no two sheets which need to be imported have much similarity.
Documentation of all systems are very limited.
Without my revealing any confidential data, does anyone have any good ideas how I might optimize my workflow?
It's not clear exactly what you need to do: here are 3 possible scenarios, requiring increasing knowledge of Excel.
1. If all you want is to convert the Excel spreadsheets into CSV format then just save the worksheets as CSVs.
2. If you just want the data and not the formulae then it would be simple (using VBA) to output anything that isn't a formula (the cell.Formula won't start with =).
3. If you need to create a linkage excel-->csv-->existing Salesforce objects/SOQL then you will need to understand both the Excel Spreadsheets and the Salesforce objects/SOQL that have been created. This will be difficult unless you have good knowledge and experience of Excel and also understand what the salesforce App requires.
Brian, if you're still working on this, here's one way to approach the problem. I use this kind of process often for updating data between SFDC and marketing automation apps.
1) Analyze the formulae that you're re-creating in Salesforce.com to determine what base data fields you need (stuff that doesn't have to be calculated from something else.
2) Find those columns/rows in your spreadsheets and use Paste Special -> Values in a new spreadsheet to create an upload file with values instead of formulae that you need for each data area (leads, prospects, accounts, etc.)
3) If you have to associate the info with leads or contacts or accounts and you have already uploaded or created those records in Salesforce.com, be sure to export them with their ID numbers. That makes it easy to use the vlookup formula in Excel to match up fields that you need to add and then re-upload the data into Salesforce.
Like data cleaning, this can be a tedious process. But if you take it step by step it shouldn't be too hard. Good luck.

Getting mixed tabular & non-tabular data from Excel into Access

My Access programming is a little rusty, & I've never worked with Excel files all that much.
I have a requirement to bring data from Excel spreadsheets into Access 2007. These spreadsheets have a fixed (predictable) format, but it includes a "header area" where I need to read single data items from specific cells, followed by a mass of tabular data (~500 rows in the one sample I've seen so far). I will be processing all of this into a set of tables that are normalized quite differently from the flat layout of the spreadsheet.
I know how to open an ADO recordset on the tabular data, and it should work fairly well for my purposes. I also figure that I can reference the Excel object model and open the sheets through Automation to get the "header area" data items.
My question is this: since I have to (I think) use the Automation approach for the "header area", am I better off just leaving it open in this mode to move on to the tabular data (with cell/range navigation), or closing that mode & going over to ADO? I suspect it's the latter--and I'd be more comfortable with it--but I don't want to do the wrong thing just because it's more familiar.
Edit
It seems I wasn't clear that I need to build this capability into the "application", as something that a user can repeat down the line. I'm assured that I can trust the format of the spreadsheet (though I'll include error trapping for graceful failure if that turns out to be false). These spreadsheets are "official design documents" for hardware, and my app needs to handle bringing in new &/or updated ones to track the things that are described in the tabular data in ways that the flat Excel format diesn't allow for.
Of those two options, I would choose the second simply because I find it more convenient to work with an ADO recordset. It should be fairly simple if you can assign a named range to your spreadsheet's tabular data.
Edit: If your spreadsheet includes field names, the recordset approach would be less prone to break due to spreadsheet changes such as one or more new columns inserted before or between the existing columns or a re-ordering of the existing columns.
But actually, I think the TransferSpreadsheet Method might be more convenient. You can specify the spreadsheet range as a named range or by cell address as in this example from the linked page:
DoCmd.TransferSpreadsheet acImport, 3, _
"Employees","C:\Lotus\Newemps.wk3", True, "A1:G12"
Also, you can choose between importing the spreadsheet range directly into an Access table, or linking to the range as a "virtual" table ... whichever best meets your application's needs.
Edit2: Creating a link (acLink instead of acImport) with TransferSpreadsheet would allow you to execute SQL statements against the link table:
INSERT INTO DestinationTable (field1, field2, field3)
SELECT foo, bar, bat FROM LinkedTable;
If the header information is really complicated, this can simplify your coding work:
In the official design Excel file, create a hidden tab.
In that tab, make a 1-row table connecting to all the header elements you're interested in. (i.e. set row 1 column 1 to "Document#" and row 2 column 1 to Sheet1:A1)
Then you can re-use the same VBA procedure to import both your tabular data and your header data.
I would do it all via Automation. Why have two separate processes where one will do? After you've read the header information reading the tabular information will be quite easy.
I inherited an application back in mid-2000 that was built to import Excel spreadsheets that were basically reporting output from MYOB (an accounting program). What had been done was to simply create a template table that had all the columns necessary to accomodate the report, using text data type for all columns. Then the non-data rows were filtered out and processed into the eventual destination table.
It's not elegant, and doesn't require a lot of programming, though the implementation I inherited used a dedicated temp table for each report layout that was being imported. You could easily replace all of those with a single table with 100 text columns of 255 (or memo fields, for that matter, if that was a requirement), and just re-use it.
I'm not sure if I'd recommend it or not, but it really is quite easy without requiring much in the way of code.

Resources