Spreadsheet with relationships - excel

I have to work with data CSV file. They look like this
sample
It represents products with options/cars etc. at the web-store.
It has a lot of columns with duplicated values and in my work in often need to copy some part of this data to another sheet, deduplicate it, edit and then paste it back by matching it for one of the columns that were untouched. More this purpose I'm using Ablebits Excel suit.
Is it possible by any excel function to automate this process or maybe there is some other software that could handle this? Something not so complicated as relational databases like Access, but something close to spreadsheet editor with relationships
I already tried Power Query in Excel and Power Bi, but they seem to be more analytics tools and not the data edit
2nd edition:
Data has a layer structure with duplicates.
Title1|Part number 1|Car1
Title1|Part number 1|Car2
Title2|Part number 2|Option1
Title2|Part number 3|Option2
I want to have opportunity to:
Edit values that duplicate without using "Replace All" or at least have more flexible "Find&Replace".
Extract columns with deduplicating them and saving a reference to the place they were taken. So if you edit some data there it was changed in the 1st place. For example, I have titles(a lot of titles) but need to edit it. Instead of copying it with some id to reference it I want to open it like they appears in filters, edit it, confirm and get it edited in all column

I would use Power Query (aka Get & Transform on the Data ribbon in Excel 2016). The only limitation I see with what you want to do is that Power Query will deliver a new Excel Table with the output of a Query - it can't update existing cells.
If you can get past that, Power Query is very flexible, easy to learn (WYSIWYG query editor), scales well and is integrated with other Microsoft products (as well as Power BI, there is integration with SQL Server Analysis Services in preview and hopefully SQL Server Integration Services one day).

Related

Excel VBA: Create a table from the permutation of other two

I want to generate new entries as two tables say how.
It could be like a vlookup but I need more than one "result entry" per "original entry"
I would like to know if there is a single formula, or I should use macros or maybe even python.
I really also didn't know how to search this, maybe a link would be useful.
This example will explain better than me what I need.
Example:
Table 1. SALES
Table 2. Tells which primary materials are used in each product (product components)
Result table: Tells the amount of primary materials by product and client
You could use Excel Power Query if you are looking to keep things simple.
We use Office 365 in our office and we frequently use Power Query for things like this.
I tossed your data into two tables and configured the power query to output the aggregated data per client, per material.
The output will be updated if you modify the tables and click the refresh button on the data tab.
Excel example (using Office 365)

Querying single data points from the Excel Data Model / Power Query (Get & Transform Data)

I'm using an up-to-date version of Excel 2016 (via O365 E3 license) and using Power Query / Get & Transform Data. I can successfully create queries and load them to the page. I have also successfully created Power Pivot reports.
I would like to query single data points from the data loaded via Power Query. For instance, imagine a dataset called DivisionalRevenue with:
Date Division Revenue
2016-01-01 Alpha 1000
2016-01-02 Alpha 1500
2016-01-01 Beta 2000
2016-01-02 Beta 400
I could easily load that to an Excel workbook or include it in the data model and create a power pivot. However, Power Pivot doesn't always meet my requirements, particularly around how the data is displayed on the page. In order to achieve my goal I may want to be able to query individual data points.
I would like to have a cell on the page with a formula in it that I can use to query individual data points. If it was in a pivot table I could use something like:
=GETPIVOTDATA("Revenue",$A$3,"Date",DATE(2016,1,1),"Division","Alpha")
The lookup values (date and division) could be retrieved from a cell on the page or hard-coded into the formula. This is a requirement for several reports I'm working on.
Or, I could add a combined lookup column with Date and Division concatenated and use a vlookup to pull the values like:
=VLOOKUP("42371Alpha",I9:L13,4,FALSE)
Finally, I could use a combination of INDEX and MATCH to identify the correct row number and then pull the data.
All of these solutions require the data to be loaded onto a sheet. One requires a pivot table that has to be refreshed to work properly. The other two require creating arbitrary lookup columns so that you can match a row based on more than one field (date and division in this example), and you have to ensure that that lookup field's formula is properly extended down the length of the data table. In both cases I would have concerns when sharing this workbook with my colleagues in case someone affects the rather fragile setup of the pivot table or the lookup.
So, what I truly want to find is something equivalent to pivot table querying against a dataset.
** This doesn't exist, but I would like to know if something like it does **
=GETQUERYDATA("Revenue","DivisionalRevenue","Date",DATE(2016,1,1),"Division","Alpha")
Does such a thing exist? Can such a thing be done? Can I retrieve arbitrary data points from the dataset created through Power Query / Get & Transform Data?
I think that what you want are cubefunctions:
Some Background
How to easy create cubefunctions from a pivot table
There is a feature in Excel that allows you to query off of a PowerPivot model, but it's not highly advertised for some reason.
Once you have the data in your PowerPivot model, go to your Excel -> Data tab -> Existing Connections -> Tables tab
From there, choose the table that you want to start with. Once that table's data is on your excel sheet, you can actually right click that table -> go to "Table" -> "Edit DAX"
From there you can enter the following DAX function, as an example
EVALUATE
FILTER(SampleData,[Date]=DATE(2016,1,1) && SampleData[Division]="Alpha")
Make sure to choose Command Type=DAX in the drop-down. Here's how it looks on my screen:
To further improve your querying power, you can install the optional "DAX Studio" plugin for Excel, which allows you to write custom DAX queries and then export the results directly back to an Excel sheet.

Excel 2010: Automatically combine multiple tables into one dataset

I thought there would be a simple way of doing this, but unfortunately I have not come across one. My company has an Excel workbook with 12 sheets (1 for each month), into which I enter sales data as accounts are written. I reformatted each month's data into tables, thinking that this would provide an easy reference to gather the data into a pivot table that joins all the months and would be updated as I enter data; however, a pivot table based on multiple sets of data allows highly limited manipulation.
So what I want to do is create a new table that is automatically populated as I enter data in any of the 12 current tables, to combine them into a master listing. I have tried doing a query, but when I try to set up the data sources, it doesn't recognize my tables. I tried Power Query, but I couldn't get it to update the data as I updated the source. Consolidate also was not a useful feature, as it required all the data to be somehow calculated, and my columns need to simply be copied over, not summed or averaged.
As you can probably tell from my explanations and terminology, I'm no Excel expert. I don't know what VBA even is, let alone know how to use it, but I've seen it mentioned a lot, so I figure at some point in my life I should learn it.
Is there a formula or some other Excel 2010 feature that can automatically copy all of this data onto one running list, and keep it updating as I enter data in the source tables? It would have to run automatically.
I believe your end goal is to have a pivot table which consolidates data from each of the individual 12 sheets/tables and not really to have the intermediate "single running list which is an aggregation of all the 12 sheets".
If so, I suggest to create an Excel Pivot table directly based upon the 'Multiple consolidation ranges'.
To start, create a new spreadsheet and select a cell (say A3) and use the click sequence Alt+D+P, this will bring up the PivotTable and PivotChart Wizard, and proceed further using the third option - 'Mulitple consolidation ranges'.
I will have to refer you to the below site for a detailed step by step instructions on the above: http://www.contextures.com/xlPivot08.html
Please be aware that the Difficulty level for this solution is Medium, suggest you to bookmark the solution from maintainability reasons, in case you choose to implement it.

Sort text-based information into different sheets

I am creating a tracking document for artists' accommodation as part of an arts festival and would like to automate part of my work flow. Whilst we use event management/scheduling software for confirmed bookings, it's nice to do all my working in Excel.
I would like to have a master sheet (sheet 1), with a full list of artists and their respective accommodation - that can then be sorted into individual sheets (sheet 2, 3 etc) based on the name of the accommodation. The automatic sorting would also capture the other pieces of information in the row.
This would allow for each different sheet to show a report on who is staying in each type of accommodation and would be rather handy!
I would recommend one or more PivotTables as a simpler solution. Here a PT and two clones are shown on your Master Sheet, but they could each be on their own sheet:
Accom is in Report Filter, Company is in Row Labels and PAX (as Sum) is in Σ Values. Once having clicked on PivotTable in Insert > Tables - PivotTable and having chosen you range ('Master Sheet'!$A$2:$C$7A2:C7) and Location just drag the fields from the big box to the little ones.
This is feasible using Excel, but I don't recommend it; it is creating a maintenance nightmare in the long run.
From the question I can't gather whether the data is available in some kind of event management software package; if so you can use that one as a data source. Or create an Access or SQL database with a few tables. After that, you can use one of the following options to make the necessary overviews and as many more as you think up during the project:
Use Excel with ODBC or web query to retrieve data aggregated and
sorted as you like. Make changes in the event management package
allowing others to see the same facts. Or do it in Access. When you
change one thing, it automatically propogates also into the Excel.
Similarly, you can use an Excel add-in such as Invantive
Control (caution I work at a supplier) to retrieve the data from
the database using SQL or a webservice, change it from within Excel and
then synchronize the changes back assuming you have write access.
A similar solution is available as SQL*XL. Probably there are others too.
If the solution must be Excel only, I would recommend using vertical/horizontal lookups with the Excel function vlookup / hlookup (Dutch: vert.zoeken, horiz.zoeken). These function perform reasonable with a small amount of data and performance can be improved by sorting. And they resemble SQL joins, so the database you get within Excel more easily conforms to the relational model.
I hope the event is successfull and the people enjoy it.

Getting mixed tabular & non-tabular data from Excel into Access

My Access programming is a little rusty, & I've never worked with Excel files all that much.
I have a requirement to bring data from Excel spreadsheets into Access 2007. These spreadsheets have a fixed (predictable) format, but it includes a "header area" where I need to read single data items from specific cells, followed by a mass of tabular data (~500 rows in the one sample I've seen so far). I will be processing all of this into a set of tables that are normalized quite differently from the flat layout of the spreadsheet.
I know how to open an ADO recordset on the tabular data, and it should work fairly well for my purposes. I also figure that I can reference the Excel object model and open the sheets through Automation to get the "header area" data items.
My question is this: since I have to (I think) use the Automation approach for the "header area", am I better off just leaving it open in this mode to move on to the tabular data (with cell/range navigation), or closing that mode & going over to ADO? I suspect it's the latter--and I'd be more comfortable with it--but I don't want to do the wrong thing just because it's more familiar.
Edit
It seems I wasn't clear that I need to build this capability into the "application", as something that a user can repeat down the line. I'm assured that I can trust the format of the spreadsheet (though I'll include error trapping for graceful failure if that turns out to be false). These spreadsheets are "official design documents" for hardware, and my app needs to handle bringing in new &/or updated ones to track the things that are described in the tabular data in ways that the flat Excel format diesn't allow for.
Of those two options, I would choose the second simply because I find it more convenient to work with an ADO recordset. It should be fairly simple if you can assign a named range to your spreadsheet's tabular data.
Edit: If your spreadsheet includes field names, the recordset approach would be less prone to break due to spreadsheet changes such as one or more new columns inserted before or between the existing columns or a re-ordering of the existing columns.
But actually, I think the TransferSpreadsheet Method might be more convenient. You can specify the spreadsheet range as a named range or by cell address as in this example from the linked page:
DoCmd.TransferSpreadsheet acImport, 3, _
"Employees","C:\Lotus\Newemps.wk3", True, "A1:G12"
Also, you can choose between importing the spreadsheet range directly into an Access table, or linking to the range as a "virtual" table ... whichever best meets your application's needs.
Edit2: Creating a link (acLink instead of acImport) with TransferSpreadsheet would allow you to execute SQL statements against the link table:
INSERT INTO DestinationTable (field1, field2, field3)
SELECT foo, bar, bat FROM LinkedTable;
If the header information is really complicated, this can simplify your coding work:
In the official design Excel file, create a hidden tab.
In that tab, make a 1-row table connecting to all the header elements you're interested in. (i.e. set row 1 column 1 to "Document#" and row 2 column 1 to Sheet1:A1)
Then you can re-use the same VBA procedure to import both your tabular data and your header data.
I would do it all via Automation. Why have two separate processes where one will do? After you've read the header information reading the tabular information will be quite easy.
I inherited an application back in mid-2000 that was built to import Excel spreadsheets that were basically reporting output from MYOB (an accounting program). What had been done was to simply create a template table that had all the columns necessary to accomodate the report, using text data type for all columns. Then the non-data rows were filtered out and processed into the eventual destination table.
It's not elegant, and doesn't require a lot of programming, though the implementation I inherited used a dedicated temp table for each report layout that was being imported. You could easily replace all of those with a single table with 100 text columns of 255 (or memo fields, for that matter, if that was a requirement), and just re-use it.
I'm not sure if I'd recommend it or not, but it really is quite easy without requiring much in the way of code.

Resources