Classification and Grouping of Sorted Data

Classification and Grouping of Sorted Data - excel

I have a Dataset H3:J12 where components are classified based on Type. I have summed the count for similar components and sorted based on Count with Unique and Sort formulae and the result is L3:M7.
In my actual case, there are several thousands of such components which are sorted as in L:L and now I would like to add the Type column next to the Component with sorted Count as shown in P3:R12. Is it possible to extract them directly from L3:M7 or directly from H3:J12, as I will not be able to do them manually.

Screenshots/here refer:
Mechanical approaches include pivots, VB, etc. However, you could consider a more dynamic approach that doesn't require constant updating / refreshing (of code, pivots, etc.) whenever underlying data is appended/amended.
YES: you can retrieve directly from source data as follows:
=SORT(UNIQUE(C3:D20))
=SUMIFS(E3:E30,C3:C30,H3:H17,D3:D30,I3:I17)
Notes:
Could make this averageif(s), countif(s), percentile etc. etc. - another
significant advantage over alternative methods (VB, pivots, etc.), besides dynamic nature, is the flexibility re: measures; restrictions present in pivots are substantive in relation to Excel direct calculation
Disadvantage is inability to automatically chart data using pivot chart functionality that accompanies pivot tables
See linked sheet opening line for conditional formatting sample used to recreate 'look / feel' you might otherwise be losing out on with this approach
It seemed as though you were almost there! I'm not sure what relevance other tables have (ignored this in light of question).

Related

Excel pivot table with ranking

I'm in the processing of creating a report for the company I work at that has a rather complicated survey export file that needs to have the data extracted in meaningful ways.
The table headers are as follow https://docs.google.com/spreadsheets/d/1Et9Pg6k9CJA3HTO0aHcnSnOWVU05bmHYUsPS0wB2Nr8/edit?usp=sharing
It has respondents listing there top 3 most important options and the rest are left blank.
If anyone can help me figure out a way to potentially summarize this in a pivot table that would be great.

You're data is in a crosstab. Pivot's don't like that kind of layout. You need to unpivot your data.
If you've got the PowerQuery add-in installed (or have Excel 2016 or Excel/Office 365 subscription) then you can use PowerQuery to do this. Google "PowerQuery" and "Unpivot" and you'll turn up a whole heap of videos.
Otherwise you can use VBA such as my Unpivot routine I've previously blogged about at http://dailydoseofexcel.com/archives/2013/11/21/unpivot-shootout/

As always it depends what questions you want to ask in your analysis. Here are two suggestions.
What are the commonest first/second/third choices?
This assumes that the ranking is important, i.e. the first choice is ranked significantly higher than the second choice, so you want to analyse them separately.
You could add three extra columns to your data using this formula to convert the first choice to a single variable with 11 categories
=IFERROR(MATCH(COLUMNS($A:A),$A3:$K3,0),"")
in L3 and likewise with the second and third choices in M3 and N3.
in the event that a respondent (row) has less than three choices, it will give a blank for the second and/or third choice.
What are the commonest choices regardless of ranking?
This assumes that the ranking isn't so important - you just want to know which columns have been picked overall.
=INDEX($L$3:$N$10,INT((ROWS($1:1)-1)/3)+1,MOD(INT(ROWS($1:1)-1),3)+1)
In N3. This would have to pulled down for 3N rows, where N is the number of rows in the original dataset.
Then it would be a simple case of setting up pivot tables or charts for the four new variables.

Excel: Create lookup based on list of criteria

I have a little bit of an Excel problem and would be happy about any suggestions.
Long version: I have a dataset with raw data representing journal entries. The structure of this dataset can be seen here:
Now, what I want to achieve is to assign each row/each journal entry to a cost category (marketing, personnel, IT, depreciation, …) based on the values in the account number, type, and cost center rows, and, in a second step, break down the categories once more, eg. for labour costs, distinguish between direct and indirect labour costs.
The way my company does this right now is using an Excel sheet with several macros where the criteria are hardcoded in the VBA code to loop through the whole list, check if a row matches the criteria for a certain cost category, and if it does, copy the row to a new sheet (having one new sheet for each category), then using a second macro to break down the categories, assigning values to the “description”-column which is empty initially based on another set of criteria. Then, pivot tables are used on each of the new sub-datasets to calculate sums for each sub-category. These sums are finally used as input data for a management report (as seen in the image above) which is the ultimate goal of this whole ordeal.
Now, not only does this seem overly complicated to me and running the macros and manually adjusting the input ranges for the pivot tables takes forever, but also the criteria for allocating the costs can change quite often, and opening the VBA editor and changing the code is not really user-friendly.The initial idea was to maybe include some helper columns (one for each cost category) and somehow create an indicator variable being one of the entry falls in the respective category, and zero otherwise, and then use these columns for further calculations (e.g. for Sumifs and such).
The problem is that a) combinations of account number and type are not unique, so that one account number can go along with various types, and one type can go along with various account numbers, so the criteria can be something like C6 = 544300 OR 544700 AND D6<>110246, etc. And b) criteria can change, meaning sometimes a new account number or type is added that also has to be assigned to an already existing category such as labor costs, which would make it necessary to include that criterion in all the formulas for that particular cost category. So, is it possible to somehow create a criteria table for each category that serves as input for some sort of IF/SUMIF or lookup function?
Short version: I have a data set (can range from 5000 to up to 100000 rows, 8 columns) where I want to perform a lookup, but based on various criteria. And, in addition to that, it would be nice if the criteria could somehow be drawn from a separate list so that they can be modified fairly easily without having to change the formula itself. Is there a way to do so? Or do you think using the advanced filter might be the most suitable option?

PivotTable Nested Columns

I am interested in grouping columns together by a common factor. Below is a basic example:
I'm interested in nesting the vehicle brands under the categories Luxury and Frugal. Thus making them collapsible like the rows can be. This is a basic example whereas the data I'm planning to work with this has thousands of rows of data similar to a layout as these rows and possibly 50-100 columns that could be collapsible into 5-10. Any guidance would be appreciated.

As has been suggested, adding Luxury/Frugal as a column in your source data would achieve what you require. However that would require substantial rearrangement, make your source data occupy a much larger range and not give an ideal result because adding just the one column, while allowing filtering by column in the PivotTable, each filter view would still show all columns (though mostly blank).
So completely "flattening" the source data may be a better choice, if to rearrange the source data at all:
The fields arranged like so:

How do I filter an Excel pivot table based on the combination of two OLAP dimensions?

I've been tasked with building some ad-hoc reports in Excel that are sourced from an SSAS OLAP cube. I don't have the ability to alter the design of the cube's dimensions currently. I've been receiving repeated requests to filter results based upon the combination of two different dimensions and their attributes.
For example:
One dimension lists locations with their hierarchies. Another dimension contains codes for the various insurance companies we work with. I'm given a list of combinations of these, concatenated with a hyphen separating them, and they are supposed to be the only combinations within the report. For example, I get things like "001-AB5". Unfortunately, there are duplicates of the codes, so I can't just pull the code, seeing that AB5 means different things for different locations, which I can't do anything about at this time either.
For some of the smaller data sets, I've used PowerPivot and just created a calculated column, and added a relationship to the list in another sheet. The issue is that now they want the drill-through actions that have been setup for the cube. Is it possible to create something like a calculated dimension in Excel (or some other means) that would be the concatenation of these without using PowerPivot?

I have recently been working on a data table in Excel containing measurements of fossil specimens. In addition to containing things like the specimen number, species name, etc., the table also contains measurements from the fossils in question. However, because several specimens have data from both the left and right sides of the specimen, I often end up with situations where a single entry spans multiple rows, which means I cannot sort the data.
I have looked elsewhere on the Internet for a solution, and the only response I have gotten is that Excel doesn't really work well with entries spanning multiple rows, and I should reorganize my data. I understand that, and I have been looking for an alternate way of organizing the data. However, I have not been able to find an easy way to organize the data. I have tried reorganizing the information so each entry spans multiple rows, but when I do this it becomes very easy to make mistakes and to lose track of the data. It also becomes difficult to compare the data, since the measurements on the left and right side of the specimen are essentially the same thing and I cannot easily compare them if one specimen has a bone only preserved on the right side and the other specimen has the same bone preserved but only on the left side.
I have also tried organizing the measurements into a separate sheet which could be accessed by a hyperlink from the main sheet, but this has also posed problems. Because in this case the measurement data still cannot be sorted by specimen number of species name, if a specimen number or species name changes (which it has in the past), I have to manually reorganize all the hyperlinks by hand.
Finally, I have also tried adding an identifier to the multi-row entries, but this has a tendency to get screwed up if I sort the data, and it also mixes up any equations I use in the sheet. I might be doing it wrong somehow.
The good news is I am not interested in sorting the specimens by measurements, so if there is any way to organize the table so it is sortable but the measurements cannot be sorted, that is fine. At the same time, because all specimens technically have a left and right side (plus the average measurement between them), I could also work with a system wherein each "entry" spanned a set number of rows or subrows.
I was also wondering if it would be possible to write a macro to sort the data (especially since I am just sorting by the first five columns or so), or else do the database in some other program like Microsoft Access. Any help would be greatly appreciated.

Everything you describe really breaks down to: "Excel is great for analysis; but I really should have stored my source data in a database." Accountants I have worked with almost always come to this conclusion eventually, once their data and reporting needs get sufficiently complex.
I suggest you invest the moderate effort to upload your data to a proper database, and learn how to download I as appropriate to Excel for specific analyses. The time effort will be well spent, and simpler by far than coercing EXCEL into tasks for which it is ill-suited.
MS-Access, MySql, and SQL-Server Express are all suitable for this type of upgrade. MS-Access, if already available in your Office subscription, has the advantage of integrating even more easily with Excel than the other two, and also uses VBA as it's macro language. The other two offer more complete and powerful implementations of the SQL language. All told, use the one most easily available to you.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string