Using Excel to GROUP BY and find date WHERE MAX - excel

My problem is I have a table of data structured as below :
+---------------+------------+---------+
| recipe number | date | quality |
+---------------+------------+---------+
| 154 | 01/01/2020 | 2 |
| 154 | 01/03/2020 | 3 |
| 154 | 01/05/2020 | 1 |
| 154 | 01/07/2020 | 2 |
| 222 | 01/01/2020 | 3 |
| 222 | 01/03/2020 | 2 |
| 222 | 01/05/2020 | 2 |
| 222 | 01/07/2020 | 1 |
| 888 | 01/01/2020 | 1 |
| 888 | 01/03/2020 | 3 |
| 888 | 01/05/2020 | 2 |
| 888 | 01/07/2020 | 3 |
| 666 | 01/01/2020 | 2 |
| 666 | 01/03/2020 | 3 |
| 666 | 01/05/2020 | 3 |
| 666 | 01/07/2020 | 3 |
| 777 | 01/01/2020 | 1 |
| 777 | 01/03/2020 | 2 |
| 777 | 01/05/2020 | 3 |
| 777 | 01/07/2020 | 1 |
| 123 | 01/09/2020 | 3 |
| 254 | 01/01/2020 | 2 |
| 254 | 01/03/2020 | 3 |
| 745 | 01/01/2020 | 1 |
| 745 | 01/03/2020 | 3 |
| 745 | 01/05/2020 | 2 |
| 745 | 01/07/2020 | 3 |
| 578 | 01/11/2020 | 3 |
| 578 | 01/01/2021 | 3 |
| 578 | 01/03/2021 | 1 |
| 578 | 01/05/2021 | 3 |
| 678 | 01/07/2021 | 2 |
| 999 | 01/09/2021 | 1 |
| 999 | 01/11/2021 | 1 |
+---------------+------------+---------+
The final answer I want is that I need a table of each recipe number and a simple yes/no whether that recipe decreased in quality over time, at all.
There are some recipes which only have one entry, and others which only increased in quality - these need to be answered "no"
EG:
recipe
decreased?
154
yes
666
no
Unfortunately, I'm limited to only Excel for this, though I understand doing it in other environments is probably easier.
I have tried a max(index+match) to see if I can return the highest quality for each recipe (and the lowest with a min). But I got stuck at trying to get Excel to return an array of qualities conditional on which recipe to look at.
I also tried PowerQuery but the problem seems too complex for that utility.
I've done some more thinking and some psuedocode that would solve it is:
For each recipe number:
1) Find me the max quality and the date where it happened
2) Find me any quality lower than this number where the date it happened is
after step 1
3) If the date of step 1 is earlier (less than) the date of step 2, output
"yes", otherwise "no"
Translating that into Excel 2016 is a bit difficult

Edit: Thanks to #TomSharpe who pointed out some logic errors => incorrect results, I have had to modify the M-code and also remove the formula method. I believe the code below satisfies your requirements
This can be done in Power Query
Group by Recipe
Determine the first date of the highest quality
See if any dates after the first date have a lower quality
If so then "yes" else "no"
Examine the comments and step through the Applied Steps window to better understand the algorithm.
M Code
let
//read in the data
//change table name in next line to actual table name
Source = Excel.CurrentWorkbook(){[Name="Table18"]}[Content],
//set data types
#"Changed Type" = Table.TransformColumnTypes(Source,{{"recipe number", Int64.Type}, {"date", type date}, {"quality", Int64.Type}}),
//group by recipe number
#"Grouped Rows" = Table.Group(#"Changed Type", {"recipe number"}, {
//(t) is each subtable returned
{"Quality Decrease", (t)=>
let
//List.Max(quality) => highest quality rating
//Filter the subtable to only show qualities of that value
//then, with List.Min(....[date]) return the earliest date
firstBest= List.Min(Table.SelectRows(t, each [quality] = List.Max(t[quality]))[date]),
//Filter the subtable to only show dates > firstBest date and with quality worse than the highest quality rating
decrQual = Table.SelectRows(t, each [date] > firstBest and [quality] < List.Max(t[quality]))
in
//check if resultant table is empty
if Table.IsEmpty(decrQual) then "no" else "yes", Text.Type
}
})
in
#"Grouped Rows"

Related

Show text as value Power Pivot using DAX formula

Is there a way by using a DAX measure to create the column which contain text values instead of the numeric sum/count that it will automatically give?
In the example below the first name will appear as a value (in the first table) instead of their name as in the second.
Data table:
+----+------------+------------+---------------+-------+-------+
| id | first_name | last_name | currency | Sales | Stock |
+----+------------+------------+---------------+-------+-------+
| 1 | Giovanna | Christon | Peso | 10 | 12 |
| 2 | Roderich | MacMorland | Peso | 8 | 10 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 4 | 6 |
| 1 | Giovanna | Christon | Peso | 11 | 13 |
| 2 | Roderich | MacMorland | Peso | 9 | 11 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 5 | 7 |
| 1 | Giovanna | Christon | Peso | 15 | 17 |
| 2 | Roderich | MacMorland | Peso | 10 | 12 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 6 | 8 |
| 1 | Giovanna | Christon | Peso | 17 | 19 |
| 2 | Roderich | MacMorland | Peso | 11 | 13 |
| 3 | Bond | Arkcoll | Yuan Renminbi | 7 | 9 |
+----+------------+------------+---------------+-------+-------+
No DAX needed. You should put the first_name field on Rows and not on Values. Select Tabular View for the Report Layout. Like this:
After some search I found 4 ways.
measure 1 (will return blank if values differ):
=IF(COUNTROWS(VALUES(Table1[first_name])) > 1, BLANK(), VALUES(Table1[first_name]))
measure 2 (will return blank if values differ):
=CALCULATE(
VALUES(Table1[first_name]),
FILTER(Table1,
COUNTROWS(VALUES(Table1[first_name]))=1))
measure 3 (will show every single text value), thanks # Rory:
=CONCATENATEX(Table1,[first_name]," ")
For very large dataset this concatenate seems to work better:
=CALCULATE(CONCATENATEX(VALUES(Table1[first_name]),Table1[first_name]," "))
Results:

Spotfire - Identify each time a value changes in a particular pattern for a particular type

Apologies for the bad title, I'm struggling to describe exactly what I;m trying to do (which has also made it difficult to search for an existing answer.
I have a series of date with the columns "Asset", "Time", and a "State" that is a calculated column that changes based on several other values. The data is source from a constant, and non-regular stream of data (though in the table below I have created a sample of timestamps that are regular).
The below table shows the source data ("Asset", "Time" and "State"), as well as an intended calculated column "Event" that tracks each time an event starts. I do not simply want to count the number of times an "Asset" has a Bad "State", but identify each time an "Asset" changes from a "State" of Good to a state of Bad (note in the real data there is a large number of different states, but the pattern I'm trying to identify is consistent).
+-------+----------+-------+-------+
| Asset | Time | State | Event |
+-------+----------+-------+-------+
| 1 | 12:00:00 | Good | 0 |
| 2 | 12:00:00 | Good | 0 |
| 1 | 12:00:01 | Good | 0 |
| 2 | 12:00:01 | Good | 0 |
| 1 | 12:00:02 | Bad | 1 |
| 2 | 12:00:02 | Good | 0 |
| 2 | 12:00:03 | Good | 0 |
| 2 | 12:00:03 | Good | 0 |
| 1 | 12:00:04 | Bad | 0 |
| 1 | 12:00:04 | Good | 0 |
| 2 | 12:00:05 | Good | 0 |
| 2 | 12:00:05 | Bad | 1 |
| 2 | 12:00:06 | Bad | 0 |
| 1 | 12:00:06 | Good | 0 |
| 2 | 12:00:07 | Bad | 0 |
| 2 | 12:00:07 | Good | 0 |
| 2 | 12:00:08 | Good | 0 |
| 1 | 12:00:08 | Bad | 1 |
| 2 | 12:00:09 | Good | 0 |
| 1 | 12:00:09 | Bad | 0 |
| 2 | 12:00:10 | Good | 0 |
| 1 | 12:00:10 | Good | 0 |
+-------+----------+-------+-------+
I intend to create a chart for this data to show how many times an event occurs for a particular asset per day. I figured the easiest way to do that is to have a column that counts a 1 when the event starts, and then sum this column over the dates to create the visualisation.
My idea was for a calculated column that picks up when the "State" is bad, and then checks to see if the previous state for that "Asset" was good, and set as a 1 if so. I have so far been unable to find a way to identify what the value was for a previous row entry for a particular "Asset".
Note that there are roughly 100 (or more) individual "Asset" entries, so creating an individual calculated column to track each would not be feasible. I am also using Spotfire 7.1.
Thanks in advance, and sorry again if the way I've written this is confusing.

Excel to return list of items with specific repetition

I am trying to create a list of names that repeat a specific number of times, based on another variable. Basically, if I have the following:
Column A Column B
Amy 5
John 2
Carl 3
the result would be:
Amy
Amy
Amy
Amy
Amy
Carl
Carl
Carl
John
John
I have built the initial list using the Index-Small-Countif, method, to get an alphabetical and distinct list, and then another formula to determine how many times each item repeats. I know I need to use some sort of index/offset with reference to rows, but just can't quite get it to work out.
The list is dynamic and changes daily, so manually retyping the list each day would result in too much human error and time (list is about 50 distinct items, with total number of rows at the end being around 400). Ultimately, the list will be used for a number of sumproduct/vlookups.
I can do this fairly quickly in VBA, but the users of this document don't always trust VBA and trying to get them to Enable Macros each time is not something that is going to work.
Thank you very much for any help you can offer!
Based on your table:
+---+------+---+
| | A | B |
+---+------+---+
| 1 | Amy | 5 |
| 2 | John | 2 |
| 3 | Carl | 3 |
+---+------+---+
In column C stick a "0" at C4 and formula =B1+C2 copying down to just before the 0:
+---+------+---+----+
| | A | B | C |
+---+------+---+----+
| 1 | Amy | 5 | 10 |
| 2 | John | 2 | 5 |
| 3 | Carl | 3 | 3 |
| 4 | | | 0 |
+---+------+---+----+
Now we have an upper bound of the row that each value should be placed on which we can use in a Match() formula which will feed an Index() formula.
In a new column (I'm using E) IN E1: =INDEX($A$1:$A$3,MATCH(ROW(),$C$1:$C$3,-1),1) and copy down
+----+------+---+----+--+------+
| | A | B | C | D | E |
+----+------+---+----+--+------+
| 1 | Amy | 5 | 10 | | Carl |
| 2 | John | 2 | 5 | | Carl |
| 3 | Carl | 3 | 3 | | Carl |
| 4 | | | 0 | | John |
| 5 | | | | | John |
| 6 | | | | | Amy |
| 7 | | | | | Amy |
| 8 | | | | | Amy |
| 9 | | | | | Amy |
| 10 | | | | | Amy |
+----+------+---+----+--+------+
The list is backwards because of that oddball backwards from 0 thing we did in Column C. This is to make that Match() last parameter of -1 (Greater than) work correctly.
I would imagine with some tweaking this could be done a little cleaner, but this should get you in the ballpark.
Although I would still be a big proponent of finding users who are capable of enabling macros. Ugh.

How to compose sales table for collections of items that are sold separately?

I want to compose sales table for purchased and sold items to see total profit. It's easy to do when items are purchased and sold individually or as a lot. But how to handle situation when one buys collection of items and sells them one by one. For example, I buy a collection (C) of a hammer and a screwdriver and sell tools separately. If I would enter data into simple table as in the image, I would get wrong profit result.
When there are only two items, I could divide their purchase price randomly, but when there are many items and not all of them are yet sold, I can't easily see if this collection already made profit or not.
I expect correct output of profit. In this case collection cost was 10 and selling price of all collection items was 13. Thus it should show profit of 3, not loss of -7. I was thinking of adding 2 new column, like IsCollection, CollectionID. Then derive a formula, which would use either simple subtraction or would check price of a whole collection and subtract it from the sum of items that belong to that collection. Deriving such formula is another question... But maybe there is an easier way of accomplishing the same
I added a column COLLECTION to identify item who belong to a collection.
Then I used SUMIF to sum sell price for items which belong at the same collection.
Then I used IF in Profit column to use summed sell price or single sell price.
You need to define in some formula a range of cell (see below).
Problem: you can't add profit values to obtain Total profit.
I used opencalc (but it should be almost the same in Excel).
Content of
SUM_COLL (row2):
=SUMIF($A$1:$A$22;"="&A2;$D$1:$D$22)
SUM_COLL (row3):
=SUMIF($A$1:$A$22;"="&A3;$D$1:$D$22)
and so on.
Profit (row2):
=IF(A2<>"";E2-C2;D2-C2)
Profit (row3):
=IF(A3<>"";E3-C3;D3-C3)
+------------+-----------+-------------+------------+----------+--------+
| COLLECTION | Item name | Purch Price | Sell Price | SUM_COLL | Profit |
+------------+-----------+-------------+------------+----------+--------+
| | A | 1 | 1.5 | 0 | 0.5 |
+------------+-----------+-------------+------------+----------+--------+
| | B | 2 | 2.1 | 0 | 0.1 |
+------------+-----------+-------------+------------+----------+--------+
| C | C1 | 10 | 7 | 27 | 17 |
+------------+-----------+-------------+------------+----------+--------+
| C | C2 | 10 | 6 | 27 | 17 |
+------------+-----------+-------------+------------+----------+--------+
| D | D1 | 7 | 15 | 23 | 16 |
+------------+-----------+-------------+------------+----------+--------+
| | E | 8 | 12 | 0 | 4 |
+------------+-----------+-------------+------------+----------+--------+
| C | C3 | 10 | 14 | 27 | 17 |
+------------+-----------+-------------+------------+----------+--------+
| D | D2 | 7 | 8 | 23 | 16 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
Update:
I added two more column to make Profit summable:
COUNT_COLL (row2):
=COUNTIF($A$1:$A$22;"="&A2)
COUNT_COLL (row3):
=COUNTIF($A$1:$A$22;"="&A3)
Profit_SUMMABLE (row2)
=IF(A2<>"";(E2-C2)/G2;D2-C2)
Profit_SUMMABLE (row3)
=IF(A3<>"";(E3-C3)/G3;D3-C3)
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| COLLECTION | Item name | Purch Price | Sell Price | SUM_COLL | Profit | COUNT_COLL | Profit_SUMMABLE |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | A | 1 | 1.5 | 0 | 0.5 | 0 | 0.5 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | B | 2 | 2.1 | 0 | 0.1 | 0 | 0.1 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| C | C1 | 10 | 7 | 27 | 17 | 3 | 5.6666666667 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| C | C2 | 10 | 6 | 27 | 17 | 3 | 5.6666666667 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| D | D1 | 7 | 15 | 23 | 16 | 2 | 8 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | E | 8 | 12 | 0 | 4 | 0 | 4 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| C | C3 | 10 | 14 | 27 | 17 | 3 | 5.6666666667 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| D | D2 | 7 | 8 | 23 | 16 | 2 | 8 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | | | | 0 | 0 | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | | | | 0 | 0 | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | | | | 0 | 0 | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
...
...
| TOTAL | | | | | 87.6 | | 37.6 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+

Adding Columns to Excel As List From Other Sheet Grows

Background
I'm creating a grade book in Excel for my wife. I have sheets for the overall grade, classwork, exams, and participation.
The three sections of work (classwork, exams, and participation) each have a variable number of items, and each item has a different number of points possible. Each section has a weight in the overall grade.
I have this up and running with a fixed number of items per section, but I'd like to create a template that can be updated from class to class and year to year.
Here's the problem:
On the classwork sheet, I'd like to be able to enter new assignments and their point value and have that automatically update the master grade sheet on my first sheet tab. Is there any way to add columns in a section of one worksheet (the master grade sheet) when new rows are added to another worksheet (the list of assignments)?
It is possible to achieve this without using VBA. The reason you will have difficulty acheiving this, however, is that you've violated normal form in the table you've already built. It appears the pertinent data you're looking for is each student's score on each assignment. If this if correct, the level of granularity you will want is on the Assignment, not on the Student.
There are some fairly quick ways to modify your existing work to account for this. I've written out some sample data below. Take a look and see if it helps.
Sample Original Table
+---------+------+------------+------------+
| Student | Quiz | Thumbnails | Watercolor |
+---------+------+------------+------------+
| Paul | 3 | 10 | 90 |
| Frank | 4 | 10 | 95 |
| Mary | 5 | 10 | 70 |
| Ellen | | 10 | 85 |
| Sue | 6 | 10 | 92 |
| Anton | 5 | 10 | 87 |
+---------+------+------------+------------+
Image of the data is below ( note I have highlighted the blank value ).
Sample Normal Table
+---------+-------------+-----------+-------+
| Student | Assignment | New_Score | Score |
+---------+-------------+-----------+-------+
| Paul | Quiz | | 3 |
| Frank | Quiz | | 4 |
| Mary | Quiz | | 5 |
| Ellen | Quiz | | 0 |
| Sue | Quiz | | 6 |
| Anton | Quiz | | 5 |
| Paul | Thumbnails | | 10 |
| Frank | Thumbnails | | 10 |
| Mary | Thumbnails | | 10 |
| Ellen | Thumbnails | | 10 |
| Sue | Thumbnails | | 10 |
| Anton | Thumbnails | | 10 |
| Paul | Watercolor | | 90 |
| Frank | Watercolor | | 95 |
| Mary | Watercolor | | 70 |
| Ellen | Watercolor | | 85 |
| Sue | Watercolor | | 92 |
| Anton | Watercolor | | 87 |
| Mary | ExtraCredit | 10 | 10 |
| Ellen | ExtraCredit | 8 | 8 |
| Sue | ExtraCredit | 9 | 9 |
| Anton | ExtraCredit | 10 | 10 |
+---------+-------------+-----------+-------+
Image of the data is below. The score column reaches back to your old table and grabs the score you've already entered for the students, so you won't have to do this all manually. The formula for this is =INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0)).
This assumes you've formatted the old data into an Excel DataTable ( ctrl+t ) and named it non_normal ( alt+j+t+i ). Note the unsubmitted assignment for Ellen comes through with a score of zero using this method. I've added a column named New_Score so that you are able to add new student-assignment submission combinations to the table without having to modify your old non_normal table ( which was the trouble in the OP ). With this column added, the formula in the Score column can be changed to =IF(NOT(ISBLANK([#[New_Score]])),[#[New_Score]],INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0))) which will take the New_Score value if available and the original score if not.
The orange cells are new student-assignment submission combinations. Note you do not need to add a row for every student, just add a row whenever a student submits an assignment.
Sample Assignments Table
+-------------+-----------------+
| Assignment | Points_Possible |
+-------------+-----------------+
| Quiz | 6 |
| Thumbnails | 10 |
| Wartercolor | 100 |
| ExtraCredit | |
+-------------+-----------------+
I've added the ExtraCredit assignment with a possible max score of zero/blank ( since not completing extra credit shouldn't count against a student )
Payoff - Back to the Original Table
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Sum of Score | Column Labels | | | | | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Row Labels | Quiz | Thumbnails | Watercolor | ExtraCredit | Grand Total | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Anton | 5 | 10 | 87 | 10 | 112 | 96.6% |
| Ellen | 0 | 10 | 85 | 8 | 103 | 88.8% |
| Frank | 4 | 10 | 95 | | 109 | 94.0% |
| Mary | 5 | 10 | 70 | 10 | 95 | 81.9% |
| Paul | 3 | 10 | 90 | | 103 | 88.8% |
| Sue | 6 | 10 | 92 | 9 | 117 | 100.9% |
+--------------+---------------+------------+------------+-------------+-------------+--------+
Using the image below, you pivot your newly normalized data into a Pivot Table. ( alt+n+v ). Now, simply adding a new assignment to the normal_assignment DataTable will cause that assignment to appear in a new column when you refresh the Pivot Table ( alt+a+r+a ).
The % score on the right of the Pivot Table is calculated using the following formula ( with the sample Pivot Table starting in cell $M$2 ): =GETPIVOTDATA("Score",$M$2,"Student",M4)/SUM(assignment[Points_Possible])
I've uploaded the raw sample file for this to my public repo if you'd like to pull it and take a peek at the source. Credit to sensefulsolutions for text-to-table conversion.
Hope this is what you need!

Resources