Grouping a list of items in to as equal in numbers as possible - excel

Requirement: Split a list in to 4 separate groups, based on a value for each row.
| Player | Skill |
| ------------- |:-------------:|
| Player 1 | 10000 |
| Player 2 | 50000 |
| Player 3 | 2000 |
| Player 4 | 11000 |
| Player 5 | 7525 |
| Player 6 | 100 |
| Player 7 | 999 |
| Player 8 | 14579 |
| Player 9 | 26700 |
So in the example above, these players would be split in to 4 groups:
| Group | # of players |
| ------------- |:-------------:|
| Group1 | 2 |
| Group2 | 2 |
| Group3 | 2 |
| Group4 | 3 |
The number of players in a group needs to be as close as possible, however, as a group, the groups total Skill needs to around similar in numbers each time.
Before I go too far down the rabbit hole (as wording a question like this in a simple google search is not turning out very well) are there any built in functions of Excel that can be leveraged to achieve this or possible efforts in VBA that can be explored to achieve the required result?

This isn't an answer! But suppose you try a simple algorithm:
Calculate average skill level (ASL) for all 9 players
Set TSG (total skill for group) to zero.
Loop:Take largest skill Level (LSL) of remaining players
If TSG+LSL>ASL
Go to next group
Else
Add to total skill (TSG) for this group
Remove player from list
Repeat loop until no players remaining.
If you apply this by hand to your data you should get:
Average=30725.75
+---------+---------+---------+---------+
| Group 1 | Group 2 | Group 3 | Group 4 |
+---------+---------+---------+---------+
| 50000 | 26700 | 14579 | 10000 |
| | 2000 | 11000 | 7525 |
| | 999 | | |
| | 100 | | |
| | | | |
| 50000 | 29799 | 25579 | 17525 |
+---------+---------+---------+---------+
Clearly there are a couple of issues - you might not want a single group containing only player with highest skill level. Also you might want to re-average the remaining players after taking out the most skilful player. Should be a starting point though - could be implemented fairly easily with formulas or VBA.

Related

Formula to Repeat and increment Numbers and reset when ID changes

I am looking for a start in the right direction and I hope someone on this forum has run into this issue. I have an excel table with 24K jobs on it and a technician assigned to every job. These technicians have 40 weeks to complete all the jobs assigned. I have a helper table with each technician’s id and how many jobs per week then need to complete all the work. I have sorted the jobs by geographic area for efficiency. I need a formula that will look at the Technician id and if they are receiving 3 jobs per week that it will number the first 3 with a 1, and the next 3 with a 2 and so on. And when it switches Technician it would reset the counter.
In the example below Tech 1 is assigned 3 jobs per week, and Tech 2 has 2 jobs per week.
| JobID | Tech | Grouping |
|-------|--------|----------|
| BK025 | Tech 1 | 1 |
| CD044 | Tech 1 | 1 |
| DE024 | Tech 1 | 1 |
| DE031 | Tech 1 | 2 |
| DE035 | Tech 1 | 2 |
| FT083 | Tech 1 | 2 |
| IR004 | Tech 2 | 1 |
| IR006 | Tech 2 | 1 |
| IR052 | Tech 2 | 2 |
| IR061 | Tech 2 | 2 |
| IR062 | Tech 2 | 3 |
| IR072 | Tech 2 | 3 |
I have been searching SO and Google looking for an answer but may not be using the correct key words.I have found this formula =ROUNDUP((ROW()-offset)/repeat,0) -- Found on exceljet -- which will work, but in order to get it to work properly I would have to filter to each tech individually.
Assuming your helper table is something like in the screenshot below, you could use an approach like the following:
=ROUNDUP(COUNTIF(B$2:B2,B2)/VLOOKUP(B2,$E$1:$F$3,2,0),0)

Spotfire - Identify each time a value changes in a particular pattern for a particular type

Apologies for the bad title, I'm struggling to describe exactly what I;m trying to do (which has also made it difficult to search for an existing answer.
I have a series of date with the columns "Asset", "Time", and a "State" that is a calculated column that changes based on several other values. The data is source from a constant, and non-regular stream of data (though in the table below I have created a sample of timestamps that are regular).
The below table shows the source data ("Asset", "Time" and "State"), as well as an intended calculated column "Event" that tracks each time an event starts. I do not simply want to count the number of times an "Asset" has a Bad "State", but identify each time an "Asset" changes from a "State" of Good to a state of Bad (note in the real data there is a large number of different states, but the pattern I'm trying to identify is consistent).
+-------+----------+-------+-------+
| Asset | Time | State | Event |
+-------+----------+-------+-------+
| 1 | 12:00:00 | Good | 0 |
| 2 | 12:00:00 | Good | 0 |
| 1 | 12:00:01 | Good | 0 |
| 2 | 12:00:01 | Good | 0 |
| 1 | 12:00:02 | Bad | 1 |
| 2 | 12:00:02 | Good | 0 |
| 2 | 12:00:03 | Good | 0 |
| 2 | 12:00:03 | Good | 0 |
| 1 | 12:00:04 | Bad | 0 |
| 1 | 12:00:04 | Good | 0 |
| 2 | 12:00:05 | Good | 0 |
| 2 | 12:00:05 | Bad | 1 |
| 2 | 12:00:06 | Bad | 0 |
| 1 | 12:00:06 | Good | 0 |
| 2 | 12:00:07 | Bad | 0 |
| 2 | 12:00:07 | Good | 0 |
| 2 | 12:00:08 | Good | 0 |
| 1 | 12:00:08 | Bad | 1 |
| 2 | 12:00:09 | Good | 0 |
| 1 | 12:00:09 | Bad | 0 |
| 2 | 12:00:10 | Good | 0 |
| 1 | 12:00:10 | Good | 0 |
+-------+----------+-------+-------+
I intend to create a chart for this data to show how many times an event occurs for a particular asset per day. I figured the easiest way to do that is to have a column that counts a 1 when the event starts, and then sum this column over the dates to create the visualisation.
My idea was for a calculated column that picks up when the "State" is bad, and then checks to see if the previous state for that "Asset" was good, and set as a 1 if so. I have so far been unable to find a way to identify what the value was for a previous row entry for a particular "Asset".
Note that there are roughly 100 (or more) individual "Asset" entries, so creating an individual calculated column to track each would not be feasible. I am also using Spotfire 7.1.
Thanks in advance, and sorry again if the way I've written this is confusing.

Excel to return list of items with specific repetition

I am trying to create a list of names that repeat a specific number of times, based on another variable. Basically, if I have the following:
Column A Column B
Amy 5
John 2
Carl 3
the result would be:
Amy
Amy
Amy
Amy
Amy
Carl
Carl
Carl
John
John
I have built the initial list using the Index-Small-Countif, method, to get an alphabetical and distinct list, and then another formula to determine how many times each item repeats. I know I need to use some sort of index/offset with reference to rows, but just can't quite get it to work out.
The list is dynamic and changes daily, so manually retyping the list each day would result in too much human error and time (list is about 50 distinct items, with total number of rows at the end being around 400). Ultimately, the list will be used for a number of sumproduct/vlookups.
I can do this fairly quickly in VBA, but the users of this document don't always trust VBA and trying to get them to Enable Macros each time is not something that is going to work.
Thank you very much for any help you can offer!
Based on your table:
+---+------+---+
| | A | B |
+---+------+---+
| 1 | Amy | 5 |
| 2 | John | 2 |
| 3 | Carl | 3 |
+---+------+---+
In column C stick a "0" at C4 and formula =B1+C2 copying down to just before the 0:
+---+------+---+----+
| | A | B | C |
+---+------+---+----+
| 1 | Amy | 5 | 10 |
| 2 | John | 2 | 5 |
| 3 | Carl | 3 | 3 |
| 4 | | | 0 |
+---+------+---+----+
Now we have an upper bound of the row that each value should be placed on which we can use in a Match() formula which will feed an Index() formula.
In a new column (I'm using E) IN E1: =INDEX($A$1:$A$3,MATCH(ROW(),$C$1:$C$3,-1),1) and copy down
+----+------+---+----+--+------+
| | A | B | C | D | E |
+----+------+---+----+--+------+
| 1 | Amy | 5 | 10 | | Carl |
| 2 | John | 2 | 5 | | Carl |
| 3 | Carl | 3 | 3 | | Carl |
| 4 | | | 0 | | John |
| 5 | | | | | John |
| 6 | | | | | Amy |
| 7 | | | | | Amy |
| 8 | | | | | Amy |
| 9 | | | | | Amy |
| 10 | | | | | Amy |
+----+------+---+----+--+------+
The list is backwards because of that oddball backwards from 0 thing we did in Column C. This is to make that Match() last parameter of -1 (Greater than) work correctly.
I would imagine with some tweaking this could be done a little cleaner, but this should get you in the ballpark.
Although I would still be a big proponent of finding users who are capable of enabling macros. Ugh.

Adding Columns to Excel As List From Other Sheet Grows

Background
I'm creating a grade book in Excel for my wife. I have sheets for the overall grade, classwork, exams, and participation.
The three sections of work (classwork, exams, and participation) each have a variable number of items, and each item has a different number of points possible. Each section has a weight in the overall grade.
I have this up and running with a fixed number of items per section, but I'd like to create a template that can be updated from class to class and year to year.
Here's the problem:
On the classwork sheet, I'd like to be able to enter new assignments and their point value and have that automatically update the master grade sheet on my first sheet tab. Is there any way to add columns in a section of one worksheet (the master grade sheet) when new rows are added to another worksheet (the list of assignments)?
It is possible to achieve this without using VBA. The reason you will have difficulty acheiving this, however, is that you've violated normal form in the table you've already built. It appears the pertinent data you're looking for is each student's score on each assignment. If this if correct, the level of granularity you will want is on the Assignment, not on the Student.
There are some fairly quick ways to modify your existing work to account for this. I've written out some sample data below. Take a look and see if it helps.
Sample Original Table
+---------+------+------------+------------+
| Student | Quiz | Thumbnails | Watercolor |
+---------+------+------------+------------+
| Paul | 3 | 10 | 90 |
| Frank | 4 | 10 | 95 |
| Mary | 5 | 10 | 70 |
| Ellen | | 10 | 85 |
| Sue | 6 | 10 | 92 |
| Anton | 5 | 10 | 87 |
+---------+------+------------+------------+
Image of the data is below ( note I have highlighted the blank value ).
Sample Normal Table
+---------+-------------+-----------+-------+
| Student | Assignment | New_Score | Score |
+---------+-------------+-----------+-------+
| Paul | Quiz | | 3 |
| Frank | Quiz | | 4 |
| Mary | Quiz | | 5 |
| Ellen | Quiz | | 0 |
| Sue | Quiz | | 6 |
| Anton | Quiz | | 5 |
| Paul | Thumbnails | | 10 |
| Frank | Thumbnails | | 10 |
| Mary | Thumbnails | | 10 |
| Ellen | Thumbnails | | 10 |
| Sue | Thumbnails | | 10 |
| Anton | Thumbnails | | 10 |
| Paul | Watercolor | | 90 |
| Frank | Watercolor | | 95 |
| Mary | Watercolor | | 70 |
| Ellen | Watercolor | | 85 |
| Sue | Watercolor | | 92 |
| Anton | Watercolor | | 87 |
| Mary | ExtraCredit | 10 | 10 |
| Ellen | ExtraCredit | 8 | 8 |
| Sue | ExtraCredit | 9 | 9 |
| Anton | ExtraCredit | 10 | 10 |
+---------+-------------+-----------+-------+
Image of the data is below. The score column reaches back to your old table and grabs the score you've already entered for the students, so you won't have to do this all manually. The formula for this is =INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0)).
This assumes you've formatted the old data into an Excel DataTable ( ctrl+t ) and named it non_normal ( alt+j+t+i ). Note the unsubmitted assignment for Ellen comes through with a score of zero using this method. I've added a column named New_Score so that you are able to add new student-assignment submission combinations to the table without having to modify your old non_normal table ( which was the trouble in the OP ). With this column added, the formula in the Score column can be changed to =IF(NOT(ISBLANK([#[New_Score]])),[#[New_Score]],INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0))) which will take the New_Score value if available and the original score if not.
The orange cells are new student-assignment submission combinations. Note you do not need to add a row for every student, just add a row whenever a student submits an assignment.
Sample Assignments Table
+-------------+-----------------+
| Assignment | Points_Possible |
+-------------+-----------------+
| Quiz | 6 |
| Thumbnails | 10 |
| Wartercolor | 100 |
| ExtraCredit | |
+-------------+-----------------+
I've added the ExtraCredit assignment with a possible max score of zero/blank ( since not completing extra credit shouldn't count against a student )
Payoff - Back to the Original Table
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Sum of Score | Column Labels | | | | | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Row Labels | Quiz | Thumbnails | Watercolor | ExtraCredit | Grand Total | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Anton | 5 | 10 | 87 | 10 | 112 | 96.6% |
| Ellen | 0 | 10 | 85 | 8 | 103 | 88.8% |
| Frank | 4 | 10 | 95 | | 109 | 94.0% |
| Mary | 5 | 10 | 70 | 10 | 95 | 81.9% |
| Paul | 3 | 10 | 90 | | 103 | 88.8% |
| Sue | 6 | 10 | 92 | 9 | 117 | 100.9% |
+--------------+---------------+------------+------------+-------------+-------------+--------+
Using the image below, you pivot your newly normalized data into a Pivot Table. ( alt+n+v ). Now, simply adding a new assignment to the normal_assignment DataTable will cause that assignment to appear in a new column when you refresh the Pivot Table ( alt+a+r+a ).
The % score on the right of the Pivot Table is calculated using the following formula ( with the sample Pivot Table starting in cell $M$2 ): =GETPIVOTDATA("Score",$M$2,"Student",M4)/SUM(assignment[Points_Possible])
I've uploaded the raw sample file for this to my public repo if you'd like to pull it and take a peek at the source. Credit to sensefulsolutions for text-to-table conversion.
Hope this is what you need!

Count distinct occurrences and averages in a column based on identifiers in another column

In MS Excel, I want to count the number of distinct categories (ignoring a specific item) based on a different column. Also, I want to find the average and the max for the same selection. This is the data:
+--------+-----------+-------+
| Person | idea | score |
+--------+-----------+-------+
| George | vacuum | 9 |
| George | box | 6 |
| George | x | 1 |
| Joe | scoop | 4 |
| Joe | x | 1 |
| Joe | x | 1 |
| Joe | scoop | 4 |
| Joe | gear | 7 |
| Mike | harvester | 10 |
| Mike | gear | 7 |
| Mike | box | 6 |
+--------+-----------+-------+
The result should be the following:
+--------+----------------+------------+-----------+
| Person | distinct ideas | Avg. score | Max score |
+--------+----------------+------------+-----------+
| George | 2 | 5.3 | 9 |
| Joe | 2 | 3.4 | 7 |
| Mike | 3 | 7.7 | 10 |
+--------+----------------+------------+-----------+
Because Joe has two "scoop" and one "gear" idea, and I want to ignore the "x" items.
I reluctantly gave up and did it manually for each person, e.g., this is for the first person:
SUM(IF(FREQUENCY(MATCH(B2:B4,B2:B4,0),MATCH(B2:B4,B2:B4,0))>0,1))-IF(COUNTIF(B2:B4,"x")>0,1,0)
Doesn't Excel have functions to return a range instead of a value? If I could select the range based on the name of the person in the first columns, I could count distinct occurrences or find the average in another column.
Add a 4th column and label it Distinct Ideas
If your table starts in A1, then:
EDIT: Formula changed to exclude "x". Screen shot also changed
D2: =IF(TRIM($B2)="x",0,IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))>1,0,1))
and fill down.
Then construct a Pivot table
Person to Row Labels
Distinct Ideas to Values area
score to Values and select to Average
Score to Values area and Select Max
Format as desired. Here is one result:

Resources