I am trying to create a spreadsheet that can find the most likely probability that a student scored a specific grade on a test.
Only one student can score a grade and only one grade can have a student.
I have limited information about each student.
There are 5 students (1,2,3,4,5)
and the grades possible are only (100,90,80,70,60)
In the spreadsheet a 1 denotes that the student DIDN'T score that grade.
Does anyone know how to make a simulation that I can find the most likely probability of what student scored what grade?
Link:
https://docs.google.com/spreadsheets/d/1a8uUIRzUKsY3DolTM1A0ISqMd-42WCUCiDsxmUT5TKI/edit?usp=sharing
Based on your response in comments, each student has an equal likelihood of getting each grade. No simulation is necessary.
If you want to simulate it anyway, don't use Excel*. Create a vector of students, and pair it with a shuffled vector of the grades. Lather, rinse, repeat as many times as you want to see that the student-to-grade matching is uniformly distributed.
* - To get an idea of how bad Excel can be for random variate generation, enable the Analysis Toolpak, go to "Data -> Data Analysis" on the ribbon, and select "Random Number Generation". Fill in the tabs that you want 10 variables, number of random numbers 2000, a "Normal" distribution, leave the mean and std dev at 0 and 1, and enter a "Random Seed" value of 123. You will find that the resulting table contains 3 instances of the value "-9.35764". Values that extreme should occur about once per twenty thousand years if you generate a billion a second. Getting three of them is so extreme that it should happen once per 1030 times the current estimated age of the universe. Conclude that a) it's your lucky day, or b) Excel sucks at random numbers, and despite being informed about this as far back as 1998 Microsoft hasn't bothered to fix it.
Related
(I use the term "teams" generically here because the entirety of this question rests on ranking, and it seemed to be the most intuitive language to describe my problem.)
In a league of 30 teams, each day only 8 teams play. The results for those teams are ranked ordinally from 1 to 8 for the day. This continues "forever", so that additional results must be recorded every day.
Example after 4 days:
I want to calculate a single number to describe the relationship between two teams. For instance, given the example, the value (in a 2d table) that describes the relationship of Ace to Get is 1. Ace beat Get twice and Get beat Ace once (2-1).
I have been messing with Sumproduct, Match, and Index to get get values, which I could calculate using many extra tables, but I may need to add "teams" on the fly, and I do not know how large the pool of teams will become. Because of this, I was hoping to be able to use a single formula in the 2d relationship table. The results of that table, looking at just day 1 and day 2 given the previous example, are:
Is there a direct formula I can use to calculate the results to populate that table?
You can try following formula:
=IF($A11<>B$10;
SUMPRODUCT(
IF(MMULT(($B$1:$I$1)*($B$2:$I$3=$A11);ROW($1:$8)^0)
<MMULT(($B$1:$I$1)*($B$2:$I$3=B$10);ROW($1:$8)^0);
1;
-1)
*(((MMULT(--($B$2:$I$3<>$A11);ROW($1:$8)^0)=8)
+(MMULT(--($B$2:$I$3<>B$10);ROW($1:$8)^0)=8))
=0));
"")
Copy right and down.
Ive created a game and in that game played 5 users which collected few points, Ive gived gifts manually but for next games how can i split or make in excel to calculate number of gifts,
this is ok using number format with 0 decimal places, 6+1+1+1 = 9
but in cases like this:
1+6+1+1+1 = 10, how can I make that only 9 gifts results?
You should be comparing their percent (B2/SUM(B2:B6)) against each prize as it relates to the total prize (e.g. 1/9). Since you are comparing decimal numbers with another decimal number and expecting an integer (no. of prizes), you will be rounding either up or down depending on whether you are favoring a wider distribution of the prizes or favoring the top score.
Either way you are going to have to decide whether the lowest score should always receive a prize or if the highest score should benefit from the points awarded.
The three possible formulas to start with would be,
=MROUND(C2, 1/9)*9 ◄ closest to even distribution
=FLOOR(C2, 1/9)*9 ◄ favours wider prize distribution
=CEILING(C2, 1/9)*9 ◄ rewards highest awarded points
Fill down as necessary.
Now you have to either take the highest or lowest score and adjust that to compensate for rounding the division of decimal numbers to an integer. MROUND doesn't play well with SUMPRODUCT but these two may give you a solution that you can live with.
=FLOOR($C2, 1/9)*9-((SUMPRODUCT(FLOOR($C$2:$C$6, 1/9)*9)-9)*($C2=MAX($C$2:$C$6)))
=CEILING($C2, 1/9)*9-((SUMPRODUCT(CEILING($C$2:$C$6, 1/9)*9)-9)*($C2=MAX($C$2:$C$6)))
Fill down as necessary.
If the MROUND solution is best suited to your prize distribution model, use a helper column that can determine the MROUND returns and then adjust the high score according to the sum of the helper column without circular references.
I'd like to use Excel to generate a randomized lab partner list, without using VB (due to security settings on the PCs).
Parameters are as follows:
Number of students: 10-30, one worksheet per total number desired
Number of partners: Three for first two labs, and two for the other four-five.
Number of lab stations: 10
Repeats: Ideally none, but it is permissible for a student to have a repeat partner from one of the first two labs.
Excel version: 2007
To clarify, each student will have two labs where they share a lab station with up to two other students, giving a maximum lab size of 30 students. After that, they will be strictly limited to two students per station, giving a maximum of 20 students. Each student will have four of these limited labs, with there being a total of five such labs presented, to allow for either odd-numbered classes, or a class size between 21-30.
Each student is simply numbered from 1-30, so a cell could, for instance, state "5, 24" as the two students for that lab station.
True RNG is not important, and in fact, only needs to be performed once to make these matrices.
I think this is a bit tricky without using VBA, but here is one approach that is OK for small groups. I have tried it using a group of just nine so that the screen shot should be readable.
The method is basic Fisher-Yates
A Start with a group of students size n represented by a list of numbers 1 to n.
B Generate a random number r in range 1 to n
C Pick the rth element from the list
D Remove the rth element from the list
E Reduce n by 1
F Repeat from B until n=1.
In Excel:-
Fill A2:A10 and D2:L2 with numbers 1-9
Put the following in B2 and pull down:-
=RANDBETWEEN(1,10-A2)
Put this in C2 and pull down:-
=OFFSET(D2,0,B2-1)
Put this in D3 and pull down and across:-
=IF(D2>=$C2,E2,D2)
The ID's will be in column C so the first three would be in group 1, the next three in group 2 etc.
By the way, your question is a special case of generating non-repeating random numbers - see
Generating unique random numbers without VBA
The array formula described here does it in one step - modified slightly for this problem it would look like
=SMALL(IF(COUNTIF(C$1:C1,ROW(INDIRECT("1:9")))=0,ROW(INDIRECT("1:9"))),RANDBETWEEN(1,(9-ROWS(C$2:C2)+1)))
I have a requirement in Excel to spread small; i.e. pennies, monetry rounding errors fairly across the members of my club.
The error arises when I deduct money from members; e.g. £30 divided between 21 members is £1.428571... requiring £1.43 to be deducted from each member, totalling £30.03, in order to hit the £30 target.
The approach that I want to take, continuing the above example, is to deduct £1.42 from each member, totalling £29.82, and then deduct the remaining £0.18 using an error spreading technique to randomly take an extra penny from 18 of the 21 members.
This immediately made me think of Reservoir Sampling, and I used the information here: Random selection,
to construct the test Excel spreadsheet here: https://www.dropbox.com/s/snbkldt6e8qkcco/ErrorSpreading.xls, on Dropbox, for you guys to play with...
The problem I have is that each row of this spreadsheet calculates the error distribution indepentently of every other row, and this causes some members to contribute more than their fair share of extra pennies.
What I am looking for is a modification to the Resevoir Sampling technique, or another balanced / 2 dimensional error spreading methodology that I'm not aware of, that will minimise the overall error between members across many 'error spreading' rows.
I think this is one of those challenging problems that has a huge number of other uses, so I'm hoping you geniuses have some good ideas!
Thanks for any insight you can share :)
Will
I found a solution. Not very elegant, through.
You have to use two matrix. In the first you get completely random number, chosen with =RANDOM() and in the second you choose the n greater value
Say that in F30 you have the first
=RANDOM()
cell.
(I have experimented with your sheet.)
Just copy a column of n (in your sheet 8) in column A)
In cell F52 you put:
=IF(RANK(F30,$F30:$Z30)<=$A52, 1, 0)
Until now, if you drag left and down the formulas, you have the same situation that is in your sheet (only less elegant und efficient).
But starting from the second row of random number you could compensate for the penny esbursed.
In cell F31 you put:
=RANDOM()-SUM(F$52:F52)*0.5
(pay attention to the $, each random number should have a correction basated on penny already spent.)
If the $ are ok you should be OK dragging formulas left and down. You could also parametrize the 0.5 and experiment with other values. With 0,5 I have a error factor (the equivalent of your cell AB24) between 1 and 2
I have a data matrix depicting the number of telephone calls from one telephone to another, all calls are unidirectional. The rows represent days and the columns represent hours. The data is not a sample - it is the full population. Rows are days of the week and columns are one hour blocks of a 24 hour clock. Values in the cells represent the number of telephone calls from telephone A to telephone B for that specific hour.
I would like to have a repeatable measure that enables me to tell my audience that the likelihood of this distribution occurring randomly is <x.
I'd like the formula for Excel 2007 or, as a last resort, VBA code.
I've searched and found answers that tell me how to statistically determine the significance of differences between two different data sets but not how to measure for just one data set against a random outcome.
Thanx in advance.
If the total number of calls in a given hour is T, and the total calling population is P; then the number of calls from A to B should be about T/P if "random". To test whether this is really the case you'd use the Chi-squared test. I'm afraid I don't have time to give you the full answer - but it'd be the testvalue=sum((observed_i/P - (T/P))^2/(T/P)) where you check the testvalue against the chi-squared table, and you can pick off the probability too. Excel can calculate these values. Refer Chi-Squared Test for more details.