How to generate a random number in Google Sheets / Excel through a discrete list of percentage of influence in random outcome? - excel

Let's say I'm randomly picking up a number 1, 2, 3, and I take notes of how many times they were picked out of 10 times I did this. After this experiment, and taking the notes of the percentage of the times these numbers were picked in this 10 randomly generated picks, I want to randomly pick a number but this time having the weight of the percentage of times that I just took note from the original procedure.
For instance, if 3 was picked 20% of times, then the random generator tool will have it 20% of the times in consideration instead of going equally ~33% for each number 1,2 and 3.
The thing I'm missing is if there is any way to (either in Excel or Google Sheets) give this "weight" of the percentages a random picker.

to generate 10 numbers from fixed set (1, 2, 3) you can use:
=INDEX(ROUND(RANDARRAY(10)*(3-1))+1)
if this gives you distribution like:
1
2
1
2
1
2
3
2
3
1
where number 3 is picked up 20% of times you can find out the distribution like:
=INDEX(QUERY({A2:A11, COUNTIFS(A2:A11, A2:A11)},
"select Col1,count(Col2)/10 group by Col1 label count(Col2)/10''"))
now to assign a weight we can reuse it like:
=INDEX(ROUND(RANDARRAY(10)*(MAX(A2:A11)-A2:A11))+MIN(A2:A11))
where you can notice that the % distribution of number 3 is always significantly lower or none:
for more precision and to avoid ghost values you can use:
=INDEX(SORTN(SORT(FLATTEN(SPLIT(QUERY(REPT(SORT(UNIQUE(A2:A11))&"×",
QUERY({A2:A11, COUNTIFS(A2:A11, A2:A11)},
"select count(Col2)*10 group by Col1 label count(Col2)*10''")),,9^9),
"×"))), 10, 1, RANDARRAY(100), 1))
if you wish to freeze the random generation follow the white fox into the forest of ice

Related

How to distribute n identical balls into k identical boxes having different capacities

Suppose there are 4 boxes having capacities 10, 5, 2, 1.Please help me in finding number of ways to distribute 16 identical balls into these four boxes. Each box can have 0 to the capacity number of the balls.
We can simplify the question to
"Find number of ways to distribute 2 identical balls into 4 boxes, having capacities 2,2,2 and 1.",
because we can count empty spaces in the original.
1) If we put a ball in 4th box, then we have 3 box for another one - 3 combination.
2) Now we don't put a ball in 4th box at all, so we have 3 box with 2 places.
n^k=2^3=8
In total, we have 8+1=9 combinations.
Let the parts be a_1,...,a_k and sum of them n, there are a total of n! combinations and then so each of the a_i parts are equal so we divide it by all a_i! and then divide it by k!, so the final answer will be:
n!/(a_1! * a_2! * ... * a_k! * k!).
this can be calculate in O(k) with O(n) preprocess using the Fermats theorem.

Histogram bins size to equal 1 day - pyplot

I have this list of delivery times in days for cars that are 0 years old. The list contains nearly 20,000 delivery days with many days being repeated. My question is how do i get the histogram to show bin sizes as 1 day. I have set the bin size to the amount of unique delivery days there by:
len(set(list))
but when i generate the histogram, the frequency of 0 delivery days is over 5000, however when i do list.count(0) it returns with 4500.
As you pointed out, len(set(list)) is the number of unique values for the "delivery days" variable. This is not the same thing as the bin size; it's the number of distinct bins. I would use "bin size" to describe the number of items in one bin; "bin count" would be a better name for the number of bins.
If you want to generate a histogram, supposing the original list of days is called days_list, a quick high-level approach is:
Make a new set unique_days = set(days_list)
Iterate over each value day in unique_days
For the current day, set the height of the bar (or size of the bin) in the
histogram to be equal to days_list.count(day). This will tell you the number
of times the current "day" value for number of delivery days appeared in the
days_list list of delivery times.
Does this make sense?
If the problem is not that you're manually calculating the histogram wrong but that pyplot is doing something wrong, it would help if you included some code for how you are using pyplot.
The number of bins would be determined by the number of days up to the maximum number of possible days.
Say daylist is the list you want to histogram (never call a list list, because that overwrites the python command with the same name), you would use the maximum of that list and create a range of bins like
maxi = max(daylist)
bins = range(0, maxi)
plt.hist(daylist, bins=bins)
or, if you want to use numpy,
bins = np.arange(0,np.max(daylist))
plt.hist(daylist, bins=bins)

Excel: Count until, then repeat?

I have a list of numbers which are either 1's or 2's. What I'd like to do is count how many 1's there are before a 2 appears, and then keep repeating this down the list (i'm trying to find the average number of 1's between each 2).
What would be the best way of doing this considering I've got over 10,000 rows? (i.e. too many to do manually)
The average number of 1's between each number 2, is the same as the ratio between the number 1 and the number 2.
Example:
1
1
2
1
1
1
1
2
1
1
2
1
1
2
Contains 10 ones and 4 twos.
Or there are five groups of ones, with the following counts: 2, 4, 2, 2
Either way, it will give you and average of 2.5 (10/4 = 2.5)
Note: You have to make a design choice, regarding how to handle beginnings and ends. If you had another one, after the last two, how should it be handled?
You can use the formula as shown in the screenshot below:
Note that the formula in the first row is different.
B C
=IF(A2=1,B1,B1+1) =COUNTIF(B:B,B2)
=IF(A3=1,B2,B2+1) =IFERROR(IF(A4=2,COUNTIF(B:B,B4),"")-1,"")
Then to get the average use:
=AVERAGEIF(C:C,"<>"&0)
Noceo's solution as a formula:
=COUNTIF(A:A,1)/COUNTIF(A:A,2)
The output of all the above:

Simulation on Excel with 6 possibilities with different probabilities

So I have a table that looks like this
Arrival Time Probability
0 .09
1 .17
2 .27
3 .2
4 .15
5 .12
And I want excel to randomly create one of the 6 arrival time values based on the given probabilities using RAND(). Is there any way to do this other than to have nested If loops?
here's what I came up with.
I would add a column C that calculates the cumulative brackets from 0-1 each digit would represent. If you start with zero and use formulas to calculate your brackets, you can change the probability if needed in the future. (formulas in photo below)
For example, arrival time of 0 would be between 0 and .09.
Then you can use use the rand() function in column D to generate your random number between 0 and 1 and add a lookup function in column E, or wherever you like. Screenshots of the data and formulas:
Replace your probabilities with cumulative probabilities (with a preliminary line for 0) and use VLOOKUP, exploiting the fact that VLOOKUP finds the closest match:

Excel IF OR Statement

I am having trouble determining the correct way to calculate a final rank order for four categories. Each of the four metrics make up a higher group. A Top 10 of each category is applied to the respective product to risk analysis.
CURRENT LOGIC - Assignment of 25% max per category.
Columns - Y4
Parts
0.25
25
=IF(L9=1,$Y$4,IF(L9=2,$Y$4*0.9, IF(L9=3,$Y$4*0.8, IF(L9=4,$Y$4*0.7, IF(L9=5,$Y$4*0.6, IF(L9=6,$Y$4*0.5, IF(L9=7,$Y$4*0.4, IF(L9=8,$Y$4*0.3, IF(L9=9,$Y$4*0.2, IF(L9=10,$Y$4*0.1,0))))))))))
DESIRED...
I would like to use a statement to determine three criteria in order to apply a score (1=100, 2=90, 3=80, etc..).
SUM the rank positions of each of the four categories-apply product rank ascending (not including NULL since it's not in the Top 10)
IF a product is identified in more than one metric-apply a significant contribution weight of (*.75),
IF a product has the number 1 rank in any of the four metrics-apply a score of (100).
Data - UPDATED EXAMPLE
(Product) Parts Labor Overhead External Final Score
"XYZ" 3 1 7 7 100
"ABC" NULL 6 NULL 2 100
"LMN" 4 NULL NULL NULL 70
This is way beyond my capability. ANY assistance is appreciated greatly!!!
Jim
I figured this is a good start and I can alter the weight as needed to reflect the reality of the situation.
=AVERAGE(G28:I28)+SUM(G28:I28)*0.25
However, I couldn't figure out how to put a cap on the score of no more than 100 points.
I am still unclear of what exactly you are attempting and if this will work, but how about this simple matrix using an array formula and some conditional formatting.
Array Formula in F2 (make sure to press Ctrl+Shift+Enter when exiting formula edit mode)
=MIN(100,SUM(IF(B2:E2<>"NULL",CHOOSE(B2:E2,100,90,80,70,60,50,40,30,20,10))))
Conditional Formatting defined as shown below.
Red = 100 value where it comes from a 1
Yellow = 100 value where it comes from more than 1 factor, but without a 1.

Resources