How to solve two balls select event without replacement - statistics

Q)A box contains 4 red balls, 3 green balls and 3 blue balls. Two balls are selected at
random without replacement. Let X represent the number of red balls in the sample and
Y the number of green balls in the sample.
a) Arrange the different pairs of values of (X, Y ) as the cells in a table, each cell being
filled with the probability of that pair of values occurring, i.e. provide the joint
probability distribution.
b) What does the random variable Z = 2 - X - Y represent?
c) Calculate Cov(X, Y ).
d) Calculate P(X = 1 | -2 < X - Y < 2).
I couldn't understand how to think to solve the part a) in this question and so on.

To solving this question first of all you have to create a tree with this two events. First data in this question is that we can take is , these are not independent event. so you can create tree like this,
In first part you have to create the joint table of X and Y.
there is 0,1,2 are the only possible values that each variable can get.
The critical situations are that X-1 , Y-0 and X-0 , Y-1 .Because they got two possible chances in same situation that one color ball take first and that same color ball select the second time.
So this is the table that can get according to this tree.
part b represent the blue balls in selected sample

Related

How to distribute n identical balls into k identical boxes having different capacities

Suppose there are 4 boxes having capacities 10, 5, 2, 1.Please help me in finding number of ways to distribute 16 identical balls into these four boxes. Each box can have 0 to the capacity number of the balls.
We can simplify the question to
"Find number of ways to distribute 2 identical balls into 4 boxes, having capacities 2,2,2 and 1.",
because we can count empty spaces in the original.
1) If we put a ball in 4th box, then we have 3 box for another one - 3 combination.
2) Now we don't put a ball in 4th box at all, so we have 3 box with 2 places.
n^k=2^3=8
In total, we have 8+1=9 combinations.
Let the parts be a_1,...,a_k and sum of them n, there are a total of n! combinations and then so each of the a_i parts are equal so we divide it by all a_i! and then divide it by k!, so the final answer will be:
n!/(a_1! * a_2! * ... * a_k! * k!).
this can be calculate in O(k) with O(n) preprocess using the Fermats theorem.

Plotting Number of Events that Occur in an Interval Histogram

In Excel, I have two fields per row, a start date/time and an end date/time. I am looking to plot a histogram that shows how many of the rows' intervals contain the time on the x axis.
For example, some start and end times could be: [1,3], [3,4], [7,9], and [7,8]
And I want an output similar to:
x x x
x x x x x x x
1.2.3.4.5.6.7.8.9
How can this be done?
One way is to split your tuples (say with Text to Columns) into say ColumnsA:B, starting in Row3 (say to Row6) and series filling D1:L9 (or to suit) with D1 = 1 and an integer increase.
Then in D3 copied across to L3 and D3:L3 copied down to Row6:
=IF(AND(D$1>=$A3,D$1<=$B3),"X","")
and in D2 copied across to L2:
=COUNTIF(D3:D6,"X")
Then make a column chart (INSERT > Charts) from D1:L2.
However, I may have misunderstood because there is no particular time significance to the above - the data is just treated like integers.

How to make PivotChart with line breaks

I have data like the following:
x y f
1 1 1.2
1 2 1.4
1 3 1.6
3 1 3.2
3 2 3.4
3 3 3.6
5 1 5.2
5 2 5.4
5 3 5.6
If you insert a pivot chart, you can plot f vs x and y using a line chart, and the plot has two stacked x-axes where the lower x-axes values are 1 3 5 corresponding to x, and the upper x-axes has values 1 2 3 for each value of the lower x-axes, representing x = 1 and y = 1 2 3, then x = 2 and y = 1 2 3, and x = 3 and y = 1 2 3. The plot should show a single continuous line from left to right. What I would like is for the line to break when x changes values, so there are three short lines showing the influence of y for constant values of x.
This link makes a chart similar to what I'm describing in the answer. In terms of that figure, what I want is for the link to break every time the year changes. But the answer they have, and discussion doesn't get what I'm looking for. The only approach that I can think of is to modify the PivotTable data by hand and add a row at the location the data breaks. I tried to do something like that at work, but before modifying the table, I copied the table as values to a separate location. With the new data table, I was not able to create the plot with two x axis. If I created the plot, I could put a second value in when y = 3, and for f have NA(), which should create the break in the proper location.
For something that looks like:
Select each of the second and subsequent y 1 values (individually):
and Format Data Point..., Line, No line.
(BTW IMO better suited to Super User.)

Excel - Multiply Until Total Reached

I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
The question behind the formula is, how many boxes of x capacity do I need to have a total capacity of 20 liters and how much does that cost.
x = volume of bottle
y = number of bottles in a box
z = price per box
This could be done very easily by hand, but I've been playing (with little effect) in excel for a while and would like a solution.
I hope that makes sense
I rather think what you would like is the formula provided by #Jeeped but for:
I want to multiply x*y until x>=20, then multiply z that value and have the results displayed as two values, the multiple and multiple*z
label two arrays from 1 to 20 for columns and rows as shown, populate V1 with the price per box and in B2:
=IF(AND($A2*B$1>20,A2>=20),"",$A2*B$1)
and in X2:
=IF(B2="","",$V$1*B2)
with both formulae copied across 19 columns and those two sets of 20 formulae then copied down 19 rows. The result should be similar to:

Rating the straightness of a line

I have a data set that defines a set of points on a 2-dimensional Cartesian plane. Theoretically, those points should form a line, but that line may be perfectly horizontal, perfectly vertical, and anything in between.
I would like to design an algorithm that rates the 'straightness' of that line.
For example, the following data sets would be perfectly straight:
Y = 2/3x + 4
X | Y
---------
-3 | 2
0 | 4
3 | 6
Y = 4
X | Y
---------
1 | 4
2 | 4
3 | 4
X = -1
X | Y
---------
-1 | 7
-1 | 8
-1 | 9
While this one would not:
X | Y
---------
-3 | 2
0 | 5
3 | 6
I think it would work to minimize the sum of the squares of the distances of each point from to a line (usually called a regression line), then determine the average distance of each point to the line. Thus, a perfectly straight line would have an average distance of 0.
Because the data can represent a line that is vertical, as I understand it, the usual least-squares regression line won't work for this data set. A perpendicular least-squares regression line might work, but I've had little luck finding an implementation of one.
I am working in Excel 2010 VBA, but I should be able to translate any reasonable algorithm.
Thanks,
PaulH
The reason things like RSQ and LinEst won't work for this is because I need a universal measurement that includes vertical lines. As a line's slope approaches infinity (vertical), their RSQ approaches 0 even if the line is perfectly straight or nearly so.
-PaulH
Sounds like you are looking for R2, the coefficient of determinism.
Basically, you take the residual sum of squares, divide by the sum of squares and subtract from 1.
Use a Linear Regression. The "straightness" of the line is the R^2 value.
A value of 0 for the R^2 value implies it is perfectly straight. Increasing values imply increasing error in the regression, and thus the line is less and less "straight"
Could you try to catch the case of the vertical line before moving the least squares regression? If all x-values are the same, then the line is perfectly straight, no need to calculate an r^2 value.
Rough idea:
1. translate all coordinates to absolute values
2. calculate tan of current x/y
3. calculate tan of difference in x/y between current x/y and next x/y
4. difference in tan can give running deviation
Yes, use ordinary least squares method. Just use the Slope and Intercept functions in a worksheet. I expect there is a simple way to call these from the VBA codebehind.
Here's the VBA info. for R-Squared: http://www.pcreview.co.uk/forums/thread-1009945.php

Resources