Independence of random variables - statistics

let U = the number of trials needed to get the first head
and
let V = number of trials needed to get two heads in repeated tosses of a fair coin.
Are U and V independent random variables?
I would say they are dependent if
u = number of trials before first head appears
v = number of trials to get 2nd head after the event u has occurred
Please help me understand it!

Related

Find global maximum of an equation using python

I am trying to write some codes to find the global maximum of an equation, e.g. f = -x**4.
Here is what I have got at the moment.
import sympy
x = sympy.symbols('x')
f = -x**4
df = sympy.diff(f,x)
ans = sympy.solve(df,x)
Then I am stuck. How should I substitute ans back into f, and how would I know if that would be the maximum, but not the minimum or a saddle point?
If you are just looking for the global maximum and nothing else, then there is already a function for that. See the following:
from sympy import *
x = symbols('x')
f = -x**4
print(maximum(f, x)) # 0
If you want more information such as the x value that gives that max or maybe local maxima, you'll have to do more manual work. In the following, I find the critical values as you have done above and then I show the values as those critical points.
diff_f = diff(f, x)
critical_points = solve(diff_f, x)
print(critical_points) # x values
for point in critical_points:
print(f.subs(x, point)) # f(x) values
This can be extended to include the second derivative test as follows:
d_f = diff(f, x)
dd_f = diff(f, x, 2)
critical_points = solve(d_f, x)
for point in critical_points:
if dd_f.subs(x, point) < 0:
print(f"Local maximum at x={point} with f({point})={f.subs(x, point)}")
elif dd_f.subs(x, point) > 0:
print(f"Local minimum at x={point} with f({point})={f.subs(x, point)}")
else:
print(f"Inconclusive at x={point} with f({point})={f.subs(x, point)}")
To find the global max, you would need to take all your critical points and evaluate the function at those points. Then pick the max from those.
outputs = [f.subs(x, point) for point in critical_points]
optimal_x = [point for point in critical_points if f.subs(x, point) == max(outputs)]
print(f"The values x={optimal_x} all produce a global max at f(x)={max(outputs)}")
The above should work for most elementary functions. Apologies for the inconsistent naming of variables.
If you are struggling with simple things like substitution, I suggest going through the docs for an hour or two.

Given a exponential probability density function, how to generate random values using the random generator in Excel?

Based on a set of experiments, a probability density function (PDF) for an exponentially distributed variable was generated. Now the goal is to use this function in a Monte carlo simulation. I am vaguely familiar with PDF's and random generator, especially for normal and log-normal distributions. However, I am not quite able to figure this out. Would be great if someone can help.
Here's the function:
f = γ/2R * exp⁡(-γl/2R) (1-exp⁡(-γ) )^(-1) H (2R-l)
f is the probability density function,
1/γ is the mean of the distribution,
R is a known fixed variable,
H is the heaviside step function,
l is the variable that is exponentially distributed
Well. I don't know how to do it in Excel, but using inverse method it is easy to get the answer (assuming there is RANDOM() function which returns uniform numbers in the [0...1] range)
l = -(2R/γ)*LOG(1 - RANDOM()*(1-EXP(-γ)))
Easy to check boundary values
if RANDOM()=0, then l = 0
if RANDOM()=1, then l = 2R
UPDATE
So there is a PDF
PDF(l|R,γ) = γ/2R * exp⁡(-lγ/2R)/(1-exp⁡(-γ)), l in the range [0...2R]
First, check that it is normalized
∫ PDF(l|R,γ) dl from 0 to 2R = 1
Ok, it is normalized
Then compute CDF(l|R,γ)
CDF(l|R,γ) = ∫ PDF(l|R,γ) dl from 0 to l =
(1 - exp⁡(-lγ/2R))/(1-exp⁡(-γ))
Check again, CDF(l=2R|R,γ) = 1, good.
Now set CDF(l|R,γ)=RANDOM(), solve it wrt l and get your sampling expression. Check it at the RANDOM() returning 0 or RANDOM() returning 1, you should get end points of l interval.

can a random number be added to a set..such that its mean and variance will not change

i have a set of 4 values. i want to generate a random number which will be adding to the each of the set. But after adding ,the values of mean and variance should not change.
Meaning mean and variance of set before adding should be same as after adding the number.i was trying to approach it with genetic algorithm .can anyone please give me more insight on this?
Let us suppose your set is called x. Let us also suppose that you will add values to x to make it y. In R, this could be achieved by
x <- rnorm(4, mean = 5, sd = 2)
x
[1] 5.124843 3.070105 4.444706 6.657949
rand <- rnorm(0, sd(x))/1000 # Divide by 1000 so rand will have minimum
#impact on the mean and variance of x when added
y <- x + rand
y
[1] 5.124799 3.066977 4.444524 6.656452
mean(x); mean(y)
[1] 4.824401
[1] 4.823188
Now this will show some incremental change but to minimize the incremental change, you can scale rand by dividing it by a large number (as I did) or multiplying it by a small number. Another way you can about this is by using the jitter function in R. This function uses a small uniform distribution centered about 0 to sample and add noise to data.
x <- c(1, -.5, 2, -1.2)
jitter(x)
[1] 1.1117953 -0.5391391 2.0695948 -1.1145638
The only downside to jitter is that you cannot scale your noise from outside the function. It will scale your entire x vector.

Statistical Analysis Error? python 3 proof read please

The code below generates two random integers within range specified by argv, tests if the integers match and starts again. At the end it prints some stats about the process.
I've noticed though that increasing the value of argv reduces the percentage of tested possibilities exponentially.
This seems counter intuitive to me so my question is, is this an error in the code or are the numbers real and if so then what am I not thinking about?
#!/usr/bin/python3
import sys
import random
x = int(sys.argv[1])
a = random.randint(0,x)
b = random.randint(0,x)
steps = 1
combos = x**2
while a != b:
a = random.randint(0,x)
b = random.randint(0,x)
steps += 1
percent = (steps / combos) * 100
print()
print()
print('[{} ! {}]'.format(a,b), end=' ')
print('equality!'.upper())
print('steps'.upper(), steps)
print('possble combinations = {}'.format(combos))
print('explored {}% possibilitys'.format(percent))
Thanks
EDIT
For example:
./runscrypt.py 100000
will returm me something like:
[65697 ! 65697] EQUALITY!
STEPS 115867
possble combinations = 10000000000
explored 0.00115867% possibilitys
"explored 0.00115867% possibilitys" <-- This number is too low?
This experiment is really a geometric distribution.
Ie.
Let Y be the random variable of the number of iterations before a match is seen. Then Y is geometrically distributed with parameter 1/x (the probability of generating two matching integers).
The expected value, E[Y] = 1/p where p is the mentioned probability (the proof of this can be found in the link above). So in your case the expected number of iterations is 1/(1/x) = x.
The number of combinations is x^2.
So the expected percentage of explored possibilities is really x/(x^2) = 1/x.
As x approaches infinity, this number approaches 0.
In the case of x=100000, the expected percentage of explored possibilities = 1/100000 = 0.001% which is very close to your numerical result.

counting results from a defined matrix

So I am very new to programming and Haskell is the first language that I'm learning. The problem I'm having is probably a very simple one but I simply can not find an answer, no matter how much I search.
So basically what I have is a 3x3-Matrix and each of the elements has a number from 1 to 3. This Matrix is predefined, now all I need to do is create a function which when I input 1, 2 or 3 tells me how many elements there are in this matrix with this value.
I've been trying around with different things but none of them appear to be allowed, for example I've defined 3 variables for each of the possible numbers and tried to define them by
value w =
let a=0
b=0
c=0
in
if matrix 1 1==1 then a=a+1 else if matrix 1 1==2 then b=b+1
etc. etc. for every combination and field.
<- ignoring the wrong syntax which I'm really struggling with, the fact that I can't use a "=" with "if, then" is my biggest problem. Is there a way to bypass this or maybe a way to use "stored data" from previously defined functions?
I hope I made my question somewhat clear, as I said I've only been at programming for 2 days now and I just can't seem to find a way to make this work!
By default, Haskell doesn't use updateable variables. Instead, you typically make a new value, and pass it somewhere else (e.g., return it from a function, add it into a list, etc).
I would approach this in two steps: get a list of the elements from your matrix, then count the elements with each value.
-- get list of elements using list comprehension
elements = [matrix i j | i <- [1..3], j <- [1..3]]
-- define counting function
count (x,y,z) (1:tail) = count (x+1,y,z) tail
count (x,y,z) (2:tail) = count (x,y+1,z) tail
count (x,y,z) (3:tail) = count (x,y,z+1) tail
count scores [] = scores
-- use counting function
(a,b,c) = count (0,0,0) elements
There are better ways of accumulating scores, but this seems closest to what your question is looking for.
Per comments below, an example of a more idiomatic counting method, using foldl and an accumulation function addscore instead of the count function above:
-- define accumulation function
addscore (x,y,z) 1 = (x+1,y,z)
addscore (x,y,z) 2 = (x,y+1,z)
addscore (x,y,z) 3 = (x,y,z+1)
-- use accumulation function
(a,b,c) = foldl addscore (0,0,0) elements

Resources