Is the sum of random numbers of random variables Gaussian? - gaussian

In random service systems, we will encounter such problems. A random service system comes to ν customers at a certain time, each customer receives a service time of ζi, and the random service time is independent of ν, then the total service time provided by the system is: Sν=ζ1 +ζ2+……ζν . If ζ1, ζ2...ζν are independent and identically distributed, and is also a Gaussian distribution. v is a uniform distribution, is the total time Sν a Gaussian distribution?

Welcome to SO, yes i.i.d will be a normal distribution with the resulting mean being the sum of means and the variance being the sum of variances. But, there are some assumptions. There are more details with respect to the answer here[1]:

Indeed, the a random variable Z equal to a sum of n independent random variables Xi, each of which is drawn from a Gaussian distribution, not necessarily the same one, will have a Gaussian distribution:
if Xi ~ N(ui, si**2), where ui is the mean value for Gaussian distribution N(...), and si is its standard deviation, then Z has this distribution
Z ~ N(U, S**2)
where U = u1 + ... + un, S**2 = s1**2 + ... sn**2
Check Central Limit Theorem for details. It states that your Xi are not Gaussian RV, but have other distribution, the total Z will still approach the Gaussian distribution if n is large (many Xi).

Related

intensity of point process - weights with covariate - spatstat

I am trying spatstat for a specific case. In my shapefile of roads, i have attributes speed and % of heavy vehicles on each road. It is an observation that severe accidents are likely to happen on roads with high speeds and more heavy vehicles (because road is not properly access controlled and pedestrians cross the road). We know that there are accidents at a rate (per 5km stretch).
I would like to generate a random poisson with that rate, but giving weight that the points happen more on roads with high speed ( or high % truck)
and if possible also to include the second variable % of trucks
What is the best way to model the two aspects to make a small proof of concept? I have read (portions of) the spatstat book and section on influence of covariates on intensity, but this is still unclear to me.
Thanks
The spatstat function rpoislpp generates a Poisson random point pattern on the network with a given intensity. In this case, you want a spatially-varying intensity, which can be specified by a function of spatial location. That is, you want something like rpoislpp(f, L) where L is the linear network and f is the intensity function.
I assume you have obtained values of the covariate (like speed limit and fraction of trucks) for each road. Then you need to build a function that looks up these values at any spatial location on the network. Once you have this, you can write the intensity function in terms of it.
To start, suppose you have a network L (object of class linnet). The segments of the network can be indexed in the original order given when you specified them: or you can extract these segments by S <- as.psp(L). We need a vector z giving the covariate values for each of these segments (so this will be a numeric vector of length n=nsegments(S)). Then z[i] is the covariate value along segment i. (Note: if you have covariate values for each road, where a road consists of multiple segments of L, then you first need to figure out which segments of L belong to each road, and construct z.)
Next do the following:
Zfun <- linfun(function(x,y,seg,tp) { z[seg] }, L)
This creates a function on the linear network (class linfun) that evaluates the covariate at any spatial location on L. To check it's built correctly, type plot(Zfun).
Now suppose you want the point process intensity to be lambda = exp(3*Z+2). Then do
lam <- function(x,y,seg,tp) { exp(3 * z[seg] + 2) }
lambda <- linfun(lam, L)
(Needless to say, you can write any mathematical expression in the braces; and you can have more than one covariate, etc.)
Finally generate the random points:
X <- rpoislpp(lambda, L)

Sampling a hemisphere using an arbitary distribtuion

I am writing a ray tracer and I wish to fire rays from a point p into a hemisphere above that point according to some distribution.
1) I have derived a method to uniformly sample within a solid angle (defined by theta) above p Image
phi = 2*pi*X_1
alpha = arccos (1-(1-cos(theta))*X_2)
x = sin(alpha)*cos(phi)
y = sin(alpha)*sin*phi
z = -cos(alpha)
Where X is a uniform random number
That works and Im pretty happy with that. But my question is what happens if I do not want a uniform distribution.
I have used the algorithm on page 27 from here and I can draw samples from a piecewise arbitrary distribution. However if I simply say:
alpha = arccos (1-(1-cos(theta)) B1)
Where B is a random number generated from an arbiatry distribution.
It doesn't behave nicely...What am I doing wrong? Thanks in advance. I really really need help on this
Additional:
Perhaps I am asking a leading question. Taking a step back:
Is there a way to generate points on a hemisphere according to an arbitrary distribution. I have a method for uniformly sampling a hemisphere and one for cosine-weighted hemisphere sampling. (pg 663-669 pbrt.org)
With an uniform distribution, you can just average the sample results and obtain the correct result. This is equivalent to divide each sample result by the sample Probability Density Function (PDF) and, in the case of an uniform distribution, it is just 1 / sample_count (i.e. the same of averaging the results).
With an arbitrary distribution, you have still to divide the sample result by the sample PDF however the PDF now depends on the arbitrary distribution you are using. I assume your error is here.

Find the period in a random distribution

I have objects which are randomly distributed in the X axis. The objects have a periodic (m) distribution with a slight variation of their position around multiples of m.
The graph here shows a distribution for m=100.
Is there a way to calculate m using the statistics of distribution?
Thanks!
Do you know the error distribution? If so, for example, it's a 0 mean Gaussian with variance \sigma^2, then you can calculate the likelihood of the data as a function of the unknown period m. Once you can do this you can try to solve an optimization and find the period that has maximum likelihood.

How do I efficiently estimate a probability based on a small amount of evidence?

I've been trying to find an answer to this for months (to be used in a machine learning application), it doesn't seem like it should be a terribly hard problem, but I'm a software engineer, and math was never one of my strengths.
Here is the scenario:
I have a (possibly) unevenly weighted coin and I want to figure out the probability of it coming up heads. I know that coins from the same box that this one came from have an average probability of p, and I also know the standard deviation of these probabilities (call it s).
(If other summary properties of the probabilities of other coins aside from their mean and stddev would be useful, I can probably get them too.)
I toss the coin n times, and it comes up heads h times.
The naive approach is that the probability is just h/n - but if n is small this is unlikely to be accurate.
Is there a computationally efficient way (ie. doesn't involve very very large or very very small numbers) to take p and s into consideration to come up with a more accurate probability estimate, even when n is small?
I'd appreciate it if any answers could use pseudocode rather than mathematical notation since I find most mathematical notation to be impenetrable ;-)
Other answers:
There are some other answers on SO that are similar, but the answers provided are unsatisfactory. For example this is not computationally efficient because it quickly involves numbers way smaller than can be represented even in double-precision floats. And this one turned out to be incorrect.
Unfortunately you can't do machine learning without knowing some basic math---it's like asking somebody for help in programming but not wanting to know about "variables" , "subroutines" and all that if-then stuff.
The better way to do this is called a Bayesian integration, but there is a simpler approximation called "maximum a postieri" (MAP). It's pretty much like the usual thinking except you can put in the prior distribution.
Fancy words, but you may ask, well where did the h/(h+t) formula come from? Of course it's obvious, but it turns out that it is answer that you get when you have "no prior". And the method below is the next level of sophistication up when you add a prior. Going to Bayesian integration would be the next one but that's harder and perhaps unnecessary.
As I understand it the problem is two fold: first you draw a coin from the bag of coins. This coin has a "headsiness" called theta, so that it gives a head theta fraction of the flips. But the theta for this coin comes from the master distribution which I guess I assume is Gaussian with mean P and standard deviation S.
What you do next is to write down the total unnormalized probability (called likelihood) of seeing the whole shebang, all the data: (h heads, t tails)
L = (theta)^h * (1-theta)^t * Gaussian(theta; P, S).
Gaussian(theta; P, S) = exp( -(theta-P)^2/(2*S^2) ) / sqrt(2*Pi*S^2)
This is the meaning of "first draw 1 value of theta from the Gaussian" and then draw h heads and t tails from a coin using that theta.
The MAP principle says, if you don't know theta, find the value which maximizes L given the data that you do know. You do that with calculus. The trick to make it easy is that you take logarithms first. Define LL = log(L). Wherever L is maximized, then LL will be too.
so
LL = hlog(theta) + tlog(1-theta) + -(theta-P)^2 / (2*S^2)) - 1/2 * log(2*pi*S^2)
By calculus to look for extrema you find the value of theta such that dLL/dtheta = 0.
Since the last term with the log has no theta in it you can ignore it.
dLL/dtheta = 0 = (h/theta) + (P-theta)/S^2 - (t/(1-theta)) = 0.
If you can solve this equation for theta you will get an answer, the MAP estimate for theta given the number of heads h and the number of tails t.
If you want a fast approximation, try doing one step of Newton's method, where you start with your proposed theta at the obvious (called maximum likelihood) estimate of theta = h/(h+t).
And where does that 'obvious' estimate come from? If you do the stuff above but don't put in the Gaussian prior: h/theta - t/(1-theta) = 0 you'll come up with theta = h/(h+t).
If your prior probabilities are really small, as is often the case, instead of near 0.5, then a Gaussian prior on theta is probably inappropriate, as it predicts some weight with negative probabilities, clearly wrong. More appropriate is a Gaussian prior on log theta ('lognormal distribution'). Plug it in the same way and work through the calculus.
You can use p as a prior on your estimated probability. This is basically the same as doing pseudocount smoothing. I.e., use
(h + c * p) / (n + c)
as your estimate. When h and n are large, then this just becomes h / n. When h and n are small, this is just c * p / c = p. The choice of c is up to you. You can base it on s but in the end you have to decide how small is too small.
You don't have nearly enough info in this question.
How many coins are in the box? If it's two, then in some scenarios (for example one coin is always heads, the other always tails) knowing p and s would be useful. If it's more than a few, and especially if only some of the coins are only slightly weighted then it is not useful.
What is a small n? 2? 5? 10? 100? What is the probability of a weighted coin coming up heads/tail? 100/0, 60/40, 50.00001/49.99999? How is the weighting distributed? Is every coin one of 2 possible weightings? Do they follow a bell curve? etc.
It boils down to this: the differences between a weighted/unweighted coin, the distribution of weighted coins, and the number coins in your box will all decide what n has to be for you to solve this with a high confidence.
The name for what you're trying to do is a Bernoulli trial. Knowing the name should be helpful in finding better resources.
Response to comment:
If you have differences in p that small, you are going to have to do a lot of trials and there's no getting around it.
Assuming a uniform distribution of bias, p will still be 0.5 and all standard deviation will tell you is that at least some of the coins have a minor bias.
How many tosses, again, will be determined under these circumstances by the weighting of the coins. Even with 500 tosses, you won't get a strong confidence (about 2/3) detecting a .51/.49 split.
In general, what you are looking for is Maximum Likelihood Estimation. Wolfram Demonstration Project has an illustration of estimating the probability of a coin landing head, given a sample of tosses.
Well I'm no math man, but I think the simple Bayesian approach is intuitive and broadly applicable enough to put a little though into it. Others above have already suggested this, but perhaps if your like me you would prefer more verbosity.
In this lingo, you have a set of mutually-exclusive hypotheses, H, and some data D, and you want to find the (posterior) probabilities that each hypothesis Hi is correct given the data. Presumably you would choose the hypothesis that had the largest posterior probability (the MAP as noted above), if you had to choose one. As Matt notes above, what distinguishes the Bayesian approach from only maximum likelihood (finding the H that maximizes Pr(D|H)) is that you also have some PRIOR info regarding which hypotheses are most likely, and you want to incorporate these priors.
So you have from basic probability Pr(H|D) = Pr(D|H)*Pr(H)/Pr(D). You can estimate these Pr(H|D) numerically by creating a series of discrete probabilities Hi for each hypothesis you wish to test, eg [0.0,0.05, 0.1 ... 0.95, 1.0], and then determining your prior Pr(H) for each Hi -- above it is assumed you have a normal distribution of priors, and if that is acceptable you could use the mean and stdev to get each Pr(Hi) -- or use another distribution if you prefer. With coin tosses the Pr(D|H) is of course determined by the binomial using the observed number of successes with n trials and the particular Hi being tested. The denominator Pr(D) may seem daunting but we assume that we have covered all the bases with our hypotheses, so that Pr(D) is the summation of Pr(D|Hi)Pr(H) over all H.
Very simple if you think about it a bit, and maybe not so if you think about it a bit more.

Probability of selecting an element from a set

The expected probability of randomly selecting an element from a set of n elements is P=1.0/n .
Suppose I check P using an unbiased method sufficiently many times. What is the distribution type of P? It is clear that P is not normally distributed, since cannot be negative. Thus, may I correctly assume that P is gamma distributed? And if yes, what are the parameters of this distribution?
Histogram of probabilities of selecting an element from 100-element set for 1000 times is shown here.
Is there any way to convert this to a standard distribution
Now supposed that the observed probability of selecting the given element was P* (P* != P). How can I estimate whether the bias is statistically significant?
EDIT: This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-)
With repetitions, your distribution will be binomial. So let X be the number of times you select some fixed object, with M total selections
P{ X = x } = ( M choose x ) * (1/N)^x * (N-1/N)^(M-x)
You may find this difficult to compute for large N. It turns out that for sufficiently large N, this actually converges to a normal distribution with probability 1 (Central Limit theorem).
In case P{X=x} will be given by a normal distribution. The mean will be M/N and the variance will be M * (1/N) * ( N-1) / N.
This is a clear binomial distribution with p=1/(number of elements) and n=(number of trials).
To test whether the observed result differs significantly from the expected result, you can do the binomial test.
The dice examples on the two Wikipedia pages should give you some good guidance on how to formulate your problem. In your 100-element, 1000 trial example, that would be like rolling a 100-sided die 1000 times.
As others have noted, you want the Binomial distribution. Your question seems to imply an interest in a continuous approximation to it, though. It can actually be approximated by the normal distribution, and also by the Poisson distribution.
Is your distribution a discrete uniform distribution?

Resources