I estimated the theta of exponential distribution and the theta and tau of weibull distribution. I want to compare the two distribution to see which one is the best fitting of my data. How can i do that in excel? Can i find the R squared value in excel?
You can simply use function correl(x range, y range) and make it square.
for example
=CORREL(A1:A10,B1:B10)^2
For more information go to https://www.youtube.com/watch?v=RYcYyxoKq0U
Related
I have an experimental dataset of the following values (y, x1, x2, w), where y is the measured quantity, x1 and x2 are the two independet variables and w is the error of each measurement.
The function I've chosen to describe my data is
These are my tasks:
1) Estimate values of bi
2) Estimate their standard errors
3) Calculate predicted values of f(x1, x2) on a mesh grid and estimate their confidence intervals
4) Calculate predicted values of
and definite integral
and their confidence intervals on a mesh grid
I have several questions:
1) Can all of my tasks be solved by weighted least squares? I've solved task 1-3 using WLS in matrix form by linearisation of the chosen function, but I have no idea, how to solve step №4.
2) I've performed Monte Carlo simulations to estimate bi and their s.e. I've generated perturbated values y'i from normal distribution with mean yi and standard deviation wi. I did this operation N=5000 times. For each perturbated dataset I estimated b'i, and from 5000 values of b'i I calculated mean values and their standard distribution. In the end, bi estimated from Monte-Carlo simulation coincide with those found by WLS. Am I correct, that standard deviations of b'i must be devided by № of Degrees of freedom to obtain standard error?
3) How to estimate confidence bands for predicted values of y using Monte-Carlo approach? I've generated a bunch of perturbated bi values from normal distribution using their BLUE as mean and standard deviations. Then I calculated lots of predicted values of f(x1,x2), found their means and standard deviations. Values of f(x1,x2) found by WLS and MC coincide, but s.d. found from MC are 5-45 order higher than those from WLS. What is the scaling factor that I'm missing here?
4) It seems that some of parameters b are not independent of each other, since there are only 2 independent variables. Should I take this into account in question 3, when I generate bi values? If yes, how can this be done? Should I use Chi-squared test to decide whether generated values of bi are suitable for further calculations, or should they be rejected?
In fact, I not only want to solve tasks I've mentioned earlier, but also I want to compare the two methods for regression analysys. I would appreciate any help and suggestions!
I am trying to implement Aggregation Pheromone density based classification for land use map problem. In the paper, the formula to calculate pheromone intensity deposited at x by ant aj (located at xj) is calculated as :
T(aj,x) = e^-d(xj,x)^2/2del^2
where, d(xj,x) represents the euclidean distance between two points,
del denotes the spread of gaussian function.
I want to know two things : First, what is this and second, how to calculate this.
I know how to calculate the corresponding z scores of all raw scores in my data using SPSS.
Now I want to calculate the percentage (probability format, like 0.989, 0.003 and etc is better) of distribution that is below and above correspond z score. I know how to do it manually with z table but I want to do it using only SPSS, without manual looking in z table.
SPSS has built-in cumulative distribution function (CDF) functions. The one for the normal distribution takes the z-score, the distribution's mean, and the distribution's standard deviation as parameters. So for a standard normal distribution:
COMPUTE probability = CDF.normal(z_score, 0, 1).
EXECUTE.
Most other distributions you might use are also available; the parameters used by the functions differ depending on how they're defined. (For example, CDF.T takes the t-score and the number of degrees of freedom as parameters.)
I have calculated a set of binomial distributions giving me the probabilities of finding n number of objects given a sample of N objects.
I calculated this using the Percent Point Function (PPF) (a.k.a. the inverse cumulative function) using the scipy.stats.distributions package.
Now that I want to plot a probability distribution a question emerges: Which package and function in python should I use for this? I've found a few useful resources like: http://goo.gl/Q2UjxX but I am still no closer (most likely I am missing something).
Say n = 2 and N = 10. How would I go about creating the plot below?:
The x-axis shows the range of values from 0 - n/N (where N > n always).
The y-axis shows the probability of n/N.
Thank you all for your time.
How to specify excel linest weighted polynomial fit formula, something like
LINEST(y*w^0.5,IF({1,0},1,x)*w^0.5,FALSE,TRUE), but this is for linear fit. I'm looking for similar formula for 2nd order and 3rd order polynomial regression fit.
In a reply to the other post in Weighted trendline an approach was already suggested for weighted polynomials. For example for a cubic fit try with CTRL+SHIFT+ENTER in a 4x1 range:
=LINEST(y*w^0.5,(x-1E-99)^{0,1,2,3}*w^0.5,FALSE)
(-1e-99 ensures that 0^0=1). Similar to the linear case for R^2 try:
=INDEX(LINEST((y-SUMPRODUCT(y,w)/SUM(w))*w^0.5,(x-1E-99)^{0,1,2,3}*w^0.5,FALSE,TRUE),3,1)
Derivation
In standard least squares we find the vector b that minimises:|y-Xb|²=(y-Xb)'(y-Xb)
In the weighted case b is chosen to minimise instead: |W(y-Xb)|²=(y-Xb)'W'W(y-Xb)
So the weighted regression is Wy on WX where W'W = W² is the diagonal matrix of the weights.