How do I calculate the standard deviation between weighted measurements? - statistics

I have several weighted values for which I am taking a weighted average. I want to calculate a weighted standard deviation using the weighted values and weighted average. How would I modify the typical standard deviation to include weights on each measurement?
This is the standard deviation formula I am using.
When I simply use each weighted value for 'x' and the weighted average for '\bar{x}', the result seems smaller than it should be.

I just found this wikipedia page discussing data of equal significance vs weighted data. The correct way to calculate the biased weighted estimator of variance is
,
though the following, on-the-fly implementation, is more efficient computationally as it does not require calculating the weighted average before looping over the sum on the weighted differences squared
.
Despite my skepticism, I tried both and got the exact same results.
Note, be sure to use the weighted average
.

Related

Standard Deviation from MSE

Is there a formula to calculate Standard Deviation from the given value of Mean Squared Error ?
For a given data set, I have the mean squared error value(s) calculated, and Standard Deviation is calculated based on this value. But I am not sure of the formula my system is using. Is there any general statistical formula for this?
This calculation is for Safety Stock value. Demand Standard deviation is used as an input for Safety Stock calculation which in turn is calculated from MSE. I need to find the formula to derive demand standard deviation from MSE.
Thanks in Advance.

Gaussian Mixture model log-likelihood to likelihood-Sklearn

I want to calculate the likelihoods instead of log-likelihoods. I know that score gives per sample average log-likelihood and for that I need to multiply score with sample size but the log likelihoods are very large negative numbers such as -38567258.1157 and when I take np.exp(scores) , I get a zero. Any help is appreciated.
gmm=GaussianMixture(covariance_type="diag",n_components=2)
y_pred=gmm.fit_predict(X_test)
scores=gmm.score(X_test)

Weighted Least Squares vs Monte Carlo comparison

I have an experimental dataset of the following values (y, x1, x2, w), where y is the measured quantity, x1 and x2 are the two independet variables and w is the error of each measurement.
The function I've chosen to describe my data is
These are my tasks:
1) Estimate values of bi
2) Estimate their standard errors
3) Calculate predicted values of f(x1, x2) on a mesh grid and estimate their confidence intervals
4) Calculate predicted values of
and definite integral
and their confidence intervals on a mesh grid
I have several questions:
1) Can all of my tasks be solved by weighted least squares? I've solved task 1-3 using WLS in matrix form by linearisation of the chosen function, but I have no idea, how to solve step №4.
2) I've performed Monte Carlo simulations to estimate bi and their s.e. I've generated perturbated values y'i from normal distribution with mean yi and standard deviation wi. I did this operation N=5000 times. For each perturbated dataset I estimated b'i, and from 5000 values of b'i I calculated mean values and their standard distribution. In the end, bi estimated from Monte-Carlo simulation coincide with those found by WLS. Am I correct, that standard deviations of b'i must be devided by № of Degrees of freedom to obtain standard error?
3) How to estimate confidence bands for predicted values of y using Monte-Carlo approach? I've generated a bunch of perturbated bi values from normal distribution using their BLUE as mean and standard deviations. Then I calculated lots of predicted values of f(x1,x2), found their means and standard deviations. Values of f(x1,x2) found by WLS and MC coincide, but s.d. found from MC are 5-45 order higher than those from WLS. What is the scaling factor that I'm missing here?
4) It seems that some of parameters b are not independent of each other, since there are only 2 independent variables. Should I take this into account in question 3, when I generate bi values? If yes, how can this be done? Should I use Chi-squared test to decide whether generated values of bi are suitable for further calculations, or should they be rejected?
In fact, I not only want to solve tasks I've mentioned earlier, but also I want to compare the two methods for regression analysys. I would appreciate any help and suggestions!

Is the loss in keras in percentage?

I am trying to implement VGGNet-16 for depth map prediction from single image. In the training the RMSE loss comes out to be 0.1599.
That loss value, is it in percentage or not?
No, if you want a percentage of a correctly classified data you can look at a value of accuracy.
Definition of RMSE from Wikipedia:
The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample and population values) predicted by a model or an estimator and the values actually observed.
It's always non-negative, and values closer to zero are better.

How do you calculate the standard deviation for data which is mainly discrete but has a probability of being continuous?

I’m having some issue with calculating the standard deviation of a game. In the game you can get several different discrete scores. The scores have a fixed probability which is given. There is also a 5% chance that your score is randomly generated. You do not know the distribution of the random variable you are only given the mean and variance.
I’ve calculated the variance of the main game (ignoring the random variable) to be 5.2. The variance of the random variable is 137. From this I get a standard deviation of
sqrt(5.2 + 5% *137) = 3.47
Is this the correct method?

Resources