How to convert the "root mean squared" to the "standard deviation"? - statistics

Is there any way of converting the "root mean square" of a set of N measures to the "Standard deviation" of these measures, without knowing the value of N?

If you mean you have the "root mean square" of a set of values then you need to know the mean value to subtract to get the standard deviation.
If you have the "root mean square" of a set of errors (ie the mean value is zero) then the rms is the standard deviation.
In neither case do you need 'n'

Related

How to calculate the Standard error from a Variance-covariance matrix?

I am calculating a variance-covariance matrix and I see two different ways of calculating the standard errors:
sqrt(diagonal values/number of observations)
e.g. standard deviation / sqrt(number of observations)
(as is given from on how to calculate the standard error https://en.wikipedia.org/wiki/Standard_error)
or some people say it is simply
sqrt(diagonal values)
I had previously thought that the diagonal values in the variance-co-variance matrix were the variance and hence the square root would be the standard deviation (not the SE). However, the more I read the more I think I may be wrong and that it is the SE, but I am unsure why this is the case.
Can anyone help? Many thanks!!
Yes, the diagonal elements of the covariance matrix are the variances. The square root of these variances are the standard deviations. If you need the standard error you have to clarify the question "the standard error of what?" (see also the wikipedia entry of your post). If you mean the standard error of the mean then yes, "standard deviation / sqrt(number of observations)" is what you are looking for.

How to properly clamp beckmann distribution

I am trying to implement a Microfacet BRDF shading model (similar to the Cook-Torrance model) and I am having some trouble with the Beckmann Distribution defined in this paper: https://www.cs.cornell.edu/~srm/publications/EGSR07-btdf.pdf
Where M is a microfacet normal, N is the macrofacet normal and ab is a "hardness" parameter between [0, 1].
My issue is that this distribution often returns obscenely large values, especially when ab is very small.
For instance, the Beckmann distribution is used to calculate the probability of generating a microfacet normal M per this equation :
A probability has to be between the range [0,1], so how is it possible to get a value within this range using the function above if the Beckmann distribution gives me values that are 1000000000+ in size?
So there a proper way to clamp the distribution? Or am I misunderstanding it or the probability function? I had tried simply clamping it to 1 if the value exceeded 1 but this didn't really give me the results I was looking for.
I was having the same question you did.
If you read
http://blog.selfshadow.com/publications/s2012-shading-course/hoffman/s2012_pbs_physics_math_notes.pdf
and
http://blog.selfshadow.com/publications/s2012-shading-course/hoffman/s2012_pbs_physics_math_notebook.pdf
You'll notice it's perfectly normal. To quote from the links:
"The Beckmann Αb parameter is equal to the RMS (root mean square) microfacet slope. Therefore its valid range is from 0 (non-inclusive –0 corresponds to a perfect mirror or Dirac delta and causes divide by 0 errors in the Beckmann formulation) and up to arbitrarily high values. There is no special significance to a value of 1 –this just means that the RMS slope is 1/1 or 45°.(...)"
Also another quote:
"The statistical distribution of microfacet orientations is defined via the microfacet normal distribution function D(m). Unlike F (), the value of D() is not restricted to lie between 0 and 1—although values must be non-negative, they can be arbitrarily large (indicating a very high concentration of microfacets with normals pointing in a particular direction). (...)"
You should google for Self Shadow's Physically Based Shading courses which is full of useful material (there is one blog post for each year: 2010, 2011, 2012 & 2013)

skew normal distribution

we have skew normal distribution with location=0, scale =1 and shape =0 then it is same as standard normal distribution with mean 0 and variance 1.but if we change the shape parameter say shape=5 then mean and variance also changes.how can we fix mean and variance with different values of shape parameter
Just look after how the mean and variance of a skew normal distribution can be computed and you got the answer! Knowing that the mean looks like:
and
You can see, that with a xi=0 (location), omega=1 (scale) and alpha=0 (shape) you really get a standard normal distribution (with mean=0, standard deviation=1):
If you only change the alpha (shape) to 5, you can except the mean will differ a lot, and will be positive. If you want to hold the mean around zero with a higher alpha (shape), you will have to decrease other parameters, e.g.: the omega (scale). The most obvious solution could be to set it to zero instead of 1. See:
Mean is set, we have to get a variance equal to zero with a omega set to zero and shape set to 5. The formula is known:
With our known parameters:
Which is insane :) That cannot be done this way. You may also go back and alter the value of xi instead of omega to get a mean equal to zero. But that way you might first compute the only possible value of omega with the formula of variance given.
Then the omega should be around 1.605681 (negative or positive).
Getting back to mean:
So, with the following parameters you should get a distribution you was intended to:
location = 1.256269 (negative or positive), scale = 1.605681 (negative or positive) and shape = 5.
Please, someone test it, as I might miscalculated somewhere with the given example.

Percentage Error for 2 sets of 3D points

I have a reference set of n points, and another set which 'approximates' each of those points. How do I find out the absolute/percentage error between the approximation and my reference set.
Put in other words, I have a canned animation and a simulation. How do I know how much is the 'drift' between the 2 in terms of a single number? That is, how good is the simulation approximating the vertices as compared to those of the animation.
I actually do something like this for all vertices: |actual - reference|/|actual| and then average out the errors by dividing the number of verts. Is this correct at all?
Does this measurement really have to be a percentage value? I'm guessing you have one reference set, and then several sets that approximate this set and you want to pick the one that is "the best" in some sense.
I'd add the squared distances between the actual and the reference:
avgSquareDrift = sum(1..n, |actual - reference|^2) / numvertices
Main advantage with this approach, is that we dont need to take apply the square root, which is a costly operation.
If you sum the formula you have over all vertices (and then divide by the number of verts) you will have calculated the average percentage error in position for all vertices.
However, this percentage error is probably not quite what you want, because vertices closer to the origin will have a greater "percentage error" for the same displacement because their magnitude is smaller.
If you don't divide by anything at all, you will have the average drift in world units, which may be exactly what you want:
average_drift = sum(1->numvertices, |actual - reference|) / numvertices
You may want to divide by something more appropriate to your particular situation to get a meaningful unitless number. If you divide average_drift by the height of your model, you will have the error as a percentage of the model size, which could be useful.
If individual vertices are likely to have more error if they are a long distance from a vertex 'parented' to them, as could be the case if they are vertices of a jointed model, you could divide each error by the length of their parent joint to get the average error normalised for joint orientation -- i.e. what the average drift would be if each joint were of unit length:
orientation_drift = sum(1->numvertices, |actual - reference| / jointlength) / numvertices

What's the correct term for "number of std deviations" away from a mean

I've computed the mean & variance of a set of values, and I want to pass along the value that represents the # of std deviations away from mean for each number in the set. Is there a better term for this, or should I just call it num_of_std_devs_from_mean ...
Some suggestions here:
Standard score (z-value, z-score, normal score)
but "sigma" or "stdev_distance" would probably be clearer
The standard deviation is usually denoted with the letter σ (sigma). Personally, I think more people will understand what you mean if you do say number of standard deviations.
As for a variable name, as long as you comment the declaration you could shorten it to std_devs.
sigma is what you want, I think.
That is normalizing your values. You could just refer to it as the normalized value. Maybe norm_val would be more appropriate.
I've always heard it as number of standard deviations
Deviation may be what you're after. Deviation is the distance between a data point and the mean.

Resources