Finding the underlying distribution/standard deviation based on percentiles - statistics

Is there a way to find the underlying distribution and the standard deviation given the data below? This is in order to run some random sampling in R for simulation purposes.
Data Here

Related

Standardized tails of pearson curves

I am working on process capability analysis for non normal data, and i need standardized tails of pearson curves which are used in clement's method to approximate the Upper and lower percentiles based on the values of kurtosis and skewness.
Currently i am struggeling to find a library or any module that contains or implements these tables any help or tips on this matter?
I initially thought of hardcoding the tables and looking up the upper and lower percentiles depending of skewness and kurtosis values but i am aslo unable to find any digital copy of the tables .

How can I calculate percentage of distribution that is below and above z scores only with SPSS without manually looking in z table

I know how to calculate the corresponding z scores of all raw scores in my data using SPSS.
Now I want to calculate the percentage (probability format, like 0.989, 0.003 and etc is better) of distribution that is below and above correspond z score. I know how to do it manually with z table but I want to do it using only SPSS, without manual looking in z table.
SPSS has built-in cumulative distribution function (CDF) functions. The one for the normal distribution takes the z-score, the distribution's mean, and the distribution's standard deviation as parameters. So for a standard normal distribution:
COMPUTE probability = CDF.normal(z_score, 0, 1).
EXECUTE.
Most other distributions you might use are also available; the parameters used by the functions differ depending on how they're defined. (For example, CDF.T takes the t-score and the number of degrees of freedom as parameters.)

Convert GMM-UBM scores to equicalent accuracy percent

I have constructed a GMM-UBM model for the speaker recognition purpose. The output of models adapted for each speaker some scores calculated by log likelihood ratio. Now I want to convert these likelihood scores to equivalent number between 0 and 100. Can anybody guide me please?
There is no straightforward formula. You can do simple things like
prob = exp(logratio_score)
but those might not reflect the true distribution of your data. The computed probability percentage of your samples will not be uniformly distributed.
Ideally you need to take a large dataset and collect statistics on what acceptance/rejection rate do you have for what score. Then once you build a histogram you can normalize the score difference by that spectrogram to make sure that 30% of your subjects are accepted if you see the certain score difference. That normalization will allow you to create uniformly distributed probability percentages. See for example How to calculate the confidence intervals for likelihood ratios from a 2x2 table in the presence of cells with zeroes
This problem is rarely solved in speaker identification systems because confidence intervals is not what you want actually want to display. You need a simple accept/reject decision and for that you need to know the amount of false rejects and accept rate. So it is enough to find just a threshold, not build the whole distribution.

How do you calculate the standard deviation for data which is mainly discrete but has a probability of being continuous?

I’m having some issue with calculating the standard deviation of a game. In the game you can get several different discrete scores. The scores have a fixed probability which is given. There is also a 5% chance that your score is randomly generated. You do not know the distribution of the random variable you are only given the mean and variance.
I’ve calculated the variance of the main game (ignoring the random variable) to be 5.2. The variance of the random variable is 137. From this I get a standard deviation of
sqrt(5.2 + 5% *137) = 3.47
Is this the correct method?

Monte Carlo Simulation in Excel for Non-normal Distributions

I would like to simulate the performance a baseball player. I know his expected performance for every future year and the standard deviations of those performances (based on regression analysis). At first, I was thinking of using the NORMINV(RAND(),REF,REF) function in excel, but the underlying distribution of baseball players' performances is dramatically right skewed. Is there a way that I can perform this sort of analysis in Excel or some other free or low-cost software? The end-goal here is for the simulation to use the right skewed distribution. Thanks very much.
R has lots of tools to do this sort of analysis, though you'd have to look through the docs to figure out how to use it. R is free, at least for non-commercial use.
If you have a cumulative distribution table (that is evenly spaced and sufficiently detailed) then you can easily generate random values from this distribution in Excel by looking up a uniform random number generated by RAND() in your distribution table and take the corresponding "x-axis" value.
=OFFSET($A$1,MATCH(RAND(),$B$2:$B$102),0)
A1 is the cell just above the table of "x-axis" values.
B2:B102 is the cumulative distribution table.
This is a simplified example. Some small modifications may be needed to handle edge-cases and adjust for biases.
If you have enough empirical data you should be able to create the cumulative distribution table.

Resources