Gumbel distribution - statistics

I have a silly question:
If X has a Gumbel distribution with parameters mu (location) and beta>0 (scale)
Then what would be the distribution of A.X,
where A is constant.
thank you all.

Probably your question is better suited for Cross Validated and a google search would give you the answer. That being said, if X is Gumbel distributed than a*X would be Gumbel distributed as well (a is constant, lower case is used to avoid ambiguity). Using that E[a*X] = a*E[X] and Var[a*X] = a^2*Var[X], where E[] and Var[] are the mean and variance, respectively.
Then using the formulas which express the connection between mean, variance and distribution parameters (mu, beta), e.g. from here, one can show that a*X has the following parameters (denoted by underscore):
mu_ = A*mu and beta_ = A*beta
a*X ~ Gum(mi_, beta_)


Why my fit for a logarithm function looks so wrong

I'm plotting this dataset and making a logarithmic fit, but, for some reason, the fit seems to be strongly wrong, at some point I got a good enough fit, but then I re ploted and there were that bad fit. At the very beginning there were a 0.0 0.0076 but I changed that to 0.001 0.0076 to avoid the asymptote.
I'm using (not exactly this one for the image above but now I'm testing with this one and there is that bad fit as well) this for the fit
f(x) = a*log(k*x + b)
fit = fit f(x) 'R_B/R_B.txt' via a, k, b
And the output is this
Also, sometimes it says 7 iterations were as is the case shown in the screenshot above, others only 1, and when it did the "correct" fit, it did like 35 iterations or something and got a = 32 if I remember correctly
Edit: here is again the good one, the plot I got is this one. And again, I re ploted and get that weird fit. It's curious that if there is the 0.0 0.0076 when the good fit it's about to be shown, gnuplot says "Undefined value during function evaluation", but that message is not shown when I'm getting the bad one.
Do you know why do I keep getting this inconsistence? Thanks for your help
As I already mentioned in comments the method of fitting antiderivatives is much better than fitting derivatives because the numerical calculus of derivatives is strongly scattered when the data is slightly scatered.
The principle of the method of fitting an integral equation (obtained from the original equation to be fitted) is explained in . The application to the case of y=a.ln(c.x+b) is shown below.
Numerical calculus :
In order to get even better result (according to some specified criteria of fitting) one can use the above values of the parameters as initial values for iterarive method of nonlinear regression implemented in some convenient software.
NOTE : The integral equation used in the present case is :
NOTE : On the above figure one can compare the result with the method of fitting an integral equation to the result with the method of fitting with derivatives.
Acknowledgements : Alex Sveshnikov did a very good work in applying the method of regression with derivatives. This allows an interesting and enlightening comparison. If the goal is only to compute approximative values of parameters to be used in nonlinear regression software both methods are quite equivalent. Nevertheless the method with integral equation appears preferable in case of scattered data.
UPDATE (After Alex Sveshnikov updated his answer)
The figure below was drawn in using the Alex Sveshnikov's result with further iterative method of fitting.
The two curves are almost indistinguishable. This shows that (in the present case) the method of fitting the integral equation is almost sufficient without further treatment.
Of course this not always so satisfying. This is due to the low scatter of the data.
In ADDITION , answer to a question raised in comments by CosmeticMichu :
The problem here is that the fit algorithm starts with "wrong" approximations for parameters a, k, and b, so during the minimalization it finds a local minimum, not the global one. You can improve the result if you provide the algorithm with starting values, which are close to the optimal ones. For example, let's start with the following parameters:
gnuplot> a=47.5087
gnuplot> k=0.226
gnuplot> b=1.0016
gnuplot> f(x)=a*log(k*x+b)
gnuplot> fit f(x) 'R_B.txt' via a,k,b
After 40 iterations the fit converged.
final sum of squares of residuals : 16.2185
rel. change during last iteration : -7.6943e-06
degrees of freedom (FIT_NDF) : 18
rms of residuals (FIT_STDFIT) = sqrt(WSSR/ndf) : 0.949225
variance of residuals (reduced chisquare) = WSSR/ndf : 0.901027
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = 35.0415 +/- 2.302 (6.57%)
k = 0.372381 +/- 0.0461 (12.38%)
b = 1.07012 +/- 0.02016 (1.884%)
correlation matrix of the fit parameters:
a k b
a 1.000
k -0.994 1.000
b 0.467 -0.531 1.000
The resulting plot is
Now the question is how you can find "good" initial approximations for your parameters? Well, you start with
If you differentiate this equation you get
The left-hand side of this equation is some constant 'C', so the expression in the right-hand side should be equal to this constant as well:
In other words, the reciprocal of the derivative of your data should be approximated by a linear function. So, from your data x[i], y[i] you can construct the reciprocal derivatives x[i], (x[i+1]-x[i])/(y[i+1]-y[i]) and the linear fit of these data:
The fit gives the following values:
C*k = 0.0236179
C*b = 0.106268
Now, we need to find the values for a, and C. Let's say, that we want the resulting graph to pass close to the starting and the ending point of our dataset. That means, that we want
a*log(k*x1 + b) = y1
a*log(k*xn + b) = yn
a*log((C*k*x1 + C*b)/C) = a*log(C*k*x1 + C*b) - a*log(C) = y1
a*log((C*k*xn + C*b)/C) = a*log(C*k*xn + C*b) - a*log(C) = yn
By subtracting the equations we get the value for a:
a = (yn-y1)/log((C*k*xn + C*b)/(C*k*x1 + C*b)) = 47.51
log(k*x1+b) = y1/a
k*x1+b = exp(y1/a)
C*k*x1+C*b = C*exp(y1/a)
From this we can calculate C:
C = (C*k*x1+C*b)/exp(y1/a)
and finally find the k and b:
These are the values used above for finding the better fit.
You can automate the process described above with the following script:
# Name of the file with the data
# The coordinates of the last data point
# The temporary coordinates of a data point used to calculate a derivative
fit linearFit(x) data using (xn=$1,dx=$1-x0,x0=$1,$1):(yn=$2,dy=$2-y0,y0=$2,dx/dy) via Ck, Cb
# The coordinates of the first data point
plot data using (x1=$1):(y1=$2) every ::0::0
fit f(x) data via a,k,b
plot data, f(x)
pause -1

Digital Watermarking: Blind detection with linear correlation

I'm trying to understand how a blind detection (detection without cover work) works, by applying linear correlation.
This is my understand so far:
Embedding (one-bit):
we generate a reference pattern w_r by using watermarking key
W_m:we multiply w_r with an strength factor a and take the negative values if we want to embedd a zero bit.
Then: C = C_0 + W_m + N,where N is noise
Blind detection (found in literature):
We need to calculate the linear correlation between w_r and C, to detect the appearance of w_r in C. Linear correlation in genereal is the normalizez scalar product = 1/(j*i) *C*w_r
C consists of C_0*w_r + W_m*w_r + w_*r*N. It is said that, because the left and the right term is probably small, but W_m*w_r has large magnitude, therefore LC(C,w_r) = +-a * |w_r|^2/(ji)
This makes no sense to me. Why should we only consider +-a * |w_r|^2/(ji) for detecting watermarks, without using C ?. This term LC(C,w_r) = +-a * |w_r|^2/(ji) is independent from C?
Or does this only explain why we can say that low linear correlation corresponds to zero-bit and high value to one-bit and we just compute LC(C,w_r) like we usually do by using the scalar product?

Numerical differentiation using Cauchy (CIF)

I am trying to create a module with a mathematical class for Taylor series, to have it easily accessible for other projects. Hence I wish to optimize it as far as I can.
For those who are not too familiar with Taylor series, it will be a necessity to be able to differentiate a function in a point many times. Given that the normal definition of the mathematical derivative of a function will require immense precision for higher order derivatives, I've decided to use Cauchy's integral formula instead. With a little bit of work, I've managed to rearrange the formula a little bit, as you can see on this picture: Rearranged formula. This provided me with much more accurate results on higher order derivatives than the traditional definition of the derivative. Here is the function i am currently using to differentiate a function in a point:
def myDerivative(f, x, dTheta, degree):
riemannSum = 0
theta = 0
while theta < 2*np.pi:
functionArgument = np.complex128(x + np.exp(1j*theta))
secondFactor = np.complex128(np.exp(-1j * degree * theta))
riemannSum += f(functionArgument) * secondFactor * dTheta
theta += dTheta
return factorial(degree)/(2*np.pi) * riemannSum.real
I've tested this function in my main function with a carefully thought out mathematical function which I know the derivatives of, namely f(x) = sin(x).
def main():
print(myDerivative(f, 0, 2*np.pi/(4*4096), 16))
These derivatives seems to freak out at around the derivative of degree 16. I've also tried to play around with dTheta, but with no luck. I would like to have higher orders as well, but I fear I've run into some kind of machine precission.
My question is in it's simplest form: What can I do to improve this function in order to get higher order of my derivatives?
I seem to have come up with a solution to the problem. I did this by rearranging Cauchy's integral formula in a different way, by exploiting that the initial contour integral can be an arbitrarily large circle around the point of differentiation. Be aware that it is very important that the function is analytic in the complex plane for this to be valid.
New formula
Also this gives a new function for differentiation:
def myDerivative(f, x, dTheta, degree, contourRadius):
riemannSum = 0
theta = 0
while theta < 2*np.pi:
functionArgument = np.complex128(x + contourRadius*np.exp(1j*theta))
secondFactor = (1/contourRadius)**degree*np.complex128(np.exp(-1j * degree * theta))
riemannSum += f(functionArgument) * secondFactor * dTheta
theta += dTheta
return factorial(degree) * riemannSum.real / (2*np.pi)
This gives me a very accurate differentiation of high orders. For instance I am able to differentiate f(x)=e^x 50 times without a problem.
Well, since you are working with a discrete approximation of the derivative (via dTheta), sooner or later you must run into trouble. I'm surprised you were able to get at least 15 accurate derivatives -- good work! But to get derivatives of all orders, either you have to put a limit on what you're willing to accept and say it's good enough, or else compute the derivatives symbolically. Take a look at Sympy for that. Sympy probably has some functions for computing Taylor series too.

Is there a graph-drawing tool that will allow me to constrain x, and automatically lay out y?

I am looking for a tool similar to graphviz that can render graphs, but that will allow me to constrain just the x coordinate of each node. Then, the tool will automatically choose y coordinates to make the graph look neat.
Basically, I want to make a timeline.
Language / platform / rendering medium are not very important.
If you want a neat-looking graph a force-directed algorithm is going to be your best bet. One of the best ones is SFDP (developed by AT&T, included in graphviz) though I can't seem to find pseudocode or an easy implementation. I don't think there are any algorithms this specialized. Thankfully, it's easy to code your own. I'll present some pseudocode mostly lifted form Wikipedia, but with suitably one-dimensional modifications. I'll assume you have n vertices and the vector of x-positions is x, subscripted by x.i.
set all vertex velocities to (0,0)
set all vertex positions to (x.i, random)
while (KE > epsilon)
KE = 0
for each vertex v
force = (0,0)
for each vertex u != v
force = force + (0, coulomb(u, v).y)
if u is incident to v
force = force + (0, hooke(u, v).y)
v.velocity = (v.velocity + timestep * force) * damping
v.position = v.position + timestep * v.velocity
KE = KE + |v.velocity| ^ 2
here the .y denotes getting the y-component of the force. This ensures that the x-components of the positions of the vertices never change from what you set them to be. The epsilon parameter is to be set by you, and should be something small compared to what you expect KE (the kinetic energy) to be. Also, |v| denotes the magnitude of the vector v (all computations are of 2-vectors in the above, except the KE). Note I set the mass of all the nodes to be 1, but you can change that if you want.
The Hooke and Coulomb functions calculate the respective forces between nodes; the first is linear in distance between vertices, the second is quadratic, so there is a guaranteed equilibrium. These functions look something like
def hooke(u, v)
return -k * |u.position - v.position|
def coulomb(u, v)
return C * |u.position - v.position|
where again most computations are in vector form. C and k have real values but experiment to get the graph you want. This isn't usually necessary because the scaling factors will, in two dimensions, pretty much expand or contract the whole graph, but here the x-distances are set so to get a good-looking graph you will have to change the values a bit.

Given a set of points, how do I approximate the major axis of its shape?

Given a "shape" drawn by the user, I would like to "normalize" it so they all have similar size and orientation. What we have is a set of points. I can approximate the size using bounding box or circle, but the orientation is a bit more tricky.
The right way to do it, I think, is to calculate the majoraxis of its bounding ellipse. To do that you need to calculate the eigenvector of the covariance matrix. Doing so likely will be way too complicated for my need, since I am looking for some good-enough estimate. Picking min, max, and 20 random points could be some starter. Is there an easy way to approximate this?
I found Power method to iteratively approximate eigenvector. Wikipedia article.
So far I am liking David's answer.
You'd be calculating the eigenvectors of a 2x2 matrix, which can be done with a few simple formulas, so it's not that complicated. In pseudocode:
// sums are over all points
b = -(sum(x * x) - sum(y * y)) / (2 * sum(x * y))
evec1_x = b + sqrt(b ** 2 + 1)
evec1_y = 1
evec2_x = b - sqrt(b ** 2 + 1)
evec2_y = 1
You could even do this by summing over only some of the points to get an estimate, if you expect that your chosen subset of points would be representative of the full set.
Edit: I think x and y must be translated to zero-mean, i.e. subtract mean from all x, y first (eed3si9n).
Here's a thought... What if you performed a linear regression on the points and used the slope of the resulting line? If not all of the points, at least a sample of them.
The r^2 value would also give you information about the general shape. The closer to 0, the more circular/uniform the shape is (circle/square). The closer to 1, the more stretched out the shape is (oval/rectangle).
The ultimate solution to this problem is running PCA
I wish I could find a nice little implementation for you to refer to...
Here you go! (assuming x is a nx2 vector)
def majAxis(x):
e,v = np.linalg.eig(np.cov(x.T)); return v[:,np.argmax(e)]
