how to consider gaussian noise when rigid rotation? - gaussian

Given, rigid object,
then, roll, pitch and yaw are estimated by algorithm(e.g. EKF, madgwick using IMU)
or (quaternion is also ok)
however, estimated states are not true value. i.e., true value + error
Assume that, this error ~ N(0,sigma^2)
then, how to handling Rotation matrix with gaussian noise?

Related

Custom smoothing kernel

I would like to use Smooth.ppp in spatstat to calculate a sort of "moving average" according to a specific function. The specific distance-dependent weights I would like to use are given by a function wt; for simplicity
wt=function(x,y) exp(-1e5*(x-y)^2)
In the extreme case where wt=kernel, I'd expect no smoothing (ie input marks = smoothed estimates). I'm wondering what I am mis-understanding here about the kernel and how it is applied?
remotes::install_github("spatstat/spatstat.core")
n=4; PPP=ppp(rep(1:n,each=n),rep(1:n,n), c(1,n),c(1,n), marks=1:n^2);
smo=Smooth.ppp(PPP,cutoff=2,kernel=wt,at="points")
rbind(marks(PPP),smo)
(I'm using the latest spatstat build to allow estimates at points using a custom kernel)
This example may have been misinterpreted.
The kernel should be a function(x, y) in the R language which gives the value, at a spatial location (x,y), of the kernel centred at the origin (0,0). Generally the kernel takes its largest values when (x,y) is close to (0,0), and drops to zero when (x,y) is far from (0,0).
The function wt defined in your example has values close to 1 along the diagonal line x = y, and drops to zero rapidly away from the diagonal.
That is unusual. It means that a data point at location (a,b) will be 'smoothed' along the infinite line through the data point with unit slope, with equation y = x + b-a, rather than being smoothed over a region close to (a,b) as it normally would.
The example point pattern PPP consists of points along the diagonal y=x.
The smoothed value at a data point is the weighted average of the mark values at all data points, with weights proportional to the kernel value. In your example, the kernel value for each pair of data points, wt(x1-x2, y1-y2), is equal to 1 because all the data and query points lie on the same line with slope 1.
The kernel weights are all equal in this example, so the smoothed values should all be equal to the average mark value, if leaveoneout=FALSE, and if leaveoneout=TRUE then the smoothed value at data point i is the average of the mark values at the data points excluding point i.

pytorch albumentations augmentation p value?

augmented_images(raw.image_id.unique()[1230], albumentations.HorizontalFlip(p=1))
for augmented_image what is the p=1 mean? is value difference make angle different?
if its not it how should I make various angle different horizontal augmentation?
As you can see in the docs of albumentations.HorizontalFlip:
Parameters: p (float) – probability of applying the transform. Default: 0.5.
If you want to rotate, you should consider using albumentations.augmentations.transforms.Rotate:
Rotate(limit=90, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=0.5)
Rotate the input by an angle selected randomly from the uniform distribution.
Parameters:
limit ((int, int) or int) – range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: (-90, 90)
[...]
p (float) – probability of applying the transform. Default: 0.5.

Which is the error of a value corresponding to the maximum of a function?

This is my problem:
The first input is the observed data of MUSE, which is an astronomical instrument provides cubes, i.e. an image for each wavelength with a certain range. This means that, taken all the wavelengths corresponding to the pixel i,j, I can extract the spectrum for this pixel. Since these images are observed, for each pixel I have an error.
The second input is a spectrum template, i.e. a model of a spectrum. This template is assumed to be without error. I map this spectra at various redshift (this means multiply the wavelenghts for a factor 1+z, where z belong to a certain range).
The core of my code is the cross-correlation between the cube, i.e. the spectra extracted from each pixel, and the template mapped at different redshift. The result is a cross-correlation function for each pixel for each z, let's call this computed function as f(z). Taking, for each pixel, the argmax of f(z), I get the best redshift.
This is a common and widely-used process, indeed, it actually works well.
My question:
Since my input, i.e. the MUSE cube, has an error, I have propagated this error through the cross-correlation, obtaining an error on f(z), i.e. each f_i has a error sigma_i. So, how can I compute the error on z_max, which is the value of z corresponding to the maximum of f?
Maybe a solution could be the implementation of bootstrap method: I can extract, within the error of f, a certain number of function, for each of them I computed the argamx, so i can have an idea about the scatter of z_max.
By the way, I'm using python (3.x) and tensorflow has been used to compute the cross-correlation function.
Thanks!
EDIT
Following #TF_Support suggestion I'm trying to add some code and some figures to better understand the problem. But, before this, maybe it's better a little of math.
With this expression I had computed the cross-correlation:
where S is the spectra, T is the template and N is the normalization coefficient. Since S has an error, I had propagated these errors through the previous relation founding:
where SST_k is the the sum of the template squared and sigma_ij is the error on on S_ij (actually, I should have written sigma_S_ij).
The follow function (implemented with tensorflow 2.1) makes the cross-correlation between one template and the spectra of batch pixels, and computes the error on the cross-correlation function:
#tf.function
def make_xcorr_err1(T, S, sigma_S):
sum_spectra_sq = tf.reduce_sum(tf.square(S), 1) #shape (batch,)
sum_template_sq = tf.reduce_sum(tf.square(T), 0) #shape (Nz, )
norm = tf.sqrt(tf.reshape(sum_spectra_sq, (-1,1))*tf.reshape(sum_template_sq, (1,-1))) #shape (batch, Nz)
xcorr = tf.matmul(S, T, transpose_a = False, transpose_b= False)/norm
foo1 = tf.matmul(sigma_S**2, T**2, transpose_a = False, transpose_b= False)/norm**2
foo2 = xcorr**2 * tf.reshape(sum_template_sq**2, (1,-1)) * tf.reshape(tf.reduce_sum((S*sigma_S)**2, 1), (-1,1))/norm**4
foo3 = - 2 * xcorr * tf.reshape(sum_template_sq, (1,-1)) * tf.matmul(S*(sigma_S)**2, T, transpose_a = False, transpose_b= False)/norm**3
sigma_xcorr = tf.sqrt(tf.maximum(foo1+foo2+foo3, 0.))
Maybe, in order to understand my problem, more important than code is an image representing an output. This is the cross-correlation function for a single pixel, in red the maximum value, let's call z_best, i.e. the best cross-correlated value. The figure also shows the 3 sigma errors (the grey limits are +3sigma -3sigma).
If i zoom-in near the peak, I get this:
As you can see the maximum (as any other value) oscillates within a certain range. I would like to find a way to map this fluctuations of maximum (or the fluctuations around the maximum, or the fluctuations of the whole function) to an error on the value corresponding the maximum, i.e. an error on z_best.

How do I prevent minimize (via SCIPY) from outputting "optimized" parameters that I have input as guesses?

I am trying to use the minimize function from the scipy module. The full code is too lengthy to post, but the main idea is that there are multiple defined distributions that should be fittable against datasets. The observations per bin are easily calculated from the datasets, whereas the expectations per bin are calculated by a function that uses one argument to specify which distribution should be integrated over bin bounds (where the bin bounds are identical to the histogram bins). There are three functions chisqI where I = 1,2,3 (one for each distribution), each of which inputs specified observations per bin and expectations per bin to output the chi square. Then there are three functions, each of which inputs a chisqI and args to output the minimized function result and optimized parameters. Here, the args are parameters mu and sigma that will be optimized to produce the smallest chi-square. I was able to pass arguments through a chain of functions for one distribution, and am wondering if I need to pass through another arg that specifies which distribution is being dealt with from one function down the chain.
There are different methods that the minimize function can use, like Nelder-Mead or CG. I've been trying to compare results from the different methods to find the one that provides the best fit (where the best fit is defined as the fit that both produces the smallest chi-square or largest p-value when compared to an actual dataset). Interestingly enough, the Nelder-Mead and Powell methods produce the lowest chi square relative to the other methods, but the plotted fit against the histogram of the actual data looks better with other methods. For the code outputs below, the function value is the negative of the p-value that is associated with a chi-square value; this is the minimized result. CHISQ_RED is the reduced chi square value by using the CHISQ_TOT and the degrees of freedom, whereas the first and second elements in the x: array are the optimized parameters mu and sigma for a distribution, respectively.
Running the Nelder-Mead minimization method produces the output below.
final_simplex: (array([[ 6.00002802, 0.60020636],
[ 5.99995429, 0.60018798],
[ 6.0000716 , 0.60011127]]), array([ -5.16845821e-21, -5.16838926e-21, -5.16815050e-21]))
fun: -5.1684582072826815e-21
message: 'Optimization terminated successfully.'
nfev: 47
nit: 24
status: 0
success: True
x: array([ 6.00002802, 0.60020636])
CHISQ_TOT = 259.042420419 CHISQ_RED = 3.36418727816
Running the CG minimization method produces the output below.
fun: -4.0964504680695594e-97
jac: array([ 8.72867710e-94, -3.96555507e-93])
message: 'Optimization terminated successfully.'
nfev: 4
nit: 0
njev: 1
status: 0
success: True
x: array([ 6.01921293, 0.54436257])
CHISQ_TOT = 683.781671477 CHISQ_RED = 8.88028144776
Yet, the fit with a higher chi square value looks like a better fit (same dataset in the histogram).
The problem is that every method of minimization outputs my guess parameters (mu and sigma) as the optimized parameters. The Nelder-Mead method (smaller chi-square, worse-looking fit) has 47 function evaluations and 24 iterations, whereas the CG method (larger chi-square, better-looking fit) has 4 function evaluations and 0 iterations. I tried to change this by adding extra args in the minimization function (where chisq3 is the pre-defined function of mu and sigma being minimized, and parameterguess is [mu_guess, sigma_guess].
minimize( chisq3 , parameterguess , method = 'CG', options={'gtol':1e-50, 'maxiter': 100})
If I change my guess value of mu and sigma by adding 2 to each, then the fits become drastically worse (as the guess value for the optimized parameters is rather decent). I'm not sure if it's relevant, but the data shown in the plots are adapted from a lognormal distribution by taking the logarithm of each value in my dataset to create a "pseudo-" Gaussian shape/distribution (over logarithmic x axes).
I am guessing that the minimize function via scipy is supposed to do many iterations to be truly successful. So I think adding more iterations should decrease the sensitivity of the minimize function to my initial guess of parameters.
Most importantly, is this a common error using the minimize function via scipy? If so, what are some common fixes for this? Also, why would the minimize function do many iterations and function evaluations only to produce the same result as the input?
The problem was that chi square is the calculation equalto the sum of the square of the per-bin difference of expectation values and observed values, all divided by the expectation value. The result was a small number divided by a large number, squared, then continuously summed thousands of times, contributing to zero division problems and round off errors. By minimizing a simpler function, such as chi square without the denominator term, the source of the bug is gone and one can calculate a chi square from the obtained parameter fit.

Rendering Fractals: The Mobius Transformation and The Newtonian Basin

I understand how to render (two dimensional) "Escape Time Group" fractals (Julia and Mandelbrot), but I can't seem to get a Mobius Transformation or a Newton Basin rendered.
I'm trying to render them using the same method (by recursively using the polynomial equation on each pixel 'n' times), but I have a feeling these fractals are rendered using totally different methods. Mobius 'Transformation' implies that an image must already exist, and then be transformed to produce the geometry, and the Newton Basin seems to plot each point, not just points that fall into a set.
How are these fractals graphed? Are they graphed using the same iterative methods as the Julia and Mandelbrot?
Equations I'm Using:
Julia: Zn+1 = Zn^2 + C
Where Z is a complex number representing a pixel, and C is a complex constant (Correct).
Mandelbrot: Cn+1 = Cn^2 + Z
Where Z is a complex number representing a pixel, and C is the complex number (0, 0), and is compounded each step (The reverse of the Julia, correct).
Newton Basin: Zn+1 = Zn - (Zn^x - a) / (Zn^y - a)
Where Z is a complex number representing a pixel, x and y are exponents of various degrees, and a is a complex constant (Incorrect - creating a centered, eight legged 'line star').
Mobius Transformation: Zn+1 = (aZn + b) / (cZn + d)
Where Z is a complex number representing a pixel, and a, b, c, and d are complex constants (Incorrect, everything falls into the set).
So how are the Newton Basin and Mobius Transformation plotted on the complex plane?
Update: Mobius Transformations are just that; transformations.
"Every Möbius transformation is
a composition of translations,
rotations, zooms (dilations) and
inversions."
To perform a Mobius Transformation, a shape, picture, smear, etc. must be present already in order to transform it.
Now how about those Newton Basins?
Update 2: My math was wrong for the Newton Basin. The denominator at the end of the equation is (supposed to be) the derivative of the original function. The function can be understood by studying 'NewtonRoot.m' from the MIT MatLab source-code. A search engine can find it quite easily. I'm still at a loss as to how to graph it on the complex plane, though...
Newton Basin:
f(x) = x - f(x) / f'(x)
In Mandelbrot and Julia sets you terminate the inner loop if it exceeds a certain threshold as a measurement how fast the orbit "reaches" infinity
if(|z| > 4) { stop }
For newton fractals it is the other way round: Since the newton method is usually converging towards a certain value we are interested how fast it reaches its limit, which can be done by checking when the difference of two consecutive values drops below a certain value (usually 10^-9 is a good value)
if(|z[n] - z[n-1]| < epsilon) { stop }

Resources