Why is julia updating a different struct than what I am in my function (trying to implement a 4th order runge kutta method)? - struct

In Julia I am integrating two fields in a struct: x position and x velocity. In function d(u, du) I am trying to only return the du vector without altering any values in u. u is only used to calculate the du value. Instead it is changing the values of u instead of du. I have a vector of one struct for du and one vector for u. On each time step, I update my du.x with u.xvelocity, and I update my u.xvelocity for my acceleration. For some reason it seems to break when I calculate the k2 for my runge kutta. The MWE is appended below and should run and get the same error that I am getting. Also, if my runge kutta looks incorrect, also let me know. Best.
module MyOde
mutable struct Particle
x::Float64
xvelocity::Float64
end # End struct
function d(u, du)
for i in 1:length(u)
testme = u[i].x
display(testme)
display(u[i].xvelocity)
du[i].x = 1.
println("now see the issue:")
display(u[i].x)
testme != u[i].x ? error("\n What the heck is going on here? ") : nothing
du[i].xvelocity = 1.
end # End function
return du
end # End function
function f(u::Vector{Particle}, d, timeend, dt)
du = Vector{Particle}(undef, length(u))
k2 = Vector{Particle}(undef, length(u))
k3 = Vector{Particle}(undef, length(u))
k4 = Vector{Particle}(undef, length(u))
for i ∈ 1:length(u)
du[i] = Particle(0.0, 0.0)
k2[i] = Particle(0.0, 0.0)
k3[i] = Particle(0.0, 0.0)
k4[i] = Particle(0.0, 0.0)
end # End list push
for i in 0.0:dt:timeend
# Calculate the k values which will be going into the 4th order Runge-Kutta method.
k1 = d(u, du)
for i ∈ 1:length(u)
k2[i].x = u[i].x + k1[i].x *dt/2
k2[i].xvelocity = u[i].xvelocity + k1[i].xvelocity *dt/2
end # End k2 loop
k2 = d(k2, du)
for i ∈ 1:length(u)
k3[i].x = u[i].x + k2[i].x *dt/2
k3[i].xvelocity = u[i].xvelocity + k2[i].xvelocity *dt/2
end # End k3 loop
k3 = d(k3, du)
for i ∈ 1:length(u)
k4[i].x = u[i].x + k3[i].x *dt
k4[i].xvelocity = u[i].xvelocity + k3[i].xvelocity *dt
end # End k4 loop
k4 = d(k4, du)
for i ∈ 1:length(u)
u[i].x += 1/6 * dt * (k1[i].x + 2k2[i].x + 2k3[i].x + k4[i].x)
u[i].xvelocity += 1/6 * dt * (k1[i].xvelocity + 2k2[i].xvelocity + 2k3[i].xvelocity + k4[i].xvelocity)
end # End loop
end # End loop
end # End function
u = [Particle(0.0, 0.0); Particle(6.00, 0.0)]
timeend = .01
dt = 0.01
#time f(u, d, timeend, dt)
end # End module

Like you said, function d(u, du) returns a mutated du. When you run k2 = d(k2, du), the variable k2 is reassigned to the same object referenced by the function variable du, and the previously referenced object is discarded, along with all the work done on it when it was still named k2 in the preceding loop.
You do the same mistake for the rest of the ks, so by the time you hit the 2nd iteration of the for i in 0.0:dt:timeend loop, the variables du, k1, k2, k3, k4 all reference the same object. The error is thrown on the 2nd iteration's k2 = d(k2, du) because in the function d, the arguments u and du reference the same object you mutate via du.
I don't know the math here so I can't give any input on that. But I'm sure you want your variables to reference separate independent objects. Look over your code and remove the assignments that cause this issue. For example, k2 = d(k2, du) should be d(k2, du).

It should be apparent that if you want to use du=k1,k2,k3 at the end in the update expression with their original values as slopes or derivatives, then you can not overwrite them in-between and use them as temporary points. Use one state utmp for that.
There are ways to minimize the number of arrays that need to be kept in an RK4 implementation, for instance by storing k2+k3 in k2 and using the k4 array in the role of k4, or accumulating the step update during the stages, so that only the temporary state, the current slope and the accumulated slope need to be present.
For the general problem with the flow of the contents of the data arrays see the other answer.

Related

Solving vector second order differential equation while indexing into an array

I'm attempting to solve the differential equation:
m(t) = M(x)x'' + C(x, x') + B x'
where x and x' are vectors with 2 entries representing the angles and angular velocity in a dynamical system. M(x) is a 2x2 matrix that is a function of the components of theta, C is a 2x1 vector that is a function of theta and theta' and B is a 2x2 matrix of constants. m(t) is a 2*1001 array containing the torques applied to each of the two joints at the 1001 time steps and I would like to calculate the evolution of the angles as a function of those 1001 time steps.
I've transformed it to standard form such that :
x'' = M(x)^-1 (m(t) - C(x, x') - B x')
Then substituting y_1 = x and y_2 = x' gives the first order linear system of equations:
y_2 = y_1'
y_2' = M(y_1)^-1 (m(t) - C(y_1, y_2) - B y_2)
(I've used theta and phi in my code for x and y)
def joint_angles(theta_array, t, torques, B):
phi_1 = np.array([theta_array[0], theta_array[1]])
phi_2 = np.array([theta_array[2], theta_array[3]])
def M_func(phi):
M = np.array([[a_1+2.*a_2*np.cos(phi[1]), a_3+a_2*np.cos(phi[1])],[a_3+a_2*np.cos(phi[1]), a_3]])
return np.linalg.inv(M)
def C_func(phi, phi_dot):
return a_2 * np.sin(phi[1]) * np.array([-phi_dot[1] * (2. * phi_dot[0] + phi_dot[1]), phi_dot[0]**2])
dphi_2dt = M_func(phi_1) # (torques[:, t] - C_func(phi_1, phi_2) - B # phi_2)
return dphi_2dt, phi_2
t = np.linspace(0,1,1001)
initial = theta_init[0], theta_init[1], dtheta_init[0], dtheta_init[1]
x = odeint(joint_angles, initial, t, args = (torque_array, B))
I get the error that I cannot index into torques using the t array, which makes perfect sense, however I am not sure how to have it use the current value of the torques at each time step.
I also tried putting odeint command in a for loop and only evaluating it at one time step at a time, using the solution of the function as the initial conditions for the next loop, however the function simply returned the initial conditions, meaning every loop was identical. This leads me to suspect I've made a mistake in my implementation of the standard form but I can't work out what it is. It would be preferable however to not have to call the odeint solver in a for loop every time, and rather do it all as one.
If helpful, my initial conditions and constant values are:
theta_init = np.array([10*np.pi/180, 143.54*np.pi/180])
dtheta_init = np.array([0, 0])
L_1 = 0.3
L_2 = 0.33
I_1 = 0.025
I_2 = 0.045
M_1 = 1.4
M_2 = 1.0
D_2 = 0.16
a_1 = I_1+I_2+M_2*(L_1**2)
a_2 = M_2*L_1*D_2
a_3 = I_2
Thanks for helping!
The solver uses an internal stepping that is problem adapted. The given time list is a list of points where the internal solution gets interpolated for output samples. The internal and external time lists are in no way related, the internal list only depends on the given tolerances.
There is no actual natural relation between array indices and sample times.
The translation of a given time into an index and construction of a sample value from the surrounding table entries is called interpolation (by a piecewise polynomial function).
Torque as a physical phenomenon is at least continuous, a piecewise linear interpolation is the easiest way to transform the given function value table into an actual continuous function. Of course one also needs the time array.
So use numpy.interp1d or the more advanced routines of scipy.interpolate to define the torque function that can be evaluated at arbitrary times as demanded by the solver and its integration method.

How to use np.where in another np.where (conext: ray tracing)

The question is: how to use two np.where in the same statement, like this (oversimplified):
np.where((ndarr1==ndarr2),np.where((ndarr1+ndarr2==ndarr3),True,False),False)
To avoid computing second conditional statement if the first is not reached.
My first objective is to find the intersection of a ray in a triangle, if there is one. This problem can be solved by this algorithm (found on stackoverflow):
def intersect_line_triangle(q1,q2,p1,p2,p3):
def signed_tetra_volume(a,b,c,d):
return np.sign(np.dot(np.cross(b-a,c-a),d-a)/6.0)
s1 = signed_tetra_volume(q1,p1,p2,p3)
s2 = signed_tetra_volume(q2,p1,p2,p3)
if s1 != s2:
s3 = signed_tetra_volume(q1,q2,p1,p2)
s4 = signed_tetra_volume(q1,q2,p2,p3)
s5 = signed_tetra_volume(q1,q2,p3,p1)
if s3 == s4 and s4 == s5:
n = np.cross(p2-p1,p3-p1)
t = np.dot(p1-q1,n) / np.dot(q2-q1,n)
return q1 + t * (q2-q1)
return None
Here are two conditional statements:
s1!=s2
s3==s4 & s4==s5
Now since I have >20k triangles to check, I want to apply this function on all triangles at the same time.
First solution is:
s1 = vol(r0,tri[:,0,:],tri[:,1,:],tri[:,2,:])
s2 = vol(r1,tri[:,0,:],tri[:,1,:],tri[:,2,:])
s3 = vol(r1,r2,tri[:,0,:],tri[:,1,:])
s4 = vol(r1,r2,tri[:,1,:],tri[:,2,:])
s5 = vol(r1,r2,tri[:,2,:],tri[:,0,:])
np.where((s1!=s2) & (s3+s4==s4+s5),intersect(),False)
where s1,s2,s3,s4,s5 are arrays containing the value S for each triangle. Problem is, it means I have to compute s3,s4,and s5 for all triangles.
Now the ideal would be to compute statement 2 (and s3,s4,s5) only when statement 1 is True, with something like this:
check= np.where((s1!=s2),np.where((compute(s3)==compute(s4)) & (compute(s4)==compute(s5), compute(intersection),False),False)
(to simplify explanation, I just stated 'compute' instead of the whole computing process. Here, 'compute' is does only on the appropriate triangles).
Now of course this option doesn't work (and computes s4 two times), but I'd gladly have some recommendations on a similar process
Here's how I used masked arrays to answer this problem:
loTrue= np.where((s1!=s2),False,True)
s3=ma.masked_array(np.sign(dot(np.cross(r0r1, r0t0), r0t1)),mask=loTrue)
s4=ma.masked_array(np.sign(dot(np.cross(r0r1, r0t1), r0t2)),mask=loTrue)
s5=ma.masked_array(np.sign(dot(np.cross(r0r1, r0t2), r0t0)),mask=loTrue)
loTrue= ma.masked_array(np.where((abs(s3-s4)<1e-4) & ( abs(s5-s4)<1e-4),True,False),mask=loTrue)
#also works when computing s3,s4 and s5 inside loTrue, like this:
loTrue= np.where((s1!=s2),False,True)
loTrue= ma.masked_array(np.where(
(abs(np.sign(dot(np.cross(r0r1, r0t0), r0t1))-np.sign(dot(np.cross(r0r1, r0t1), r0t2)))<1e-4) &
(abs(np.sign(dot(np.cross(r0r1, r0t2), r0t0))-np.sign(dot(np.cross(r0r1, r0t1), r0t2)))<1e-4),True,False)
,mask=loTrue)
Note that the same process, when not using such approach, is done like this:
s3= np.sign(dot(np.cross(r0r1, r0t0), r0t1) /6.0)
s4= np.sign(dot(np.cross(r0r1, r0t1), r0t2) /6.0)
s5= np.sign(dot(np.cross(r0r1, r0t2), r0t0) /6.0)
loTrue= np.where((s1!=s2) & (abs(s3-s4)<1e-4) & ( abs(s5-s4)<1e-4) ,True,False)
Both give the same results, however, when looping on this process only for 10k iterations, NOT using masked arrays is faster! (26 secs without masked arrays, 31 secs with masked arrays, 33 when using masked arrays in one line only (not computing s3,s4 and s5 separately, or computing s4 before).
Conclusion: using nested arrays is solved here (note that the mask indicates where it won't be computed, hence first loTri must bet set to False (0) when condition is verified). However, in that scenario, it's not faster.
I can get a small speedup from short circuiting but I'm not convinced it is worth the additional admin.
full computation 4.463818839867599 ms per iteration (one ray, 20,000 triangles)
short ciruciting 3.0060838296776637 ms per iteration (one ray, 20,000 triangles)
Code:
import numpy as np
def ilt_cut(q1,q2,p1,p2,p3):
qm = (q1+q2)/2
qd = qm-q2
p12 = p1-p2
aux = np.cross(qd,q2-p2)
s3 = np.einsum("ij,ij->i",aux,p12)
s4 = np.einsum("ij,ij->i",aux,p2-p3)
ge = (s3>=0)&(s4>=0)
le = (s3<=0)&(s4<=0)
keep = np.flatnonzero(ge|le)
aux = p1[keep]
qpm1 = qm-aux
p31 = p3[keep]-aux
s5 = np.einsum("ij,ij->i",np.cross(qpm1,p31),qd)
ge = ge[keep]&(s5>=0)
le = le[keep]&(s5<=0)
flt = np.flatnonzero(ge|le)
keep = keep[flt]
n = np.cross(p31[flt], p12[keep])
s12 = np.einsum("ij,ij->i",n,qpm1[flt])
flt = np.abs(s12) <= np.abs(s3[keep]+s4[keep]+s5[flt])
return keep[flt],qm-(s12[flt]/np.einsum("ij,ij->i",qd,n[flt]))[:,None]*qd
def ilt_full(q1,q2,p1,p2,p3):
qm = (q1+q2)/2
qd = qm-q2
p12 = p1-p2
qpm1 = qm-p1
p31 = p3-p1
aux = np.cross(qd,q2-p2)
s3 = np.einsum("ij,ij->i",aux,p12)
s4 = np.einsum("ij,ij->i",aux,p2-p3)
s5 = np.einsum("ij,ij->i",np.cross(qpm1,p31),qd)
n = np.cross(p31, p12)
s12 = np.einsum("ij,ij->i",n,qpm1)
ge = (s3>=0)&(s4>=0)&(s5>=0)
le = (s3<=0)&(s4<=0)&(s5<=0)
keep = np.flatnonzero((np.abs(s12) <= np.abs(s3+s4+s5)) & (ge|le))
return keep,qm-(s12[keep]/np.einsum("ij,ij->i",qd,n[keep]))[:,None]*qd
tri = np.random.uniform(1, 10, (20_000, 3, 3))
p0, p1 = np.random.uniform(1, 10, (2, 3))
from timeit import timeit
A,B,C = tri.transpose(1,0,2)
print('full computation', timeit(lambda: ilt_full(p0[None], p1[None], A, B, C), number=100)*10, 'ms per iteration (one ray, 20,000 triangles)')
print('short ciruciting', timeit(lambda: ilt_cut(p0[None], p1[None], A, B, C), number=100)*10, 'ms per iteration (one ray, 20,000 triangles)')
Note that I played a bit with the algorithm, so this may not in every edge case give the same result aas yours.
What I changed:
I inlined the tetra volume, which allows to save a few repeated subcomputations
I replace one of the ray ends with the midpoint M of the ray. This saves computing one tetra volume (s1 or s2) because one can check whether the ray crosses the triangle ABC plane by comparing the volume of tetra ABCM to the sum of s3, s4, s5 (if they have the same signs).

Parallellizing Intial-boundary value problem (Finite difference)

I am running a simulation to solve the advection diffusion equation. I wish to parallelize the part of the code where I calculate the partial derivatives so as to speed up my computation. Here is what I am doing:
p1 = np.zeros((len(r), len(th)-1 )) #The solution of the matrix
def func(i):
pti = np.zeros(len(th)-1)
for j in range (len(pti)):
abc = f(p1) #Some function calculating the derivatives at each point
pti[j] = p1[i][j] + dt*( abc ) #dt is some small float number
return pti
#Setting the initial condition of the p1 matrix
for i in range(len(p1[:,0])):
for j in range(len(p1[0])):
p1[i][j] = 0.01
#Final loop calculating the integral by finite difference scheme
p = Pool(args.cores)
for k in range (0,args.iterations): #This is integration in time
p1=p.map(func,range(len(r)))
print (p1)
The problem that I am facing here is that my p1 matrix is not updating after each iteration in k. In the end when I print p1 I get the same matrix that I initialized.
Also, the linear version of this code is working (but it takes too long).
Okay I solved this myself. Apparently putting the line
p = Pool(args.cores)
inside the loop
for k in range (0,args.iterations):
does the trick.

Distance to a straight line in standard form

For a 3D straight line expressed in the standard form
a1*x + b1*y + c1*z + d1 = 0
a2*x + b2*y + c2*z + d2 = 0
and a given point x0,y0,z0
what is the distance from the point to the straight line?
Distance from point P0 to parametric line L(t) = Base + t * Dir is
Dist = Length(CrossProduct(Dir, P0 - Base)) / Length(Dir)
To find direction vector:
Dir = CrossProduct((a1,b1,c1), (a2,b2,c2))
To get some arbitrary base point, solve equation system with 2 equations and three unknowns (find arbitrary solution):
a1*x + b1*y + c1*z + d1 = 0
a2*x + b2*y + c2*z + d2 = 0
Check minors consisting of a and b, a and c, b and c coefficients. When minor is non-zero, corresponding variable might be taken as free one. For example, if a1 * b2 - b1 * a2 <> 0, choose variable z as free - make it zero or another value and solve system for two unknowns x and y.
(I omitted extra cases of parallel or coinciding planes)

Color Histogram

I'm trying to calculate histogram for an image. I'm using the following formula to calculate the bin
%bin = red*(N^2) + green*(N^1) + blue;
I have to implement the following Matlab functions.
[row, col, noChannels] = size(rgbImage);
hsvImage = rgb2hsv(rgbImage); % Ranges from 0 to 1.
H = zeros(4,4,4);
for col = 1 : columns
for row = 1 : rows
hBin = floor(hsvImage(row, column, 1) * 15);
sBin = floor(hsvImage(row, column, 2) * 4);
vBin = floor(hsvImage(row, column, 3) * 4);
F(hBin, sBin, vBin) = hBin, sBin, vBin + 1;
end
end
When I run the code I get the following error message "Subscript indices must either be real positive integers or logical."
As I am new to Matlab and Image processing, I'm not sure if the problem is with implementing the algorithm or a syntax error.
There are 3 problems with your code. (Four if you count that you changed from H to F your accumulator vector, but I'll assume that's a typo.)
First one, your variable bin can be zero at any moment if the values of a giving pixel are low. And F(0) is not a valid index for a vector or matrix. This is why you are getting that error.
You can solve easily by doing F(bin+1) and keep in mind that your F vector will have your values shifted one position over.
Second error, you are assigning the value bin + 1 to your accumulator vector F, which is not what you want, you want to add 1 every time a pixel in that range is found, what you should do is F(bin+1) = F(bin+1) + 1;. This way the values of F will be increasing all the time.
Third error is simpler, you forgot to implement your bin = red*(N^2) + green*(N^1) + blue; equation

Resources