Distance between each point and the linear regression solution - scikit-learn

I have a set of data ([x[0],x[1]],y), many points in 3D space
and use scikit-learn to fit a learn model.
How I can calculate the distance between all the points to the fitting plane?
Does sklearn provide such function? I mean perpendicular distance.
My code works but too manually.
I am looking for an existing quick function in a package like sklearn.
Thanks.
def Linfit3D(x,y):
# x is a 2D array, they should be location of each bump, x_loc and y_loc
# y is the CTV or BTV that need to be fit to the least square plane
# three value will be returned, a,b, and c, which indicate a + b*x1 + c*x2 =y
model = sklearn.linear_model.LinearRegression()
model.fit(x, y)
coefs = model.coef_
intercept = model.intercept_
print("Equation: y = {:.5f} + {:.5f}*x1 + {:.5f}*x2".format(intercept, coefs[0],coefs[1]))
a=coefs[0]
b=coefs[1]
c=-1
d=intercept
return a,b,c,d
def point_to_plane_dist(x,y, a, b, c, d):
# the plane equation is: a*x + b*y + c*z + d = 0, and typically c=-1
# so the plane equation typicall is z = a*x + b*y + d
# and output has concerned the positive/negtive of point on top/bottom of the plane
f = abs((a * x[0] + b * x[1] + c * y + d))
e = (math.sqrt(a * a + b * b + c * c))
zp=a*x[0]+b*x[1]+d
# print('y = %2f, zp = %2f' %(y,zp))
if y>=zp:
return f/e
elif y<zp:
return (f/e)*(-1)

Related

Solving equation of motion due to (Lorentz acceleration) using Forward Euler and Runge-Kutta 4th order using Python 3

I am tring to solve the equation of motion of charged particle in planetary magnetic field to see the path of the particle using Forward Euler's and RK5 method in python (as an excercise in learning Numerical methods) I encounter two problems:
The 'for loop' in the RK4 method does not update the new values. It give the values of the first iteration for all iteration.
With the change of the sing of 'β = charge/mass' the path of particle which is expected does not change. It seems the path is unaffected by the nature(sign) of the particle. What does this mean physically or mathematically?
The codes are adapted from :
python two coupled second order ODEs Runge Kutta 4th order
and
Applying Forward Euler Method to a Three-Box Model System of ODEs
I would be immensely grateful if anyone explain to me what is wrong in the code.
thank you.
The Code are as under:
import numpy as np
import matplotlib.pyplot as plt
from math import sin, cos
from scipy.integrate import odeint
scales = np.array([1e7, 0.1, 1, 1e-5, 10, 1e-5])
def LzForce(t,p):
# assigning each ODE to a vector element
r,x,θ,y,ϕ,z = p*scales
# constants
R = 60268e3 # metre
g_20 = 1583e-9
Ω = 9.74e-3 # degree/second
B_θ = (R/r)**4*g_20*cos(θ)*sin(θ)
B_r = 2*(R/r)**4*g_20*(0.5*(3*cos(θ)**2-1))
β = +9.36e10
# defining the ODEs
drdt = x
dxdt = r*(y**2 +(z+Ω)**2*sin(θ)**2-β*z*sin(θ)*B_θ)
dθdt = y
dydt = (-2*x*y +r*(z+Ω)**2*sin(θ)*cos(θ)+β*r*z*sin(θ)*B_r)/r
dϕdt = z
dzdt = (-2*x*(z+Ω)*sin(θ)-2*r*y*(z+Ω)*cos(θ)+β*(x*B_θ-r*y*B_r))/(r*sin(θ))
return np.array([drdt,dxdt,dθdt,dydt,dϕdt,dzdt])/scales
def ForwardEuler(fun,t0,p0,tf,dt):
r0 = 6.6e+07
x0 = 0.
θ0 = 88.
y0 = 0.
ϕ0 = 0.
z0 = 22e-3
p0 = np.array([r0,x0,θ0,y0,ϕ0,z0])
t = np.arange(t0,tf+dt,dt)
p = np.zeros([len(t), len(p0)])
p[0] = p0
for i in range(len(t)-1):
p[i+1,:] = p[i,:] + fun(t[i],p[i,:]) * dt
return t, p
def rk4(fun,t0,p0,tf,dt):
# initial conditions
r0 = 6.6e+07
x0 = 0.
θ0 = 88.
y0 = 0.
ϕ0 = 0.
z0 = 22e-3
p0 = np.array([r0,x0,θ0,y0,ϕ0,z0])
t = np.arange(t0,tf+dt,dt)
p = np.zeros([len(t), len(p0)])
p[0] = p0
for i in range(len(t)-1):
k1 = dt * fun(t[i], p[i])
k2 = dt * fun(t[i] + 0.5*dt, p[i] + 0.5 * k1)
k3 = dt * fun(t[i] + 0.5*dt, p[i] + 0.5 * k2)
k4 = dt * fun(t[i] + dt, p[i] + k3)
p[i+1] = p[i] + (k1 + 2*(k2 + k3) + k4)/6
return t,p
dt = 0.5
tf = 1000
p0 = [6.6e+07,0.0,88.0,0.0,0.0,22e-3]
t0 = 0
#Solution with Forward Euler
t,p_Euler = ForwardEuler(LzForce,t0,p0,tf,dt)
#Solution with RK4
t ,p_RK4 = rk4(LzForce,t0, p0 ,tf,dt)
print(t,p_Euler)
print(t,p_RK4)
# Plot Solutions
r,x,θ,y,ϕ,z = p_Euler.T
fig,ax=plt.subplots(2,3,figsize=(8,4))
plt.xlabel('time in sec')
plt.ylabel('parameters')
for a,s in zip(ax.flatten(),[r,x,θ,y,ϕ,z]):
a.plot(t,s); a.grid()
plt.title("Forward Euler", loc='left')
plt.tight_layout(); plt.show()
r,x,θ,y,ϕ,z = p_RK4.T
fig,ax=plt.subplots(2,3,figsize=(8,4))
plt.xlabel('time in sec')
plt.ylabel('parameters')
for a,q in zip(ax.flatten(),[r,x,θ,y,ϕ,z]):
a.plot(t,q); a.grid()
plt.title("RK4", loc='left')
plt.tight_layout(); plt.show()
[RK4 solution plot][1]
[Euler's solution methods][2]
''''RK4 does not give iterated values.
The path is unaffected by the change of sign which is expected as it is under Lorentz force''''
[1]: https://i.stack.imgur.com/bZdIw.png
[2]: https://i.stack.imgur.com/tuNDp.png
You are not iterating more than once inside the for loop in rk4 because it returns after the first iteration.
for i in range(len(t)-1):
k1 = dt * fun(t[i], p[i])
k2 = dt * fun(t[i] + 0.5*dt, p[i] + 0.5 * k1)
k3 = dt * fun(t[i] + 0.5*dt, p[i] + 0.5 * k2)
k4 = dt * fun(t[i] + dt, p[i] + k3)
p[i+1] = p[i] + (k1 + 2*(k2 + k3) + k4)/6
# This is the problem line, the return was tabbed in, to be inside the for block, so the block executed once and returned.
return t,p
For physics questions please try a different forum.

Tangent lines to a circle

If I have a point B tangent to a circle with known radius r, and a point D outside the circle, how do I find the intersection of tangent lines through B and D?
If the only known values are the blue ones as shown in the sketch, how do I find point E?
I guess I'm missing the math background to combine similar examples with other known values to come to a solution.
We can write two vector equations:
-vector EB is perpendicular to radius CB, so dot product is zero
EB.dot.CB = 0 or
(ex - bx)*(bx - cx) + (ey - by)*(by - cy) = 0 (1)
-squared distance from center C to line DE is equal to squared radius (using vector product)
(DC x ED)^2 / |ED|^2 = R^2
((dx-cx)*(ey-dy)-(dy-cy)*(ex-dx))^2 = R^2 * ((ex-dx)^2+(ey-dy)^2) (2)
Equations (1) and (2) form equation system for two unknowns ex, ey. Solve it, get 0, 1 or 2 solutions (due to quadratic equation)
By running some more synthetic geometry first, you can apply the law of cosines on triangle BCD to express CD, then use in Pythagoras' theorem for triangle CDF to find the length d of of DF. Then, apply the law of cosines to the triangle BDE to find the length e of EF, where e = DE - d. Since EB = EF = e you just have to make the vector AB unit first and then multiply by e to find vector BE. After that just add the coordinates of B to BE.
The point H is the other point on the line AB such that the line DH is the other tangent to the circle.
import numpy as np
import math
'''
input A, B, D, r
'''
A = [ 0,-4]
B = [-1, 1]
D = [ 5, 0]
r = 2
A = [ 5.49, -8.12]
B = [ 1.24, 1.82]
D = [ 15.95, -1.12]
r = 3
A = np.array(A)
B = np.array(B)
D = np.array(D)
AB = B - A
l_AB = math.sqrt(AB[0]**2 + AB[1]**2)
AB = AB / l_AB
BD = D - B
l_BD = math.sqrt(BD[0]**2 + BD[1]**2)
cos_alpha = (-AB[0]*BD[0] - AB[1]*BD[1]) / l_BD
sin_alpha = math.sqrt(1 - cos_alpha**2)
d = math.sqrt( l_BD**2 - 2*r*l_BD*sin_alpha )
e = (l_BD**2 - d**2) / (2*d - 2*l_BD*cos_alpha)
E = B + e*AB
h = (l_BD**2 - d**2) / (2*d + 2*l_BD*cos_alpha)
H = B - h*AB
AB_perp = [AB[1], -AB[0]]
AB_perp = np.array(AB_perp)
C = B + r*AB_perp
CE = E - C
l2_CE = CE[0]**2 + CE[1]**2
G = C + (r**2 / l2_CE)*CE
F = B + 2*(G - B)
print('E =', E)
print('H =', H)
print('C =', C)
print('G =', G)
print('F =', F)

Best-Fit without point interpolation

I have two sets of data. One is nominal form. The other is actual form. The problem is that when I wish to calculate the form error alone. It's a big problem when the two sets of data isn't "on top of each other". That gives errors that also include positional error.
Both curves are read from a series of data. The nominal shape (black) is made up from many different size radius that are tangent to each other. Its the leading edge of an airfoil profile.
I have tried various methods of "Best-Fit" I've found both here and on where ever google took me. But the problem is that they all smooth my "actual" data. So it get modified and is not keeping it's actual form.
Is there any function in scipy or any other python lib that "simply" can fit my two curves together without altering the actual shape?
I wish for the green curve with red dots to lie as much as possible on top of the black.
Might it be possible to calculate the center of gravity of both curves and then move the actual curve in x and y depending on the value difference from the center point? It might not be the ultimate solution, but it would get closer?
Here is a solution assuming that the nominal form can be described as a conic, i.a as solution of the equation ax^2 + by^2 + cxy + dx + ey = 1. Then, a least square fit can be applied to find the coefficients (a, b, c, d, e).
import numpy as np
import matplotlib.pylab as plt
# Generate example data
t = np.linspace(-2, 2.5, 25)
e, theta = 0.5, 0.3 # ratio minor axis/major & orientation angle major axis
c, s = np.cos(theta), np.sin(theta)
x = c*np.cos(t) - s*e*np.sin(t)
y = s*np.cos(t) + c*e*np.sin(t)
# add noise:
xy = 4*np.vstack((x, y))
xy += .08 *np.random.randn(*xy.shape) + np.random.randn(2, 1)
# Least square fit by a generic conic equation
# a*x^2 + b*y^2 + c*x*y + d*x + e*y = 1
x, y = xy
x = x - x.mean()
y = y - y.mean()
M = np.vstack([x**2, y**2, x*y, x, y]).T
b = np.ones_like(x)
# solve M*w = b
w, res, rank, s = np.linalg.lstsq(M, b, rcond=None)
a, b, c, d, e = w
# Get x, y coordinates for the fitted ellipse:
# using polar coordinates
# x = r*cos(theta), y = r*sin(theta)
# for a given theta, the radius is obtained with the 2nd order eq.:
# (a*ct^2 + b*st^2 + c*cs*st)*r^2 + (d*ct + e*st)*r - 1 = 0
# with ct = cos(theta) and st = sin(theta)
theta = np.linspace(-np.pi, np.pi, 97)
ct, st = np.cos(theta), np.sin(theta)
A = a*ct**2 + b*st**2 + c*ct*st
B = d*ct + e*st
D = B**2 + 4*A
radius = (-B + np.sqrt(D))/2/A
# Graph
plt.plot(radius*ct, radius*st, '-k', label='fitted ellipse');
plt.plot(x, y, 'or', label='measured points');
plt.axis('equal'); plt.legend();
plt.xlabel('x'); plt.ylabel('y');

Decision Boundary Plot for Support Vector Classifier (distance from separating hyperplane)

I am working through the book "Hands-on Machine Learning with Scikit-Learn and TensorFlow" by Aurélien Géron. The code below is written in Python 3.
On the GitHub page for the Chap. 5 solutions to the Support Vector Machine problems there is the following code for plotting the SVC decision boundary (https://github.com/ageron/handson-ml/blob/master/05_support_vector_machines.ipynb):
def plot_svc_decision_boundary(svm_clf, xmin, xmax):
w = svm_clf.coef_[0]
b = svm_clf.intercept_[0]
# At the decision boundary, w0*x0 + w1*x1 + b = 0
# => x1 = -w0/w1 * x0 - b/w1
x0 = np.linspace(xmin, xmax, 200)
decision_boundary = -w[0]/w[1] * x0 - b/w[1]
margin = 1/w[1]
gutter_up = decision_boundary + margin
gutter_down = decision_boundary - margin
svs = svm_clf.support_vectors_
plt.scatter(svs[:, 0], svs[:, 1], s=180, facecolors='#FFAAAA')
plt.plot(x0, decision_boundary, "k-", linewidth=2)
plt.plot(x0, gutter_up, "k--", linewidth=2)
plt.plot(x0, gutter_down, "k--", linewidth=2)
My question is why is the margin defined as 1/w[1]? I believe the margin should be 1/sqrt(w[0]^2+w[1]^2). That is, the margin is half of 2/L_2_norm(weight_vector) which is 1/L_2_norm(weight_vector). See https://math.stackexchange.com/questions/1305925/why-does-the-svm-margin-is-frac2-mathbfw.
Is this an error in the code?
Given:
decision boundary: w0*x0 + w1*x1 + b = 0
gutter_up: w0*x0 + w1*x1 + b = 1, i.e. w0*x0 + w1*(x1 - 1/w1) + b = 0
gutter_down: w0*x0 + w1*x1 + b = -1, i.e. w0*x0 + w1*(x1 + 1/w1) + b = 0
corresponding to (x0, x1) in decision boundary line, (x0, x1 +1/w1) and (x0, x1 -1/w1) are points in gutter_up/down line.

Python 3.x Runge Kutta simple orbit

I am in the early stages of creating a program to plot orbits using the Runge-Kutta method, and would like to plot the orbit in 2D, however, no matter what the initial conditions are, i get a straight line. I have seen a similar question but it didn't solve my problem. Why is this happening?
import numpy as np
import matplotlib.pyplot as mpl
def derX(vx):
return vx
def derY(vy):
return vy
def derVx(x,y):
return -(G*M*x)/((x**2 + y**2)**(3/2))
def timestep(x,k1,k2,k3,k4):
return x + (step/6)*(k1 + 2*k2 +2*k3 + k4)
G=6.67408E-11 #m^3/kg s^2
M=5.972E24 #kg, mass of Earth
step=100 #seconds
x=4596194 #initial conditions in m and m/s
y=4596194
vx=-6646
vy=6646
t=0
T=3600
bodyx = 444 #stationary body position metres
bodyy = 444
tarray=[]
xarray=[]
yarray=[]
vxarray=[]
vyarray=[]
while t<T:
k1 = np.zeros(4)
k2 = np.zeros(4)
k3 = np.zeros(4)
k4 = np.zeros(4)
tarray.append(t)
xarray.append(x)
yarray.append(y)
vxarray.append(vx)
vyarray.append(vy)
x = bodyx - x
y = bodyy - y
k1[0]=derX(vx)
k1[1]=derY(vy)
k1[2]=derVx(x,y)
k1[3]=derVx(y,x)
k2[0]=derX(vx+(step/2)*k1[2])
k2[1]=derY(vy+(step/2)*k1[3])
k2[2]=derVx(x+(step/2)*k1[0],y+(step/2)*k1[1])
k2[3]=derVx(y+(step/2)*k1[1],x+(step/2)*k1[0])
k3[0]=derX(vx+(step/2)*k2[2])
k3[1]=derY(vy+(step/2)*k2[3])
k3[2]=derVx(x+(step/2)*k2[0],y+(step/2)*k2[1])
k3[3]=derVx(y+(step/2)*k2[1],x+(step/2)*k2[0])
k4[0]=derX(vx+step*k3[2])
k4[1]=derY(vy+step*k3[3])
k4[2]=derVx(x+step*k3[0],y+step*k3[1])
k4[3]=derVx(y+step*k3[1],vx+step*k3[0])
t=t+step
x=timestep(x,k1[0],k2[0],k3[0],k4[0])
y=timestep(x,k1[1],k2[1],k3[1],k4[1])
vx=timestep(x,k1[2],k2[2],k3[2],k4[2])
vy=timestep(x,k1[3],k2[3],k3[3],k4[3])
mpl.plot(xarray, yarray)
There is a spurious v in the computation of k4[3].
The call of timestep has x as argument where it should be y, vx, vy.
And another error seems to be that in the difference computation
x = bodyx - x
y = bodyy - y
you also change the absolute position. Also the force direction becomes reversed.
Change that to something like
diffx = x - bodyx
diffy = y - bodyy
and use these relative positions in the force computation.
To compare, the built-in procedures produce scipy.integrate.odeint with data
G=6.67408E-11 #m^3/kg s^2
M=5.972E24 #kg, mass of Earth
bodyx = 444 #stationary body position metres
bodyy = 444
def system(u,t):
x,y,vx,vy = u
x -= bodyx
y -= bodyy
f = -(G*M)/((x**2 + y**2)**(1.5))
return [ vx, vy, f*x, f*y ]
x0=4596194 #initial conditions in m and m/s
y0=4596194
vx0=-6646
vy0=6646
u0 = [ x0, y0, vx0, vy0 ]
T= np.linspace(0,3600,36+1)
sol = odeint(system, u0, T)
mpl.plot(sol[:,0], sol[:,1]); mpl.show()
gives a nicely curved bow, about 1/4 of a full orbit.

Resources