I have some coordinates of a 3D point curve through which I lay a spline like so:
from splipy import curve_factory
pts = [...] #3D coordinate points
curve = curve_factory.curve(pts)
I know that I can get a point in 3D along the curve by evaluating it after a certain length:
point_on_curve = curve.evaluate(t)
print(point_on_curve) #outputs coordinates: (x y z)
Is it however somehow possible to do it the other way round? Is there a function/method that can tell me if a certain point is part of the curve? Or if its almost part of the curve? Something like:
curve.func(point) #output: True
or
curve.func(point) #output: distance to curve 0.0001 --> also part of curve
Thanks!
I've found this script by ventusff that performs an optimization to find the value of the parameter that you call t (in the script is u) which gives the point on the spline closest to the external point.
I report below the code with some changes to make it clearer for you. I've defined a tolerance equal to 0.001.
The selection of the optimization solver and of its parameter values requires a little bit of study. I do not have enough time now for doing that, but you can try to experiment a little bit.
In this case SciPy is used for spline generation and evaluation, but you can easily replace it with splipy. The optimization is the interesting part performed using SciPy.
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import splprep, splev
from scipy.spatial.distance import euclidean
from scipy.optimize import fmin_bfgs
points_count = 40
phi = np.linspace(0, 2. * np.pi, points_count)
k = np.linspace(0, 2, points_count)
r = 0.5 + np.cos(phi)
x, y, z = r * np.cos(phi), r * np.sin(phi), k
tck, u = splprep([x, y, z], s=1)
points = splev(u, tck)
idx = np.random.randint(low=0, high=40)
noise = np.random.normal(scale=0.01)
external_point = np.array([points[0][idx], points[1][idx], points[2][idx]]) + noise
def distance_to_point(u_):
s = splev(u_, tck)
return euclidean(external_point, [s[0][0], s[1][0], s[2][0]])
closest_u = fmin_bfgs(distance_to_point, x0=np.array([0.0]), gtol=1e-8)
closest_point = splev(closest_u, tck)
tol = 1e-3
if euclidean(external_point, [closest_point[0][0], closest_point[1][0], closest_point[2][0]]) < tol:
print("The point is very close to the spline.")
ax = plt.figure().add_subplot(projection='3d')
ax.plot(points[0], points[1], points[2], "r-", label="Spline")
ax.plot(external_point[0], external_point[1], external_point[2], "bo", label="External Point")
ax.plot(closest_point[0], closest_point[1], closest_point[2], "go", label="Closest Point")
plt.legend()
plt.show()
The script draws the plot below:
and prints the following output:
Current function value: 0.000941
Iterations: 5
Function evaluations: 75
Gradient evaluations: 32
The point is very close to the spline.
Do you know some well-known python library for interpolate randomly located 2d points based on regular grid date points?
Note that data points to create an interpolator is on regular grid. But evaluation points are not on regular grid.
context
Let me explain the context. In my application, data points to create an interpolator is on a regular grid. However, at the evaluation time, the points to be evaluated are on random locations (say np.random.rand(100, 2)).
As far as I know, most used library for 2d interpolation is scipy's interp2d. But at the evaluation time interp2d takes grid coordinates X and Y instead of points as the following documentation describe.
Of course, it is possible to do something like
values = []
for p in np.random.rand(100, 2):
value = itp([p[0]], [p[1]])
values.append(value)
or to avoid for-loop
pts = np.random.rand(100, 2)
tmp = itp(pts[:, 0], pts[:, 1])
value = tmp.diagonal()
But both method is two inefficient. First one will be slow by for loop (run code as possible as in c-side) and the second one is wasteful because evaluate N^2 points for getting results for only N points.
scipy.interpolate.RegularGridInterpolator does. By this, one can create interpolator using gridded data points, and at evaluation time it takes 2dim numpy array with shape (n_points, n_dim).
For example:
import numpy as np
from scipy.interpolate import RegularGridInterpolator
x = np.linspace(0, 1, 20)
y = np.linspace(0, 1, 20)
f = np.random.randn(20, 20)
itp = RegularGridInterpolator((x, y), f)
pts = np.random.rand(100, 2)
f_interped = itp(pts)
I have a set of points and would like to know if there is a function (for the sake of convenience and probably speed) that can calculate the area enclosed by a set of points.
for example:
x = np.arange(0,1,0.001)
y = np.sqrt(1-x**2)
points = zip(x,y)
given points the area should be approximately equal to (pi-2)/4. Maybe there is something from scipy, matplotlib, numpy, shapely, etc. to do this? I won't be encountering any negative values for either the x or y coordinates... and they will be polygons without any defined function.
EDIT:
points will most likely not be in any specified order (clockwise or counterclockwise) and may be quite complex as they are a set of utm coordinates from a shapefile under a set of boundaries
Implementation of Shoelace formula could be done in Numpy. Assuming these vertices:
import numpy as np
x = np.arange(0,1,0.001)
y = np.sqrt(1-x**2)
We can redefine the function in numpy to find the area:
def PolyArea(x,y):
return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
And getting results:
print PolyArea(x,y)
# 0.26353377782163534
Avoiding for loop makes this function ~50X faster than PolygonArea:
%timeit PolyArea(x,y)
# 10000 loops, best of 3: 42 µs per loop
%timeit PolygonArea(zip(x,y))
# 100 loops, best of 3: 2.09 ms per loop.
Timing is done in Jupyter notebook.
The most optimized solution that covers all possible cases, would be to use a geometry package, like shapely, scikit-geometry or pygeos. All of them use C++ geometry packages under the hood. The first one is easy to install via pip:
pip install shapely
and simple to use:
from shapely.geometry import Polygon
pgon = Polygon(zip(x, y)) # Assuming the OP's x,y coordinates
print(pgon.area)
To build it from scratch or understand how the underlying algorithm works, check the shoelace formula:
# e.g. corners = [(2.0, 1.0), (4.0, 5.0), (7.0, 8.0)]
def Area(corners):
n = len(corners) # of corners
area = 0.0
for i in range(n):
j = (i + 1) % n
area += corners[i][0] * corners[j][1]
area -= corners[j][0] * corners[i][1]
area = abs(area) / 2.0
return area
Since this works for simple polygons:
If you have a polygon with holes : Calculate the area of the outer ring and subtrack the areas of the inner rings
If you have self-intersecting rings : You have to decompose them into simple sectors
By analysis of Mahdi's answer, I concluded that the majority of time was spent doing np.roll(). By removing the need of the roll, and still using numpy, I got the execution time down to 4-5µs per loop compared to Mahdi's 41µs (for comparison Mahdi's function took an average of 37µs on my machine).
def polygon_area(x,y):
correction = x[-1] * y[0] - y[-1]* x[0]
main_area = np.dot(x[:-1], y[1:]) - np.dot(y[:-1], x[1:])
return 0.5*np.abs(main_area + correction)
By calculating the correctional term, and then slicing the arrays, there is no need to roll or create a new array.
Benchmarks:
10000 iterations
PolyArea(x,y): 37.075µs per loop
polygon_area(x,y): 4.665µs per loop
Timing was done using the time module and time.clock()
maxb's answer gives good performance but can easily lead to loss of precision when coordinate values or the number of points are large. This can be mitigated with a simple coordinate shift:
def polygon_area(x,y):
# coordinate shift
x_ = x - x.mean()
y_ = y - y.mean()
# everything else is the same as maxb's code
correction = x_[-1] * y_[0] - y_[-1]* x_[0]
main_area = np.dot(x_[:-1], y_[1:]) - np.dot(y_[:-1], x_[1:])
return 0.5*np.abs(main_area + correction)
For example, a common geographic reference system is UTM, which might have (x,y) coordinates of (488685.984, 7133035.984). The product of those two values is 3485814708748.448. You can see that this single product is already at the edge of precision (it has the same number of decimal places as the inputs). Adding just a few of these products, let alone thousands, will result in loss of precision.
A simple way to mitigate this is to shift the polygon from large positive coordinates to something closer to (0,0), for example by subtracting the centroid as in the code above. This helps in two ways:
It eliminates a factor of x.mean() * y.mean() from each product
It produces a mix of positive and negative values within each dot product, which will largely cancel.
The coordinate shift does not alter the total area, it just makes the calculation more numerically stable.
It's faster to use shapely.geometry.Polygon rather than to calculate yourself.
from shapely.geometry import Polygon
import numpy as np
def PolyArea(x,y):
return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
coords = np.random.rand(6, 2)
x, y = coords[:, 0], coords[:, 1]
With those codes, and do %timeit:
%timeit PolyArea(x,y)
46.4 µs ± 2.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit Polygon(coords).area
20.2 µs ± 414 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
cv2.contourArea() in OpenCV gives an alternative method.
example:
points = np.array([[0,0],[10,0],[10,10],[0,10]])
area = cv2.contourArea(points)
print(area) # 100.0
The argument (points, in the above example) is a numpy array with dtype int, representing the vertices of a polygon: [[x1,y1],[x2,y2], ...]
There's an error in the code above as it doesn't take absolute values on each iteration. The above code will always return zero. (Mathematically, it's the difference between taking signed area or wedge product and the actual area
http://en.wikipedia.org/wiki/Exterior_algebra.) Here's some alternate code.
def area(vertices):
n = len(vertices) # of corners
a = 0.0
for i in range(n):
j = (i + 1) % n
a += abs(vertices[i][0] * vertices[j][1]-vertices[j][0] * vertices[i][1])
result = a / 2.0
return result
a bit late here, but have you considered simply using sympy?
a simple code is :
from sympy import Polygon
a = Polygon((0, 0), (2, 0), (2, 2), (0, 2)).area
print(a)
I compared every solutions offered here to Shapely's area method result, they had the right integer part but the decimal numbers differed. Only #Trenton's solution provided the the correct result.
Now improving on #Trenton's answer to process coordinates as a list of tuples, I came up with the following:
import numpy as np
def polygon_area(coords):
# get x and y in vectors
x = [point[0] for point in coords]
y = [point[1] for point in coords]
# shift coordinates
x_ = x - np.mean(x)
y_ = y - np.mean(y)
# calculate area
correction = x_[-1] * y_[0] - y_[-1] * x_[0]
main_area = np.dot(x_[:-1], y_[1:]) - np.dot(y_[:-1], x_[1:])
return 0.5 * np.abs(main_area + correction)
#### Example output
coords = [(385495.19520441635, 6466826.196947694), (385496.1951836388, 6466826.196947694), (385496.1951836388, 6466825.196929455), (385495.19520441635, 6466825.196929455), (385495.19520441635, 6466826.196947694)]
Shapely's area method: 0.9999974610685296
#Trenton's area method: 0.9999974610685296
This is much simpler, for regular polygons:
import math
def area_polygon(n, s):
return 0.25 * n * s**2 / math.tan(math.pi/n)
since the formula is ¼ n s2 / tan(π/n).
Given the number of sides, n, and the length of each side, s
Based on
https://www.mathsisfun.com/geometry/area-irregular-polygons.html
def _area_(coords):
t=0
for count in range(len(coords)-1):
y = coords[count+1][1] + coords[count][1]
x = coords[count+1][0] - coords[count][0]
z = y * x
t += z
return abs(t/2.0)
a=[(5.09,5.8), (1.68,4.9), (1.48,1.38), (4.76,0.1), (7.0,2.83), (5.09,5.8)]
print _area_(a)
The trick is that the first coordinate should also be last.
def find_int_coordinates(n: int, coords: list[list[int]]) -> float:
rez = 0
x, y = coords[n - 1]
for coord in coords:
rez += (x + coord[0]) * (y - coord[1])
x, y = coord
return abs(rez / 2)
In this example, the column-wise sum of an array pr is computed in two different ways:
(a) take the sum over the first axis using p.sum's axis parameter
(b) slice the array along the the second axis and take the sum of each slice
import matplotlib.pyplot as plt
import numpy as np
m = 100
n = 2000
x = np.random.random_sample((m, n))
X = np.abs(np.fft.rfft(x)).T
frq = np.fft.rfftfreq(n)
total = X.sum(axis=0)
c = frq # X / total
df = frq[:, None] - c
pr = df * X
a = np.sum(pr, axis=0)
b = [np.sum(pr[:, i]) for i in range(m)]
fig, ax = plt.subplots(1)
ax.plot(a)
ax.plot(b)
plt.show()
Both methods should return the same, but for whatever reason, in this example, they do not. As you can see in the plot below, a and b have totally different values. The difference is, however, so small that np.allclose(a, b) is True.
If you replace pr with some small random values, there is no difference between the two summation methods:
pr = np.random.randn(n, m) / 1e12
a = np.sum(pr, axis=0)
b = np.array([np.sum(pr[:, i]) for i in range(m)])
fig, ax = plt.subplots(1)
ax.plot(a)
ax.plot(b)
plt.show()
The second example indicates that the differences in the sums of the first example are not related to the summation methods. Then, is this a problem relate to floating point value summation? If so, why doesn't such an effect occure in the second example?
Why do the colum-wise sums differ in the first example, and which one is correct?
For why the results are different, see https://stackoverflow.com/a/55469395/7207392. The slice case uses pairwise summation, the axis case doesn't.
Which one is correct? Well, probably neither, but pairwise summation is expected to be more accurate.
Indeed, we can see that it is fairly close to the exact (within machine precision) result obtained using math.fsum.
How can I get from a plot in Python an exact value on y - axis? I have two arrays vertical_data and gradient(temperature_data) and I plotted them as:
plt.plot(gradient(temperature_data),vertical_data)
plt.show()
Plot shown here:
I need the zero value but it is not exactly zero, it's a float.
I did not find a good answer to the question of how to find the roots or zeros of a numpy array, so here is a solution, using simple linear interpolation.
import numpy as np
N = 750
x = .4+np.sort(np.random.rand(N))*3.5
y = (x-4)*np.cos(x*9.)*np.cos(x*6+0.05)+0.1
def find_roots(x,y):
s = np.abs(np.diff(np.sign(y))).astype(bool)
return x[:-1][s] + np.diff(x)[s]/(np.abs(y[1:][s]/y[:-1][s])+1)
z = find_roots(x,y)
import matplotlib.pyplot as plt
plt.plot(x,y)
plt.plot(z, np.zeros(len(z)), marker="o", ls="", ms=4)
plt.show()
Of course you can invert the roles of x and y to get
plt.plot(y,x)
plt.plot(np.zeros(len(z)),z, marker="o", ls="", ms=4)
Because people where asking how to get the intercepts at non-zero values y0, note that one may simply find the zeros of y-y0 then.
y0 = 1.4
z = find_roots(x,y-y0)
# ...
plt.plot(z, np.zeros(len(z))+y0)
People were also asking how to get the intersection between two curves. In that case it's again about finding the roots of the difference between the two, e.g.
x = .4 + np.sort(np.random.rand(N)) * 3.5
y1 = (x - 4) * np.cos(x * 9.) * np.cos(x * 6 + 0.05) + 0.1
y2 = (x - 2) * np.cos(x * 8.) * np.cos(x * 5 + 0.03) + 0.3
z = find_roots(x,y2-y1)
plt.plot(x,y1)
plt.plot(x,y2, color="C2")
plt.plot(z, np.interp(z, x, y1), marker="o", ls="", ms=4, color="C1")