How to divide the point cloud without duplicate?

How to divide the point cloud without duplicate? - pytorch

I want to group some point clouds without duplicate
Hi, I want to divide Point cloud!
for example I have some point cloud (B,N,C) // B is Batch, N is Num of points, C is feature(xyz,RGB..)
So I want to divide it with K!
I want to group K pieces each in nearest way without duplication!
ex) If k = 8 , I want to make dimension (B,N//8,8,C) then how can I do?
Can I just use reshape or view in torch?
or something different way?
I've already thought about K-nearest, but it seems that there is overlap.

Related

How to approximate coordinates basing on azimuths?

Suppose I have a series of (imperfect) azimuth readouts, giving me vague angles between a number of points. Lines projected from points A, B, C obviously [-don't-always-] never converge in a single point to define the location of point D. Hence, angles as viewed from A, B and C need to be adjusted.
To make it more fun, I might be more certain of the relative positions of specific points (suppose I locate them on a satellite image, or I know for a fact they are oriented perfectly north-south), so I might want to use that certainty in my calculations and NOT adjust certain angles at all.
By what technique should I average the resulting coordinates, to achieve a "mostly accurate" overall shape?
I considered treating the difference between non-adjusted and adjusted angles as "tension" and trying to "relieve" it in subsequent passes, but that approach gives priority to points calculated earlier.
Another approach could be to calculate the total "tension" in the set, then shake all angles by a random amount, see if that resulted in less tension, and repeat for possibly improved results, trying to evolve a possibly better solution.

As I understand it you have a bunch of unknown points (p[] say) and a number of measurements of azimuths, say Az[i,j] of p[j] from p[i]. You want to find the coordinates of the points.
You'll need to fix one point. This is because if the values of p[] is a solution -- i.e. gave the measured azimuths -- so too is q[] where for some fixed x,
q[i] = p[i] + x
I'll suppose you fix p[0].
You'll also need to fix a distance. This is because if p[] is a solution, so too is q[] where now for some fixed s,
q[i] = p[0] + s*(p[i] - p[0])
I'll suppose you fix dist(p[0], p[1]), and that there is and azimuth Az[1,2]. You'd be best to choose p[0] p[1] so that there is a reliable azimuth between them. Then we can compute p[1].
The usual way to approach such problems is least squares. That is we seek p[] to minimise
Sum square( (Az[i,j] - Azimuth( p[i], p[j]))/S[i,j])
where Az[i,j] is your measurement data
Azimuth( r, s) is the function that gives the azimuth of the point s from the point r
S[i,j] is the 'sd' of the measurement A[i,j] -- the higher the sd of a particular observation is, relative to the others, the less it affects the final result.
The above is a non linear least squares problem. There are many solvers available for this, but generally speaking as well as providing the data -- the Az[] and the S[] -- and the observation model -- the Azimuth function -- you need to provide an initial estimate of the state -- the values sought, in your case p[2] ..
It is highly likely that if your initial estimate is wrong the solver will fail.
One way to find this estimate would be to start with a set K of known point indices and seek to expand it. You would start with K being {0,1}. Then look for points that have as many azimuths as possible to points in K, and for such points estimate geometrically their position from the known points and the azimuths, and add them to K. If at the end you have all the points in K, then you can go on to the least squares. If it isn't its possible that a different pair of initial fixed points might do better, or maybe you are stuck.
The latter case is a real possibility. For example suppose you had points p[0],p[1],p[2],p[3] and azimuths A[0,1], A[1,2], A[1,3], A[2,3].
As above we fix the positions of p[0] and p[1]. But we can't compute positions of p[2] and p[3] because we do not know the distances of 2 or 3 from 1. The 1,2,3 triangle could be scaled arbitrarily and still give the same azimuths.

Quickest way to find closest set of point

I have three arrays of points:
A=[[5,2],[1,0],[5,1]]
B=[[3,3],[5,3],[1,1]]
C=[[4,2],[9,0],[0,0]]
I need the most efficient way to find the three points (one for each array) that are closest to each other (within one pixel in each axis).
What I'm doing right now is taking one point as reference, let's say A[0], and cycling all other B and C points looking for a solution. If A[0] gives me no result I'll move the reference to A[1] and so on. This approach as a huge problem because if I increase the number of points for each array and/or the number of arrays it requires too much time to converge some times, especially if the solution is in the last members of the arrays. So I'm wondering if there is any way to do this without maybe using a reference, or any quicker way than just looping all over the elements.
The rules that I must follow are the following:
the final solution has to be made by only one element from each array like: S=[A[n],B[m],C[j]]
each selected element has to be within 1 pixel in X and Y from ALL the other members of the solution (so Xi-Xj<=1 and Yi-Yj<=1 for each member of the solution).
For example in this simplified case the solution would be: S=[A[1],B[2],C[1]]
To clarify further the problem: what I wrote above it's just a simplify example to explain what I need. In my real case I don't know a priori the length of the lists nor the number of lists I have to work with, could be A,B,C, or A,B,C,D,E... (each of one with different number of points) etc. So I also need to find a way to make it as general as possible.

This requirement:
each selected element has to be within 1 pixel in X and Y from ALL the other members of the solution (so Xi-Xj<=1 and Yi-Yj<=1 for each member of the solution).
massively simplifies the problem, because it means that for any given (xi, yi), there are only nine possible choices of (xj, yj).
So I think the best approach is as follows:
Copy B and C into sets of tuples.
Iterate over A. For each point (xi, yi):
Iterate over the values of x from xi−1 to xi+1 and the values of y from yi−1 to yi+1. For each resulting point (xj, yj):
Check if (xj, yj) is in B. If so:
Iterate over the values of x from max(xi, xj)−1 to min(xi, xj)+1 and the values of y from max(yi, yj)−1 to min(yi, yj)+1. For each resulting point (xk, yk):
Check if (xk, yk) is in C. If so, we're done!
If we get to the end without having a match, that means there isn't one.
This requires roughly O(len(A) + len(B) + len(C)) time and O(len(B) + len(C) extra space.
Edited to add (due to a follow-up question in the comments): if you have N lists instead of just 3, then instead of nesting N loops deep (which gives time exponential in N), you'll want to do something more like this:
Copy B, C, etc., into sets of tuples, as above.
Iterate over A. For each point (xi, yi):
Create a set containing (xi, yi) and its eight neighbors.
For each of the lists B, C, etc.:
For each element in the set of nine points, see if it's in the current list.
Update the set to remove any points that aren't in the current list and don't have any neighbors in the current list.
If the set still has at least one element, then — great, each list contained a point that's within one pixel of that element (with all of those points also being within one pixel of each other). So, we're done!
If we get to the end without having a match, that means there isn't one.
which is much more complicated to implement, but is linear in N instead of exponential in N.

Currently, you are finding the solution with a bruteforce algorithm which has a O(n2) complexity. If your lists contains 1000 items, your algo will need 1000000 iterations to run... (It's even O(n3) as tobias_k pointed out)
Like you can see there: https://en.wikipedia.org/wiki/Closest_pair_of_points_problem, you could improve it by using a divide and conquer algorithm, which would run in a O(n log n) time.
You should search for Delaunay triangulation and/or Voronoi diagram implementations.
NB: if you can use external libs, you should also consider taking a look at the scipy lib: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.Delaunay.html

searching two identical submatrices in a bigger matrix

I need to find two identical submatices in a larger matrix.
I am not getting how to approach as first I need to have a submatrice and then to find if its identical is present or not.
e.g.
(N X M) 4X5 matrix
xy*yyx*
yx*xyy*
x*yyx*y
x*xyy*x
Identicals are
yyx
xyy
present at bold letters made points.. N,M <=10
not sure where to start with..

What are the limits on the submatrix? E.g., can you have a 1x1 submatrix? If there are no limits, then a dynamic programming approach would seem to be in order.
I.e., find the 1x1 matching submatrices, then use that as a starting point to find 2x2, 3x3, etc.

KD Tree alternative/variant for weighted data

I'm using a static KD-Tree for nearest neighbor search in 3D space. However, the client's specifications have now changed so that I'll need a weighted nearest neighbor search instead. For example, in 1D space, I have a point A with weight 5 at 0, and a point B with weight 2 at 4; the search should return A if the query point is from -5 to 5, and should return B if the query point is from 5 to 6. In other words, the higher-weighted point takes precedence within its radius.
Google hasn't been any help - all I get is information on the K-nearest neighbors algorithm.
I can simply remove points that are completely subsumed by a higher-weighted point, but this generally isn't the case (usually a lower-weighted point is only partially subsumed, like in the 1D example above). I could use a range tree to query all points in an NxNxN cube centered on the query point and determine the one with the greatest weight, but the naive implementation of this is wasteful - I'll need to set N to the point with the maximum weight in the entire tree, even though there may not be a point with that weight within the cube, e.g. let's say the point with the maximum weight in the tree is 25, then I'll need to set N to 25 even though the point with the highest weight for any given cube probably has a much lower weight; in the 1D case, if I have a point located at 100 with weight 25 then my naive algorithm would need to set N to 25 even if I'm outside of the point's radius.
To sum up, I'm looking for a way that I can query the KD tree (or some alternative/variant) such that I can quickly determine the highest-weighted point whose radius covers the query point.
FWIW, I'm coding this in Java.
It would also be nice if I could dynamically change a point's weight without incurring too high of a cost - at present this isn't a requirement, but I'm expecting that it may be a requirement down the road.
Edit: I found a paper on a priority range tree, but this doesn't exactly address the same problem in that it doesn't account for higher-priority points having a greater radius.

Use an extra dimension for the weight. A point (x,y,z) with weight w is placed at (N-w,x,y,z), where N is the maximum weight.
Distances in 4D are defined by…
d((a, b, c, d), (e, f, g, h)) = |a - e| + d((b, c, d), (f, g, h))
…where the second d is whatever your 3D distance was.
To find all potential results for (x,y,z), query a ball of radius N about (0,x,y,z).

I think I've found a solution: the nested interval tree, which is an implementation of a 3D interval tree. Rather than storing points with an associated radius that I then need to query, I instead store and query the radii directly. This has the added benefit that each dimension does not need to have the same weight (so that the radius is a rectangular box instead of a cubic box), which is not presently a project requirement but may become one in the future (the client only recently added the "weighted points" requirement, who knows what else he'll come up with).

Calculating the distance between each pair of a set of points

So I'm working on simulating a large number of n-dimensional particles, and I need to know the distance between every pair of points. Allowing for some error, and given the distance isn't relevant at all if exceeds some threshold, are there any good ways to accomplish this? I'm pretty sure if I want dist(A,C) and already know dist(A,B) and dist(B,C) I can bound it by [dist(A,B)-dist(B,C) , dist(A,B)+dist(B,C)], and then store the results in a sorted array, but I'd like to not reinvent the wheel if there's something better.
I don't think the number of dimensions should greatly affect the logic, but maybe for some solutions it will. Thanks in advance.

If the problem was simply about calculating the distances between all pairs, then it would be a O(n^2) problem without any chance for a better solution. However, you are saying that if the distance is greater than some threshold D, then you are not interested in it. This opens the opportunities for a better algorithm.
For example, in 2D case you can use the sweep-line technique. Sort your points lexicographically, first by y then by x. Then sweep the plane with a stripe of width D, bottom to top. As that stripe moves across the plane new points will enter the stripe through its top edge and exit it through its bottom edge. Active points (i.e. points currently inside the stripe) should be kept in some incrementally modifiable linear data structure sorted by their x coordinate.
Now, every time a new point enters the stripe, you have to check the currently active points to the left and to the right no farther than D (measured along the x axis). That's all.
The purpose of this algorithm (as it is typically the case with sweep-line approach) is to push the practical complexity away from O(n^2) and towards O(m), where m is the number of interactions we are actually interested in. Of course, the worst case performance will be O(n^2).
The above applies to 2-dimensional case. For n-dimensional case I'd say you'll be better off with a different technique. Some sort of space partitioning should work well here, i.e. to exploit the fact that if the distance between partitions is known to be greater than D, then there's no reason to consider the specific points in these partitions against each other.

If the distance beyond a certain threshold is not relevant, and this threshold is not too large, there are common techniques to make this more efficient: limit the search for neighbouring points using space-partitioning data structures. Possible options are:
Binning.
Trees: quadtrees(2d), kd-trees.
Binning with spatial hashing.
Also, since the distance from point A to point B is the same as distance from point B to point A, this distance should only be computed once. Thus, you should use the following loop:
for point i from 0 to n-1:
for point j from i+1 to n:
distance(point i, point j)
Combining these two techniques is very common for n-body simulation for example, where you have particles affect each other if they are close enough. Here are some fun examples of that in 2d: http://forum.openframeworks.cc/index.php?topic=2860.0
Here's a explanation of binning (and hashing): http://www.cs.cornell.edu/~bindel/class/cs5220-f11/notes/spatial.pdf

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string