How to delete the element in a list which is duplicated for linear representation in python?

How to delete the element in a list which is duplicated for linear representation in python? - python-3.x

https://imgur.com/a/0lFwssy
I want to draw an evolution diagram like this, [1,2,3,4] is an annotation to the point:
1 :(x=1,y=2)
2 :(x=2,y=3)
3 :(x=3,y=5)
4 :(x=4,y=6)
The connection is like:
a = [1,1,2,3] *Starting Point
b = [2,4,4,4] *Ending Point
And because point1 and point2 both connect to point4 and I don't want the connection of point to point4 because point1 evolved to point2 first.
So I want to get
https://imgur.com/a/asAUlHQ
c = [1,2,3]
d = [2,4,4]
I tried to use zip to write a for loop but it failed.
How to get c and d in python?

From what I understand it looks like you are looking for a minimum spanning tree for the graph where the edges are (a_i,b_i). You can do this as follows:
A = sp.sparse.csr_matrix((len(a),len(a)),dtype='bool')
A[a,b] = 1
c,d = sp.sparse.csgraph.minimum_spanning_tree(A).nonzero()
Note that the minimum spanning tree is not unique.

Related

How to match a geometric template of 2D boxes to fit another set of 2D boxes

I'm trying to find a match between a set of 2D boxes with coordinates (A) (from a template with known sizes and distances between boxes) to another set of 2D boxes with coordinates (B) (which may contain more boxes than A). They should match in terms of each box from A corresponds to a single Box in B. The boxes in A together form a "stamp" which is assymmetrical in atleast one dimension.
Illustration of problem
explanation: "Stanz" in the illustration is a box from set A.
One might even think of the Set A as only 2D points (the centerpoint of the box) to make it simpler.
The end result will be to know which A box corresponds to which B box.
I can only think of very specific ways of doing this, tailored to a specific layout of boxes, is there any known generic ways of dealing with this forms of matching/search problems and what are they called?
Edit: Possible solution
I have come up with one possible solution, looking for all the possible rotations at each possible B center position for a single box from set A. Here all of the points in A would be rotated and compared against the distance to B centers. Not sure if this is a good way.
Looking for the possible rotations at each B centerpoint- solution

In your example, the transformation between the template and its presence in B can be entirely defined (actually, over-defined) by two matching points.
So here's a simple approach which is kind of performant. First, put all the points in B into a kD-tree. Now, pick a canonical "first" point in A, and hypothesize matching it to each of the points in B. To check whether it matches a particular point in B, pick a canonical "second" point in A and measure its distance to the "first" point. Then, use a standard kD proximity-bounding query to find all the points in B which are roughly that distance from your hypothesized matched "first" point in B. For each of those, determine the transformation between A and B, and for each of the other points in A, determine whether there's a point in A at roughly the right place (again, using the kD-tree), early-outing with the first unmatched point.
The worst-case performance there can get quite bad with pathological cases (O(n^3 log n), I think) but in general I would expect roughly O(n log n) for well-behaved data with a low threshold. Note that the thresholding is a bit rough-and-ready, and the results can depend on your choice of "first" and "second" points.

This is more of an idea than an answer, but it's too long for a comment. I asked some additional questions in a comment above, but the answers may not be particular relevant, so I'll go ahead and offer some thoughts in the meantime.
As you may know, point matching is its own problem domain, and if you search for 'point matching algorithm', you'll find various articles, papers, and other resources. It seems though that an ad hoc solution might be appropriate here (one that's simpler than more generic algorithms that are available).
I'll assume that the input point set can only be rotated, and not also flipped. If this idea were to work though, it should also work with flipping - you'd just have to run the algorithm separately for each flipped configuration.
In your example image, you've matched a point from set A with a point from set B so that they're coincident. Call this shared point the 'anchor' point. You'd need to do this for every combination of a point from set A and a point from set B until you found a match or exhausted the possibilities. The problem then is to determine if a match can be made given one of these matched point pairs.
It seems that for a given anchor point, a necessary but not sufficient condition for a match is that a point from set A and a point from set B can be found that are approximately the same distance from the anchor point. (What 'approximately' means would depend on the input, and would need to be tuned appropriately given that you're using integers.) This condition is met in your example image in that the center point of each point set is (approximately) the same distance from the anchor point. (Note that there could be multiple pairs of points that meet this condition, in which case you'd have to examine each such pair in turn.)
Once you have such a pair - the center points in your example - you can use some simple trigonometry and linear algebra to rotate set A so that the points in the pair coincide, after which the two point sets are locked together at two points and not just one. In your image that would involve rotating set A about 135 degrees clockwise. Then you check to see if every point in set B has a point in set A with which it's coincident, to within some threshold. If so, you have a match.
In your example, this fails of course, because the rotation is not actually a match. Eventually though, if there's a match, you'll find the anchor point pair for which the test succeeds.
I realize this would be easier to explain with some diagrams, but I'm afraid this written explanation will have to suffice for the moment. I'm not positive this would work - it's just an idea. And maybe a more generic algorithm would be preferable. But, if this did work, it might have the advantage of being fairly straightforward to implement.
[Edit: Perhaps I should add that this is similar to your solution, except for the additional step to allow for only testing a subset of the possible rotations.]
[Edit: I think a further refinement may be possible here. If, after choosing an anchor point, matching is possible via rotation, it should be the case that for every point p in B there's a point in A that's (approximately) the same distance from the anchor point as p is. Again, it's a necessary but not sufficient condition, but it allows you to quickly eliminate cases where a match isn't possible via rotation.]

Below follows a finished solution in python without kD-tree and without early outing candidates. A better way is to do the implementation yourself according to Sneftel but if you need anything quick and with a plot this might be useful.
Plot shows the different steps, starts off with just the template as a collection of connected lines. Then it is translated to a point in B where the distances between A and B points fits the best. Finally it is rotated.
In this example it was important to also match up which of the template positions was matched to which boundingbox position, so its an extra step in the end. There might be some deviations in the code compared to the outline above.
import numpy as np
import random
import math
import matplotlib.pyplot as plt
def to_polar(pos_array):
x = pos_array[:, 0]
y = pos_array[:, 1]
length = np.sqrt(x ** 2 + y ** 2)
t = np.arctan2(y, x)
zip_list = list(zip(length, t))
array_polar = np.array(zip_list)
return array_polar
def to_cartesian(pos):
# first element radius
# second is angle(theta)
# Converting polar to cartesian coordinates
radius = pos[0]
theta = pos[1]
x = radius * math.cos(theta)
y = radius * math.sin(theta)
return x,y
def calculate_distance_points(p1,p2):
return np.sqrt((p1[0]-p2[0])**2+(p1[1]-p2[1])**2)
def find_closest_point_inx(point, neighbour_set):
shortest_dist = None
closest_index = -1
# Find the point in the secondary array that is the closest
for index,curr_neighbour in enumerate(neighbour_set):
distance = calculate_distance_points(point, curr_neighbour)
if shortest_dist is None or distance < shortest_dist:
shortest_dist = distance
closest_index = index
return closest_index
# Find the sum of distances between each point in primary to the closest one in secondary
def calculate_agg_distance_arrs(primary,secondary):
total_distance = 0
for point in primary:
closest_inx = find_closest_point_inx(point, secondary)
dist = calculate_distance_points(point, secondary[closest_inx])
total_distance += dist
return total_distance
# returns a set of <primary_index,neighbour_index>
def pair_neighbours_by_distance(primary_set, neighbour_set, distance_limit):
pairs = {}
for num, point in enumerate(primary_set):
closest_inx = find_closest_point_inx(point, neighbour_set)
if calculate_distance_points(neighbour_set[closest_inx], point) > distance_limit:
closest_inx = None
pairs[num]=closest_inx
return pairs
def rotate_array(array, angle,rot_origin=None):
if rot_origin is not None:
array = np.subtract(array,rot_origin)
# clockwise rotation
theta = np.radians(angle)
c, s = np.cos(theta), np.sin(theta)
R = np.array(((c, -s), (s, c)))
rotated = np.matmul(array, R)
if rot_origin is not None:
rotated = np.add(rotated,rot_origin)
return rotated
# Finds out a point in B_set and a rotation where the points in SetA have the best alignment towards SetB.
def find_stamp_rotation(A_set, B_set):
# Step 1
anchor_point_A = A_set[0]
# Step 2. Convert all points to polar coordinates with anchor as origin
A_anchor_origin = A_set - anchor_point_A
anchor_A_polar = to_polar(A_anchor_origin)
print(anchor_A_polar)
# Step 3 for each point in B
score_tuples = []
for num_anchor, B_anchor_point_try in enumerate(B_set):
# Step 3.1
B_origin_rel_point = B_set-B_anchor_point_try
B_polar_rp_origin = to_polar(B_origin_rel_point)
# Step 3.3 select arbitrary point q from Ap
point_Aq = anchor_A_polar[1]
# Step 3.4 test each rotation, where pointAq is rotated to each B-point (except the B anchor point)
for try_rot_point_B in [B_rot_point for num_rot, B_rot_point in enumerate(B_polar_rp_origin) if num_rot != num_anchor]:
# positive rotation is clockwise
# Step 4.1 Rotate Ap by the angle between q and n
angle_to_try = try_rot_point_B[1]-point_Aq[1]
rot_try_arr = np.copy(anchor_A_polar)
rot_try_arr[:,1]+=angle_to_try
cart_rot_try_arr = [to_cartesian(e) for e in rot_try_arr]
cart_B_rp_origin = [to_cartesian(e) for e in B_polar_rp_origin]
distance_score = calculate_agg_distance_arrs(cart_rot_try_arr, cart_B_rp_origin)
score_tuples.append((B_anchor_point_try,angle_to_try,distance_score))
# Step 4.3
lowest=None
for b_point,angle,distance in score_tuples:
print("point:{} angle(rad):{} distance(sum):{}".format(b_point,360*(angle/(2*math.pi)),distance))
if lowest is None or distance < lowest[2]:
lowest = b_point, 360*angle/(2*math.pi), distance
return lowest
def test_example():
ax = plt.subplot()
ax.grid(True)
plt.title('Fit Template to BBoxes by translation and rotation')
plt.xlim(-20, 20)
plt.ylim(-20, 20)
ax.set_xticks(range(-20,20), minor=True)
ax.set_yticks(range(-20,20), minor=True)
template = np.array([[-10,-10],[-10,10],[0,0],[10,-10],[10,10], [0,20]])
# Test Bboxes are Rotated 40 degree, translated 2,2
rotated = rotate_array(template,40)
rotated = np.subtract(rotated,[2,2])
# Adds some extra bounding boxes as noise
for i in range(8):
rotated = np.append(rotated,[[random.randrange(-20,20), random.randrange(-20,20)]],axis=0)
# Scramble entries in array and return the position change.
rnd_rotated = rotated.copy()
np.random.shuffle(rnd_rotated)
element_positions = []
# After shuffling, looks at which index the "A"-marks has ended up at. For later comparison to see that the algo found the correct answer.
# This is to represent the actual case, where I will get a bunch of unordered bboxes.
rnd_map = {}
indexes_translation = [num2 for num,point in enumerate(rnd_rotated) for num2,point2 in enumerate(rotated) if point[0]==point2[0] and point[1]==point2[1]]
for num,inx in enumerate(indexes_translation):
rnd_map[num]=inx
# algo part 1/3
b_point,angle,_ = find_stamp_rotation(template,rnd_rotated)
# Plot for visualization
legend_list = np.empty((0,2))
leg_template = plt.plot(template[:,0],template[:,1],c='r')
legend_list = np.append(legend_list,[[leg_template[0],'1. template-pattern']],axis=0)
leg_bboxes = plt.scatter(rnd_rotated[:,0],rnd_rotated[:,1],c='b',label="scatter")
legend_list = np.append(legend_list,[[leg_bboxes,'2. bounding boxes']],axis=0)
leg_anchor = plt.scatter(b_point[0],b_point[1],c='y')
legend_list = np.append(legend_list,[[leg_anchor,'3. Discovered bbox anchor point']],axis=0)
# algo part 2/3
# Superimpose A onto B by A[0] to b_point
offset = b_point - template[0]
super_imposed_A = template + offset
# Plot superimposed, but not yet rotated
leg_s_imposed = plt.plot(super_imposed_A[:,0],super_imposed_A[:,1],c='k')
#plt.legend(rubberduckz, "superimposed template on anchor")
legend_list = np.append(legend_list,[[leg_s_imposed[0],'4. Templ superimposed on Bbox']],axis=0)
print("Superimposed A on B by A[0] to {}".format(b_point))
print(super_imposed_A)
# Rotate, now the template should match pattern of bboxes
# algo part 3/4
super_imposed_rotated_A = rotate_array(super_imposed_A,-angle,rot_origin=super_imposed_A[0])
# Show the beautiful match in a last plot
leg_s_imp_rot = plt.plot(super_imposed_rotated_A[:,0],super_imposed_rotated_A[:,1],c='g')
legend_list = np.append(legend_list,[[leg_s_imp_rot[0],'5. final fit']],axis=0)
plt.legend(legend_list[:,0], legend_list[:,1],loc="upper left")
plt.show()
# algo part 4/4
pairs = pair_neighbours_by_distance(super_imposed_rotated_A, rnd_rotated, 10)
print(pairs)
for inx in range(len(pairs)):
bbox_num = pairs[inx]
print("template id:{}".format(inx))
print("bbox#id:{}".format(bbox_num))
#print("original_bbox:{}".format(rnd_map[bbox_num]))
if __name__ == "__main__":
test_example()
Result on actual image with bounding boxes. Here it can be seen that the scaling is incorrect which makes the template a bit off but it will still be able to pair up and thats the desired end-result in my case.

How to keep graph shape when read it by networkx

I have a file shows different points' coordinates(first 10 rows):
1 10.381090522139 55.39134945301
2 10.37928179195319 55.38858713256631
3 10.387152479898077 55.3923338690609
4 10.380048819655258 55.393938880906745
5 10.380679138517507 55.39459444742785
6 10.382474625286 55.392132993022
7 10.383736185130601 55.39454404088371
8 10.387334283235987 55.39433237195271
9 10.388468103023115 55.39536574771765
10 10.390814951258335 55.396308397998475
Now I want to calculate the MST(minimum spanning tree) of them so firstly I change my coordinates to weight graph(distance->weight):
n = 10
data = []
for i in range(0, n):
for j in range(i + 1, n):
temp = []
temp.append(i)
temp.append(j)
x = np.array(rawdata[i, 1:3])
y = np.array(rawdata[j, 1:3])
temp.append(np.linalg.norm(x - y))
data.append(temp)
Then, using networkx to load weight data:
G = nx.read_weighted_edgelist("data.txt")
T = nx.minimum_spanning_tree(G)
nx.draw(T)
plt.show()
but I cannot see the orignal shape from result:
how to solve this problem?

I'm just answering the question about the position of the nodes. I can't tell from what you've done whether the minimum spanning tree is what you're after or not.
When you plot a network, it will assign positions based on an algorithm that is in part stochastic. If you want the nodes to go at particular positions, you will have to include that information in the call in an optional argument. So define a dictionary (it's usually called pos) such that pos[node] is a tuple (x,y) where x is the x-coordinate of node and y is the y-coordinate of node.
Then the call is nx.draw(T, pos=pos).

Generate 2D list with given dimensions in Haskell

I am trying to generate a list of lists with a specified dimension.
the data type of this list looks something like this:
data A = X | Y | Z
so the list is of type [[A]]. (A is an instance of the Show type class so don't worry about that).
The user gives in a certain dimension (lets say width = 3 and height = 4), so the content could look like this:
[[X,Y,Z],
[Y,Y,X],
[Y,X,Z],
[X,Z,Z]]
How can I generate a width X height 'matrix', the values aren't all that important at the moment.
thanks in advance.
EDIT: (for clarity reasons)
I just want to know how to generate a 'matrix' of type [[A]] with the width and height as user input.
So width = number of elements in the inner list, height = number of lists in the outer list.

To generate a 3x4 nested list filled by a certain element, you can use:
data A = X | Y | Z deriving (Show)
generate width height = replicate height . replicate width
main = print $ generate 3 4 X
to get [[X,X,X],[X,X,X],[X,X,X],[X,X,X]].
Note that nested lists are not a great substitute for a 2D array in C/Java if the goal is to do frequent point updates. In those cases, use Data.Map or Data.Array.

How to find variability of a set of Cartesian Points (xyz) or fitting/distance to 3D line and/or plane?

So I was looking at this question:
Matlab - Standard Deviation of Cartesian Points
Which basically answers my question, except the problem is I have xyz, not xy. So I don't think Ax=b would work in this case.
I have, say, 10 Cartesian points, and I want to be able to find the standard deviation of these points. Now, I don't want standard deviation of each X, Y and Z (as a result of 3 sets) but I just want to get one number.
This can be done using MATLAB or excel.
To better understand what I'm doing, I have this desired point (1,2,3) and I recorded (1.1,2.1,2.9), (1.2,1.9,3.1) and so on. I wanted to be able to find the variability of all the recorded points.
I'm open for any other suggestions.

If you do the same thing as in the other answer you linked, it should work.
x_vals = xyz(:,1);
y_vals = xyz(:,2);
z_vals = xyz(:,3);
then make A with 3 columns,
A = [x_vals y_vals ones(size(x_vals))];
and
b = z_vals;
Then
sol=A\b;
m = sol(1);
n = sol(2);
c = sol(3);
and then
errs = (m*x_vals + n*y_vals + c) - z_vals;
After that you can use errs just as in the linked question.

Randomly clustered data
If your data is not expected to be near a line or a plane, just compute the distance of each point to the centroid:
xyz_bar = mean(xyz);
M = bsxfun(#minus,xyz,xyz_bar);
d = sqrt(sum(M.^2,2)); % distances to centroid
Then you can compute variability anyway you like. For example, standard deviation and RMS error:
std(d)
sqrt(mean(d.^2))
Data about a 3D line
If the data points are expected to be roughly along the path of a line, with some deviation from it, you might look at the distance to a best fit line. First, fit a 3D line to your points. One way is using the following parametric form of a 3D line:
x = a*t + x0
y = b*t + y0
z = c*t + z0
Generate some test data, with noise:
abc = [2 3 1]; xyz0 = [6 12 3];
t = 0:0.1:10;
xyz = bsxfun(#plus,bsxfun(#times,abc,t.'),xyz0) + 0.5*randn(numel(t),3)
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'*') % to visualize
Estimate the 3D line parameters:
xyz_bar = mean(xyz) % centroid is on the line
M = bsxfun(#minus,xyz,xyz_bar); % remove mean
[~,S,V] = svd(M,0)
abc_est = V(:,1).'
abc/norm(abc) % compare actual slope coefficients
Distance from points to a 3D line:
pointCentroidSeg = bsxfun(#minus,xyz_bar,xyz);
pointCross = cross(pointCentroidSeg, repmat(abc_est,size(xyz,1),1));
errs = sqrt(sum(pointCross.^2,2))
Now you have the distance from each point to the fit line ("error" of each point). You can compute the mean, RMS, standard deviation, etc.:
>> std(errs)
ans =
0.3232
>> sqrt(mean(errs.^2))
ans =
0.7017
Data about a 3D plane
See David's answer.

Best fit square to quadrilateral

I've got a shape consisting of four points, A, B, C and D, of which the only their position is known. The goal is to transform these points to have specific angles and offsets relative to each other.
For example: A(-1,-1) B(2,-1) C(1,1) D(-2,1), which should be transformed to a perfect square (all angles 90) with offsets between AB, BC, CD and AD all being 2. The result should be a square slightly rotated counter-clockwise.
What would be the most efficient way to do this?
I'm using this for a simple block simulation program.

As Mark alluded, we can use constrained optimization to find the side 2 square that minimizes the square of the distance to the corners of the original.
We need to minimize f = (a-A)^2 + (b-B)^2 + (c-C)^2 + (d-D)^2 (where the square is actually a dot product of the vector argument with itself) subject to some constraints.
Following the method of Lagrange multipliers, I chose the following distance constraints:
g1 = (a-b)^2 - 4
g2 = (c-b)^2 - 4
g3 = (d-c)^2 - 4
and the following angle constraints:
g4 = (b-a).(c-b)
g5 = (c-b).(d-c)
A quick napkin sketch should convince you that these constraints are sufficient.
We then want to minimize f subject to the g's all being zero.
The Lagrange function is:
L = f + Sum(i = 1 to 5, li gi)
where the lis are the Lagrange multipliers.
The gradient is non-linear, so we have to take a hessian and use multivariate Newton's method to iterate to a solution.
Here's the solution I got (red) for the data given (black):
This took 5 iterations, after which the L2 norm of the step was 6.5106e-9.

While Codie CodeMonkey's solution is a perfectly valid one (and a great use case for the Lagrangian Multipliers at that), I believe that it's worth mentioning that if the side length is not given this particular problem actually has a closed form solution.
We would like to minimise the distance between the corners of our fitted square and the ones of the given quadrilateral. This is equivalent to minimising the cost function:
f(x1,...,y4) = (x1-ax)^2+(y1-ay)^2 + (x2-bx)^2+(y2-by)^2 +
(x3-cx)^2+(y3-cy)^2 + (x4-dx)^2+(y4-dy)^2
Where Pi = (xi,yi) are the corners of the fitted square and A = (ax,ay) through D = (dx,dy) represent the given corners of the quadrilateral in clockwise order. Since we are fitting a square we have certain contraints regarding the positions of the four corners. Actually, if two opposite corners are given, they are enough to describe a unique square (save for the mirror image on the diagonal).
Parametrization of the points
This means that two opposite corners are enough to represent our target square. We can parametrise the two remaining corners using the components of the first two. In the above example we express P2 and P4 in terms of P1 = (x1,y1) and P3 = (x3,y3). If you need a visualisation of the geometrical intuition behind the parametrisation of a square you can play with the interactive version.
P2 = (x2,y2) = ( (x1+x3-y3+y1)/2 , (y1+y3-x1+x3)/2 )
P4 = (x4,y4) = ( (x1+x3+y3-y1)/2 , (y1+y3+x1-x3)/2 )
Substituting for x2,x4,y2,y4 means that f(x1,...,y4) can be rewritten to:
f(x1,x3,y1,y3) = (x1-ax)^2+(y1-ay)^2 + ((x1+x3-y3+y1)/2-bx)^2+((y1+y3-x1+x3)/2-by)^2 +
(x3-cx)^2+(y3-cy)^2 + ((x1+x3+y3-y1)/2-dx)^2+((y1+y3+x1-x3)/2-dy)^2
a function which only depends on x1,x3,y1,y3. To find the minimum of the resulting function we then set the partial derivatives of f(x1,x3,y1,y3) equal to zero. They are the following:
df/dx1 = 4x1-dy-dx+by-bx-2ax = 0 --> x1 = ( dy+dx-by+bx+2ax)/4
df/dx3 = 4x3+dy-dx-by-bx-2cx = 0 --> x3 = (-dy+dx+by+bx+2cx)/4
df/dy1 = 4y1-dy+dx-by-bx-2ay = 0 --> y1 = ( dy-dx+by+bx+2ay)/4
df/dy3 = 4y3-dy-dx-2cy-by+bx = 0 --> y3 = ( dy+dx+by-bx+2cy)/4
You may see where this is going, as simple rearrangment of the terms leads to the final solution.
Final solution

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string