Calculating the Held Karp Lower bound For The Traveling Salesman(TSP) - traveling-salesman

I am currently researching the traveling salesman problem, and was wondering if anyone would be able to simply explain the held karp lower bound. I have been looking at a lot of papers and i am struggling to understand it. If someone could simply explain it that would be great.
I also know there is the method of calculating a minimum spanning tree of the vertices not including the starting vertex and then adding the two minimum edges from the starting vertex.

I'll try to explain this without going in too much details. I'll avoid formal proofs and I'll try to avoid technical jargon. However, you might want to go over everything again once you have read everything through. And most importantly; try the algorithms out yourself.
Introduction
A 1-tree is a tree that consists of a vertex attached with 2 edges to a spanning tree. You can check for yourself that every TSP tour is a 1-Tree.
There is also such a thing as a minimum-1-Tree of a graph. That is the resulting tree when you follow this algorithm:
Exclude a vertex from your graph
Calculate the minimum spanning tree of the resulting graph
Attach the excluded vertex with it's 2 smallest edges to the minimum spanning tree
*For now I'll assume that you know that a minimum-1-tree is a lower bound for the optimal TSP tour. There is an informal proof at the end.
You will find that the resulting tree is different when you exclude different vertices. However all of the resulting trees can be considered lower bounds for the optimal tour in the TSP. Therefore the largest of the minimum-1-trees you have found this way is a better lower bound then the others found this way.
Held-Karp lower bound
The Held-Karp lower bound is an even tighter lower bound.
The idea is that you can alter the original graph in a special way. This modified graph will generate different minimum-1-trees then the original.
Furthermore (and this is important so I'll repeat it throughout this paragraph with different words), the modification is such that the length of all the valid TSP tours are modified by the same (known) constant. In other words, the length of a valid TSP solution in this new graph = the length of a valid solution in the original graph plus a known constant. For example: say the weight of the TSP tour visiting vertices A, B, C and D in that order in the original graph = 10. Then the weight of the TSP tour visiting the same vertices in the same order in the modified graph = 10 + a known constant.
This, of course, is true for the optimal TSP tour as well. Therefore the optimal TSP tour in the modified graph is also an optimal tour in the original graph. And a minimum-1-Tree of the modified graph is a lower bound for the optimal tour in the modified graph. Again, I'll just assume you understand that this generates a lower bound for your modified graph's optimal TSP tour. By substracting another known constant from the found lower bound of your modified graph, you have a new lower bound for your original graph.
There are infinitly many of such modifications to your graph. These different modifications result in different lower bounds. The tightest of these lower bounds is the Held-Karp lower bound.
How to modify your graph
Now that I have explained what the Held-Karp lower bound is, I will show you how to modify your graph to generate different minimum-1-trees.
Use the following algorithm:
Give every vertex in your graph an arbitrary weight
update the weight of every edge as follows: new edge weight = edge weight + starting vertex weight + ending vertex weight
For example, your original graph has the vertices A, B and C with edge AB = 3, edge AC = 5 and edge BC = 4. And for the algorithm you assign the (arbitrary) weights to the vertices A: 30, B: 40, C:50 then the resulting weights of the edges in your modified graph are AB = 3 + 30 + 40 = 73, AC = 5 + 30 + 50 = 85 and BC = 4 + 40 + 50 = 94.
The known constant for the modification is twice the sum of the weights given to the vertices. In this example the known constant is 2 * (30 + 40 + 50) = 240. Note: the tours in the modified graph are thus equal to the original tours + 240. In this example there is only one tour namely ABC. The tour in the original graph has a length of 3 + 4 + 5 = 12. The tour in the modified graph has a length of 73 + 85 + 94 = 252, which is indeed 240 + 12.
The reason why the constant equals twice the sum of the weights given to the vertices is because every vertex in a TSP tour has degree 2.
You will need another known constant. The constant you substract from your minimum-1-tree to get a lower bound. This depends on the degree of the vertices of your found minimum-1-tree. You will need to multiply the weight you have given each vertex by the degree of the vertex in that minimum-1-tree. And add that all up. For example if you have given the following weights A: 30, B:40, C:50, D:60 and in your minimum spanning tree vertex A has degree 1, vertex B and C have degree 2, vertex D has degree 3 then your constant to substract to get a lower bound = 1 * 30 + 2 * 40 + 2 * 50 + 3 * 60 = 390.
How to find the Held-Karp lower bound
Now I believe there is one more question unanswered: how do I find the best modification to my graph, so that I get the tightest lower bound (and thus the Held-Karp lower bound)?
Well, that's the hard part. Without delving too deep: there are ways to get closer and closer to the Held-Karp lower bound. Basicly one can keep modifying the graph such that the degree of all vertices get closer and closer to 2. And thus closer and closer to a real tour.
Minimum-1-tree is a lower bound
As promised I would give an informal proof that a minimum-1-tree is a lower bound for the optimal TSP solution. A minimum-1-Tree is made of two parts: a minimum-spanning-tree and a vertex attached to it with 2 edges. A TSP tour must go through the vertex attached to the minimum spanning tree. The shortest way to do so is through the attached edges. The tour must also visit all the vertices in the minimum spanning tree. That minimum spanning tree is a lower bound for the optimal TSP for the graph excluding the attached vertex. Combining these two facts one can conclude that the minimum-1-tree is a lower bound for the optimal TSP tour.
Conclusion
When you modify a graph in a certain way and find the minimum-1-Tree of this modified graph to calculate a lower bound. The best possible lower bound through these means is the Held-Karp lower bound.
I hope this answers your question.
Links
For a more formal approach and additional information I recommend the following links:
ieor.berkeley.edu/~kaminsky/ieor251/notes/3-16-05.pdf
http://www.sciencedirect.com/science/article/pii/S0377221796002147

Related

Converting intensities to probabilities in ppp

Apologies for the overlap with existing questions; mine is at a more basic skill level. I am working with very sparse occurrences spanning very large areas, so I would like to calculate probability at pixels using the density.ppp function (as opposed to relrisk.ppp, where specifying presences+absences would be computationally intractable). Is there a straightforward way to convert density (intensity) to probabilities at each point?
Maxdist=50
dtruncauchy=function(x,L=60) L/(diff(atan(c(-1,1)*Maxdist/L)) * (L^2 + x^2))
dispersfun=function(x,y) dtruncauchy(sqrt(x^2+y^2))
n=1e3; PPP=ppp(1:n,1:n, c(1,n),c(1,n), marks=rep(1,n));
density.ppp(PPP,cutoff=Maxdist,kernel=dispersfun,at="points",leaveoneout=FALSE) #convert to probabilies?
Thank you!!
I think there is a misunderstanding about fundamentals. The spatstat package is designed mainly for analysing "mapped point patterns", datasets which record the locations where events occurred or things were located. It is designed for "presence-only" data, not "presence/absence" data (with some exceptions).
The relrisk function expects input data about the presence of two different types of events, such as the mapped locations of trees belonging to two different species, and then estimates the spatially-varying probability that a tree will belong to each species.
If you have 'presence-only' data stored in a point pattern object X of class "ppp", then density(X, ....) will produce a pixel image of the spatially-varying intensity (expected number of points per unit area). For example if the spatial coordinates were expressed in metres, then the intensity values are "points per square metre". If you want to calculate the probability of presence in each pixel (i.e. for each pixel, the probability that there is at least one presence point in the pixel), you just need to multiply the intensity value by the area of one pixel, which gives the expected number of points in the pixel. If pixels are small (the usual case) then the presence probability is just equal to this value. For physically larger pixels the probability is 1 - exp(-m) where m is the expected number of points.
Example:
X <- redwood
D <- density(X, 0.2)
pixarea <- with(D, xstep * ystep)
M <- pixarea * D
p <- 1 - exp(-M)
then M and p are images which should be almost equal, and can both be interpreted as probability of presence.
For more information see Chapter 6 of the spatstat book.
If, instead, you had a pixel image of presence/absence data, with pixel values equal to 1 or 0 for presence or absence respectively, then you can just use the function blur in the spatstat package to perform kernel smoothing of the image, and the resulting pixel values are presence probabilities.

How to approximate coordinates basing on azimuths?

Suppose I have a series of (imperfect) azimuth readouts, giving me vague angles between a number of points. Lines projected from points A, B, C obviously [-don't-always-] never converge in a single point to define the location of point D. Hence, angles as viewed from A, B and C need to be adjusted.
To make it more fun, I might be more certain of the relative positions of specific points (suppose I locate them on a satellite image, or I know for a fact they are oriented perfectly north-south), so I might want to use that certainty in my calculations and NOT adjust certain angles at all.
By what technique should I average the resulting coordinates, to achieve a "mostly accurate" overall shape?
I considered treating the difference between non-adjusted and adjusted angles as "tension" and trying to "relieve" it in subsequent passes, but that approach gives priority to points calculated earlier.
Another approach could be to calculate the total "tension" in the set, then shake all angles by a random amount, see if that resulted in less tension, and repeat for possibly improved results, trying to evolve a possibly better solution.
As I understand it you have a bunch of unknown points (p[] say) and a number of measurements of azimuths, say Az[i,j] of p[j] from p[i]. You want to find the coordinates of the points.
You'll need to fix one point. This is because if the values of p[] is a solution -- i.e. gave the measured azimuths -- so too is q[] where for some fixed x,
q[i] = p[i] + x
I'll suppose you fix p[0].
You'll also need to fix a distance. This is because if p[] is a solution, so too is q[] where now for some fixed s,
q[i] = p[0] + s*(p[i] - p[0])
I'll suppose you fix dist(p[0], p[1]), and that there is and azimuth Az[1,2]. You'd be best to choose p[0] p[1] so that there is a reliable azimuth between them. Then we can compute p[1].
The usual way to approach such problems is least squares. That is we seek p[] to minimise
Sum square( (Az[i,j] - Azimuth( p[i], p[j]))/S[i,j])
where Az[i,j] is your measurement data
Azimuth( r, s) is the function that gives the azimuth of the point s from the point r
S[i,j] is the 'sd' of the measurement A[i,j] -- the higher the sd of a particular observation is, relative to the others, the less it affects the final result.
The above is a non linear least squares problem. There are many solvers available for this, but generally speaking as well as providing the data -- the Az[] and the S[] -- and the observation model -- the Azimuth function -- you need to provide an initial estimate of the state -- the values sought, in your case p[2] ..
It is highly likely that if your initial estimate is wrong the solver will fail.
One way to find this estimate would be to start with a set K of known point indices and seek to expand it. You would start with K being {0,1}. Then look for points that have as many azimuths as possible to points in K, and for such points estimate geometrically their position from the known points and the azimuths, and add them to K. If at the end you have all the points in K, then you can go on to the least squares. If it isn't its possible that a different pair of initial fixed points might do better, or maybe you are stuck.
The latter case is a real possibility. For example suppose you had points p[0],p[1],p[2],p[3] and azimuths A[0,1], A[1,2], A[1,3], A[2,3].
As above we fix the positions of p[0] and p[1]. But we can't compute positions of p[2] and p[3] because we do not know the distances of 2 or 3 from 1. The 1,2,3 triangle could be scaled arbitrarily and still give the same azimuths.

Isomap "nonlinear dimensionality reduction" numbre of points

I have a question please , it's about 'Isomap' nonlinear dimensionality reduction, in normal cases when I introduce a matrix distance of 100 * 100
and I apply Isomap [http://isomap.stanford.edu/][1] I get the coordinates of 100 points ,in other cases I do not understand why, with a matrix of 150 * 150 i obtain juste 35 or 50 points ?
The first step of Isomap is usually to create a "nearest neighbor matrix" so that every point is connected to its 4 or 6 or 8 or something nearest neighbors.
So, you may start with a distance matrix that is 100 x 100 and every point has a distance to 99 other points, after this first step the distances for anything but the (4 or 6 or 8) closest points are set to infinity.
Then Isomap computes a shortest path distance, hopping between nearby points to get to farther away points.
In your case, when you create a matrix of 150 points, I think that once you only keep the nearby points in the first step, the points become disconnected, and there is path between distant points. The default behavior of many Isomap codes is to return the Isomap embedding of the largest collection of connected points.
How can you fix this?
1. You can increase the number of nearest neighbors that you use until you get all the points included.
Caveat: In many natural cases, if you include most or all neighbors, this ends up in the case where the shortest-path part of the procedure does nothing, and this reduces to a problem called "multi-dimensional scaling" which gives a linear embedding.

KD Tree alternative/variant for weighted data

I'm using a static KD-Tree for nearest neighbor search in 3D space. However, the client's specifications have now changed so that I'll need a weighted nearest neighbor search instead. For example, in 1D space, I have a point A with weight 5 at 0, and a point B with weight 2 at 4; the search should return A if the query point is from -5 to 5, and should return B if the query point is from 5 to 6. In other words, the higher-weighted point takes precedence within its radius.
Google hasn't been any help - all I get is information on the K-nearest neighbors algorithm.
I can simply remove points that are completely subsumed by a higher-weighted point, but this generally isn't the case (usually a lower-weighted point is only partially subsumed, like in the 1D example above). I could use a range tree to query all points in an NxNxN cube centered on the query point and determine the one with the greatest weight, but the naive implementation of this is wasteful - I'll need to set N to the point with the maximum weight in the entire tree, even though there may not be a point with that weight within the cube, e.g. let's say the point with the maximum weight in the tree is 25, then I'll need to set N to 25 even though the point with the highest weight for any given cube probably has a much lower weight; in the 1D case, if I have a point located at 100 with weight 25 then my naive algorithm would need to set N to 25 even if I'm outside of the point's radius.
To sum up, I'm looking for a way that I can query the KD tree (or some alternative/variant) such that I can quickly determine the highest-weighted point whose radius covers the query point.
FWIW, I'm coding this in Java.
It would also be nice if I could dynamically change a point's weight without incurring too high of a cost - at present this isn't a requirement, but I'm expecting that it may be a requirement down the road.
Edit: I found a paper on a priority range tree, but this doesn't exactly address the same problem in that it doesn't account for higher-priority points having a greater radius.
Use an extra dimension for the weight. A point (x,y,z) with weight w is placed at (N-w,x,y,z), where N is the maximum weight.
Distances in 4D are defined by…
d((a, b, c, d), (e, f, g, h)) = |a - e| + d((b, c, d), (f, g, h))
…where the second d is whatever your 3D distance was.
To find all potential results for (x,y,z), query a ball of radius N about (0,x,y,z).
I think I've found a solution: the nested interval tree, which is an implementation of a 3D interval tree. Rather than storing points with an associated radius that I then need to query, I instead store and query the radii directly. This has the added benefit that each dimension does not need to have the same weight (so that the radius is a rectangular box instead of a cubic box), which is not presently a project requirement but may become one in the future (the client only recently added the "weighted points" requirement, who knows what else he'll come up with).

Decomposition to Convex Polygons

This question is a little involved. I wrote an algorithm for breaking up a simple polygon into convex subpolygons, but now I'm having trouble proving that it's not optimal (i.e. minimal number of convex polygons using Steiner points (added vertices)). My prof is adamant that it can't be done with a greedy algorithm such as this one, but I can't think of a counterexample.
So, if anyone can prove my algorithm is suboptimal (or optimal), I would appreciate it.
The easiest way to explain my algorithm with pictures (these are from an older suboptimal version)
What my algorithm does, is extends the line segments around the point i across until it hits a point on the opposite edge.
If there is no vertex within this range, it creates a new one (the red point) and connects to that:
If there is one or more vertices in the range, it connects to the closest one. This usually produces a decomposition with the fewest number of convex polygons:
However, in some cases it can fail -- in the following figure, if it happens to connect the middle green line first, this will create an extra unneeded polygon. To this I propose double checking all the edges (diagonals) we've added, and check that they are all still necessary. If not, remove it:
In some cases, however, this is not enough. See this figure:
Replacing a-b and c-d with a-c would yield a better solution. In this scenario though, there's no edges to remove so this poses a problem. In this case I suggest an order of preference: when deciding which vertex to connect a reflex vertex to, it should choose the vertex with the highest priority:
lowest) closest vertex
med) closest reflex vertex
highest) closest reflex that is also in range when working backwards (hard to explain) --
In this figure, we can see that the reflex vertex 9 chose to connect to 12 (because it was closest), when it would have been better to connect to 5. Both vertices 5 and 12 are in the range as defined by the extended line segments 10-9 and 8-9, but vertex 5 should be given preference because 9 is within the range given by 4-5 and 6-5, but NOT in the range given by 13-12 and 11-12. i.e., the edge 9-12 elimates the reflex vertex at 9, but does NOT eliminate the reflex vertex at 12, but it CAN eliminate the reflex vertex at 5, so 5 should be given preference.
It is possible that the edge 5-12 will still exist with this modified version, but it can be removed during post-processing.
Are there any cases I've missed?
Pseudo-code (requested by John Feminella) -- this is missing the bits under Figures 3 and 5
assume vertices in `poly` are given in CCW order
let 'good reflex' (better term??) mean that if poly[i] is being compared with poly[j], then poly[i] is in the range given by the rays poly[j-1], poly[j] and poly[j+1], poly[j]
for each vertex poly[i]
if poly[i] is reflex
find the closest point of intersection given by the ray starting at poly[i-1] and extending in the direction of poly[i] (call this lower bound)
repeat for the ray given by poly[i+1], poly[i] (call this upper bound)
if there are no vertices along boundary of the polygon in the range given by the upper and lower bounds
create a new vertex exactly half way between the lower and upper bound points (lower and upper will lie on the same edge)
connect poly[i] to this new point
else
iterate along the vertices in the range given by the lower and upper bounds, for each vertex poly[j]
if poly[j] is a 'good reflex'
if no other good reflexes have been found
save it (overwrite any other vertex found)
else
if it is closer then the other good reflexes vertices, save it
else
if no good reflexes have been found and it is closer than the other vertices found, save it
connect poly[i] to the best candidate
repeat entire algorithm for both halves of the polygon that was just split
// no reflex vertices found, then `poly` is convex
save poly
Turns out there is one more case I didn't anticipate: [Figure 5]
My algorithm will attempt to connect vertex 1 to 4, unless I add another check to make sure it can. So I propose stuffing everything "in the range" onto a priority queue using the priority scheme I mentioned above, then take the highest priority one, check if it can connect, if not, pop it off and use the next. I think this makes my algorithm O(r n log n) if I optimize it right.
I've put together a website that loosely describes my findings. I tend to move stuff around, so get it while it's hot.
I believe the regular five pointed star (e.g. with alternating points having collinear segments) is the counterexample you seek.
Edit in response to comments
In light of my revised understanding, a revised answer: try an acute five pointed star (e.g. one with arms sufficiently narrow that only the three points comprising the arm opposite the reflex point you are working on are within the range considered "good reflex points"). At least working through it on paper it appears to give more than the optimal. However, a final reading of your code has me wondering: what do you mean by "closest" (i.e. closest to what)?
Note
Even though my answer was accepted, it isn't the counter example we initially thought. As #Mark points out in the comments, it goes from four to five at exactly the same time as the optimal does.
Flip-flop, flip flop
On further reflection, I think I was right after all. The optimal bound of four can be retained in a acute star by simply assuring that one pair of arms have collinear edges. But the algorithm finds five, even with the patch up.
I get this:
removing dead ImageShack link
When the optimal is this:
removing dead ImageShack link
I think your algorithm cannot be optimal because it makes no use of any measure of optimality. You use other metrics like 'closest' vertices, and checking for 'necessary' diagonals.
To drive a wedge between yours and an optimal algorithm, we need to exploit that gap by looking for shapes with close vertices which would decompose badly. For example (ignore the lines, I found this on the intertubenet):
concave polygon which forms a G or U shape http://avocado-cad.wiki.sourceforge.net/space/showimage/2007-03-19_-_convexize.png
You have no protection against the centre-most point being connected across the concave 'gap', which is external to the polygon.
Your algorithm is also quite complex, and may be overdoing it - just like complex code, you may find bugs in it because complex code makes complex assumptions.
Consider a more extensive initial stage to break the shape into more, simpler shapes - like triangles - and then an iterative or genetic algorithm to recombine them. You will need a stage like this to combine any unnecessary divisions between your convex polys anyway, and by then you may have limited your possible decompositions to only sub-optimal solutions.
At a guess something like:
decompose into triangles
non-deterministically generate a number of recombinations
calculate a quality metric (number of polys)
select the best x% of the recombinations
partially decompose each using triangles, and generate a new set of recombinations
repeat from 4 until some measure of convergence is reached
but vertex 5 should be given preference because 9 is within the range given by 4-5 and 6-5
What would you do if 4-5 and 6-5 were even more convex so that 9 didn't lie within their range? Then by your rules the proper thing to do would be to connect 9 to 12 because 12 is the closest reflex vertex, which would be suboptimal.
Found it :( They're actually quite obvious.
*dead imageshack img*
A four leaf clover will not be optimal if Steiner points are allowed... the red vertices could have been connected.
*dead imageshack img*
It won't even be optimal without Steiner points... 5 could be connected to 14, removing the need for 3-14, 3-12 AND 5-12. This could have been two polygons better! Ouch!

Resources