To detect if a point is in a polygon, you project a line from the point, to infinity, and see how many of polygon's vertices it intersects with... simple enough. My problem is that if the ray intersects the polygon on one of the points, then it is counted as intersecting two segments, and considered outside the polygon. I altered my function to make it only count one of the segments when the ray intersects a point of the polygon, but there are cases where a line could intersect the point while still being outside as well. Take this image as an example:
If you assume the point in the top left is "infinity", and cast a ray to either of the other points, both intersect at a point of the polygon, and would count as intersecting the same number of vertices even though one is inside, and one is outside.
Is there a way to compensate for that, or do I just have to assume that those fringe cases won't pop up?

If the ray crosses a side exactly on a vertex, only count that side if the other vertex is above the ray. That will fix your corner case.
For example in the picture you posted, the lower ray crosses two sides of the square at the top-left vertex, but one side is above the ray and the other below, so that contributes 1 and the target point is found to be inside. The upper ray crosses two sides at the top-right vertex, both sides are below the ray, so they contribute 0 to the count and the target point is found to be outside.
I remembered reading an article which describes a technique for dealing with singular cases in general. Please read my other answer if interested.

While my first answer should do the trick for this simple problem, I can't help but mention that there exist general techniques for dealing with these kinds of special cases.
This article describes a technique for dealing with these kinds of issues in general. And one of the first examples they provide happens to be the algorithm you ask about!
The idea is to apply Automatic differentiation aka Dual numbers to compute symbolic perturbations.
By the way the same technique can also be used to avoid handling 0/0 as a special case in programs!
Here is the blog post I originally learned this from, it gives some great background to the technique, and the author blogs a lot about automatic differentiation (AD).
Despite appearances AD is a very practical technique especially in languages with good support for operator overloading (eg: C++, Haskell, Python ...) and I have used it in "real life" (industrial applications in C++).

Send ray in another direction.
If you try n+1 different directions (n is number of polygon points) one of them surely will not pass through any vertex.
This will simplify the code compared to consideration of corner cases.
Worst case becomes O(n)*CheckComplexity(n) which is likely O(n^2). If it's not acceptable, you can just sort all vertices by direction from the point to them and select middle of some interval. This will give O(n*log n).


How to determine whether an edge of a nonzero-fill polygon is an outside edge?

Let's assume I have a polygon and I have computed all of its self-intersections. How do I determine whether a specific edge is inside or outside according to the nonzero fill rule? By "outside edge" I mean an edge which lies between a filled region and a non-filled region.
On the left is an example polygon, filled according to the nonzero fill rule. On the right is the same polygon with its outside edges highlighted in red. I'm looking for an algorithm that, given the edges of the polygon and their intersections with each other, can mark each of the edges as either outside or inside.
Preferably, the solution should generalize to paths that are composed of e.g. Bezier curves.
[EDIT] two more examples to consider:
I've noticed that the "outside edge" that is enclosed within the shape must cross an even number of intersections before they get to the outside. The "non-outside edges" that are enclosed must cross an odd number of intersections.
You might try an algorithm like this
isOutside = true
edge = find first outside edge*
edge.IsOutside = isOutside
while (not got back to start) {
edge = next
if (gone over intersection)
isOutside = !isOutside
edge.IsOutside = isOutside
For example:
*I think that you can always find an outside edge by trying each line in turn: try extending it infinitely - if it does not cross another line then it should be on the outside. This seems intuitively true but I wonder if there are some pathological cases where you cannot find a start line using this rule. Using this method of finding the first line will not work with curves.
I think, you problem can be solved in two steps.
A triangulation of a source polygon with algorithm that supports self-intersecting polygons. Good start is Seidel algorithm. The section 5.2 of the linked PDF document describes self-intersecting polygons.
A merge triangles into the single polygon with algorithm that supports holes, i.e. Weiler-Atherton algorithm. This algorithm can be used for both the clipping and the merging, so you need it's "merging" case. Maybe you can simplify the algorithm, cause triangles form first step are not intersecting.
I realized this can be determined in a fairly simple way, using a slight modification of the standard routine that computes the winding number. It is conceptually similar to evaluating the winding both immediately to the left and immediately to the right of the target edge. Here is the algorithm for arbitrary curves, not just line segments:
Pick a point on the target segment. Ensure the Y derivative at that point is nonzero.
Subdivide the target segment at the Y roots of its derivative. In the next point, ignore the portion of the segment that contains the point you picked in step 1.
Determine the winding number at the point picked in 1. This can be done by casting a ray in the +X direction and seeing what intersects it, and in what direction. Intersections at points where Y component of derivative is positive are counted as +1. While doing this, ignore the Y-monotonic portion that contains the point you picked in step 1.
If the winding number is 0, we are done - this is definitely an outside edge. If it is nonzero and different than -1, 0 or 1, we are done - this is definitely an inside edge.
Inspect the derivative at the point picked in step 1. If intersection of the ray with that point would be counted as -1 and the winding number obtained in step 3 is +1, this is an outside edge; similarly for +1/-1 case. Otherwise this is an inside edge.
In essence, we are checking whether intersection of the ray with the target segment changes the winding number between zero and non-zero.
I'd suggest what I feel is a simpler implementation of your solution that has worked for me:
1. Pick ANY point on the target segment. (I arbitrarily pick the midpoint.)
2. Construct a ray from that point normal to the segment. (I use a left normal ray for a CW polygon and a right normal ray for a CCW polygon.)
3. Count the intersections of the ray with the polygon, ignoring the target segment itself. Here you can chose a NonZero winding rule [decrement for polygon segments crossing to the left (CCW) and increment for a crossing to the right (CW); where an inside edge yields a zero count] or an EvenOdd rule [count all crossings where an inside edge yields an odd count]. For line segments, crossing direction is determined with a simple left-or-right test for its start and end points. For arcs and curves it can be done with tangents at the intersection, an exercise for the reader.
My purpose for this analysis is to divide a self-intersecting polygon into an equivalent set of not self-intersecting polygons. To that end, it's useful to likewise analyze the ray in the opposite direction and sense if the original polygon would be filled there or not. This results in an inside/outside determination for BOTH sides of the segment, yielding four possible states. I suspect an OUTSIDE-OUTSIDE state might be valid only for a non-closed polygon, but for this analysis it might be desirable to temporarily close it. Segments with the same state can be collected into non-intersecting polygons by tracing their shared intersections. In some cases, such as with a pure fill, you might even decide to eliminate INSIDE-INSIDE polygons as redundant since they fill an already-filled space.
And thanks for your original solution!!

How do I check if a set of plane polygones create a watertight polyhedra

I am currently wondering if there is a common algorithm to check whether a set of plane polygones, not nescessarily triangles, contruct a watertight polyhedra. Each polygon has an oriantation (normal vector). A simple solution would just be to say yes or no. A more advanced version would be to point out the edges, where the polyhedron is "open". I am not really interesed on how to close to polyhedra.
I would like to point out, that my "holes" are not nescessarily small, e.g., one face of a cube might be missing. Thus, the "undersampling correction" algorithms dont seem to be the correct approach. Furthermore, I am talking of about 100 - 1000, not 1000000 polygons, so computation time should not really be a problem.
Any hints or tips?
I believe you can use a simple topological test -- count the number of times each edge appears in the full list of polygons.
If the set of polygons define the surface of a closed volume, each edge should have count>=2, indicating that each edge is shared by (at least) two adjacent polygons. If the surface is manifold count==2 exactly.
Edges with count==1 indicate open regions of the surface.
The above answer does not cover many cases. A more correct (but not necessarily complete: I wouldn't know) algorithm is to ensure that every edge of every polygon (or of the mesh/polyhedron) has an even number of faces connected to it. Consider the following mesh:
The segment (line) between the closest vertex and the one below is attached to 3 faces (one one of the outer triangle and two of the inner triangle), which is greater than two faces. However this is clearly not closed.

Given a polygon and a point in 2D, how can one find the feature (vertex or edge) of the polygon closest to the point?

A naive approach is to find, for each edge in the polygon, the point on that edge closest to the given point, and then take the one that's closest. Is there a faster algorithm? My goal is to implement a 2D Super Mario Galaxy-style platformer.
Apparently this can be done with Voronoi regions, as in this video: http://www.youtube.com/watch?v=Ldh2YKobuWo
However, I can't find any Voronoi algorithms that deal with edges as well as points. Ideas?
Calculate the point-line distance for each of the edges, then pick the shortest one. There is no shortcut. This site has a good explanation and even implementations in various languages.
However, finding "the point on that edge closest to the given point" is a computationally unnecessary intermediate result.
If the polygon is convex, then the overhead of the voronoi calculation far exceeds that of the naive approach.
If this is run many times, and each time the point changes slightly, you only need to check 3 segments (think about it: as you move around, assuming many checks, then the closest edge will only change to an adjacent edge)

triangle points around a point

I have given the coordinates of 1000 triangles on a plane (triangle number (T0001-T1000) and its coordinates (x1,y1) (x2,y2),(x3,y3)). Now, for a given point P(x,y), I need to find a triangle which contains the point P.
One option might be to check all the triangles and find the triangle that contain P. But, I am looking for efficient solution for this problem.
You are going to have to check every triangle at some point during the execution of your program. That's obvious right? If you want to maximize the efficiency of this calculation then you are going to create some kind of cache data structure. The details of the data structure depend on your application. For example: How often do the triangles change? How often do you need to calculate where a point is?
One way to make the cache would be this: Divide your plane in to a finite grid of boxes. For each box in the grid, store a list of the triangles that might intersect with the box.
Then when you need to find out which triangles your point is inside of, you would first figure out which box it is in (this would be O(1) time because you just look at the coordinates) and then look at the triangles in the triangle list for that box.
Several different ways you could search through your triangles. I would start by eliminating impossibilities.
Find a lowest left corner for each triangle and eliminate any that lie above and/or to the right of your point. continue search with the other triangles and you should eliminate the vast majority of the original triangles.
Take what you have left and use the polar coordinate system to gather the rest of the needed information based on angles between the corners and the point (java does have some tools for this, I do not know about other languages).
Some things to look at would be convex hull (different but somewhat helpful), Bernoullies triangles, and some methods for sorting would probably be helpful.

Decomposition to Convex Polygons

This question is a little involved. I wrote an algorithm for breaking up a simple polygon into convex subpolygons, but now I'm having trouble proving that it's not optimal (i.e. minimal number of convex polygons using Steiner points (added vertices)). My prof is adamant that it can't be done with a greedy algorithm such as this one, but I can't think of a counterexample.
So, if anyone can prove my algorithm is suboptimal (or optimal), I would appreciate it.
The easiest way to explain my algorithm with pictures (these are from an older suboptimal version)
What my algorithm does, is extends the line segments around the point i across until it hits a point on the opposite edge.
If there is no vertex within this range, it creates a new one (the red point) and connects to that:
If there is one or more vertices in the range, it connects to the closest one. This usually produces a decomposition with the fewest number of convex polygons:
However, in some cases it can fail -- in the following figure, if it happens to connect the middle green line first, this will create an extra unneeded polygon. To this I propose double checking all the edges (diagonals) we've added, and check that they are all still necessary. If not, remove it:
In some cases, however, this is not enough. See this figure:
Replacing a-b and c-d with a-c would yield a better solution. In this scenario though, there's no edges to remove so this poses a problem. In this case I suggest an order of preference: when deciding which vertex to connect a reflex vertex to, it should choose the vertex with the highest priority:
lowest) closest vertex
med) closest reflex vertex
highest) closest reflex that is also in range when working backwards (hard to explain) --
In this figure, we can see that the reflex vertex 9 chose to connect to 12 (because it was closest), when it would have been better to connect to 5. Both vertices 5 and 12 are in the range as defined by the extended line segments 10-9 and 8-9, but vertex 5 should be given preference because 9 is within the range given by 4-5 and 6-5, but NOT in the range given by 13-12 and 11-12. i.e., the edge 9-12 elimates the reflex vertex at 9, but does NOT eliminate the reflex vertex at 12, but it CAN eliminate the reflex vertex at 5, so 5 should be given preference.
It is possible that the edge 5-12 will still exist with this modified version, but it can be removed during post-processing.
Are there any cases I've missed?
Pseudo-code (requested by John Feminella) -- this is missing the bits under Figures 3 and 5
assume vertices in `poly` are given in CCW order
let 'good reflex' (better term??) mean that if poly[i] is being compared with poly[j], then poly[i] is in the range given by the rays poly[j-1], poly[j] and poly[j+1], poly[j]
for each vertex poly[i]
if poly[i] is reflex
find the closest point of intersection given by the ray starting at poly[i-1] and extending in the direction of poly[i] (call this lower bound)
repeat for the ray given by poly[i+1], poly[i] (call this upper bound)
if there are no vertices along boundary of the polygon in the range given by the upper and lower bounds
create a new vertex exactly half way between the lower and upper bound points (lower and upper will lie on the same edge)
connect poly[i] to this new point
iterate along the vertices in the range given by the lower and upper bounds, for each vertex poly[j]
if poly[j] is a 'good reflex'
if no other good reflexes have been found
save it (overwrite any other vertex found)
if it is closer then the other good reflexes vertices, save it
if no good reflexes have been found and it is closer than the other vertices found, save it
connect poly[i] to the best candidate
repeat entire algorithm for both halves of the polygon that was just split
// no reflex vertices found, then `poly` is convex
save poly
Turns out there is one more case I didn't anticipate: [Figure 5]
My algorithm will attempt to connect vertex 1 to 4, unless I add another check to make sure it can. So I propose stuffing everything "in the range" onto a priority queue using the priority scheme I mentioned above, then take the highest priority one, check if it can connect, if not, pop it off and use the next. I think this makes my algorithm O(r n log n) if I optimize it right.
I've put together a website that loosely describes my findings. I tend to move stuff around, so get it while it's hot.
I believe the regular five pointed star (e.g. with alternating points having collinear segments) is the counterexample you seek.
Edit in response to comments
In light of my revised understanding, a revised answer: try an acute five pointed star (e.g. one with arms sufficiently narrow that only the three points comprising the arm opposite the reflex point you are working on are within the range considered "good reflex points"). At least working through it on paper it appears to give more than the optimal. However, a final reading of your code has me wondering: what do you mean by "closest" (i.e. closest to what)?
Even though my answer was accepted, it isn't the counter example we initially thought. As #Mark points out in the comments, it goes from four to five at exactly the same time as the optimal does.
Flip-flop, flip flop
On further reflection, I think I was right after all. The optimal bound of four can be retained in a acute star by simply assuring that one pair of arms have collinear edges. But the algorithm finds five, even with the patch up.
I get this:
When the optimal is this:
I think your algorithm cannot be optimal because it makes no use of any measure of optimality. You use other metrics like 'closest' vertices, and checking for 'necessary' diagonals.
To drive a wedge between yours and an optimal algorithm, we need to exploit that gap by looking for shapes with close vertices which would decompose badly. For example (ignore the lines, I found this on the intertubenet):
concave polygon which forms a G or U shape http://avocado-cad.wiki.sourceforge.net/space/showimage/2007-03-19_-_convexize.png
You have no protection against the centre-most point being connected across the concave 'gap', which is external to the polygon.
Your algorithm is also quite complex, and may be overdoing it - just like complex code, you may find bugs in it because complex code makes complex assumptions.
Consider a more extensive initial stage to break the shape into more, simpler shapes - like triangles - and then an iterative or genetic algorithm to recombine them. You will need a stage like this to combine any unnecessary divisions between your convex polys anyway, and by then you may have limited your possible decompositions to only sub-optimal solutions.
At a guess something like:
decompose into triangles
non-deterministically generate a number of recombinations
calculate a quality metric (number of polys)
select the best x% of the recombinations
partially decompose each using triangles, and generate a new set of recombinations
repeat from 4 until some measure of convergence is reached
but vertex 5 should be given preference because 9 is within the range given by 4-5 and 6-5
What would you do if 4-5 and 6-5 were even more convex so that 9 didn't lie within their range? Then by your rules the proper thing to do would be to connect 9 to 12 because 12 is the closest reflex vertex, which would be suboptimal.
Found it :( They're actually quite obvious.
A four leaf clover will not be optimal if Steiner points are allowed... the red vertices could have been connected.
It won't even be optimal without Steiner points... 5 could be connected to 14, removing the need for 3-14, 3-12 AND 5-12. This could have been two polygons better! Ouch!
