I'm trying to infer an object's direction of movement using dense optical flow in OpenCV. I'm using calcOpticalFlowFarneback() to get flow coordinates and cartToPolar() to acquire vector angles which would indicate direction.
To interpret the results I need to know the reference point for measuring the angle. I have found this blog post indicating that the range of angles is 360°. That tells me that the angle measurement would go along the lines of the unit circle. I couldn't make out much more than that.
The documentation for cartToPolar() doesn't cover this and my attempts at testing it have failed.
It seems that the angle produced by cartToPolar() is in reference to the unit circle rotated clockwise by 90° centered on the image coordinate starting point in the top left corner. It would look like this.
I came to this conclusion by using the dense optical flow example provided by OpenCV. I replaced the line hsv[...,0] = ang*180/np.pi/2 with hsv[...,0] = ang*180/np.pi to get correct angle conversion from radians. Then I tested a video with people moving from top right to bottom left and vice versa. I sampled the dominant color with GIMP and got RGB values which I converted to HSV values. Hue value corresponds to the angle in degrees.
People moving from top right to bottom left produced an angle of about 300° and people moving the other way round produced an angle of about 120°. This hinted at the way the unit circle is positioned.
Looking at the code, fastAtan32f is used to compute the angles. and that seems to be a atan2 implementation.
Related
My app captures the shape of a room by having the user point a camera at floor corners, and then doing a bunch of math, eventually ending up with a polygon.
The assumption is that the walls are straight (not curved). The majority of the corners are formed by walls at right angles to each other, but in some cases might not be.
Depending on how accurately the user points the camera, the (x,y) coordinates I derive for the corner might be beyond the actual corner, or in front of the actual camera, or, less likely, to the left or right. Obviously, in this case, when I connect the dots, I get weird parallelogram or rhomboid shapes. See example.
I am looking for a program or algorithm to normalize or regularize these shapes, provided we know which corners are supposed to be right angles.
My initial attempt involved finding segments which had angles which were "close" to each other, adjust them all to the same angle, and then recalculate the vertices. However, this algorithm proved to be unstable.
My current thinking is to find angles which are most obtuse (as would be caused by a point mistakenly placed beyond the actual corner), or most acute (as would be caused by a point mistakenly placed in front of the actual corner), and find the corner point which would make it a right angle. The problem, however, is that such as adjustment could have side-effects on other corners, such as making them even further away from right angles. I sense I need some kind of algorithm which takes all the information and optimizes/solves it at once--is this a kind of linear programming problem?--but I am stuck.
There is not a unique solution.
For example, take the perpendicular from the middle point of an edge to the two neighboring edges. This will give you two new corners.
Or take the perpendicular from the end point of an edge to other edges.
Or compute the average of angles in the end points of an edge. Use this average and the middle point of the edge to compute new corners.
Or...
To get the most faithful compliance, capture (or calculate) distances from each corner to the other three. Build triangles with those distances. Then use the average of the coordinates you compute for a corner from 2 or 3 triangles.
Resulting angles will not be exactly 90 degrees, but the polygon will represent the room fairly.
So ... what exactly are the parameters of body.rotation and body.angularVelocity in Phaser arcade physics?
The documentation for body.rotation just says "the amount the Body is rotated", without specifying units (radians or degrees), the zero vector (X axis?), nor the direction that's positive.
Docs for body.angle says "angle in radians" ... but again doesn't say which axis is the 0 rotation vector, nor which direction is positive.
The documentation for angularVelocity says "angular velocity in pixels per second squared" which doesn't make ANY SENSE AT ALL. You can't measure rotation in pixels.
I'm trying to sync up a phaser front-end with a server-based physics model that has its own coordinate system, so some clarity on the documentation would really make my life easier!
As far as I know "body.rotation" is given in radians and if using degrees you should use "body.angle".
For the rotation direction a higher value rotates the sprite clockwise. If the angle is 0 and the sprite is pointing up it will point to the right after entering the body.angle = 90.
angularVelocity is not for rotating your sprite. The name says "angularVELOCITY" so what it's used for is to set an angular velocity. It's mainly used when you want the sprite to move in the direction it's facing.
A bit of background
I am writing a simple ray tracer in C++. I have most of the core complete but don't understand how to retrieve the world coordinate of a pixel on the image plane. I need this location so that I can cast the ray into the world.
Currently I have a Camera with a position(aka my perspective reference point), a direction (vector) which is not normalized. The directions length signifies the center of the image plane and which way the camera is facing.
There are other values associated with the camera but they should not be relevant.
My image coordinates will range from -1 to 1 and the perspective(focal length), will change based on the distance of the direction associated with the camera.
What I need help with
I need to go from pixel coordinates (say [0, 256] in an image 256 pixels on each side) to my world coordinates.
I will also want to program this so that no matter where the camera is placed and where it is directed, that I can find the pixel in the world coordinates. (Currently the camera will almost always be centered at the origin and will look down the negative z axis. I would like to program this with the future changes in mind.) It is also important to know if this code should be pushed down into my threaded code as well. Otherwise it will be calculated by the main thread and then the ray will be used in the threaded code.
(source: in.tum.de)
I did not make this image and it is only there to give an idea of what I need.
Please leave comments if you need any additional info. Otherwise I would like a simple theory/code example of what to do.
Basically you have to do the inverse process of V * MVP which transforms the point to unit cube dimensions. Look at the following urls for programming help
http://nehe.gamedev.net/article/using_gluunproject/16013/ https://sites.google.com/site/vamsikrishnav/gluunproject
I have two objects: A sphere and an object. Its an object that I created using surface reconstruction - so we do not know the equation of the object. I want to know the intersecting points on the sphere when the object and the sphere intersect. If we had a sphere and a cylinder, we could solve for the equation and figure out the area and all that but the problem here is that the object is not uniform.
Is there a way to find out the intersecting points or area on the sphere?
I'd start by finding the intersection of triangles with the sphere. First find the intersection of each triangle's plane and the sphere, which gives a circle. Then find the circle's intersection/s with the triangle edges in 2D using line/circle tests. The result will be many arcs which I guess you could approximate with lines. I'm not really sure where to go from here without knowing the end goal.
If it's surface area you're after, maybe a numerical approach would be better. I'd cover the sphere in points and count the number inside the non-uniform object. To find if a point is inside, maybe trace outwards and count the intersections with the surface (if it's odd, the point is inside). You could use the stencil buffer for this if you wanted (similar to stencil shadows).
If you want the volume of intersection a quick google search gives "carve", a mesh based CSG library.
Starting with triangles versus the sphere will give you the points of intersection.
You can take the arcs of intersection with each surface and combine them to make fences around the sphere. Ideally your reconstructed object will be in winged-edge format so you could just step from one fence segment to the next, but with reconstructed surfaces I guess you might need to apply some slightly fuzzy logic.
You can determine which side of each fence is inside the reconstructed object and which side is out by factoring in the surface normals along the fence.
You can then cut the sphere along the fences and add the internal bits to the display.
For the other side of things you could remove any triangle completely inside the sphere and cut those that intersect.
In my Direct3D application, the camera can be moved using the mouse or arrow keys. But if I hard code (0,1,0) as the up direction vector in LookAtLH, the frame goes blank at some orientations of the camera.
I just learned the hard way that when looking along the Y-axis, (0,1,0) no longer works as the Up direction (seems obvious?). I am thinking of switching my up direction to something else for each of these special cases. Is there a more graceful way to handle this?
Assuming you can calculate a vector pointing forward (what you are looking at - your position) and a vector pointing right (always on the XZ-plane unless you can roll). Normalize both these vectors, then up is forward x right (where x is cross product).
In general, you can plug in your yaw, pitch and roll into a rotation matrix and rotate the axis vectors to get right, up and forward, but I guess that's what you are using LookAtLH to avoid.
See http://en.wikipedia.org/wiki/Rotation_matrix#The_3-dimensional_rotation_matricies
The graceful way to handle this is to use Unit Quaternions. A quaternion is a vector of 4 values that encodes an orientation in 3D space (not a rotation as some articles assert) and a unit quaternion is one where the vector length sqrt(x^2+y^2+z^2+w^2) is 1.0. There are a set of mathematical operations for working with quaternions that are analogous to using matrices to encode rotations, with the added bonus that quaternions can never represent an degenerate orientation. You can freely convert quaternions to a 3x3 or 4x4 matrix when you need to feed the result to a GPU.
Your problem is that, while you are moving your camera, you will introduce a little twist into the camera's up direction. By forcing the camera to re-center itself on the (0,1,0) vector every iteration, you are in effect rotating the camera and then clamping the camera's orientation to remain on the surface of a sphere, but when your camera hits the pole of this sphere there is no good direction to call "up" and your matrix goes singular and gives you zero-sized polygons (hence the black screen). Quaternions have the ability to interpolate through these poles and come out the other side just fine, leaving you with a valid matrix at all times. all you have to do is control the "twist".
To measure this twist you should read Ken Shoemake's article "Fiber Bundle Twist Reduction" in the book Graphics Gems 4. He shows a good way to measure this accumulated twist and how to remove it when it is offensive.