Target Coordinates To Screen Coordinates - Vuforia - vuforia

I just want to get the 2dscreen coords of a target calculated from the info in the returned pose in iOS.
Would someone take pity and please provide me with an answer as I have virtually finished the app I was working on and that is the last piece I need. I looked at the documentation but I don't understand matrix math.

Related

How can I proceed with my defect detection algorithm?

I am an undergraduate student working with detecting defects on a surface of an object, in a given digital image using image processing technique. I am planning on using OpenCV library to get image processing functions. Currently I am trying to decide on which defect detection algorithm to use, in order to detect defects. This is one of my very first projects related to this field, so it will be appreciated if I can get some help related to this issue. The reference image with a defect (missing teeth in the gear), which I am currently working with is uploaded as a link below ("defective gear image").
defective gear image
Get the convex hull of a gear (which is a polygon) and shrink is slightly so that it crosses the teeth. Make sure that the centroid of the gear is the fixed point.
Then sample the pixels along the hull, preferably using equidistant points (divide the perimeter by a multiple of the number of teeth). The unwrapped profile will be a dashed line, with missing dashes corresponding to missing teeth, and the problem is reduced to 1D.
You can also try a polar unwarping, making the outline straight, but you will need an accurate location of the center.

Extracting part of a 2D image in OpenCV

enter image description here
My goal is to take the image above and "open" it along the center so that the 9 black doublets are in a straight line rather than in a circle. I have tried using the cv2.toPolar() function in OpenCV but the image is quite distorted, as can be seen below:
enter image description here
I am attempting to try a different approach now. From the center, I would like to access each of the doublet individually, like a pizza slice, and place them side by side
Initially I was thinking of slicing each doublet using two lines from the center of the image to the mid point between the doublets on either side.
My question is: how can I draw contours from the center of the image to the edge of the image, passing through the mid point between any two doublet. If I can draw one, I know that the angle between any two such consecutive contour is 40 degrees.
Any help is greatly appreciated!
I noted a few problems here:
The toPolar() conversion might have been around the center of the image file, but it is not the center of the object. This causes part of the distortion. If you share your code, I could try playing with the code and improving it.
2.The object is somewhat elliptical, not circular. This means you will still have a wave after correcting the above problem.
If you don't mind a semi-automatic solution, you could use OpenCV mouse events to specify the first line and let the program use the 40 degree angle to calculate the rest.

How do I find the world coordinates of a pixel on the image plane?

A bit of background
I am writing a simple ray tracer in C++. I have most of the core complete but don't understand how to retrieve the world coordinate of a pixel on the image plane. I need this location so that I can cast the ray into the world.
Currently I have a Camera with a position(aka my perspective reference point), a direction (vector) which is not normalized. The directions length signifies the center of the image plane and which way the camera is facing.
There are other values associated with the camera but they should not be relevant.
My image coordinates will range from -1 to 1 and the perspective(focal length), will change based on the distance of the direction associated with the camera.
What I need help with
I need to go from pixel coordinates (say [0, 256] in an image 256 pixels on each side) to my world coordinates.
I will also want to program this so that no matter where the camera is placed and where it is directed, that I can find the pixel in the world coordinates. (Currently the camera will almost always be centered at the origin and will look down the negative z axis. I would like to program this with the future changes in mind.) It is also important to know if this code should be pushed down into my threaded code as well. Otherwise it will be calculated by the main thread and then the ray will be used in the threaded code.
(source: in.tum.de)
I did not make this image and it is only there to give an idea of what I need.
Please leave comments if you need any additional info. Otherwise I would like a simple theory/code example of what to do.
Basically you have to do the inverse process of V * MVP which transforms the point to unit cube dimensions. Look at the following urls for programming help
http://nehe.gamedev.net/article/using_gluunproject/16013/ https://sites.google.com/site/vamsikrishnav/gluunproject

How to calculate a pixels world space position on an image plane formed by a virtual camera?

First, this Calculating camera ray direction to 3d world pixel helped me a bit in understanding what the virtual camera setup is like. I don't understand how the vectors work in this setup, and I thought normalized device coordinates had to be used which led me to this page http://www.scratchapixel.com/lessons/3d-basic-lessons/lesson-6-rays-cameras-and-images/building-primary-rays-and-rendering-an-image/. What I am trying to do is build a ray tracer, and as the question states, find out the pixels position in order to shoot out a ray. What I really, really really would like, is an actually example showing a virtual camera setup, screen resolution and how to calculate a pixels position, then transform to world space coordinates. Experts!, Thank you for your help! :D
Multiply a matrix by the coordinates. What matrix? There are lots of choices. For example XNA uses a projection matrix, view matrix and world matrix. Applying all of them transforms pixel coordinates into world coordinates or vice versa. Breaking it down this way helps to understand the different transformations going on so you can more easily construct the matrices.
Isn't this webpage providing you already with 4 pages of explanation on how these rays are built? It seems like you haven't made the effort to read the content of the link you are referring to. I would suggest you read it first, try to understand it, maybe look at the source code they provide and come back with a real question regarding what you potentially don't understand.
It's all there, and I am not going to re-write what these people seem to have put a lot of energy already to explain! (nor should anybody else really ...).

Distance between the camera and a recognized "object"

I would like to calculate the distance between my camera and a recognized "object".
The recognized "object" is a black rectangle sticker on a white board for example. I know the values of the rectangle (x,y).
Is there a method that I can use to calculate the distance with the values of my original rectangle, and the values of the picture of the rectangle I took with the camera?
I searched the forum for answeres, but none of the were specified to calculate the distance with these attributes.
I am working on a robot called Nao from Aldebaran Robotics, I am planing to use OpenCV to recognize the black rectangle.
If you could compute the angle taken up by the image of the target, then the distance to the target should be proportional to cot (i.e. 1/tan) of that angle. You should find that the number of pixels in the image corresponded roughly to the angles, but I doubt it is completely linear, especially up close.
The behaviour of your camera lens is likely to affect this measurement, so it will depend on your exact setup.
Why not measure the size of the target at several distances, and plot a scatter graph? You could then fit a curve to the data to get a size->distance function for your particular system. If your camera is close to an "ideal" camera, then you should find this graph looks like cot, and you should be able to find your values of a and b to match dist = a * cot (b * width).
If you try this experiment, why not post the answers here, for others to benefit from?
[Edit: a note about 'ideal' cameras]
For a camera image to look 'realistic' to us, the image should approximate projection onto a plane held infront of the eye (because camera images are viewed by us by holding a planar image in front of our eyes). Imagine holding a sheet of tracing paper up in front of your eye, and sketching the objects silhouette on that paper. The second diagram on this page shows sort of what I mean. You might describe a camera which achieves this as an "ideal" camera.
Of course, in real life, cameras don't work via tracing paper, but with lenses. Very complicated lenses. Have a look at the lens diagram on this page. For various reasons which you could spend a lifetime studying, it is very tricky to create a lens which works exactly like the tracing paper example would work under all conditions. Start with this wiki page and read on if you want to know more.
So you are unlikely to be able to compute an exact relationship between pixel length and distance: you should measure it and fit a curve.
It is a big topic. If you want to proceed from a single image, take a look at this old paper by A. Criminisi. For an in-depth view, read his Ph.D. thesis. Then start playing with the OpenCV routines in the "projective geometry" sectiop.
I have been working on Image/Object Recognition as well. I just released a python programmed android app (ported to android) that recognizes objects, people, cars, books, logos, trees, flowers... anything:) It also shows it's thought process as it "thinks" :)
I've put it out as a test for 99 cents on google play.
Here's the link if you're interested, there's also a video of it in action:
https://play.google.com/store/apps/details?id=com.davecote.androideyes
Enjoy!
:)

Resources