I need to write a program that will detect a red square in an image. I would like to do this on my GPU using OpenGl-ES. I have no experience with GPU programming, and haven't found the answer through Google so far.
Is it possible to do this using OpenGL? Does OpenGL-ES give access to the whole matrix of pixels as well as their location in the matrix, allowing a program to go through the pixels, and check the color value of each one as well as their locations in the matrix?
Thank you.
Above all, you are confused to call a few terms. There is 'no matrix of pixels'
If what you meant by that is Convolution, yes, you can put the convolution on Fragment shader to detect edges. However, there is no returning datas, and no way to access each pixel to get the color value. Convolution would work if you just want the shader to draw of square's edge. But if you want to know if a red square exist in the camera frame it must be calculated in CPU not in GPU.
Related
Image morphing is mostly a graphic design SFX to adapt one picture into another one using some points decided by the artist, who has to match the eyes some key zones on one portrait with another, and then some kinds of algorithms adapt the entire picture to change from one to another.
I would like to do something a bit similar with a shader, which can load any 2 graphics and automatically choose zones of the most similar colors in the same kinds of zone of the picture and automatically morph two pictures in real time processing. Perhaps a shader based version would be logically alot faster at the task? except I don't even understand how it works at all.
If you know, Please don't worry about a complete reply about the process, it would be great if you have save vague background concepts and keywords, for how to attempt a 2d texture morph in a graphics shader.
There are more morphing methods out there the one you are describing is based on geometry.
morph by interpolation
you have 2 data sets with similar properties (for example 2 images are both 2D) and interpolate between them by some parameter. In case of 2D images you can use linear interpolation if both images are the same resolution or trilinear interpolation if not.
So you just pick corresponding pixels from each images and interpolate the actual color for some parameter t=<0,1>. for the same resolution something like this:
for (y=0;y<img1.height;y++)
for (x=0;x<img1.width;x++)
img.pixel[x][y]=(1.0-t)*img1.pixel[x][y] + t*img2.pixel[x][y];
where img1,img2 are input images and img is the ouptput. Beware the t is float so you need to overtype to avoid integer rounding problems or use scale t=<0,256> and correct the result by bit shift right by 8 bits or by /256 For different sizes you need to bilinear-ly interpolate the corresponding (x,y) position in both of the source images first.
All This can be done very easily in fragment shader. Just bind the img1,img2 to texture units 0,1 pick the texel from them interpolate and output the final color. The bilinear coordinate interpolation is done automatically by GLSL because texture coordinates are normalized to <0,1> no matter the resolution. In Vertex you just pass the texture and vertex coordinates. And in main program side you just draw single Quad covering the final image output...
morph by geometry
You have 2 polygons (or matching points) and interpolate their positions between the 2. For example something like this: Morph a cube to coil. This is suited for vector graphics. you just need to have points corespondency and then the interpolation is similar to #1.
for (i=0;i<points;i++)
{
p(i).x=(1.0-t)*p1.x + t*p2.x
p(i).y=(1.0-t)*p1.y + t*p2.y
}
where p1(i),p2(i) is i-th point from each input geometry set and p(i) is point from the final result...
To enhance visual appearance the linear interpolation is exchanged with specific trajectory (like BEZIER curves) so the morph look more cool. For example see
Path generation for non-intersecting disc movement on a plane
To acomplish this you need to use geometry shader (or maybe even tesselation shader). you would need to pass both polygons as single primitive, then geometry shader should interpolate the actual polygon and pass it to vertex shader.
morph by particle swarms
In this case you find corresponding pixels in source images by matching colors. Then handle each pixel as particle and create its path from position in img1 to img2 with parameter t. It i s the same as #2 but instead polygon areas you got just points. The particle has its color,position you interpolate both ... because there is very slim chance you will get exact color matches and the count ... (histograms would be the same) which is in-probable.
hybrid morphing
It is any combination of #1,#2,#3
I am sure there is more methods for morphing these are just the ones I know of. Also the morphing can be done not only in spatial domain...
Imagine you have a chessboard textured triangle shown in front of you.
Then imagine you move the camera so that you can see the triangle from one side, when it nearly looks as a line.
You will provably see the line as grey, because this is the average color of the texels shown in a straight line from the camera to the end of the triangle. The GPU does this all the time.
Now, how is this implemented? Should I sample every texel in a straight line and average the result to get the same output? Or is there another more efficient way to do this? Maybe using mipmaps?
It does not matter if you look at the object from the side, front, or back; the implementation remains exactly the same.
The exact implementation depends on the required results. A typical graphics API such as Direct3D has many different texture sample techniques, which all have different properties. Have a look at the documentation for some common sampling techniques and an explanation.
If you start looking at objects from an oblique angle, the texture on the triangle might look distorted with most sampling techniques, and Anisotropic Filtering is often used in these scenario's.
A bit of background
I am writing a simple ray tracer in C++. I have most of the core complete but don't understand how to retrieve the world coordinate of a pixel on the image plane. I need this location so that I can cast the ray into the world.
Currently I have a Camera with a position(aka my perspective reference point), a direction (vector) which is not normalized. The directions length signifies the center of the image plane and which way the camera is facing.
There are other values associated with the camera but they should not be relevant.
My image coordinates will range from -1 to 1 and the perspective(focal length), will change based on the distance of the direction associated with the camera.
What I need help with
I need to go from pixel coordinates (say [0, 256] in an image 256 pixels on each side) to my world coordinates.
I will also want to program this so that no matter where the camera is placed and where it is directed, that I can find the pixel in the world coordinates. (Currently the camera will almost always be centered at the origin and will look down the negative z axis. I would like to program this with the future changes in mind.) It is also important to know if this code should be pushed down into my threaded code as well. Otherwise it will be calculated by the main thread and then the ray will be used in the threaded code.
(source: in.tum.de)
I did not make this image and it is only there to give an idea of what I need.
Please leave comments if you need any additional info. Otherwise I would like a simple theory/code example of what to do.
Basically you have to do the inverse process of V * MVP which transforms the point to unit cube dimensions. Look at the following urls for programming help
http://nehe.gamedev.net/article/using_gluunproject/16013/ https://sites.google.com/site/vamsikrishnav/gluunproject
First, this Calculating camera ray direction to 3d world pixel helped me a bit in understanding what the virtual camera setup is like. I don't understand how the vectors work in this setup, and I thought normalized device coordinates had to be used which led me to this page http://www.scratchapixel.com/lessons/3d-basic-lessons/lesson-6-rays-cameras-and-images/building-primary-rays-and-rendering-an-image/. What I am trying to do is build a ray tracer, and as the question states, find out the pixels position in order to shoot out a ray. What I really, really really would like, is an actually example showing a virtual camera setup, screen resolution and how to calculate a pixels position, then transform to world space coordinates. Experts!, Thank you for your help! :D
Multiply a matrix by the coordinates. What matrix? There are lots of choices. For example XNA uses a projection matrix, view matrix and world matrix. Applying all of them transforms pixel coordinates into world coordinates or vice versa. Breaking it down this way helps to understand the different transformations going on so you can more easily construct the matrices.
Isn't this webpage providing you already with 4 pages of explanation on how these rays are built? It seems like you haven't made the effort to read the content of the link you are referring to. I would suggest you read it first, try to understand it, maybe look at the source code they provide and come back with a real question regarding what you potentially don't understand.
It's all there, and I am not going to re-write what these people seem to have put a lot of energy already to explain! (nor should anybody else really ...).
I am currently working on a program to detect coordinates of pool balls in an image of a pool table taken from an arbitrary point.
I first calculated the table corners and warped the perspective of the image to obtain a bird's eye view. Unfortunately, this made the spherical balls appear to be slightly elliptical as shown below.
In an attempt to detect the ellipses, I extracted all but the green felt area and used a Hough transform algorithm (HoughCircles) on the resulting image shown below. Unfortunately, none of the ellipses were detected (I can only assume because they are not circles).
Is there any better method of detecting the balls in this image? I am technically using JavaCV, but OpenCV solutions should be suitable. Thank you so much for reading.
The extracted BW image is good but it needs some morphological filters to eliminate noises then you can extract external contours of each object (by cvFindContours) and fit best ellipse to them (by cvFitEllipse2).