I'm using the Kinect SDK in C++ to generate an image of points near a plane in space, with the goal of using them as touches. I've attached a 3x scale image of the result of that process, so thats all gravy.
My question is how best to use OpenCV to generate blobs frame to frame from this image (and images like it) to use for touches. Heres what I've tried in my ProcessDepth callback, where img is a monochrome cv::Mat of the touch image, and out is an empty cv::Mat.
cv::Canny(img,out,100,200,3);
cv::findContours(out,contours,cv::RETR_TREE,cv::CHAIN_APPROX_SIMPLE,cv::Point(0,0));
mu.resize(contours.size());
mc.resize(contours.size());
for(int i = 0; i<contours.size();i++){
mu[i] = cv::moments(contours[i],true);
}
for(int i = 0; i<contours.size();i++){
mc[i] = cv::Point2f(mu[i].m10/mu[i].m00, mu[i].m01/mu[i].m00);
}
(I'd post more code, but VMWare is being bad about letting me copy paste out of it, if you want more, just ask.)
At which point I think I should get center of masses for blobs for a frame, in practice though, its not there. I get either errors when contour.size() returns greater than 0, or with a bit of tinkering, I get moments that seem really weird, containing large negative numbers say. So my questions are as follows:
Does anyone have recommendations on how to turn the image below into blob data with a good result, so far as flags in findContours are concerned?
Do I even need to bother with Cranny or threshold since I have a monochrome image already, and if Cranny, is the kernal of 3 too large for the number of pixels I'm dealing with?
Will Find contours work on images of this size? (160 ish by 90 ish, though thats fairly arbitrary. Smallish more generally.)
Are the OpenCV functions async? I get lots of invalid address errors if my images and the contour vector don't exist as properties on the application class. (I'm the first to admit I'm not a particularly talented C++ Programmer.)
Is there a way simpler way to go from image to series of points corresponding to touches from image?
For reference, I'm cribbing on some examples in my OpenCV download, and this example.
Let me know if you need some other information to better answer, and I'll try to provide it, thanks!
Related
I have been struggling to find a proper image comparison technique. I have a directory in my system with couple of pictures and I am trying to recreate those pictures with same same objects, same lighting and same camera position. I want to know whether the correct camera frame is same as mentioned reference image.
for example, assume, we have a camera mounted in a fixed position, we took a picture using that camera and stored that picture with named 'reference.jpg', now when i run this image comparison algorithm, without changing the camera orientation or any of the surroundings, the algorithm should return the correlation between the referenced image and the current frame, in this scenario it must return something like 1 as nothing is changed and everything is same.
Until now i have been using SSIM technique, although the precision of the technique is very bad, for example if i take a picture and then run SSIM technique in a loop, the deviation between the correlation foctor is very high somewhere like it says 0.72 or so which is very bad for precision.
image stitching with a reference image.
I have multiple images of subject(bone), the images are of different sections of the subject as on a 3x3 matrix. I would like to stitch them together but the problem is they don't have any common feature, as the subject was cut into these sections using a saw. What i have is the image of subject before cutting and want to use it as a guide to stitch the images of sections together.
I have tried using Fiji imagej and searched the web for an alternative. imageJ can only do the job if it has common feature between images to work with. can someone point to some code in python or matlab that can do this or any software that could help.
'[Reference image][1] section (11) section (12) section (13) section (21) section (23) section (31)'
' [1]: https://i.stack.imgur.com/wQr09.jpg
I'm not able to add more than 8 links due to SO's policy. There are two more remaining, I'll add them soon. And the "section (22)" i.e centre position in the 3X3 matrix is empty.
Solutions for image processing needs like this vary wildly depending on whether you need a script to use just a few times, a software tool you'll use for a few weeks, or what could become lab automation software.
This seems to be a problem more of image matching rather than image stitching. By image matching I mean you need to find out how a subimage such as the bone section at (row 2, column 1) would match what is labeled as "4," the center left section, in the reference bone image.
The basic process:
Load your reference image as a 2D array (first converted to grayscale)
Load your first sample image of a subsection of bone.
Use an algorithm such as SIFT to determine the location, orientation, and scale to fit the bone subsection image onto the reference image.
Apply the fit criteria (x,y,rotation,scale) to the bone subsection image, transform it, and past it into a black image the same size as the reference image.
Continue the process above to fit all subsections.
(Optional) With all bone subsections fitted in place, perform additional image processing operations to improve the fit, fill in gaps, etc.
From your sample images it appears that the reference and the bone section images area taken using different lighting, sometimes with the flat portion of the bone slightly tilted relative to the camera's optical axis, etc., all of which makes the image match more difficult.
SIFT is an algorithm that could help here. Note that "scale invariant" is part of the algorithm name.
https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
Given all that, your reference image and bone subsection images appear to be taken under very different circumstances, and that makes solving the problem harder than it needs to be. You'll have an easier time overall if you can control the conditions under which images are captured.
Capture all images with the same camera, with the same lighting, at roughly the same distance
For lighting, use something like a high-frequency diffuse fluorescent
Use the same background for every image (e.g. matte black)
Making this image match a robust process means paying attention to the physical setup as well as creating your image processing algorithm.
If you need a good reference for traditional image processing techniques, find a copy of Digital Image Processing by Gonzalez and Woods. Some time spent with that book will give you better answers faster than learning image processing piecemeal online.
For practical image processing that addresses real-world concerns for implementing even simple image processing algorithms, look for Machine Vision by Davies.
I would strongly urge that you NOT look into machine learning, or try to find an answer in a more advanced image processing textbook until you run into a roadblock with more traditional methods.
I'm researching the the possibility of performing occlusion culling in voxel/cube-based games like Minecraft and I've come across a challenging sub-problem. I'll give the 2D version of it.
I have a bitmap, which infrequently has pixels get either added to or removed from it.
Image Link
What I want to do is maintain some arbitrarily small set of geometry primitives that cover an arbitrarily large area, such that the area covered by all the primitives is within the colored part of the bitmap.
Image Link
Is there a smart way to maintain these sets? Please not that this is different from typical image tracing in that the primitives can not go outside the lines. If it helps, I already have the bitmap organized into a quadtree.
I want to get a screenshot of a x11 window and find the location of smaller images in it. I've had no experiences with working with images, I searched a lot, but I don't get much helpful results.
The image are from files and can be loaded with any format that is easier to use.
The getting screenshot is easy, using XGetImage. But then the question is that which format to use XYPixmap or ZPixmap? What's the difference? How each pixel is represented?
And then what about the images? Which file format is easier to use? And then how each pixel is represented in that format?
And which algorithm should I use to find the location of the images in the screenshot?
I'm really lost here. I need a push in the right direction and see some example code that can help me to understand what I'm dealing with. Couldn't find any similar work.
The language, frameworks or the tools doesn't really matter to me as long as I get it working on my ubuntu machine. I can work in either C, C++, haskell, python or javascript.
With XYPixmap, each image plane is a separate bitmap (one bit per pixel, with padding at the end each scanline). If you have 24-bit color, you get 24 separate bitmaps. To retrieve pixel value at some (x,y) coordinates, you need to fetch one bit from each of the bitmaps at these coordinates, and pack these bits into a pixel.
With ZPixmap, pixels are represented as sequences of bits, with padding at the end of each scanline. If you have 24-bit color, every 3 bytes is a pixel.
In both cases, there may bee padding in the end and sometimes in the beginning of each scanline. It is all described here.
I would not use either format directly. Convert your pixmap to a simple 1, 2, or 4 bytes-per-pixel 2D array, and do the same with the patterns you want to search. If you want to find exact matches, you can use a slightly modified string search algorithm like KMP. Fuzzy matches are tricky, I don't know of any methods that work well.
i am trying to read an image with ITK and display with VTK.
But there is a problem that has been haunting me for quite some time.
I read the images using the classes itkGDCMImageIO and itkImageSeriesReader.
After reading, i can do two different things:
1.
I can convert the ITK image to vtkImageData using itkImageToVTKImageFilter and the use vtkImageReslicer to get all three axes. Then, i use the classes vtkImageMapper, vtkActor2D, vtkRenderer and QVTKWidget to display the image.
In this case, when i display the images, there are several problems with colors. Some of them are shown very bright, others are so dark you can barely see them.
2.
The second scenario is the registration pipeline. Here, i read the image as before, then use the classes shown in the ITK Software Guide chapter about registration. Then i resample the image and use the itkImageSeriesWriter.
And that's when the problem appears. After writing the image to a file, i compare this new image with the image i used as input in the XMedcon software. If the image i wrote ahs been shown too bright in my software, there no changes when i compare both of them in XMedcon. Otherwise, if the image was too dark in my software, it appears all messed up in XMedcon.
I noticed, when comparing both images (the original and the new one) that, in both cases, there are changes in modality, pixel dimensions and glmax.
I suppose the problem is with the glmax, as the major changes occur with the darker images.
I really don't know what to do. Does this have something to do with color level/window? The most strange thing is that all the images are very similar, with identical tags and only some of them display errors when shown/written.
I'm not familiar with the particulars of VTK/ITK specifically, but it sounds to me like the problem is more general than that. Medical images have a high dynamic range and often the images will appear very dark or very bright if the window isn't set to some appropriate range. The DICOM tags Window Center (0028, 1050) and Window Width (0028, 1051) will include some default window settings that were selected by the modality. Usually these values are reasonable, but not always. See part 3 of the DICOM standard (11_03pu.pdf is the filename) section C.11.2.1.2 for details on how raw image pixels are scaled for display. The general idea is that you'll need to apply a linear scaling to the images to get appropriate pixel values for display.
What pixel types do you use? In most cases, it's simpler to use a floating point type while using ITK, but raw medical images are often in short, so that could be your problem.
You should also write the image to the disk after each step (in MHD format, for example), and inspect it with a viewer that's known to work properly, such as vv (http://www.creatis.insa-lyon.fr/rio/vv). You could also post them here as well as your code for further review.
Good luck!
For what you describe as your first issue:
I can convert the ITK image to vtkImageData using itkImageToVTKImageFilter and the use vtkImageReslicer to get all three axes. Then, i use the classes vtkImageMapper, vtkActor2D, vtkRenderer and QVTKWidget to display the image.
In this case, when i display the images, there are several problems with colors. Some of them are shown very bright, others are so dark you can barely see them.
I suggest the following: Check your window/level in VTK, they probably aren't adequate to your images. If they are abdominal tomographies window = 350 level 50 should be a nice color level.