I have been struggling to find a proper image comparison technique. I have a directory in my system with couple of pictures and I am trying to recreate those pictures with same same objects, same lighting and same camera position. I want to know whether the correct camera frame is same as mentioned reference image.
for example, assume, we have a camera mounted in a fixed position, we took a picture using that camera and stored that picture with named 'reference.jpg', now when i run this image comparison algorithm, without changing the camera orientation or any of the surroundings, the algorithm should return the correlation between the referenced image and the current frame, in this scenario it must return something like 1 as nothing is changed and everything is same.
Until now i have been using SSIM technique, although the precision of the technique is very bad, for example if i take a picture and then run SSIM technique in a loop, the deviation between the correlation foctor is very high somewhere like it says 0.72 or so which is very bad for precision.
Related
I am trying to load a .gif file and find the physical dimensions of entities in the file.
i.e I want to find the volume occupied by each cell in the 3D volume.
gif source
One could do the following to get the frames in GIF.(code ref.)
from PIL import Image, ImageSequence
img = Image.open(filename)
frames = []
for frame in ImageSequence.Iterator(img):
a = np.array(frame.convert('RGB').getdata(), dtype=np.uint8)
a = a.reshape(frame.size[1], frame.size[0],3)
frames.append(Picture(a))
return frames
I am not sure what has to be done next.
Could someone please offer some suggestions?
To put it simply you can NOT find the exact volume occupied by each cell in the 3D volume. It is impossible without the 3D object raw data. (just one example of this that there are multiple cells inside the object that you cant see clearly with human eye, so you cant get their data from a picture)
You maybe able to make a complicated algorithm that can get a rough estimation of the volume of the whole object, but it will be very difficult and the accuracy will be low, because there are multiple factors (for example you cant predect if the object is hollow or if it has holes inside it)
As Jabbar mentions, you won't be able to get an exact value, but with some computer vision processing, you should be able to get the voxel dimensions, and if you know the scale of the image, you should be able to scale that value to a physical volume.
First you need to run edge detection and some kind of blob detection to get the individual cells.
Generate a segmentation label for each slice. This is a 2D uint32 array which has a unique number for each entity (cell) you want the volume of.
With your per-layer segmentation labels, you need to correlate the label IDs of the same cell across multiple slices. This will probably be the hardest part, but it's probably ok if it isn't perfect.
Once you have a segmentation mask for each cell in each frame, you can generate a 3D segmentation mask - a 3D array for each cell, which is a boolean mask (True where the cell volume is, False elsewhere)
Sum that array to get the volume (in voxels) of the cell
Scale your voxel volume by the ratio of pixel width and slice depth to physical distance.
I'm working on developing software for an electron microscope.This works by focusing a beam of electrons at a specific part of the sample and then recording the image using a sensor.
The scan data is saved as a 4D array where the first 2 dimensions are the location (x and y) at which the beam was focused and the other 2 dimensions are the raw sensor output which is a 2D image.
while analyzing the data, I realized that there are some stuck pixels which I would ideally be able to repair automatically via software.
Here is an example:
As you can see, the data shape is 256,256,256,256 which means we scanned 256x256 points and the sensor data is a 256x256 image.
On the right data browser window (called nav), you can see the scan location which is 0,0 (also marked on the left window by scanY and scanX). I drew circles around a few of the stuck pixels. here is another scan location for reference:
I can automatically detect these pixels by unraveling the sensor data and checking for locations where the scan values are always the same, but I'm not sure how to repair these.
My first guess was reading the data from all the pixels that are next to this pixel and averaging them, then storing the average value instead of the stuck value, but I'm not sure if this is a good approach.
How do professional software such as Photoshop "repair" or "hide" a defect in the picture? Are there any known algorithms for this issue? I did a bit of searching but didn't manage to find much.
image stitching with a reference image.
I have multiple images of subject(bone), the images are of different sections of the subject as on a 3x3 matrix. I would like to stitch them together but the problem is they don't have any common feature, as the subject was cut into these sections using a saw. What i have is the image of subject before cutting and want to use it as a guide to stitch the images of sections together.
I have tried using Fiji imagej and searched the web for an alternative. imageJ can only do the job if it has common feature between images to work with. can someone point to some code in python or matlab that can do this or any software that could help.
'[Reference image][1] section (11) section (12) section (13) section (21) section (23) section (31)'
' [1]: https://i.stack.imgur.com/wQr09.jpg
I'm not able to add more than 8 links due to SO's policy. There are two more remaining, I'll add them soon. And the "section (22)" i.e centre position in the 3X3 matrix is empty.
Solutions for image processing needs like this vary wildly depending on whether you need a script to use just a few times, a software tool you'll use for a few weeks, or what could become lab automation software.
This seems to be a problem more of image matching rather than image stitching. By image matching I mean you need to find out how a subimage such as the bone section at (row 2, column 1) would match what is labeled as "4," the center left section, in the reference bone image.
The basic process:
Load your reference image as a 2D array (first converted to grayscale)
Load your first sample image of a subsection of bone.
Use an algorithm such as SIFT to determine the location, orientation, and scale to fit the bone subsection image onto the reference image.
Apply the fit criteria (x,y,rotation,scale) to the bone subsection image, transform it, and past it into a black image the same size as the reference image.
Continue the process above to fit all subsections.
(Optional) With all bone subsections fitted in place, perform additional image processing operations to improve the fit, fill in gaps, etc.
From your sample images it appears that the reference and the bone section images area taken using different lighting, sometimes with the flat portion of the bone slightly tilted relative to the camera's optical axis, etc., all of which makes the image match more difficult.
SIFT is an algorithm that could help here. Note that "scale invariant" is part of the algorithm name.
https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
Given all that, your reference image and bone subsection images appear to be taken under very different circumstances, and that makes solving the problem harder than it needs to be. You'll have an easier time overall if you can control the conditions under which images are captured.
Capture all images with the same camera, with the same lighting, at roughly the same distance
For lighting, use something like a high-frequency diffuse fluorescent
Use the same background for every image (e.g. matte black)
Making this image match a robust process means paying attention to the physical setup as well as creating your image processing algorithm.
If you need a good reference for traditional image processing techniques, find a copy of Digital Image Processing by Gonzalez and Woods. Some time spent with that book will give you better answers faster than learning image processing piecemeal online.
For practical image processing that addresses real-world concerns for implementing even simple image processing algorithms, look for Machine Vision by Davies.
I would strongly urge that you NOT look into machine learning, or try to find an answer in a more advanced image processing textbook until you run into a roadblock with more traditional methods.
I am doing some studies on eye vascularization - my project contains a machine which can detect the different blood vessels in the retinal membrane at the back of the eye. What I am looking for is a possibility to segment the picture and analyze each segmentation on it`s own. The Segmentation consist of six squares wich I want to analyze separately on the density of white pixels.
I would be very thankful for every kind of input, I am pretty new in the programming world an I actually just have a bare concept on how it should work.
Thanks and Cheerio
Sam
Concept DrawOCTA PICTURE
You could probably accomplish this by using numpy to load the image and split it into sections. You could then analyze the sections using scikit-image or opencv (though this could be difficult to get working. To view the image, you can either save it to a file using numpy, or use matplotlib to open it in a new window.
First of all, please note that in image processing "segmentation" describes the process of grouping neighbouring pixels by context.
https://en.wikipedia.org/wiki/Image_segmentation
What you want to do can be done in various ways.
The most common way is by using ROIs or AOIs (region/area of interest). That's basically some geometric shape like a rectangle, circle, polygon or similar defined in image coordinates.
The image processing is then restricted to only process pixels within that region. So you don't slice your image into pieces but you restrict your evaluation to specific areas.
Another way, like you suggested is to cut the image into pieces and process them one by one. Those sub-images are usually created using ROIs.
A third option which is rather limited but sufficient for simple tasks like yours is accessing pixels directly using coordinate offsets and several nested loops.
Just google "python image processing" in combination with "library" "roi" "cropping" "sliding window" "subimage" "tiles" "slicing" and you'll get tons of information...
I would like to calculate the distance between my camera and a recognized "object".
The recognized "object" is a black rectangle sticker on a white board for example. I know the values of the rectangle (x,y).
Is there a method that I can use to calculate the distance with the values of my original rectangle, and the values of the picture of the rectangle I took with the camera?
I searched the forum for answeres, but none of the were specified to calculate the distance with these attributes.
I am working on a robot called Nao from Aldebaran Robotics, I am planing to use OpenCV to recognize the black rectangle.
If you could compute the angle taken up by the image of the target, then the distance to the target should be proportional to cot (i.e. 1/tan) of that angle. You should find that the number of pixels in the image corresponded roughly to the angles, but I doubt it is completely linear, especially up close.
The behaviour of your camera lens is likely to affect this measurement, so it will depend on your exact setup.
Why not measure the size of the target at several distances, and plot a scatter graph? You could then fit a curve to the data to get a size->distance function for your particular system. If your camera is close to an "ideal" camera, then you should find this graph looks like cot, and you should be able to find your values of a and b to match dist = a * cot (b * width).
If you try this experiment, why not post the answers here, for others to benefit from?
[Edit: a note about 'ideal' cameras]
For a camera image to look 'realistic' to us, the image should approximate projection onto a plane held infront of the eye (because camera images are viewed by us by holding a planar image in front of our eyes). Imagine holding a sheet of tracing paper up in front of your eye, and sketching the objects silhouette on that paper. The second diagram on this page shows sort of what I mean. You might describe a camera which achieves this as an "ideal" camera.
Of course, in real life, cameras don't work via tracing paper, but with lenses. Very complicated lenses. Have a look at the lens diagram on this page. For various reasons which you could spend a lifetime studying, it is very tricky to create a lens which works exactly like the tracing paper example would work under all conditions. Start with this wiki page and read on if you want to know more.
So you are unlikely to be able to compute an exact relationship between pixel length and distance: you should measure it and fit a curve.
It is a big topic. If you want to proceed from a single image, take a look at this old paper by A. Criminisi. For an in-depth view, read his Ph.D. thesis. Then start playing with the OpenCV routines in the "projective geometry" sectiop.
I have been working on Image/Object Recognition as well. I just released a python programmed android app (ported to android) that recognizes objects, people, cars, books, logos, trees, flowers... anything:) It also shows it's thought process as it "thinks" :)
I've put it out as a test for 99 cents on google play.
Here's the link if you're interested, there's also a video of it in action:
https://play.google.com/store/apps/details?id=com.davecote.androideyes
Enjoy!
:)