I am trying to segment the blue components from a set of images. In most images where blue components have a large spread, otsu thresholded image works properly well. However, for images where blue components are minimal, the results are not ok and seems to include the non-relevant sections. Example below:
Are there ways to improve the otsu thresholding such that only relevant parts are segmented but not necessarily making the other images suffer?
I already tried global and adaptive thresholding but otsu particularly captured betters which however included unnecessary details.
Here's the code:
l_image = remove_background(image)
l_image = cv2.cvtColor(l_image, cv2.COLOR_BGR2GRAY)
ret1,th1 = cv2.threshold(l_image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
mask = (th1 != 255)
sel = np.ones_like(image)
sel[mask] = image[mask]
sel = cv2.cvtColor(sel, cv2.COLOR_HSV2BGR)
#we simply set these channels to 0 to remove excess background
sel[:,:,1] = 0
sel[:,:,2] = 0
Here's the sample image.
The main issue with the logic in your code is that you are looking for something that is distinguished primarily by color, but throw away the color information first by converting the image to grayscale.
Instead, consider looking at color properties of each pixel. One easy way to do so is to look at the HCV color space. This is a similar color space to the more common HSV, with "C" for chroma instead of "S" for saturation, where S = C / V. I'm suggesting this because it's so easy to compute the "C" channel, which is the one that would have most of the contrast in this image. Note that all the complexity is in computing "H", the hue, and that would be ideally used to find a specific color independently of its brightness, but that requires a double threshold on the "H" channel plus a threshold on the "S" channel. For this simple case, a single threshold on the "S" channel is sufficient to find the colored regions: we have only blue, we don't care about what color it is, we just want to find the color.
To compute the "C" (chroma) channel, we find the difference between the largest and the smallest of the RGB values (for each pixel independently):
rgbmax = np.amax(image, axis=2)
rgbmin = np.amin(image, axis=2)
c = rgbmax - rgbmin
As you can guess, a simple threshold of this image leads to finding the colored regions. The green background can easily be subtracted before processing, or after.
Edit: after #Cris Luengo comment, the green channel works better than the blue one.
You can apply Otsu's threshold on the green channel (of BGR).
Results are not perfect but much better.
img = img[:,:,1] #get the green channel
th, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
output:
Related
I have a study which provides the length and width values of the objects in an image. What I need is to have exact measurements as length and width but my results deviate too little and I need to reach at the exact values.
I have a ready program but it needs to be developed to reach best result.
(contours, _) = contours.sort_contours(contours)
for cnt in contours:
box = cv2.minAreaRect(cnt)
box = cv2.boxPoints(box) if imutils.is_cv2() else cv2.boxPoints(box)
box = np.array(box, dtype="float")
box = perspective.order_points(box)
cv2.drawContours(orig, [box.astype("int")], -1, (0, 255, 0), 1)
To see the dataset I have I am sharing my test image:
It detects te contours inside of the purple lines but I would like to have it as the yellow lines.
Wht should I update obn my code to reach the aim?
I came across this PDF file, and I wanted to write my own Fractal Flame generator. I'm trying to understand what the sections on Log-Density Display, Coloring, and Gamma Factor are trying to say though. As of now I think it says to use this algorithm to determine the [0-255] value of each color channel for an opaque image:
var log_log = log(pixel_counter)/log(max_counter),
alpha_gamma_factor = color_channel*log_log^(1/gamma),
color_gamma_factor = log_log*color_channel^(1/gamma),
vibrant_color = vibrancy*alpha_gamma_factor+(1-vibrancy)*color_gamma_factor,
corrected_color_channel = floor(256*vibrant_color);
Where vibrancy and color_channel are [0, 1), the counters are integers, and gamma is a value between sqrt(5) and sqrt(16) (or ~2.2 and 4).
Please let me know if this is right, and if not, how I should change the algorithm. I'd also like to make a variation of the algorithm that supports semi-transparency, as the output will be PNG files. What would be the highest quality algorithm for the alpha channel for [0, 255] (transparent to opaque respectively) in that case?
One simple way is to say that when the RGB components are equal, they form a gray color.
However, this is not the whole story, because if they only have a slight difference, they will still look gray.
Assuming the viewer has a healthy vision of color, how can I decide if the given values would be perceived as gray (presumably with an adjustable threshold level for "grayness")?
A relatively straightforward method would be to convert RGB value to HSV color space and use threshold on the saturation component, e.g. "if saturation < 0.05 then 'almost grey', else not grey".
Saturation is actually the "grayness/colorfulness" by definition.
This method is much more accurate than using differences between R, G and B channels (since human eye perceives saturation differently on light and dark colors). On the other hand, converting RGB to HSV is computationally intensive. It is up to you to decide what is of more value - precise answer (grey/not grey) or performance.
If you need an even more precise method, you may use L*a*b* color space and compute chroma as sqrt(a*a + b*b) (see here), and then apply thresholding to this value. However, this would be even more computationally intensive.
You can also combine multiple methods:
Calculate simple differences between R, G, B components. If the color can be identified as definitely desaturated (e.g. max(abs(R-G), abs(R-B), abs(G-B)) <= 5) or definitely saturated (e.g. max(abs(R-G), abs(R-B), abs(G-B)) > 100), then stop.
Otherwise, convert to L*a*b*, compute chroma as sqrt(a*a + b*b) and use thresholding on this value.
r = 160;
g = 179;
b = 151;
tolerance = 20;
if (Math.abs(r-g) < 20 && Math.abs(r-b) < 20) {
#then perceived as gray
}
After doing some research and reading information about OpenCV object detection, I am still not sure on how can I detect a stick in a video frame. What would be the best way so i can detect even if the user moves it around. I'll be using the stick as a sword and make a lightsaber out of it. Any points on where I can start? Thanks!
The go-to answer for this would usually be the Hough line transform. The Hough transform is designed to find straight lines (or other contours) in the scene, and OpenCV can parameterize these lines so you get the endpoints coordinates. But, word to the wise, if you are doing lightsaber effects, you don't need to go that far - just paint the stick orange and do a chroma key. Standard feature of Adobe Premiere, Final Cut Pro, Sony Vegas, etc. The OpenCV version of this is to convert your frame to HSV color mode, and isolate regions of the picture that lie in your desired hue and saturation region.
http://opencv.itseez.com/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html?highlight=hough
Here is an old routine I wrote as an example:
//Photoshop-style color range selection with hue and saturation parameters.
//Expects input image to be in Hue-Lightness-Saturation colorspace.
//Returns a binary mask image. Hue and saturation bounds expect values from 0 to 255.
IplImage* selectColorRange(IplImage *image, double lowerHueBound, double upperHueBound,
double lowerSaturationBound, double upperSaturationBound) {
cvSetImageCOI(image, 1); //select hue channel
IplImage* hue1 = cvCreateImage(cvSize(image->width, image->height), IPL_DEPTH_8U, 1);
cvCopy(image, hue1); //copy hue channel to hue1
cvFlip(hue1, hue1); //vertical-flip
IplImage* hue2 = cvCloneImage(hue1); //clone hue image
cvThreshold(hue1, hue1, lowerHueBound, 255, CV_THRESH_BINARY); //threshold lower bound
cvThreshold(hue2, hue2, upperHueBound, 255, CV_THRESH_BINARY_INV); //threshold inverse upper bound
cvAnd(hue1, hue2, hue1); //intersect the threshold pair, save into hue1
cvSetImageCOI(image, 3); //select saturation channel
IplImage* saturation1 = cvCreateImage(cvSize(image->width, image->height), IPL_DEPTH_8U, 1);
cvCopy(image, saturation1); //copy saturation channel to saturation1
cvFlip(saturation1, saturation1); //vertical-flip
IplImage* saturation2 = cvCloneImage(saturation1); //clone saturation image
cvThreshold(saturation1, saturation1, lowerSaturationBound, 255, CV_THRESH_BINARY); //threshold lower bound
cvThreshold(saturation2, saturation2, upperSaturationBound, 255, CV_THRESH_BINARY_INV); //threshold inverse upper bound
cvAnd(saturation1, saturation2, saturation1); //intersect the threshold pair, save into saturation1
cvAnd(saturation1, hue1, hue1); //intersect the matched hue and matched saturation regions
cvReleaseImage(&saturation1);
cvReleaseImage(&saturation2);
cvReleaseImage(&hue2);
return hue1;
}
A little verbose, but you get the idea!
My old professor always said that the first law of computer vision is to do whatever you can to the image to make your job easier.
If you have control over the stick's appearance, then you might have the best luck painting the stick a very specific color --- neon pink or something that isn't likely to appear in the background --- and then using color segmentation combined with connected component labeling. That would be very fast.
You can start by following the face-recognition (training & detection) techniques written for OpenCV.
If you are looking for specific steps, let me know.
I have code that needs to render regions of my object differently depending on their location. I am trying to use a colour map to define these regions.
The problem is when I sample from my colour map, I get collisions. Ie, two regions with different colours in the colourmap get the same value returned from the sampler.
I've tried various formats of my colour map. I set the colours for each region to be "5" apart in each case;
Indexed colour
RGB, RGBA: region 1 will have RGB 5%,5%,5%. region 2 will have RGB 10%,10%,10% and so on.
HSV Greyscale: region 1 will have HSV 0,0,5%. region 2 will have HSV 0,0,10% and so on.
(Values selected in The Gimp)
The tex2D sampler returns a value [0..1].
[ I then intend to derive an int array index from region. Code to do with that is unrelated, so has been removed from the question ]
float region = tex2D(gColourmapSampler,In.UV).x;
Sampling the "5%" colour gave a "region" of 0.05098 in hlsl.
From this I assume the 5% represents 5/100*255, or 12.75, which is rounded to 13 when stored in the texture. (Reasoning: 0.05098 * 255 ~= 13)
By this logic, the 50% should be stored as 127.5.
Sampled, I get 0.50196 which implies it was stored as 128.
the 70% should be stored as 178.5.
Sampled, I get 0.698039, which implies it was stored as 178.
What rounding is going on here?
(127.5 becomes 128, 178.5 becomes 178 ?!)
Edit: OK,
http://en.wikipedia.org/wiki/Bankers_rounding#Round_half_to_even
Apparently this is "banker's rounding". I have no idea why this is being used, but it solves my problem. Apparently, it's a Gimp issue.
I am using Shader Model 2 and FX Composer. This is my sampler declaration;
//Colour map
texture gColourmapTexture <
string ResourceName = "Globe_Colourmap_Regions_Greyscale.png";
string ResourceType = "2D";
>;
sampler2D gColourmapSampler : register(s1) = sampler_state {
Texture = <gColourmapTexture>;
#if DIRECT3D_VERSION >= 0xa00
Filter = MIN_MAG_MIP_LINEAR;
#else /* DIRECT3D_VERSION < 0xa00 */
MinFilter = Linear;
MipFilter = Linear;
MagFilter = Linear;
#endif /* DIRECT3D_VERSION */
AddressU = Clamp;
AddressV = Clamp;
};
I never used HLSL, but I did use GLSL a while back (and I must admit it's terribly far in my head).
One issue I had with textures is that 0 is not the first pixel. 1 is not the second one. 0 is the edge of the texture and 1 is the right edge of the first pixel. The values get interpolated automatically and that can cause serious trouble if what you need is precision like when applying a lookup table rather than applying a normal texture. You need to aim for the middle of the pixel, so asking for [0.5,0.5], [1.5,0.5] rather than [0,0], [1, 0] and so on.
At least, that's the way it was in GLSL.
Beware: region in levels[region] is rounded down. When you see 5 % in your image editor, the actual value in the texture 8b representation is 5/100*255 = 12.75, which may be either 12 or 13. If it is 12, the rounding down will hit you. If you want rounding to nearest, you need to change this to levels[region+0.5].
Another similar thing (already written by Louis-Philippe) which might hit you is texture coordinates rounding rules. You always need to hit a spot in the texel so that you are not in between of two texels, otherwise the result is ill-defined (you may get any of two randomly) and some of your source texels may disapper while other duplicate. Those rules are different for bilinar and point sampling, you may need to add half of texel size when sampling to compensate for this.
GIMP uses banker's rounding. Apparently.
This threw out my code to derive region indicies.