After doing some research and reading information about OpenCV object detection, I am still not sure on how can I detect a stick in a video frame. What would be the best way so i can detect even if the user moves it around. I'll be using the stick as a sword and make a lightsaber out of it. Any points on where I can start? Thanks!
The go-to answer for this would usually be the Hough line transform. The Hough transform is designed to find straight lines (or other contours) in the scene, and OpenCV can parameterize these lines so you get the endpoints coordinates. But, word to the wise, if you are doing lightsaber effects, you don't need to go that far - just paint the stick orange and do a chroma key. Standard feature of Adobe Premiere, Final Cut Pro, Sony Vegas, etc. The OpenCV version of this is to convert your frame to HSV color mode, and isolate regions of the picture that lie in your desired hue and saturation region.
http://opencv.itseez.com/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html?highlight=hough
Here is an old routine I wrote as an example:
//Photoshop-style color range selection with hue and saturation parameters.
//Expects input image to be in Hue-Lightness-Saturation colorspace.
//Returns a binary mask image. Hue and saturation bounds expect values from 0 to 255.
IplImage* selectColorRange(IplImage *image, double lowerHueBound, double upperHueBound,
double lowerSaturationBound, double upperSaturationBound) {
cvSetImageCOI(image, 1); //select hue channel
IplImage* hue1 = cvCreateImage(cvSize(image->width, image->height), IPL_DEPTH_8U, 1);
cvCopy(image, hue1); //copy hue channel to hue1
cvFlip(hue1, hue1); //vertical-flip
IplImage* hue2 = cvCloneImage(hue1); //clone hue image
cvThreshold(hue1, hue1, lowerHueBound, 255, CV_THRESH_BINARY); //threshold lower bound
cvThreshold(hue2, hue2, upperHueBound, 255, CV_THRESH_BINARY_INV); //threshold inverse upper bound
cvAnd(hue1, hue2, hue1); //intersect the threshold pair, save into hue1
cvSetImageCOI(image, 3); //select saturation channel
IplImage* saturation1 = cvCreateImage(cvSize(image->width, image->height), IPL_DEPTH_8U, 1);
cvCopy(image, saturation1); //copy saturation channel to saturation1
cvFlip(saturation1, saturation1); //vertical-flip
IplImage* saturation2 = cvCloneImage(saturation1); //clone saturation image
cvThreshold(saturation1, saturation1, lowerSaturationBound, 255, CV_THRESH_BINARY); //threshold lower bound
cvThreshold(saturation2, saturation2, upperSaturationBound, 255, CV_THRESH_BINARY_INV); //threshold inverse upper bound
cvAnd(saturation1, saturation2, saturation1); //intersect the threshold pair, save into saturation1
cvAnd(saturation1, hue1, hue1); //intersect the matched hue and matched saturation regions
cvReleaseImage(&saturation1);
cvReleaseImage(&saturation2);
cvReleaseImage(&hue2);
return hue1;
}
A little verbose, but you get the idea!
My old professor always said that the first law of computer vision is to do whatever you can to the image to make your job easier.
If you have control over the stick's appearance, then you might have the best luck painting the stick a very specific color --- neon pink or something that isn't likely to appear in the background --- and then using color segmentation combined with connected component labeling. That would be very fast.
You can start by following the face-recognition (training & detection) techniques written for OpenCV.
If you are looking for specific steps, let me know.
Related
I am trying to segment the blue components from a set of images. In most images where blue components have a large spread, otsu thresholded image works properly well. However, for images where blue components are minimal, the results are not ok and seems to include the non-relevant sections. Example below:
Are there ways to improve the otsu thresholding such that only relevant parts are segmented but not necessarily making the other images suffer?
I already tried global and adaptive thresholding but otsu particularly captured betters which however included unnecessary details.
Here's the code:
l_image = remove_background(image)
l_image = cv2.cvtColor(l_image, cv2.COLOR_BGR2GRAY)
ret1,th1 = cv2.threshold(l_image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
mask = (th1 != 255)
sel = np.ones_like(image)
sel[mask] = image[mask]
sel = cv2.cvtColor(sel, cv2.COLOR_HSV2BGR)
#we simply set these channels to 0 to remove excess background
sel[:,:,1] = 0
sel[:,:,2] = 0
Here's the sample image.
The main issue with the logic in your code is that you are looking for something that is distinguished primarily by color, but throw away the color information first by converting the image to grayscale.
Instead, consider looking at color properties of each pixel. One easy way to do so is to look at the HCV color space. This is a similar color space to the more common HSV, with "C" for chroma instead of "S" for saturation, where S = C / V. I'm suggesting this because it's so easy to compute the "C" channel, which is the one that would have most of the contrast in this image. Note that all the complexity is in computing "H", the hue, and that would be ideally used to find a specific color independently of its brightness, but that requires a double threshold on the "H" channel plus a threshold on the "S" channel. For this simple case, a single threshold on the "S" channel is sufficient to find the colored regions: we have only blue, we don't care about what color it is, we just want to find the color.
To compute the "C" (chroma) channel, we find the difference between the largest and the smallest of the RGB values (for each pixel independently):
rgbmax = np.amax(image, axis=2)
rgbmin = np.amin(image, axis=2)
c = rgbmax - rgbmin
As you can guess, a simple threshold of this image leads to finding the colored regions. The green background can easily be subtracted before processing, or after.
Edit: after #Cris Luengo comment, the green channel works better than the blue one.
You can apply Otsu's threshold on the green channel (of BGR).
Results are not perfect but much better.
img = img[:,:,1] #get the green channel
th, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
output:
This is the problem I am facing simplified:
Using directx I need to draw two(or more) exactly (in the same 2d plane) overlapping triangles. The triangles are semi transparent but the effect I want to release is that they clip to transparency of a single triangle. The picture below might depict the problem better.
Is there a way to do this?
I use this to get overlapping transparent triangles to not "accumulate". You need to create a blendstate and set it on output merge.
blendStateDescription.AlphaToCoverageEnable = false;
blendStateDescription.RenderTarget[0].IsBlendEnabled = true;
blendStateDescription.RenderTarget[0].SourceBlend = D3D11.BlendOption.SourceAlpha;
blendStateDescription.RenderTarget[0].DestinationBlend = D3D11.BlendOption.One; //
blendStateDescription.RenderTarget[0].BlendOperation = D3D11.BlendOperation.Maximum;
blendStateDescription.RenderTarget[0].SourceAlphaBlend = D3D11.BlendOption.SourceAlpha; //Zero
blendStateDescription.RenderTarget[0].DestinationAlphaBlend = D3D11.BlendOption.DestinationAlpha;
blendStateDescription.RenderTarget[0].AlphaBlendOperation = D3D11.BlendOperation.Maximum;
blendStateDescription.RenderTarget[0].RenderTargetWriteMask = D3D11.ColorWriteMaskFlags.All;
Hope this helps. Code is in C# but it works the same in C++ etc. Basically, takes the alpha of both source and destination, compares and takes the max. Which will always be the same (as long as you use the same alpha on both triangles) otherwise it will render the one with the most alpha.
edit: I've added a sample of what the blending does in my project. The roads here overlap. Overlap Sample
My pixel shader is as:
I pass the UV co-ords in a float4.
xy = uv coords.
w is the alpha value.
Pixel shader code
float4 pixelColourBlend;
pixelColourBlend = primaryTexture.Sample(textureSamplerStandard, input.uv.xy, 0);
pixelColourBlend.w = input.uv.w;
clip(pixelColourBlend.w - 0.05f);
return pixelColourBlend;
Ignore my responses, couldn't edit them...grrrr.
Enabling the depth stencil prevents this problem
I'm using the Phaser framework. Here is the jsfiddle:
http://jsfiddle.net/Dillybob/u3mGL/13/
Here is where the filter is getting populated:
background = game.add.sprite(0, 0);
background.width = 800;
background.height = 600;
filter = game.add.filter('Fire', 800, 600);
filter.alpha = 0.0;
background.filters = [filter];
My line object is assigned to the variable drawnObject
So I assign that object to receive the filter like so:
drawnObject.filters = [filter];
But my line is now a red fiery square instead of being a line with a fiery background, why?
Firstly, be aware that drawnObject is actually a bitmap, which is rectangular shaped. It consists of white pixels, which build your line, and transparent pixels, which are taking the rest of bitmap space.
The filter you use is a pixel shader. Pixel shader describes instructions that GPU invokes for each pixel of a provided bitmap. In case of this shader, it creates fire effect based on some noise functions, but it doesn't take original bitmap into account. The original color of pixels is not preserved, it doesn't add to final effect in any way.
To achieve your expected result, you have to amend fragmentSrc in Fire.js, so that shader uses and mixes/blends original color into final pixel color and/or doesn't change pixel transparency.
Is there any example out there of a HLSL written .fx file that splats a tiled texture with different tiles?Like this: http://messy-mind.net/blog/wp-content/uploads/2007/10/transitions.jpg you can see theres a different tile type in each square and there's a little blurring between them to make a smoother transition,but right now I just need to find a way to draw the tiles on a texture.I have a 2D array of integers,each integer equals a corresponding tile type(0 = grass,1 = stone,2 = sand).I opened up a few HLSL examples and they were really confusing.Everything is running fine on the C++ side,but HLSL is proving to be difficult.
You can use a technique called 'texture splatting'. It mixes several textures (color maps) using another texture which contains alpha values for each color map. The texture with alpha values is an equivalent of your 2D array. You can create a 3-channel RGB texture and use each channel for a different color map (in your case: R - grass, G - stone, B - sand). Every pixel of this texture tells us how to mix the color maps (for example R=0 means 'no grass', G=1 means 'full stone', B=0.5 means 'sand, half intensity').
Let's say you have four RGB textures: tex1 - grass, tex2 - stone, tex3 - sand, alpha - mixing texture. In your .fx file, you create a simple vertex shader which just calculates the position and passes the texture coordinate on. The whole thing is done in pixel shader, which should look like this:
float tiling_factor = 10; // number of texture's repetitions, you can also
// specify a seperate factor for each texture
float4 PS_TexSplatting(float2 tex_coord : TEXCOORD0)
{
float3 color = float3(0, 0, 0);
float3 mix = tex2D(alpha_sampler, tex_coord).rgb;
color += tex2D(tex1_sampler, tex_coord * tiling_factor).rgb * mix.r;
color += tex2D(tex2_sampler, tex_coord * tiling_factor).rgb * mix.g;
color += tex2D(tex3_sampler, tex_coord * tiling_factor).rgb * mix.b;
return float4(color, 1);
}
If your application supports multi-pass rendering you should use it.
You should use a multi-pass shader approach where you render the base object with the tiled stone texture in the first pass and on top render the decal passes with different shaders and different detail textures with seperate transparent alpha maps.
(Transparent map could also be stored in your detail texture, but keeping it seperate allows different tile-levels and more flexibility in reusing it.)
Additionally you can use different texture coordinate channels for each decal pass one so that you do not need to hardcode your tile level.
So for minimum you need two shaders, whereas Shader 2 is used as often as decals you need.
Shader to render tiled base texture
Shader to render one tiled detail texture using a seperate transparency map.
If you have multiple decals z-fighting can occur and you should offset your polygons a little. (Very similar to basic simple fur rendering.)
Else you need a single shader which takes multiple textures and lays them on top of the base tiled texture, this solution is less flexible, but you can use one texture for the mix between the textures (equals your 2D-array).
I have code that needs to render regions of my object differently depending on their location. I am trying to use a colour map to define these regions.
The problem is when I sample from my colour map, I get collisions. Ie, two regions with different colours in the colourmap get the same value returned from the sampler.
I've tried various formats of my colour map. I set the colours for each region to be "5" apart in each case;
Indexed colour
RGB, RGBA: region 1 will have RGB 5%,5%,5%. region 2 will have RGB 10%,10%,10% and so on.
HSV Greyscale: region 1 will have HSV 0,0,5%. region 2 will have HSV 0,0,10% and so on.
(Values selected in The Gimp)
The tex2D sampler returns a value [0..1].
[ I then intend to derive an int array index from region. Code to do with that is unrelated, so has been removed from the question ]
float region = tex2D(gColourmapSampler,In.UV).x;
Sampling the "5%" colour gave a "region" of 0.05098 in hlsl.
From this I assume the 5% represents 5/100*255, or 12.75, which is rounded to 13 when stored in the texture. (Reasoning: 0.05098 * 255 ~= 13)
By this logic, the 50% should be stored as 127.5.
Sampled, I get 0.50196 which implies it was stored as 128.
the 70% should be stored as 178.5.
Sampled, I get 0.698039, which implies it was stored as 178.
What rounding is going on here?
(127.5 becomes 128, 178.5 becomes 178 ?!)
Edit: OK,
http://en.wikipedia.org/wiki/Bankers_rounding#Round_half_to_even
Apparently this is "banker's rounding". I have no idea why this is being used, but it solves my problem. Apparently, it's a Gimp issue.
I am using Shader Model 2 and FX Composer. This is my sampler declaration;
//Colour map
texture gColourmapTexture <
string ResourceName = "Globe_Colourmap_Regions_Greyscale.png";
string ResourceType = "2D";
>;
sampler2D gColourmapSampler : register(s1) = sampler_state {
Texture = <gColourmapTexture>;
#if DIRECT3D_VERSION >= 0xa00
Filter = MIN_MAG_MIP_LINEAR;
#else /* DIRECT3D_VERSION < 0xa00 */
MinFilter = Linear;
MipFilter = Linear;
MagFilter = Linear;
#endif /* DIRECT3D_VERSION */
AddressU = Clamp;
AddressV = Clamp;
};
I never used HLSL, but I did use GLSL a while back (and I must admit it's terribly far in my head).
One issue I had with textures is that 0 is not the first pixel. 1 is not the second one. 0 is the edge of the texture and 1 is the right edge of the first pixel. The values get interpolated automatically and that can cause serious trouble if what you need is precision like when applying a lookup table rather than applying a normal texture. You need to aim for the middle of the pixel, so asking for [0.5,0.5], [1.5,0.5] rather than [0,0], [1, 0] and so on.
At least, that's the way it was in GLSL.
Beware: region in levels[region] is rounded down. When you see 5 % in your image editor, the actual value in the texture 8b representation is 5/100*255 = 12.75, which may be either 12 or 13. If it is 12, the rounding down will hit you. If you want rounding to nearest, you need to change this to levels[region+0.5].
Another similar thing (already written by Louis-Philippe) which might hit you is texture coordinates rounding rules. You always need to hit a spot in the texel so that you are not in between of two texels, otherwise the result is ill-defined (you may get any of two randomly) and some of your source texels may disapper while other duplicate. Those rules are different for bilinar and point sampling, you may need to add half of texel size when sampling to compensate for this.
GIMP uses banker's rounding. Apparently.
This threw out my code to derive region indicies.