I have a grey-scale image .
I get SIFT keypoints .
I can define density of each SIFT keypoint based on number of SIFT keypoints found in neighbourhood of radius 'r' .
My dilemma is that how can I decide threshold of density automatically for any image ?
I have considered an option of choosing top 50 most dense keypoints but that is very naive .
I am looking for some parameter related to distribution that does not change much if few SIFT keypoints are removed .
Any suggestion on automatically choosing 'r' would be also welcome.
Thanks.
PS:My final aim is to treat relative arrangement of neaby dense SIFT keypoints as an indentifier of the object .
Related
I have a small set of aerial images where different terrains visible in the image have been have been labelled by human experts. For example, an image may contain vegetation, river, rocky mountains, farmland etc. Each image may have one or more of these labelled regions. Using this small labeled dataset, I would like to fit a gaussian mixture model for each of the known terrain types. After this is complete, I would have N number of GMMs for each N types of terrains that I might encounter in an image.
Now, given a new image, I would like to determine for each pixel, which terrain it belongs to by assigning the pixel to the most probable GMM.
Is this the correct line of thought ? And if yes, how can I go about clustering an image using GMMs
Its not clustering if you use labeled training data!
You can, however, use the labeling function of GMM clustering easily.
For this, compute the prior probabilities, mean and covariance matrixes, invert them. Then classify each pixel of the new image by the maximum probability density (weighted by prior probabilities) using the multivariate Gaussians from the training data.
Intuitively, your thought process is correct. If you already have the labels that makes this a lot easier.
For example, let's pick on a very well known and non-parametric algorithm like Known Nearest Neighbors https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
In this algorithm, you would take your new "pixels" which would then find the closest k-pixels like the one you are currently evaluating; where closest is determined by some distance function (usually Euclidean). From there, you would then assign this new pixel to the most frequently occurring classification label.
I am not sure if you are looking for a specific algorithm recommendation, but KNN would be a very good algorithm to begin testing this type of exercise out on. I saw you tagged sklearn, scikit learn has a very good KNN implementation I suggest you read up on.
My quest is to gather information on edges that a particular image has for the purpose of content based image retrieval.
What I have in mind is :
a. apply Gaussian filter to soften/blurr the image.
b. apply Scharr function for sharpening it.
c. apply canny edge detection
d. somehow extract information on edges that are 0, 45, 90 and 135. (Hough Transform, maybe?)
Does anybody have a suggestion on what I have so far planned and how I can extract the information on the edges?
Thanks!
Why make it complicated? At first: The canny operator already includes blurring, why do want to preblur the image? Also sharpening is not necessary for edge detection.
You can use the Sobel operator to calculate the direction the detected edges. To do so, you first need to apply the filter in x and y direction and then calculate for each edge pixel the orientation angle θ θ = atan(Gy/Gx) where Gy is the pixel in the vertical edge map and Gx is the pixel in the horizontal edge map.
I'm newbie in image processing, i have tried to implement filter2D to reduce noise image with RGB color recently, and it works well. But i don't understand how it works manually in image matrix. Anybody can help me to explain how it works manually?
This is the input matrix and output matrix i get.
Input Image Matrix
Output Image Matrix
Thanks for your help. :)
as a short answer, filtering an image means apply a filter (or kernel) to it, i.e; convolving the image by this kernel. For that, you take each pixel on your image and consider a neighbourhood around it. You apply the kernel to the neighbourhood by multiplying each pixel of the neighbourhood with the corresponding kernel coefficient and sum all these values.
For a pixel, this can be summarized by this figure (source) :
For example, by setting all the coefficients to 1/N (where N is the number of elements in your kernel), you compute the average intensity of your neighbourhood.
You can see https://en.wikipedia.org/wiki/Multidimensional_discrete_convolution for more information about image convolutions.
OpenCV's documentation gives some practical examples of image smoothing.
Hope it helps
please i like to classify a set of image in 4 class with SIFT DESCRIPTOR and SVM. Now, using SIFT extractor I get keypoints of different sizes exemple img1 have 100 keypoints img2 have 55 keypoints.... how build histograms that give fixed size vectors with matlab
In this case, perhaps dense sift is a good choice.
There are two main stages:
Stage 1: Creating a codebook.
Divide the input image into a set of sub-images.
Apply sift on each sub-image. Each key point will have 128 dimensional feature vector.
Encode these vectors to create a codebook by simply applying k-means clustering with a chosen k. Each image will produce a matrix Vi (i <= n and n is the number of images used to create the codeword.) of size 128 * m, where m is the number of key points gathered from the image. The input to K-means is therefore, a big matrix V created by horizontal concatenation of Vi, for all i. The output of K-means is a matrix C with size 128 * k.
Stage 2: Calculating Histograms.
For each image in the dataset, do the following:
Create a histogram vector h of size k and initialize it to zeros.
Apply dense sift as in step 2 in stage 1.
For each key point's vector find the index of it's "best match" vector in the codebook matrix C (can be the minimum in the Euclidian distance) .
Increase the corresponding bin to this index in h by 1.
Normalize h by L1 or L2 norms.
Now h is ready for classification.
Another possibility is to use Fisher's vector instead of a codebook, https://hal.inria.fr/file/index/docid/633013/filename/jegou_aggregate.pdf
You will always get different number of keypoints for different images, but the size of feature vector of each descriptor point remains same i.e. 128. People prefer using Vector Quantization or K-Mean Clustering and build Bag-of-Words model histogram. You can have a look at this thread.
Using the conventional SIFT approach you will never have the same number of key points in every image. One way of achieving that is to sample the descriptors densely, using Dense SIFT, that places a regular grid on top of the image. If all images have the same size, then you will have the same number of key points per image.
i retrieve contours from images by using canny algorithm. it's enough to have a descriptor image and put in SVM and find similarities? Or i need necessarily other features like elongation, perimeter, area ?
I talk about this, because inspired by this example: http://scikit-learn.org/dev/auto_examples/plot_digits_classification.html i give my image in greyscale first, in canny algorithm style second and in both cases my confusion matrix was plenty of 0 like precision, recall, f1-score, support measure
My advice is:
unless you have a low number of images in your database and/or the recognition is going to be really specific (not a random thing for example) I would highly recommend you to apply one or more features extractors such SIFT, Fourier Descriptors, Haralick's Features, Hough Transform to extract more details which could be summarised in a short vector.
Then you could apply SVM after all this in order to get more accuracy.