Detect Color of particular area of Image Nodejs OpenCV - node.js

I'm trying to write code to detect the color of a particular area of an image.
So far I have come across is using OpenCV, we can do this, But still haven't found any particular tutorial to help with this.
I want to do this with javascript, but I can also use python OpenCV to get the results.
can anyone please help me with sharing any useful link or can explain how can I achieve detecting the color of the particular area in the image.
For eg.
The box in red will show a different color. I need to figure out which color it is showing.
What I have tried:
I have tried OpenCV canny images, though I am successful to get area separated with canny images, how to detect the color of that particular canny area is still a challenge.
Also, I tried it with inRange method from OpenCV which works perfect
# find the colors within the specified boundaries and apply
# the mask
mask = cv2.inRange(image, lower, upper)
output = cv2.bitwise_and(image, image, mask = mask)
# show the images
cv2.imshow("images", np.hstack([image, output]))
It works well and extracts the color area from the image But is there any callback which responds if the image has particular color so that it can be all done automatically?

So I am assuming here that, you already know the location of the rect which is going to be dynamically changed and need to find out the single most dominant color in the desired ROI. There are a lot of ways to do the same, one is by getting the average, of all the pixels in the ROI, other is to count all the distinct pixel values in the given ROI, with some tolerance difference.
Method 1:
import cv2
import numpy as np
img = cv2.imread("path/to/img.jpg")
region_of_interest = (356, 88, 495, 227) # left, top, bottom, right
cropped_img = img[region_of_interest[1]:region_of_interest[3], region_of_interest[0]:region_of_interest[2]]
print cv2.mean(cropped_img)
>>> (53.430516018839604, 41.05708814243569, 244.54991977640907, 0.0)
Method 2:
To find out the various dominant clusters in the given image you can use cv2.kmeans() as:
import cv2
import numpy as np
img = cv2.imread("path/to/img.jpg")
region_of_interest = (356, 88, 495, 227)
cropped_img = img[region_of_interest[1]:region_of_interest[3], region_of_interest[0]:region_of_interest[2]]
Z = cropped_img.reshape((-1, 3))
Z = np.float32(Z)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 4
ret, label, center = cv2.kmeans(Z, K, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
# Sort all the colors, as per their frequencies, as:
print center[sorted(range(K), key=lambda x: np.count_nonzero(label == [x]), reverse=True)[0]]
>>> [ 52.96525192 40.93861389 245.02325439]

#Prateek... nice to have the question narrowed down to the core. The code you provided does not address this issue at hand and remains just a question. I'll hint you towards a direction but you have to code it yourself.
steps that guide you towards a scripting result:
1) In your script add two (past & current) pixellists to store values (pixeltype + occurance).
2) Introduce a while-loop with an action true/stop statement (link to "3") for looping purpose because then it becomes a dynamic process.
3) Write a GUI with a flashy warning banner.
4) compare the pixellist with current_pixellist for serious state change (threshhold).
5) If the delta state change at "4" meets threshold throw the alert ("3").
When you've got written the code and enjoyed the trouble of tracking the tracebacks... then edit your question, update it with the code and reshape your question (i can help wiht that if you want). Then we can pick it up from there. Does that sound like a plan?

I am not sure why you need callback in this situation, but maybe this is what you mean?
def test_color(image, lower, upper):
mask = cv2.inRange(image, lower, upper)
return np.any(mask == 255)
Explanations:
cv2.inRange() will return 255 when pixel is in range (lower, upper), 0 otherwise (see docs)
Use np.any() to check if any element in the mask is actually 255

Related

How to detect an object in an image rather than screen with pyautogui?

I am using pyautogui.locateOnScreen() function to locate elements in chrome and get their x,y coordinates and click them. But at some point I need to take a screenshot of a part of the screen and search for the object I want in this screenshot. Then I get coordinates of it. Is it possible to do it with pyautogui?
My example code:
coord_one = pyautogui.locateOnScreen("first_image.png",confidence=0.95)
scshoot = pyautogui.screenshot(region=coord_one)
coord_two = # search second image in scshoot and if it can be detected get coordinates of it.
If it is not possible with pyautogui, can you advice the easiest-smartest way?
Thanks in advance.
I don't believe there is a built-in direct way to do what you need but the python-opencv library does the job.
The following code sample assumes you have an screen capture you just took "capture.png" and you want to find "logo.png" in that capture, which you know is an subsection of "capture.png".
Minimal example
"""Get bounding box of cropped image from original image."""
import cv2 as cv
import numpy as np
img_rgb = cv.imread(r'res/original.png')
# the cropped image, expected to be smaller
target_img = cv.imread(r'res/crop.png')
_, w, h = target_img.shape[::-1]
res = cv.matchTemplate(img_rgb,target_img,cv.TM_CCOEFF_NORMED)
# with the method used, the date in res are top left pixel coords
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)
top_left = max_loc
# if we add to it the width and height of the target, then we get the bbox.
bottom_right = (top_left[0] + w, top_left[1] + h)
cv.rectangle(img_rgb,top_left, bottom_right, 255, 2)
cv.imshow('', img_rgb)
MatchTemplate
From the docs, MatchTemplate "simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image." Under the hood, this offers methods such as square difference to compare the images represented as arrays.
See more
For a more in-depth explanation, check the opencv docs as the code is entirely based off their example.

Comparing 2 image content using python [duplicate]

I'm trying to compare images to each other to find out whether they are different. First I tried to make a Pearson correleation of the RGB values, which works also quite good unless the pictures are a litte bit shifted. So if a have a 100% identical images but one is a little bit moved, I get a bad correlation value.
Any suggestions for a better algorithm?
BTW, I'm talking about to compare thousand of imgages...
Edit:
Here is an example of my pictures (microscopic):
im1:
im2:
im3:
im1 and im2 are the same but a little bit shifted/cutted, im3 should be recognized as completly different...
Edit:
Problem is solved with the suggestions of Peter Hansen! Works very well! Thanks to all answers! Some results can be found here
http://labtools.ipk-gatersleben.de/image%20comparison/image%20comparision.pdf
A similar question was asked a year ago and has numerous responses, including one regarding pixelizing the images, which I was going to suggest as at least a pre-qualification step (as it would exclude very non-similar images quite quickly).
There are also links there to still-earlier questions which have even more references and good answers.
Here's an implementation using some of the ideas with Scipy, using your above three images (saved as im1.jpg, im2.jpg, im3.jpg, respectively). The final output shows im1 compared with itself, as a baseline, and then each image compared with the others.
>>> import scipy as sp
>>> from scipy.misc import imread
>>> from scipy.signal.signaltools import correlate2d as c2d
>>>
>>> def get(i):
... # get JPG image as Scipy array, RGB (3 layer)
... data = imread('im%s.jpg' % i)
... # convert to grey-scale using W3C luminance calc
... data = sp.inner(data, [299, 587, 114]) / 1000.0
... # normalize per http://en.wikipedia.org/wiki/Cross-correlation
... return (data - data.mean()) / data.std()
...
>>> im1 = get(1)
>>> im2 = get(2)
>>> im3 = get(3)
>>> im1.shape
(105, 401)
>>> im2.shape
(109, 373)
>>> im3.shape
(121, 457)
>>> c11 = c2d(im1, im1, mode='same') # baseline
>>> c12 = c2d(im1, im2, mode='same')
>>> c13 = c2d(im1, im3, mode='same')
>>> c23 = c2d(im2, im3, mode='same')
>>> c11.max(), c12.max(), c13.max(), c23.max()
(42105.00000000259, 39898.103896795357, 16482.883608327804, 15873.465425120798)
So note that im1 compared with itself gives a score of 42105, im2 compared with im1 is not far off that, but im3 compared with either of the others gives well under half that value. You'd have to experiment with other images to see how well this might perform and how you might improve it.
Run time is long... several minutes on my machine. I would try some pre-filtering to avoid wasting time comparing very dissimilar images, maybe with the "compare jpg file size" trick mentioned in responses to the other question, or with pixelization. The fact that you have images of different sizes complicates things, but you didn't give enough information about the extent of butchering one might expect, so it's hard to give a specific answer that takes that into account.
I have one done this with an image histogram comparison. My basic algorithm was this:
Split image into red, green and blue
Create normalized histograms for red, green and blue channel and concatenate them into a vector (r0...rn, g0...gn, b0...bn) where n is the number of "buckets", 256 should be enough
subtract this histogram from the histogram of another image and calculate the distance
here is some code with numpy and pil
r = numpy.asarray(im.convert( "RGB", (1,0,0,0, 1,0,0,0, 1,0,0,0) ))
g = numpy.asarray(im.convert( "RGB", (0,1,0,0, 0,1,0,0, 0,1,0,0) ))
b = numpy.asarray(im.convert( "RGB", (0,0,1,0, 0,0,1,0, 0,0,1,0) ))
hr, h_bins = numpy.histogram(r, bins=256, new=True, normed=True)
hg, h_bins = numpy.histogram(g, bins=256, new=True, normed=True)
hb, h_bins = numpy.histogram(b, bins=256, new=True, normed=True)
hist = numpy.array([hr, hg, hb]).ravel()
if you have two histograms, you can get the distance like this:
diff = hist1 - hist2
distance = numpy.sqrt(numpy.dot(diff, diff))
If the two images are identical, the distance is 0, the more they diverge, the greater the distance.
It worked quite well for photos for me but failed on graphics like texts and logos.
You really need to specify the question better, but, looking at those 5 images, the organisms all seem to be oriented the same way. If this is always the case, you can try doing a normalized cross-correlation between the two images and taking the peak value as your degree of similarity. I don't know of a normalized cross-correlation function in Python, but there is a similar fftconvolve() function and you can do the circular cross-correlation yourself:
a = asarray(Image.open('c603225337.jpg').convert('L'))
b = asarray(Image.open('9b78f22f42.jpg').convert('L'))
f1 = rfftn(a)
f2 = rfftn(b)
g = f1 * f2
c = irfftn(g)
This won't work as written since the images are different sizes, and the output isn't weighted or normalized at all.
The location of the peak value of the output indicates the offset between the two images, and the magnitude of the peak indicates the similarity. There should be a way to weight/normalize it so that you can tell the difference between a good match and a poor match.
This isn't as good of an answer as I want, since I haven't figured out how to normalize it yet, but I'll update it if I figure it out, and it will give you an idea to look into.
If your problem is about shifted pixels, maybe you should compare against a frequency transform.
The FFT should be OK (numpy has an implementation for 2D matrices), but I'm always hearing that Wavelets are better for this kind of tasks ^_^
About the performance, if all the images are of the same size, if I remember well, the FFTW package created an specialised function for each FFT input size, so you can get a nice performance boost reusing the same code... I don't know if numpy is based on FFTW, but if it's not maybe you could try to investigate a little bit there.
Here you have a prototype... you can play a little bit with it to see which threshold fits with your images.
import Image
import numpy
import sys
def main():
img1 = Image.open(sys.argv[1])
img2 = Image.open(sys.argv[2])
if img1.size != img2.size or img1.getbands() != img2.getbands():
return -1
s = 0
for band_index, band in enumerate(img1.getbands()):
m1 = numpy.fft.fft2(numpy.array([p[band_index] for p in img1.getdata()]).reshape(*img1.size))
m2 = numpy.fft.fft2(numpy.array([p[band_index] for p in img2.getdata()]).reshape(*img2.size))
s += numpy.sum(numpy.abs(m1-m2))
print s
if __name__ == "__main__":
sys.exit(main())
Another way to proceed might be blurring the images, then subtracting the pixel values from the two images. If the difference is non nil, then you can shift one of the images 1 px in each direction and compare again, if the difference is lower than in the previous step, you can repeat shifting in the direction of the gradient and subtracting until the difference is lower than a certain threshold or increases again. That should work if the radius of the blurring kernel is larger than the shift of the images.
Also, you can try with some of the tools that are commonly used in the photography workflow for blending multiple expositions or doing panoramas, like the Pano Tools.
I have done some image processing course long ago, and remember that when matching I normally started with making the image grayscale, and then sharpening the edges of the image so you only see edges. You (the software) can then shift and subtract the images until the difference is minimal.
If that difference is larger than the treshold you set, the images are not equal and you can move on to the next. Images with a smaller treshold can then be analyzed next.
I do think that at best you can radically thin out possible matches, but will need to personally compare possible matches to determine they're really equal.
I can't really show code as it was a long time ago, and I used Khoros/Cantata for that course.
First off, correlation is a very CPU intensive rather inaccurate measure for similarity. Why not just go for the sum of the squares if differences between individual pixels?
A simple solution, if the maximum shift is limited: generate all possible shifted images and find the one that is the best match. Make sure you calculate your match variable (i.e. correllation) only over the subset of pixels that can be matched in all shifted images. Also, your maximum shift should be significantly smaller than the size of your images.
If you want to use some more advances image processing techniques I suggest you look at SIFT this is a very powerfull method that (theoretically anyway) can properly match items in images independent of translation, rotation and scale.
I guess you could do something like this:
estimate vertical / horizontal displacement of reference image vs the comparison image. a
simple SAD (sum of absolute difference) with motion vectors would do to.
shift the comparison image accordingly
compute the pearson correlation you were trying to do
Shift measurement is not difficult.
Take a region (say about 32x32) in comparison image.
Shift it by x pixels in horizontal and y pixels in vertical direction.
Compute the SAD (sum of absolute difference) w.r.t. original image
Do this for several values of x and y in a small range (-10, +10)
Find the place where the difference is minimum
Pick that value as the shift motion vector
Note:
If the SAD is coming very high for all values of x and y then you can anyway assume that the images are highly dissimilar and shift measurement is not necessary.
To get the imports to work correctly on my Ubuntu 16.04 (as of April 2017), I installed python 2.7 and these:
sudo apt-get install python-dev
sudo apt-get install libtiff5-dev libjpeg8-dev zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python-tk
sudo apt-get install python-scipy
sudo pip install pillow
Then I changed Snowflake's imports to these:
import scipy as sp
from scipy.ndimage import imread
from scipy.signal.signaltools import correlate2d as c2d
How awesome that Snowflake's scripted worked for me 8 years later!
I propose a solution based on the Jaccard index of similarity on the image histograms. See: https://en.wikipedia.org/wiki/Jaccard_index#Weighted_Jaccard_similarity_and_distance
You can compute the difference in the distribution of the pixel colors. This is indeed pretty invariant to translations.
from PIL.Image import Image
from typing import List
def jaccard_similarity(im1: Image, im2: Image) -> float:
"""Compute the similarity between two images.
First, for each image an histogram of the pixels distribution is extracted.
Then, the similarity between the histograms is compared using the weighted Jaccard index of similarity, defined as:
Jsimilarity = sum(min(b1_i, b2_i)) / sum(max(b1_i, b2_i)
where b1_i, and b2_i are the ith histogram bin of images 1 and 2, respectively.
The two images must have same resolution and number of channels (depth).
See: https://en.wikipedia.org/wiki/Jaccard_index
Where it is also called Ruzicka similarity."""
if im1.size != im2.size:
raise Exception("Images must have the same size. Found {} and {}".format(im1.size, im2.size))
n_channels_1 = len(im1.getbands())
n_channels_2 = len(im2.getbands())
if n_channels_1 != n_channels_2:
raise Exception("Images must have the same number of channels. Found {} and {}".format(n_channels_1, n_channels_2))
assert n_channels_1 == n_channels_2
sum_mins = 0
sum_maxs = 0
hi1 = im1.histogram() # type: List[int]
hi2 = im2.histogram() # type: List[int]
# Since the two images have the same amount of channels, they must have the same amount of bins in the histogram.
assert len(hi1) == len(hi2)
for b1, b2 in zip(hi1, hi2):
min_b = min(b1, b2)
sum_mins += min_b
max_b = max(b1, b2)
sum_maxs += max_b
jaccard_index = sum_mins / sum_maxs
return jaccard_index
With respect to mean squared error, the Jaccard index lies always in the range [0,1], thus allowing for comparisons among different image sizes.
Then, you can compare the two images, but after rescaling to the same size! Or pixel counts will have to be somehow normalized. I used this:
import sys
from skincare.common.utils import jaccard_similarity
import PIL.Image
from PIL.Image import Image
file1 = sys.argv[1]
file2 = sys.argv[2]
im1 = PIL.Image.open(file1) # type: Image
im2 = PIL.Image.open(file2) # type: Image
print("Image 1: mode={}, size={}".format(im1.mode, im1.size))
print("Image 2: mode={}, size={}".format(im2.mode, im2.size))
if im1.size != im2.size:
print("Resizing image 2 to {}".format(im1.size))
im2 = im2.resize(im1.size, resample=PIL.Image.BILINEAR)
j = jaccard_similarity(im1, im2)
print("Jaccard similarity index = {}".format(j))
Testing on your images:
$ python CompareTwoImages.py im1.jpg im2.jpg
Image 1: mode=RGB, size=(401, 105)
Image 2: mode=RGB, size=(373, 109)
Resizing image 2 to (401, 105)
Jaccard similarity index = 0.7238955686269157
$ python CompareTwoImages.py im1.jpg im3.jpg
Image 1: mode=RGB, size=(401, 105)
Image 2: mode=RGB, size=(457, 121)
Resizing image 2 to (401, 105)
Jaccard similarity index = 0.22785529941822316
$ python CompareTwoImages.py im2.jpg im3.jpg
Image 1: mode=RGB, size=(373, 109)
Image 2: mode=RGB, size=(457, 121)
Resizing image 2 to (373, 109)
Jaccard similarity index = 0.29066426814105445
You might also consider experimenting with different resampling filters (like NEAREST or LANCZOS), as they, of course, alter the color distribution when resizing.
Additionally, consider that swapping images change the results, as the second image might be downsampled instead of upsampled (After all, cropping might better suit your case rather than rescaling.)

Change width of image in Opencv using Numpy

I'm making a Python file that will make a filter to have color on the Canny filter in OpenCV. I do this change from grayscale to color using the code provided below. My problem is when I apply the concatenate method (to add the color back as Canny filter is converted to grayscale), it cuts the width of the screen in 3 as I show in the 2 screenshots of before the color is added and after. The code snippet shown is only the transformation from grayscale to colored images.
What I've tried:
Tried using NumPy.tile: this wasn't the wisest attempt as it just repeated the same 1/3 of the screen twice more and didn't expand it to take up the whole screen as I had hoped.
Tried changing the image to only be from the index of 1/3 of the screen to cover the entire screen.
Tried setting the column index that is blank to equal None.
Image without the color added
Image with the color added
My code:
def convert_pixels(image, color):
rows, cols = image.shape
concat = np.zeros(image.shape)
image = np.concatenate((image, concat), axis=1)
image = np.concatenate((image, concat), axis=1)
image = image.reshape(rows, cols, 3)
index = image.nonzero()
#TODO: turn color into constantly changing color wheel or shifting colors
for i in zip(index[0], index[1], index[2]):
color.next_color()
image[i[0]][i[1]] = color.color
#TODO: fix this issue below:
#image[:, int(cols/3):cols] = None # turns right side (gliched) into None type
return image, color
In short, you're using concatenate on the wrong axis. axis=1 is the "columns" axis, so you're just putting two copies of zeros next to each other in the x direction. Since you want a three-channel image I would just initialize color_image with three channels and leave the original grayscale image alone:
def convert_pixels(image,color):
rows, cols = image.shape
color_image = np.zeros((rows,cols,3),dtype=np.uint8)
idx = image.nonzero()
for i in zip(*idx):
color_image[i] = color.color
return color_image,color
I've changed the indexing to match. I can't check this exactly since I don't know what your color object is, but I can confirm this works in terms of correctly shaping and indexing the new image.

I can't generate a word cloud with some images

I just start the module worcloud in Python 3.7, and I'm using the next cxode to generate wordclouds from a dictionary and I'm trying to use differents masks, but this works for some images: in two cases works with images of 831x816 and 1000x808. This has to be with the size of the image? Or is because the images is kind a blurry? Or what is it?
I paste my code:
from PIL import Image
our_mask = np.array(Image.open('twitter.png'))
twitter_cloud = WordCloud(background_color = 'white', mask = our_mask)
twitter_cloud.generate_from_frequencies(frequencies)
twitter_cloud.to_file("twitter_cloud.jpg")
plt.imshow(twitter_cloud)
plt.axis('off')
plt.show()
How can i fix this?
I had a similar problem with a black-and-white image I used. What fixed it for me was when I cropped the image more closely to the black drawing so there was no unnecessary bulk white area on the edges.
Some images should be adjusted for the process. Note only white point values for image is mask_out (other values are mask_in). The problem is that some of images are not suitable for masking. The reason is that the color's np.array somewhat mismatches. To solve this, following can be done:
1.Creating mask object: (Please try with your own image as I couldn't upload:)
import numpy as np;
import pandas as pd;
from PIL import Image;
from wordcloud import WordCloud
mask = np.array(Image.open("filepath/picture.png"))
print(mask)
If the output values for white np.array is 255, then it is okay. But if it is 0 or probably other value, we have to change this to 255.
2.In the case of other values, the code for changing the values:
2-1. Create function for transforming (here our value = 0)
def transform_zeros(val):
if val == 0:
return 255
else:
return val
2-2. Creating the same shaped np.array:
maskable_image = np.ndarray((mask.shape[0],mask.shape[1]), np.int32)
2-3. Transformation:
for i in range(len(mask)):
maskable_image[i] = list(map(transform_zeros, mask[i]))
3.Checking:
print(maskable_image)
Then you can use this array for your mask.
mask = maskable_image
All this is copied and interpreted from this link, so check it if you find my attempted explanation unclear, as I just provided solution but don't understand that much about color arrays of image and its transformation.

How do I extract each road in terms of the pixel coordinates from Google Map Screenshot and place them into different lists?

I'm working on a project related to road recognition from a standard Google Map view. Some navigation features will be added to the project later on.
I already extracted all the white pixels (representing road on the map) according to the RGB criteria. Also, I stored all the white pixel (roads) coordinates (2D) in one list named "all_roads". Now I want to extract each road in terms of the pixel coordinates and place them into different lists (one road in one list), but I'm lacking ideas.
I'd like to use Dijkstra's algorithm to calculate the shortest path between two points, but I need to create "nodes" on each road intersection. That's why I'd like to store each road in the corresponding list for further processing.
I hope someone could provide some ideas and methods. Thank you!
Note: The RGB criteria ("if" statements in "threshold" method) seems unnecessary for the chosen map screenshot, but it becomes useful in some other map screenshot with other road colours other than white. (NOT the point of the question anyway but I hope to avoid unnecessary confusion)
# Import numpy to enable numpy array
import numpy as np
# Import time to handle time-related task
import time
# Import mean to calculate the averages of the pixals
from statistics import mean
# Import cv2 to display the image
import cv2 as cv2
def threshold(imageArray):
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Purpose: Display a given image with road in white according to pixel RGBs
Argument(s): A matrix generated from a given image.
Return: A matrix of the same size but only displays white and black.
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
newAr = imageArray
for eachRow in newAr:
for eachPix in eachRow:
if eachPix[0] == 253 and eachPix[1] == 242:
eachPix[0] = 255
eachPix[1] = 255
eachPix[2] = 255
else:
pass
return newAr
# Import the image
g1 = cv2.imread("1.png")
# fix the output image with resolution of 800 * 600
g1 = cv2.resize(g1,(800,600))
# Apply threshold method to the imported image
g2 = threshold(g1)
index = np.where(g2 == [(255,255,255)])
# x coordinate of the white pixels (roads)
print(index[1])
# y coordinate of the white pixels (roads)
print(index[0])
# Storing the 2D coordinates of white pixels (roads) in a list
all_roads = []
for i in range(len(index[0]))[0::3]:
all_roads.append([index[1][i], index[0][i]])
#Display the modified image
cv2.imshow('g2', g2)
cv2.waitKey(0)
cv2.destroyAllWindows()

Resources