Mapping RGB/hex color codes to general color categories - colors

Is there a dataset that maps each of the ~16M RGB or hex color values to a general color family/category - e.g. red, purple, orange, beige, brown, etc. - that I could access programmatically or load into a database or JSON document to cross-refence the color codes against? The use case is to classify the results of PIL color detection of swatch files into a small set of color pickers for a shopping site. It would also work if the mapping is a bit more granular, say 100-200 categories, since it would be easy enough to map those to my target 10-15 myself. I have some knowledge of kNN classification and will work with that if I have to, but it would be so much easier to use a static mapping if one already exists.

You can use a table such as the one in X11
http://www.astrouw.edu.pl/~jskowron/colors-x11/rgb.html
In order to find color proximity, it's best to transform the colors to Lab color space first, so that euclidean distances have more meaning, and then nearest neighbor would give good results.

You could convert from RGB to CIE Lab color space wherein Euclidian distance between two color selections is perceptually more meaningful. Here is the link to all relevant color space transformation formulae used in OpenCV's color conversion method (cvtColor): http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html
Since your use case is to compare two swatches, I would advise you to use texture descriptors (http://www.robots.ox.ac.uk/~vgg/research/texclass/with.html) in addition to color information for better results.

Related

What is the relationship between color space RGB, XYZ and the color matching function?

What is the relationship between color spaces (RGB, XYZ) and the color matching function? Let's say we have some color matching function in the color space XYZ (3 row matrix). We also have the transformation matrix which translates from XYZ coordinates to RGB coordinates.
My understanding is that there is some visual input, which is made up of the color spectrum S(y). The human eye does not see the world - it only sees its interpretation of the world. The human eye has 3 cone types LMS, each of which is responsible for processing RED, GREEN, or BLUE. The human eye sees the spectral color only because it's eye sums over RED, GREEN, BLUE vector, and this sum matches the color of the input. In order to match the color, there is a color matching function, which takes the input spectrum and produces the weights by which to multiply the primary RED, GREEN, BLUE color vector. These then get added and their output visually matches the spectral input, even though the spectrum had many many frequencies added, while the human eye was only adding 3. So we went from HUGE space to space where we can describe all with 3 vectors, summed as dictated by the color matching function.
The spectral input, color primaries, and color matching functions behave as described above and can be summarized in this formula:
where pi is the 3d vector of primary colors, c - color matching function is also a vector of 3 components, and finally s is the spectral input.
We have XYZ color space, and a corresponding color matching function which does what is described above. We are then given matrix T, which transforms XYZ coordinates to RGB coordinates. We already know T, and we need to use it to produce a new color matching function for the RGB color space.
I do not understand how the color space relates to choice of primaries pi(λ) and the choice of color functions ci(λ1).
I have been trying to understand about colours from months and after some research, i believe I have some insights which probably can help me answer your question.
I do not understand how the color space relates to choice of primaries
pi(λ)
Primaries are nothing but the wavelength of the colors that we choose to use for making all the other colors in space and that also defines the gamut of the colour space. So if you play with the applet provided in the link that is given below you can see that the whole gamut in the colour space changes when you change your primary.
Have a look at Alternative primaries and gamuts section.
Now I do not know how much you understand the RGB and XYZ or what do you mean when you say RGB here (assuming you are referring to sRGB gamut values); XYZ are actually Tristimulus values which are called rho, beta and gamma as shown in the image above and just for simplicity XYZ are converted to xy space from where you get your standard sRGB gamut.
Please go through this if you are interested in understanding how colour sensors work and converting sensor values to XYZ matrix
Please comment if I have missed any information or answer needs editing.
I think lots of issues with color selection are due technical problems people had to solve. Usually you are not trying to reproduce colors as accurately as possible, but to make them pleasant looking, cheap, fast to calculate on cpu.... If someone watches plains of New Zealand on TV he is very unlikely to know they really look like, but almost certainly wants to enjoy the picture and pay little for it.
Several reasons why you might want to use different color matching functions might include:
You are taking pictures under non-white light and you want your picture to look natural.
You are taking underwater pictures and want to compensate for the fact that water attenuates different frequencies at different speeds.
Your sensor is not perfect and you want to compensate for that.
On the other hand you might want to change your primaries due to some reason. For example your images might be taking a picture of a scene with limited amount of colors. By nudging your primaries a little you might get a "fuller" picture.
Finally sometimes you just have to compensate for some of the limitations you have with your devices. Your phosphorus on CRT TV will impose some restrictions. So will the noise in air when transmitting using PAL. On the other hand if you go digital you might be forced to have less than 36 bits per pixel. In that case you will have to make compromises and this will give you opportunity to lose as little as possible.
If you want a short tutorial visit Cambridge in colour.
Here is a Szeliski's textbook on photography, look at chapters 1 2 and 10.
Poyton has list of common transformations.

How to find the PMS (Pantone) color for a given CMYK value?

Is it possible to find the nearest PMS color of any CMYK/RGB/HSL color?
I understand that PMS is an arbitrary color system (not a space) and it's not possible to do conversions using algorithms.
I have seen (some, examples) on the web, where people who developed the tools managed to do the "conversion" form CMYK to PMS.
I'm wondering if there is a lookup technique/reference that I can use.
––
Update
I am not sure if this would help in finding PMS color...
To find matches, the total sum of the differences (delta, Δ) of the CMYK color components between the input CMYK color and the matched PANTONE color is calculated for the entire table of PANTONE colors. This list is then sorted by increasing delta, so that the matches with the smallest difference are on top. Note that because the delta is not weighted, you may get matches that are optically off.

how to choose a range for filtering points by RGB color?

I have an image and I am picking colors by RGB (data sampling). I select N points from a specific region in the image which has the "same" color. By "same" I mean, that part of the image belongs to an object, (let's say a yellow object). Each picked point in the RGB case has three values [R,G,B]. For example: [120,150,225]. And the maximum and minimum for each field are 255 and 0 respectively.
Let's assume that I picked N points from the region of the object in the image. The points obviously have different RGB values but from the same family (a gradient of the specific color).
Question:
I want to find a range for each RGB field that when I apply a color filter on the image the pixels related to that specific object remain (to be considered as inliers). Is it correct to find the maximum and minimum from the sampled points and consider them as the filter range? For example if the max and min of the field R are 120 ,170 respectively, can it be used as a the range that should be kept.
In my opinion, the idea is not true. Because when choosing the max and min of a set of sampled data some points will be out of that range and also there will be some point on the object that doesn't fit in this range.
What is a better solution to include more points as inliers?
If anybody needs to see collected data samples, please let me know.
I am not sure I fully grasp what you are asking for, but in my opinion filtering in RGB is not the way to go. You should use a different color space than RGB if you want to compare pixels of similar color. RGB is good for representing colors on a screen, but you actually want to look at the hue, saturation and intensity (lightness, or luminance) for analysing visible similarities in colors.
For example, you should convert your pixels to HSI or HSL color space first, then compare the different parameters you get. At that point, it is more natural to compare the resulting hue in a hue range, saturation in a saturation range, and so on.
Go here for further information on how to convert to and from RGB.
What happens here is that you implicitly try to reinvent either color indexing or histogram back-projection. You call it color filter but it is better to focus on probabilities than on colors and color spaces. Colors of course not super reliable and change with lighting (though hue tends to stay the same given non-colored illumination) that's why some color spaces are better than others. You can handle this separately but it seems that you are more interested in the principles of calculating "filtering operation" that will do segmentation of the foreground object from background. Hopefully.
In short, a histogram back-projection works by first creating a histogram for R, G, B within object area and then back-projecting them into the image in the following way. For each pixel in the image find its bin in the histogram, calculate its relative weight (probability) given overall sum of the bins and put this probability into the image. In such a way each pixel would have probability that it belongs to the object. You can improve it by dividing with probability of background if you want to model background too.
The result will be messy but somewhat resemble an object segment plus some background noise. It has to be cleaned and then reconnected into object using separate methods such as connected components, grab cut, morphological operation, blur, etc.

Given a color, how do I find which color it's closest to?

Let's say that I have a list of valid color values like [0x67FF82, 0x808080, 0xffffff, ...] and given an input color, in hex, I want to find which color in the list of acceptable colors that the input color is closest to.
My thought is that I'd find the color in which the absolute value of the difference of the red, green, and blue values is smallest. Is this correct?
It sounds like you're looking for a way to quantify the "distance" between colors - in math, they'd call it a metric. Many people are intuitively pretty comfortable with the Euclidean metric for example - it's simply the distance between two points as measured with a ruler. In the case of colors, things are more complicated because of subjective perception of different colors.
There's a pretty mathy wikipedia article about color difference, which includes links to different implementations.
The difference or distance between two colors is a metric of interest in color science. It allows people to quantify a notion that would otherwise be described with adjectives, to the detriment of anyone whose work is color critical. Common definitions make use of the Euclidean distance in a device independent color space.
In particular, there's Python Colormath, an implementation in python that converts between different color encodings and also seems to have a function for calculating the distance between two colors. If you happen to be coding in python, that sounds helpful, although I unfortunately don't have any personal experience with that tool. There's also similar resources available for MATLAB and Excel provided by the authors of CIE2000, a leading color-difference formula.

How does dribbble's color search work?

How does dribble's color search work? It's not like other search by color features. What I can't figure out is how they can have search parameters for color variance and color minimum without storing a row for every individual color in an image (which I suppose is possible).
Colors are usually extracted from the image using a histogram computing the density of the colors. Once, you have the top 5/10/15 colors from the image, performing a search is matching the given color against these extracted colors.
To match a given color against other, various techniques are available such as minimizing the euclidean distance between the two colors. More on such techniques can be read at http://en.wikipedia.org/wiki/Color_quantization
Similar strategy is discussed in the blog entry http://mattmueller.me/blog/creating-piximilar-image-search-by-color

Resources