Convert YUV into HSL or HSV bypassing the RGB step - colors

Wikipedia and plethora of online resources provide detailed and abundant help with various color space conversions from/to RGB. What I need is a straight YUV->HSL/HSV conversion.
In fact what I need is just the Hue (don't care much for the Saturation or the brightness Lightness/Value). In other words I just need to calculate the "color angle" for a given YUV color.
Code in any language would suffice, though my preference is C-style syntax.
Note that by YUV I mean specifically Y′UV, a.k.a. YCbCr (if that makes any difference).

While YUV->RGB colorspace conversion is linear (same as "can be expressed as a matrix operation") the RGB->HSL is not. Thus it is not possible to combine the two into a single operation.
Thank you Kel Solaar for confirming this for me.
For reference:
YUV(YCbCr)->RGB conversion
RGB->HSL conversion
Note that mathematically the calculation for Hue is written piecewise as the "base angle" depends on which sector the color is in and the "major color" is driven by the max(R, G, B) expression.

I think they are from the different worlds of interest. Here is a google patent
https://patents.google.com/patent/CN105847775A/en

Related

How does one properly scale an XYZ color gamut bounding volume after computing it from color matching functions?

After computing the XYZ gamut bounding mesh below from spectral samples/color matching functions, how does one scale the resulting volume for compatibility with popular color spaces such as sRGB? More specifically, the size and scale of the volume depends on the number of samples and the integral approximation method used to compute it. How, then, can one determine the right values to scale such volumes to match known color spaces like sRGB, P3-Display, NTSC, PAL, etc?
It seemed like fitting the whole volume so that Y ranges from [0, 1] would work, but it had several problems:
When compared to a sub-volume generated by converting the sRGB color cube to XYZ space, the result protruded outside of the 'full gamut'.
Converting random XYZ values from the full gamut volume to sRGB and back, the final XYZ doesn't match the initial one.
Most (all?) standardized color spaces derive from CIE XYZ, so each must have some kind of function or transformation to and from the full XYZ Gamut, or at least each must have some unique parameters for a general function.
How does one determine the correct function and its parameters?
Short answer
If I understand your question, you are trying to accomplish is determining the sRGB gamut limits (boundary) relative to the XYZ space you have constructed.
Longer answer
I am assuming you are NOT trying to accomplish gamut mapping. This is non-trivial, and there are multiple methods (perceptual, absolute, relative, etc). I'm going to set gamut mapping aside, and instead focus on determining how some arbitrary color space fints inside your XYZ volume.
First to answer your granular questions:
After computing the XYZ gamut bounding mesh below from spectral samples, how does one scale the volume for compatibility with popular color spaces such as sRGB?
What spectral samples? From a spectrophotometer reading a test print under a given standard illuminant? Or where did they come from? A color matching experiment?
The math is a matter of integrating the spectral data to form the XYZ space, which you apparently have done. What illuminant (white point)??
It seemed like fitting the whole volume so that Y ranges from [0, 1] would work, but it had several problems:
Whole volume of what? The sRGB space? How did you convert the sRGB data to YXZ? OR is this really the question you are asking?
What are the proper scaling constants?
They depend on the spectral data and the adapted white point for the spectral data. sRGB is D65. Most printing is done using D50.
Does each color space have its own ranges for x, y, and z values? How can I determine them?
YES.
Every color space has a different transformation matrix depending on the coordinates of the R G and B primaries. The primaries can be imaginary, such as in ProPhoto.
Some Things
The math you are looking for you can find at brucelindbloom.com and also, you might want to check out Thomas Mansencal's ColorScience, a python library that's the swiss-army-knife of color.
sRGB
XYZ is a linear light space, wherein Y = 0.2 to Y = 0.4 is a doubling of luminance.
sRGB is not a linear space, there is a gamma curve or tone response curve on sRGB data, such that rgb(20,20,20) to rgb(40,40,40) is NOT a doubling of luminance.
The first thing that needs to be done is linearize the sRGB color data.
Then take the linear RGB and run it through the appropriate matrix. If the XYZ data is relative to a different adapting white point, then you need to do something like a Bradford transform to convert to the appropriate one for your XYZ space.
The Bruce Lindbloom site has some ready-to-go matrixes for a couple common situations.
The problem you are describing can be caused by either (or both) failing to linearize the sRGB data and/or not adapting the white point. And... possibly other factors.
If you can answer my questions regarding the source of the spectral data I can better assist.
Further research and experimentation implied that the XYZ volume should scale such that { max(X), max(Y), max(Z) } should equal the illuminant from the working space. In the case of sRGB, that illuminant (also called white point) is called D65.
Results look convincing, but expert confirmation would still be appreciated.

CIE-L*u*v* color interpolation

I'm writing a vertex decimator that needs to interpolate vertex colors on a mesh. I'm reading Level of Detail for 3D Graphics for domain material. In the color interpolation secion, the book goes on to suggest using the CIE-Luv* color space to perform perceptual linear interpolation of colors.
The translation equations to and from the CIE XYZ color space are provided. I am able to implement the equations it provides, but Wikipedia leaves out numeric values of the following variables: u'n, v'n, and Yn.
The article say these values depend on a "specified white point" and its "luminance". It suggests u'n = 0.2009 and v'n = 0.4610 when using 2° observer and standard illuminant C. If I am using these, what would Yn be? I do not know enough physics to figure this out, and I have been unable to search for an answer on Google.
In the end, my question boils down to: What are satisfactory/appropriate values I can use for u'n, v'n, and Yn?
Also, I'm assuming I simply linearly interpolate piecewise each component of CIE-Luv* (L*, u*, and v*) when interpolating values in this color space. Is this correct?
These three values are left out its because they depend on the colorspace of the specific device (e.g. display, printer or camera). Since computer screens use an RGB colorspace where perceived grey are R=B=G, you can assume that the values are not device dependant. I can't remember the values of by heart, so I'll edit them in later.
The human eye perceives luminance/intensity logarithmically, however, a linear interpolation is close enough, especially since you don't know what the actual min and max screen levels are.
The human eye perceives the color angle linearly, however, you need to take into account that the angle id's cyclic, therefore, the interpolation of the min and max angles should equal min (or max) and not the half way point. E.g. average of purple and red should be purple.
I think that the perception of saturation is also logarithmic, however, can be approximated by a linear interpolation.
Edit:
It seems like most sites use the sRGB to XYZ formulas.
http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html
http://www.easyrgb.com/index.php?X=MATH&H=02#text2
http://colormine.org/convert/rgb-to-xyz

How do I deal with color spaces and gamma in the PNG file format?

Here's my problem:
I'm doing some rendering using spectral samples, and want to save an image showing the results. I am weighting my spectral power function by the CIE XYZ color matching functions to obtain an XYZ color space result. I multiply this XYZ color tuple by the matrix given by this page for converting to sRGB, and clamp the results to (0,1).
To save the image, I scale the converted tuple by 255 and cast it to bytes, and pass the array to libpng's png_write_image(). When I view a uniform intensity, pure-color spectrum rendered this way, it looks wrong; there are dark bands in the transitions between the colors. This is perhaps not surprising, because to convert from XYZ to sRGB, the color components must be raised to 2.4 after the matrix multiply (or linearly scaled if they are small enough). But if I do this, it looks worse! Only after raising to 1/2.2 does it start to look right. It seems like, in the absence of me doing anything, the decoded images are having a gamma of ~2.2 applied twice.
Here's the way I would expect it to work: I apply the matrix to XYZ, and I have a roughly energy-linear RGB tuple. I raise this to 2.2, and now have a perceptually linear color tuple. I encode these numbers as they are (thus making efficient use of the file precision), and store a field in the file that says "these bytes have been encoded with gamma 2.2". Then at image load time, the decoding system un-applies the encoded gamma, then applies the system gamma before display. (And thus from an authoring perspective, I shouldn't have to care what the viewer's system gamma is). But the results I'm getting suggest it doesn't work this way.
Worse, I have tried calling png_set_gAMA() with both 2.2 and 1/2.2 and see no difference in the image. I get similar results with png_set_sRGB() (which I believe should force the gamma to 1/2.2).
There must be something I have backwards or don't understand with regards to either how I should be converting my color values, or how PNG handles gamma and color spaces. To break this down into a few clarifying questions:
What is the color space of the byte values I am expected to pass to write_png()?
What calls, if any, must I make to libpng in order to specify the color space and gamma of the passed bytes, to ensure proper display? Why might they fail?
How does the gamma field in the the png file relate to the exponent I have applied to the passed color channel values, if any?
If I am expected to invert a gamma curve before sending my image data (which I doubt, but seems necessary now), should that inversion include the linear part of the sRGB curve?
Furthermore, I see hints that "white point" matters in conversion between XYZ and sRGB. It is unclear to me whether the matrices in the site given above include a renormalization to D65 (it does not match Wikipedia's matrix)-- or even when such a conversion is necessary. Most of the literature I've found glosses over the details. Is there yet another step in the conversion not mentioned in the wiki article, or will this be handled automatically?
It is pretty much the way you expected. png_set_gAMA() causes libpng to write a gAMA
chunk in the output PNG file. It doesn't change the pixels themselves. A png-compliant
viewer is supposed to use the gamma value from the chunk, combined with the gamma of the display, to write the pixel intensity values properly on the display. Most decoders won't actually do the two-step (unapply the image gamma, then apply the system gamma) method you described, although the result is conceptually the same: It will combine the image gamma with the system gamma to create a lookup table, then use that table to convert the pixels in one step.
From what you observed (gamma=2.2 and gamma=1/2.2 behaving the same), it appears that you are using a viewer that doesn't do anything with the PNG gAMA chunk data.
You said:
because to convert from XYZ to sRGB, the color components must be raised to 2.4 after the matrix multiply...
No, this is incorrect. Going from linear (XYZ) to sRGB, you do NOT raise to 2.4 nor 2.2, that is for going FROM sRGB to linear.
Going from linear to sRGB you raise to ^(1/2.2) or if using the sRGB piecewise, you'll see 1/2.4 — the effective gamma you are applying is ^0.45455
On the wikipedia page you linked, this is the FORWARD transformation.
From XYZ to sRGB:
That of course being after the correct matrix is applied. Assuming everything is in D65, then:
Straight Talk about Linear
Light in the real world is linear. If you triple 100 photons, you then have 300 photons. But the human eye does not see a trippling, we see only a modest increast by comparison.
This is in part why transfer curves or "gamma" is used, to make the most of the available code space in an 8 bit image (oversimplification on my part I know).
To do this, a linear light value is raised to the power of 0.455, and to get that sRGB value back to a linear space, then we raise it to the inverse, i.e. ^1/0.455 otherwise known as ^2.2
The use of the matrixes must be done in linear space. but after transiting the matrix, you need to apply the trc or "gamma" encoding. Based on your statements, no, things are not having 2.2 added twice, you are simply going the wrong way.
You wrote: " It seems like, in the absence of me doing anything, the decoded images are having a gamma of ~2.2 applied twice. "
I think your monitor (hardwrare or your systems icc profile) has already a gamma setting itself.

What are the practical differences when working with colors in a linear vs. a non-linear RGB space?

What is the basic property of a linear RGB space and what is the fundamental property of a non-linear one? When talking about the values inside each channel in those 8 (or more) bits, what changes?
In OpenGL, colors are 3+1 values, and with this i mean RGB+alpha, with 8 bit reserved to each channel, and this is the part that i get clearly.
But when it comes to gamma correction i don't get what the effect of working in a non-linear RGB space is.
Since i know how to use a curve in a graphic software for photo-editing, my explanation is that in a linear RGB space you take the values as they are, with no manipulation and no math function attached, instead when it's non-linear each channel usually evolves following a classic power function behaviour.
Even if i take this explanation as the real one, i still don't get what a real linear space is, because after computation all non-linear RGB spaces becomes linear and most important of all i don't get the part where a non-linear color space is more suitable for the human eye because in the end all RGB spaces are linear for what i understand.
Let's say you're working with RGB colors: each color is represented with three intensities or brightnesses. You've got to choose between "linear RGB" and "sRGB". For now, we'll simplify things by ignoring the three different intensities, and assume you just have one intensity: that is, you're only dealing with shades of gray.
In a linear color-space, the relationship between the numbers you store and the intensities they represent is linear. Practically, this means that if you double the number, you double the intensity (the lightness of the gray). If you want to add two intensities together (because you're computing an intensity based on the contributions of two light sources, or because you're adding a transparent object on top of an opaque object), you can do this by just adding the two numbers together. If you're doing any kind of 2D blending or 3D shading, or almost any image processing, then you want your intensities in a linear color-space, so you can just add, subtract, multiply, and divide numbers to have the same effect on the intensities. Most color-processing and rendering algorithms only give correct results with linear RGB, unless you add extra weights to everything.
That sounds really easy, but there's a problem. The human eye's sensitivity to light is finer at low intensities than high intensities. That's to say, if you make a list of all the intensities you can distinguish, there are more dark ones than light ones. To put it another way, you can tell dark shades of gray apart better than you can with light shades of gray. In particular, if you're using 8 bits to represent your intensity, and you do this in a linear color-space, you'll end up with too many light shades, and not enough dark shades. You get banding in your dark areas, while in your light areas, you're wasting bits on different shades of near-white that the user can't tell apart.
To avoid this problem, and make the best use of those 8 bits, we tend to use sRGB. The sRGB standard tells you a curve to use, to make your colors non-linear. The curve is shallower at the bottom, so you can have more dark grays, and steeper at the top, so you have fewer light grays. If you double the number, you more than double the intensity. This means that if you add sRGB colors together, you end up with a result that is lighter than it should be. These days, most monitors interpret their input colors as sRGB. So, when you're putting a color on the screen, or storing it in an 8-bit-per-channel texture, store it as sRGB, so you make the best use of those 8 bits.
You'll notice we now have a problem: we want our colors processed in linear space, but stored in sRGB. This means you end up doing sRGB-to-linear conversion on read, and linear-to-sRGB conversion on write. As we've already said that linear 8-bit intensities don't have enough darks, this would cause problems, so there's one more practical rule: don't use 8-bit linear colors if you can avoid it. It's becoming conventional to follow the rule that 8-bit colors are always sRGB, so you do your sRGB-to-linear conversion at the same time as widening your intensity from 8 to 16 bits, or from integer to floating-point; similarly, when you've finished your floating-point processing, you narrow to 8 bits at the same time as converting to sRGB. If you follow these rules, you never have to worry about gamma correction.
When you're reading an sRGB image, and you want linear intensities, apply this formula to each intensity:
float s = read_channel();
float linear;
if (s <= 0.04045) linear = s / 12.92;
else linear = pow((s + 0.055) / 1.055, 2.4);
Going the other way, when you want to write an image as sRGB, apply this formula to each linear intensity:
float linear = do_processing();
float s;
if (linear <= 0.0031308) s = linear * 12.92;
else s = 1.055 * pow(linear, 1.0/2.4) - 0.055; ( Edited: The previous version is -0.55 )
In both cases, the floating-point s value ranges from 0 to 1, so if you're reading 8-bit integers you want to divide by 255 first, and if you're writing 8-bit integers you want to multiply by 255 last, the same way you usually would. That's all you need to know to work with sRGB.
Up to now, I've dealt with one intensity only, but there are cleverer things to do with colors. The human eye can tell different brightnesses apart better than different tints (more technically, it has better luminance resolution than chrominance), so you can make even better use of your 24 bits by storing the brightness separately from the tint. This is what YUV, YCrCb, etc. representations try to do. The Y channel is the overall lightness of the color, and uses more bits (or has more spatial resolution) than the other two channels. This way, you don't (always) need to apply a curve like you do with RGB intensities. YUV is a linear color-space, so if you double the number in the Y channel, you double the lightness of the color, but you can't add or multiply YUV colors together like you can with RGB colors, so it's not used for image processing, only for storage and transmission.
I think that answers your question, so I'll end with a quick historical note. Before sRGB, old CRTs used to have a non-linearity built into them. If you doubled the voltage for a pixel, you would more than double the intensity. How much more was different for each monitor, and this parameter was called the gamma. This behavior was useful because it meant you could get more darks than lights, but it also meant you couldn't tell how bright your colors would be on the user's CRT, unless you calibrated it first. Gamma correction means transforming the colors you start with (probably linear) and transforming them for the gamma of the user's CRT. OpenGL comes from this era, which is why its sRGB behavior is sometimes a little confusing. But GPU vendors now tend to work with the convention I described above: that when you're storing an 8-bit intensity in a texture or framebuffer, it's sRGB, and when you're processing colors, it's linear. For example, an OpenGL ES 3.0, each framebuffer and texture has an "sRGB flag" you can turn on to enable automatic conversion when reading and writing. You don't need to explicitly do sRGB conversion or gamma correction at all.
I am not a "human color detection expert", but I've met similar thing on the YUV->RGB conversion. There are different weights for R/G/B channels, so if you change the source color by x, RGB values change different quantity.
As said, I'm not an expert, anyway, I think, if you want to do some color-correct transformation, you should do it in YUV space, then convert it to RGB (or do the mathematically equivalent operation on RGB, beware of data loss). Also, I'm not sure that YUV is the best native representation of colors, but video cameras provide that format, that's where I've met the issue.
Here is the magic YUV->RGB formula with secret numbers included: http://www.fourcc.org/fccyvrgb.php

Antialiasing and gamma compensation

The luminence of pixels on a computer screen is not usually linearly related to the digital RGB triplet values of a pixel. The nonlinear response of early CRTs required a compensating nonlinear encoding and we continue to use such encodings today.
Usually we produce images on a computer screen and consume them there as well, so it all works fine. But when we antialias, the nonlinearity — called gamma — means that we can't just add an alpha value of 0.5 to a 50% covered pixel and expect it to look right. An alpha value of 0.5 is only 0.5^2.2=22% as bright as an alpha of 1.0 with a typical gamma of 2.2.
Is there any widely established best practice for antialiasing gamma compensation? Do you have a pet method you use from day to day? Has anyone seen any studies of the results and human perceptions of the quality of the graphic output with different techniques?
I've thought of doing standard X^(1/2.2) compensation but that is pretty computationally intense. Maybe I can make it faster with a 256 entry lookup table, though.
Lookup tables are used quite often for work like that. They're small and fast.
But whether look-up or some formula, if the end result is an image file, and the format permits, it's best to save a color profile or at least the gamma value in the file for later viewing, rather than try adjusting RGB values yourself.
The reason: for typical byte-valued R, G, B channels, you have 256 unique values in each channel at each pixel. That's almost good enough to look good to the human eye (I wish "byte" had been defined as nine bits!) Any kind of math, aside from trivial value inversion, would map many-to-one for some of those values. The output won't have 256 values to pick from for each pixel for R, G, or B, but far fewer. That can lead to contouring, jaggies, color noise and other badness.
Precision issues aside, if any kind of decent quality is desired, all composting, mixing, blending, color correction, fake lens flare addition, chroma-keying and whatever, should be done in linear RGB space, where the values of R, G and B are in proportion to physical light intensity. The image math mimics physical light math. But where ultimate speed is vital, there are ways to cheat.
Jim Blinns - "Dirty Pixels" book outlines a fast and good compositing calculation by using 16 bit math plus lookup tables to accurately go back and forward to linear color space. This guy worked on NASAs visualisations, he knows his stuff.
I'm trying to answer, though mainly for reference now, to the actual questions:
First, there are the recommendations from ITU (http://www.itu.int/rec/T-REC-H.272-200701-I/en) which can be applied to programming (but you have to know your stuff).
In Jim Blinn's "Notation, Notation, Notation", Chapter 9, has a very detailed mathematical and perceptual error analysis, although he only covers compositing (many other graphics tasks are affected too).
The notation he establishes can also be used to derive a way of dealing with gamma, or to check if a given way of doing so is actually correct. Very handy, my pet method (mainly as I discovered it independently but later found his book).
When generating images, one typically works in a linear color space (like linear RGB or one of the CIE color spaces) and then converts to a non-linear RGB space at the end. That conversion can be accelerated in hardware or via lookup tables or even through tricky math. (See the other answers' references.)
When performing an alpha blend (e.g., render this icon onto this background), this kind of precision is often elided in favor of speed. The results are computed directly in the non-linear RGB-space by lerping with the alpha as the parameter. This is not "correct", but it's good enough in most cases. Especially for things like icons on desktops.
If you're trying to do more correct blending, you treat it like an original render. Work in linear space (which may require an initial conversion) and then convert to your non-linear display space at the end.
A lot of graphics nowadays use sRGB as the non-linear display color space. If I recall correctly, sRGB is very similar to a gamma of 2.2, but there are adjustments made to values at the low end.

Resources