Why prefer Non-SRGB format for vulkan swapchain? - graphics

in the Vulkan cube example, the method to pick the surface format is:
vk::SurfaceFormatKHR Demo::pick_surface_format(const std::vector<vk::SurfaceFormatKHR> &surface_formats) {
// Prefer non-SRGB formats...
for (const auto &surface_format : surface_formats) {
const vk::Format format = surface_format.format;
if (format == vk::Format::eR8G8B8A8Unorm || format == vk::Format::eB8G8R8A8Unorm ||
format == vk::Format::eA2B10G10R10UnormPack32 || format == vk::Format::eA2R10G10B10UnormPack32 ||
format == vk::Format::eR16G16B16A16Sfloat) {
return surface_format;
}
}
printf("Can't find our preferred formats... Falling back to first exposed format. Rendering may be incorrect.\n");
assert(surface_formats.size() >= 1);
return surface_formats[0];
}
why would we prefer the non-SRGB formats, isn't most screens expecting the SRGB format?

The answer to your question is: You generally do not prefer non-sRGB formats for the swapchain images.
Your doubts are totally justified: The code in question arguably does not show a best practice approach. Consider Figure 1, which shows the fundamental conversion stages involved when capturing images, storing them in a a low dynamic range (LDR) format, and displaying it through a computer monitor.
Figure 1: Clearly, physical radiance is in linear space (leftmost stage). When we capture an image, we often want to store it in an LDR format to save some space ("LDR image" stage). Whatever radiance is output by a monitor must be again in linear space, otherwise we would have distorted the color values (rightmost stage). Since most (all?) monitors only support LDR framebuffer formats, they apply gamma correction to the framebuffer color values, before they emit the corresponding radiance (monitor stage).
So, yes, the proper approach would be to send an sRGB format to your monitor.
In fact, checking gpuinfo.org and sorting the list by COLOR_ATTACHMENT support, one can observe that sRGB formats are the most commonly supported formats. Here are the top four formats support-wise:
B8G8R8A8_SRGB: 99.88%
A8B8G8R8_SRGB_PACK32: 99.72%
R8G8B8A8_SRGB: 99.48%
B8G8R8A8_UNORM: 99.44%
So, I would say, default to an sRGB format!
P.S.: In case you are wondering why it makes sense to use an LDR image format, have a look at Figure 2 for an explanation!
Figure 2: The top row shows evenly spaced bars w.r.t. physical light intensity. This means, the amount of emitted photons increases by the same amount between each neighboring bar. The bottom row shows evenly spaced bars w.r.t. perceived light intensity---the increase of physical light intensity corresponds to the gamma curve shown in Figure 1. In other words, we only have to add a few photons in darker regions for a big change in perceived light intensity. For brighter regions, we have to add a lot of photons for the same relative difference in perceived light intensity. Therefore, it makes sense to store images so that relatively more bits are allocated for the darker colors, and fewer bits for the brighter colors in order to use the available bits optimally w.r.t. the human visual system.

Related

How does one properly scale an XYZ color gamut bounding volume after computing it from color matching functions?

After computing the XYZ gamut bounding mesh below from spectral samples/color matching functions, how does one scale the resulting volume for compatibility with popular color spaces such as sRGB? More specifically, the size and scale of the volume depends on the number of samples and the integral approximation method used to compute it. How, then, can one determine the right values to scale such volumes to match known color spaces like sRGB, P3-Display, NTSC, PAL, etc?
It seemed like fitting the whole volume so that Y ranges from [0, 1] would work, but it had several problems:
When compared to a sub-volume generated by converting the sRGB color cube to XYZ space, the result protruded outside of the 'full gamut'.
Converting random XYZ values from the full gamut volume to sRGB and back, the final XYZ doesn't match the initial one.
Most (all?) standardized color spaces derive from CIE XYZ, so each must have some kind of function or transformation to and from the full XYZ Gamut, or at least each must have some unique parameters for a general function.
How does one determine the correct function and its parameters?
Short answer
If I understand your question, you are trying to accomplish is determining the sRGB gamut limits (boundary) relative to the XYZ space you have constructed.
Longer answer
I am assuming you are NOT trying to accomplish gamut mapping. This is non-trivial, and there are multiple methods (perceptual, absolute, relative, etc). I'm going to set gamut mapping aside, and instead focus on determining how some arbitrary color space fints inside your XYZ volume.
First to answer your granular questions:
After computing the XYZ gamut bounding mesh below from spectral samples, how does one scale the volume for compatibility with popular color spaces such as sRGB?
What spectral samples? From a spectrophotometer reading a test print under a given standard illuminant? Or where did they come from? A color matching experiment?
The math is a matter of integrating the spectral data to form the XYZ space, which you apparently have done. What illuminant (white point)??
It seemed like fitting the whole volume so that Y ranges from [0, 1] would work, but it had several problems:
Whole volume of what? The sRGB space? How did you convert the sRGB data to YXZ? OR is this really the question you are asking?
What are the proper scaling constants?
They depend on the spectral data and the adapted white point for the spectral data. sRGB is D65. Most printing is done using D50.
Does each color space have its own ranges for x, y, and z values? How can I determine them?
YES.
Every color space has a different transformation matrix depending on the coordinates of the R G and B primaries. The primaries can be imaginary, such as in ProPhoto.
Some Things
The math you are looking for you can find at brucelindbloom.com and also, you might want to check out Thomas Mansencal's ColorScience, a python library that's the swiss-army-knife of color.
sRGB
XYZ is a linear light space, wherein Y = 0.2 to Y = 0.4 is a doubling of luminance.
sRGB is not a linear space, there is a gamma curve or tone response curve on sRGB data, such that rgb(20,20,20) to rgb(40,40,40) is NOT a doubling of luminance.
The first thing that needs to be done is linearize the sRGB color data.
Then take the linear RGB and run it through the appropriate matrix. If the XYZ data is relative to a different adapting white point, then you need to do something like a Bradford transform to convert to the appropriate one for your XYZ space.
The Bruce Lindbloom site has some ready-to-go matrixes for a couple common situations.
The problem you are describing can be caused by either (or both) failing to linearize the sRGB data and/or not adapting the white point. And... possibly other factors.
If you can answer my questions regarding the source of the spectral data I can better assist.
Further research and experimentation implied that the XYZ volume should scale such that { max(X), max(Y), max(Z) } should equal the illuminant from the working space. In the case of sRGB, that illuminant (also called white point) is called D65.
Results look convincing, but expert confirmation would still be appreciated.

Color management - what exactly does the monitor ICC profile do, and where does it sit in the color conversion chain?

I'm reading/watching anything I can about color management/color science and something that's not making sense to me is the scene-referred and display-referred workflows. Isn't everything display-referred, because your monitor is converting everything you see into something it can display?
While reading this article, I came across this image:
So, if I understand this right to follow a linear workflow, I should apply an inverse power function to any imported jpg/png/etc files that contain color data, to get it's gamma to be linear. I then work on the image, and when I'm ready to export, say to sRGB and save it as a png, it'll bake in the original transfer function.
But, even while it's linear, and I'm working on it, is't my monitor converting everything I see to what I can display? Isn't it basically applying it's own LUT? Isn't there already a gamma curve that the monitor itself is applying?
Also, from input to output, how many color space conversions take place, say if I'm working in the ACEScg color space. If I import a jpg texture, I linearize it and bring it into the ACEScg color space. I work on it, and when I render it out, the renderer applies a view transform to convert it from ACEScg to sRGB, and then also what I'm seeing is my monitor converting then from sRGB to my monitor's own ICC profile, right (which is always happening since everything I'm seeing is through my monitor's ICC profile)?
Finally, if I add a tone-mapping s curve, where does that conversion sit on that image?
I'm not sure your question is about programming, and the question has not much relevance to the title.
In any case:
light (photons) behave linearly. The intensity of two lights is the sum of the intensity of each light. For this reason a lot of image mangling is done in linear space. Note: camera sensors have often a near linear response.
eyes see nearly as with a gamma exponent of 2. So for compression (less noise with less bit information) gamma is useful. By accident also the CRT phosphors had a similar response (else the engineers would have found some other methods: in past such fields were done with a lot of experiments: feed back from users, on many settings).
Screens expects images with a standardized gamma correction (now it depends on the port, setting, image format). Some may be able to cope with many different colour spaces. Note: now we have no more CRT, so the screen will convert data from expected gamma to the monitor gamma (and possibly different value for each channel). So a sort of a LUT (it may just be electronically done, so without the T (table)). Screens are setup so that with a standard signal you get expected light. (There are standards (images and methods) to measure the expected bahavious, but so ... there is some implicit gamma correction of the gamma corrected values. It was always so: on old electronic monitor/TV technicians may get an internal knob to regulate single colours, general settings, etc.)
Note: Professionals outside computer graphic will use often opto-electronic transfer function (OETF) from camera (so light to signal) and the inverse electro-optical transfer function (EOTF) when you convert a signal (electric) to light, e.g. in the screen. I find this way to call the "gamma" show quickly what it is inside gamma: it is just a conversion between analogue electrical signal and light intensity.
The input image has own colour space. You now assume a JPEG, but often you have much more information (RAW or log, S-log, ...). So now you convert to your working colour space (it may be linear, as our example). If you show the working image, you will have distorted colours. But you may not able to show it, because you will use probably more then 8-bit per channel (colour). Common is 16 or 32bits, and often with half-float or single float).
And I lost some part of my answer (after last autosave). The rest was also complex, but the answer is already too long. In short. You can calibrate the monitor: two way: the best way (if you have a monitor that can be "hardware calibrated"), you just modify the tables in monitor. So it is nearly all transparent (it is just that the internal gamma function is adapted to get better colours). You still get the ICC, but for other reasons. Or you get the easy calibration, where the bytes of an image are transformed on your computer to get better colours (in a program, or now often by operating system, either directly by OS, or by telling the video card to do it). You should careful check that only one component will do colour correction.
Note: in your program, you may save the image as sRGB (or AdobeRGB), so with standard ICC profiles, and practically never as your screen ICC, just for consistency with other images. Then it is OS, or soft-preview, etc. which convert for your screen, but if the image as your screen ICC, just the OS colour management will see that ICC-image to ICC-output will be a trivial conversion (just copying the value).
So, take into account that at every step, there is an expected colour space and gamma. All programs expect it, and later it may be changed. So there may be unnecessary calculation, but it make things simpler: you should not track expectations.
And there are many more details. ICC is also use to characterize your monitor (so the capable gamut), which can be used for some colour management things. The intensions are just the method the colour correction are done, if the image has out-of-gamut colours (just keep the nearest colour, so you lose shade, but gain accuracy, or you scale all colours (and you expect your eyes will adapt: they do if you have just one image at a time). The evil is in such details.

CIE-L*u*v* color interpolation

I'm writing a vertex decimator that needs to interpolate vertex colors on a mesh. I'm reading Level of Detail for 3D Graphics for domain material. In the color interpolation secion, the book goes on to suggest using the CIE-Luv* color space to perform perceptual linear interpolation of colors.
The translation equations to and from the CIE XYZ color space are provided. I am able to implement the equations it provides, but Wikipedia leaves out numeric values of the following variables: u'n, v'n, and Yn.
The article say these values depend on a "specified white point" and its "luminance". It suggests u'n = 0.2009 and v'n = 0.4610 when using 2° observer and standard illuminant C. If I am using these, what would Yn be? I do not know enough physics to figure this out, and I have been unable to search for an answer on Google.
In the end, my question boils down to: What are satisfactory/appropriate values I can use for u'n, v'n, and Yn?
Also, I'm assuming I simply linearly interpolate piecewise each component of CIE-Luv* (L*, u*, and v*) when interpolating values in this color space. Is this correct?
These three values are left out its because they depend on the colorspace of the specific device (e.g. display, printer or camera). Since computer screens use an RGB colorspace where perceived grey are R=B=G, you can assume that the values are not device dependant. I can't remember the values of by heart, so I'll edit them in later.
The human eye perceives luminance/intensity logarithmically, however, a linear interpolation is close enough, especially since you don't know what the actual min and max screen levels are.
The human eye perceives the color angle linearly, however, you need to take into account that the angle id's cyclic, therefore, the interpolation of the min and max angles should equal min (or max) and not the half way point. E.g. average of purple and red should be purple.
I think that the perception of saturation is also logarithmic, however, can be approximated by a linear interpolation.
Edit:
It seems like most sites use the sRGB to XYZ formulas.
http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html
http://www.easyrgb.com/index.php?X=MATH&H=02#text2
http://colormine.org/convert/rgb-to-xyz

How do I deal with color spaces and gamma in the PNG file format?

Here's my problem:
I'm doing some rendering using spectral samples, and want to save an image showing the results. I am weighting my spectral power function by the CIE XYZ color matching functions to obtain an XYZ color space result. I multiply this XYZ color tuple by the matrix given by this page for converting to sRGB, and clamp the results to (0,1).
To save the image, I scale the converted tuple by 255 and cast it to bytes, and pass the array to libpng's png_write_image(). When I view a uniform intensity, pure-color spectrum rendered this way, it looks wrong; there are dark bands in the transitions between the colors. This is perhaps not surprising, because to convert from XYZ to sRGB, the color components must be raised to 2.4 after the matrix multiply (or linearly scaled if they are small enough). But if I do this, it looks worse! Only after raising to 1/2.2 does it start to look right. It seems like, in the absence of me doing anything, the decoded images are having a gamma of ~2.2 applied twice.
Here's the way I would expect it to work: I apply the matrix to XYZ, and I have a roughly energy-linear RGB tuple. I raise this to 2.2, and now have a perceptually linear color tuple. I encode these numbers as they are (thus making efficient use of the file precision), and store a field in the file that says "these bytes have been encoded with gamma 2.2". Then at image load time, the decoding system un-applies the encoded gamma, then applies the system gamma before display. (And thus from an authoring perspective, I shouldn't have to care what the viewer's system gamma is). But the results I'm getting suggest it doesn't work this way.
Worse, I have tried calling png_set_gAMA() with both 2.2 and 1/2.2 and see no difference in the image. I get similar results with png_set_sRGB() (which I believe should force the gamma to 1/2.2).
There must be something I have backwards or don't understand with regards to either how I should be converting my color values, or how PNG handles gamma and color spaces. To break this down into a few clarifying questions:
What is the color space of the byte values I am expected to pass to write_png()?
What calls, if any, must I make to libpng in order to specify the color space and gamma of the passed bytes, to ensure proper display? Why might they fail?
How does the gamma field in the the png file relate to the exponent I have applied to the passed color channel values, if any?
If I am expected to invert a gamma curve before sending my image data (which I doubt, but seems necessary now), should that inversion include the linear part of the sRGB curve?
Furthermore, I see hints that "white point" matters in conversion between XYZ and sRGB. It is unclear to me whether the matrices in the site given above include a renormalization to D65 (it does not match Wikipedia's matrix)-- or even when such a conversion is necessary. Most of the literature I've found glosses over the details. Is there yet another step in the conversion not mentioned in the wiki article, or will this be handled automatically?
It is pretty much the way you expected. png_set_gAMA() causes libpng to write a gAMA
chunk in the output PNG file. It doesn't change the pixels themselves. A png-compliant
viewer is supposed to use the gamma value from the chunk, combined with the gamma of the display, to write the pixel intensity values properly on the display. Most decoders won't actually do the two-step (unapply the image gamma, then apply the system gamma) method you described, although the result is conceptually the same: It will combine the image gamma with the system gamma to create a lookup table, then use that table to convert the pixels in one step.
From what you observed (gamma=2.2 and gamma=1/2.2 behaving the same), it appears that you are using a viewer that doesn't do anything with the PNG gAMA chunk data.
You said:
because to convert from XYZ to sRGB, the color components must be raised to 2.4 after the matrix multiply...
No, this is incorrect. Going from linear (XYZ) to sRGB, you do NOT raise to 2.4 nor 2.2, that is for going FROM sRGB to linear.
Going from linear to sRGB you raise to ^(1/2.2) or if using the sRGB piecewise, you'll see 1/2.4 — the effective gamma you are applying is ^0.45455
On the wikipedia page you linked, this is the FORWARD transformation.
From XYZ to sRGB:
That of course being after the correct matrix is applied. Assuming everything is in D65, then:
Straight Talk about Linear
Light in the real world is linear. If you triple 100 photons, you then have 300 photons. But the human eye does not see a trippling, we see only a modest increast by comparison.
This is in part why transfer curves or "gamma" is used, to make the most of the available code space in an 8 bit image (oversimplification on my part I know).
To do this, a linear light value is raised to the power of 0.455, and to get that sRGB value back to a linear space, then we raise it to the inverse, i.e. ^1/0.455 otherwise known as ^2.2
The use of the matrixes must be done in linear space. but after transiting the matrix, you need to apply the trc or "gamma" encoding. Based on your statements, no, things are not having 2.2 added twice, you are simply going the wrong way.
You wrote: " It seems like, in the absence of me doing anything, the decoded images are having a gamma of ~2.2 applied twice. "
I think your monitor (hardwrare or your systems icc profile) has already a gamma setting itself.

What are the practical differences when working with colors in a linear vs. a non-linear RGB space?

What is the basic property of a linear RGB space and what is the fundamental property of a non-linear one? When talking about the values inside each channel in those 8 (or more) bits, what changes?
In OpenGL, colors are 3+1 values, and with this i mean RGB+alpha, with 8 bit reserved to each channel, and this is the part that i get clearly.
But when it comes to gamma correction i don't get what the effect of working in a non-linear RGB space is.
Since i know how to use a curve in a graphic software for photo-editing, my explanation is that in a linear RGB space you take the values as they are, with no manipulation and no math function attached, instead when it's non-linear each channel usually evolves following a classic power function behaviour.
Even if i take this explanation as the real one, i still don't get what a real linear space is, because after computation all non-linear RGB spaces becomes linear and most important of all i don't get the part where a non-linear color space is more suitable for the human eye because in the end all RGB spaces are linear for what i understand.
Let's say you're working with RGB colors: each color is represented with three intensities or brightnesses. You've got to choose between "linear RGB" and "sRGB". For now, we'll simplify things by ignoring the three different intensities, and assume you just have one intensity: that is, you're only dealing with shades of gray.
In a linear color-space, the relationship between the numbers you store and the intensities they represent is linear. Practically, this means that if you double the number, you double the intensity (the lightness of the gray). If you want to add two intensities together (because you're computing an intensity based on the contributions of two light sources, or because you're adding a transparent object on top of an opaque object), you can do this by just adding the two numbers together. If you're doing any kind of 2D blending or 3D shading, or almost any image processing, then you want your intensities in a linear color-space, so you can just add, subtract, multiply, and divide numbers to have the same effect on the intensities. Most color-processing and rendering algorithms only give correct results with linear RGB, unless you add extra weights to everything.
That sounds really easy, but there's a problem. The human eye's sensitivity to light is finer at low intensities than high intensities. That's to say, if you make a list of all the intensities you can distinguish, there are more dark ones than light ones. To put it another way, you can tell dark shades of gray apart better than you can with light shades of gray. In particular, if you're using 8 bits to represent your intensity, and you do this in a linear color-space, you'll end up with too many light shades, and not enough dark shades. You get banding in your dark areas, while in your light areas, you're wasting bits on different shades of near-white that the user can't tell apart.
To avoid this problem, and make the best use of those 8 bits, we tend to use sRGB. The sRGB standard tells you a curve to use, to make your colors non-linear. The curve is shallower at the bottom, so you can have more dark grays, and steeper at the top, so you have fewer light grays. If you double the number, you more than double the intensity. This means that if you add sRGB colors together, you end up with a result that is lighter than it should be. These days, most monitors interpret their input colors as sRGB. So, when you're putting a color on the screen, or storing it in an 8-bit-per-channel texture, store it as sRGB, so you make the best use of those 8 bits.
You'll notice we now have a problem: we want our colors processed in linear space, but stored in sRGB. This means you end up doing sRGB-to-linear conversion on read, and linear-to-sRGB conversion on write. As we've already said that linear 8-bit intensities don't have enough darks, this would cause problems, so there's one more practical rule: don't use 8-bit linear colors if you can avoid it. It's becoming conventional to follow the rule that 8-bit colors are always sRGB, so you do your sRGB-to-linear conversion at the same time as widening your intensity from 8 to 16 bits, or from integer to floating-point; similarly, when you've finished your floating-point processing, you narrow to 8 bits at the same time as converting to sRGB. If you follow these rules, you never have to worry about gamma correction.
When you're reading an sRGB image, and you want linear intensities, apply this formula to each intensity:
float s = read_channel();
float linear;
if (s <= 0.04045) linear = s / 12.92;
else linear = pow((s + 0.055) / 1.055, 2.4);
Going the other way, when you want to write an image as sRGB, apply this formula to each linear intensity:
float linear = do_processing();
float s;
if (linear <= 0.0031308) s = linear * 12.92;
else s = 1.055 * pow(linear, 1.0/2.4) - 0.055; ( Edited: The previous version is -0.55 )
In both cases, the floating-point s value ranges from 0 to 1, so if you're reading 8-bit integers you want to divide by 255 first, and if you're writing 8-bit integers you want to multiply by 255 last, the same way you usually would. That's all you need to know to work with sRGB.
Up to now, I've dealt with one intensity only, but there are cleverer things to do with colors. The human eye can tell different brightnesses apart better than different tints (more technically, it has better luminance resolution than chrominance), so you can make even better use of your 24 bits by storing the brightness separately from the tint. This is what YUV, YCrCb, etc. representations try to do. The Y channel is the overall lightness of the color, and uses more bits (or has more spatial resolution) than the other two channels. This way, you don't (always) need to apply a curve like you do with RGB intensities. YUV is a linear color-space, so if you double the number in the Y channel, you double the lightness of the color, but you can't add or multiply YUV colors together like you can with RGB colors, so it's not used for image processing, only for storage and transmission.
I think that answers your question, so I'll end with a quick historical note. Before sRGB, old CRTs used to have a non-linearity built into them. If you doubled the voltage for a pixel, you would more than double the intensity. How much more was different for each monitor, and this parameter was called the gamma. This behavior was useful because it meant you could get more darks than lights, but it also meant you couldn't tell how bright your colors would be on the user's CRT, unless you calibrated it first. Gamma correction means transforming the colors you start with (probably linear) and transforming them for the gamma of the user's CRT. OpenGL comes from this era, which is why its sRGB behavior is sometimes a little confusing. But GPU vendors now tend to work with the convention I described above: that when you're storing an 8-bit intensity in a texture or framebuffer, it's sRGB, and when you're processing colors, it's linear. For example, an OpenGL ES 3.0, each framebuffer and texture has an "sRGB flag" you can turn on to enable automatic conversion when reading and writing. You don't need to explicitly do sRGB conversion or gamma correction at all.
I am not a "human color detection expert", but I've met similar thing on the YUV->RGB conversion. There are different weights for R/G/B channels, so if you change the source color by x, RGB values change different quantity.
As said, I'm not an expert, anyway, I think, if you want to do some color-correct transformation, you should do it in YUV space, then convert it to RGB (or do the mathematically equivalent operation on RGB, beware of data loss). Also, I'm not sure that YUV is the best native representation of colors, but video cameras provide that format, that's where I've met the issue.
Here is the magic YUV->RGB formula with secret numbers included: http://www.fourcc.org/fccyvrgb.php

Resources