gnuplot scale plot function to same height - gnuplot

I am drawing distribution curves of three different datasets.
They have different means and standard deviations, and thus different curves. However, the plots appear different when in the same graph.
I use the normal curve function:
std_b=0.1674
mu_b=.6058
mu_j=0.8955
std_j=0.0373
mu_s=0.9330
std_s=0.0240
normal(x,mu,sd) = (1/(sd*sqrt(2*pi)))*exp(-(x-mu)**2/(2*sd**2))
plot normal(x,mu_b,std_b) w boxes title "Boolean",\
normal(x,mu_j,std_j) w boxes title "Jaccard",\
normal(x,mu_s,std_s) w boxes title "Sorensen"
However the scale of the curves if off as seen by the difference in the Y axis.
How can I scale each plot function, so that they are all at the same Y height?

In general, you can't.
These are probability density functions, which means that they must be positive and they must have an area of exactly 1 under the curve (the formal definition is a little more technical, but that is the statistics 101 definition). Because of that, when you make the curve less spread out (which is what the standard deviation is measuring), in order to preserve the area, you must make the peak in the middle higher.
If it helps to visualize it, think of a finite distribution in the shape of an isosceles triangle.
Both the purple and green triangles form perfectly valid probability distributions. In the case of the purple distribution, it has a base of length 10 (from 0 to 10) and a height of 1/5, giving an area of 1. If I want to make it cover a smaller range (which again is basically what the standard deviation is doing in your normal curves), I push the sides together (in this case a length of 6 - from 2 to 8), but in order to preserve the area of 1, I have to make the triangle taller (in this case a height of 1/3). If I kept the same height, I would have less than an area of 1.
In your normal distributions, the y height is controlled by the scale in front of your exponential functions. Getting a rid of that, or setting them to be the same will make them have the same height, but they will no longer be probability distributions, as the area will not be 1. In general, for a normal distribution, the smaller the standard deviation, the taller the peak.

Related

Relation of luminance in RGB/XYZ color and physical luminance

Short version: When a color described in XYZ or xyY coordinates has a luminance Y=1, what are the physical units of that? Does that mean 1 candela, or 1 lumen? Is there any way to translate between this conceptual space and physical brightness?
Long version: I want to simulate how the sky looks in different directions, at different times of day, and (eventually) under different cloudiness and air pollution conditions. I've learned enough to figure out how to translate a given spectrum into a chrominance, for example xyz coordinates. But almost everything I've read on color theory in graphical display is focused on relative color, so the luminance is always 1. Non-programming color theory describes the units of luminance, so that I can translate from a spectrum in watts/square meter/steradian to candela or lumens, but nothing that describes the units of luminance in programming. What are the units of luminance in XYZ coordinates? I understand that the actual brightness of a patch would depend on monitor settings, but I'm really not finding any hints as to how to proceed.
Below is an example of what I'm coming across. The base color, at relative luminance of 1, was calculated from first principles. All the other colors are generated by increasing or decreasing the luminance. Most of them are plausible colors for mid-day sky. For the parameters I've chosen, I believe the total intensity in the visible range is 6.5 W/m2/sr = 4434 cd/m2, which seems to be in the right ballpark according to Wiki: Orders of Magnitude. Which color would I choose to represent that patch of sky?
Without more, luminance is usually expressed in candelas per square meter (cd/m2), and CIE XYZ's Y component is a luminance in cd/m2 — if the convention used is "absolute XYZ", which is rare. (The link is to an article I wrote which contains more detailed information.) More commonly, XYZ colors are normalized such that the white point (such as the D65 or D50 white point) has Y = 1 (or Y = 100).

How to tell if an xyY color lies within the CIE 1931 gamut?

I'm trying to plot the CIE 1931 color gamut using math.
I take a xyY color with Y fixed to 1.0 then vary x and y from 0.0 to 1.0.
If I plot the resulting colors as an image (ie. the pixel at (x,y) is my xyY color converted to RGB) I get a pretty picture with the CIE 1931 color gamut somewhere in the middle of it, like this:
xyY from 0.0 to 1.0:
Now I want the classic tongue-shaped image so my question is: How do I cull pixels outside the range of the CIE 1931 color gamut?
ie. How can I tell if my xyY color is inside/outside the CIE 1931 color range?
I happened upon this question while searching for a slightly different but related issue, and what immediately caught my eye is the rendering at the top. It's identical to the rendering I had produced a few hours earlier, and trying to figure out why it didn't make sense is, in part, what led me here.
For readers: the rendering is what results when you convert from {x ∈ [0, 1], y ∈ [0, 1], Y = 1} to XYZ, convert that color to sRGB, and then clamp the individual components to [0, 1].
At first glance, it looks OK. At second glance, it looks off... it seems less saturated than expected, and there are visible transition lines at odd angles. Upon closer inspection, it becomes clear that the primaries aren't smoothly transitioning into each other. Much of the range, for example, between red and blue is just magenta—both R and B are 100% for almost the entire distance between them. When you then add a check to skip drawing any colors that have an out-of-range component, instead of clamping, everything disappears. It's all out-of-gamut. So what's going on?
I think I've got this one small part of colorimetry at least 80% figured out, so I'm setting this out, greatly simplified, for the edification of anyone else who might find it interesting or useful. I also try to answer the question.
(⚠️ Before I begin, an important note: valid RGB display colors in the xyY space can be outside the boundary of the CIE 1931 2° Standard Observer. This isn't the case for sRGB, but it is the case for Display P3, Rec. 2020, CIE RGB, and other wide gamuts. This is because the three primaries need to add up to the white point all by themselves, and so even monochromatic primaries must be incredibly, unnaturally luminous compared to the same wavelength under equivalent illumination.)
Coloring the chromaticity diagram
The xy chromaticity diagram isn't just a slice through xyY space. It's intrinsically two dimensional. A point in the xy plane represents chromaticity apart from luminance, so to the extent that there is a color there it is to represent as best as possible only the chromaticity, not any specific color. Normally the colors seem to be the brightest, most saturated colors for that chromaticity, or whatever's closest in the display's color space, but that's an arbitrary design decision.
Which is to say: to the extent that there are illustrative colors drawn they're necessarily fictitious, in much the same way that coloring an electoral map is purely a matter of data visualization: a convenience to aid comprehension. It's just that, in this case, we're using colors to visualize one aspect of colorimetry, so it's super easy to conflate the two things.
(Image credit: Michael Horvath)
The falsity, and necessity thereof, of the colors becomes obvious when we consider the full 3D shape of the visible spectrum in the xyY space. The classic spectral locus ("horse shoe") can easily be seen to be the base of a quasi-Gibraltian volume, widest at the spectral locus and narrowing to a summit (the white point) at {Y = 1}. If viewed as a top-down projection, then colors located on and near the spectral locus would be very dark (although still the brightest possible color for that chromaticity), and would grow increasingly luminous towards the center. If viewed as a slice of the xyY volume, through a particular value of Y, the colors would be equally luminous but would grow brighter overall and the shape of the boundary would shrink, again unevenly, with increasing Y, until it disappeared entirely. So far as I can tell, neither of these possibilities see much, if any, practical use, interesting though they may be.
Instead, the diagram is colored inside out: the gamut being plotted is colored with maximum intensities (each primary at its brightest, and then linear mixtures in the interior) and out-of-gamut colors are projected from the inner gamut triangle to the spectral locus. This is annoying because you can't simply use a matrix transformation to turn a point on the xy plane into a sensible color, but in terms of actually communicating useful and somewhat accurate information it seems, unfortunately, to be unavoidable.
(To clarify: it is actually possible to move a single chromaticity point into the sRGB space, and color the chromaticity diagram pixel-by-pixel with the most brightly saturated sRGB colors possible—it's just more complicated than a simple matrix transformation. To do so, first move the three-coordinate xyz chromaticity into sRGB. Then clamp any negative values to 0. Finally, scale the components uniformly such that the maximum component value is 1. Be aware this can be much slower than plotting the whitepoint and the primaries and then interpolating between them, depending on your rendering method and the efficiency of your data representations and their operations.)
Drawing the spectral locus
The most straightforward way to get the characteristic horseshoe shape is just to use a table of the empirical data.
(http://cvrl.ioo.ucl.ac.uk/index.htm, scroll down for the "historical" datasets that will most closely match other sources intended for the layperson. Their too-clever icon scheme for selecting data is that a dotted-line icon is for data sampled at 5nm, a solid line icon is for data sampled at 1nm.)
Construct a path with the points as vertices (you might want to trim some off the top, I cut it back to 700nm, the CIERGB red primary), and use the resulting shape as a mask. With 1nm samples, a polyline should be smooth enough for near any resolution: there's no need for fitting bezier curves or whatnot.
(Note: only every 5th point shown for illustrative purposes.)
If all we want to do is draw the standard horse shoe bounded by the triangle {x = 0, y = 0}, {0, 1}, and {1, 0} then that should suffice. Note that we can save rendering time by skipping any coordinates where x + y >= 1. If we want to do more complex things, like plot the changing boundary for different Y values, then we're talking about the color matching functions that define the XYZ space.
Color matching functions
(Image credit: User:Acdx - Own work, CC BY-SA 4.0)
The ground truth for the XYZ space is in the form of three functions that map spectral power distributions to {X, Y, Z} tristimulus values. A lot of data and calculations went into constructing the XYZ space, but it all gets baked into these three functions, which uniquely determine the {X, Y, Z} values for a given spectrum of light. In effect, what the functions do is define 3 imaginary primary colors, which can't be created with any actual light spectrum, but can be mixed together to create perceptible colors. Because they can be mixed, every non-negative point in the XYZ space is meaningful mathematically, but not every point corresponds to a real color.
The functions themselves are actually defined as lookup tables, not equations that can be calculated exactly. The Munsell Color Science Laboratory (https://www.rit.edu/science/munsell-color-lab) provides 1nm resolution samples: scroll down to "Useful Color Data" under "Educational Resources." Unfortunately, it's in Excel format. Other sources might provide 5nm data, and anything more precise than 1nm is probably a modern reconstruction which might not commute with the 1931 space.
(For interest: this paper—http://jcgt.org/published/0002/02/01/—provides analytic approximations with error within the variability of the original human subject data, but they're mostly intended for specific use cases. For our purposes, it's preferable, and simpler, to stick with the empirically sampled data.)
The functions are referred to as x̅, y̅, and z̅ (or x bar, y bar, and z bar.) Collectively, they're known as the CIE 1931 2 Degree Standard Observer. There's a separate 1964 standard observer constructed from a wider 10 degree field-of-view, with minor differences, which can be used instead of the 1931 standard observer, but which arguably creates a different color space. (The 1964 standard observer shouldn't be confused with the separate CIE 1964 color space.)
To calculate the tristimulus values, you take the inner product of (1) the spectrum of the color and (2) the color matching function. This just means that every point (or sample) in the spectrum is multiplied by the corresponding point (or sample) in the color matching function, which serves to reweight the data. Then, you take the integral (or summation, more accurately, since we're dealing with discrete samples) over the whole range of visible light ([360nm, 830nm].) The functions are normalized so that they have equal area under their curves, so an equal energy spectrum (the sampled value for every wavelength is the same) will have {X = Y = Z}. (FWIW, the Munsell Color Lab data are properly normalized, but they sum to 106 and change, for some reason.)
Taking another look at that 3D plot of the xyY space, we notice again that the familiar spectral locus shape seems to be the shape of the volume at {Y = 0}, i.e. where those colors are actually black. This now makes some sort of sense, since they are monochromatic colors, and their spectrums should consist of a single point, and thus when you take the integral over a single point you'll always get 0. However, that then raises the question: how do they have chromaticity at all, since the other two functions should also be 0?
The simplest explanation is that Y at the base of the shape is actually ever-so-slightly greater than zero. The use of sampling means that the spectrums for the monochromatic sources are not taken to be instantaneous values. Instead, they're narrow bands of the spectrum near their wavelengths. You can get arbitrarily close to instantaneous and still expect meaningful chromaticity, within the bounds of precision, so the limit as the sampling bandwidth goes to 0 is the ideal spectral locus, even if it disappears at exactly 0. However, the spectral locus as actually derived is just calculated from the single-sample values for the x̅, y̅, and z̅ color matching functions.
That means that you really just need one set of data—the lookup tables for x̅, y̅, and z̅. The spectral locus can be computed from each wavelength by just dividing x̅(wl) and y̅(wl) by x̅(wl) + y̅(wl) + z̅(wl).
(Image credit: Apple, screenshot from ColorSync Utility)
Sometimes you'll see a plot like this, with a dramatically arcing, rainbow-colored line swooping up and around the plot, and then back down to 0 at the far red end of the spectrum. This is just the y̅ function plotted along the spectral locus, scaled so that y̅ = Y. Note that this is not a contour of the 3D shape of the visible gamut. Such a contour would be well inside the spectral locus through the blue-green range, when plotted in 2 dimensions.
Delineating the visible spectrum in XYZ space
The final question becomes: given these three color matching functions, how do we use them to decide if a given {X, Y, Z} is within the gamut of human color perception?
Useful fact: you can't have luminosity by itself. Any real color will also have a non-zero value for one or both of the other functions. We also know Y by definition has a range of [0, 1], so we're really only talking about figuring whether {X, Z} is valid for a given Y.
Now the question becomes: what spectrums (simplified for our purposes: an array of 471 values, either 0 or 1, for the wavelengths [360nm, 830nm], band width 1nm), when weighted by y̅, will sum to Y?
The XYZ space is additive, like RGB, so any non-monochromatic light is equivalent to a linear combination of monochromatic colors at various intensities. In other words, any point inside of the spectral locus can be created by some combination of points situated exactly on the boundary. If you took the monochromatic CIE RGB primaries and just added up their tristimulus values, you'd get white, and the spectrum of that white would just be the spectrum of the three primaries superimposed, a thin band at the wavelength for each primary.
It follows, then, that every possible combination of monochromatic colors is within the gamut of human vision. However, there's a ton of overlap: different spectrums can produce the same perceived color. This is called metamerism. So, while it might be impractical to enumerate every possible individually perceptible color or spectrums that can produce them, it's actually relatively easy to calculate the overall shape of the space from a trivially enumerable set of spectrums.
What we do is step through the gamut wavelength-by-wavelength, and, for that given wavelength, we iteratively sum ever-larger slices of the spectrum starting from that point, until we either hit our Y target or run out of spectrum. You can picture this as going around a circle, drawing progressively larger arcs from one starting point and plotting the center of the resulting shape—when you get to an arc that is just the full circle, the centers coincide, and you get white, but until then the points you plot will spiral inward from the edge. Repeat that from every point on the circumference, and you'll have points spiraling in along every possible path, covering the gamut. You can actually see this spiraling in effect, sometimes, in 3D color space plots.
In practice, this takes the form of two loops, the outer loop going from 360 to 830, and the inner loop going from 1 to 470. In my implementation, what I did for the inner loop is save the current and last summed values, and once the sum exceeds the target I use the difference to calculate a fractional number of bands and push the outer loop's counter and that interpolated width onto an array, then break out of the inner loop. Interpolating the bands greatly smooths out the curves, especially in the prow.
Once we have the set of spectrums of the right luminance, we can calculate their X and Z values. For that, I have a higher order summation function that gets passed the function to sum and the interval. From there, the shape of the gamut on the chromaticity diagram for that Y is just the path formed by the derived {x, y} coordinates, as this method only enumerates the surface of the gamut, without interior points.
In effect, this is a simpler version of what libraries like the one mentioned in the accepted answer do: they create a 3D mesh via exhaustion of the continuous spectrum space and then interpolate between points to decide if an exact color is inside or outside the gamut. Yes, it's a pretty brute-force method, but it's simple, speedy, and effective enough for demonstrative and visualization purposes. Rendering a 20-step contour plot of the overall shape of the chromaticity space in a browser is effectively instantaneous, for instance, with nearly perfect curves.
There are a couple of places where a lack of precision can't be entirely smoothed over: in particular, two corners near orange are clipped. This is due to the shapes of the lines of partial sums in this region being a combination of (1) almost perfectly horizontal and (2) having a hard cusp at the corner. Since the points exactly at the cusp aren't at nice even values of Y, the flatness of the contours is more a problem because they're perpendicular to the mostly-vertical line of the cusp, so interpolating points to fit any given Y will be most pessimum in this region. Another problem is that the points aren't uniformly distributed, being concentrated very near to the cusp: the clipping of the corner corresponds to situations where an outlying point is interpolated. All these issues can clearly be seen in this plot (rendered with 20nm bins for clarity but, again, more precision doesn't eliminate the issue):
Conclusion
Of course, this is the sort of highly technical and pitfall-prone problem (PPP) that is often best outsourced to a quality 3rd party library. Knowing the basic techniques and science behind it, however, demystifies the entire process and helps us use those libraries effectively, and adapt our solutions as needs change.
You could use Colour and the colour.is_within_visible_spectrum definition:
>>> import numpy as np
>>> is_within_visible_spectrum(np.array([0.3205, 0.4131, 0.51]))
array(True, dtype=bool)
>>> a = np.array([[0.3205, 0.4131, 0.51],
... [-0.0005, 0.0031, 0.001]])
>>> is_within_visible_spectrum(a)
array([ True, False], dtype=bool)
Note that this definition expects CIE XYZ tristimulus values, so you would have to convert your CIE xyY colourspace values to XYZ by using colour.xyY_to_XYZ definition.

texture mapping (u,v) values

Here is a excerpt from Peter Shirley's Fundamentals of computer graphics:
11.1.2 Texture Arrays
We will assume the two dimensions to be mapped are called u and v.
We also assume we have an nx and ny image that we use as the texture.
Somehow we need every (u,v) to have an associated color found from the
image. A fairly standard way to make texturing work for (u,v) is to
first remove the integer portion of (u,v) so that it lies in the unit
square. This has the effect of "tiling" the entire uv plane with
copies of the now-square texture. We then use one of the three
interpolation strategies to compute the image color for the
coordinates.
My question is: What are the integer portion of (u,v)? I thought u,v are 0 <= u,v <= 1.0. If there is an integer portion, shouldn't we be dividing u,v by the texture image width and height to get the normalized u,v values?
UV values can be less than 0 or greater than 1. The reason for dropping the integer portion is that UV values use the fractional part when indexing textures, where (0,0), (0,1), (1,0) and (1,1) correspond to the texture's corners. Allowing UV values to go beyond 0 and 1 is what enables the "tiling" effect to work.
For example, if you have a rectangle whose corners are indexed with the UV points (0,0), (0,2), (2,0), (2,2), and assuming the texture is set to tile the rectangle, then four copies of the texture will be drawn on that rectangle.
The meaning of a UV value's integer part depends on the wrapping mode. In OpenGL, for example, there are at least three wrapping modes:
GL_REPEAT - The integer part is ignored and has no meaning. This is what allows textures to tile when UV values go beyond 0 and 1.
GL_MIRRORED_REPEAT - The fractional part is mirrored if the integer part is odd.
GL_CLAMP_TO_EDGE - Values greater than 1 are clamped to 1, and values less than 0 are clamped to 0.
Peter O's answer is excellent. I want to add a high level point that the coordinate systems used in graphics are a convention that people just stick to as a defacto standard-- there's no law of nature here and it is arbitrary (but a decent standard thank goodness). I think one reason texture mapping is often confusing is that the arbitrariness of this stardard isn't obvious. This is that the image has a de facto coordinate system on the unit square [0,1]^2. Give me a (u,v) on the unit square and I will tell you a point in the image (for example, (0.2,0.3) is 20% to the right and 30% up from the bottom-left corner of the image). But what if you give me a (u,v) that is outside [0,1]^2 like (22.7, -13.4)? Some rule is used to make that on [0.1]^2, and the GL modes described are just various useful hacks to deal with that case.

How to apply flat shading to RGB colors?

I am creating a small 3d rendering application. I decided to use simple flat shading for my triangles - just calculate the cosine of angle between face normal and light source and scale light intensity by it.
But I'm not sure about how exactly should I apply that shading coefficient to my RGB colors.
For example, imagine some surface at 60 degree angle to light source. cos(60 degree) = 0.5, so I should retain only half of the energy in emitted light.
I could simply scale RGB values by that coefficient, as in following pseudocode:
double shade = cos(angle(normal, lightDir))
Color out = new Color(in.r * shade, in.g * shade, in.b * shade)
But the resulting colors get too dark even at smaller angles. After some thought, that seems logical - our eyes perceive the logarithm of light energy (it's why we can see both in the bright day, and in the night). And RGB values already represent that log scale.
My next attempt was to use that linear/logarithmic insight. Theoretically:
output energy = lg(exp(input energy) * shade)
That can be simplified to:
output energy = lg(exp(input energy)) + lg(shade)
output energy = input energy + lg(shade)
So such shading will just amount to adding logarithm of shade coefficient (which is negative) to RGB values:
double shade = lg(cos(angle(normal, lightDir)))
Color out = new Color(in.r + shade, in.g + shade, in.b + shade)
That seems to work, but is it correct? How it is done in real rendering pipelines?
The color RGB vector is multiplied by the shade coefficient
The cosine value as you initially assumed. The logarithmic scaling is done by the target imaging device and human eyes
If your colors get too dark then the probable cause is:
the cosine or angle value get truncated to integer
or your pipeline does not have linear scale output (some gamma corrections can do that)
or you have a bug somewhere
or your angle and cosine uses different metrics (radians/degrees)
you forget to add ambient light coefficient to the shade value
your vectors are opposite or wrong (check them visually see the first link on how)
your vectors are not in the same coordinate system (light is usually in GCS and Normal vectors in model LCS so you need convert at least one of them to the coordinate system of the other)
The cos(angle) itself is not usually computed by cosine
As you got all data as vectors then just use dot product
double shade = dot(normal, lightDir)/(|normal|.|lightDir|)
if the vectors are unit size then you can discard the division by sizes ... that is why normal and light vectors are normalized ...
Some related questions and notes
Normal shading this may enlight thing or two (for beginners)
Normal/Bump mapping see fragment shader and search the dot
mirrored light see for slightly more complex lighting scheme
GCS/LCS mean global/local coordinate system

How to optimally plot parametric continuous curve?

Let's say we have a parametric curve, for example a circle:
x = r * cos(t)
y = r * sin(t)
We want to plot the curve on the screen in a way that:
every pixel is painted just once (the optimal part)
there is a painted pixel for each (x, y) that lies on the curve (the continuous part)
If we just plot (x, y) for each t in [t1, t2], these conditions will not be met.
I am searching for a general solution for any parametric curve.
A general solution that 100% satisfies your criteria does not exist.
So we have to compromize.
Usually this is tackled by starting with a stepsize (usually a parameter to your routine), this stepsize can be subdivided triggered by a heuristic e.g.:
subdivide when the distance covered by a segment is larger than a given distance (e.g. one pixel)
subdivide when the curve direction changes too much
Or a combination of these.
Usually some limit to subdivision is also given to avoid taking forever.
Many systems that offer parametric plotting start with some changeable default setting for the heuristic params and step size. The user can adapt these if the curve is not "nice" enough or if it takes too long.
The problem is that there are always pathological curves that will defeat your method of drawing making it miss details or taking overly long.
Check out Bézier splines.

Resources