Image convolution with pre-multiplied alpha - graphics

I'm trying to implement image convolution with a 3x3 matrix, where my colour components (each ranging from 0 to 255) are stored using pre-multiplied alpha. All the tutorials (e.g. http://www.codeproject.com/KB/GDI-plus/csharpfilters.aspx) I can find only describe performing the convolution calculations on the RGB components and nothing is mentioned about the alpha component.
My current code leaves the alpha component as it is. The filters I have tried look fine when working on images where every pixel already has full alpha set. When I have partially transparent pixels e.g. a boxblur filter looks strange because pixel colors do not propagate into transparent areas when blurring happens.
What calculations do I perform on the alpha component when running the convolution algorithm and how do I deal with pre-multiplied alphas when setting the final pixel value? Also, do I add the filter offset to the alpha component?
I've tried calculating my new alpha component the same way I calculate the RGB components (i.e. adding up the surrounding alpha values for that pixel according to the filter matrix) but I get colored fringes appearing on the edge of transparent areas and semi-transparent pixels start to darken too much. I think I need to change the new RGB components to take into account the new alpha value but I'm not sure what to do.
Thanks.

I think that the correct way is to first compute just the alpha of the convolution using the standard formulas
alpha = a1*m1 + a2*m2 + a3*m3 +
a4*m4 + a5*m5 + a6*m6 +
a7*m7 + a8*m8 + a9*m9;
then you must compute the convolution of the original (non-premultiplied) r/g/b and post-multiply by alpha
red = (r1/a1*m1 + r2/a2*m2 + r3/a3*m3 +
r4/a4*m4 + r5/a5*m5 + r6/a6*m6 +
r7/a7*m7 + r8/a8*m8 + r9/a9*m9) * alpha;
with a similar formula for green and blue.
A more efficient way would be first removing premultiplication (i.e. replacing r with r/a, g with g/a and b with b/a) doing the convolution of all components using standard formulas and then re-premultiply (replacing r with r*a, g with g*a and b with b*a).

Related

Transformed colors when painting semi-transparent in p5.js

A transformation seems to be applied when painting colors in p5.js with an alpha value lower than 255:
for (const color of [[1,2,3,255],[1,2,3,4],[10,11,12,13],[10,20,30,40],[50,100,200,40],[50,100,200,0],[50,100,200,1]]) {
clear();
background(color);
loadPixels();
print(pixels.slice(0, 4).join(','));
}
Input/Expected Output Actual Output (Firefox)
1,2,3,255 1,2,3,255 ✅
1,2,3,4 0,0,0,4
10,11,12,13 0,0,0,13
10,20,30,40 6,19,25,40
50,100,200,40 51,102,204,40
50,100,200,0 0,0,0,0
50,100,200,1 0,0,255,1
The alpha value is preserved, but the RGB information is lost, especially on low alpha values.
This makes visualizations impossible where, for example, 2D shapes are first drawn and then the visibility in certain areas is animated by changing the alpha values.
Can these transformations be turned off or are they predictable in any way?
Update: The behavior is not specific to p5.js:
const ctx = new OffscreenCanvas(1, 1).getContext('2d');
for (const [r,g,b,a] of [[1,2,3,255],[1,2,3,4],[10,11,12,13],[10,20,30,40],[50,100,200,40],[50,100,200,0],[50,100,200,1]]) {
ctx.clearRect(0, 0, 1, 1);
ctx.fillStyle = `rgba(${r},${g},${b},${a/255})`;
ctx.fillRect(0, 0, 1, 1);
console.log(ctx.getImageData(0, 0, 1, 1).data.join(','));
}
I could be way off here...but it looks like internally that in the background method if _isErasing is true then blendMode is called. By default this will apply a linear interpolation of colours.
See https://github.com/processing/p5.js/blob/9cd186349cdb55c5faf28befff9c0d4a390e02ed/src/core/p5.Renderer2D.js#L45
See https://p5js.org/reference/#/p5/blendMode
BLEND - linear interpolation of colours: C = A*factor + B. This is the
default blending mode.
So, if you set the blend mode to REPLACE I think it should work.
REPLACE - the pixels entirely replace the others and don't utilize
alpha (transparency) values.
i.e.
blendMode(REPLACE);
for (const color of [[1,2,3,255],[1,2,3,4],[10,11,12,13],[10,20,30,40],[50,100,200,40],[50,100,200,0],[50,100,200,1]]) {
clear();
background(color);
loadPixels();
print(pixels.slice(0, 4).join(','));
}
Internally, the HTML Canvas stores colors in a different way that cannot preserve RGB values when fully transparent. When writing and reading pixel data, conversions take place that are lossy due to the representation by 8-bit numbers.
Take for example this row from the test above:
Input/Expected Output Actual Output
10,20,30,40 6,19,25,40
IN (conventional alpha)
R
G
B
A
values
10
20
30
40 (= 15.6%)
Interpretation: When painting, add 15.6% of (10,20,30) to the 15.6% darkened (r,g,b) background.
Canvas-internal (premultiplied alpha)
R
G
B
A
R
G
B
A
calculation
10 * 0.156
20 * 0.156
30 * 0.156
40 (= 15.6%)
values
1.56
3.12
4.7
40
values (8-bit)
1
3
4
40
Interpretation: When painting, add (1,3,4) to the 15.6% darkened (r,g,b) background.
Premultiplied alpha allows faster painting and supports additive colors, that is, adding color values without darkening the background.
OUT (conventional alpha)
R
G
B
A
calculation
1 / 0.156
3 / 0.156
4 / 0.156
40
values
6.41
19.23
25.64
40
values (8-bit)
6
19
25
40
So the results are predictable, but due to the different internal representation, the transformation cannot be turned off.
The HTML specification explicitly mentions this in section 4.12.5.1.15 Pixel manipulation:
Due to the lossy nature of converting between color spaces and converting to and from premultiplied alpha color values, pixels that have just been set using putImageData(), and are not completely opaque, might be returned to an equivalent getImageData() as different values.
see also 4.12.5.7 Premultiplied alpha and the 2D rendering context

Reduce components included by otsu threshold python opencv

I am trying to segment the blue components from a set of images. In most images where blue components have a large spread, otsu thresholded image works properly well. However, for images where blue components are minimal, the results are not ok and seems to include the non-relevant sections. Example below:
Are there ways to improve the otsu thresholding such that only relevant parts are segmented but not necessarily making the other images suffer?
I already tried global and adaptive thresholding but otsu particularly captured betters which however included unnecessary details.
Here's the code:
l_image = remove_background(image)
l_image = cv2.cvtColor(l_image, cv2.COLOR_BGR2GRAY)
ret1,th1 = cv2.threshold(l_image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
mask = (th1 != 255)
sel = np.ones_like(image)
sel[mask] = image[mask]
sel = cv2.cvtColor(sel, cv2.COLOR_HSV2BGR)
#we simply set these channels to 0 to remove excess background
sel[:,:,1] = 0
sel[:,:,2] = 0
Here's the sample image.
The main issue with the logic in your code is that you are looking for something that is distinguished primarily by color, but throw away the color information first by converting the image to grayscale.
Instead, consider looking at color properties of each pixel. One easy way to do so is to look at the HCV color space. This is a similar color space to the more common HSV, with "C" for chroma instead of "S" for saturation, where S = C / V. I'm suggesting this because it's so easy to compute the "C" channel, which is the one that would have most of the contrast in this image. Note that all the complexity is in computing "H", the hue, and that would be ideally used to find a specific color independently of its brightness, but that requires a double threshold on the "H" channel plus a threshold on the "S" channel. For this simple case, a single threshold on the "S" channel is sufficient to find the colored regions: we have only blue, we don't care about what color it is, we just want to find the color.
To compute the "C" (chroma) channel, we find the difference between the largest and the smallest of the RGB values (for each pixel independently):
rgbmax = np.amax(image, axis=2)
rgbmin = np.amin(image, axis=2)
c = rgbmax - rgbmin
As you can guess, a simple threshold of this image leads to finding the colored regions. The green background can easily be subtracted before processing, or after.
Edit: after #Cris Luengo comment, the green channel works better than the blue one.
You can apply Otsu's threshold on the green channel (of BGR).
Results are not perfect but much better.
img = img[:,:,1] #get the green channel
th, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
output:

Given the RGB components of a color, how can I decide if it is perceived as gray by humans?

One simple way is to say that when the RGB components are equal, they form a gray color.
However, this is not the whole story, because if they only have a slight difference, they will still look gray.
Assuming the viewer has a healthy vision of color, how can I decide if the given values would be perceived as gray (presumably with an adjustable threshold level for "grayness")?
A relatively straightforward method would be to convert RGB value to HSV color space and use threshold on the saturation component, e.g. "if saturation < 0.05 then 'almost grey', else not grey".
Saturation is actually the "grayness/colorfulness" by definition.
This method is much more accurate than using differences between R, G and B channels (since human eye perceives saturation differently on light and dark colors). On the other hand, converting RGB to HSV is computationally intensive. It is up to you to decide what is of more value - precise answer (grey/not grey) or performance.
If you need an even more precise method, you may use L*a*b* color space and compute chroma as sqrt(a*a + b*b) (see here), and then apply thresholding to this value. However, this would be even more computationally intensive.
You can also combine multiple methods:
Calculate simple differences between R, G, B components. If the color can be identified as definitely desaturated (e.g. max(abs(R-G), abs(R-B), abs(G-B)) <= 5) or definitely saturated (e.g. max(abs(R-G), abs(R-B), abs(G-B)) > 100), then stop.
Otherwise, convert to L*a*b*, compute chroma as sqrt(a*a + b*b) and use thresholding on this value.
r = 160;
g = 179;
b = 151;
tolerance = 20;
if (Math.abs(r-g) < 20 && Math.abs(r-b) < 20) {
#then perceived as gray
}

How can I translate an image with subpixel accuracy?

I have a system that requires moving an image on the screen. I am currently using a png and just placing it at the desired screen coordinates.
Because of a combination of the screen resolution and the required frame rate, some frames are identical because the image has not yet moved a full pixel. Unfortunately, the resolution of the screen is not negotiable.
I have a general understanding of how sub-pixel rendering works to smooth out edges but I have been unable to find a resource (if it exists) as to how I can use shading to translate an image by less than a single pixel.
Ideally, this would be usable with any image but if it was only possible with a simple shape like a circle or a ring, that would also be acceptable.
Sub-pixel interpolation is relatively simple. Typically you apply what amounts to an all-pass filter with a constant phase shift, where the phase shift corresponds to the required sub-pixel image shift. Depending on the required image quality you might use e.g. a 5 point Lanczos or other windowed sinc function and then apply this in one or both axes depending on whether you want an X shift or a Y shift or both.
E.g. for a 0.5 pixel shift the coefficients might be [ 0.06645, 0.18965, 0.27713, 0.27713, 0.18965 ]. (Note that the coefficients are normalised, i.e. their sum is equal to 1.0.)
To generate a horizontal shift you would convolve these coefficients with the pixels from x - 2 to x + 2, e.g.
const float kCoeffs[5] = { 0.06645f, 0.18965f, 0.27713f, 0.27713f, 0.18965f };
for (y = 0; y < height; ++y) // for each row
for (x = 2; x < width - 2; ++x) // for each col (apart from 2 pixel border)
{
float p = 0.0f; // convolve pixel with Lanczos coeffs
for (dx = -2; dx <= 2; ++dx)
p += in[y][x + dx] * kCoeffs[dx + 2];
out[y][x] = p; // store interpolated pixel
}
Conceptually, the operation is very simple. First you scale up the image (using any method of interpolation, as you like), then you translate the result, and finally you subsample down to the original image size.
The scale factor depends on the precision of sub-pixel translation you want to do. If you want to translate by 0.5 degrees, you need scale up the original image by a factor of 2 then you translate the resulting image by 1 pixel; if you want to translate by 0.25 degrees, you need to scale up by a factor of 4, and so on.
Note that this implementation is not efficient because when you scale up you end up calculating pixel values that you won't actually use because they're just dropped when you subsample back to the original image size. The implementation in Paul's answer is more efficient.

How to get colors with the same perceived brightness?

Is there a tool / program / color system that enables you to get colors of the same luminance (perceived brightness)?
Say I pick a color (determine RGB values) and the program gives me all the colors around the color wheel with the same luminance but different hues?
I haven't seen such tool yet, all I came across were three different algorithms for color luminance:
(0.2126*R) + (0.7152*G) + (0.0722*B)
(0.299*R + 0.587*G + 0.114*B)
sqrt( 0.241*R^2 + 0.691*G^2 + 0.068*B^2 )
Just to be clear, I'm talking about color luminance / perceived brightness or whatever you want to call it - the attribute that encounters that we perceive red hue brighter than blue for example. (So 255,0,0 has higher luminance value than 0,0,255.)
P.S.: Does anyone know which algorithm is used to determine color luminence on this website: http://www.workwithcolor.com/hsl-color-picker-01.htm
It looks like they used none of the posted algorithms.
In the HSL color picker you linked to, it looks like they are using the 3rd Lightness equation given here, and then making it a percentage. So the equation is:
L = (100 * 0.5 * (max(r,g,b) + min(r,g,b))) / 255
Edit: Actually, I just realized that they have an L value and a Lum value shown on that color picker. The equation above applies to the L value, but I don't know how they are arriving at the Lum value. It doesn't seem to follow any of the standard equations.

Resources