A transformation seems to be applied when painting colors in p5.js with an alpha value lower than 255:
for (const color of [[1,2,3,255],[1,2,3,4],[10,11,12,13],[10,20,30,40],[50,100,200,40],[50,100,200,0],[50,100,200,1]]) {
clear();
background(color);
loadPixels();
print(pixels.slice(0, 4).join(','));
}
Input/Expected Output Actual Output (Firefox)
1,2,3,255 1,2,3,255 ✅
1,2,3,4 0,0,0,4
10,11,12,13 0,0,0,13
10,20,30,40 6,19,25,40
50,100,200,40 51,102,204,40
50,100,200,0 0,0,0,0
50,100,200,1 0,0,255,1
The alpha value is preserved, but the RGB information is lost, especially on low alpha values.
This makes visualizations impossible where, for example, 2D shapes are first drawn and then the visibility in certain areas is animated by changing the alpha values.
Can these transformations be turned off or are they predictable in any way?
Update: The behavior is not specific to p5.js:
const ctx = new OffscreenCanvas(1, 1).getContext('2d');
for (const [r,g,b,a] of [[1,2,3,255],[1,2,3,4],[10,11,12,13],[10,20,30,40],[50,100,200,40],[50,100,200,0],[50,100,200,1]]) {
ctx.clearRect(0, 0, 1, 1);
ctx.fillStyle = `rgba(${r},${g},${b},${a/255})`;
ctx.fillRect(0, 0, 1, 1);
console.log(ctx.getImageData(0, 0, 1, 1).data.join(','));
}
I could be way off here...but it looks like internally that in the background method if _isErasing is true then blendMode is called. By default this will apply a linear interpolation of colours.
See https://github.com/processing/p5.js/blob/9cd186349cdb55c5faf28befff9c0d4a390e02ed/src/core/p5.Renderer2D.js#L45
See https://p5js.org/reference/#/p5/blendMode
BLEND - linear interpolation of colours: C = A*factor + B. This is the
default blending mode.
So, if you set the blend mode to REPLACE I think it should work.
REPLACE - the pixels entirely replace the others and don't utilize
alpha (transparency) values.
i.e.
blendMode(REPLACE);
for (const color of [[1,2,3,255],[1,2,3,4],[10,11,12,13],[10,20,30,40],[50,100,200,40],[50,100,200,0],[50,100,200,1]]) {
clear();
background(color);
loadPixels();
print(pixels.slice(0, 4).join(','));
}
Internally, the HTML Canvas stores colors in a different way that cannot preserve RGB values when fully transparent. When writing and reading pixel data, conversions take place that are lossy due to the representation by 8-bit numbers.
Take for example this row from the test above:
Input/Expected Output Actual Output
10,20,30,40 6,19,25,40
IN (conventional alpha)
R
G
B
A
values
10
20
30
40 (= 15.6%)
Interpretation: When painting, add 15.6% of (10,20,30) to the 15.6% darkened (r,g,b) background.
Canvas-internal (premultiplied alpha)
R
G
B
A
R
G
B
A
calculation
10 * 0.156
20 * 0.156
30 * 0.156
40 (= 15.6%)
values
1.56
3.12
4.7
40
values (8-bit)
1
3
4
40
Interpretation: When painting, add (1,3,4) to the 15.6% darkened (r,g,b) background.
Premultiplied alpha allows faster painting and supports additive colors, that is, adding color values without darkening the background.
OUT (conventional alpha)
R
G
B
A
calculation
1 / 0.156
3 / 0.156
4 / 0.156
40
values
6.41
19.23
25.64
40
values (8-bit)
6
19
25
40
So the results are predictable, but due to the different internal representation, the transformation cannot be turned off.
The HTML specification explicitly mentions this in section 4.12.5.1.15 Pixel manipulation:
Due to the lossy nature of converting between color spaces and converting to and from premultiplied alpha color values, pixels that have just been set using putImageData(), and are not completely opaque, might be returned to an equivalent getImageData() as different values.
see also 4.12.5.7 Premultiplied alpha and the 2D rendering context
In this article https://en.m.wikipedia.org/wiki/Indexed_color
It says this:
Indexed color images with palette sizes beyond 256 entries are rare. The practical limit is around 12-bit per pixel, 4,096 different indices. To use indexed 16 bpp or more does not provide the benefits of the indexed color images' nature, due to the color palette size in bytes being greater than the raw image data itself. Also, useful direct RGB Highcolor modes can be used from 15 bpp and up.
I don't undestand why the indexed 16 bpp or more is inefficient in terms of memory
Because in this article there is also this:
Indexed color saves a lot of memory, storage space, and transmission time: using truecolor, each pixel needs 24 bits, or 3 bytes. A typical 640×480 VGA resolution truecolor uncompressed image needs 640×480×3 = 921,600 bytes (900 KiB). Limiting the image colors to 256, every pixel needs only 8 bits, or 1 byte each, so the example image now needs only 640×480×1 = 307,200 bytes (300 KiB), plus 256×3 = 768 additional bytes to store the palette map in itself (assuming RGB), approximately one third of the original size. Smaller palettes (4-bit 16 colors, 2-bit 4 colors) can pack the pixels even more (to one sixth or one twelfth), obviously at cost of color accuracy.
If i have 640x480 resolution and if i want to use 16-bit palette:
640x480x2(16 bits == 2 bytes) + 65536(2^16)*3(rgb)
614400 + 196608 = 811008 bytes
Raw image memory size:
640x480x3(rgb)
921600 bytes
So 811008 < 921600
And if i have 1920x1080 reolution:
Raw image: 1920x1080x3 = 6 220 800
Indexed color:
1920x1080x2 + palette size(2**16 * 3)
4147200 + 196608
4343808 bytes
So again indexed color is efficien in terms of memory. I don’t get it, why in this article is says it is inefficient.
It really depends upon the size of the image. As you said, if b is the number of bytes per pixel and p is the number of pixels, then the image data size i is:
i = p * b
And the color table size t is:
t = 2^(b * 8) * 3
So the point where a raw image would take the same space as an indexed image is:
p * 3 = p * b + 2^(b * 8) * 3
Which I'll now solve for p:
p * 3 - p * b = 2^(b * 8) * 3
p * (3 - b) = 2^(b * 8) * 3
p = (2^(b * 8) * 3) / (3 - b)
So for various bytepp, the minimum image size that will make using indexed images break even:
1 bytepp (8 bit) - 384 pixels (like an image of 24 x 16)
1.5 bytepp (12 bit) - 8192 pixels (like an image of 128 x 64)
2 bytepp (16 bit) - 196,604 pixels (like an image of 512 x 384)
2.5 bytepp (20 bit) - 6,291,456 pixels (like an image of 3072 x 2048)
2.875 bytepp (23 bit) - 201,326,592 (like an image of 16,384 x 12,288)
If you are using an image smaller than 512 x 384, 16 bit per pixel indexed color would take up more space than raw 24 bit image data.
I have two png images, one is outputed by python library pillow to png, converted from svg font image, another is this one read by and re-saved from windows 10's paint program to png.
Strangely, I use opencv3 cv2.imread function to read these images, one is not OK with only black window, another is OK.
How to read these pngs both correctly?
CODE:
import cv2
image_file_path = r""
image = cv2.imread(image_file_path, cv2.IMREAD_ANYDEPTH)
if(! os.path.exists(image_file_path)):
print('NOT EXIST! = ' + image_file_path)
cv2.namedWindow('image', cv2.WINDOW_NORMAL)
cv2.imshow("image", image)
cv2.waitKey()
IMAGES:
OK:
NOT OK:
The first image is in 4-channel RGBA format with a completely pointless, fully opaque, alpha channel which you can ignore.
The second image is in 2-channel Grey+Alpha format where all the pixels are pure solid black and the shapes are defined only in the alpha channel.
So, basically you want to:
discard the last channel of the first image, which you can do by using cv2.IMREAD_COLOR
discard all except the last channel of the second image, which you can do like this:
im = cv2.imread('2.png',cv2.IMREAD_UNCHANGED)[:,:,-1]
I obtained the information above by using ImageMagick which is included in most Linux distros and is available on macOS and Windows.
The command I used in Terminal is:
magick identify -verbose 2.png
Sample Output
Image: 2.png
Format: PNG (Portable Network Graphics)
Mime type: image/png
Class: DirectClass
Geometry: 1040x1533+0+0
Units: Undefined
Colorspace: Gray
Type: Bilevel
Base type: Undefined
Endianess: Undefined
Depth: 8-bit
Channel depth:
Gray: 1-bit <--- Note 1
Alpha: 8-bit <--- Note 1
Channel statistics:
Pixels: 1594320
Gray:
min: 0 (0) <--- Note 2
max: 0 (0) <--- Note 2
mean: 0 (0)
standard deviation: 0 (0)
kurtosis: -3
skewness: 0
entropy: 4.82164e-05
Alpha:
min: 0 (0) <--- Note 3
max: 255 (1) <--- Note 3
mean: 50.3212 (0.197338)
standard deviation: 101.351 (0.397456)
kurtosis: 0.316613
skewness: 1.52096
entropy: 0.0954769
...
...
I have annotated with arrows and notes on the right above.
Note 1: This tells me the image is greyscale + alpha
Note 2: This tells me all the greyscale pixels are black, since the max is zero and the min is zero
Note 3: This tells me that there are some fully transparent pixels, and some fully opaque pixels
Paint is transforming the images somehow making their format incompatible with the 'typical' imread routine. I'm not sure what's happening, it might be related to paint already removing the alpha channel which OpenCV also wants to remove (according to their docs, didn't take a look at the code). Luckily you can circumvent it:
I_not_ok = cv2.imread(ImagePath, CV2.IMREAD_UNCHANGED)
I_ok = I_not_ok[:,:,3]
cv2.namedWindow('Image_ok', cv2.WINDOW_NORMAL)
cv2.imshow('Image_ok', I_ok)
cv2.waitKey(0)
I have a question regarding a PNG file that I am trying to read (I have attached it in this question)
The file size 328750 bytes
Width 660
Height 330
Color type - truecolor
Bit depth - 24 bits
So here's my question. If it's true color, I assume it's RGB, which is 24 bits. But you do the math, the number doesn't add up. 660 (width) * 330 (height) * 3 bytes (from 24 bits) = 653400 bytes, which is double the actual file size.
Why is that?
I tried to read the IDAT chunk, pretending that each pixel is 3 bytes, and I tried to check the colour and it doesn't match what is displayed.
PNG is a compressed image format, so the IDAT chunk(s) contain a zlib-compressed representation of the RGB pixels. Probably the easiest way for you to access the pixel data is to use a converter such as ImageMagick or GraphicsMagick to decompress the image into the Netpbm "PPM" format.
magick image.png image.ppm
or
gm convert image.png image.ppm
Then read the "image.ppm" in the same way you tried to read the PNG. Just skip over the short header, which in the case of your image is
P 6 \n 6 6 0 3 3 0 \n 2 5 5 \n
where "P6" is the magic number, 660 and 330 are the dimensions, and 255 is the image depth (maximum value for R,G,and B is 255, or 0xff). The remainder of the file is just the R,G,B values you were expecting.