I've a basic problem with Python's library PIL. I have some .txt files containing only 0 and 1 values arranged in matrices. I have transformed the "binary" data in an image with the function Image.fromarray() included in PIL. The format of my data produces black&white images if I multiply it by 255, and that's fine for me. Now I want to add some text to the image, using the appropriate text function included in PIL, but I want that text to be coloured. Clearly, I can't do it because the image obtained from fromarray has a grayscale colormap. How can I change it?
You can get a RGB image from a monochromatic one like this:
from PIL import Image
from numpy import eye
arr = (eye(200)*255).astype('uint8') # sample array
im = Image.fromarray(arr) # monochromatic image
imrgb = im.convert('RGB') # color image
imrgb.show()
Related
I'm currently working on cloud removals from satellite data (I'm pretty new).
This is the image I'm working on (TIFF)
And this is the mask, where black pixels represent clouds (JPG)
I'm trying to remove the clouds from the TIFF, using the mask to identify the position of the cloud, and the cloudless image itself, like this (the area is the same, but the period is different):
I'm kindly ask how can I achieve that. A Python solution, with libraries like Rasterio or skimage is particularly appreciated.
Thanks in advance.
You can read the images with rasterio, PIL, OpenCV or tifffile, so I use OpenCV
import cv2
import numpy as np
# Load the 3 images
cloudy = cv2.imread('cloudy.png')
mask = cv2.imread('mask.jpg')
clear = cv2.imread('clear.png')
Then just use Numpy where() to choose whether you want the clear or cloudy image at each location according to the mask:
res = np.where(mask<128, clear, cloudy)
Note that if your mask was a single channel PNG rather than JPEG, or if it was read as greyscale like this:
mask = cv2.imread('mask.jpg', cv2.IMREAD_GRAYSCALE)
you would have to make it broadcastable to the 3 channels of the other two arrays by adding a new axis like this:
res = np.where(mask[...,np.newaxis]<128, clear, cloudy)
I was trying to use pytesseract to find the box positions of each letter in an image. I tried to use an image, and cropping it with Pillow and it worked, but when I tried with a lower character size image (example), the program may recognize the characters, but cropping the image with the box coordinates give me images like this. I also tried to double up the size of the original image, but it changed nothing.
img = Image.open('imgtest.png')
data=pytesseract.image_to_boxes(img)
dati= data.splitlines()
corde=[]
for i in dati[0].split()[1:5]: #just trying with the first character
corde.append(int(i))
im=img.crop(tuple(corde))
im.save('cimg.png')
If we stick to the source code of image_to_boxes, we see, that the returned coordinates are in the following order:
left bottom right top
From the documentation on Image.crop, we see, that the expected order of coordinates is:
left upper right lower
Now, it also seems, that pytesseract iterates images from bottom to top. Therefore, we also need to further convert the top/upper and bottom/lower coordinates.
That'd be the reworked code:
from PIL import Image
import pytesseract
img = Image.open('MJwQi9f.png')
data = pytesseract.image_to_boxes(img)
dati = data.splitlines()
corde = []
for i in dati[0].split()[1:5]:
corde.append(int(i))
corde = tuple([corde[0], img.size[1]-corde[3], corde[2], img.size[1]-corde[1]])
im = img.crop(tuple(corde))
im.save('cimg.png')
You see, left and right are in the same place, but top/upper and bottom/lower switched places, and where also altered w.r.t. the image height.
And, that's the updated output:
The result isn't optimal, but I assume, that's due to the font.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
Pillow: 8.1.0
pytesseract: 4.00.00alpha
----------------------------------------
I'm testing a simple script as shown below to convert JPG to PDF, but somehow the output PDF comes out inverted. The same behaviour is not seen when I convert the image to 'RGB' before saving it as PDF. The original image is in 'CMYK'. How can I avoid this?
Sample code:
from PIL import Image, ImageOps
image = Image.open('door.jpg')
image.save(
'output.pdf',
resolution=180.0,
quality=100
)
Input and output images:
If the image is known to be CMYK, try converting it to RGB before saving.
cmyk = Image.open('door.jpg')
rgb = cmyk.convert("RGB")
rgb.save(...)
This was caused by a limitation in Pillow Python and will be fixed in the latest release (v7.2.^) - https://github.com/python-pillow/Pillow/issues/4860
I'm trying to open a SAR image from sentinel-1. I can view the tiff file in QGIS, so I know the data is there, but when I go to open and view/show it in python, all of the modules I could use to open the data produce a NaN area, basically insinuating that there is no data in the image. Visualizing the image produces a completely black image, however the shape is correct.
Here is the code where I read in the image:
img = skimage.io.imread('NewData.tif', as_gray = True, plugin = 'tifffile')
with rio.open(r'NewData.tif') as src:
img2 = src.read()
imgMeta = src.profile
print(img)
skimage.io.imshow(img)
Any help would be appreciated.
thank you
The problem is not on the way rasterio or skimage is importing the image, but on the way it is displayed. I am assumign you are working with Calibrated SAR images that ARE NOT converted to the decibel dB scale. Here is the problem, the dynamic range of your data.
The issue here is that by default, the color ramp is not being strech according to the distribution of values in the raster histogram. In QGIS, SNAP or many other EO-related softwares, the color distribution matches the histogram to produce proper visualizations.
Solution: either you make that happen in your code or simply convert your backscatter values to decibel (which is a very common procedure when working with SAR data and produces an almost normal distrubution of the data). The conversion can be done in a EO software or more directly in your imported image with:
srcdB = 10*np.log10(src)
Once done, you can properly display your image:
import rasterio
from rasterio.plot import show
import numpy as np
with rio.open(r'/.../S1B_IW_GRDH_1SDV_20190319T161451_20190319T161520_015425_01CE3C_A401_Cal.tif') as src:
img2 = src.read()
imgMeta = src.profile
srcdB = 10*np.log10(src) # to decibel
show(srcdB, cmap='gray') # show using rasterio
I am using the image_to_string function in the pytesseract package to convert multiple parts of a single picture file to string. All parts are working except for this image:
Here is the script that I am using to convert it:
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
im = Image.open('image.png')
text = pytesseract.image_to_string(im)
print(text)
Which gives the output:
—\—\—\N—\—\—\—\—\N
I have tried breaking up the image into smaller parts as well as processing the image as a jpg and as png. What can I do to have it output the values in the image?
Using a different page segmentation instead of the default one seems to work.
text = pytesseract.image_to_string(im,config ='--psm 6'))
According to the tesseract wiki, option 6 assumes a single uniform block of text. I tried with other options but only this one worked.
To check for other page segmentation methods read the tesseract wiki on how to improve quality of an image.