Find all images that are over 50% black - Python - python-3.x

I am trying to write code that will iterate over a directory of images and tell me which images are over 50% black - so I can get rid of those. This is what I have, but it isn't returning any results:
from PIL import Image
import glob
images = glob.glob('/content/drive/MyDrive/cropped_images/*.png') #find all filenames specified by pattern
for image in images:
with open(image, 'rb') as file:
img = Image.open(file)
pixels = list(img.getdata()) # get the pixels as a flattened sequence
black_thresh = (50,50,50)
nblack = 0
for pixel in pixels:
if pixel < black_thresh:
nblack += 1
n = len(pixels)
if (nblack / float(n)) > 0.5:
print(file)

Hannah, tuple comparison in python might not be what you expect: How does tuple comparison work in Python? . It is element by element, only for tiebreaking. Perhaps your definition of over 50% black is not what you have mapped to code?
I ran the above code on a dark image and my file printed just fine. Example png: https://www.nasa.gov/mission_pages/chandra/news/black-hole-image-makes-history .
Recommendations: define over 50% black to something you can put in code, pull the n out of the for loop, add files to a list when they satisfy your criteria.

Related

Python add padding to images that need it

I have a bunch of images that aren't equal size, and where some fit entirely to the frame and some have blank padding.
I would like to know how I can resize each of them to be the same image size and to have roughly the same border size.
Currently I am doing
from PIL import Image
from glob import glob
images = glob('src/assets/emotes/medals/**/*.png', recursive=True)
for image_path in images:
im = Image.open(image_path).convert('RGBA')
im = im.resize((100, 100))
im.save(image_path)
but this doesn't account for a possible border.
Image 1 - 101 x 101
Image 2 - 132 x 160
Desired result - 100 x 100
Images arent always bigger than (100, 100) so I will need to use resize.
I can also maybe remove the PNG border for all images, and then resize which might be easier.
Taken from Crop a PNG image to its minimum size, im.getbbox() will give you the original image without transparent background.
Documentation : Pillow (PIL Fork)

How can I make the text of a photo list clearer?

I have about a hundred photos that aren't very sharp and I'd like to make them sharper.
So I created a script with python that already tries with one. I have tried with PIL, OpenCV and OCR readers to read texts from Images.
# External libraries used for
# Image IO
from PIL import Image
# Morphological filtering
from skimage.morphology import opening
from skimage.morphology import disk
# Data handling
import numpy as np
# Connected component filtering
import cv2
black = 0
white = 255
threshold = 160
# Open input image in grayscale mode and get its pixels.
img = Image.open("image3.png").convert("LA")
pixels = np.array(img)[:,:,0]
# Remove pixels above threshold
pixels[pixels > threshold] = white
pixels[pixels < threshold] = black
# Morphological opening
blobSize = 1 # Select the maximum radius of the blobs you would like to remove
structureElement = disk(blobSize) # you can define different shapes, here we take a disk shape
# We need to invert the image such that black is background and white foreground to perform the opening
pixels = np.invert(opening(np.invert(pixels), structureElement))
# Create and save new image.
newImg = Image.fromarray(pixels).convert('RGB')
newImg.save("newImage1.PNG")
# Find the connected components (black objects in your image)
# Because the function searches for white connected components on a black background, we need to invert the image
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(np.invert(pixels), connectivity=8)
# For every connected component in your image, you can obtain the number of pixels from the stats variable in the last
# column. We remove the first entry from sizes, because this is the entry of the background connected component
sizes = stats[1:,-1]
nb_components -= 1
# Define the minimum size (number of pixels) a component should consist of
minimum_size = 100
# Create a new image
newPixels = np.ones(pixels.shape)*255
# Iterate over all components in the image, only keep the components larger than minimum size
for i in range(1, nb_components):
if sizes[i] > minimum_size:
newPixels[output == i+1] = 0
# Create and save new image.
newImg = Image.fromarray(newPixels).convert('RGB')
newImg.save("newImage2.PNG")
But it returns:
I would prefer it not to be black and white, the best output would be one which upscale both text and image
As mentioned in the comments, the quality is very bad. This is not an easy problem. However, there may be a couple of tricks you can try.
This looks like it is due to some anti-aliasing that has been applied to the image/scan. I would try reversing anti-aliasing if possible. As descriped in the post the steps would be similar to this:
Apply low pass filter
difference = original_image - low_pass_image
sharpened_image = original_image + alpha*difference
Code may look something like this:
from skimage.filters import gaussian
alpha = 1 # Sharpening factor
low_pass_image = gaussian(image, sigma=1)
difference = original_image - low_pass_image
sharpened_image = original_image + alpha*difference
Also, scikit image has an implementation of an unsharp mask as well as the wiener filter.

How to reduce image portion with numpy.compress method ? (numpy+scikit-image)

Hi using the sample image phantom.png I'm following some operations with numpy + skimage libraries and after some modifications the last one exercise ask for:
Compress the size of center spots by 50% and plot the final image.
These are the steps I do before.
I read the image doing
img = imread(os.path.join(data_dir, 'phantom.png'))
Then apply following to make it black and white
img[np.less_equal(img[:,:,0],50)] = 0
img[np.greater_equal(img[:,:,0],51)] = 255
Took couple of slices of the image (the black spots) with given coordinates
img_slice=img.copy()
img_slice=img_slice[100:300, 100:200]
img_slice2=img.copy()
img_slice2=img_slice2[100:300, 200:300]
Now flip them
img_slice=np.fliplr(img_slice)
img_slice2=np.fliplr(img_slice2)
And put them back into an image copy
img2=img.copy()
img2[100:300, 200:300]=img_slice
img2[100:300, 100:200]=img_slice2
And this is the resulting image before the final ("compress") excersise:
Then I'm asked to "reduce" the black spots by using the numpy.compress method.
The expected result after using "compress" method is the following image (screenshot) where the black spots are reduced by 50%:
But I have no clue of how to use numpy.compress method over the image or image slices to get that result, not even close, all what I get is just chunks of the image that looks like cropped or stretched portions of it.
I will appreciate any help/explanation about how the numpy.compress method works for this matter and even if is feasible to use it for this.
You seem ok with cropping and extracting, but just stuck on the compress aspect. So, crop out the middle and save that as im and we will compress that in the next step. Fill the area you cropped from with white.
Now, compress the part you cropped out. In order to reduce by 50%, you need to take alternate rows and alternate columns, so:
# Generate a vector alternating between True and False the same height as "im"
a = [(i%2)==0 for i in range(im.shape[0])]
# Likewise for the width
b = [(i%2)==0 for i in range(im.shape[1])]
# Now take alternate rows with numpy.compress()
r = np.compress(a,im,0)
# And now take alternate columns with numpy.compress()
res = np.compress(b,r,1)
Finally put res back in the original image, offset by half its width and height relative to where you cut it from.
I guess you can slice off the center spots first by :
center_spots = img2[100:300,100:300]
Then you can replace the center spots values in the original image with 255 (white)
img2[100:300,100:300] = 255
then compress center_spots by 50% along both axes and add the resultant back to img2
the compressed image shape will be (100,100), so add to img2[150:250,150:250]
Check the below code for the output you want. Comment if you need explanation for the below code.
import os.path
from skimage.io import imread
from skimage import data_dir
import matplotlib.pyplot as plt
import numpy as np
img = imread(os.path.join(data_dir, 'phantom.png'))
img[np.less_equal(img[:,:,0],50)] = 0
img[np.greater_equal(img[:,:,0],51)] = 255
img_slice=img[100:300,100:200]
img_slice2=img[100:300,200:300]
img_slice=np.fliplr(img_slice)
img_slice2=np.fliplr(img_slice2)
img2=img.copy()
img2[100:300, 200:300]=img_slice
img2[100:300, 100:200]=img_slice2
#extract the left and right images
img_left = img2[100:300,100:200]
img_right = img2[100:300,200:300]
#reduce the size of the images extracted using compress
#numpy.compress([list of states as True,False... or 1,0,1...], axis = (0 for column-wise and 1 for row-wise))
#In state list whatever is False or 0 that particular row should will be removed from that matrix or image
#note: len(A) -> number of rows and len(A[0]) number of columns
#reducing the height-> axis = 0
img_left = img_left.compress([not(i%2) for i in range(len(img_left))],axis = 0)
#reducing the width-> axis = 1
img_left = img_left.compress([not(i%2) for i in range(len(img_left[0]))],axis = 1)
#reducing the height-> axis = 0
img_right = img_right.compress([not(i%2) for i in range(len(img_right))],axis = 0)
#reducing the width-> axis = 1
img_right = img_right.compress([not(i%2) for i in range(len(img_right[0]))],axis = 1)
#clearing the area before pasting the left and right minimized images
img2[100:300,100:200] = 255 #255 is for whitening the pixel
img2[100:300,200:300] = 255
#paste the reduced size images back into the main picture(but notice the coordinates!)
img2[150:250,125:175] = img_left
img2[150:250,225:275] = img_right
plt.imshow(img2)
numpy.compress document here.
eyes = copy[100:300,100:300]
eyes1 = eyes
e = [(i%2 == 0) for i in range(eyes.shape[0])]
f = [(i%2 == 0) for i in range(eyes.shape[1])]
eyes1 = eyes1.compress(e,axis = 0)
eyes1 = eyes1.compress(f,axis = 1)
# plt.imshow(eyes1)
copy[100:300,100:300] = 255
copy[150:250,150:250] = eyes1
plt.imshow(copy)

Creating a greyscale image with a Matrix in python

I'm Marius, a maths student in the first year.
We have recieved a team-assignment where we have to implement a fourier transformation and we chose to try to encode the transformation of an image to a JPEG image.
to simplify the problem for ourselves, we chose to do it only for pictures that are greyscaled.
This is my code so far:
from PIL import Image
import numpy as np
import sympy as sp
#
#ALLEMAAL INFORMATIE GEEN BEREKENINGEN
img = Image.open('mario.png')
img = img.convert('L') # convert to monochrome picture
img.show() #opens the picture
pixels = list(img.getdata())
print(pixels) #to see if we got the pixel numeric values correct
grootte = list(img.size)
print(len(pixels)) #to check if the amount of pixels is correct.
kolommen, rijen = img.size
print("het aantal kolommen is",kolommen,"het aantal rijen is",rijen)
#tot hier allemaal informatie
pixelMatrix = []
while pixels != []:
pixelMatrix.append(pixels[:kolommen])
pixels = pixels[kolommen:]
print(pixelMatrix)
pixelMatrix = np.array(pixelMatrix)
print(pixelMatrix.shape)
Now the problem forms itself in the last 3 lines. I want to try to convert the matrix of values back into an Image with the matrix 'pixelMatrix' as it's value.
I've tried many things, but this seems to be the most obvious way:
im2 = Image.new('L',(kolommen,rijen))
im2.putdata(pixels)
im2.show()
When I use this, it just gives me a black image of the correct dimensions.
Any ideas on how to get back the original picture, starting from the values in my matrix pixelMatrix?
Post Scriptum: We still have to implement the transformation itself, but that would be useless unless we are sure we can convert a matrix back into a greyscaled image.

tesseract ocr is not working on image which have text length of only 2 or less. Works fine for Image with text length greater than 3

import pytesseract
from PIL import Image
def textFromTesseractOCR(croppedImage):
for i in range(14):
text = pytesseract.image_to_string(croppedImage, lang = 'eng', boxes = False ,config = '--psm '+ str(i) +' --oem 3')
print("PSM Mode", i)
print("Text detected: ",text)
imgPath = "ImagePath" #you can use image I have uploaded
img = Image.open(imgPath)
textFromTesseractOCR(img)
I am working on extracting Table data from PDF. For this I am converting pdf to png. Detecting Lines, ascertaining table by line intersection and then cropping individual cells to get their text.
This all works fine, but tesseract is not working on cells image which has text of length 2 or less.
Works for this image:
Result from tesseract:
Does not work for this image:
Result from tesseract: return empty string.
It also returns empty for numbers of text length 2 or less.
I have tried resizing the image(which I knew wouldn't work), also tried appending dummy text to the image but the result was bad(was working only for few and I didn't the exact location to append the dummy text in the image)
It would be great if someone could help me with this.
So I finally came with a workaround for this situation. The situation being tesseract-OCR giving empty string when the image contains only 1 or 2 length string(eg "1" or "25").
To get output in this situation I appended the same image multiple time at the original image so as to make its length greater than 2. For example, if the original image contained only "3", I appended "3" image(the same image) 4 more times and thereby making it an image which contains the text "33333". We then give this image to tesseract which gives output "33333"(most of the times).Then we just have to replace space with blank in the text output from the Tesseract and divide the resulting string length by 5 to get the index up to which we would want to text out from the whole text.
Please see code for reference, hope this helps:
import pytesseract ## pip3 install pytesseract
Method which calls tesseract for OCR or calls our workaround code if we get an empty string from tesseract output.
def textFromTesseractOCR(croppedImage):
text = pytesseract.image_to_string(croppedImage)
if text.strip() == '': ### program that handles our problem
if 0 not in croppedImage:
return ""
yDir = 3
xDir = 3
iterations = 4
img = generate_blocks_dilation(croppedImage, yDir, xDir, iterations)
## we dilation to get only the text portion of the image and not the whole image
kernelH = np.ones((1,5),np.uint8)
kernelV = np.ones((5,1),np.uint8)
img = cv2.dilate(img,kernelH,iterations = 1)
img = cv2.dilate(img,kernelV,iterations = 1)
image = cropOutMyImg(img, croppedImage)
concateImg = np.concatenate((image, image), axis = 1)
concateImg = np.concatenate((concateImg, image), axis = 1)
concateImg = np.concatenate((concateImg, image), axis = 1)
concateImg = np.concatenate((concateImg, image), axis = 1)
textA = pytesseract.image_to_string(concateImg)
textA = textA.strip()
textA = textA.replace(" ","")
textA = textA[0:int(len(textA)/5)]
return textA
return text
Method for dilation.This method is used to dilate only the text region of the image
def generate_blocks_dilation(img, yDir, xDir, iterations):
kernel = np.ones((yDir,xDir),np.uint8)
ret,img = cv2.threshold(img, 0, 1, cv2.THRESH_BINARY_INV)
return cv2.dilate(img,kernel,iterations = iterations)
Method to crop the dilated part of the image
def cropOutMyImg(gray, OrigImg):
mask = np.zeros(gray.shape,np.uint8) # mask image the final image without small pieces
_ , contours, hierarchy = cv2.findContours(gray,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
if cv2.contourArea(cnt)!=0:
cv2.drawContours(mask,[cnt],0,255,-1) # the [] around cnt and 3rd argument 0 mean only the particular contour is drawn
# Build a ROI to crop the QR
x,y,w,h = cv2.boundingRect(cnt)
roi=mask[y:y+h,x:x+w]
# crop the original QR based on the ROI
QR_crop = OrigImg[y:y+h,x:x+w]
# use cropped mask image (roi) to get rid of all small pieces
QR_final = QR_crop * (roi/255)
return QR_final
I tried running tesseract on given 2 image but it does not returns text in shorter text image.
Another thing you can try is "Train a machine learning model (probably neural net) to on alphabets, numbers and special character, then when you want to get text from image, feed that image to model and it will predict text/characters."
Training dataset would look like :
Pair of (Image of character, 'character').
First element of pair is independent variable for model.
Second element of pair is corresponding character present in that image. It will be dependent variable for model.

Resources