I have set of 17 images and one of them has a highlighted pixel for my use. But, when I merge these 17 images, I get the color but it diffuses out of the pixel boundaries and I start seeing some colored pixel in black background.
I am using PIL library for the merging. I am attaching my code and the images for the reference. Any help would be appreciated.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Cretaing the Pixel array
from PIL import Image
from PIL import ImageColor
img_path = '/Volumes/MY_PASSPORT/JRF/cancer_genome/gopal_gen/png_files/'
image_list = []
for entry in os.listdir(img_path):
if entry.endswith('.png'):
entry = int(entry.rstrip('.csv.png'))
image_list.append(entry)
image_list.sort()
list_img = []
for j in range(len(image_list)):
stuff = str(image_list[j])+'.csv.png'
list_img.append(stuff)
#print(list_img[0])
images = [Image.open(img_path+x) for x in list_img]
widths, heights = zip(*(i.size for i in images))
total_width = sum(widths)
max_height = max(heights)
#print(total_width, max_height)
new_im = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in images:
new_im.paste(im, (x_offset,0))
#print(im.size)
x_offset += im.size[0]
#print(x_offset)
new_im.save(img_path+'final_result_image.jpg')
Here is the combined image:
The third column has a pixel highlighted.
Here is the zoomed in part with the problem.
The JPEG format is lossy - it is allowed to change your pixels to make the file smaller. If your image is a conventional photo of a real-life scene, this doesn't normally matter. If your data is a blocky, computer-generated image, or a set of classes from a classification process, it can go horribly wrong if you use JPEG.
So, the answer is to use PNG (or potentially TIFF) format for images that need to be lossless.
I am new to OpenCV and trying to see if I can find a way to detect vertical text for the image attached.
In this case on row 3 , I would like to get the bounding box around Original Cost and the amount below ($200,000.00).
Similarly I would like to get the bounding box around Amount Existing Liens and the associated amount below. I then would use this data to send to an OCR engine to read text. Traditional OCR engines go line by line and extract and loses the context.
Here is what I have tried so far -
import cv2
import numpy as np
img = cv2.imread('Test3.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,100,100,apertureSize = 3)
cv2.imshow('edges',edges)
cv2.waitKey(0)
minLineLength = 20
maxLineGap = 10
lines = cv2.HoughLinesP(edges,1,np.pi/180,15,minLineLength=minLineLength,maxLineGap=maxLineGap)
for x in range(0, len(lines)):
for x1,y1,x2,y2 in lines[x]:
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
cv2.imshow('hough',img)
cv2.waitKey(0)
Here is my solution based on Kanan Vyas and Adrian Rosenbrock
It's probably not as "canonical" as you'd wish.
But it seems to work (more or less...) with the image you provided.
Just a word of CAUTION: The code looks within the directory from which it is running, for a folder named "Cropped" where cropped images will be stored. So, don't run it in a directory which already contains a folder named "Cropped" because it deletes everything in this folder at each run. Understood? If you're unsure run it in a separate folder.
The code:
# Import required packages
import cv2
import numpy as np
import pathlib
###################################################################################################################################
# https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/
###################################################################################################################################
def sort_contours(cnts, method="left-to-right"):
# initialize the reverse flag and sort index
reverse = False
i = 0
# handle if we need to sort in reverse
if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i = 1
# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b:b[1][i], reverse=reverse))
# return the list of sorted contours and bounding boxes
return (cnts, boundingBoxes)
###################################################################################################################################
# https://medium.com/coinmonks/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26 (with a few modifications)
###################################################################################################################################
def box_extraction(img_for_box_extraction_path, cropped_dir_path):
img = cv2.imread(img_for_box_extraction_path, 0) # Read the image
(thresh, img_bin) = cv2.threshold(img, 128, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU) # Thresholding the image
img_bin = 255-img_bin # Invert the imagecv2.imwrite("Image_bin.jpg",img_bin)
# Defining a kernel length
kernel_length = np.array(img).shape[1]//200
# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))
# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))
# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))# Morphological operation to detect verticle lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
#cv2.imwrite("verticle_lines.jpg",verticle_lines_img)# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
#cv2.imwrite("horizontal_lines.jpg",horizontal_lines_img)# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha
# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)# For Debugging
# Enable this line to see verticle and horizontal lines in the image which is used to find boxes
#cv2.imwrite("img_final_bin.jpg",img_final_bin)
# Find contours for image, which will detect all the boxes
contours, hierarchy = cv2.findContours(
img_final_bin, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Sort all the contours by top to bottom.
(contours, boundingBoxes) = sort_contours(contours, method="top-to-bottom")
idx = 0
for c in contours:
# Returns the location and width,height for every contour
x, y, w, h = cv2.boundingRect(c)# If the box height is greater then 20, widht is >80, then only save it as a box in "cropped/" folder.
if (w > 50 and h > 20):# and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(x)+'_'+str(y) + '.png', new_img)
###########################################################################################################################################################
def prepare_cropped_folder():
p=pathlib.Path('./Cropped')
if p.exists(): # Cropped folder non empty. Let's clean up
files = [x for x in p.glob('*.*') if x.is_file()]
for f in files:
f.unlink()
else:
p.mkdir()
###########################################################################################################################################################
# MAIN
###########################################################################################################################################################
prepare_cropped_folder()
# Read image from which text needs to be extracted
img = cv2.imread("dkesg.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Performing OTSU threshold
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
thresh1=255-thresh1
bin_y=np.zeros(thresh1.shape[0])
for x in range(0,len(bin_y)):
bin_y[x]=sum(thresh1[x,:])
bin_y=bin_y/max(bin_y)
ry=np.where(bin_y>0.995)[0]
for i in range(0,len(ry)):
cv2.line(img, (0, ry[i]), (thresh1.shape[1], ry[i]), (0, 0, 0), 1)
# We need to draw abox around the picture with a white border in order for box_detection to work
cv2.line(img,(0,0),(0,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(img.shape[1]-1,0),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(img.shape[1]-1,0),(255,255,255),2)
cv2.line(img,(0,img.shape[0]-1),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(0,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(img.shape[1]-3,0),(img.shape[1]-3,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(0,0),(img.shape[1]-1,0),(0,0,0),1)
cv2.line(img,(0,img.shape[0]-2),(img.shape[1]-1,img.shape[0]-2),(0,0,0),1)
cv2.imwrite('out.png',img)
box_extraction("out.png", "./Cropped/")
Now... It puts the cropped regions in the Cropped folder. They are named as x_y.png with (x,y) the position on the original image.
Here are two examples of the outputs
and
Now, in a terminal. I used pytesseract on these two images.
The results are the following:
1)
Original Cost
$200,000.00
2)
Amount Existing Liens
$494,215.00
As you can see, pytesseract got the amount wrong in the second case... So, be careful.
Best regards,
Stéphane
I assume the bounding box is fix (rectangle that able to fit in "Original Amount and the amount below). You can use text detection to detect the "Original Amount" and "Amount Existing Liens" using OCR and crop out the image based on the detected location for further OCR on the amount. You can refer this link for text detection
Try to divide the image into different cells using the lines in the image.
For example, first divide the input into rows by detecting the horizontal lines. This can be done by using cv.HoughLinesP and checking for each line if the difference between y-coordinate of the begin and end point is smaller than a certain threshold abs(y2 - y1) < 10. If you have a horizontal line, it's a separator for a new row. You can use the y-coordinates of this line to split the input horizontally.
Next, for the row you're interested in, divide the region into columns using the same technique, but now make sure the difference between the x-coordinates of the begin and end point are smaller than a certain threshold, since you're now looking for the vertical lines.
You can now crop the image to different cells using the y-coordinates of the horizontal lines and the x-coordinates of the vertical lines. Pass these cropped regions one by one to the OCR engine and you'll have for each cell the corresponding text.
I want to highlight specific words/sentences in a website screenshot.
Once the screenshot is taken, I extract the text using pytesseract and cv2. That works well and I can get text and data about it.
import pytesseract
import cv2
if __name__ == "__main__":
img = cv2.imread('test.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
result = pytesseract.image_to_data(img, lang='eng', nice=0, output_type=pytesseract.Output.DICT)
print(result)
Using the results object I can find needed words and sentences.
The question is how to go back to the image and highlight those word?
Should I be looking at other libraries or there is a way to get pixel values and then highlight the text?
Ideally, I would like to get start and end coordinates of each word, how can that be done?
You can use pytesseract.image_to_boxes method to get the bounding box position of each character identified in your image. You can also use the method to draw bounding box around some specific characters if you want. Below code draws rectangles around my identified image.
import cv2
import pytesseract
import matplotlib.pyplot as plt
filename = 'sf.png'
# read the image and get the dimensions
img = cv2.imread(filename)
h, w, _ = img.shape # assumes color image
# run tesseract, returning the bounding boxes
boxes = pytesseract.image_to_boxes(img)use
print(pytesseract.image_to_string(img)) #print identified text
# draw the bounding boxes on the image
for b in boxes.splitlines():
b = b.split()
cv2.rectangle(img, ((int(b[1]), h - int(b[2]))), ((int(b[3]), h - int(b[4]))), (0, 255, 0), 2)
plt.imshow(img)
So After some image preprocessing I have gotten an image which holds 5 contours
(The image was resized for posting here in stackoverflow):
I'd like to remove all "islands" except for the actual letter,
So at first I tried using cv2.erode and cv2.dilate with all kinds of kernels sizes and it didn't do the job, so I decided to remove by masking all contours except the largest one by this:
_, cnts, _ = cv2.findContours(original, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
I would expect according to the given image there would be 5 contours
areas = []
for contour in cnts:
area = cv2.contourArea(contour)
areas.append(area)
relevant_indexes = list(range(1, len(cnts)))
relevant_indexes.remove(areas.index(max(areas)))
mask = numpy.zeros(eroded.shape).astype(eroded.dtype)
color = 255
for i in relevant_indexes:
cv2.fillPoly(mask, cnts[i], color)
cv2.imwrite("mask.png", mask)
// Trying to mask out the noise
result = cv2.bitwise_xor(orifinal, mask)
cv2.imwrite("result.png", result)
But the mask I get is:
it's not what I would expect, and the left down contour is missing,
can someone PLEASE explain me what am I missing here? And what would be the correct approach for getting rid of those "isolated islands"?
Thank you all!
p.s
The original photo I'm working on:
Solution:
It sounds like you want to mask out the largest connected component (cv-speak for "island").
Here's an opencv/python script to do that:
#!/usr/bin/env python
import cv2
import numpy as np
import console
# load image in grayscale
img = cv2.imread("img.png", 0)
# get all connected components
_, output, stats, _ = cv2.connectedComponentsWithStats(img, connectivity=4)
# get a list of areas for each group label
group_areas = stats[cv2.CC_STAT_AREA]
# get the id of the group with the largest area (ignoring 0, which is the background id)
max_group_id = np.argmax(group_areas[1:]) + 1
# get max_group_id mask and save it as an image
max_group_id_mask = (output == max_group_id).astype(np.uint8) * 255
cv2.imwrite("output.png", max_group_id_mask)
Result:
Here's the result of the above script on your sample image:
I'm trying to run a python code to detect a car in an image and draw a box around it. But somehow, I'm not getting the box even though there is no error. The code I'm using is below:
The function to draw the box
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
# Make a copy of the image
imcopy = np.copy(img)
# Iterate through the bounding boxes
for bbox in bboxes:
# Draw a rectangle given bbox coordinates
cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
# Return the image copy with boxes drawn
return imcopy
The function to draw the box
def search_windows(img, windows, model):
#1) Create an empty list to receive positive detection windows
#2) Iterate over all windows in the list
test_images = []
for window in windows:
#3) Extract the test window from original image
test_img = cv2.resize(img[window[0][1]:window[1][1], window[0][0]:window[1][0]], (64, 64))
# Normalize image
test_img = test_img/255
# Predict and round the result
test_images.append(test_img)
test_images = np.array(test_images)
prediction = np.around(model.predict(test_images))
on_windows = [windows[i] for i in np.where(prediction==1)[0]]
return on_windows
Read an image and use the functions to draw the box
img = mpimg.imread(test_images[0])
detected_windows = search_windows(img, windows, model)
window_img = draw_boxes(img, detected_windows, color=(0, 255, 0), thick=3)
plt.imshow(window_img)
Thanks in adanvce.
I found the answer. I need to set a better limit for my prediction in the code. If I set 0.4 or 0.5 as the value, then I get the boxes and then work my way for better predictions.