I am using python libraries PyPDF2 and reportlab to add text fields into an existing PDF.
I currently use the function
def makeTextFields():
packet = io.BytesIO()
can = canvas.Canvas(packet, pagesize=landscape(letter))
can.acroForm.textfield(name='fname', tooltip='First Name',
x=500, y=20, borderStyle='inset',
borderColor=blue, fillColor=blue,
width=79, height=24,
textColor=black, forceBorder=False, annotationFlags ="")
can.showPage()
can.save()
packet.seek(0)
text_fields = PdfFileReader(packet)
return text_fields
to create a PDF with the text fields then the following to load the base pdf, merge and save
main = PdfFileReader(open("master.pdf", 'rb'))
text_fields = makeTextFields()
output = PdfFileWriter()
text_field_page = text_fields.getPage(0)
page = main.getPage(0)
page.mergePage(text_field_page)
output.addPage(text_field_page)
stream = open("dest.pdf", "wb")
output.write(stream)
stream.close()
this solution would work fine however master.pdf has a rotation of 90 this means that when page.mergePage is called the text field pdf is automatically roated 90 degrees to match the base pdf and leaves the text fields 90 degrees with sideways text
WHAT I'VE TRIED
I have tried replacing page.mergePage with page.mergeRotatedTranslatedPage to no luck, I have also tried setting annotationFlags ="norotate" which according to reportlab docs should allow the text field to ignore canvas rotation but that did not work. lastly i tried
can.saveState()
can.rotate(90)
can.acroForm.textfield(name='fname', tooltip='First Name',
x=500, y=20, borderStyle='inset',
borderColor=blue, fillColor=blue,
width=79, height=24,
textColor=black, forceBorder=False, annotationFlags ="")
can.restoreState()
in the hopes of rotating the text field 90 degrees offset of the page so it will be 0 when the page is rotated to 90 but that seemed to have no affect
I believe the solution will lie in either finding a way to nullify the rotation on the text field,
applying an initial rotation to the text field, or merging the two pdfs without matching the rotation. However, any other solutions / libraries are apreciated.
I am also open to creating the pdf in another program and just merging them using python. Or using a different language if python isnt the best language for the job
try this
text_field_page.mergeRotatedTranslatedPage(page , -90, page .mediaBox.getWidth() / 2, page .mediaBox.getWidth() / 2)
Related
I would like to remove gridlines from a scanned document using Python to make them easier to read.
Here is a snippet of what we're working with:
As you can see, there are inconsistencies in the grid, and to make matters worse the scanning isn't always square. Five example documents can be found here.
I am open to whatever methods you may suggest for this, but using openCV and pypdf might be a good place to start before any more involved breaking out the machine learning techniques.
This post addresses a similar question, but does not have a solution. The user posted the following code snippet which may be of interest (to be honest I have not tested it, I am just putting it here for your convivence).
import cv2
import numpy as np
def rmv_lines(Image_Path):
img = cv2.imread(Image_Path)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
minLineLength, maxLineGap = 100, 15
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap)
for x in range(0, len(lines)):
for x1,y1,x2,y2 in lines[x]:
#if x1 != x2 and y1 != y2:
cv2.line(img,(x1,y1),(x2,y2),(255,255,255),4)
return cv2.imwrite('removed.jpg',img)
I would prefer the final documents be in pdf format if possible.
(disclaimer: I am the author of pText, the library being used in this answer)
I can help you part of the way (extracting the images from the PDF).
Start by loading the Document.
You'll see that I'm passing an extra parameter in the PDF.loads method.
SimpleImageExtraction acts like an EventListener for PDF instructions. Whenever it encounters an instruction that would render an image, it intercepts the instruction and stores the image.
with open(file, "rb") as pdf_file_handle:
l = SimpleImageExtraction()
doc = PDF.loads(pdf_file_handle, [l])
Now that we have loaded the Document, and SimpleImageExtraction should have had a chance to work its magic, we can output the images. In this example I'm just going to store them.
for i, img in enumerate(l.get_images_per_page(0)):
output_file = "image_" + str(i) + ".jpg"
with open(output_file, "wb") as image_file_handle:
img.save(image_file_handle)
You can obtain pText either on GitHub, or using PyPi
There are a ton more examples, check them out to find out more about working with images.
I have a bunch of medical images in dicom that I want to correct for bias field inhomogeneity using SimpleITK in Python. The workflow is straightforward: I want to (1) open the dicom image, (2) create a binary mask of the object in the image, (3) apply N4 bias field correction to the masked image, (4) write back the corrected image in dicom format. Note that no spatial transformation is applied to the image, but only intensity transformation, so that I could copy all spatial information and all meta data (except for date/hour of creation and instance number) from the original to the corrected image.
I have written this function to achieve my goal:
def n4_dcm_correction(dcm_in_file):
metadata_to_set = ["0008|0012", "0008|0013", "0020|0013"]
filepath = PurePath(dcm_in_file)
root_dir = str(filepath.parent)
file_name = filepath.stem
dcm_reader = sitk.ImageFileReader()
dcm_reader.SetFileName(dcm_in_file)
dcm_reader.LoadPrivateTagsOn()
inputImage = dcm_reader.Execute()
metadata_to_copy = [k for k in inputImage.GetMetaDataKeys() if k not in metadata_to_set]
maskImage = sitk.OtsuThreshold(inputImage,0,1,200)
filledImage = sitk.BinaryFillhole(maskImage)
floatImage = sitk.Cast(inputImage,sitk.sitkFloat32)
corrector = sitk.N4BiasFieldCorrectionImageFilter();
output = corrector.Execute(floatImage, filledImage)
output.CopyInformation(inputImage)
for k in metadata_to_copy:
print("key is: {}; value is {}".format(k, inputImage.GetMetaData(k)))
output.SetMetaData(k, inputImage.GetMetaData(k))
output.SetMetaData("0008|0012", time.strftime("%Y%m%d"))
output.SetMetaData("0008|0013", time.strftime("%H%M%S"))
output.SetMetaData("0008|0013", str(float(inputImage.GetMetaData("0008|0013")) + randint(1, 999)))
out_file = "{}/{}_biascorrected.dcm".format(root_dir, file_name)
writer = sitk.ImageFileWriter()
writer.KeepOriginalImageUIDOn()
writer.SetFileName(out_file)
writer.Execute(sitk.Cast(output, sitk.sitkUInt16))
return
n4_dcm_correction("/path/to/my/dcm/image.dcm")
As much as the bias correction part works (the bias is removed), the writing part is a mess. I would expect my output dicom to have the exact same metadata of the original one, however they are all missing, notably the patient name, the protocol name and the manufacturer. Similalry, something is very wrong with the spatial information, since if I try to convert the dicom to the nifti format with dcm2niix, the directions are reversed: superior is down and inferior is up, forward is back and backward is front. What step am I missing ?
I suspect you are working with a MRI series, not a single file. Likely this example does what you want, read-modify-write a volume stored in a set of files.
If the example did not resolve your issue, please post to the ITK discourse which is the primary location for ITK/SimpleITK related discussions.
Input-Sample
I am trying to pre-process my images in order to improve the ocr quality. However, I am stuck with a problem.
The Images I am dealing with contain different text orientations within the same image (2 pages, 1st is vertical, the 2nd one is horizontally oriented and they are scanned to the same image.
The text direction is automatically detected for the first part. nevertheless, the rest of the text from the other page is completely missed up.
I was thinking of creating a zonal template to detect the regions of interest but I don't know how.
Or automatically detect the border and split the image adaptively then flip the splitted part to achieve the required result.
I could set splitting based on a fixed pixel height but it is not constant as well.
from tesserocr import PyTessBaseAPI, RIL
import cv2
from PIL import Image
with PyTessBaseAPI() as api:
filePath = r'sample.jpg'
img = Image.open(filePath)
api.SetImage(img)
boxes = api.GetComponentImages(RIL.TEXTLINE, True)
print('Found {} textline image components.'.format(len(boxes)))
for i, (im, box, _, _) in enumerate(boxes):
# im is a PIL image object
# box is a dict with x, y, w and h keys
api.SetRectangle(box['x'], box['y'], box['w'], box['h'])
ocrResult = api.GetUTF8Text()
conf = api.MeanTextConf()
for box in boxes:
box = boxes[0][1]
x = box.get('x')
y = box.get('y')
h = box.get('h')
w = box.get('w')
cimg = cv2.imread(filePath)
crop_img = cimg[y:y+h, x:x+w]
cv2.imshow("cropped", crop_img)
cv2.waitKey(0)
output image
as you can see i can apply an orientation detection but I wount get any meaningful text out of such an image.
Try Tesseract API method GetComponentImages and then DetectOrientationScript on each component image.
I have written a python code which creates a gif from a list of images. In order to do this, I used the python library: imageio. Here is my code :
def create_gif(files, gif_path):
"""Creates an animated gif from a list of figures
Args:
files (list of str) : list of the files that are to be used for the gif creation.
All files should have the same extension which should be either png or jpg
gif_path (str) : path where the created gif is to be saved
Raise:
ValueError: if the files given in argument don't have the proper
file extenion (".png" or ".jpeg" for the images in 'files',
and ".gif" for 'gif_path')
"""
images = []
for image in files:
# Make sure that the file is a ".png" or a ".jpeg" one
if splitext(image)[-1] == ".png" or splitext(image)[-1] == ".jpeg":
pass
elif splitext(image)[-1] == "":
image += ".png"
else:
raise ValueError("Wrong file extension ({})".format(image))
# Reads the image with imageio and puts it into the images list
images.append(imageio.imread(image))
# Mak sure that the file is a ".gif" one
if splitext(gif_path)[-1] == ".gif":
pass
elif splitext(gif_path)[-1] == "":
gif_path += ".gif"
else:
raise ValueError("Wrong file extension ({})".format(gif_path))
# imageio writes all the images in a .gif file at the gif_path
imageio.mimsave(gif_path, images)
When I try this code with a list of images the Gif is correctly created but I have no idea how to change its parameters :
What I mean by that is that I would like to be able to control the delay between the gif's images, and also to control how much time the gif's is running.
I have tried to my gif with the Image module from PIL, and change its info, but when I save it my gif turns into my first image.
Could you please help me understand what I am doing wrong?
here is the code that I ran to try to change the gif prameter :
# Try to change gif parameters
my_gif = Image.open(my_gif.name)
my_gif_info = my_gif.info
print(my_gif_info)
my_gif_info['loop'] = 65535
my_gif_info['duration'] = 100
print(my_gif.info)
my_gif.save('./generated_gif/my_third_gif.gif')
You can just pass both parameters, loop and duration, to the mimsave/mimwrite method.
imageio.mimsave(gif_name, fileList, loop=4, duration = 0.3)
Next time you want to check which parameters can be used for a format compatible with imageio you can just use imageio.help(format name).
imageio.help("gif")
GIF-PIL - Static and animated gif (Pillow)
A format for reading and writing static and animated GIF, based
on Pillow.
Images read with this format are always RGBA. Currently,
the alpha channel is ignored when saving RGB images with this
format.
Parameters for reading
----------------------
None
Parameters for saving
---------------------
loop : int
The number of iterations. Default 0 (meaning loop indefinitely).
duration : {float, list}
The duration (in seconds) of each frame. Either specify one value
that is used for all frames, or one value for each frame.
Note that in the GIF format the duration/delay is expressed in
hundredths of a second, which limits the precision of the duration.
fps : float
The number of frames per second. If duration is not given, the
duration for each frame is set to 1/fps. Default 10.
palettesize : int
The number of colors to quantize the image to. Is rounded to
the nearest power of two. Default 256.
subrectangles : bool
If True, will try and optimize the GIF by storing only the
rectangular parts of each frame that change with respect to the
previous. Default False.
I'm currently working in my final project for my Coding class (my first coding class, so kind of an amateur).
My idea is for a code to search every newspaper in the world for a specific word within the titles (using bs4) and then obtaining a dictionary with the average mentions by country, taking into account the number of newspaper in each country. Afterwards, and this is the part where I'm stuck, I want to put this in a map.
The whole program is already working properly, until the part where I have a CSV with the following form:
'Country','Average'
'Afghanistan',10
'Albania',5
'Algeria',0
'Andorra',2
'Antigua and Barbuda',7
'Argentina',0
'Armenia',4
Now, I want to create a worldmap where the higher the number, the redder (or any other color) the whole polygon of the country. So far I've found many codes that work well placing points in space, but I haven't found one that "appends" the CSV data presented above and then fills each country accordingly. Below is the part of the code that currently created the worldmap:
# Now we proceed with the creation of the map
fig, ax = plt.subplots(figsize=(15,10)) # We define the size of the map
m = Basemap(resolution='c', # c, l, i, h, f or None
projection='merc', # Mercator projection
lat_0=24.20, lon_0=-6.67, # The center of the mas, so that the whole world is shown without splitting Asia
llcrnrlon=-180, llcrnrlat= -85,urcrnrlon=180, urcrnrlat=85) # The coordinates of the whole world
m.drawmapboundary(fill_color='#46bcec') # We choose a color for the boundary of the map
m.fillcontinents(color='#f2f2f2',lake_color='#46bcec') # We choose a color for the land and one for the lakes
m.drawcoastlines() # We choose to draw the lines of the map
m.readshapefile('Final project\\vincent_map_data-master\\ne_110m_admin_0_countries\\ne_110m_admin_0_countries', 'areas') # We import the shape file of the whole world
df_poly = pd.DataFrame({ # We define the polygon structure
'shapes': [Polygon(np.array(shape), True) for shape in m.areas],
'area': [area['name'] for area in m.areas_info]
})
cmap = plt.get_cmap('Oranges')
pc = PatchCollection(df_poly.shapes, zorder=2)
norm = Normalize()
mapper = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)
# We show the map
plt.show(m)
I opened the shapefile of the countries and the way to identify the countries is with the variable "sovereignty". There might be some non-sensical things within my code, since I've extracted things from many places. Sorry about that.
If someone could help me out, I would really appreciated.
Thanks