How to read pdf images as opencv images using PyMuPDF? - python-3.x

I would like to read all images found in a pdf file by PyMuPDF as opencv images, as close as they are from the source (avoiding funky format conversions that would lead to precision loss). Basically, I would like the result to be the exact same as if I was doing a cv2.imread(filename): (in terms of the type it outputs, color space, etc...)
# Libraries
import os
import cv2
import fitz
import numpy as np
# Input file
filename = "myfile.pdf"
# Read all images in file as a list of opencv images
def read_images(filename):
images = []:
_, extension = os.path.splitext(filename)
# If it's a pdf process each image
if (extension == ".pdf"):
pdf = fitz.open(file)
for index in range(len(pdf)):
page = pdf[index]
for im in page.getImageList():
xref = im[0]
pix = fitz.Pixmap(pdf, xref)
images.append(pix_to_opencv_image(pix)) # DO SOMETHING HERE
# Otherwise just do an imread
else:
images.append(cv2.imread(filename))
return images
Basically I would like to know what the function pix_to_opencv_image should be:
# Equivalent of doing a "cv2.imread" on a pdf pixmap:
def pix_to_opencv_image(pix):
# DO SOMETHING HERE
If found example explaining how to convert pdf pixmaps to numpy arrays, but nothing that outputs an opencv image.
How can I achieve this?

I used help() function to find the various data descriptors associated with it --> help(pix)
pix.samples stores the image information as bytes. Using numpy's frombuffer, the image array can be obtained from these bytes after reshaping accordingly.
pix.height and pix.width gives the height and width of the image array respectively. pix.n is the number of channels. These can be used for reshaping the resulting array.
Your complete function would be:
def pix_to_image(pix):
bytes = np.frombuffer(pix.samples, dtype=np.uint8)
img = bytes.reshape(pix.height, pix.width, pix.n)
return img
You can display the result using cv2.imshow().

Related

Keras conversion in Pytorch

I have the following code in keras:
# load all images in a directory into memory
def load_images(path, size=(256,512)):
src_list, tar_list = list(), list()
# enumerate filenames in directory, assume all are images
for filename in listdir(path):
# load and resize the image
pixels = load_img(path + filename, target_size=size)
# convert to numpy array
pixels = img_to_array(pixels)
# split into satellite and map
sat_img, map_img = pixels[:, :256], pixels[:, 256:]
src_list.append(sat_img)
tar_list.append(map_img)
return [asarray(src_list), asarray(tar_list)]
I would like to convert it to pytorch, but I don't know much about it. Any suggestion?
I don't think you have anything to change but the very last line:
return [torch.stack(src_list), torch.stack(tar_list)]

Convert Python List Object to numpy array [duplicate]

I used to use scipy which would load an image from file straight into an ndarray.
from scipy import misc
img = misc.imread('./myimage.jpg')
type(img)
>>> numpy.ndarray
But now it gives me a DeprecationWarning and the docs say it will be removed in 1.2.0. and I should use imageio.imread instead. But:
import imageio
img = imageio.imread('./myimage.jpg')
type(img)
>>> imageio.core.util.Image
I could convert it by doing
img = numpy.array(img)
But this seems hacky. Is there any way to load an image straight into a numpy array as I was doing before with scipy's misc.imread (other than using OpenCV)?
The result of imageio.imread is already a NumPy array; imageio.core.util.Image is an ndarray subclass that exists primarily so the array can have a meta attribute holding image metadata.
If you want an object of type exactly numpy.ndarray, you can use asarray:
array = numpy.asarray(img)
Unlike numpy.array(img), this will not copy img's data.
If it was a bitmap or even jpeg, you can do:
import matplotlib.pyplot as plt
import numpy as np
# 'pip install pillow' but import PIL
from PIL import Image
png_filepath = 'somepng.png'
png_pil_img = Image.open(png_filepath)
# this will print info about the PIL object
print(png_pil_img.format, png_pil_img.size, png_pil_img.mode)
png_np_img = np.asarray(png_pil_img)
plt.imshow(png_np_img) # this will graphit in a jupyter notebook
# or if its grayscale plt.imshow(png_np_img, cmap='gray')
# FWIW, this will show the np characteritics
print("shape is ", png_np_img.shape)
print("dtype is ", png_np_img.dtype)
print("ndim is ", png_np_img.ndim)
print("itemsize is ", png_np_img.itemsize) # size in bytes of each array element
print("nbytes is ", png_np_img.nbytes) # size in bytes of each array element
If you have a jpg, it works the same. PIL.image will decode the compressed JPG, and convert it to an array for you. Literally it will do all this for you. Perhaps you could load the raw bitmap with file io skipping the header, yadda yadda, but PIL is popular for a reason.
The output for a grayscale png will look like this:
PNG (3024, 4032) L
shape is (4032, 3024)
dtype is uint8
ndim is 2
itemsize is 1
nbytes is 12192768
The output for a color jpeg will look like this:
JPEG (704, 480) RGB
shape is (480, 704, 3)
dtype is uint8
ndim is 3
itemsize is 1
nbytes is 1013760
In either case, the pixel values range 0-255 as ints. They are not floats. The color image has three channels corresponding to red green and blue. The grayscale image is much greater resolution and the jpg.

holoviews doesn't display PIL image format

I am trying to import the MNIST data set and just display it using Holoviews. When I run the following:
import holoviews as hv
from torchvision import datasets, transforms
hv.extension('bokeh')
mnist_images = datasets.MNIST('data', train=True, download=True)
image_list = []
for k, (image, label) in enumerate(mnist_images):
if k >= 18:
break
image.show()
bounds = (0,0,1,1)
temp = hv.Image(image, bounds=bounds)
image_list.append(temp)
layout = hv.Layout(image_list).cols(2)
layout
I get the following error at the line with 'temp = hv.Image(...)':
holoviews.core.data.interface.DataError: None of the available storage backends were able to support the supplied data format.
The 'image' variable is the following object: <PIL.Image.Image image mode=L size=28x28 at 0x7F7F28567910>
and image.show() renders the image correctly. Also if I use matplotlib's .imshow() I can get a correct render.
What I want is to see the image rendered in Holoviews and I expected the Holoviews.Image() would do that. Is that not a correct assumption? If it is, then what is wrong with the code/approach?
HoloViews works with numerical arrays rather than images, so hv.Image is for constructing an image out of a 2D array, not for showing things that are already images. But you can get numerical arrays out of PIL objects, e.g. hv.RGB(np.array(image), bounds=bounds) to display it as an RGB image or something similar to pull out just the grayscale values to pass to hv.Image.

Converting .mat file extension image to .jpg via python

I'm currently trying to converting the images from a .mat file to .jpg file downloaded from this site- BrainTumorDataset.
All the files contained in the directory are .mat files, now I want to convert all the files in .jpg format via python for making a project(Brain Tumor Classification using Deep Neural Net) via CNN. I searched in google but then I didn't get anything from there, only some topics on how to load .mat file in python but that also didn't help me. I found an answer in StackOverflow but this didn't work with this dataset and also the answer is for loading .mat image in python but I want to convert .mat images in .jpg format.
I managed to convert one image, use a loop to convert all.
Please read the comments.
import matplotlib.pyplot as plt
import numpy as np
import h5py
from PIL import Image
#reading v 7.3 mat file in python
#https://stackoverflow.com/questions/17316880/reading-v-7-3-mat-file-in-python
filepath = '1.mat';
f = h5py.File(filepath, 'r') #Open mat file for reading
#In MATLAB the data is arranged as follows:
#cjdata is a MATLAB struct
#cjdata.image is a matrix of type int16
#Before update: read only image data.
####################################################################
#Read cjdata struct, get image member and convert numpy ndarray of type float
#image = np.array(f['cjdata'].get('image')).astype(np.float64) #In MATLAB: image = cjdata.image
#f.close()
####################################################################
#Update: Read all elements of cjdata struct
####################################################################
#Read cjdata struct
cjdata = f['cjdata'] #<HDF5 group "/cjdata" (5 members)>
# In MATLAB cjdata =
# struct with fields:
# label: 1
# PID: '100360'
# image: [512×512 int16]
# tumorBorder: [38×1 double]
# tumorMask: [512×512 logical]
#get image member and convert numpy ndarray of type float
image = np.array(cjdata.get('image')).astype(np.float64) #In MATLAB: image = cjdata.image
label = cjdata.get('label')[0,0] #Use [0,0] indexing in order to convert lable to scalar
PID = cjdata.get('PID') # <HDF5 dataset "PID": shape (6, 1), type "<u2">
PID = ''.join(chr(c) for c in PID) #Convert to string https://stackoverflow.com/questions/12036304/loading-hdf5-matlab-strings-into-python
tumorBorder = np.array(cjdata.get('tumorBorder'))[0] #Use [0] indexing - convert from 2D array to 1D array.
tumorMask = np.array(cjdata.get('tumorMask'))
f.close()
####################################################################
#Convert image to uint8 (before saving as jpeg - jpeg doesn't support int16 format).
#Use simple linear conversion: subtract minimum, and divide by range.
#Note: the conversion is not optimal - you should find a better way.
#Multiply by 255 to set values in uint8 range [0, 255], and covert to type uint8.
hi = np.max(image)
lo = np.min(image)
image = (((image - lo)/(hi-lo))*255).astype(np.uint8)
#Save as jpeg
#https://stackoverflow.com/questions/902761/saving-a-numpy-array-as-an-image
im = Image.fromarray(image)
im.save("1.jpg")
#Display image for testing
imgplot = plt.imshow(image)
plt.show()
Note:
Each mat file contains a struct named cjdata.
Fields of cjdata struct:
cjdata =
struct with fields:
label: 1
PID: '100360'
image: [512×512 int16]
tumorBorder: [38×1 double]
tumorMask: [512×512 logical]
When converting images to jpeg, you are loosing information...
Here is how you can use a loop to convert all images.
from os import path
import os
from matplotlib import pyplot as plt
import numpy as np
import h5py
from PIL import Image
import re
import sys
from glob import glob
dir_path = path.dirname(path.abspath(__file__))
path_to_mat_files = path.join(dir_path, "*.mat")
found_files = glob(path_to_mat_files, recursive=True)
total_files = 0
def convert_to_png(file: str, number: int):
global total_files
if path.exists(file):
print(file, "already exist\nSkipping...")
else:
h5_file = h5py.File(file, 'r')
png = file[:-3] + "png"
cjdata = h5_file['cjdata']
image = np.array(cjdata.get('image')).astype(np.float64)
label = cjdata.get('label')[0,0]
PID = cjdata.get('PID')
PID = ''.join(chr(c) for c in PID)
tumorBorder = np.array(cjdata.get('tumorBorder'))[0]
tumorMask = np.array(cjdata.get('tumorMask'))
h5_file.close()
hi = np.max(image)
lo = np.min(image)
image = (((image - lo)/(hi-lo))*255).astype(np.uint8)
im = Image.fromarray(image)
im.save(png)
os.system(f"mv {png} {dir_path}\\png_images")#make sure folder png_images exist
total_files += 1
print("saving", png, "File No: ", number)
for file in found_files:
if "cvind.mat" in file:
continue
convert_to_png(file, total_files)
print("Finished converting all files: ", total_files)
Here is a MATLAB code that can convert all images in a folder to a different format:
% Define the source and destination folders
src_folder = 'src';
dst_folder = 'dst';
% Get a list of all image files in the source folder
files = dir(fullfile(src_folder, '*.mat'));
% Loop through each file
for i = 1:length(files)
% Load the .mat file
load(fullfile(src_folder, files(i).name));
% Convert the data to uint8
example_matrix = im2uint8(example_matrix);
% Construct the destination file name
[~, name, ~] = fileparts(files(i).name);
dst_file = fullfile(dst_folder, [name '.png']);
% Try to save the image
try
imwrite(example_matrix, dst_file);
disp(['Image ' name ' saved successfully']);
catch
disp(['Error saving image ' name]);
end
end
in some case it produce a error example_matrix.
This error solved by this code
% Convert the data to uint8
I = reshape(uint16(linspace(0,65535,25)),[5 5])
% Convert the data to uint8
example_matrix = im2uint8(1);
This code defines the source folder (src_folder) and the destination folder (dst_folder). Then, it uses the dir function to get a list of all .mat files in the source folder.
The code loops through each file, loads the .mat file, converts the data to uint8, and constructs the destination file name. Finally, it tries to save the image using the imwrite function. If the image is saved successfully, it displays a message indicating the image was saved successfully. If an error occurs, it displays an error message.
Note that you should replace "src" and "dst" with the actual names of your source and destination folders, respectively.

How to save a encoded image using Pickle

What I am doing here is encoding a image and then adding this into a list with the path of the original image in the database variable like this
database.append[path, encoding]
I then want to save this database variable into a pickle for use in other programs. how would I go about doing that as I have had no luck with saving the files correctly yet.
Any help would be appreciated.
Here is the method that I am using to generate the variables I want to save
def embedDatabase(imagePath, model, metadata):
#Get the metadata
#Perform embedding
# calculated by feeding the aligned and scaled images into the pre-trained network.
'''
#Go through the database and get the embedding for each image
'''
database = []
embedded = np.zeros((metadata.shape[0], 128))
print("Embedding")
for i, m in enumerate(metadata):
img = imgUtil.loadImage(m.image_path())
_,img = imgUtil.alignImage(img)
# scale RGB values to interval [0,1]
if img is not None:
img = (img / 255.).astype(np.float32)
#Get the embedding vectors for the image
embedded[i] = model.predict(np.expand_dims(img, axis=0))[0]
database.append([m.image_path(), embedded[i]])
#return the array of embedded images from the database
return embedded, database
And this is the load image method
def loadImage(path):
img = cv2.imread(path, 1)
if img is not None:
# OpenCV loads images with color channels
# in BGR order. So we need to reverse them
return img[...,::-1]
else:
pass
print("There is no Image avaliable")
Figured it out.
with open("database.pickle", "wb") as f:
pickle.dump(database, f, pickle.HIGHEST_PROTOCOL)
for some reason I needed the pickle.HIGHEST_PROTOCOL Thing

Resources