Converting .TIF to .PDF gives PIL: Error reading image - python-3.x

I've been trying to batch process some .TIF files and convert them to PDFs. I did have it working, but then after trying to change img2pdf so it would accept larger files I was never able to get the same program running again, even after re-installing.
Currently this is throwing out the following error:
>>>>
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A608255EB8>
Here is the code I've been using. Anyone got any suggestions? Thanks in advance.
import img2pdf, sys, os, time
image_directory = r"PATH"
image_files = []
for root, dirs, files in os.walk(image_directory):
for file in files:
if file.endswith(".tif") or file.endswith(".TIF"):
print("Discovered this TIF: ", os.path.join(root, file))
image_files.append(os.path.join(root, file))
for image in image_files:
output_file = image[:-4] + ".pdf"
print ("Putting all TIFs into ", output_file)
pdf_bytes = img2pdf.convert(image)
file = open(output_file,"wb")
file.write(pdf_bytes)
Here is the full traceback
Traceback (most recent call last):
File "<ipython-input-37-fe96d5eeb049>", line 1, in <module>
runfile('PATH', wdir='PATH')
File "PATH", line 704, in runfile
execfile(filename, namespace)
File "PATH", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "PATH", line 23, in <module>
pdf_bytes = img2pdf.convert(image_files)
File "PATH", line 1829, in convert
) in read_images(rawdata, kwargs["colorspace"], kwargs["first_frame_only"]):
File "PATH", line 1171, in read_images
"PIL: error reading image: %s" % e
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A6082BE3B8>

If, as I understand it, you want to recursively find all TIFF images and convert each one to a correspondingly named PDF file, you can do that simply and in parallel with GNU Parallel and ImageMagick like this in Terminal:
find . -iname "*tif" -print0 | parallel -0 --dry-run mogrify {} {.}.pdf
Sample Output
mogrify ./OpenCVTIFF64/result.tif ./OpenCVTIFF64/result.pdf
mogrify ./OpenCVTIFF64/a.tif ./OpenCVTIFF64/a.pdf
mogrify ./OpenCVBasics/a.tif ./OpenCVBasics/a.pdf
mogrify ./CImgDump/image.tif ./CImgDump/image.pdf
That command says... "Starting in the current directory, recursively find all TIFF files, whether upper or lowercase or some mixture and pass their names, null-terminated, to GNU Parallel. It should then read each name and run ImageMagick mogrify to convert that TIFF into a file with same name but the extension replaced with PDF."
If it does what you want, remove the --dry-run and do it again for real.

So this ended up working once I executed pip install 'Pillow>=6.0.0' --force-reinstall, even though the command itself didn't execute properly. I get a few warnings when I run, but it's now working. Short version is, it was an issue with Pillow.

Related

Can't load txt file with numpy in VSCode

import numpy as np
dat=np.loadtxt('sample.txt',skiprows=0,dtype=float)
print(dat.shape)
I'm trying to load txt file with numpy in VScode, but it displays such error
dat=np.loadtxt('sample.txt',skiprows=0,dtype=float)
File "C:\Users\asadb\anaconda3\lib\site-packages\numpy\lib\npyio.py", line 961, in loadtxt
fh = np.lib._datasource.open(fname, 'rt', encoding=encoding)
File "C:\Users\asadb\anaconda3\lib\site-packages\numpy\lib\_datasource.py", line 195, in open
return ds.open(path, mode, encoding=encoding, newline=newline)
File "C:\Users\asadb\anaconda3\lib\site-packages\numpy\lib\_datasource.py", line 535, in open
raise IOError("%s not found." % path)
OSError: sample.txt not found.
Two files, .py file and .txt file, are in the same folder.
I changed the directory to save .txt file then it moved!! Two files, .py that you want to move and .txt shouldn't be together in the same file or directory.
Assuming the answer you gave yourself is not the answer.
In VSCode you have multiple ways of opening a file. If you open a file directly. It does not always take the directory with it. If you were to load in a textfile. It would not search for it in your .py file directory (Whatever it may be). But it would search in your Python executable or VSCode directory. I recommend navigating to File --> Open Folder. In VSCode. And opening the folder where your .py file is located. This should fix your issue.

FileNotFoundError When Attempting to Open a File in the Same Directory

The txt file is saved in the exact same folder as my code but when I run it I get that traceback. I right clicked saved file directly to folder but when run the code vs studio. I am very new to code sorry for the basic question.
file = open('regex_sum_1114202.txt', 'r')
Traceback:
Traceback (most recent call last):
File "c:\Users\EM2750\Desktop\py4e\ex_11\ex_11.py", line 2, in <module>
file = open('regex_sum_1114202.txt', 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'regex_sum_1114202.txt'
screen shot of traceback
Try file = open('./regex_sum_1114202.txt', 'r') instead.
This explicitly specifies that Python should look for the file in the current directory by providing the relative path. Think of the point as a shorthand for the current working directory. So if the current working directory is the directory where the script and the file is, that should work.
Use forward slashes (/) instead of backslashes (\). Backslashes are the default directory separator on Windows, but here they make problems because they are interpreted as escape sequences by Python. Alternatively, you can use two backslashes after another as directory separator: \\.
You can also try to specify the full path before the filename: file = open('c:/Users/EM2750/Desktop/py4e/ex_11/regex_sum_1114202.txt', 'r'). The downside is of course that the path wouldn't be correct anymore if you'd move the file.

Indexing audio to get timestamps for each word using python

I have an audio file and I want to get the timestamps for each word. I want to know during which time period was each the words spoken.
(For example if an audio file says "I am a doctor" I want to know during which instant "I" was said, "am" was said and so on)
I want to do this using python.
I have tried the following code.
from SimpleAudioIndexer import SimpleAudioIndexer as sai
indexer = sai(mode="ibm", src_dir="D:/Codes/Python/audio recognition",
username_ibm="", password_ibm="")
indexer.index_audio(basename = "target.wav")
indexer.save_indexed_audio("{}/indexed_audio".format(indexer.src_dir))
indexer.load_indexed_audio("{}/indexed_audio.txt".format(indexer.src_dir))
print(indexer.get_timestamps())
However I am running into following error.
Traceback (most recent call last):
File "D:\Codes\Python\audio recognition\rec.py", line 5, in <module>
indexer.index_audio(basename = "target.wav")
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 1108, in index_audio
self._index_audio_ibm(*args, **kwargs)
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 928, in _index_audio_ibm
replace_already_indexed=replace_already_indexed)
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 730, in _prepare_audio
self._filtering_step(basename)
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 638, in _filtering_step
universal_newlines=True).communicate()
File "C:\Program Files (x86)\Microsoft Visual
Studio\Shared\Python37_64\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Program Files (x86)\Microsoft Visual
Studio\Shared\Python37_64\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
From what I can tell from your code/errors is that you're running Windows, but you're using forward slashes "/" in your src_dir string. This won't work as Windows uses the backslash "\" when navigating folders. Such as:
C:\Windows
while many Unix based systems use the forward slash like this:
/home/Awais
In your code you should try to change the "/" to "\" and see if that makes a difference.
If it is your intention on using this program on both Windows and Unix systems, you should use python3's "pathlib" to ensure your directory paths work on either OS. Here is a link that you can use to learn more about handling paths in python3:
https://medium.com/#ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f

FileNotFoundError even though file exist

when trying to open a file using wit open .. getting error that file doesn't exist.
I am trying to parse some txt files , when working localy it works with no issue, but the issue started when I am trying to connect to a network folder. the strange this is that is does see the file , but says its not found .
The Path I referring is '//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs/*' (this folder is full of txt files.
but I am still getting this error:
FileNotFoundError: [Errno 2] No such file or directory: 'Console_log_01-01-2019_08-17-56.txt'
as you see , it does see the needed file .
in order to get to this file I am parsing splitting the path the follwoing way :
readFile = name.split("/")[9].split("\\")[1]
because if I am looking on the list of my files I see them the following way :
['//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs\Console_log_01-01-2019_08-17-56.txt',
after splitting I am getting :
Console_log_01-01-2019_08-17-56.txt
and still it says the file is not found.
def main():
lines =0
path = '//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs/*'
files = glob.glob(path)
print ("files")
print ('\n')
print(files)
for name in glob.glob(path):
print (path)
readFile = name.split("/")[9].split("\\")[1]
print(readFile)
with open(readFile,"r") as file:
lines = file.readlines()
print (lines)
main()
files
['//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs\\Console_log_01-01-2019_08-17-56.txt', '//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs\\Console_log_01-01-2019_08-18-29.txt']
Traceback (most recent call last):
//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs/*
Console_log_01-01-2019_08-17-56.txt
File "C:/Users/markp/.PyCharmEdu2018.3/config/scratches/scratch_3.py", line 19, in <module>
main()
File "C:/Users/markp/.PyCharmEdu2018.3/config/scratches/scratch_3.py", line 16, in main
with open(readFile,"r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'Console_log_01-01-2019_08-17-56.txt'
Process finished with exit code 1
When you are looking for the file you are looking in the entire path, however when you are opening the file, you are referencing it as if it was in the local path, either change the current working directory with
os.chdir(path)
before opening the file, or in the open statement use
open(os.path.join(path,filename))
I recommend the first approach if you have to open only one file in your program and second if you open multiple files at multiple directories.
In future better format your questions, stack overflow has multiple tools, use them, also you can see how your text looks, make sure to have a look at it before posting. Use the code brackets for your code, that will help whoever is trying to answer.

python error while reading large files from a folder to copy to another file

i'm trying to read files in folder and copy specific part of each file to a new file using the below python code.but getting error as below
import glob
file=glob.glob("C:/Users/prasanth/Desktop/project/prgms/rank_free1/*.txt")
fp=[]
for b in file:
fp.append(open(b,'r'))
s1=''
for f in fp:
d=f.read().split('\t')
rank=d[0]
appname=d[1]
appid=d[2]
s1=appid+'\n'
file=open('C:/Users/prasanth/Desktop/project/prgms/appids_file.txt','a',encoding="utf-8")
file.write(s1)
file.close()
im getting the following error message
enter code here
Traceback (most recent call last):
File "appids.py", line 8, in <module>
d=f.read().split('\t')
File "C:\Users\prasanth\AppData\Local\Programs\Python\Python36-
32\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position
12307: character maps to <undefined>
From what I can see one of the files you are opening contains non-UTF8 characters so it can't be read into a string variable without appropriate information about its encoding.
To handle this you need to open the file for reading in binary mode and take care of the problem in your script.
You may put d=f.read().split('\t') in a try: except: construct and reopen the file in binary mode in the except: branch. Then handle in your script the problem with non-UTF8 characters it contains.

Resources