How to retreive original folder contents from Appium pull_folder output? - base64

Based on the appium doc at http://appium.io/docs/en/commands/device/files/pull-folder/ a folder can be pulled following way
folder_base64 = self.driver.pull_folder('/path/to/device/foo.bar')
As per the doc the response folder_base64 is : "A string of Base64 encoded data, representing a zip archive of the contents of the requested folder."
So, based on my understanding of the above, i tried followings(A to D) which did not work.
A)
base64 decoding of the folder_base64
Unzipping the decoded output
decoded_base64 = base64.b64decode(folder_base64)
folder_base64 = ZipFile.extractall(decoded_base64)
this fails with following error:
zipfile.py", line 1493, in extractall
AttributeError: 'bytes' object has no attribute 'namelist'
B)
base64 decoding of the folder_base64
zipping the decoded output
unzipping
decoded_base64 = base64.b64decode(folder_base64)
zipfile.ZipFile('out.zip', mode='w').write(decoded_base64)
fails with following error at step 2:
zipfile.py", line 484, in from_file
st = os.stat(filename)
ValueError: stat: embedded null character in path
C)
unzipping the folder_base64
base64 decoding of the output
unzipped_base64 = ZipFile.extractall(folder_base64)
decoded_base64 = base64.b64decode(unzipped_base64)
fails at step 1 with following error
zipfile.py", line 1493, in extractall
AttributeError: 'str' object has no attribute 'namelist'
D)
base64 decoding of the folder_base64
read the file as zip file
extract the files
decoded_base64 = base64.b64decode(folder_base64)
zip_folder = zipfile.ZipFile(decoded_base64, 'r')
ZipFile.extractall(zip_folder, "./mp3_files")
fails with following error:
zipfile.py", line 241, in _EndRecData
fpin.seek(0, 2)
AttributeError: 'bytes' object has no attribute 'seek'
E)
Finally following worked, but i am wondering why it had to be routed via a temp file to make it work? Also, is there a better way/more direct way to handle appium pull_folder output?
decoded_base64 = base64.b64decode(folder_base64)
with SpooledTemporaryFile() as tmp:
tmp.write(decoded_base64)
archive = ZipFile(tmp, 'r')
ZipFile.extractall(archive, "./mp3_files")
Note: following python packages are used in above code snippets
import zipfile
from tempfile import SpooledTemporaryFile

Related

Python image save error - raise ValueError("unknown file extension: {}".format(ext)) from e ValueError: unknown file extension:

Am just having four weeks of experience in Python. Creating a tool using Tkinter to paste a new company logo on top of the existing images.
The Below method is to, get all images in the given directory and paste the new logo on the initial level. Existing image, edited image, x-position, y-position, a preview of the image and few data's are store in global instance self.images_all_arr.
def get_img_copy(self):
self.images_all_arr = []
existing_img_fldr = self.input_frame.input_frame_data['existing_img_folder']
for file in os.listdir(existing_img_fldr):
img_old = Image.open(os.path.join(existing_img_fldr, file))
img_new_copy = img_old.copy()
self.pasteImage(img_new_copy, initpaste=True) #process to paste new logo.
view_new_img = ImageTk.PhotoImage(img_new_copy)
fname, fext = file.split('.')
formObj = {
"fname": fname,
"fext": fext,
"img_old": img_old,
"img_new": img_new_copy,
"img_new_view": view_new_img,
"add_logo": 1,
"is_default": 1,
"is_opacityImg": 0,
"pos_x": self.defult_x.get(),
"pos_y": self.defult_y.get()
}
self.images_all_arr.append(formObj)
After previewing each image in Tkinter screen, doing some adjustment in position x and y(updating pos_x and pos_y in the list self.images_all_arr) depends upon the necessity.
Well, once all done. Need to save the edited images. Below method to save images, iterating the list self.images_all_arr and call save method as img['img_new'].save(dir_output) since img['img_new'] has updated image.
def generate_imgae(self):
if len(self.images_all_arr):
dir_output = 'xxxxx'
for img in self.images_all_arr:
print(img['img_new'])
img['img_new'].save(dir_output)
print('completed..')
But it returns below error,
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Program Files (x86)\Python38-32\lib\site-packages\PIL\Image.py", line 2138, in save
format = EXTENSION[ext]
KeyError: ''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python38-32\lib\tkinter_init_.py", line 1883, in call
return self.func(*args)
File "C:\Users\662828\WORKSPACE\AAA_python_projects\AMI_automation_poc2\position_and_images.py", line 241, in generate_imgae
img['img_new'].save(dir_output)
File "C:\Program Files (x86)\Python38-32\lib\site-packages\PIL\Image.py", line 2140, in save
raise ValueError("unknown file extension: {}".format(ext)) from e
ValueError: unknown file extension:
dir_output doesn't contain the file extension, its just xxxxx. You need to specify what image file format you want. the error tells us this by saying "unknown file format".
Basically, you either need to include the extension in the file name, or pass it as the next parameter in image.save. You can check out the documentation here
eg.
image.save('image.png')
or
image.save('image', 'png')
The below code solved my issue. By giving the exact directory, filename and extension of the image as param to the method image.save(). Here the result of opfile is, C:\Users\WORKSPACE\AAA_python_projects\test\valid.png.
def generate_imgae(self):
if len(self.images_all_arr):
dir_output = r"C:\Users\WORKSPACE\AAA_python_projects\test"
if not os.path.isdir(dir_output):
os.mkdir(dir_output)
for img in self.images_all_arr:
opfile = os.path.join(dir_output, '{}.{}'.format(img['fname'], img['fext'] ))
img['img_new'].save(opfile)
print('completed..')
Thanks #dantechguy

Python 3.73 inserting into bytearray = "object cannot be re-sized"

I'm working with a bytearray from file data. I'm opening it as 'r+b', so can change as binary.
In the Python 3.7 docs, it explains that a RegEx's finditer() can use m.start() and m.end() to identify the start and end of a match.
In the question Insert bytearray into bytearray Python, the answer says an insert can be made to a bytearray by using slicing. But when this is attempted, the following error is given: BufferError: Existing exports of data: object cannot be re-sized.
Here is an example:
pat = re.compile(rb'0.?\d* [nN]') # regex, binary "0[.*] n"
with open(file, mode='r+b') as f: # updateable, binary
d = bytearray(f.read()) # read file data as d [as bytes]
it = pat.finditer(d) # find pattern in data as iterable
for match in it: # for each match,
m = match.group() # bytes of the match string to binary m
...
val = b'0123456789 n'
...
d[match.start():match.end()] = bytearray(val)
In the file, the match is 0 n and I'm attempting to replace it with 0123456789 n so would be inserting 9 bytes. The file can be changed successfully with this code, just not increased in size. What am I doing wrong? Here is output showing all non-increasing-filesize operations working, but it failing on inserting digits:
*** Changing b'0.0032 n' to b'0.0640 n'
len(d): 10435, match.start(): 607, match.end(): 615, len(bytearray(val)): 8
*** Found: "0.0126 n"; set to [0.252] or custom:
*** Changing b'0.0126 n' to b'0.2520 n'
len(d): 10435, match.start(): 758, match.end(): 766, len(bytearray(val)): 8
*** Found: "0 n"; set to [0.1] or custom:
*** Changing b'0 n' to b'0.1 n'
len(d): 10435, match.start(): 806, match.end(): 809, len(bytearray(val)): 5
Traceback (most recent call last):
File "fixV1.py", line 190, in <module>
main(sys.argv)
File "fixV1.py", line 136, in main
nchanges += search(midfile) # perform search, returning count
File "fixV1.py", line 71, in search
d[match.start():match.end()] = bytearray(val)
BufferError: Existing exports of data: object cannot be re-sized
This is a simple case, much like modifying an iterable during iteration:
it = pat.finditer(d) creates a buffer from the bytearray object. This in turn "locks" the bytearray object from being changed in size.
d[match.start():match.end()] = bytearray(val) attempts to modify the size on the "locked" bytearray object.
Just like attempting to change a list's size while iterating over it will fail, an attempt to change a bytearray size while iterating over it's buffer will also fail.
You can give a copy of the object to finditer().
For more information about buffers and how Python works under the hood, see the Python docs.
Also, do keep in mind, you're not actually modifying the file. You'll nee to either write the data back to the file, or use memory mapped files. I suggest the latter if you're looking for efficiency.

pytest fails when using io.BytesIO stream instead of PDF file

I'm running pytest to check a function that uses pdfminer to convert PDF to text. The function works when doing $ python function.py and the result is what I expect it to be. I should also point out that I'm using a stream when parsing the file (io.BytesIO) and this stream is the reason my test fails.
Running pytest the function fails with a PDFSyntaxError.
# function.py
...
from pdfminer.pdfparser import PDFParser
from pdfminer.document import PDFDocument
req = requests.get(url_pointing_to_pdf_file)
pdf = io.BytesIO(req.content)
parser = PDFParser(pdf)
document = PDFDocument(parser, password=None) # this fails
...
pytest calls the init method in pdfdocument.py (part of the pdfminer library) and stops here:
for xref in xrefs:
trailer = xref.get_trailer()
...
if 'Root' in trailer:
self.catalog = dict_value(trailer['Root'])
break
else:
raise PDFSyntaxError('No /Root object! - Is this really a PDF?')
...
And this is what pytest shows when testing the function fails:
tests/test_function.py:11:
----------------------------------------------------
.../function.py:157: in function
**document = PDFDocument(parser, password=None)**
...
E pdfminer.pdfparser.PDFSyntaxError: No /Root object! - Is this really a PDF?
lib/python3.6/site-packages/pdfminer/pdfdocument.py:583:PDFSyntaxError
Running the test with a PDF file stored in the same directory as function.py is successful, so the culprit is the io.BytesIO format of the downloaded PDF file. Since I want to use a stream with function.py I would like to know if there is a better way to do this.

Adding timestamp to a file in PYTHON

I can able to rename a file without any problem/error using os.rename().
But the moment I tried to rename a file with timestamp adding to it, it throws win3 error or win123 error, tried all combinations but no luck, Could anyone help.
Successfully Ran Code :
#!/usr/bin/python
import datetime
import os
import shutil
import json
import re
maindir = "F:/Protocols/"
os.chdir(maindir)
maindir = os.getcwd()
print("Working Directory : "+maindir)
path_4_all_iter = os.path.abspath("all_iteration.txt")
now = datetime.datetime.now()
timestamp = str(now.strftime("%Y%m%d_%H:%M:%S"))
print(type(timestamp))
archive_name = "all_iteration_"+timestamp+".txt"
print(archive_name)
print(os.getcwd())
if os.path.exists("all_iteration.txt"):
print("File Exists")
os.rename(path_4_all_iter, "F:/Protocols/archive/archive.txt")
print(os.listdir("F:/Protocols/archive/"))
print(os.path.abspath("all_iteration.txt"))
Log :
E:\python.exe C:/Users/SPAR/PycharmProjects/Sample/debug.py
Working Directory : F:\Protocols
<class 'str'>
all_iteration_20180409_20:25:51.txt
F:\Protocols
File Exists
['archive.txt']
F:\Protocols\all_iteration.txt
Process finished with exit code 0
Error Code :
print(os.getcwd())
if os.path.exists("all_iteration.txt"):
print("File Exists")
os.rename(path_4_all_iter, "F:/Protocols/archive/"+archive_name)
print(os.listdir("F:/Protocols/archive/"))
print(os.path.abspath("all_iteration.txt"))
Error LOG:
E:\python.exe C:/Users/SPAR/PycharmProjects/Sample/debug.py
Traceback (most recent call last):
Working Directory : F:\Protocols
<class 'str'>
File "C:/Users/SPAR/PycharmProjects/Sample/debug.py", line 22, in <module>
all_iteration_20180409_20:31:16.txt
F:\Protocols
os.rename(path_4_all_iter, "F:/Protocols/archive/"+archive_name)
File Exists
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'F:\\Protocols\\all_iteration.txt' -> 'F:/Protocols/archive/all_iteration_20180409_20:31:16.txt'
Process finished with exit code 1
Your timestamp format has colons in it, which are not allowed in Windows filenames. See this answer on that subject:
How to get a file in Windows with a colon in the filename?
If you change your timestamp format to something like:
timestamp = str(now.strftime("%Y%m%d_%H-%M-%S"))
it should work.
You can't have : characters as part of the filename, so change
timestamp = str(now.strftime("%Y%m%d_%H:%M:%S"))
to
timestamp = str(now.strftime("%Y%m%d_%H%M%S"))
and you'll be able to rename your file.

load .npy file from google cloud storage with tensorflow

i'm trying to load .npy files from my google cloud storage to my model i followed this example here Load numpy array in google-cloud-ml job
but i get this error
'utf-8' codec can't decode byte 0x93 in
position 0: invalid start byte
can you help me please ??
here is sample from the code
Here i read the file
with file_io.FileIO(metadata_filename, 'r') as f:
self._metadata = [line.strip().split('|') for line in f]
and here i start processing on it
if self._offset >= len(self._metadata):
self._offset = 0
random.shuffle(self._metadata)
meta = self._metadata[self._offset]
self._offset += 1
text = meta[3]
if self._cmudict and random.random() < _p_cmudict:
text = ' '.join([self._maybe_get_arpabet(word) for word in text.split(' ')])
input_data = np.asarray(text_to_sequence(text, self._cleaner_names), dtype=np.int32)
f = StringIO(file_io.read_file_to_string(
os.path.join('gs://path',meta[0]))
linear_target = tf.Variable(initial_value=np.load(f), name='linear_target')
s = StringIO(file_io.read_file_to_string(
os.path.join('gs://path',meta[1])))
mel_target = tf.Variable(initial_value=np.load(s), name='mel_target')
return (input_data, mel_target, linear_target, len(linear_target))
and this is a sample from the data sample
This is likely because your file doesn't contain utf-8 encoded text.
Its possible, you may need to initialize the file_io.FileIO instance as a binary file using mode = 'rb', or set binary_mode = True in the call to read_file_to_string.
This will cause data that is read to be returned as a sequence of bytes, rather than a string.

Resources