How to catch an exception and get exception type in Python? - python-3.x

[Question] I need user to provide GDrive link to their kaggle.json file so I can set it up, when the script is run first time without kaggle.json file it throws an error and I am trying to handle it but without any success, my first question is same as title of this post, and second is does this make any sense all of this? Is there a better way to do this?
[Background] I am trying to write a script that acts as an interface providing limited access to functionalities of Kaggle library so that it runs in my projects and still being able to share it on GitHub so that others can use it in similar projects, I will bundle this along with configuration management tool or with shell script.
This is the code:
#!/usr/bin/env python3
import os
import sys
import traceback
import gdown
import kaggle
import argparse
"""
A wrapper around the kaggle library that provides limited access to the kaggle library
"""
#*hyperparameters
dataset = 'roopahegde/cryptocurrency-timeseries-2020'
raw_data_folder = './raw'
kaggle_api_file_link = None
#*argument parser
parser = argparse.ArgumentParser(description="download dataset")
parser.add_argument('--kaggle_api_file', type=str, default=kaggle_api_file_link, help="download and set kaggle API file [Gdrive link]")
parser.add_argument("--kaggle_dataset", type=str,
default=dataset, help="download kaggle dataset using user/datasets_name")
parser.add_argument("--create_folder", type=str, default=raw_data_folder, help="create folder to store raw datasets")
group = parser.add_mutually_exclusive_group()
group.add_argument('-preprocess_folder', action="store_true", help="create folder to store preprocessed datasets")
group.add_argument('-v', '--verbose', action="store_true", help="print verbose output")
group.add_argument('-q', '--quiet', action="store_true", help="print quiet output")
args = parser.parse_args()
#*setting kaggle_api_file
if args.kaggle_api_file:
gdown.download(args.kaggle_api_file, os.path.expanduser('~'), fuzzy=True)
#*creating directories if not exists
if not os.path.exists(args.create_folder):
os.mkdir(args.create_folder)
if not os.path.exists('./preprocessed') and args.preprocess_folder:
os.mkdir('./preprocessed')
def main():
try:
#*downloading datasets using kaggle.api
kaggle.api.authenticate()
kaggle.api.dataset_download_files(
args.kaggle_dataset, path=args.create_folder, unzip=True)
kaggle.api.competition_download_files
#*output
if args.verbose:
print(
f"Dataset downlaoded from https://www.kaggle.com/{args.kaggle_dataset} in {args.create_folder}")
elif args.quiet:
pass
else:
print(f"Download Complete")
except Exception as ex:
print(f"Error occured {type(ex)} {ex.args} use flag --kaggle_api_file to download and set kaggle api file")
if __name__ == '__main__':
sys.exit(main())
I tried to catch IOError and OSError instead of catching generic Exception, still no success. I want to print a message telling user to use --kaggle_api_file flag to set up kaggle.json file.
This is the error:
python get_data.py
Traceback (most recent call last):
File "get_data.py", line 7, in <module>
import kaggle
File "/home/user/.local/lib/python3.8/site-packages/kaggle/__init__.py", line 23, in <module>
api.authenticate()
File "/home/user/.local/lib/python3.8/site-packages/kaggle/api/kaggle_api_extended.py", line 164, in authenticate
raise IOError('Could not find {}. Make sure it\'s located in'
OSError: Could not find kaggle.json. Make sure it's located in /home/user/.kaggle. Or use the environment method.

Related

Embedded Python: How to Compile Script from Memory (Not a File) That Requires "Pickle"

My app hosts embedded Python so that users can create and run scripts using the Python 3.9 framework for macOS. The scripts are not stored as files on disk, but in the my app's memory.
In general, I am successfully compiling and running these scripts from memory using Py_CompileString/PyImport_ExecCodeModule
However, I get a "pickle" error when attempting to run a script that uses the multiprocessing module.
I created a command line program to replicate the problem. Upon trying to execute the following code as a string (with proper newlines, etc.) using PyRun_SimpleString()
from multiprocessing import Process
import sys
def foo(name):
print('executing foo', name)
def python_main():
# set the executable to path to python, not this command line program
sys.executable = '/Library/Frameworks/Python.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python'
# set sys.argv[0] to '__main__' since it will be empty otherwise
sys.argv[0] = '__main__'
print('creating process')
p = Process(target=foo, args=('Hi Bob!',))
p.start()
p.join()
print('process complete - exit code', p.exitcode)
if __name__ == '__main__':
python_main()
I receive the following output:
creating process
starting process
joining process
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Cant get attribute 'foo' on <module '__main__' (built-in)>
process complete - exit code 1
python_main complete
If I run the same code from an actual file called mp.py as follows, then it executes successfully.
PyRun_SimpleString("import mp; mp.python_main()");
Is there something I must do to allow my in-memory script to be pickled as it is when the file is on disk?
P.S. It is worth mention that when I compile the code using Py_CompileString/PyImport_ExecCodeModule, the code looks something like this:
codeObj = Py_CompileString(inScript.c_str(), "mp.py", Py_file_input);
moduleObj = PyImport_ExecCodeModule("mp", codeObj);
so the code does have an "imginary" file name (imaginary because the script text is in memory, not on disk) and a real module name.

FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

I have recently saved a model into s3 using joblib
model_doc is the model object
import subprocess
import joblib
save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")
def save_d2v_to_s3_current_doc2vec_model(model,fname):
model_name = fname
joblib.dump(model,model_name)
s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(model_name,path).split()
print('saving...'+model_name)
subprocess.call(command)
It was successful, but after that when i try to load the model back from s3 it gives me an error
model = load_d2v("doc2vec_model")
def load_d2v(fname):
model_name = fname
s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
This is the error i get:
loading...doc2vec_model
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in load_d2v
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
I don't even understand why it is saying File not found, this was the path i used to save the model but now i'm unable to get the model back from s3. Please help me!!
I suggest that rather than your generic print() lines, showing your intent, you should print the actual command you've composed, to verify that it makes sense upon observation.
If it does, then also try that exact same aws ... command directly, at the command prompt where you had been launching your python code, to make sure it runs that way. If it doesn't, you may get a more clear error.
Note that the error you're getting doesn't particularly look like it's coming from the aws command, of from the S3 service - which might talk about 'paths' or 'objects'. Rather, it's from the Python subprocess system & Popen' call. I think those are via your call tosubprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with theload_d2v()`?)
That suggests the file that's no found might be the aws command itself. Are you sure it's installed & runnable from the exact working-directory/environment that your Python is running in, and invoking via subprocess.call()?
(BTW, if my previous answer got you over your sklearn.externals.joblib problem, it'd be good for you to mark the answer as accepted, to save other potential answerers from thinking that's still an unsolved question that's blocking you.)
try to add extension of your model file to your fname if you are confident the model file is there.
e.g. doc2vec_model.h3

Agora - unable to merge video .ts files into one single video file

I am using agora.io for video calling. I am running script on my localhost.
I am able to record the video successfully but they are multiple .ts files.
I downloaded python script from agora website and ran it. It runs successfully without any error But it does not generate any single video file, in short script run successfully but nothing happens.
No errors, no new file generated.
The code I am using is:
#!/usr/bin/env python
import time
import re
import os
import sys
import signal
import glob
import parser_metadata_files
import video_convert
from optparse import OptionParser
import traceback
if '__main__' == __name__:
import sys
signal.signal(signal.SIGINT, signal.SIG_IGN)
signal.signal(signal.SIGQUIT, signal.SIG_IGN)
parser = OptionParser()
parser.add_option("-f", "--folder", type="string", dest="folder", help="Convert folder", default="")
parser.add_option("-m", "--mode", type="int", dest="mode", help="Convert merge mode, \
[0: txt merge A/V(Default); 1: uid merge A/V; 2: uid merge audio; 3: uid merge video]", default=0)
parser.add_option("-p", "--fps", type="int", dest="fps", help="Convert fps, default 15", default=15)
parser.add_option("-s", "--saving", action="store_true", dest="saving", help="Convert Do not time sync",
default=False)
parser.add_option("-r", "--resolution", type="int", dest="resolution", nargs=2,
help="Specific resolution to convert '-r width height' \nEg:'-r 640 360'", default=(0, 0))
(options, args) = parser.parse_args()
if not options.folder:
parser.print_help()
parser.error("Not set folder")
try:
print('1')
os.environ['FPSARG'] = "%s" % options.fps
print('2')
parser_metadata_files.cmds_parse(["dispose", options.folder])
print('3')
video_convert.do_work()
print('4')
parser_metadata_files.cmds_parse(["clean", options.folder])
print('5')
except Exception as e:
traceback.print_exc()
The command I am running is:
/usr/local/bin/python3.7 convert.py -f /Users/msmexmac/Desktop/Cloud_Recording_tools/tiles/ -m 3 -p 30
I downloaded the script from this page.
The reason you see multiple .ts files is because, after the recording starts, the Agora server automatically splits the recorded content into multiple TS/WebM files and keeps uploading them to the third-party cloud storage until the recording stops.
Make sure to follow the steps in the below-given link for uploading the recorded video:
https://docs.agora.io/en/cloud-recording/cloud_recording_rest
It is crucial to get the "uploaded" callback to proceed further.

how to fix an error during uploading custom dataset in to colab?

I have followed all the steps described in most tutorials on how to upload your custom dataset into google colab. but I am getting an error which I try a lot to fix but not working.
I am trying to train a CNN model using my custom dataset. I try to upload it on colab using the code snippet given in most tutorials.
the following error is displayed when I run the code snippet
Downloading zip file
---------------------------------------------------------------------------
HttpError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pydrive/files.py in FetchMetadata(self, fields, fetch_all)
236 fields=fields)\
--> 237 .execute(http=self.http)
238 except errors.HttpError as error:
6 frames
HttpError: <HttpError 404 when requesting https://www.googleapis.com/drive/v2/files/https%3A%2F%2Fdrive.google.com%2Fopen%3Fid%3D1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW?alt=json returned "File not found: https://drive.google.com/open?id=1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW">
During handling of the above exception, another exception occurred:
ApiRequestError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pydrive/files.py in FetchMetadata(self, fields, fetch_all)
237 .execute(http=self.http)
238 except errors.HttpError as error:
--> 239 raise ApiRequestError(error)
240 else:
241 self.uploaded = True
ApiRequestError: <HttpError 404 when requesting https://www.googleapis.com/drive/v2/files/https%3A%2F%2Fdrive.google.com%2Fopen%3Fid%3D1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW?alt=json returned "File not found: https://drive.google.com/open?id=1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW">
#this is the code snippet I have taken from tutorials to upload dataset to google colab.
!pip install -U -q PyDrive
# Insert your file ID
# Get it by generating a share URL for the file
# An example : https://drive.google.com/file/d/1iz5JmTB4YcBvO7amj3Sy2_scSeAsN4gd/view?usp=sharing
zip_id = 'https://drive.google.com/open?id=1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW'
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import zipfile, os
# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
if not os.path.exists('MODEL'):
os.makedirs('MODEL')
# 2. Download Zip
print ("Downloading zip file")
myzip = drive.CreateFile({'id': zip_id})
myzip.GetContentFile('model.zip')
# 3. Unzip
print ("Uncompressing zip file")
zip_ref = zipfile.ZipFile('model.zip', 'r')
zip_ref.extractall('MODEL/')
zip_ref.close()
OMG. after a long hour (almost 8 hours) researching on the internet and brainstorming i found the answer. if any one who is new working on colab and face a similar error here is how i solved this error. The problem on the above code is the way we assign the file id. zip_id = 'https://drive.google.com/open?id=1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW'. most of the tutorial I have seen told us just to take the file id by right clicking the file in google drive and copy the share link address. but the file id is not the whole thing we copied. the file id is only after the id= which is in my case 1RqLx88tx2FCV0Z3CHsqVtx7S3_ffE-UW. After giving this as an id the error is gone. Hope this response will help other colab starters.

How to catch an ImportError non-recursively? (dynamic import)

Say we want to import a script dynamically, i.e. the name is constructed at runtime. I use this to detect plugin scripts for some program, so the script may not be there and the import may fail.
from importlib import import_module
# ...
subpackage_name = 'some' + dynamic() + 'string'
try:
subpackage = import_module(subpackage_name)
except ImportError:
print('No script found')
How can we make sure to only catch the possible import failure of the plugin script itself, and not of the imports that may be contained inside the plugin script?
Side note: this question is related, but is about static imports (using the import keyword), and the provided solutions don't work here.
Since Python 3.3, ImportError objects have had name and path attributes, so you can catch the error and inspect the name it failed to import.
try:
import_module(subpackage_name)
except ImportError as e:
if e.name == subpackage_name:
print('subpackage not found')
else:
print('subpackage contains import errors')
ImportErrors in Python have messages you can read name attributes you can use if you have the exception object:
# try to import the module
try:
subpackage = import_module(subpackage_name)
# get the ImportError object
except ImportError as e:
## get the message
##message=e.message
## ImportError messages start with "No module named ", which is sixteen chars long.
##modulename=message[16:]
# As Ben Darnell pointed out, that isn't the best way to do it in Python 3
# get the name attribute:
modulename=e.name
# now check if that's the module you just tried to import
if modulename==subpackage_name:
pass
# handle the plugin not existing here
else:
# handle the plugin existing but raising an ImportError itself here
# check for other exceptions
except Exception as e:
pass
# handle the plugin raising other exceptions here

Resources