Embedded Python: How to Compile Script from Memory (Not a File) That Requires "Pickle" - python-3.x

My app hosts embedded Python so that users can create and run scripts using the Python 3.9 framework for macOS. The scripts are not stored as files on disk, but in the my app's memory.
In general, I am successfully compiling and running these scripts from memory using Py_CompileString/PyImport_ExecCodeModule
However, I get a "pickle" error when attempting to run a script that uses the multiprocessing module.
I created a command line program to replicate the problem. Upon trying to execute the following code as a string (with proper newlines, etc.) using PyRun_SimpleString()
from multiprocessing import Process
import sys
def foo(name):
print('executing foo', name)
def python_main():
# set the executable to path to python, not this command line program
sys.executable = '/Library/Frameworks/Python.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python'
# set sys.argv[0] to '__main__' since it will be empty otherwise
sys.argv[0] = '__main__'
print('creating process')
p = Process(target=foo, args=('Hi Bob!',))
p.start()
p.join()
print('process complete - exit code', p.exitcode)
if __name__ == '__main__':
python_main()
I receive the following output:
creating process
starting process
joining process
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Cant get attribute 'foo' on <module '__main__' (built-in)>
process complete - exit code 1
python_main complete
If I run the same code from an actual file called mp.py as follows, then it executes successfully.
PyRun_SimpleString("import mp; mp.python_main()");
Is there something I must do to allow my in-memory script to be pickled as it is when the file is on disk?
P.S. It is worth mention that when I compile the code using Py_CompileString/PyImport_ExecCodeModule, the code looks something like this:
codeObj = Py_CompileString(inScript.c_str(), "mp.py", Py_file_input);
moduleObj = PyImport_ExecCodeModule("mp", codeObj);
so the code does have an "imginary" file name (imaginary because the script text is in memory, not on disk) and a real module name.

Related

from azure.eventhub.aio import EventHubConsumerClient ModuleNotFoundError: No module named 'azure'

I try to run the below following code using python3 recv.py on visual studio code but I'm getting the following error
Traceback (most recent call last):
File "recv.py", line 2, in <module>
from azure.eventhub.aio import EventHubConsumerClient
ModuleNotFoundError: No module named 'azure'
import asyncio
from azure.eventhub.aio import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblobaio import BlobCheckpointStore
async def on_event(partition_context, event):
# Print the event data.
print("Received the event: \"{}\" from the partition with ID: \"{}\"".format(event.body_as_str(encoding='UTF-8'), partition_context.partition_id))
# Update the checkpoint so that the program doesn't read the events
# that it has already read when you run it next time.
await partition_context.update_checkpoint(event)
async def main():
# Create an Azure blob checkpoint store to store the checkpoints.
checkpoint_store = BlobCheckpointStore.from_connection_string("connection_string", "containername")
# Create a consumer client for the event hub.
client = EventHubConsumerClient.from_connection_string("connection_string", consumer_group="$Default", eventhub_name="eventhubinstance", checkpoint_store=checkpoint_store)
async with client:
# Call the receive method. Read from the beginning of the partition (starting_position: "-1")
await client.receive(on_event=on_event, starting_position="-1")
if __name__ == '__main__':
loop = asyncio.get_event_loop()
# Run the main method.
loop.run_until_complete(main())
I try to execute the file on my iTerm terminal and it's working fine. Can you tell me why it is n ot working in vscode?
I'm using Python 3.7.9
I have installed the package using pip3 install azure-eventhub (I have also tried with just pip) but the modules are still interpreting as missing whereas there are not.
Using pip show azure-eventhub WARNING: Package(s) not found: azure-eventhub but it is there I can see the package in /usr/local/lib/python3.10/site-packages

How to catch an exception and get exception type in Python?

[Question] I need user to provide GDrive link to their kaggle.json file so I can set it up, when the script is run first time without kaggle.json file it throws an error and I am trying to handle it but without any success, my first question is same as title of this post, and second is does this make any sense all of this? Is there a better way to do this?
[Background] I am trying to write a script that acts as an interface providing limited access to functionalities of Kaggle library so that it runs in my projects and still being able to share it on GitHub so that others can use it in similar projects, I will bundle this along with configuration management tool or with shell script.
This is the code:
#!/usr/bin/env python3
import os
import sys
import traceback
import gdown
import kaggle
import argparse
"""
A wrapper around the kaggle library that provides limited access to the kaggle library
"""
#*hyperparameters
dataset = 'roopahegde/cryptocurrency-timeseries-2020'
raw_data_folder = './raw'
kaggle_api_file_link = None
#*argument parser
parser = argparse.ArgumentParser(description="download dataset")
parser.add_argument('--kaggle_api_file', type=str, default=kaggle_api_file_link, help="download and set kaggle API file [Gdrive link]")
parser.add_argument("--kaggle_dataset", type=str,
default=dataset, help="download kaggle dataset using user/datasets_name")
parser.add_argument("--create_folder", type=str, default=raw_data_folder, help="create folder to store raw datasets")
group = parser.add_mutually_exclusive_group()
group.add_argument('-preprocess_folder', action="store_true", help="create folder to store preprocessed datasets")
group.add_argument('-v', '--verbose', action="store_true", help="print verbose output")
group.add_argument('-q', '--quiet', action="store_true", help="print quiet output")
args = parser.parse_args()
#*setting kaggle_api_file
if args.kaggle_api_file:
gdown.download(args.kaggle_api_file, os.path.expanduser('~'), fuzzy=True)
#*creating directories if not exists
if not os.path.exists(args.create_folder):
os.mkdir(args.create_folder)
if not os.path.exists('./preprocessed') and args.preprocess_folder:
os.mkdir('./preprocessed')
def main():
try:
#*downloading datasets using kaggle.api
kaggle.api.authenticate()
kaggle.api.dataset_download_files(
args.kaggle_dataset, path=args.create_folder, unzip=True)
kaggle.api.competition_download_files
#*output
if args.verbose:
print(
f"Dataset downlaoded from https://www.kaggle.com/{args.kaggle_dataset} in {args.create_folder}")
elif args.quiet:
pass
else:
print(f"Download Complete")
except Exception as ex:
print(f"Error occured {type(ex)} {ex.args} use flag --kaggle_api_file to download and set kaggle api file")
if __name__ == '__main__':
sys.exit(main())
I tried to catch IOError and OSError instead of catching generic Exception, still no success. I want to print a message telling user to use --kaggle_api_file flag to set up kaggle.json file.
This is the error:
python get_data.py
Traceback (most recent call last):
File "get_data.py", line 7, in <module>
import kaggle
File "/home/user/.local/lib/python3.8/site-packages/kaggle/__init__.py", line 23, in <module>
api.authenticate()
File "/home/user/.local/lib/python3.8/site-packages/kaggle/api/kaggle_api_extended.py", line 164, in authenticate
raise IOError('Could not find {}. Make sure it\'s located in'
OSError: Could not find kaggle.json. Make sure it's located in /home/user/.kaggle. Or use the environment method.

How can I open a file with its correct program (e.g. ".blend" with Blender and ".webloc" with Chrome) inside a Python program?

I'm working with IDLE on Mac and I'm trying to make a small program that opens a random file from a folder (it is actually a test for a bigger project). In the folder, I have many types of files like ".blend", ".m4a", ".py" and ".webloc" but I expect to have even more in the future. I would like my code to open a random one with their respective program (Blender, QuickTime Player, IDLE, Chrome...) but so far I have not found any way to do it. Is it possible? The most I've been able to do is open Google Chrome from my Windows computer. It doesn't work on my Mac (maybe because it is .app instead of .exe?) and I can only open programs, but not files. Here's the code I used for that:
import subprocess
subprocess.Popen(['C:\Program Files (x86)\Google\Chrome\Application\\chrome.exe', '-new-tab'])
When I enter that on Mac (but with the correct file path for Mac):
import subprocess
subprocess.Popen(['/Applications/Google Chrome.app', '-new-tab'])
It gives me this error (could it be because the file path is written incorrectly? I copied it with right-click on the Chrome file and clicking "copy as path"):
>>>
=============== RESTART: /Users/jaimewalter/Desktop/Test/Test3.py ==============
Traceback (most recent call last):
File "/Users/jaimewalter/Desktop/Test/Test3.py", line 3, in <module>
subprocess.Popen(['/Applications/Google Chrome.app', '-new-tab'])
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1702, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
PermissionError: [Errno 13] Permission denied: '/Applications/Google Chrome.app'
>>>
And here's my code for the random file selector:
import random
files = ["Test1.blend", "Test2.m4a", "Test3.py", "Test4.webloc"]
open_this = random.choice (files)
print(open_this)
if open_this == "Test1.blend":
print("Opening Test1.blend")
#now it should open Test1.blend on a new Blender window (/Users/jaimewalter/Desktop/Test/Test1.blend)
elif open_this == "Test2.m4a":
print("Opening Test2.m4a")
#now it should open Test2.m4a on a new QuickTime Player window (/Users/jaimewalter/Desktop/Test/Test2.m4a)
elif open_this == "Test3.py":
print("Opening Test3.py")
#now it should open Test3.py on a new IDLE window or preferably runs the code inside directly it if that's possible (/Users/jaimewalter/Desktop/Test/Test3.py)
elif open_this == "Test4.webloc":
print("Opening Test4.webloc")
#now it should open Test4.webloc on a new Chrome or Safari window (/Users/jaimewalter/Desktop/Test/Test4.webloc)
What should I use to open the files inside the code? Thanks in advance
I already solved it. I used
import subprocess
subprocess.call(["open", "Test1.blend"])
I think with Linux it's xdg-open

FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

I have recently saved a model into s3 using joblib
model_doc is the model object
import subprocess
import joblib
save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")
def save_d2v_to_s3_current_doc2vec_model(model,fname):
model_name = fname
joblib.dump(model,model_name)
s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(model_name,path).split()
print('saving...'+model_name)
subprocess.call(command)
It was successful, but after that when i try to load the model back from s3 it gives me an error
model = load_d2v("doc2vec_model")
def load_d2v(fname):
model_name = fname
s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
This is the error i get:
loading...doc2vec_model
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in load_d2v
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
I don't even understand why it is saying File not found, this was the path i used to save the model but now i'm unable to get the model back from s3. Please help me!!
I suggest that rather than your generic print() lines, showing your intent, you should print the actual command you've composed, to verify that it makes sense upon observation.
If it does, then also try that exact same aws ... command directly, at the command prompt where you had been launching your python code, to make sure it runs that way. If it doesn't, you may get a more clear error.
Note that the error you're getting doesn't particularly look like it's coming from the aws command, of from the S3 service - which might talk about 'paths' or 'objects'. Rather, it's from the Python subprocess system & Popen' call. I think those are via your call tosubprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with theload_d2v()`?)
That suggests the file that's no found might be the aws command itself. Are you sure it's installed & runnable from the exact working-directory/environment that your Python is running in, and invoking via subprocess.call()?
(BTW, if my previous answer got you over your sklearn.externals.joblib problem, it'd be good for you to mark the answer as accepted, to save other potential answerers from thinking that's still an unsolved question that's blocking you.)
try to add extension of your model file to your fname if you are confident the model file is there.
e.g. doc2vec_model.h3

AttributeError: 'Timer' object has no attribute '_seed'

This is the code I used. I found this code on https://github.com/openai/universe#breaking-down-the-example . As I'm getting error on remote manager so I have to copy this code to run it. But it still giving me error as below
import gym
import universe # register the universe environments
env = gym.make('flashgames.DuskDrive-v0')
env.configure(remotes=1) # automatically creates a local docker container
observation_n = env.reset()
while True:
action_n = [[('KeyEvent', 'ArrowUp', True)] for ob in observation_n] # your agent here
observation_n, reward_n, done_n, info = env.step(action_n)
env.render()
I'm getting this when try to run above script. I tried every possible way to solve it, but it still causing the same error. There is not even one thread about this. I don't know what to do now please tell me if anyone of you solved it.
I'm using Ubuntu 18.04 LTS on virtual box which is running on Windows 10
WARN: Environment '<class 'universe.wrappers.timer.Timer'>' has deprecated methods '_step' and '_reset' rather than 'step' and 'reset'. Compatibility code invoked. Set _gym_disable_underscore_compat = True to disable this behavior.
Traceback (most recent call last):
File "gymtest1.py", line 4, in <module>
env = gym.make("flashgames.CoasterRacer-v0")
File "/home/mystery/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 167, in make
return registry.make(id)
File "/home/mystery/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 125, in make
patch_deprecated_methods(env)
File "/home/mystery/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 185, in patch_deprecated_methods
env.seed = env._seed
AttributeError: 'Timer' object has no attribute '_seed'
So I think what you need to do add a few lines in the Timer module because the code checks whether the code implements certain functions (_step, _reset, _seed, etc...)
So all you need to do (I think) is add at the end of the Timer class:
def _seed(self, seed_num=0): # this is so that you can get consistent results
pass # optionally, you could add: random.seed(random_num)
return

Resources