Import error: No module named 'boto', but i have it installed - python-3.x

I'm setting up a new functionality in mi gcloud buckets that allows me to upload or download files using a python library called "boto", but appears this error
I am using linux, visual studio code, python 3.7, gsutil and boto in their last versions.
import os
import boto
import gcs_oauth2_boto_plugin
import shutil
import io
import tempfile
import time
import sys
# Activate virtual environment
activate_this = os.path.join(VENV + 'bin/activate_this.py')
exec(open(activate_this, dict(__file__=activate_this)))
# Check arguments
if len(sys.argv) < 2:
print ("Usage: " + sys.argv[0] + ' FILENAME')
quit()
filename = sys.argv[1]
# URI scheme for Cloud Storage.
GOOGLE_STORAGE = "gs"
# URI scheme for accessing local files.
LOCAL_FILE = "file"
header_values = {"x-goog-project-id": PROJECT_ID}
# Open local file
with open(filename, 'r') as localfile:
dst_uri = boto.storage_uri(BUCKET + '/' + filename, GOOGLE_STORAGE)
# The key-related functions are a consequence of boto's
# interoperability with Amazon S3 (which employs the
# concept of a key mapping to localfile).
dst_uri.new_key().set_contents_from_file(localfile)
print ('Successfully created "%s/%s"' % (dst_uri.bucket_name, dst_uri.object_name))
Traceback (most recent call last):
File "./upload2gcs.py", line 10, in
import boto
ImportError: No module named boto

The directory containing the boto module probably isn't findable from any of the paths where Python looks for modules to be imported.
From within your script, check the sys.path list and see if the expected directory is present:
import pprint
import sys
pprint.pprint(sys.path)
As an example, gsutil is packaged with its own fork of Boto; it performs some additional steps at runtime to make sure the Boto module's parent directory is added to sys.path, which allows subsequent import boto statements to work:
https://github.com/GoogleCloudPlatform/gsutil/blob/c74a5964980b4f49ab2c4cb4d5139b35fbafe8ac/gslib/init.py#L102

Related

Error Launching Blob Trigger function in Azure Functions expected str, bytes or os.PathLike object, not PosixPath

My problem is: I try to execute a fresh uploaded python function in an Azure Function App service and launch it (no matter if I use blob trigger or http trigger) I allways get the same error:
Exception while executing function: Functions.TestBlobTrigger Result: Failure
Exception: TypeError: expected str, bytes or os.PathLike object, not PosixPath
Stack: File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/dispatcher.py", line 284, in _handle__function_load_request
func = loader.load_function(
File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/utils/wrappers.py", line 40, in call
return func(*args, **kwargs)
File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/loader.py", line 53, in load_function
register_function_dir(dir_path.parent)
File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/loader.py", line 26, in register_function_dir
_submodule_dirs.append(fspath(path))
Why is this happening: when the function is successfully deployed I upload a file in a blob in order to trigger the function but I get allways the same error, caused by the pathlib library (https://pypi.org/project/pathlib/). I have written a very easy function that works in my local vscode and it just prints a message.
import logging
import configparser
import azure.functions as func
from azure.storage.blob import BlockBlobService
import os
import datetime
import io
import json
import calendar
import aanalytics2 as api2
import time
import pandas as pd
import csv
from io import StringIO
def main(myblob: func.InputStream):
logging.info("BLob Trigger function Launched ");
blob_bytes = myblob.read();
blobmessage=blob_bytes.decode()
func1 = PythonAPP.callMain();
func1.main(blobmessage);
The Pythonapp class is:
class PythonAPP:
def __init__(self):
logging.info('START extractor. ');
self.parameter="product";
def main(self,message1):
var1="--";
try:
var1="---";
logging.info('END: ->paramet '+str(message1));
except Exception as inst:
logging.error("Error PythonAPP.main : " + str(inst));
return var1;
My requirements.txt file is:
azure-storage-blob== 0.37.1
azure-functions-durable
azure-functions
pandas
xlrd
pysftp
openpyxl
configparser
PyJWT==1.7.1
pathlib
dicttoxml
requests
aanalytics2
I've created this simple function in order to check if I can upload the simpliest example in Azure Functions, is there any dependencies that am I forgetting?
Checking the status of the functions I found this:
------------UPDATE1--------------------
The function is failing because the pathlib import, this is because in the requirements of the function it downloads this library and fails with AZ functions. Please see the requirements.txt file in the following link: https://github.com/pitchmuc/adobe_analytics_api_2.0/blob/master/requirements.txt
Can I exlude it somehow?
Well I can't provide an answer for that, I made a walkarround. In this case I created a copy of the library in a github repository. In this copy I erased the references to the pathlib in the requrements.txt and setup.py because this libary causes the failure in the AZ function APPS. By the way in the requirements file of the proyect make a reference to the project, so please mind the requiremnts file that I wrote above and change aanalytics2 reference to:
git+git://github.com/theURLtotherepository.git#master#egg=theproyectname
LINKS.
I've checked a lot of examples in google but none of them helped me:
Azure function failing after successfull deployment with OSError: [Errno 107]
https://github.com/Azure/azure-functions-host/issues/6835
https://learn.microsoft.com/en-us/answers/questions/39865/azure-functions-python-httptrigger-with-multiple-s.html
https://learn.microsoft.com/en-us/answers/questions/147627/azure-functions-modulenotfounderror-for-python-scr.html
Missing Dependencies on Python Azure Function --> no bundler flag or –build remote
https://github.com/OpenMined/PySyft/issues/3400
This is a bug in the azure codebase itself; specifically
within:
azure-functions-python-worker/azure_functions_worker/dispatcher.py
the problematic code within dispatcher looks to be setting up the exposed functions with the metadata parameters found within function.json
you will not encounter the issue if you're not using pathlib within your function app / web app code
if pathlib is available the issue manifests
rather than the simple os.path strings pathlib.Path objects are exposed - deeper down in the codebase there looks to be a conditional import and use of pathlib
to resolve simply remove the pathlib from your requirements.txt and redeploy
you'll need to refactor any of your own code that used pathlib - use equivalent methods in the os module
there looks to have been a ticket opened to resolve this around the time of the OP post - but it is notresolved in the current release

JupyterLab 3: how to get the list of running servers

Since JupyterLab 3.x jupyter-server is used instead of the classic notebook server, and the following code does not list servers served with jupyter_server:
from notebook import notebookapp
notebookapp.list_running_servers()
None
What still works for the file/notebook name is:
from time import sleep
from IPython.display import display, Javascript
import subprocess
import os
import uuid
def get_notebook_path_and_save():
magic = str(uuid.uuid1()).replace('-', '')
print(magic)
# saves it (ctrl+S)
# display(Javascript('IPython.notebook.save_checkpoint();')) # Javascript Error: IPython is not defined
nb_name = None
while nb_name is None:
try:
sleep(0.1)
nb_name = subprocess.check_output(f'grep -l {magic} *.ipynb', shell=True).decode().strip()
except:
pass
return os.path.join(os.getcwd(), nb_name)
But it's not pythonic nor fast
How to get the current running server instances - and so e.g. the current notebook file?
Migration to jupyter_server should be as easy as changing notebook to jupyter_server, notebookapp to serverapp and changing the appropriate configuration files - the server-related codebase is largely unchanged. In the case of listing servers simply use:
from jupyter_server import serverapp
serverapp.list_running_servers()

Importing all variables from a flexibly defined module

I have a main config.py file and then specific client config files, e.g. client1_config.py.
What I'd like to do is import all variables within my client1_config.py file into my config.py file. The catch is I want to do this flexibly at runtime according to an environment variable. It would look something like this:
import os
import importlib
client = os.environ['CLIENT']
client_config = importlib.import_module(
'{client}_config'.format(
client=client))
from client_config import *
This code snippet returns the following error: ModuleNotFoundError: No module named 'client_config'
Is it possible (and how) to achieve what I'm trying to do or Python does not support this kind of importing at all?
The call to import_module already imports the client configuration. from client_config import * assumes that client_config is the name of the module you are trying to import, just as import os will import the module os even if you create a variable os beforehand:
os = "sys"
import os # still imports the os module, not the sys module
In the following, assume that we have a client1_config.py which just contains one variable:
dummy = True
To add its elements to the main namespace of config.py so that you can access them directly, you can do the following:
import importlib
client = "client1"
# Import the client's configuration
client_config = importlib.import_module(f"{client}_config")
print(client_config.dummy) # True
# Add all elements from client_config
# to the main namespace:
globals().update({v: getattr(client_config, v)
for v in client_config.__dict__
if not v.startswith("_")})
print(dummy) # True
However, I would suggest to access the client's configuration as config.client for clarity and to avoid the client's configuration file overwriting values in the main configuration file.

Loading python modules in Python 3

How do I load a python module, that is not built in. I'm trying to create a plugin system for a small project im working on. How do I load those "plugins" into python? And, instaed of calling "import module", use a string to reference the module.
Have a look at importlib
Option 1: Import an arbitrary file in an arbiatrary path
Assume there's a module at /path/to/my/custom/module.py containing the following contents:
# /path/to/my/custom/module.py
test_var = 'hello'
def test_func():
print(test_var)
We can import this module using the following code:
import importlib.machinery
myfile = '/path/to/my/custom/module.py'
sfl = importlib.machinery.SourceFileLoader('mymod', myfile)
mymod = sfl.load_module()
The module is imported and assigned to the variable mymod. We can then access the module's contents as:
mymod.test_var
# prints 'hello' to the console
mymod.test_func()
# also prints 'hello' to the console
Option 2: Import a module from a package
Use importlib.import_module
For example, if you want to import settings from a settings.py file in your application root folder, you could use
_settings = importlib.import_module('settings')
The popular task queue package Celery uses this a lot, rather than giving you code examples here, please check out their git repository

ipython notebook --script deprecated. How to replace with post save hook?

I have been using "ipython --script" to automatically save a .py file for each ipython notebook so I can use it to import classes into other notebooks. But this recenty stopped working, and I get the following error message:
`--script` is deprecated. You can trigger nbconvert via pre- or post-save hooks:
ContentsManager.pre_save_hook
FileContentsManager.post_save_hook
A post-save hook has been registered that calls:
ipython nbconvert --to script [notebook]
which behaves similarly to `--script`.
As I understand this I need to set up a post-save hook, but I do not understand how to do this. Can someone explain?
[UPDATED per comment by #mobius dumpling]
Find your config files:
Jupyter / ipython >= 4.0
jupyter --config-dir
ipython <4.0
ipython locate profile default
If you need a new config:
Jupyter / ipython >= 4.0
jupyter notebook --generate-config
ipython <4.0
ipython profile create
Within this directory, there will be a file called [jupyter | ipython]_notebook_config.py, put the following code from ipython's GitHub issues page in that file:
import os
from subprocess import check_call
c = get_config()
def post_save(model, os_path, contents_manager):
"""post-save hook for converting notebooks to .py scripts"""
if model['type'] != 'notebook':
return # only do this for notebooks
d, fname = os.path.split(os_path)
check_call(['ipython', 'nbconvert', '--to', 'script', fname], cwd=d)
c.FileContentsManager.post_save_hook = post_save
For Jupyter, replace ipython with jupyter in check_call.
Note that there's a corresponding 'pre-save' hook, and also that you can call any subprocess or run any arbitrary code there...if you want to do any thing fancy like checking some condition first, notifying API consumers, or adding a git commit for the saved script.
Cheers,
-t.
Here is another approach that doesn't invoke a new thread (with check_call). Add the following to jupyter_notebook_config.py as in Tristan's answer:
import io
import os
from notebook.utils import to_api_path
_script_exporter = None
def script_post_save(model, os_path, contents_manager, **kwargs):
"""convert notebooks to Python script after save with nbconvert
replaces `ipython notebook --script`
"""
from nbconvert.exporters.script import ScriptExporter
if model['type'] != 'notebook':
return
global _script_exporter
if _script_exporter is None:
_script_exporter = ScriptExporter(parent=contents_manager)
log = contents_manager.log
base, ext = os.path.splitext(os_path)
py_fname = base + '.py'
script, resources = _script_exporter.from_filename(os_path)
script_fname = base + resources.get('output_extension', '.txt')
log.info("Saving script /%s", to_api_path(script_fname, contents_manager.root_dir))
with io.open(script_fname, 'w', encoding='utf-8') as f:
f.write(script)
c.FileContentsManager.post_save_hook = script_post_save
Disclaimer: I'm pretty sure I got this from SO somwhere, but can't find it now. Putting it here so it's easier to find in future (:
I just encountered a problem where I didn't have rights to restart my Jupyter instance, and so the post-save hook I wanted couldn't be applied.
So, I extracted the key parts and could run this with python manual_post_save_hook.py:
from io import open
from re import sub
from os.path import splitext
from nbconvert.exporters.script import ScriptExporter
for nb_path in ['notebook1.ipynb', 'notebook2.ipynb']:
base, ext = splitext(nb_path)
script, resources = ScriptExporter().from_filename(nb_path)
# mine happen to all be in Python so I needn't bother with the full flexibility
script_fname = base + '.py'
with open(script_fname, 'w', encoding='utf-8') as f:
# remove 'In [ ]' commented lines peppered about
f.write(sub(r'[\n]{2}# In\[[0-9 ]+\]:\s+[\n]{2}', '\n', script))
You can add your own bells and whistles as you would with the standard post save hook, and the config is the correct way to proceed; sharing this for others who might end up in a similar pinch where they can't get the config edits to go into action.

Resources