Cannot Upload C++ extension to Colab

Cannot Upload C++ extension to Colab - linux

I wrote a C++ extension and wrapped it using PyBind11 and compiled it on my Linux machine which yielded a .so file that works locally; however, I cannot upload that .so file to Colab so I tried it on Windows and got a .pyd file which does not upload either... am I doing something wrong?

You aren't doing anything wrong, but what method do you think that colab provides to upload system libraries? (hint: none).
You may have better luck trying to embed C code into python, i.e scipy.weave, but that still requires an environment that has access to a C compiler, which colab does not provide.
You can test if weave is provided as part of a jupyter environment as follows:
!pip install -q weave
import weave
weave.test()

Related

How can I create a Python executable program that doesn't force the user to actually download Python libraries and that stuff in order to be executed?

Let's say that I already built a program that essentially takes some images from an user's path and make new ones using the following libraries/modules:
os itertools pandas PIL
What I need now is to make that program become an executable file that can be actually executed without having to have installed the Python environment and the libraries used in the code, because the user would not know how to code, and would not likely matter how the code works.
The program would run on Windows OS (PC) only , and would use the cmd.exe as the interpreter and medium for user inputs and results display.

Solved here
You can use wheel, and ship whole environment.
#Vincent Caeles:
"#rh979 a wheel is not meant to have all dependencies. Subsequently you could do pip install path/to/wheel.whl --target /path/to/some/folder and zip the contents of the 'folder' to have all your dependencies in the zip archive and ship that to the environment where you want to run your code.
"
Another options is pyinstaller library, according to documentation it should do what you need.

How to embed FreeCAD in a Python virtual environment?

How do I setup a Python virtual environment with the FreeCAD library embedded as to enable import as a module into scripts?
I would like to avoid using the FreeCAD GUI as well as being dependent on having FreeCAD installed on the system, when working with Python scripts that use FreeCAD libraries to create and modify 3D geometry. I hope a Python virtual environment can make that possible.
I am working in PyCharm 2021.1.1 with Python 3.8 in virtualenv on Debian 10.
I started out with FreeCAD documentation for embedding in scripts as a basis:
https://wiki.freecadweb.org/Embedding_FreeCAD
In one attempt, I downloaded and unpacked .deb packages from Debian, taking care to get the correct versions required by each dependency. In another attempt, I copied the contents of a FreeCAD flatpak install, as it should contain all the libraries that FreeCAD depends on.
After placing the libraries to be imported in the virtual maching folder, I have pointed to them with sys.path.append() as well as PyCharm's Project Structure tool in various attempts. In both cases the virtual environment detects where FreeCAD.so is located, but fails to find any of its dependencies, even when located in the same folder. When importing these dependencies explicitly, each of them have the same issue. This leads to a dead end when an import fails because it does not define a module export function according to Python:
ImportError: dynamic module does not define module export function (PyInit_libnghttp2)
I seem to be looking at a very long chain of broken dependencies, even though I make the required libraries available and inform Python where they are located.
I would appreciate either straight up instructions for how to do this or pointers to documentation that describes importing FreeCAD libraries in Python virtual environments, as I have not come across anything that specific yet.
I came across a few prior questions which seemed to have similar intent, but no answers:
Embedding FreeCAD in python script
Is it possible to embed Blender/Freecad in a python program?
Similar questions for Conda focus on importing libraries from the host system rather than embedding them in the virtual environment:
Incude FreeCAD in system path for just one conda virtual environment
Other people's questions on FreeCAD forums went unanswered:
https://forum.freecadweb.org/viewtopic.php?t=27929
EDIT:
Figuring this out was a great learning experience. The problem with piecing dependencies together is that for that approach to work out, everything from the FreeCAD and its dependencies to the Python interpreter and its dependencies seems to need to be built on the same versions of the libraries that they depend on to avoid causing segmentation faults that brings everything to a crashing halt. This means that the idea of grabbing FreeCAD modules and libraries it depends on from a Flatpak installation is in theory not horrible, as all parts are built together using the same library versions. I just couldn't make it work out, presumably due to how the included libraries are located and difficulty identifying an executable for the included Python interpreter. In the end, I looked into the contents of the FreeCAD AppImage, and that turned out to have everything needed in a folder structure that appears to be very friendly to what PyCharm and Python expects from modules and libraries.

This is what I did to get FreeCAD to work with PyCharm and virtualenv:
Download FreeCAD AppImage
https://www.freecadweb.org/downloads.php
Make AppImage executable
chmod -v +x ~/Downloads/FreeCAD_*.AppImage
Create folder for extracting AppImage
mkdir -v ~/Documents/freecad_appimage
Extract AppImage from folder (note: this expands to close to 30000 files requiring in excess of 2 GB disk space)
cd ~/Documents/freecad_appimage
~/Downloads/./FreeCAD_*.AppImage --appimage-extract
Create folder for PyCharm project
mkdir -v ~/Documents/pycharm_freecad_project
Create pycharm project using Python interpreter from extracted AppImage
Location: ~/Documents/pycharm_freecad_project
New environment using: Virtualenv
Location: ~/Documents/pycharm_freecad_project/venv
Base interpreter: ~/Documents/freecad_appimage/squashfs-root/usr/bin/python
Inherit global site-packages: False
Make available to all projects: False
Add folder containing FreeCAD.so library as Content Root to PyCharm Project Structure and mark as Sources (by doing so, you shouldn't have to set PYTHONPATH or sys.path values, as PyCharm provides module location information to the interpreter)
File: Settings: Project: Project Structure: Add Content Root
~/Documents/freecad_appimage/squashfs-root/usr/lib
After this PyCharm is busy indexing files for a while.
Open Python Console in PyCharm and run command to check basic functioning
import FreeCAD
Create python script with example functionality
import FreeCAD
vec = FreeCAD.Base.Vector(0, 0, 0)
print(vec)
Run script
Debug script
All FreeCAD functionality I have used in my scripts so far has worked. However, one kink seems to be that the FreeCAD module needs to be imported before the Path module. Otherwise the Python interpreter exits with code 139 (interrupted by signal 11: SIGSEGV).
There are a couple of issues with PyCharm: It is showing a red squiggly line under the import name and claiming that an error has happened because "No module named 'FreeCAD'", even though the script is running perfectly. Also, PyCharm fails to provide code completion in the code view, even though it manages to do so in it's Python Console. I am creating new questions to address those issues and will update info here if I find a solution.

Converting .py to .exe with pandas imported for all users not having pandas

I'm new to Python, but I have set up an python script for searching some specific Values in 2 different excel sheets printing out matches (in excel).
Problem is, that our work machines are heavily locked down and without admin privileges, we can't really install anything (we can download though). Is there any version of Python that is Windows 7 compatible that will run standalone without requiring any sort of installer?
I have tried pyInstaller, but the problem is that in my script we need PANDAS.
And there is no possibility to pip install pandas to our local machines. All is blocked. ("pip install pandas" is not possible. I did the code with Anaconda)
So my question is: how can I set up a file for my coworkers, who have no permission to download pandas?
Can I set up an exe file (all use windows 7/10) in my private computer where pandas is already installed and forward it to the workers?
It should be very easy for them to use--> double click for executing the python script
Thanks in advance for any advice.

You can also use pyinstaller which personally I find the easiest to use. It can bundle executables for both Linux and Windows, but it must be run on that architecture that you wish to have executable for, i.e. if you want to have Linux executable the pyinstaller command with your code must be run on Linux OS of some kind.
More here: https://www.pyinstaller.org/

This is old so you may already have found a solution but this might help others.
Python by default is an interpreted language. This means that compiling it into an .exe file is impossible.
However, using some modules it is indeed possible to convert a .py script into a windows executable.
You can try py2exe.
py2exe is a Python Distutils extension which converts Python scripts
into executable Windows programs, able to run without requiring a
Python installation.
They have a tutorial here.

Allow my friends to run my python script? [duplicate]

This question already has answers here:
Create a single executable from a Python project [closed]
(3 answers)
Closed 1 year ago.
I'm building a Python application and don't want to force my clients to install Python and modules.
So, is there a way to compile a Python script to be a standalone executable?

You can use PyInstaller to package Python programs as standalone executables. It works on Windows, Linux, and Mac.
PyInstaller Quickstart
Install PyInstaller from PyPI:
pip install pyinstaller
Go to your program’s directory and run:
pyinstaller yourprogram.py
This will generate the bundle in a subdirectory called dist.
pyinstaller -F yourprogram.py
Adding -F (or --onefile) parameter will pack everything into single "exe".
pyinstaller -F --paths=<your_path>\Lib\site-packages yourprogram.py
running into "ImportError" you might consider side-packages.
pip install pynput==1.6.8
still runing in Import-Erorr - try to downgrade pyinstaller - see Getting error when using pynput with pyinstaller
For a more detailed walkthrough, see the manual.

You can use py2exe as already answered and use Cython to convert your key .py files in .pyc, C compiled files, like .dll in Windows and .so on Linux.
It is much harder to revert than common .pyo and .pyc files (and also gain in performance!).

You might wish to investigate Nuitka. It takes Python source code and converts it in to C++ API calls. Then it compiles into an executable binary (ELF on Linux). It has been around for a few years now and supports a wide range of Python versions.
You will probably also get a performance improvement if you use it. It is recommended.

Yes, it is possible to compile Python scripts into standalone executables.
PyInstaller can be used to convert Python programs into stand-alone executables, under Windows, Linux, Mac OS X, FreeBSD, Solaris, and AIX. It is one of the recommended converters.
py2exe converts Python scripts into only executable on the Windows platform.
Cython is a static compiler for both the Python programming language and the extended Cython programming language.

I would like to compile some useful information about creating standalone files on Windows using Python 2.7.
I have used py2exe and it works, but I had some problems.
It has shown some problems for creating single files in Windows 64 bits: Using bundle_files = 1 with py2exe is not working;
It is necessary to create a setup.py file for it to work. http://www.py2exe.org/index.cgi/Tutorial#Step2;
I have had problems with dependencies that you have to solve by importing packages in the setup file;
I was not able to make it work together with PyQt.
This last reason made me try PyInstaller http://www.pyinstaller.org/.
In my opinion, it is much better because:
It is easier to use.
I suggest creating a .bat file with the following lines for example (pyinstaller.exe must be in in the Windows path):
pyinstaller.exe --onefile MyCode.py
You can create a single file, among other options (https://pyinstaller.readthedocs.io/en/stable/usage.html#options).
I had only one problem using PyInstaller and multiprocessing package that was solved by using this recipe: https://github.com/pyinstaller/pyinstaller/wiki/Recipe-Multiprocessing.
So, I think that, at least for python 2.7, a better and simpler option is PyInstaller.

And a third option is cx_Freeze, which is cross-platform.

pyinstaller yourfile.py -F --onefile
This creates a standalone EXE file on Windows.
Important note 1: The EXE file will be generated in a folder named 'dist'.
Important note 2: Do not forget --onefile flag
You can install PyInstaller using pip install PyInstaller
NOTE: In rare cases there are hidden dependencies...so if you run the EXE file and get missing library error (win32timezone in the example below) then use something like this:
pyinstaller --hiddenimport win32timezone -F "Backup Program.py"

I like PyInstaller - especially the "windowed" variant:
pyinstaller --onefile --windowed myscript.py
It will create one single *.exe file in a distination/folder.

You may like py2exe. You'll also find information in there for doing it on Linux.

Use py2exe.... use the below set up files:
from distutils.core import setup
import py2exe
from distutils.filelist import findall
import matplotlib
setup(
console = ['PlotMemInfo.py'],
options = {
'py2exe': {
'packages': ['matplotlib'],
'dll_excludes': ['libgdk-win32-2.0-0.dll',
'libgobject-2.0-0.dll',
'libgdk_pixbuf-2.0-0.dll']
}
},
data_files = matplotlib.get_py2exe_datafiles()
)

I also recommend PyInstaller for better backward compatibility such as Python 2.3 - 2.7.
For py2exe, you have to have Python 2.6.

For Python 3.2 scripts, the only choice is cx_Freeze. Build it from sources; otherwise it won't work.
For Python 2.x I suggest PyInstaller as it can package a Python program in a single executable, unlike cx_Freeze which outputs also libraries.

Since it seems to be missing from the current list of answers, I think it is worth mentioning that the standard library includes a zipapp module that can be used for this purpose. Its basic usage is just compressing a bunch of Python files into a zip file with extension .pyz than can be directly executed as python myapp.pyz, but you can also make a self-contained package from a requirements.txt file:
$ python -m pip install -r requirements.txt --target myapp
$ python -m zipapp -p "interpreter" myapp
Where interpreter can be something like /usr/bin/env python (see Specifying the Interpreter).
Usually, the generated .pyz / .pyzw file should be executable, in Unix because it gets marked as such and in Windows because Python installation usually registers those extensions. However, it is relatively easy to make a Windows executable that should work as long as the user has python3.dll in the path.
If you don't want to require the end user to install Python, you can distribute the application along with the embeddable Python package.

py2exe will make the EXE file you want, but you need to have the same version of MSVCR90.dll on the machine you're going to use your new EXE file.
See Tutorial for more information.

You can find the list of distribution utilities listed at Distribution Utilities.
I use bbfreeze and it has been working very well (yet to have Python 3 support though).

Not exactly a packaging of the Python code, but there is now also Grumpy from Google, which transpiles the code to Go.
It doesn't support the Python C API, so it may not work for all projects.

Using PyInstaller, I found a better method using shortcut to the .exe rather than making --onefile. Anyway, there are probably some data files around and if you're running a site-based app then your program depends on HTML, JavaScript, and CSS files too. There isn't any point in moving all these files somewhere... Instead what if we move the working path up?
Make a shortcut to the EXE file, move it at top and set the target and start-in paths as specified, to have relative paths going to dist\folder:
Target: %windir%\system32\cmd.exe /c start dist\web_wrapper\web_wrapper.exe
Start in: "%windir%\system32\cmd.exe /c start dist\web_wrapper\"
We can rename the shortcut to anything, so renaming to "GTFS-Manager".
Now when I double-click the shortcut, it's as if I python-ran the file! I found this approach better than the --onefile one as:
In onefile's case, there's a problem with a .dll missing for the Windows 7 OS which needs some prior installation, etc. Yawn. With the usual build with multiple files, no such issues.
All the files that my Python script uses (it's deploying a tornado web server and needs a whole freakin' website worth of files to be there!) don't need to be moved anywhere: I simply create the shortcut at top.
I can actually use this exact same folder on Ubuntu (run python3 myfile.py) and Windows (double-click the shortcut).
I don't need to bother with the overly complicated hacking of .spec file to include data files, etc.
Oh, remember to delete off the build folder after building. It will save on size.

Use Cython to convert to C, compile, and link with GCC.
Another could be, make the core functions in C (the ones you want to make hard to reverse), compile them and use Boost.Python to import the compiled code (plus you get a much faster code execution). Then use any tool mentioned to distribute.

I'm told that PyRun is also an option. It currently supports Linux, FreeBSD and Mac OS X.

sqlite3 error on AWS lambda with Python 3

I am building a python 3.6 AWS Lambda deploy package and was facing an issue with SQLite.
In my code I am using nltk which has a import sqlite3 in one of the files.
Steps taken till now:
Deployment package has only python modules that I am using in the root. I get the error:
Unable to import module 'my_program': No module named '_sqlite3'
Added the _sqlite3.so from /home/my_username/anaconda2/envs/py3k/lib/python3.6/lib-dynload/_sqlite3.so into package root. Then my error changed to:
Unable to import module 'my_program': dynamic module does not define module export function (PyInit__sqlite3)
Added the SQLite precompiled binaries from sqlite.org to the root of my package but I still get the error as point #2.
My setup: Ubuntu 16.04, python3 virtual env
AWS lambda env: python3
How can I fix this problem?

Depending on what you're doing with NLTK, I may have found a solution.
The base nltk module imports a lot of dependencies, many of which are not used by substantial portions of its feature set. In my use case, I'm only using the nltk.sent_tokenize, which carries no functional dependency on sqlite3 even though sqlite3 gets imported as a dependency.
I was able to get my code working on AWS Lambda by changing
import nltk
to
import imp
import sys
sys.modules["sqlite"] = imp.new_module("sqlite")
sys.modules["sqlite3.dbapi2"] = imp.new_module("sqlite.dbapi2")
import nltk
This dynamically creates empty modules for sqlite and sqlite.dbapi2. When nltk.corpus.reader.panlex_lite tries to import sqlite, it will get our empty module instead of the standard library version. That means the import will succeed, but it also means that when nltk tries to use the sqlite module it will fail.
If you're using any functionality that actually depends on sqlite, I'm afraid I can't help. But if you're trying to use other nltk functionality and just need to get around the lack of sqlite, this technique might work.

This is a bit of a hack, but I've gotten this working by dropping the _sqlite3.so file from Python 3.6 on CentOS 7 directly into the root of the project being deployed with Zappa to AWS. This should mean that if you can include _sqlite3.so directly into the root of your ZIP, it should work, so it can be imported by this line in cpython:
https://github.com/python/cpython/blob/3.6/Lib/sqlite3/dbapi2.py#L27
Not pretty, but it works. You can find a copy of _sqlite.so here:
https://github.com/Miserlou/lambda-packages/files/1425358/_sqlite3.so.zip
Good luck!

This isn't a solution, but I have an explanation why.
Python 3 has support for sqlite in the standard library (stable to the point of pip knowing and not allowing installation of pysqlite). However, this library requires the sqlite developer tools (C libs) to be on the machine at runtime. Amazon's linux AMI does not have these installed by default, which is what AWS Lambda runs on (naked ami instances). I'm not sure if this means that sqlite support isn't installed or just won't work until the libraries are added, though, because I tested things in the wrong order.
Python 2 does not support sqlite in the standard library, you have to use a third party lib like pysqlite to get that support. This means that the binaries can be built more easily without depending on the machine state or path variables.
My suggestion, which you've already done I see, is to just run that function in python 2.7 if you can (and make your unit testing just that much harder :/).
Because of the limitations (it being something baked into python's base libs in 3) it is more difficult to create a lambda-friendly deployment package. The only thing I can suggest is to either petition AWS to add that support to lambda or (if you can get away without actually using the sqlite pieces in nltk) copying anaconda by putting blank libraries that have the proper methods and attributes but don't actually do anything.
If you're curious about the latter, check out any of the fake/_sqlite3 files in an anaconda install. The idea is only to avoid import errors.

As apathyman describes, there isn't a direct solution to this until Amazon bundle the C libraries required for sqlite3 into the AMI's used to run Python on lambda.
One workaround though, is using a pure Python implementation of SQLite, such as PyDbLite. This side-steps the problem, as a library like this doesn't require any particular C libraries to be installed, just Python.
Unfortunately, this doesn't help you if you are using a library which in turn uses the sqlite3 module.

My solution may or may not apply to you (as it depends on Python 3.5), but hopefully it may shed some light for similar issue.
sqlite3 comes with standard library, but is not built with the python3.6 that AWS use, with the reason explained by apathyman and other answers.
The quick hack is to include the share object .so into your lambda package:
find ~ -name _sqlite3.so
In my case:
/home/user/anaconda3/pkgs/python-3.5.2-0/lib/python3.5/lib-dynload/_sqlite3.so
However, that is not totally sufficient. You will get:
ImportError: libpython3.5m.so.1.0: cannot open shared object file: No such file or directory
Because the _sqlite3.so is built with python3.5, it also requires python3.5 share object. You will also need that in your package deployment:
find ~ -name libpython3.5m.so*
In my case:
/home/user/anaconda3/pkgs/python-3.5.2-0/lib/libpython3.5m.so.1.0
This solution is likely not work if you are using _sqlite3.so that is built with python3.6, because the libpython3.6 built by AWS will likely not support this. However, this is just my educational guess. If anyone has successfully done, please let me know.

TL;DR
Although not a very good approach but it works just fine. Use pickle module to dump whatever functionality you want in a separate file using you Local System. Export that file to AWS Project Directory, load functionality from file and use it.
In Long
So, here's what I tried, I had this same problem when I was trying to import stopwords from nltk.corpus but AWS doesn't let me import it, even installing it wasn't seems to be possible for me on Amazon AMI because of the same error _sqlite3 module not found. So to do that, what I tried was, using my local system I generated a file and dump stopwords into it. Here's the code
# Only for Local System
from nltk.corpus import stopwords
import pickle
stop = list(stopwords.words('English'))
with open('stopwords.pkl', 'wb') as f:
pickle.dump(stop, f)
Now, I exported this file to AWS Directory using winscp and then used that functionality by loading it from the file on my project.
import pickle
# loading trained model
with open('stopwords.pkl', 'rb') as f:
stop = pickle.load(f)
and it works just fine

From AusIV's answer, This version works for me in AWS Lambda and NLTK, I created a dummysqllite file to mock the required references.
spec = importlib.util.spec_from_file_location("_sqlite3","/dummysqllite.py")
sys.modules["_sqlite3"] = importlib.util.module_from_spec(spec)
sys.modules["sqlite3"] = importlib.util.module_from_spec(spec)
sys.modules["sqlite3.dbapi2"] = importlib.util.module_from_spec(spec)

You need the sqlite3.so file (as others have pointed out), but the most robust way to get it is to pull from the (semi-official?) AWS Lambda docker images available in lambci/lambda. For example, for Python 3.7, here's an easy way to do this:
First, let's grab the sqlite3.so (library file) from the docker image:
mkdir lib
docker run -v $PWD:$PWD lambci/lambda:build-python3.7 bash -c "cp sqlite3.cpython*.so $PWD/lib/"
Next, we'll make a zipped executable with our requirements and code:
pip install -t output requirements.txt
pip install . -t output
zip -r output.zip output
Finally, we add the library file to our image:
cd lib && zip -r ../output.zip sqlite3.cpython*.so
If you want to use AWS SAM build/packaging, instead copy it into the top-level of the lambda environment package (i.e., next to your other python files).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string