Py2exe: Embed static files in exe file itself and access them - resources

I found a solution to add files in library.zip via: Extend py2exe to copy files to the zipfile where pkg_resources can load them.
I can access to my file when library.zip is not include the exe.
I add a file : text.txt in directory: foo/media in library.zip.
And I use this code:
import pkg_resources
import zipfile
from cStringIO import StringIO
my_data = pkg_resources.resource_string(__name__,"library.zip")
filezip = StringIO(my_data)
zip = zipfile.ZipFile(filezip)
data = zip.read("foo/media/text.txt")
I try to use pkg_resources but I think that I don't understand something because I could open directly "library.zip".
My question is how can I do this when library.zip is embed in exe?
Best Regards
Jean-Michel

I cobbled together a reasonably neat solution to this, but it doesn't use pkg_resources.
I need to distribute productivity tools as standalone EXEs, that is, all bundled into the one .exe file. I also need to send out notifications when these tools are used, which I do via the Logging API, using file-based configuration. I emded the logging.cfg fileto make it harder to effectively switch-off these notifications i.e. by deleting the loose file... which would probably break the app anyway.
So the following is the interesting bits from my setup.py:
LOGGING_CFG = open('main/resources/logging.cfg').read()
setup(
name='productivity-tool',
...
# py2exe extras
console=[{'script': productivity_tool.__file__.replace('.pyc', '.py'),
'other_resources': [(u'LOGGINGCFG', 1, LOGGING_CFG)]}],
zipfile=None,
options={'py2exe': {'bundle_files': 1, 'dll_excludes': ['w9xpopen.exe']}},
)
Then in the startup code for productivity_tool.py:
from win32api import LoadResource
from StringIO import StringIO
from logging.config import fileConfig
...
if __name__ == '__main__':
if is_exe():
logging_cfg = StringIO(LoadResource(0, u'LOGGINGCFG', 1))
else:
logging_cfg = 'main/resources/logging.cfg'
fileConfig(logging_cfg)
...
Works a treat!!!

Thank you but I found the solution
my_data = pkg_resources.resource_stream("__main__",sys.executable) # get lib.zip file
zip = zipfile.ZipFile(my_data)
data = zip.read("foo/media/doc.pdf") # get my data on lib.zip
file = open(output_name, 'wb')
file.write(data) # write it on a file
file.close()
Best Regards

You shouldn't be using pkg_resources to retrieve the library.zip file. You should use it to retrieve the added resource.
Suppose you have the following project structure:
setup.py
foo/
__init__.py
bar.py
media/
image.jpg
You would use resource_string (or, preferably, resource_stream) to access image.jpg:
img = pkg_resources.resource_string(__name__, 'media/image.jpg')
That should "just work". At least it did when I bundled my media files in the EXE. (Sorry, I've since left the company where I was using py2exe, so don't have a working example to draw on.)
You could also try using pkg_resources.resource_filename(), but I don't think that works under py2exe.

Related

Python import files from 3 layers

I have the following file structure
home/user/app.py
home/user/content/resource.py
home/user/content/call1.py
home/user/content/call2.py
I have imported resources.py in app.py as below:
import content.resource
Also, I have imported call1 and call2 in resource.py
import call1
import call2
The requirement is to run two tests individually.
run app.py
run resource.py
When I run app.py, it says cannot find call1 and call2.
When run resource.py, the file is running without any issues. How to run app.py python file to call import functions in resource.py and also call1.py and call2.py files?
All the 4 files having __init__ main function.
In your __init__ files, just create a list like this for each init, so for your user __init__: __all__ = ["app", "content"]
And for your content __init__: __all__ = ["resource", "call1", "call2"]
First try: export PYTHONPATH=/home/user<-- Make sure this is the correct absolute path.
If that doesn't solve the issue, try adding content to the path as well.
try: export PYTHONPATH=/home/user/:/home/user/content/
This should definitely work.
You will then import like so:
import user.app
import user.content.resource
NOTE
Whatever you want to use, you must import in every file. Don't bother importing in __init__. Just mention whatever modules that __init__ includes by doing __all__ = []
You have to import call1 and call2 in app.py if you want to call them there.

Creating a Spark RDD from a file located in Google Drive using Python on Colab.Research.Google

I have been successful in running Python 3 / Spark 2.2.1 program in Google's Colab.Research platform :
!apt-get update
!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q http://apache.osuosl.org/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
!tar xf spark-2.2.1-bin-hadoop2.7.tgz
!pip install -q findspark
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["SPARK_HOME"] = "/content/spark-2.2.1-bin-hadoop2.7"
import findspark
findspark.init()
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").getOrCreate()
this works perfectly when I uploaded text files from my local computer to the Unix VM using
from google.colab import files
datafile = files.upload()
and read them as follows :
textRDD = spark.read.text('hobbit.txt').rdd
so far so good ..
My problem starts when I am trying to read a file that is lying in my Google drive colab directory.
Following instructions I have authenticated user and created a drive service
from google.colab import auth
auth.authenticate_user()
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')
after which I have been able to access the file lying in the drive as follows :
file_id = '1RELUMtExjMTSfoWF765Hr8JwNCSL7AgH'
import io
from googleapiclient.http import MediaIoBaseDownload
request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
# _ is a placeholder for a progress object that we ignore.
# (Our file is small, so we skip reporting progress.)
_, done = downloader.next_chunk()
downloaded.seek(0)
print('Downloaded file contents are: {}'.format(downloaded.read()))
Downloaded file contents are: b'The king beneath the mountain\r\nThe king of ......
even this works perfectly ..
downloaded.seek(0)
print(downloaded.read().decode('utf-8'))
and gets the data
The king beneath the mountain
The king of carven stone
The lord of silver fountain ...
where things FINALLY GO WRONG is where I try to grab this data and put it into a spark RDD
downloaded.seek(0)
tRDD = spark.read.text(downloaded.read().decode('utf-8'))
and I get the error ..
AnalysisException: 'Path does not exist: file:/content/The king beneath the mountain\ ....
Evidently, I am not using the correct method / parameters to read the file into spark. I have tried quite a few of the methods described
I would be very grateful if someone can help me figure out how to read this file for subsequent processing.
A complete solution to this problem is available in another StackOverflow question that is available at this URL.
Here is the notebook where this solution is demonstrated.
I have tested it and it works!
It seems that spark.read.text expects a file name. But you give it the file content instead. You can try either of these:
save it to a file then give the name
use just downloaded instead of downloaded.read().decode('utf-8')
You can also simplify downloading from Google Drive with pydrive. I gave an example here.
https://gist.github.com/korakot/d56c925ff3eccb86ea5a16726a70b224
Downloading is just
fid = drive.ListFile({'q':"title='hobbit.txt'"}).GetList()[0]['id']
f = drive.CreateFile({'id': fid})
f.GetContentFile('hobbit.txt')

Python 3.6 Importing a class from a parallel folder

I have a file structure as shown below,
MainFolder
__init__.py
FirstFolder
__init__.py
firstFile.py
SecondFolder
__init__.py
secondFile.py
Inside firstFile.py, I have a class named Math and I want to import this class in secondFile.py.
Code for firstFile.py
class Math(object):
def __init__(self, first_value, second_value):
self.first_value = first_value
self.second_value = second_value
def addition(self):
self.total_add_value = self.first_value + self.second_value
print(self.total_add_value)
def subtraction(self):
self.total_sub_value = self.first_value - self.second_value
print(self.total_sub_value)
Code for secondFile.py
from FirstFolder.firstFile import Math
Math(10, 2).addition()
Math(10, 2).subtraction()
When I tried running secondFile.py I get this error: ModuleNotFoundError: No module named 'First'
I am using Windows and the MainFolder is located in my C drive, under C:\Users\Name\Documents\Python\MainFolder
Possible solutions that I have tried are, creating the empty __init__.py for all main and sub folders, adding the dir of MainFolder into path under System Properties environment variable and using import sys & sys.path.append('\Users\Name\Documents\Python\MainFolder').
Unfortunately, all these solutions that I have found are not working. If anyone can highlight my mistakes to me or suggest other solutions, that would be great. Any help will be greatly appreciated!
There are potentially two issues. The first is with your import statement. The import statement should be
from FirstFolder.firstFile import Math
The second is likely that your PYTHONPATH environment variable doesn't include your MainFolder.
On linux and unix based systems you can do this temporarily on the commandline with
export PYTHONPATH=$PYTHONPATH:/path/to/MainFolder
On windows
set PYTHONPATH="%path%;C:\path\to\MainFolder"
If you want to set it permanently, use setx instead of set

Loading python modules in Python 3

How do I load a python module, that is not built in. I'm trying to create a plugin system for a small project im working on. How do I load those "plugins" into python? And, instaed of calling "import module", use a string to reference the module.
Have a look at importlib
Option 1: Import an arbitrary file in an arbiatrary path
Assume there's a module at /path/to/my/custom/module.py containing the following contents:
# /path/to/my/custom/module.py
test_var = 'hello'
def test_func():
print(test_var)
We can import this module using the following code:
import importlib.machinery
myfile = '/path/to/my/custom/module.py'
sfl = importlib.machinery.SourceFileLoader('mymod', myfile)
mymod = sfl.load_module()
The module is imported and assigned to the variable mymod. We can then access the module's contents as:
mymod.test_var
# prints 'hello' to the console
mymod.test_func()
# also prints 'hello' to the console
Option 2: Import a module from a package
Use importlib.import_module
For example, if you want to import settings from a settings.py file in your application root folder, you could use
_settings = importlib.import_module('settings')
The popular task queue package Celery uses this a lot, rather than giving you code examples here, please check out their git repository

EnsureDispatch error when using cx_freeze for making exe

I am working with Python 3.4 on Windows 7. My setup file is as follows:
from cx_Freeze import setup, Executable, sys
exe=Executable(
script="XYZ.py",
base="Win32Gui",
)
includefiles=[]
includes=[]
excludes=[]
packages=[]
setup(
version = "1.0",
description = "XYZ",
author = "MAX",
name = "AT",
options = {'build_exe': {'excludes':excludes,'packages':packages,'include_files':includefiles}},
executables = [exe]
)
from distutils.core import setup
import py2exe, sys, os, difflib
sys.argv.append('py2exe')
setup(
options = {'py2exe': {'bundle_files': 1}},
console = [{'script': "XYZ.py"}],
zipfile = None,
)
When the obtained exe is run, an error pops up saying:
...
File "C:\Python34\Lib\site-packages\win32com\client\CLSIDToClass.py", line 46, in GetClass
return mapCLSIDToClass[clsid]
KeyError: '{00020970-0000-0000-C000-000000000046}'
I just can't figure out the problem here. Help, please.
Thanks.
I've just figured out that the problem with EnsureDispatch is within gencache module, it assumes to be in read-only mode when an executable is built with cx_freeze.
The following lines allow cache to be built inside AppData\Local\Temp\gen_py\#.#\ directory in Windows 7 x64:
from win32com.client import gencache
if gencache.is_readonly:
gencache.is_readonly = False
gencache.Rebuild() #create gen_py folder if needed
References:
py2exe/pyinstaller and DispatchWithEvents answer
py2exe.org: UsingEnsureDispatch
P. S. Performance is much better with static dispatch
You are using static proxy which is generated on your disk and which has the compiled executable trouble finding. If you do not know what the static proxy is, you are probably using win32com.client.gencache.EnsureDispatch which generates static proxy automatically.
The easiest way to fix the problem is to use dynamic proxy by using win32com.client.dynamic.Dispatch. Static proxy has some benefits, but there is a high possibility that you do not need it.
You can find more information about static and dynamic proxies to COM objects here: http://timgolden.me.uk/python/win32_how_do_i/generate-a-static-com-proxy.html

Resources