The __slots__ attribute for classes was made with Python 2 or earlier, and according to comment for answer to Python __slots__ it appears that Python 3.3 has improved so the advantage on memory size may not be the reason for using __slots__ in for example Python 3.4 programs.
So should I clean up my code and remove the use of __slots__ in classes, or is there still some good reason for using __slots__ even in Python 3.4 programs?
By using __slots__ you are telling Python not to have a __dict__ attribute on your instances. This is to save memory in cases where you may have thousands upon thousands of instances. This is the reason to use them in Python 2, and it is the reason to use them in Python 3.
Related
This question already has answers here:
Python vs Cpython
(11 answers)
Closed 1 year ago.
I heard about python only but when I see different names on search result than I confused what the hell is this but I taught may be it is used in the part of code conversion but not sure about it what actually these are ?
Python is a language. CPython, IronPython, Jython are different implementations of this langauge. They also all happen to be implemented in different languages themselves: CPython is written in C, IronPython in .NET, and Jython in Java. Thus, it is very easy to integrate IronPython into a .NET program, and it is really easy to embed Jython into a Java program. But for the most part, when people execute Python, they are running their Python code through CPython (even if they don't know that's what it's officially called). It is the original Python implementation, it is the fastest of them, as well as being the implementation that defines what Python is. Unless you want to use Python within .NET or Java frameworks, you might never encounter the other two.
Intro
I have a quite complex python program (say more than 5.000 rows) written with Python 3.6. This program parses a huge dataset of more than 5.000 files, processes them creating an internal representation of the dataset and then creates statistics. Since I have to test the model, I need to save the dataset representation and at now I'm doing it by using serialization through dill (in the representation there are objects that pickle does not support). The serialization of the whole dataset, not compressed, takes about 1GB.
The problem
Now, I would like to speed up computation by parallelization. The perfect way would be a multithreading approach but GIL forbid that. multiprocessing module (and multiprocess - which is dill compatible - too) uses serialization to share complex objects between processes so that, in the best case I managed to invent, parallelization is ininfluent for me on time performance because of the huge size of the dataset.
The question
What is the best way to manage this situation?
I know about posh, but it seems to be only x86 compatible, ray but it uses serialization too, gilectomy (a version of python without gil) but I'm not able to make it parallelize threads and Jython which has no GIL but is not compatible with python 3.x.
I am open to any alternative, any language, however complex it may be, but I can't rewrite the code from scratch.
Best solution I found is change dill to a custom pickling module based on standard pickle. See here: Python 3.6 pickling custom procedure
I want to find an alternative for weave in Python 2 since weave is not available anymore in Python 3.
More specifically I need to have an alternative way of writing:
from scipy import weave
from scipy.weave import converters
code = """ C-code1 """
support_code = """ C-code2 """
weave.inline(code, ['a', 'b', 'c'], support_code=support_code, type_converters=converters.blitz, compiler='gcc', verbose=0)
You can use the Cython library as it's recommended by weave developers here. It's a little more complex in using but increases the performance of your code too. You can find some example here.
Another alternative can be Numba. It's more user-friendly but doesn't cache the compiled code.
Have a look at numba. Chances are, you can migrate all your codebase into plain python, and still preserve the same speed that you're used to from C code. You even gain some features like throwing clear python errors right from your inner loops, which was to my knowledge not easily possible from weave. As an example how fast you get with numba, you might check the benchmarks of numpy_groupies, which offers implementations as well in numba as with weave. If you got rid of your C code once, you'll never look back.
I really admire the functionality of Stackless Python, and I've been looking around for a way to emulate its syntax while still using the standard Python 3 interpreter. An article by Alex J. Champandard in a gamedev blog made it look as though the greenlet library could provide this functionality. I slightly modified his code, but the best makeshift tasklet wrapper I could come up with was a class holding a greenlet inside a variable, as such:
class tasklet():
def __init__(self,function=None,*variables):
global _scheduled
self.greenlet = greenlet.greenlet(function,None)
self.functioncall = function # Redundant backup
self.variables = variables
_scheduled.append(self)
self.blocked = False
The function then emulates Stackless' scheduling by passing the variables to the greenlet when calling its switch() method.
So far this appears to work, but I'd like to be able to call the tasklets in original Stackless syntax, e.g. tasklet(function)(*args), as opposed to the current syntax of tasklet(function,*args). I'm not sure where to look in the documentation to find out how to accomplish this. Is this even possible, or is it part of Stackless' changes to the interpreter?
According to this article from 2010-01-08 (with fixed links):
Stackless Python is an extended version of the Python language (and
its CPython reference implementation). New features include
lightweight coroutines (called tasklets), communication primitives
using message passing (called channels), manual and/or automatic
coroutine scheduling, not using the C stack Python function calls, and
serialization of coroutines (for reloading in another process).
Stackless Python could not be implemented as a Python extension module
– the core of the CPython compiler and interpreter had to be patched.
greenlet is an extension module to CPython providing coroutines and
low-level (explicit) scheduling. The most important advantage of
greenlet over Stackless Python is that greenlet could be implemented
as a Python extension module, so the whole Python interpreter doesn't
have to be recompiled in order to use greenlet. Disadvantages of
greenlet include speed (Stackless Python can be 10%, 35% or 900%
faster, depending on the workflow); possible memory leaks if
coroutines have references to each other; and that the provided
functionality is low-level (i.e. only manual coroutine scheduling, no
message passing provided).
greenstackless, the Python module I've recently developed, provides
most of the (high-level) Stackless Python API using greenlet, so it
eliminates the disadvantage of greenlet that it is low-level. See the
source code and some tests (the latter with tricky corner cases).
Please note that although greenstackless is optimized a bit, it can be
much slower than Stackless Python, and it also doesn't fix the memory
leaks. Using greenstackless is thus not recommended in production
environments; but it can be used as a temporary, drop-in replacement
for Stackless Python if replacing the Python interpreter is not
feasible.
Some other software that emulates Stackless using greenlet:
Concurrence: doesn't support stackless.main, tasklet.next,
tasklet.prev, tasklet.insert, tasklet.remove,
stackless.schedule_remove, doesn't send exceptions properly. (Because
of these features missing, it doesn't pass the unit test above.)
PyPy: doesn't support stackless.main, tasklet.next, tasklet.prev,
doesn't pass the unit test above.
Is the combination of Python 3 and PyQt 4 recommended? Are there any alternatives?
I don't see why not, there is a version available for Python 3 which works normally, and the only alternative if you really need Qt would be PySide, which is far from being compatible with Python 3.
Other GUI alternatives would be wxPython (not in Python 3 yet AFAIK) and the "native" Tkinter (which is something else...).
If PyQt4 is the only non-native module you need, there should be no problem.
Check if all modules you need are available for Py3k!
PyQt4 for Py3k is not yet integrated into all distributions.
I.e. on Debian PyQt4 only works with Python 2 currently.
Have a look at 3to2! A tool to convert Py3 to Py2 code.
That is just better than coding in Py2 and using 2to3.