Alternative for Weave in Python 3 - python-3.x

I want to find an alternative for weave in Python 2 since weave is not available anymore in Python 3.
More specifically I need to have an alternative way of writing:
from scipy import weave
from scipy.weave import converters
code = """ C-code1 """
support_code = """ C-code2 """
weave.inline(code, ['a', 'b', 'c'], support_code=support_code, type_converters=converters.blitz, compiler='gcc', verbose=0)

You can use the Cython library as it's recommended by weave developers here. It's a little more complex in using but increases the performance of your code too. You can find some example here.
Another alternative can be Numba. It's more user-friendly but doesn't cache the compiled code.

Have a look at numba. Chances are, you can migrate all your codebase into plain python, and still preserve the same speed that you're used to from C code. You even gain some features like throwing clear python errors right from your inner loops, which was to my knowledge not easily possible from weave. As an example how fast you get with numba, you might check the benchmarks of numpy_groupies, which offers implementations as well in numba as with weave. If you got rid of your C code once, you'll never look back.

Related

How can I include fast.ai functionality when primarily using PyTorch?

I am using PyTorch to carry out vision tasks, but would like to use some of what fast.ai provides since it has a lot of useful functionality. I'd prefer to work mostly in PyTorch since it's easier for me to understand what's going on, it's easier for me to find information on it online, and I want to maintain flexibility.
In https://docs.fast.ai/migrating_pytorch it's written that after I use the following imports: from fastai.vision.all import * and from migrating_pytorch import *, I should be able to start "Incrementally adding fastai goodness to your PyTorch models", which sounds great.
But when I run the second import I get ModuleNotFoundError: No module named 'migrating_pytorch'. Searching in https://github.com/fastai/fastai I also don't find any code mention of migrating_pytorch.py, nor did I manage to find something online.
(I'm using fast.ai version 2.3.1)
I'd like to know if this is indeed the way to go, and if so how to get it working. Or if there's a better way then how I should use that approach instead.
As an example, it would be nice if I could use the EarlyStoppingCallback, SaveModelCallback, and add some metrics from fast.ai instead of writing them myself, while still having everything in mostly "native" PyTorch.
Preferably the solution isn't specific to vision only, but that's my current need.
migrating_pytorch is an example script. It's in the fast.ai repo at: https://github.com/fastai/fastai/blob/master/nbs/examples/migrating_pytorch.py
The notebook that shows how to use it is at: https://github.com/fastai/fastai/blob/827e7cc0fad2db06c40df393c9569309377efac0/nbs/examples/migrating_pytorch.ipynb
For the callback example. Your training code would end up looking something like:
cbs = [EarlyStoppingCallback(), SaveModelCallback()]
learner = Learner(dls, simple_cnn(), loss_func=F.cross_entropy, cbs=cbs)
learner.fit(1)
Those two callbacks probably need some arguments, e.g. save path, etc.

Is it possible to vectorize a function in NodeJS the same way it can be done in Python with Pandas?

To be more specific, I am talking about performing operations over whole rows or columns or matrices instead of scalars, in a (very) efficient way (no need to iterate over the items of the object).
I'm pretty new to NodeJS and I'm coming from Python so sorry if this is something obvious. Are there any equivalent libraries to Pandas in NodeJS that allow to do this?
Thanks
Javascript doesn't give direct access to all SIMD instructions in your computer. Those are the instructions that allow parallel computation on multiple elements of an array.
it offers some packages like math.js for clear expression of your algorithms, debugged code, and some optimization work. maht.js's expression of matrices is done with arrays-of-arrays, so it may or may not be the best way to go.
it has really good just-in-time compilation.
the compilation is friendly to loop unrolling.
If you absolutely positively need screamingly fast performance in the Javascript world, there's always WebAssembly: It offers some SIMD instructions. But it takes a lot of tooling.
An attempt to add SIMD to the Javascript standard has been abandoned in favor of WebAssembly.

How to incorporate custom functions into tf.data pipe-lining process for maximum efficiency

So tf.image for example has some elementary image processing methods already implemented which i'd assumed are optimized. The question is as I'm iterating through a large dataset of images what/how is the recommended way of implementing a more complex function on every image, in batches of course, (for example a a patch 2-D DCT) for it to go as best as possible with the whole tf.data framework.
Thanks in advance.
p.s. of course I could use the "Map" method but i'm asking beyond that. like if I'm passing a pure numpy written function to pass to "map", it wouldn't help as much.
The current best approach (short of writing custom ops in C++/CUDA) is probably to use https://www.tensorflow.org/api_docs/python/tf/contrib/eager/py_func. This allows you to write any TF eager code and use Python control flow statements. With this you should be able to do most of the things you can do with numpy. The added benefit is that you can use your GPU and the tensors you produce in tfe.py_func will be immediately usable in your regular TF code - no copies are needed.

Is it possible to have a "safe_eval" for primitive math expressions in Python?

While there are many questions on SE regarding sand-boxing CPython, most focus on providing a more or less complete Python environment.
In this case, I'm interested in using Python for basic math expressions, something of this complexity, eg:
(sin(a) * cos(b) / tan(c) ** sqrt(d)) - e
Now I could create my own expression evaluator in Python, however I don't want to sacrifice performance (or have to maintain it and have good Python compatibility for all the corner cases).
I looked into numba and numexpr, both very interesting projects, but neither have security/sand-boxing as a goal.
I considered writing a minimal version of ceval.c which runs a restricted set of CPython's bytecodes, but this is still quite some effort.
Instead I did a quick test that restricts name-space and checks the compiled expressions opcodes before executing, from my initial tests this works well, though I'm not totally confident its secure either.
While this isn't clear-cut, using simple expressions means I don't necessarily need:
multiple lines of code.
import statements.
defining functions & classes.
getattr or getitem access.
for & while loops.
So my question is:
Is there a reliably secure way to execute math expressions in CPython that can co-exist in the same process as complete CPython scripts?
Note, since security may be too vague a term, for the purpose of discussion.
Reading or removing files on the users system.
Running open or any functions in os or shutil.

What are the used/unused features of Python 3?

I've recently did some web design as a hobby with a primary motivation to learn interesting things. It was certainly nice to learn Python, but I found out there has just been a Great Python Rewrite too late, so I had to learn both Python 3 and 2.6 essentially.
I'm a newbie, so I'd like people to share what they think the strengths/weaknesses of Python 3 are from the perspective of those who do end-user programming rather than language designers. My question would be more of what people are actually liking to the point of using, or shunning as being unproductive or unpythonic.
For me, with statement is definite plus, while breaking print operator is definitely minus.
Clarification edit: there are many posts that ask whether one should learn Python 2 or 3 or whether there is any difference. I see my question is different: the feedback from people who for whatever reason made the choice of using Python 3 but might have an opinion about what works better, what not.
Another clarification: It has been pointed in the answers that with is backported to 2.*. Apologies.
I'm not using Python 3 "in production", yet, but in playing around with it I've found that print being a function is a superb idea -- for example, I can easily put it in a lambda now, where in 2.* I have to use sys.stdout.write("%s\n" % foo), a bit crufty. Plus, the syntax for such tweaks as using an output file different from sys.stdout or removing the final \n is so much more readable than Python 2.*'s!
BTW, with is also in recent Python 2.* versions, it's not a Python 3 - exclusive.
Well a strong point is the clarification between bytes and string. How many times in your short Python experience have you been confused with the unclear UnicodeDecodeError and UnicodeEncodeError? If you never had troubles with unicode vs bytestrings, well chances are that you are using an ascii-only language, (English? ;) ) but this is usually the concept which is the hardest to grasp for beginners. (by the way, if you're still confused, this link should help for Python 2.x)
I really think that this distinction between str, and bytes, is one of the strong points of Python3.0. Read PEP358 for the formal description, and the diveintopython class for something more end-user oriented. This new feature forces developers to maintain a clear distinction between unicode objects, and bytes objects which are encoded in a specific encoding. I believe that this change will help newcomers understanding more easily the difference between the two structures, and will help experienced developers using sane programming methods.
But of course this change has its own inconvenients: porting 2.x applications is quite difficult, and these str+unicode to str+bytes change is the most annoying thing to change if you are not already clearly separating Unicode and byte strings in your 2.x code. Annoying, but long-needed.
Those breaking changes look annoying to a lot of users, and... are annoying to implement for important librairies/solutions. The current force of Python2.x is the numerous third-party applications/modules: but because it is sometimes not-trivial to port to Python3, those third-party apps will need some time to be ported (and because 2.x is still alive, those applications will need to maintain two versions: one aimed to 2.x clients, and one to 3.x... costly maintenance!) For the next year, the number of fully-fledged application running Python3 will likely be quite low, because of the low number of Python3-compatible third parties. But again, I strongly support these breaking changes: have you read this Monkey, banana, Python(3) and fire hose tale? ;)
I think everything they did was for the best, in the long run. They removed a lot of the deprecated ways to do things, thus enforcing "There's Only One Way to Do It" and increasing consistency. Also, the with statement is awesome.
The obvious problem with using Python 3 is its lack of support for a lot of [big] libraries out there (such as Django). If none of your libraries break with Python 3, there's no reason not to use it.
I really like dictionary comprehension:
{k: v for k, v in stuff}
And extended iterable unpacking:
(head, *rest) = range(5)
This is really subjective. Python3.x is certainly an improvement over 2.x. It contains long anticipated changes like: Dictionary comprehensions, ordered dictionary, more powerful string formatting...etc Not to mention cleaner library.

Resources