I have a constant flow of medium sized ndarrays (each around 10-15mb in memory) on which I use ndarray.tobytes() before I send it to the next part of the pipeline.
Currently it takes about 70-100ms per array serialization.
I was wondering, is this the fastest that this could be done or is there a faster (maybe not as pretty) way to accomplish that?
clarification: arrays are images, next step in pipeline is some CPP function, I don't want to save them as a file.
There is no need to serialize them at all! You can let C++ read the memory directly. One way is to invoke a C++ function with the PyObject which is your NumPy array. Another is to let C++ allocate the NumPy array in the first place and populate the elements in Python before returning control to C++, for which I have some open source code built atop Boost Python that you can use: https://github.com/jzwinck/pccl/blob/master/NumPyArray.hpp
Your goal should be "zero copy" meaning you never copy the bytes of the array, you only copy references to the array or data within it plus the dimensions.
Related
I'm doing improvement on a Rust codebase that uses the ndarray crate to manipulate arrays. I have one question I could not find an explicit answer in the documentation.
Is it more efficient to pass an instance of ArrayView as an argument to a function or should I use a reference to an Array instead? My intuition is that since ArrayView is a view of an array, when doing computations, it only passes a view of the array and does not grant ownership to the function (hence does not copy) the underlying data.
In short, is there any speed gain to expect from switching from passing instances of ArrayView to passing references of Array?
My goal is to avoid useless memory allocation/duplication which can be very costly when dealing with large arrays.
ArrayBase is a generic struct that can act as both an ArrayView and an Array, so I assume you mean a reference to the owned data, i.e. an Array.
Neither version will clone the array, so they should be approximately equally efficient. You can always benchmark to verify this.
As I see it, the difference is mostly that ArrayView will make the function more flexible – you can pass in parts of larger arrays, or an ArrayView created from a slice, whereas the variant that takes a reference to Array can only be called when you really have an Array of the desired size.
I am using the Neo library for linear algebra in Nim, and I would like to extract arbitrary rows from a matrix.
I can explicitly select a continuous sequence of rows as per the examples in the README, but can't select a disjoint subset of rows.
import neo
let x = randomMatrix(10, 4)
let some_rows = #[1,3,5]
echo x[2..4, All] # works fine
echo x[some_rows, All] ## error
The first echo works because you are creating a Slice object, which neo has defined a proc for. The second echo uses a sequence of integers, and that kind of access is not defined in the neo library. Unfortunately Slices define contiguous closed ranges, you can't even specify steps to iterate in bigger increments than one, so there is no way to accomplish what you want.
Looking at the structure of a Matrix, it seems that it is highly optimised to avoid copying data. Matrix transformation operations seem to reuse the data of the previous matrix and change the access/dimensions. As such, a matrix transformation with arbitrary random would not be possible, the indexes in your example specifically access non contiguos data and this would need to be encoded somehow in the new structure. Plus if you wrote #[1,5,3] that would defeat any kind of normal iterative looping.
An alternative of course is to write a proc which accepts a sequence instead of a slice and then builds a new matrix copying data from the old one. This implies a performance penalty, but if you think this is a good addition to the library please request it in the issue tracker of the project. If it is not accepted, then you will need to write yourself such a proc for personal use in your programs.
from numpy import *
arr1=array([1,2,3])
arr2=arr1 #aliasing
arr3=arr1.view() #shallow copy
arr4=arr1.copy() #deep copy
id(arr1) #120638624
id(arr2) #120638624
id(arr3) #120639004
id(arr4) #123894390
I know about shallow copy and deep copy as in C,C++ but what is it which is happening in python?
Look the c++ code . is it the same happen?
int main()
{
int arr[]={1,2,3};
int (&a)[3]=arr;//aliasing
int* b=arr;// shallow copy
int c[3];//deep copy
int i;
for(i=0;i<3;i++)
c[i]=arr[i];
}
You have aliasing and deep copy right (though copying array values in a for-loop is not usually considered a good way to do it).
On the other hand, a Numpy view is not a pointer. It's a much heavier duty thing, and a proper object instance in it's own right. Conceptually, it's the closest thing to an actual pointer-to-array that exists in Python (though the semantics are of course different), and can fulfill some of the same roles in your code. A view will never be as performant as a raw pointer, since the view needs to carry around a set of data, such as shape and strides, that may be different from that of its "parent" array.
On the other-other hand, both Numpy arrays and views wrap the __array_interface__, which in turn wraps a pointer to the underlying buffer that holds the actual data. So when you make a new view of an array, you do end up making a proper shallow copy of the underlying data, since you make a copy of the pointer to that data (albeit through several layers of wrapping and indirection).
It doesn't appear that nested Vecs work with wasm-bindgen. Is that correct?
My goal is to have a Game of Life grid in Rust that I can return as rows, rather than a 1D Vec which requires the JavaScript to handle the indexing. Two workarounds I've thought of are:
Implement a sort of custom "iterator" in Rust, which is a method which returns the rows one-by-one.
Hand a 1D array to JavaScript but write a wrapper in JavaScript which handles the indexing and exposes some sort of an iterator to the consumer.
I hesitate to use either of these because I want this library to be usable by JavaScript and native Rust, and I don't think either would be very idiomatic in pure Rust land. Any other suggestions?
You're correct that wasm-bindgen today doesn't support returning types like Vec<Vec<u8>>.
A good rule of thumb for WebAssembly is that big chunks of data (like vectors) should always live in the same location to avoid losing too much performance. This means that you might want to explore an interface where a JS object wraps a pointer into WASM memory, and all of its methods work with row/column indices but modify WASM memory to keep it as the source of truth.
If that doesn't work out, then the best way to implement this today is either of the strategies you mentioned as well, although both of those require some level of JS glue code to be written as well.
I'm trying to implement some hashing functions for numpy arrays to easily find these inside a big list of arrays, but almost every hashing function I find needs to make a reduce with more than one operation, for example:
def fnv_hash (arr):
result = FNV_offset_basis
for v in arr.view(dtype = np.uint8):
result *= FNV_prime
result ^= v
return result
It would take two operations to the result variable in each loop, which (I think) is not possible using only reduce calls in numpy functions (i.e. numpy.ufunc.reduce).
I want to avoid basic loops as they do not treat numpy arrays as memory regions (which is slow) and I don't want to use hashlib functions. Also, converting a function using numpy.vectorize and similar (which is just a for loop as said in the documentation) does not helps performance.
Unfortunately I cannot use numba.jit because, as I'm working with large arrays, I need to run my code in a cluster which doesn't have numba installed. The same happens for xxhash.
My solution so far is to use a simple hashing function as
def my_hash (arr):
indices = np.arange(arr.shape[0])
return int((arr * ((1 << (indices * 5)) - indices)).sum())
Which is kinda fast (this isn't the actual function code, I made some optimizations in my script but I can assure you the output is the same), but it makes some unwanted collisions.
In short: I want to implement a good hashing function using only numpy operations, as my arrays and my search space are enormous.
Thanks in advance.
Since your arrays are enormous, but you don't have many of them, did you try hashing just a part of each array? For example, even if arr is huge, hash(tuple(arr[:10e5])) is ~60ms on my machine and is probably unique enough to distinguish 10k different arrays, depending on how they were generated.
If this won't solve your problem, any additional context you can give on your problem would be helpful. How and when are these arrays generated? Why are you trying to hash them?