suppose we have two dictionaries
d1={"a":1,"b":2}
d2={"x":4,"y":2}
Now I want to compare only the values of both dictionaries(no matters what keys they are having).
How can I do this operation, pls suggest...
You can transform the values to numpy array and then broadcast the == operator as follows:
import numpy as np
d1={"a":1,"b":2}
d2={"x":4,"y":2}
np.array(list(d1.values())) == np.array(list(d2.values()))
output:
array([False, True])
Related
Given a numpy array a, is there any alternative to
a[-1]
to get the last element?
The idea is to have some aggregating numpy method as
np.last(a)
that could be passed to a function to operate on a numpy array:
import numpy as np
def operate_on_array(a: np.array, np_method_name: str):
method = getattr(np, np_method_name)
return method(a)
This would work for methods such as np.mean, np.sum but I have not been able to find if there is some numpy method name that would return the last or first element of the array.
What about a lambda?
lambda x: x[-1]
I have a pandas dataframe of shape (18837349,2000) and a 3D Numpy Array of shape (18837349,6,601). I want to shuffle the rows of my dataframe and the first dimension of my Numpy Array in unison. I know how to shuffle a dataframe:
df_shuffle = df.sample(frac=1).reset_index(drop=True)
But I don't know how to do it together with a 3D Numpy Array. Insights will be appreciated.
You can shuffle an index and use them for both objects
ix = np.arange(18837349)
np.random.shuffle(ix)
df_shuffle, array_shuffle = your_df.iloc[ix].reset_index(drop=True), your_array[ix]
I am looking for a single vector with values [(0:400) (-400:-1)]
Can anyone help me on how to write this in python.
Using Numpy .array to create the vector and .arange to generate the range:
import numpy as np
arr = np.array([[np.arange(401)], [np.arange(-400, 0)]], dtype=object)
I have a numpy ndarray in this form:
inputs = np.array([[1],[2],[3]])
How can I convert this ndarray to a deque (collections.deque) so that the structure get preserved (array of arrays) and I could apply normal deque methods such as popleft() and append()? for example:
inputs.popleft()
->>> [[2],[3]]
inputs.append([4])
->>> [[2],[3], [4]]
I think you could pass inputs directly to deque
from collections import deque
i = deque(inputs)
In [1050]: i
Out[1050]: deque([array([1]), array([2]), array([3])])
In [1051]: i.popleft()
Out[1051]: array([1])
In [1052]: i
Out[1052]: deque([array([2]), array([3])])
In [1053]: i.append([4])
In [1054]: i
Out[1054]: deque([array([2]), array([3]), [4]])
Later on, when you want numpy.array back, just pass deque back to numpy
np.array(i)
Out[1062]:
array([[2],
[3],
[4]])
Hmm I think that you can do:
inputs = np.array([[1],[2],[3]])
inputs = collections.deque([list(i) for i in inputs])
inputs.append([4])
inputs.popleft()
EDIT.
I edited code
I am looking for the best way to compute many dask delayed obejcts stored in a dataframe. I am unsure if the pandas dataframe should be converted to a dask dataframe with delayed objects within, or if the compute call should be called on all values of the pandas dataframe.
I would appreciate any suggestions in general, as I am having trouble with the logic of passing delayed object across nested for loops.
import numpy as np
import pandas as pd
from scipy.stats import hypergeom
from dask import delayed, compute
steps = 5
sample = [int(x) for x in np.linspace(5, 100, num=steps)]
enr_df = pd.DataFrame()
for N in sample:
enr = []
for i in range(20):
k = np.random.randint(1, 200)
enr.append(delayed(hypergeom.sf)(k=k, M=10000, n=20, N=N, loc=0))
enr_df[N] = enr
I cannot call compute on this dataframe without applying the function across all cells like so: enr_df.applymap(compute) (which I believe calls compute on each value individually).
However if I convert to a dask dataframe the delayed objects I want to compute are layered in the dask dataframe structure:
enr_dd = dd.from_pandas(enr_df, npartitions=1)
enr_dd.compute()
And the computation output I expect does not proceed.
You can pass a list of delayed objects into dask.compute
results = dask.compute(*list_of_delayed_objects)
So you need to get a list from your Pandas dataframe. This is something you can do with normal Python code.