Does someone knows why
x = df.select(["A"]).unwrap().to_ndarray::<Float64Type>().unwrap()
is considered as a 2d array while I want it to be 1d array? Is there a function to reshape it to 1d array? Here the shape of y is (100, 1).
The type of x is a ArrayBase<OwnedRepr<f64>, Dim<[usize; 2]>>
You Select a list of columns, the list having length 1. If the row count is say 100, then your resulting 2D-array has dimension (100, 1).
Related
I have a numpy matrix representing rgb image. Its shape is (n,m,3) with n rows, m columns and 3 channels. I want to convert it to list of rgb values along with corresponding indeces.
I can convert to list of rgb values but I am trying to have row and col indeces alongside as well.
We can do something like this for rgb values only.
flat_image = np.reshape(image, [-1,3]) # shape = [mxn, 3]
After also adding row and column number, the shape should be [mxn, 3+2]
so first three columns in the flat image represent rgb, fourth column represents row number from the original image array and fifth column represent col number from the original imagem array.
You can use numpy.indices to construct the row/column indices and then concatenate that with your flat_image
indices = np.indices(image.shape[:-1])
result = np.concatenate([flat_image, indices], axis=-1)
I have two tensors. One is a N batch of KxK matrices, i.e. I have NxKxK tensor called A. Then I have a MxNxK tensor called B. I want to get a new MxNxK tensor where each i'th transposed row from NxK tensor from B is multiplied by the i'th KxK matrix from A forming a new i'th transposed row of NxK tensor. And this is done for all NxK tensors from B.
Because KxK matrices from A are low-triangular maybe it will be easier to resolve this question forming A from upper-triangular matrices and do not use transpose operations multiplying rows from B by KxK upper-triangular matrices.
I attached screen to be more precise
It seems that solution is
torch.einsum('pts,tsk->ptk', B, A)
if A is upper-triangular
We have several "in_arrays" like
in_1=np.array([0.4,0.7,0.8,0.3])
in_2=np.array([0.9,0.8,0.6,0.4])
I need to create two outputs like
out_1=np.array([0,0,1,0])
out_2=np.array([1,1,0,0])
So, the given element of the output array is 1 if the value in the corresponding input array is greater than 0.5 AND the value in this position of this input array is greater than the values of other arrays in this position. What is the efficient way to do this?
You can aggregate all the input arrays in a single matrix, where each row represents a particular input array. That way it is possible to calculate all the output arrays again as a single matrix.
The code could look something like that:
import numpy as np
# input matrix corresponding to the example input arrays given in the question
in_matrix = np.array([[0.4,0.7,0.8,0.3], [0.9,0.8,0.6,0.4]])
out_matrix = np.zeros(in_matrix.shape)
# each element in the array is the maximal value of the corresponding column in input_matrix
max_values = np.max(in_matrix, axis=0)
# compute the values in the output matrix row by row
for n, row in enumerate(in_matrix):
out_matrix[n] = np.logical_and(row > 0.5, row == max_values)
The data from my files is stored in 4D arrays in python of shape (64,128,64,3). The code I run is in a grid code format, so the shape tells us that there are 64 cells in the x,128 in the y, and 64 in the z. The 3 is the x, y, and z components of velocity. What I want to do is compute the average x velocity in each direction for every cell in y.
Let's start in the corner of my grid. I want the first element of my average array to be the average of the x velocity of all the x cells and all the z cells in position y[0]. The next element should be the same, but for y[1]. The end result should be an array of shape (128).
I'm fairly new to python, so I could be missing something simple, but I don't see a way to do this with one np.mean statement because you need to sum over two axes (In this case, 1 and 2 I think). I tried
velx_avg = np.mean(ds['u'][:,:,:,0],axis=1)
here, ds is the data set I've loaded in, and the module I've used to load it stores the velocity data under 'u'. This gave me an array of shape (64,64).
What is the most efficient way to produce the result that I want?
You can use the flatten command to make your life here much easier, this takes an np.ndarray and flattens it into one dimension.
The challenge here is trying to find your definition of 'efficient', but you can play around with that yourself. To do what you want, I simply iterate over the array and flatten the x and z component into a continuous array, and then take the mean of that, see below:
velx_avg = np.mean([ds['u'][:, i, :, 0].flatten() for i in range(128)], axis=1)
i have a function that is supposed to chain link a list of daily returns in a dataframe, but when i pass the column, the function is returning a series, rather than a float
def my_aggfunc(x):
y = np.exp(np.log1p(x).cumsum())
return y
if however i change the second line to be
np.sum(x)
this returns a float
Any ideas pls?
np.log1p(x) is an array.
np.log1p(x).cumsum() is another array of the same size.
np.exp(np.log1p(x).cumsum()) is yet another array.
I'm assuming you didn't want cumsum you wanted sum
np.exp(np.log1p(x).sum())
From the np.exp docs:
Calculate the exponential of all elements in the input array.
Returns: out : ndarray Output array, element-wise exponential of x.
So y is an array.