Numpy append 3D vectors without flattening [duplicate] - python-3.x

This question already has answers here:
How do I add an extra column to a NumPy array?
(17 answers)
Closed 5 years ago.
l have the following vector
video_132.shape
Out[64]: (64, 3)
that l would to add to it a new 3D vector of three values
video_146[1][146][45]
such that
video_146[1][146][45].shape
Out[68]: (3,)
and
video_146[1][146][45]
Out[69]: array([217, 207, 198], dtype=uint8)
when l do the following
np.append(video_132,video_146[1][146][45])
l'm supposed to get
video_132.shape
Out[64]: (65, 3) # originally (64,3)
However l get :
Out[67]: (195,) # 64*3+3=195
It seems that it flattens the vector
How can l do the append by preserving the 3D structure ?

For visual simplicity let's rename video_132 --> a, and video_146[1][146][45] --> b. The particular values aren't important so let's say
In [82]: a = np.zeros((64, 3))
In [83]: b = np.ones((3,))
Then we can append b to a using:
In [84]: np.concatenate([a, b[None, :]]).shape
Out[84]: (65, 3)
Since np.concatenate returns a new array, reassign its return value to a to "append" b to a:
a = np.concatenate([a, b[None, :]])

Code for append:
def append(arr, values, axis=None):
arr = asanyarray(arr)
if axis is None:
if arr.ndim != 1:
arr = arr.ravel()
values = ravel(values)
axis = arr.ndim-1
return concatenate((arr, values), axis=axis)
Note how arr is raveled if no axis is provided
In [57]: np.append(np.ones((2,3)),2)
Out[57]: array([1., 1., 1., 1., 1., 1., 2.])
append is really aimed as simple cases like adding a scalar to a 1d array:
In [58]: np.append(np.arange(3),6)
Out[58]: array([0, 1, 2, 6])
Otherwise the behavior is hard to predict.
concatenate is the base operation (builtin) and takes a list, not just two. So we can collect many arrays (or lists) in one list and do one concatenate at the end of a loop. And since it doesn't tweak the dimensions before hand, it forces us to do that ourselves.
So to add a shape (3,) to a (64,3) we have transform that (3,) into (1,3). append requires the same dimension adjustment as concatenate if we specify the axis.
In [68]: np.append(arr,b[None,:], axis=0).shape
Out[68]: (65, 3)
In [69]: np.concatenate([arr,b[None,:]], axis=0).shape
Out[69]: (65, 3)

Related

Expand the tensor by several dimensions

In PyTorch, given a tensor of size=[3], how to expand it by several dimensions to the size=[3,2,5,5] such that the added dimensions have the corresponding values from the original tensor. For example, making size=[3] vector=[1,2,3] such that the first tensor of size [2,5,5] has values 1, the second one has all values 2, and the third one all values 3.
In addition, how to expand the vector of size [3,2] to [3,2,5,5]?
One way to do it I can think is by means of creating a vector of the same size with ones-Like and then einsum but I think there should be an easier way.
You can first unsqueeze the appropriate number of singleton dimensions, then expand to a view at the target shape with torch.Tensor.expand:
>>> x = torch.rand(3)
>>> target = [3,2,5,5]
>>> x[:, None, None, None].expand(target)
A nice workaround is to use torch.Tensor.reshape or torch.Tensor.view to do perform multiple unsqueezing:
>>> x.view(-1, 1, 1, 1).expand(target)
This allows for a more general approach to handle any arbitrary target shape:
>>> x.view(len(x), *(1,)*(len(target)-1)).expand(target)
For an even more general implementation, where x can be multi-dimensional:
>>> x = torch.rand(3, 2)
# just to make sure the target shape is valid w.r.t to x
>>> assert list(x.shape) == list(target[:x.ndim])
>>> x.view(*x.shape, *(1,)*(len(target)-x.ndim)).expand(target)

Slicing a tensor with a dimension varying

I'm trying to slice a PyTorch tensor my_tensor of dimensions s x b x c so that the slicing along the first dimension varies according to a tensor indices of length b, to the effect of:
my_tensor[0:indices, torch.arange(0, b, dtype=torch.long), :] = something
The code above doesn't work and receives the error TypeError: tuple indices must be integers or slices, not tuple.
What I'm aiming for is, for example, if indices = torch.tensor([3, 5, 4]) then:
my_tensor[0:3, 0, :] = something
my_tensor[0:5, 1, :] = something
my_tensor[0:4, 2, :] = something
I'm hoping for a tensorized way to do this so I don't have to resort to a for loop. Also, the method needs to be compatible with TorchScript. Thanks very much.

slice Pytorch tensors which are saved in a list

I have the following code segment to generate random samples. The generated samples is a list, where each entry of the list is a tensor. Each tensor has two elements. I would like to extract the first element from all tensors in the list; and extract the second element from all tensors in the list as well. How to perform this kind of tensor slice operation
import torch
import pyro.distributions as dist
num_samples = 250
# note that both covariance matrices are diagonal
mu1 = torch.tensor([0., 5.])
sig1 = torch.tensor([[2., 0.], [0., 3.]])
dist1 = dist.MultivariateNormal(mu1, sig1)
samples1 = [pyro.sample('samples1', dist1) for _ in range(num_samples)]
samples1
I'd recommend torch.cat with a list comprehension:
col1 = torch.cat([t[0] for t in samples1])
col2 = torch.cat([t[1] for t in samples1])
Docs for torch.cat: https://pytorch.org/docs/stable/generated/torch.cat.html
ALTERNATIVELY
You could turn your list of 1D tensors into a single big 2D tensor using torch.stack, then do a normal slice:
samples1_t = torch.stack(samples1)
col1 = samples1_t[:, 0] # : means all rows
col2 = samples1_t[:, 1]
Docs for torch.stack: https://pytorch.org/docs/stable/generated/torch.stack.html
I should mention PyTorch tensors come with unpacking out of the box, this means you can unpack the first axis into multiple variables without additional considerations. Here torch.stack will output a tensor of shape (rows, cols), we just need to transpose it to (cols, rows) and unpack:
>>> c1, c2 = torch.stack(samples1).T
So you get c1 and c2 shaped (rows,):
>>> c1
tensor([0.6433, 0.4667, 0.6811, 0.2006, 0.6623, 0.7033])
>>> c2
tensor([0.2963, 0.2335, 0.6803, 0.1575, 0.9420, 0.6963])
Other answers that suggest .stack() or .cat() are perfectly fine from PyTorch perspective.
However, since the context of the question involves pyro, may I add the following:
Since you are doing IID samples
[pyro.sample('samples1', dist1) for _ in range(num_samples)]
A better way to do it with pyro is
dist1 = dist.MultivariateNormal(mu1, sig1).expand([num_samples])
This tells pyro that the distribution is batched with a batch size of num_samples. Sampling from this will produce
>> dist1.sample()
tensor([[-0.8712, 6.6087],
[ 1.6076, -0.2939],
[ 1.4526, 6.1777],
...
[-0.0168, 7.5085],
[-1.6382, 2.1878]])
Now its easy to solve your original question. Just slice it like
samples = dist1.sample()
samples[:, 0] # all first elements
samples[:, 1] # all second elements

Can we initialise a numpy array of numpy arrays with different shapes using some constructor?

I want an array that looks like this,
array([array([[1, 1], [2, 2]]), array([3, 3])], dtype=object)
I can make an empty array and then assign elements one by one like this,
z = [np.array([[1,1],[2,2]]), np.array([3,3])]
x = np.empty(shape=2, dtype=object)
x[0], x[1] = z
I thought if this possible then so should be this: x = np.array(z, dtype=object), but that gets me the error: ValueError: could not broadcast input array from shape (2,2) into shape (2).
So is the way given above the only way to make a ragged numpy array? Or, is there a nice one line constructor/function we can can call to make the array x from above.

Array entry used in function turns from nan to 0 numpy python

I made a simple function that produces a weighted average of several time series using supplied weights. It is designed to handle missing values (NaNs), which is why I am not using numpy's supplied average function.
However, when I feed it my array containing missing values, the array has its nan values replaced by 0s! I would have assumed that since I am changing the name of the array and it is not a global variable this should not happen. I want my X array to retain its original form including the nan value
I am a relative novice using python (obviously).
Example:
X = np.array([[1, 2, 3], [1, 2, 3], [1, 2, np.nan]]) # 3 time series to be weighted together
weights = np.array([[1,1,1]]) # simple example with weights for each series as 1
def WeightedMeanNaN(Tseries, weights):
## calculates weighted mean
N_Tseries = Tseries
Weights = np.repeat(weights, len(N_Tseries), axis=0) # make a vector of weights matching size of time series
loc = np.where(np.isnan(N_Tseries)) # get location of nans
Weights[loc] = 0
N_Tseries[loc] = 0
Weights = Weights/Weights.sum(axis=1)[:,None] # normalize each row so that weights sum to 1
WeightedAve = np.multiply(N_Tseries,Weights)
WeightedAve = WeightedAve.sum(axis=1)
return WeightedAve
WeightedMeanNaN(Tseries = X, weights = weights)
Out[161]: array([2. , 2. , 1.5])
In:X
Out:
array([[1., 2., 3.],
[1., 2., 3.],
[1., 2., 0.]]) # no longer nan!! ```
Where you call
loc = np.where(np.isnan(N_Tseries)) # get location of nans
Weights[loc] = 0
N_Tseries[loc] = 0
You remove all NaNs and set them to zeros.
To reverse this you could iterate over the array and replace zeros with NaNs.
However, this would also set regular zeros to Nans.
So it turns out this is a mistake caused by me being used to working in Matlab. Python treats arguments supplied to the function as pointers to the original object. In contrast, Matlab creates copies that are discarded when the function ends.
I solved my problem by adding ".copy()" when assigning variables in the function, so that the first line in the function above becomes:
N_Tseries = Tseries.copy().
However, one thing that puzzles me is that some people have suggested that using Tseries[:] should also create a copy of Tseries rather than a pointer to the original variable. This did not work for me though.
I found this answer useful:
Python function not supposed to change a global variable

Resources