Convert elements of multi-dimensional numpy array to float32 - python-3.x

I have a complicated nested numpy array which contains list. I am trying to converted the elements to float32. However, it gives me following error:
ValueError Traceback (most recent call last)
<ipython-input-225-22d2824961c2> in <module>
----> 1 x_train_single.astype(np.float32)
ValueError: setting an array element with a sequence.
Here is the code and sample input:
x_train_single.astype(np.float32)
array([[ list([[[0, 0, 0, 0, 0, 0]], [-1.0], [0]]),
list([[[0, 0, 0, 0, 0, 0], [173, 8, 172, 0, 0, 0]], [-1.0], [0]])
]])

As your array contains lists of different sizes and nesting depths, I doubt that there is a simple or fast solution.
Here is a "get-the-job-done-no-matter-what" approach. It comes in two flavors. One creates arrays for leaves, the other one lists.
>>> a
array([[list([[[0, 0, 0, 0, 0, 0]], [-1.0], [0]]),
list([[[0, 0, 0, 0, 0, 0], [173, 8, 172, 0, 0, 0]], [-1.0], [0]])]],
dtype=object)
>>> def mkarr(a):
... try:
... return np.array(a,np.float32)
... except:
... return [*map(mkarr,a)]
...
>>> def mklst(a):
... try:
... return [*map(mklst,a)]
... except:
... return np.float32(a)
...
>>> np.frompyfunc(mkarr,1,1)(a)
array([[list([array([[0., 0., 0., 0., 0., 0.]], dtype=float32), array([-1.], dtype=float32), array([0.], dtype=float32)]),
list([array([[ 0., 0., 0., 0., 0., 0.],
[173., 8., 172., 0., 0., 0.]], dtype=float32), array([-1.], dtype=float32), array([0.], dtype=float32)])]],
dtype=object)
>>> np.frompyfunc(mklst,1,1)(a)
array([[list([[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0]], [-1.0], [0.0]]),
list([[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [173.0, 8.0, 172.0, 0.0, 0.0, 0.0]], [-1.0], [0.0]])]],
dtype=object)

if number of columns is fixed then
np.array([l.astype(np.float) for l in x_train_single.squeeze()])
But it will remove the redundant dimensions, convert everything into numpy array.
Before: (1, 1, 1, 11, 6)
After: (11,6)

Try this:
np.array(x_train_single.tolist())
Looks like you have a (1,1) shaped array, where the single element is a list. And the sublists a consistent in size.
I expect you will get an array with shape (1, 1, 1, 11, 6) and int dtype.
or:
np.array(x_train_single[0,0])
Again this extracts the list from the array, and then makes an array from that.
My answer so far was based on the display:
array([[list([[[173, 8, 172, 0, 0, 0], [512, 58, 57, 0, 0, 0],
...: [513, 514, 0, 0, 0, 0], [515, 189, 516, 0, 0, 0], [309, 266, 0, 0, 0,
...: 0],
...: [32, 310, 0, 0, 0, 0], [271, 58, 517, 0, 0, 0], [164, 40, 0, 0, 0, 0],
...: [38, 32, 60, 0, 0, 0], [38, 83, 60, 0, 0, 0], [149, 311, 0, 0, 0, 0]]
...: ])]])
The new display is more complicated
array([[ list([[[0, 0, 0, 0, 0, 0]], [-1.0], [0]]),
...: list([[[0, 0, 0, 0, 0, 0], [173, 8, 172, 0, 0, 0]], [-1.0], [0]])]])
because the inner lists differ in size. It can't be made into a numeric dtype array.
It can be turned into a (1,2,3) shape array, but still object dtype with 1d list elements.

Related

How to see which indices of an input effected an index of output

I want to test my neural network.
For example, given: an input tensor input, a nn.module with some submodules module, an output tensor output,
I want to find which indices of input effected the index (1,2) of output
More specifically, given:
Two input matrix of size (12, 12),
Operation is matmul
Queried index of the output matrix is: (0,0)
the expected output is:
InputMatrix1: (0,0), (0, 1), ..., (0, 11)
InputMatrix2: (0,0), (1, 0), ..., (11, 0)
Maybe visualization is okay.
Is there any method or libraries that can achieve this?
This is easy. You want to look at the non-zeros entries of the grad of InputMatrix1 and InputMatrix2 w.r.t the (0,0) element of the product:
x = torch.rand((12, 12), requires_grad=True) # explicitly asking for gradient for this tensor
y = torch.rand((12, 12), requires_grad=True) # explicitly asking for gradient for this tensor
# compute the product using # operator:
out = x # y
# use back propagation to compute the gradient w.r.t out[0, 0]:
out[0,0].backward()
Inspect the non-zero elements of the inputs' gradients yield, as expected:
In []: x.grad.nonzero()
tensor([[ 0, 0],
[ 0, 1],
[ 0, 2],
[ 0, 3],
[ 0, 4],
[ 0, 5],
[ 0, 6],
[ 0, 7],
[ 0, 8],
[ 0, 9],
[ 0, 10],
[ 0, 11]])
In []: y.grad.nonzero()
tensor([[ 0, 0],
[ 1, 0],
[ 2, 0],
[ 3, 0],
[ 4, 0],
[ 5, 0],
[ 6, 0],
[ 7, 0],
[ 8, 0],
[ 9, 0],
[10, 0],
[11, 0]])

Python assigning attribute changing length

I have a class that accepts a parameter X. This parameter X is a numpy array, containing lists that contain ints.
array([[ 101, 2002, 8542, ..., 0, 0, 0],
[ 101, 2002, 8974, ..., 0, 0, 0],
[ 101, 5076, 2743, ..., 0, 0, 0],
...,
[ 101, 4302, 2253, ..., 0, 0, 0],
[ 101, 13875, 2003, ..., 0, 0, 0],
[ 101, 1045, 2031, ..., 0, 0, 0]])
I have a class that takes this X and assigns it to an attribute.
class TaskADataset(Dataset):
def __init__(self, X, y):
self.X = X,
self.y = y
But the parameter X and the attribute X now have different lengths.
dataset = TaskADataset(X, y)
print(len(dataset.X), len(X))
1 10000
Why is this occurring?
Thank you for the help.
As shriakhilc pointed out, I included a "," and this turned it into a tuple with one element.

How to replace a torch.tensor in-place with a new padded tensor?

I am trying to figure out how I overwrite a torch.tensor object, located inside a dict, with a new torch.tensor that is a bit longer due to padding.
# pad the tensor
zeros = torch.zeros(55).long()
zeros[zeros == 0] = 100 # change to padding
temp_input = torch.cat([batch['input_ids'][0][0], zeros], dim=-1) # cat
temp_input.shape # [567]
batch['input_ids'][0][0].shape # [512]
batch['input_ids'][0][0] = temp_input
# The expanded size of the tensor (512) must match the existing size (567) at non-singleton dimension 0. Target sizes: [512]. Tensor sizes: [567]
I am struggling to find a way to extend the values of a tensor in-place or to overwrite them if the dimensions change.
The dict is emitted from torch's DataLoader and looks like this:
{'input_ids': tensor([[[ 101, 3720, 2011, ..., 25786, 2135, 102]],
[[ 101, 1017, 2233, ..., 0, 0, 0]],
[[ 101, 1996, 2899, ..., 14262, 20693, 102]],
[[ 101, 2197, 2305, ..., 2000, 1996, 102]]]),
'attn_mask': tensor([[[1, 1, 1, ..., 1, 1, 1]],
[[1, 1, 1, ..., 0, 0, 0]],
[[1, 1, 1, ..., 1, 1, 1]],
[[1, 1, 1, ..., 1, 1, 1]]]),
'cats': tensor([[-0.6410, 0.1481, -2.1568, -0.6976],
[-0.4725, 0.1481, -2.1568, 0.7869],
[-0.6410, -0.9842, -2.1568, -0.6976],
[-0.6410, -0.9842, -2.1568, -0.6976]], grad_fn=<StackBackward>),
'target': tensor([[1],
[0],
[1],
[1]]),
'idx': tensor([1391, 4000, 293, 830])}

How to mask a 3D tensor with 2D mask and keep the dimensions of original vector?

Suppose, I have a 3D tensor A
A = torch.arange(24).view(4, 3, 2)
print(A)
and require masking it using 2D tensor
mask = torch.zeros((4, 3), dtype=torch.int64) # or dtype=torch.ByteTensor
mask[0, 0] = 1
mask[1, 1] = 1
mask[3, 0] = 1
print('Mask: ', mask)
Using masked_select functionality from PyTorch leads to the following error.
torch.masked_select(X, (mask == 1))
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-72-fd6809d2c4cc> in <module>
12
13 # Select based on new mask
---> 14 Y = torch.masked_select(X, (mask == 1))
15 #Y = X * mask_
16 print(Y)
RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 2
How to mask a 3D tensor with a 2D mask and keep the dimensions of the original vector? Any hints will be appreciated.
Essentially, we need to match the dimension of the tensor mask with the tensor being masked.
There are two ways to do it.
Approach 1: Does not preserve original tensor dimensions.
X = torch.arange(24).view(4, 3, 2)
print(X)
mask = torch.zeros((4, 3), dtype=torch.int64) # or dtype=torch.ByteTensor
mask[0, 0] = 1
mask[1, 1] = 1
mask[3, 0] = 1
print('Mask: ', mask)
# Add a dimension to the mask tensor and expand it to the size of original tensor
mask_ = mask.unsqueeze(-1).expand(X.size())
print(mask_)
# Select based on the new expanded mask
Y = torch.masked_select(X, (mask_ == 1)) # does not preserve the dims
print(Y)
The output for approach 1:
tensor([ 0, 1, 8, 9, 18, 19])
Approach 2: Preserves the original tensor dimensions (by padding).
X = torch.arange(24).view(4, 3, 2)
print(X)
mask = torch.zeros((4, 3), dtype=torch.int64) # or dtype=torch.ByteTensor
mask[0, 0] = 1
mask[1, 1] = 1
mask[3, 0] = 1
print('Mask: ', mask)
# Add a dimension to the mask tensor and expand it to the size of original tensor
mask_ = mask.unsqueeze(-1).expand(X.size())
print(mask_)
# Select based on the new expanded mask
Y = X * mask_
print(Y)
The output for approach 2:
tensor([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]])
Mask: tensor([[1, 0, 0],
[0, 1, 0],
[0, 0, 0],
[1, 0, 0]])
tensor([[[1, 1],
[0, 0],
[0, 0]],
[[0, 0],
[1, 1],
[0, 0]],
[[0, 0],
[0, 0],
[0, 0]],
[[1, 1],
[0, 0],
[0, 0]]])
tensor([[[ 0, 1],
[ 0, 0],
[ 0, 0]],
[[ 0, 0],
[ 8, 9],
[ 0, 0]],
[[ 0, 0],
[ 0, 0],
[ 0, 0]],
[[18, 19],
[ 0, 0],
[ 0, 0]]]
There is a simple way to preserve dims as follows:
torch.mul(X, mask.unsqueeze(-1))
the results is also:
tensor([[[ 0, 1],
[ 0, 0],
[ 0, 0]],
[[ 0, 0],
[ 8, 9],
[ 0, 0]],
[[ 0, 0],
[ 0, 0],
[ 0, 0]],
[[18, 19],
[ 0, 0],
[ 0, 0]]])

np.add.at indexing with array

I'm working on cs231n and I'm having a difficult time understanding how this indexing works. Given that
x = [[0,4,1], [3,2,4]]
dW = np.zeros(5,6)
dout = [[[ 1.19034710e-01 -4.65005990e-01 8.93743168e-01 -9.78047129e-01
-8.88672957e-01 -4.66605091e-01]
[ -1.38617461e-03 -2.64569728e-01 -3.83712733e-01 -2.61360826e-01
8.07072009e-01 -5.47607277e-01]
[ -3.97087458e-01 -4.25187949e-02 2.57931759e-01 7.49565950e-01
1.37707667e+00 1.77392240e+00]]
[[ -1.20692745e+00 -8.28111550e-01 6.53041092e-01 -2.31247762e+00
-1.72370321e+00 2.44308033e+00]
[ -1.45191870e+00 -3.49328154e-01 6.15445782e-01 -2.84190582e-01
4.85997687e-02 4.81590106e-01]
[ -1.14828583e+00 -9.69055406e-01 -1.00773809e+00 3.63553835e-01
-1.28078363e+00 -2.54448436e+00]]]
The operation they do is
np.add.at(dW, x, dout)
x is a two dimensional array. How does indexing work here? I went through np.ufunc.at documentation but they have simple examples with 1d array and constant:
np.add.at(a, [0, 1, 2, 2], 1)
In [226]: x = [[0,4,1], [3,2,4]]
...: dW = np.zeros((5,6),int)
In [227]: np.add.at(dW,x,1)
In [228]: dW
Out[228]:
array([[0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0]])
With this x there aren't any duplicate entries, so add.at is the same as using += indexing. Equivalently we can read the changed values with:
In [229]: dW[x[0], x[1]]
Out[229]: array([1, 1, 1])
The indices work the same either way, including broadcasting:
In [234]: dW[...]=0
In [235]: np.add.at(dW,[[[1],[2]],[2,4,4]],1)
In [236]: dW
Out[236]:
array([[0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 2, 0],
[0, 0, 1, 0, 2, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
possible values
The values have to be broadcastable, with respect to the indexes:
In [112]: np.add.at(dW,[[[1],[2]],[2,4,4]],np.ones((2,3)))
...
In [114]: np.add.at(dW,[[[1],[2]],[2,4,4]],np.ones((2,3)).ravel())
...
ValueError: array is not broadcastable to correct shape
In [115]: np.add.at(dW,[[[1],[2]],[2,4,4]],[1,2,3])
In [117]: np.add.at(dW,[[[1],[2]],[2,4,4]],[[1],[2]])
In [118]: dW
Out[118]:
array([[ 0, 0, 0, 0, 0, 0],
[ 0, 0, 3, 0, 9, 0],
[ 0, 0, 4, 0, 11, 0],
[ 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0]])
In this case the indices define a (2,3) shape, so (2,3),(3,), (2,1), and scalar values work. (6,) does not.
In this case, add.at is mapping a (2,3) array onto a (2,2) subarray of dW.
recently I also have a hard time to understand this line of code. Hope what I got can help you, correct me if I am wrong.
The three arrays in this line of code is following:
x , whose shape is (N,T)
dW, ---(V,D)
dout ---(N,T,D)
Then we come to the line code we want to figure out what happens
np.add.at(dW, x, dout)
If you dont want to know the thinking procedure. The above code is equivalent to :
for row in range(N):
for col in range(T):
dW[ x[row,col] , :] += dout[row,col, :]
This is the thinking procedure:
Refering to this doc
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.ufunc.at.html
We know that the x is the index array. So the key is to understand dW[x].
This is the concept of indexing an array(dW) using another array(x). If you are not familiar with this concept, can check out this link
https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html
Generally speaking, what is returned when index arrays are used is an array with the same shape as the index array, but with the type and values of the array being indexed.
dW[x] will give us an array whose shape is (N,T,D), the (N,T) part comes from x, and the (D) comes from dW (V,D). Note here, every element of x is inside the range of [0, v).
Let's take some number as concrete example
x: np.array([[0,0],[0,0]]) ---- (2,2) N=2, T=2
dW: np.array([[0,0],[2,2]]) ---- (2,2) V=2, D=2
dout: np.arange(1,9).reshape(2,2,2) ----(2,2,2) N=2, T=2, D=2
dW[x] should be [ [[0 0] #this comes from the dW's firt row
[0 0]]
[[0 0]
[0 0]] ]
dW[x] add dout means that add the elemnet item(here, this some trick, later will explian)
np.add.at(dW, x, dout) gives
[ [16 20]
[ 2 2] ]
Why? The procedure is:
It add [1,2] to the first row of dW, which is [0,0].
Why first row? Because the x[0,0] = 0, indicating the first row of dW, dW[0] = dW[0,:] = the first row.
Then it add [3,4] to the first row of dW[0,0]. [3,4]=dout[0,1,:].
[0,0] again, comes from the dW, x[0,1] = 0, still the first row of dW[0].
Then it add [5,6] to the first row of dW.
Then it add [7,8] to the first row of dW.
So the result is [1+3+5+7, 2+4+6+8] = [16,20]. Because we do not touch the second row of dW. The dW's second row remains unchanged.
The trick is that we will only count the origin row once, can think that there is no buffer, and every step plays in the original place.
Let's consider an example based on this assignment from cs231n. If we are talking about multiple directions it's much easier to use a concrete settings.
np.random.seed(1)
N, T, V, D = 2, 3, 7, 6
x = np.random.randint(V, size=(N, T))
dW_man = np.zeros((V, D))
dW_man[x].shape, x.shape
((2, 3, 6), (2, 3))
x
array([[5, 3, 4],
[0, 1, 3]])
dout = np.arange(2*3*6).reshape(dW_man[x].shape)
dout
array([[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]],
[[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]])
What should be the rows of dW_man[x]? Well [0, 1, ...] should be added to the row 5, [ 6, 7, ..] - to the row 3. And also [30, 31, ...] should be added to the row 3. So let's compute it manually. See more examples and explanation in this GitHub gist: link.
dW_man[5] = dout[0, 0]
dW_man[3] = dout[0, 1]
dW_man[4] = dout[0, 2]
dW_man[0] = dout[1, 0]
dW_man[1] = dout[1, 1]
dW_man[3] = dout[1, 2]
dW_man
array([[18., 19., 20., 21., 22., 23.],
[24., 25., 26., 27., 28., 29.],
[ 0., 0., 0., 0., 0., 0.],
[30., 31., 32., 33., 34., 35.],
[12., 13., 14., 15., 16., 17.],
[ 0., 1., 2., 3., 4., 5.],
[ 0., 0., 0., 0., 0., 0.]])
Now let's use np.add.at.
np.random.seed(1)
N, T, V, D = 2, 3, 7, 6
x = np.random.randint(V, size=(N, T))
dW = np.zeros((V, D))
dout = np.arange(2*3*6).reshape(dW[x].shape)
np.add.at(dW, x, dout)
dW
array([[18., 19., 20., 21., 22., 23.],
[24., 25., 26., 27., 28., 29.],
[ 0., 0., 0., 0., 0., 0.],
[36., 38., 40., 42., 44., 46.],
[12., 13., 14., 15., 16., 17.],
[ 0., 1., 2., 3., 4., 5.],
[ 0., 0., 0., 0., 0., 0.]])

Resources