Pytorch: Tensordot with non-contracting position - pytorch

Given a tensor x of shape (a, b, d) and a tensor y of shape (b, c, d). I want to perform a matrix-multiplication-like operation such that position 2 of x resp. y (of length d) is contracted.
However, I don't want position 1 of x resp. 0 of y (of length b) to be contracted. Instead, I want to iterate slice-wise over this position, such that the result has shape (a, b, c).
I could implement this as follows, using b times the torch.tensordot operation and then stacking the results:
# random initialization
a = 3
b = 2
c = 5
d = 8
x = torch.randn(a, b, d)
y = torch.randn(b, c, d)
slice_results = []
for idx in range(b):
x_slice = x[:, idx, :]
y_slice = y[idx, :, :]
slice_result = torch.tensordot(x_slice, y_slice, dims=([1], [1]))
slice_results.append(slice_result)
result = torch.stack(slice_results, dim=1)
print(result.shape) # (a, b, c)
However, I wonder whether there is a more efficient way to implement this, without explicitly constructing a Python list.

You could apply torch.tensordot on the whole tensor along dim=2 then only select its diagonal (from dim=1 to dim=2) using a slice:
>>> torch.tensordot(x, y, dims=([b], [b]))[:,range(b), range(b),:]

here it is:
r2 = torch.tensordot(x, y,dims=([2], [2]))
result_ = torch.zeros((a, b, c))
result_[:,1,:] = r2[:,:,1,:][:,1,:]
result_[:,0,:] = r2[:,:,0,:][:,0,:]
result_-result
output:
tensor([[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]],
[[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]]])
the thing is, it is not faster, unless you increase a and b by a lot.
def no_loop_method(x,y):
r2 = torch.tensordot(x, y,dims=([2], [2]))
result_ = torch.zeros((a, b, c))
result_[:,1,:] = r2[:,:,1,:][:,1,:]
result_[:,0,:] = r2[:,:,0,:][:,0,:]
def with_loop_method(x,y):
slice_results = []
for idx in range(b):
x_slice = x[:, idx, :]
y_slice = y[idx, :, :]
slice_result = torch.tensordot(x_slice, y_slice, dims=([1], [1]))
slice_results.append(slice_result)
result = torch.stack(slice_results, dim=1)
a = 3
b = 2
c = 5
d = 8
x = torch.randn(a, b, d)
y = torch.randn(b, c, d)
timeit with_loop_method(x,y)
this will result in:
36.7 µs ± 177 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
timeit no_loop_method(x,y)
while this, will result in:
37.6 µs ± 111 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
but if you increase a and b:
a = 3000
b = 200
c = 5
d = 8
x = torch.randn(a, b, d)
y = torch.randn(b, c, d)
timeit with_loop_method(x,y)
12.4 ms ± 176 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
timeit no_loop_method(x,y)
525 ms ± 16.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Related

numpy.histogram2d() returning a histogram of all zeros

I'm trying to reproduce a phenomenon I've encountered when constructing 2D histograms using numpy.histogram2d, specifically when using the "bins" parameter. When I use an integer for the bins parameter (e.g. bins=20), I see the expected 2D histogram. However, I want my histogram to have consistently-sized bins, so I want to create the histogram with set minimum and maximum x- and y-values. Currently, I'm creating the bin divisions using numpy.linspace to get arrays of evenly-spaced values.
x_bins = np.linspace(min_range, max_range, num=num_bins+1) #numpy is imported as np
y_bins = np.linspace(0, max_even, num=num_bins+1)
I use these arrays for the bins argument in numpy.histogram2d.
hist, xedges, yedges = np.histogram2d(x, y, bins=(x_bins, y_bins))
The arrays x and y are arrays of numbers between the values of min_range and max_range (for x), and between 0 and max_even (for y). When I define the bins with arrays, some of the histograms I generate have all zeros. All x and y arrays are the same length, and the only thing I can think of that changes is the number ranges fed into numpy.histogram2d.
Numbers in these x and y ranges yield histograms that are not all zeros:
x: min_range = 0.07, max_range = 142.095; y: 0, max_even = 471.64
x: min_range = 0.218, max_range = 195.178; y: 0, max_even = 1493.489
Numbers in these ranges yield histograms with all zeros:
x: min_range = 0.006, max_range = 6.916; y: 0, max_even = 1.101
x: min_range = 0, max_range = 5.58; y: 0, max_even = 1.205
The x and y arrays are both numpy arrays. Printing out the x and y bins and values shows that all the x and y values should fall into the defined bins. Trying to replicate the error with arrays of random values within the ranges of interest wasn't successful, so I apologize for the lack of examples; any suggestions for replication are welcome. What might cause the histogram2d function to return a histogram of all zeros?
EDIT
I tried using the range parameter of histogram2d to define the min and max x and y values, and using an integer for the bins parameter (code below). That had no effect on the histograms with all zeros.
hist, xedges, yedges = np.histogram2d(x, y, bins=10, range=[[min_range, max_range], [0, max_even]])
Below is a case that will generate a zero array. If the range parameters were swapped by axis or repeated this could generate a 2d histogram of zeroes with certain x and y arrays.
import numpy as np
np.random.seed(100)
x = 5*np.random.rand(40)+5.
y = 3*np.random.rand(40)+10.
x_min = x.min()
x_max = x.max()
y_min = y.min()
y_max = y.max()
np.histogram2d( x , y, bins = [ 5, 3 ], range = [[ x_min, x_max ], [ x_min, x_max ]])
# x_min & x_max both times!!
# (array([[0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.],
# [0., 0., 0.]]),
# array([5.02359428, 5.99979628, 6.97599828, 7.95220028, 8.92840228, 9.90460429]),
# array([5.02359428, 6.65059762, 8.27760095, 9.90460429]))
# Corrected version
np.histogram2d( x , y, bins = [ 5, 3 ], range = [[ x_min, x_max ], [ y_min, y_max ]])
# (array([[5., 5., 2.],
# [1., 4., 3.],
# [0., 3., 2.],
# [2., 0., 1.],
# [5., 5., 2.]]),
# array([5.02359428, 5.99979628, 6.97599828, 7.95220028, 8.92840228, 9.90460429]),
# array([10.0613174 , 11.01588476, 11.97045212, 12.92501948]))

What is the logic behind this assignment: understanding in-place assignment operations in numpy

I have two fairly simple codes that give different answer. I understand it is due to the reference shared but I am not very clear what exactly happens in the 2nd case
a = np.ones(5)
b = torch.from_numpy(a)
a=np.add(a, 1, out=a)
print(a)
print(b)
[out]:
[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
a = np.ones(5)
b = torch.from_numpy(a)
a=a+1
print(a)
print(b)
[out]:
[2. 2. 2. 2. 2.]
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
Why isn't b changed in the second case ?
In the first case both a and b share the same memory (i.e. b is a view of a or in other words, b is pointing to the (array) value where a is also pointing to) and out argument guarantees that the same memory of a is updated after the np.add() operation is completed. Whereas in the second case, a is a new copy when you do a = a+1 and b is still pointing to the old value of a.
Try the second case with:
a += 1
and observe that both a and b are indeed updated.
In [7]: a = np.ones(5)
...: b = torch.from_numpy(a)
...: a += 1
In [8]: a
Out[8]: array([2., 2., 2., 2., 2.])
In [9]: b
Out[9]: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
As #hpaulj aptly pointed out in his comment, when we do a = a+1, a new object is created and a would now point to this new (array) object instead of the old one, which is still pointed to by b. And this is the reason the (array) value of b is not updated.
To understand this behavior a bit better, you might wanna refer the excellent article by Ned Batchelder about how names are bind to values in Python

Indexing values using a for loop and numpy for Python 3

I'm looking to transition from Matlab to Python with numpy. I am having some difficulties understanding array indexing using numpy.
As an example, I'm looking to use a 'for' loop to assign the inner elements of a row vector to be a specific value. Indexing in numpy seems to be unique to when it comes to indexing rank 1 column arrays versus row arrays.
# Python:
import nump as np
from numpy import*
import scipy.linalg
N = 5
L = 1.0
dx = L / N
S = 10.0 ** -2.0
k = 500.0
aW = zeros((1, N))
for i in range(1, (N-1)):
aE[0, i] = k * S / dx
% Matlab:
N=5;
L=1.0;
dx=L/N;
S=10^(-2.0);
k=500.0;
aW=zeros(1,N);
for i = 2:N-1
aE(i)=k*S/dx;
end
Is it necessary for me to specify the row index of 0, instead of just stating:
'aE[i] = k * S / dx'
Matlab doesn't seem to care about the dimensions of the matrix when assigning elements based on an single input for the index.
I don't have an issue with stating the row index. It makes me more conscious of my variable dimensions. I just want verification that it's necessary. Perhaps I'm setting up the vector incorrectly. I'd appreciate the help.
You should try to avoid iterating using for loops, instead try to solve the problem using broadcasting.
Read about numpy broadcasting
https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
You can do this for example.
>>> import numpy as np
>>> np.full((2,3), 10)
array([[10, 10, 10],
[10, 10, 10]])
or
>>> a = np.ones((2,3))
>>> a
array([[1., 1., 1.],
[1., 1., 1.]])
>>> a * 10
array([[10., 10., 10.],
[10., 10., 10.]])
or
>>> b = np.zeros((2,3))
>>> b
array([[0., 0., 0.],
[0., 0., 0.]])
>>> b+10
array([[10., 10., 10.],
[10., 10., 10.]])

Pytorch select values from the last tensor dimension with indices from another tenor with a smaller dimension

I have a tensor a with three dimensions. The first dimension corresponds to minibatch size, the second to the sequence length, and the third to the feature dimension. E.g.,
>>> a = torch.arange(1, 13, dtype=torch.float).view(2,2,3) # Consider the values of a to be random
>>> a
tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
I have a second, two-dimensional tensor. Its first dimension corresponds to the minibatch size and its second dimension to the sequence length. It contains values in the range of the indices of the third dimension of a. as third dimension has size 3, so b can contain values 0, 1 or 2. E.g.,
>>> b = torch.LongTensor([[0, 2],[1,0]])
>>> b
tensor([[0, 2],
[1, 0]])
I want to obtain a tensor c that has the shape of b and contains all the values of a that are referenced by b.
In the upper scenario I would like to have:
c = torch.empty(2,2)
c[0,0] = a[0, 0, b[0,0]]
c[1,0] = a[1, 0, b[1,0]]
c[0,1] = a[0, 1, b[0,1]]
c[1,1] = a[1, 1, b[1,1]]
>>> c
tensor([[ 1., 5.],
[ 8., 10.]])
How can I create the tensor c fast? Further, I also want c to be differentiable (be able to use .backprob()). I am not too familiar with pytorch, so I am not sure, if a differentiable version of this exists.
As an alternative, instead of c having the same shape as b I could also use a c with the same shape of a, having only zeros, but at the places referenced by b ones. Then I could multiply a and c to obtain a differentiable tensor.
Like follows:
c = torch.zeros(2,2,3, dtype=torch.float)
c[0,0,b[0,0]] = 1
c[1,0,b[1,0]] = 1
c[0,1,b[0,1]] = 1
c[1,1,b[1,1]] = 1
>>> a*c
tensor([[[ 1., 0., 0.],
[ 0., 5., 0.]],
[[ 0., 8., 0.],
[10., 0., 0.]]])
Lets declare necessary variables first: (notice requires_grad in a's initialization, we will use it to ensure differentiability)
a = torch.arange(1,13,dtype=torch.float32,requires_grad=True).reshape(2,2,3)
b = torch.LongTensor([[0, 2],[1,0]])
Lets reshape a and squash minibatch and sequence dimensions:
temp = a.reshape(-1,3)
so temp now looks like:
tensor([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.],
[10., 11., 12.]], grad_fn=<AsStridedBackward>)
Notice now each value of b can be used in each row of temp to get desired output. Now we do:
c = temp[range(len(temp )),b.view(-1)].view(b.size())
Notice how we index temp, range(len(temp )) to select each row and 1D b i.e b.view(-1) to get corresponding columns. Lastly .view(b.size()) brings this array to the same size as b.
If we print c now:
tensor([[ 1., 6.],
[ 8., 10.]], grad_fn=<ViewBackward>)
The presence of grad_fn=.. shows that c requires gradient i.e. its differentiable.

Keras custom layer/constraint to implement equal weights

I would like to create a layer in Keras such that:
y = Wx + c
where W is a block matrix with the form:
A and B are square matrices with elements:
and c is a bias vector with repeated elements:
How can I implement these restrictions? I was thinking it could either be implemented in the MyLayer.build() when initializing weights or as a constraint where I can specify certain indices to be equal but I am unsure how to do so.
You can define such W using Concatenate layer.
import keras.backend as K
from keras.layers import Concatenate
A = K.placeholder()
B = K.placeholder()
row1 = Concatenate()([A, B])
row2 = Concatenate()([B, A])
W = Concatenate(axis=1)([row1, row2])
Example evaluation:
import numpy as np
get_W = K.function(outputs=[W], inputs=[A, B])
get_W([np.eye(2), np.ones((2,2))])
Returns
[array([[1., 0., 1., 1.],
[0., 1., 1., 1.],
[1., 1., 1., 0.],
[1., 1., 0., 1.]], dtype=float32)]
To figure out exact solution you can use placeholder's shape argument. Addition and multiplication are quite straightforward.

Resources