PyTorch's torch.transpose function only transposes 2D inputs. Documentation is here.
On the other hand, Tensorflow's tf.transpose function allows you to transpose a tensor of N arbitrary dimensions.
Can someone please explain why PyTorch does not/cannot have N-dimension transpose functionality? Is this due to the dynamic nature of the computation graph construction in PyTorch versus Tensorflow's Define-then-Run paradigm?
It's simply called differently in pytorch. torch.Tensor.permute will allow you to swap dimensions in pytorch like tf.transpose does in TensorFlow.
As an example of how you'd convert a 4D image tensor from NHWC to NCHW (not tested, so might contain bugs):
>>> img_nhwc = torch.randn(10, 480, 640, 3)
>>> img_nhwc.size()
torch.Size([10, 480, 640, 3])
>>> img_nchw = img_nhwc.permute(0, 3, 1, 2)
>>> img_nchw.size()
torch.Size([10, 3, 480, 640])
Einops supports verbose transpositions for arbitrary number of dimensions:
from einops import rearrange
x = torch.zeros(10, 3, 100, 100)
y = rearrange(x, 'b c h w -> b h w c')
x2 = rearrange(y, 'b h w c -> b c h w') # inverse to the first
(and the same code works for tensorfow as well)
Related
I am reading the code for the spatial-temporal graph convolution operation here:
https://github.com/yysijie/st-gcn/blob/master/net/utils/tgcn.py and I'm having some difficulty understanding what is happening with an einsum operation. In particular
For x a tensor of shape (N, kernel_size, kc//kernel_size, t, v), where
kernel_size is typically 3, and lets say kc=64*kernel_size, t is the number of frames, say 64, and v the number of vertices, say 25. N is the batch size.
Now for a tensor A of shape (3, 25, 25) where each dimension is a filtering op on the graph vertices, an einsum is computed as:
x = torch.einsum('nkctv,kvw->nctw', (x, A))
I'm not sure how to interpret this expression. What I think it's saying is that for each batch element, for each channel c_i out of 64, sum each of the three matrices obtained by matrix multiplication of the (64, 25) feature map at that channel with the value of A[i]. Do I have this correct? The expression is a bit of a mouthful, and notation wise there seems to be a bit of a strange usage of kc as one variable name, but then decomposition of k as kernel size and c as the number of channels (192//3 = 64) in the expression for the einsum. Any insights appreciated.
Helps when you look closely at the notation:
nkctv for left side
kvw on the right side
nctw being the result
Missing from the result are:
k
v
These elements are summed together into a single value and squeezed, leaving us the resulting shape.
Something along the lines of (expanded shapes (added 1s) are broadcasted and sum per element):
left: (n, k, c, t, v, 1)
right: (1, k, 1, 1, v, w)
Now it goes (l, r for left and right):
torch.mul(l, r)
torch.sum(l, r, dim=(1, 4))
squeeze any singular dimensions
It is pretty hard to get, hence Einstein's summation helps in terms of thinking about resulting shapes “mixed” with each other, at least for me.
Y = torch.einsum('nkctv,kvw->nctw', (x, A)) means:
einsum interpretation on graph
For better understanding, I have replaced the x in left hand side with Y
Given the following tensors x and y with shapes [3,2,3] and [3,2]. I want to multiply the tensors along the 2nd dimension, this is expected to be a kind of dot product and scaling along the axis and return a [3,2,3] tensor.
import torch
a = [[[0.2,0.3,0.5],[-0.5,0.02,1.0]],[[0.01,0.13,0.06],[0.35,0.12,0.0]], [[1.0,-0.3,1.0],[1.0,0.02, 0.03]] ]
b = [[1,2],[1,3],[0,2]]
x = torch.FloatTensor(a) # shape [3,2,3]
y = torch.FloatTensor(b) # shape [3,2]
The expected output :
Expected output shape should be [3,2,3]
#output = [[[0.2,0.3,0.5],[-1.0,0.04,2.0]],[[0.01,0.13,0.06],[1.05,0.36,0.0]], [[0.0,0.0,0.0],[2.0,0.04, 0.06]] ]
I have tried the two below but none of them is giving the desired output and output shape.
torch.matmul(x,y)
torch.matmul(x,y.unsqueeze(1).shape)
What is the best way to fix this?
This is just broadcasted multiply. So you can insert a unitary dimension on the end of y to make it a [3,2,1] tensor and then multiply by x. There are multiple ways to insert unitary dimensions.
# all equivalent
x * y.unsqueeze(2)
x * y[..., None]
x * y[:, :, None]
x * y.reshape(3, 2, 1)
You could also use torch.einsum.
torch.einsum('abc,ab->abc', x, y)
I have a tensor T with dimension (d1 x d2 x d3 x ... dk) and a tensor I with dimension (p x q). Here, I contains coordinates of T but q < k, each column of I corresponds to a dimension of T. I have another tensor V of dimension p x di x ...dj where sum([di, ..., dj]) = k - q. (di, .., dj) corresponds to missing dimensions from I. I need to perform T[I] = V
A specific example of such problem using numpy array posted here[1].
The solution[2] uses fancy indexing[3] which relies on numpy.index_exp. In case of pytorch such option is not available. Is there any alternative way to mimic this in pytorch without using loops or casting tensors to numpy array?
Below is a demo:
import torch
t = torch.randn((32, 16, 60, 64)) # tensor
i0 = torch.randint(0, 32, (10, 1)).to(dtype=torch.long) # indexes for dim=0
i2 = torch.randint(0, 60, (10, 1)).to(dtype=torch.long) # indexes for dim=2
i = torch.cat((i0, i2), 1) # indexes
v = torch.randn((10, 16, 64)) # to be assigned
# t[i0, :, i2, :] = v ?? Obviously this does not work
[1] Slice numpy array using list of coordinates
[2] https://stackoverflow.com/a/42538465/6422069
[3] https://numpy.org/doc/stable/reference/generated/numpy.s_.html
After some discussion in the comments, we arrived at the following solution:
import torch
t = torch.randn((32, 16, 60, 64)) # tensor
# indices
i0 = torch.randint(0, 32, (10,)).to(dtype=torch.long) # indexes for dim=0
i2 = torch.randint(0, 60, (10,)).to(dtype=torch.long) # indexes for dim=2
v = torch.randn((10, 16, 64)) # to be assigned
t[(i0, slice(None), i2, slice(None))] = v
I don't understand why do we need tensor.reshape() function in Theano. It is said in the documentation:
Returns a view of this tensor that has been reshaped as in
numpy.reshape.
As far as I understood, theano.tensor.var.TensorVariable is some entity that is used for computation graphs creation. And it is absolutely independent of shapes. For instance when you create your function you can pass there matrix 2x2 or matrix 100x200. As I thought reshape somehow restricts this variety. But it is not. Suppose the following example:
X = tensor.matrix('X')
X_resh = X.reshape((3, 3))
Y = X_resh ** 2
f = theano.function([X_resh], Y)
print(f(numpy.array([[1, 2], [3, 4]])))
As I understood, it should give an error since I passed matrix 2x2 not 3x3, but it computes element-wise squares perfectly.
So what is the shape of the theano tensor variable and where should we use it?
There is an error in the provided code though Theano fails to point this out.
Instead of
f = theano.function([X_resh], Y)
you should really use
f = theano.function([X], Y)
Using the original code you are actually providing the tensor after the reshape so the reshape command never gets executed. This can be seen by adding
theano.printing.debugprint(f)
which prints
Elemwise{sqr,no_inplace} [id A] '' 0
|<TensorType(float64, matrix)> [id B]
Note that there is no reshape operation in this compiled execution graph.
If one changes the code so that X is used as the input instead of X_resh then Theano throws an error including the message
ValueError: total size of new array must be unchanged Apply node that
caused the error: Reshape{2}(X, TensorConstant{(2L,) of 3})
This is expected because one cannot reshape a tensor with shape (2, 2) (i.e. 4 elements) into a tensor with shape (3, 3) (i.e. 9 elements).
To address the broader question, we can use symbolic expressions in the target shape and those expressions can be functions of the input tensor's symbolic shape. Here's some examples:
import numpy
import theano
import theano.tensor
X = theano.tensor.matrix('X')
X_vector = X.reshape((X.shape[0] * X.shape[1],))
X_row = X.reshape((1, X.shape[0] * X.shape[1]))
X_column = X.reshape((X.shape[0] * X.shape[1], 1))
X_3d = X.reshape((-1, X.shape[0], X.shape[1]))
f = theano.function([X], [X_vector, X_row, X_column, X_3d])
for output in f(numpy.array([[1, 2], [3, 4]])):
print output.shape, output
I am really new to Theano, and I am just trying to figure out some basic functionality. I have a tensor variable x, and i would like the functio to return a tensor variable y of the same shape, but filled with value 0.2. I am not sure how to define y.
For example if x = [1,2,3,4,5], then I would like y = [0,2, 0,2, 0,2, 0,2, 0.2]
from theano import tensor, function
y = tensor.dmatrix('y')
masked_array = function([x],y)
There's probably a dozen different ways to do this and which is best will depend on the context: how this piece of code/functionality fits into the wider program.
Here's one approach:
import theano
import theano.tensor as tt
x = tt.vector()
y = tt.ones_like(x) * 0.2
f = theano.function([x], outputs=y)
print f([1, 2, 3, 4, 5])