torch.matmul gives RuntimeError - pytorch

I have two tensors
t1=torch.Size([400, 32, 400])
t2= torch.Size([400, 32, 32])
when i excute this
torch.matmul(t1,t2)
i got this error RuntimeError:
Expected tensor to have size 400 at dimension 1, but got size 32 for
argument #2 'batch2' (while checking arguments for bmm)
Any help will be much appreciated

You get the error because the order of matrix multiplication is wrong.
It should be:
a = torch.randn(400, 32, 400)
b = torch.randn(400, 32, 32)
out = torch.matmul(b, a) # You performed torch.matmul(a, b)
# You can also do a simpler version of the matrix multiplication using the below code
out = b # a

Related

RuntimeError: size mismatch, m1: [192 x 68], m2: [1024 x 68] at /opt/conda/conda-bld/pytorch_/work/aten/src/THC/generic/THCTensorMathBlas.cu:268

I'm getting a size mismatch error that I can't understand.
(Pdb) self.W_di
Linear(in_features=68, out_features=1024, bias=True)
(Pdb) indices.size()
torch.Size([32, 6, 68])
(Pdb) self.W_di(indices)
*** RuntimeError: size mismatch, m1: [192 x 68], m2: [1024 x 68] at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THC/generic/THCTensorMathBlas.cu:268
Why is there a mismatch?
Maybe because of the way I defined the weight in forward (instead of _init_)?
This is how I defined self.W_di:
def forward(self):
if self.W_di is None:
self.W_di_weight = nn.Parameter(torch.randn(mL_n * 2,1024).to(device))
self.W_di_bias = nn.Parameter(torch.ones(1024).to(device))
self.W_di = nn.Linear(mL_n * 2, 1024)
self.W_di.weight = self.W_di_weight
self.W_di.bias = self.W_di_bias
result = self.W_di(indices)
Any pointer would be highly appreciated!
Check my answer in here in general you may set
self.W_di = nn.Linear(mL_n * 2, 68)
Or increase the in features.
generally we also face this error in cnn, when your input image is not resized to expected size of the model.

Constructing a multivariate Normal distribution with probabilistic parameters in PyMC3

I want to construct a multivariate Normal model in PyMC3 in which the mean value and precision matrix involve probabilistic variables. h is meant to act as a latent variable in an larger project to which this code snippet belongs.
When I run the code provided below, I get the error message shown, and I'm not sure exactly how to interpret it. As far as I can see, the dimension of the mean value of the MvNormal (2-row column vector) match the dimension of the precision matrix B (2 x 2 matrix), so I don't expect it's the dimensions of these objects that are causing the problem. I don't know what other variables could be causing some error related to dimensions to be thrown up though. Can anyone shed some light on this please?
Here is the code:
import pymc3 as pm
import theano.tensor as tt
with pm.Model() as model:
# A matrix
a1 = pm.Uniform('a1', 0., 1.)
a2 = pm.Uniform('a2', 0., 1.)
ix = ([0, 0, 1, 1], [0, 1, 0, 1])
A = tt.eye(2)
A = tt.set_subtensor(A[ix], [a1, a2, 1, 0])
# B matrix
b1 = pm.Uniform('b1', 0., 1.)
b2 = pm.Uniform('b2', 0., 1.)
ix = ([0, 1], [0, 1])
B = tt.eye(2)
B = tt.set_subtensor(B[ix], [b1 ** 2, b2 ** 2])
# Model
y0 = pm.Normal('y0', mu=0., sd=1., observed=0)
y1 = pm.Normal('y1', mu=1., sd=1., observed=1)
s_v = tt.stack([y1, y0]).T
h = pm.MvNormal("h", mu=pm.math.dot(A, s_v), tau=B)
Error message:
h = pm.MvNormal("h", mu=pm.math.dot(A, s_v), tau=B)
File "/Users/Joel/PycharmProjects/AR(2)/venv/lib/python3.6/site-packages/pymc3/distributions/distribution.py", line 42, in __new__
return model.Var(name, dist, data, total_size)
File "/Users/Joel/PycharmProjects/AR(2)/venv/lib/python3.6/site-packages/pymc3/model.py", line 809, in Var
total_size=total_size, model=self)
File "/Users/Joel/PycharmProjects/AR(2)/venv/lib/python3.6/site-packages/pymc3/model.py", line 1209, in __init__
self.logp_elemwiset = distribution.logp(self)
File "/Users/Joel/PycharmProjects/AR(2)/venv/lib/python3.6/site-packages/pymc3/distributions/multivariate.py", line 274, in logp
quaddist, logdet, ok = self._quaddist(value)
File "/Users/Joel/PycharmProjects/AR(2)/venv/lib/python3.6/site-packages/pymc3/distributions/multivariate.py", line 85, in _quaddist
raise ValueError('Invalid dimension for value: %s' % value.ndim)
ValueError: Invalid dimension for value: 0```
I believe that you are missing the "shape" argument in the pm.MvNormal call, which lets it handle the right size of values. For example, if you have 7 variables, set shape=7.

Tensorflow Check failed: work_element_count > 0

Does anybody know how to deal with Tensorflow 'work_element_count' errors?
F ./tensorflow/core/util/cuda_launch_config.h:127] Check failed: work_element_count > 0 (0 vs. 0)
Aborted (core dumped)
Here is part of my source code:
class DiscriminatorModel:
def __init__(self, session, some_parameters):
self.sess = session
self.parameters = some_parameters
def build_feed_dict(self, input_frames, gt_output_frames, generator):
feed_dict = {}
batch_size = np.shape(gt_output_frames)[0]
print(batch_size) # 1
print(np.shape(generator.input_frames_train)) # (?,7,32,32,32,1)
print(np.shape(input_frames)) # (1,7,32,32,32,1)
print(np.shape(generator.gt_frames_train)) # (?,7,32,32,32,1)
print(np.shape(gt_output_frames)) # (1,7,32,32,32,1)
g_feed_dict={generator.input_frames_train:input_frames,
generator.gt_frames_train:gt_output_frames}
def getshape(d):
if isinstance(d, dict):
return {k:getshape(d[k]) for k in d}
else:
return None
print("g_feed_dict shape :", getshape(g_feed_dict),"\n")
# {<tf.Tensor 'generator/data/Placeholder:0' shape=(?, 32, 32, 32, 1) dtype=float32>: None, <tf.Tensor 'generator/data/Placeholder_1:0' shape=(?, 32, 32, 32, 1) dtype=float32>: None}
print(sys.getsizeof(generator.scale_preds_train)) # 96
print(sys.getsizeof(g_feed_dict)) # 288
# error occurs here.
g_scale_preds = self.sess.run(generator.scale_preds_train, feed_dict=g_feed_dict)
# F ./tensorflow/core/util/cuda_launch_config.h:127] Check failed: work_element_count > 0 (0 vs. 0)
# Aborted (core dumped)
def train_step(self, batch, generator):
print(np.shape(batch)) # [1, 7, 32, 32, 32, 2]
input_frames = batch[:, :, :, :, :, :-1]
gt_output_frames = batch[:, :, :, :, :, -1:]
feed_dict = self.build_feed_dict(input_frames, gt_output_frames, generator)
class GeneratorModel:
def __init__(self, session, some_parameters):
self.sess = session
self.parameters = some_parameters
self.input_frames_train = tf.placeholder(
tf.float32, shape=[None, 7, 32, 32, 32, 1])
self.gt_frames_train = tf.placeholder(
tf.float32, shape=[None, 7, 32, 32, 32, 1])
self.input_frames_test = tf.placeholder(
tf.float32, shape=[None, 7, 32, 32, 32, 1])
self.gt_frames_test = tf.placeholder(
tf.float32, shape=[None, 7, 32, 32, 32, 1])
self.scale_preds_train = []
for p in range(4):
# scale size, 4 --> 8 --> 16 --> 32
sc = 4*(2**p)
# this passes tf.Tensor array of shape (1,7,sc,sc,sc,1)
train_preds = calculate(self.width_train,
self.height_train,
self.depth_train,
...)
self.scale_preds_train.append(train_preds
# [ <..Tensor shape=(1,7,4,4,4,1) ....>,
# <..Tensor shape=(1,7,8,8,8,1) ....>,
# <..Tensor shape=(1,7,16,16,16,1)..>,
# <..Tensor shape=(1,7,32,32,32,1)..> ]
print(self.scale_preds_train)
sess = tf.Session()
d_model = DiscriminatorModel(sess, some_parameters)
g_model = GeneratorModel(sess, some_parameters)
sess.run(tf.global_variables_initializer())
# this returns numpy array of shape [1,7,32,32,32,2]
batch = get_batch()
# trouble here.
d_model.train_step(batch, g_model)
I've seen some recommendations about:
use CUDA 9.0 / cuDNN 7.0 / tensorflow-gpu 1.7.0 (--> I'm already using these)
check if batch has size greater than 0 (--> it seems they are.)
do not use more gpus than the number of samples in a batch (--> I do not)
I use single 11GB gpu among 5 of them, specified as
~$ CUDA_VISIBLE_DEVICES=2 python3 foo.py
and the batch size is 1.
Can anyone tell the missing points or things I've done wrong?
Edit 1.
I found a case that gets through this error. If I give some modification to input like
# ... previous code does not change
print(sys.getsizeof(g_feed_dict)) # 288
temp_index = 0
temp_input = [generator.scale_preds_train[temp_index],
generator.scale_preds_train[temp_index],
generator.scale_preds_train[temp_index],
generator.scale_preds_train[temp_index]]
# this <temp_input> does not raise error here.
# however temp_index > 0 don't work.
g_scale_preds = self.sess.run(temp_input, feed_dict=g_feed_dict)
This makes input passed to the sess.run with its shape something like
[(1,7,4,4,4,1), (1,7,4,4,4,1), (1,7,4,4,4,1), (1,7,4,4,4,1)]
which should be (originally) list of scaled shapes like [(1,7,4,4,4,1), (1,7,8,8,8,1), (1,7,16,16,16,1), (1,7,32,32,32,1)].
Also, the arrays in the dictionary feed_dict is of shape
(1,7,32,32,32,1).
It seems like the error comes from tensorflow-gpu trying to reach wrong indices of array (where the memory is not allocated actually) therefore the "work element is count 0" (But I'm not sure yet).
I cannot understand why the temp_index > 0 (e.g. 1, 2, 3) does throw same
Check failed error, while 0 is the only shape that does not.
Edit 2.
After I changed my gpu from TITAN Xp to GeForce GTX, the error log said
Floating point exception (core dumped)
at the same code (sess.run).
In my case, one of the conv layers has 0 output feature maps, which causes this problem.
Now I've solved it..
Just as the GTX error log had told me, there was something becomes zero, and was actually a denominator (thus irrelevant with all of those code above). Specifications at the last debug is as follows:
CUDA 8.0 / Tensorflow 1.8.0
with GeForce GTX of course. I think the log showed different (and slightly more detailed) because of versions rather than the actual GPU, even though different version itself did not solve indeed.
I was training the model on Colab and got the same problem. The issue was 'num_classes', in the config file it was set to 2 while my model had 36 classes.
You should consider paying attention to num_classes in your config file.

No N-dimensional tranpose in PyTorch

PyTorch's torch.transpose function only transposes 2D inputs. Documentation is here.
On the other hand, Tensorflow's tf.transpose function allows you to transpose a tensor of N arbitrary dimensions.
Can someone please explain why PyTorch does not/cannot have N-dimension transpose functionality? Is this due to the dynamic nature of the computation graph construction in PyTorch versus Tensorflow's Define-then-Run paradigm?
It's simply called differently in pytorch. torch.Tensor.permute will allow you to swap dimensions in pytorch like tf.transpose does in TensorFlow.
As an example of how you'd convert a 4D image tensor from NHWC to NCHW (not tested, so might contain bugs):
>>> img_nhwc = torch.randn(10, 480, 640, 3)
>>> img_nhwc.size()
torch.Size([10, 480, 640, 3])
>>> img_nchw = img_nhwc.permute(0, 3, 1, 2)
>>> img_nchw.size()
torch.Size([10, 3, 480, 640])
Einops supports verbose transpositions for arbitrary number of dimensions:
from einops import rearrange
x = torch.zeros(10, 3, 100, 100)
y = rearrange(x, 'b c h w -> b h w c')
x2 = rearrange(y, 'b h w c -> b c h w') # inverse to the first
(and the same code works for tensorfow as well)

Using tf.gather_nd() to select elements of a tensor

I have 2 tensors:
A of shape [146, 33, 559]
B of shape [146, 33]
B contains integers between 0 and 559, which serve as indeces.
What I'm after is a tensor C of shape [146, 33] where each element corresponds to the index given by B.
I tried tf.gather_nd(A, B) which gives me the error
InvalidArgumentError (see above for traceback): index innermost dimension length must be <= params rank; saw: 33 vs. 3
[[Node: GatherNd = GatherNd[Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_34, _recv_contexts_1_0)]]
I also tried tf.gather(A, B) which gives me the error
InvalidArgumentError (see above for traceback): indices[1,0] = 282 is not in [0, 146)
[[Node: Gather = Gather[Tindices=DT_INT64, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_34, _recv_contexts_1_0)]]
Any idea how to resolve this?

Resources