Pytorch Parameterized Layer not Updated

Pytorch Parameterized Layer not Updated - pytorch

I have a problem here, so I want to make a layer where the weight value (and the bias) is based on the other frozen weight. So, let’s say I have a frozen weight (FW) as a base value, then my current model layer will have weight W = FW + D, where D is the trainable parameter. Later, when I train the model, I hope the only parameter that gets updated is D.
I made this simple code for illustration:
frozen = nn.Linear(100,10)
frozen.weight.requires_grad = False
frozen.bias.requires_grad = False
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc = nn.Linear(100,10)
self.dw = nn.Parameter(torch.tensor(1.0, requires_grad=True))
self.db = nn.Parameter(torch.tensor(1.0, requires_grad=True))
def forward(self, x):
# the weight (and the bias) of fc layer is from FW and D
self.fc.weight = nn.Parameter(torch.add(frozen.weight, self.dw))
self.fc.bias = nn.Parameter(torch.add(frozen.bias, self.db))
return torch.sigmoid(self.fc(x))
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
x = torch.rand(100)
y = torch.tensor([0]*9+[1], dtype=torch.float32)
for _ in range(10):
out = model(x)
loss = criterion(out, y)
print(loss)
optimizer.zero_grad()
loss.backward()
optimizer.step()
But when I run that code, the model doesn’t train, and the self.dw and self.db doesn’t change. I am not sure whether my concept is wrong, so it’s not possible to train D, or I made a mistake in the implementation.
I also tried to implement using nn.utils.parameterize, but it still doesn’t work (I am new to using this, so I am not sure I implemented it correctly)
frozen = nn.Linear(100,10)
frozen.weight.requires_grad = False
frozen.bias.requires_grad = False
class Adder(nn.Module):
def __init__(self, delta, frozen):
super().__init__()
self.delta = nn.Parameter(torch.tensor(delta, requires_grad=True))
self.frozen=frozen
def forward(self, x):
return torch.add(self.frozen, self.delta)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc = nn.Linear(100,10)
def forward(self, x):
nn.utils.parametrize.register_parametrization(self.fc, "weight", Adder(1.0, frozen.weight))
nn.utils.parametrize.register_parametrization(self.fc, "bias", Adder(1.0, frozen.bias))
return torch.sigmoid(self.fc(x))
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
x = torch.rand(100)
y = torch.tensor([0]*9+[1], dtype=torch.float32)
for _ in range(10):
out = model(x)
loss = criterion(out, y)
print(loss)
optimizer.zero_grad()
loss.backward()
optimizer.step()
Thank you for any responses.

Instead of recreating new weight and bias by
self.fc.weight = nn.Parameter(torch.add(frozen.weight, self.dw))
self.fc.bias = nn.Parameter(torch.add(frozen.bias, self.db))
You can utilize nn.functional.linear and intermediate variables
weight = self.weight + frozen.weight
bias = self.bias + frozen.bias
F.linear(x, weight, bias)
Complete version:
import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self, frozen):
super(Net, self).__init__()
self.weight = nn.Parameter(torch.ones(10, 100, dtype=torch.float32))
self.bias = nn.Parameter(torch.zeros(10, dtype=torch.float32))
self.frozen = frozen
#property
def weight_bias(self):
weight = self.weight + self.frozen.weight
bias = self.bias + self.frozen.bias
return weight, bias
def forward(self, x):
# the weight (and the bias) of fc layer is from FW and D
weight, bias = self.weight_bias
return F.linear(x, weight, bias) # this should return raw logits as required by nn.CrossEntropyLoss
frozen = nn.Linear(100, 10)
frozen.weight.requires_grad = False
frozen.bias.requires_grad = False
model = Net(frozen)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
x = torch.rand(100).unsqueeze(0)
y = torch.tensor([0]*9+[1], dtype=torch.float32).unsqueeze(0)
for _ in range(10):
out = model(x)
loss = criterion(out, y)
print(loss)
optimizer.zero_grad()
loss.backward()
optimizer.step()

Related

LSTM multi-sequence input to one sequence output

I am new with neural networks and am currently trying to make an LSTM model that predicts an output sequence based on multiple parameters. Excuse my ignorance and dummyness in advance.
I have obtained training and validation datasets, which look somewhat like the following:
For every ID four rows are recorded, which uses columns holding certain parameters and the corresponding Y output. Practically, there are thus ~122,000 / 4 = ~30,500 samples (I mistakenly put 122,000 as ID, it is in fact the number of rows). Since the parameter values and the corresponding Y values follow temporal patterns, I am interested if a model such as LSTM improves the prediction.
I want to predict the Y in my validation dataset (~73,000/4 = ~18,000 samples), based on the temporal patterns of the parameters. But is this possible? Most tutorials I followed use a single sequence, for which an LSTM is used to extend a similar input sequence. I thus want an LSTM with 'multi-sequence' input, which outputs one sequence. How do I go about this?
I'm using PyTorch as framework. A simple LSTM model I created using a tutorial, which would not incorporate the parameters:
training_y = traindf.reset_index()['Y']
validation_y = traindf.reset_index()['Y']
Then create a dataset for this:
class YDataset(Dataset):
def __init__(self, data, seq_len = 100):
self.data = data
self.data = torch.from_numpy(data).float().view(-1)
self.seq_len = seq_len
def __len__(self):
return len(self.data)-self.seq_len-1
def __getitem__(self,index):
return self.data[index : index+self.seq_len] , self.data[index+self.seq_len]
train_y = YDataset(training_y_df)
vali_y = YDataset(validation_y_df)
batch_size = 64
train_dataloader = DataLoader(train_y, batch_size, drop_last=True)
vali_dataloader = DataLoader(vali_y, batch_size, drop_last=True)
device = "cuda" if torch.cuda.is_available() else "cpu"
Then create the model:
class Lstm_model(nn.Module):
def __init__(self, input_dim, hidden_size, num_layers):
super(Lstm_model, self).__init__()
self.num_layers = num_layers
self.input_size = input_dim
self.hidden_size = hidden_size
self.lstm = nn.LSTM(input_size=input_dim, hidden_size = hidden_size, num_layers = num_layers)
self.fc = nn.Linear(hidden_size, 1)
def forward(self,x,hn,cn):
out , (hn,cn) = self.lstm(x, (hn, cn))
final_out = self.fc(out[-1])
return final_out, hn,cn
def predict(self,x):
hn, cn = self.init()
final_out = self.fc(out[-1])
return final_out
def init(self):
h0 = torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers, batch_size, self.hidden_size).to(device)
return h0 , c0
input_dim = 1
hidden_size = 50
num_layers = 3
model = Lstm_model(input_dim , hidden_size , num_layers).to(device)
Loss function and training loop (more or less same as for validation):
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
def train(dataloader):
hn, cn = model.init()
model.train()
for batch , item in enumerate(dataloader):
x , y = item
x = x.to(device)
y = y.to(device)
out , hn , cn = model(x.reshape(100,batch_size,1),hn,cn)
loss = loss_fn(out.reshape(batch_size), y)
hn = hn.detach()
cn = cn.detach()
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch == len(dataloader)-1:
loss = loss.item
print(f"train loss: {loss:>7f} ")
Epochs and loss metrics:
epochs = 200 # Takes really long for me
for epoch in range(epochs):
print(f"epoch {epoch} ")
train(train_dataloader)
test(vali_dataloader)
Final metrics:
import math
from sklearn.metrics import mean_squared_error
import numpy as np
def calculate_metrics(data_loader):
pred_arr = []
y_arr = []
with torch.no_grad():
hn , cn = model.init()
for batch , item = in enumerate(data_loader):
x , y = item
x , y = x.to(device) , y.to(device)
x = x.view(100,64,1)
pred = model(x, hn, cn)[0]
pred = scalar.inverse_transform(pred.detach().cpu().numpy().reshape(-1))
y = scalar.inverse_transform(y.detach().cpu().numpy().reshape(1,-1)).reshape(-1)
pred_arr = pred_arr + list(pred)
y_arr = y_arr + list(y)
return math.sqrt(mean_squared_error(y_arr,pred_arr))
I used this code more as an example of how LSTM would work. Nevertheless, I don't know if this is the right track for me. Does someone know what I should do or a tutorial that does work for my example? Thanks in advance!

How to extract the encoded features after running a PyTorch LSTM autoencoder model?

I am very new to PyTorch and Python in general, and I am now struggling to get the encoded features from my pre-trained LSTM autoencoder which can be seen below:
import torch
import torch.nn as nn
# Bulding an LSTM autoencoder
class Encoder(nn.Module):
def __init__(self, seq_len, n_features, embedding_dim=32):
super(Encoder, self).__init__()
self.seq_len, self.n_features = seq_len, n_features
self.embedding_dim, self.hidden_dim1, self.hidden_dim2 = embedding_dim, 4 * embedding_dim, 2* embedding_dim
self.rnn1 = nn.LSTM(
input_size=n_features,
hidden_size=self.hidden_dim1, #128
num_layers=1,
batch_first=True
)
self.rnn2 = nn.LSTM(
input_size=self.hidden_dim1,
hidden_size=self.hidden_dim2, #64
num_layers=1,
batch_first=True
)
self.rnn3 = nn.LSTM(
input_size=self.hidden_dim2,
hidden_size=embedding_dim, #32
num_layers=1,
batch_first=True
)
def forward(self, x):
x = x.reshape((1, self.seq_len, self.n_features))
x, (_, _) = self.rnn1(x)
x, (_, _) = self.rnn2(x)
x, (hidden_n, _) = self.rnn3(x)
return hidden_n.reshape((self.n_features, self.embedding_dim))
class Decoder(nn.Module):
def __init__(self, seq_len, input_dim=32, n_features=1):
super(Decoder, self).__init__()
self.seq_len, self.input_dim = seq_len, input_dim
self.hidden_dim2, self.hidden_dim1, self.n_features = 4 * input_dim,2 * input_dim, n_features
self.rnn1 = nn.LSTM(
input_size=input_dim,
hidden_size=input_dim,
num_layers=1,
batch_first=True
)
self.rnn2 = nn.LSTM(
input_size=input_dim,
hidden_size=self.hidden_dim1,
num_layers=1,
batch_first=True
)
self.rnn3 = nn.LSTM(
input_size=self.hidden_dim1,
hidden_size=self.hidden_dim2,
num_layers=1,
batch_first=True
)
self.output_layer = nn.Linear(self.hidden_dim2, n_features)
def forward(self, x):
x = x.repeat(self.seq_len, self.n_features)
x = x.reshape((self.n_features, self.seq_len, self.input_dim))
x, (hidden_n, cell_n) = self.rnn1(x)
x, (hidden_n, cell_n) = self.rnn2(x)
x, (hidden_n, cell_n) = self.rnn3(x)
x = x.reshape((self.seq_len, self.hidden_dim2))
return self.output_layer(x)
class RAE(nn.Module):
def __init__(self,seq_len, n_features, embedding_dim=32):
super(RAE, self).__init__()
self.seq_len, self.n_features = seq_len, n_features
self.embedding_dim = embedding_dim
self.encoder = Encoder (seq_len, n_features, embedding_dim).to(device)
self.decoder = Decoder (seq_len, embedding_dim, n_features).to(device)
def forward(self,x):
x = self.encoder(x)
x = self.decoder(x)
return x
### TRAINING
def train_model(model,train_dataset,val_dataset, n_epochs):
optimizer = torch.optim.Adam(model.parameters(), lr = 1e-3)
criterion = nn.MSELoss(reduction='mean').to(device) # nn.L1Loss sum
history = dict(train = [], val = [])
for epoch in range(1, n_epochs + 1):
model = model.train()
train_losses = []
for seq_true in train_dataset:
optimizer.zero_grad()
seq_true = seq_true.to(device)
seq_pred = model(seq_true)
loss = criterion(seq_pred, seq_true)
loss.backward()
optimizer.step()
train_losses.append(loss.item())
val_losses = []
model = model.eval()
with torch.no_grad():
for seq_true in val_dataset:
seq_true = seq_true.to(device)
seq_pred =model(seq_true)
loss = criterion(seq_pred, seq_true)
val_losses.append(loss.item())
#add accuracy
train_loss = np.mean(train_losses)
val_loss = np.mean(val_losses)
history['train'].append(train_loss)
history['val'].append(val_loss)
print(f'Epoch {epoch}: train loss {train_loss} val loss {val_loss}')
return model.eval(),history
Once I trained my model I followed the advice given by ptrblck here and implemented it as follows:
activation = {}
def get_activation(name):
def hook(model, input, output):
activation[name] = output.detach()
return hook
model.encoder.register_forward_hook(get_activation('encoder'))
x = test_dataset_SR[1] # instead of using his random example I used one example from my training set
x = x.cuda()
output = model(x)
print(activation['encoder'])
but this gives me this error:
2 def get_activation(name):
3 def hook(model, input, output):
----> 4 activation[name] = output.detach()
5 return hook
AttributeError: 'tuple' object has no attribute 'detach'
Can you please help me solve this issue? I want to take these encoded features, store them and use them as input to another network. I know I could probably train the encoder separately(not sure), but I will need both encoder and decoder so I thought hooks will be my salvation.

Custom layer caused the batch dimension mismatch in pytorch. How to fix this problem?

I have tried to train a GCN model.I defined the custom layer I needed. However, It cause some dimension mismatch when I do some batch training.
the codes are as following :
import math
import numpy as np
import torch
import torch.nn as nn
from torch.nn.parameter import Parameter
from torch.nn.modules.module import Module
import torch.nn.functional as F
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from tqdm import tqdm
# =============================================================================
# model define
# =============================================================================
class GraphConvolution(Module):
"""
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
"""
def __init__(self, in_features, out_features, bias=True):
super(GraphConvolution, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.FloatTensor(in_features, out_features))
if bias:
self.bias = Parameter(torch.FloatTensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
def forward(self, input, adj):
support = torch.mm(input, self.weight)
output = torch.spmm(adj, support)
if self.bias is not None:
return output + self.bias
else:
return output
def __repr__(self):
return self.__class__.__name__ + ' (' \
+ str(self.in_features) + ' -> ' \
+ str(self.out_features) + ')'
class GCN(nn.Module):
def __init__(self, nfeat, nhid, nclass):
super(GCN, self).__init__()
self.gc1 = GraphConvolution(nfeat, nhid)
self.gc2 = GraphConvolution(nhid, nclass)
self.linear = nn.Linear(nclass, 1)
# self.dropout = dropout
def forward(self, x, adj):
x = F.relu(self.gc1(x, adj))
# x = F.dropout(x, self.dropout, training=self.training)
x = F.relu(self.gc2(x, adj))
x = self.linear(x)
return x
def train(dataloader, model, loss_fn, optimizer,adj):
size = len(dataloader.dataset)
model.train()
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
# Compute prediction error
pred = model(X,adj)
loss = loss_fn(pred, y)
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
def test(dataloader, model, loss_fn,adj):
size = len(dataloader.dataset)
num_batches = len(dataloader)
model.eval()
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X,adj)
test_loss += loss_fn(pred, y).item()
# correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
# correct /= size
# Accuracy: {(100*correct):>0.1f}%,
print(f"Test Error: \n Avg loss: {test_loss:>8f} \n")
when I run the code :
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
model = GCN(1,1,1).to(device)
print(model)
# model(X).shape
loss_fn = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
epochs = 10
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train(train_dataloader, model, loss_fn, optimizer,Adjacency_matrix)
test(test_dataloader, model, loss_fn,Adjacency_matrix)
print("Done!")
I got the error :
when I looking inside this ,I find the model is working well when I drop the dimension of batch-size. How I need to do to tell the model that this dimension is the batch-size which don't need to compute?

the error you're seeing is due to you trying to matrix multiple a 3d tensor (your input) by your 2D weights.
To get around this you can simply reshape your data, as we only really care about the last dim when doing matmuls:
def forward(self, input, adj):
b_size = input.size(0)
input = input.view(-1, input.shape[-1])
support = torch.mm(input, self.weight)
output = torch.spmm(adj, support)
output = output.view(b_size,-1,output.shape[-1])
if self.bias is not None:
return output + self.bias
else:
return output

Training loss is not changing at all while training model

I’m trying to solve a VQA classification problem. my training loss is not changing at all while training the model.
I put in comment the CNN model and try to run it with the text only, but still, no change in loss value.
I pass through those models:
class question_lstm(nn.Module):
def __init__(self, input_dim, emb_dim, hid_dim, n_layers, dropout, output_dim, que_size):
super(question_lstm, self).__init__()
self.hid_dim = hid_dim
self.n_layers = n_layers
self.embedding = nn.Embedding(input_dim, emb_dim)
self.tanh = nn.Tanh()
self.lstm = nn.LSTM(emb_dim, hid_dim, n_layers, dropout = dropout)
self.dropout = nn.Dropout(dropout)
#self.fc1=nn.Linear(n_layers*hid_dim,que_size)
self.fc1=nn.Linear(n_layers*output_dim,que_size)
def forward(self, question):
emb_question=self.embedding(question) #(batchsize, input_dim, emb_dim=256)
emb_question=self.dropout(emb_question)
emb_question=self.tanh(emb_question)
emb_question = emb_question.transpose(0, 1) #(input_dim, batchsize, emb_dim)
output, (hidden, cell) = self.lstm(emb_question)
qu_feature = torch.cat((hidden, cell), dim=2)
qu_feature = qu_feature.transpose(0, 1) #(batchsize=100, num_layer=2, hid_dim=2048)
question_output =self.fc1(qu_feature)
return question_output
class vqamodel(nn.Module):
def __init__(self, output_dim,input_dim, emb_dim, hid_dim, n_layers, dropout, answer_len, que_size,):
super(vqamodel,self).__init__()
#self.image=img_CNN(img_size,image_feature)
self.question=question_lstm(input_dim, emb_dim, hid_dim, n_layers, dropout,output_dim,que_size)
self.tanh=nn.Tanh()
self.relu=nn.ReLU()
self.dropout=nn.Dropout(dropout)
self.fc1=nn.Linear(que_size,output_dim)
self.fc2=nn.Linear(output_dim,answer_len)
def forward(self, image, question):
question_emb=self.question(question)
combine =question_emb #*img_emb
out_feature=self.fc1(combine) #(batchsize=100, output_dim=2048)
out_feature=self.relu(out_feature)
out_feature=self.dropout(out_feature)
out_feature=self.fc2(out_feature) #(batchsize=100, answer_len=1000)
return (out_feature)
I’m using cross entropy loss and Adam:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(vqa_model.parameters(),lr=0.001)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
any idea what can cause this constant loss value?
the train loop:
def train(model,criterion,optimizer,scheduler):
start_time = time.time() #the time we start the train
for epoch in range(num_epochs):
train_loss = 0
#test_loss = 0
train_correct = 0
#test_correct = 0
vqa_model.train()
for i,sample in enumerate(train_VQAdataset_loader):
#image = sample['image'].to(device=device)
question = sample['question'].to(torch.int64).to(device=device)
label = sample['answer'].to(device=device)
output = vqa_model(image, question) # forward
loss = criterion(output, label)
optimizer.zero_grad() # Zero the gradients
loss.backward() # backprop
optimizer.step() # Update weights
scheduler.step()
# Statitcs
train_loss += loss.item() # save the loss for the entire epoch
_, predictions = torch.max(output, 1)
train_correct += (predictions == label).sum() #number of success - cumulative
train_losses.append(train_loss / len(train_VQAdataset_loader))

Building Autoencoder with Softmax classifier - Input mismatch error

I am trying to train an auto-encoder with a softmax classifier to replicate the results in this paper Intriguing properties of neural networks.
My implementation is the following:
n_embedded = 400
class AE400_10(nn.Module):
def __init__(self):
super(AE400_10, self).__init__()
self.encoder = nn.Sequential(nn.Linear(28*28, n_embedded), nn.Sigmoid())
self.decoder = nn.Sequential(nn.Linear(n_embedded, 28*28))
self.classifier = nn.Sequential(nn.Linear(28*28, 10))
def forward(self, x):
x = x.view(-1, 28*28)
encoded = self.encoder(x)
decoded = self.decoder(encoded)
out = self.classifier(decoded) ##NEW UPDATED
return decoded, F.log_softmax(out)
For the training I have the following:
model = AE400_10().to(device)
criterion1 = nn.MSELoss()
criterion2 = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)
for epoch in range(epochs):
total_batch = len(train_set) // batch_size_train
for batch_idx, (data, target) in enumerate(MNSIT_train):
X = data.to(device)
Y = target.to(device)
optimizer.zero_grad()
decoded, out = model(X)
loss1 = criterion1(decoded, inputs)
loss2 = criterion2(out, labels)
loss = loss1 + loss2
loss.backward()
optimizer.step()
if (batch_idx+1) % 100 == 0:
print('Epoch [%d/%d], lter [%d/%d], Loss: %.4f'%(epoch+1, epochs, batch_idx+1, total_batch, cost.item()))
But I am getting the following error:
RuntimeError: size mismatch, m1: [128 x 400], m2: [784 x 10] at
/Users/soumith/mc3build/conda-bld/pytorch_1549593514549/work/aten/src/TH/generic/THTensorMath.cpp:940
I understand this is an error in the dimension but I am not sure why it is happening.
::UPDATE::
I fixed the input to the classifier based on the comments below and now I am getting the following error:
RuntimeError: The size of tensor a (784) must match the size of tensor
b (28) at non-singleton dimension 3

I don't use nn.Sequential so I'm not sure why exactly this happens but if you
replace
self.classifier = nn.Sequential(nn.Linear(28*28, 10))
with
self.classifier = nn.Linear(28*28, 10)
your code works
-->
import torch
import torch.nn as nn
import torch.nn.functional as F
n_embedded = 400
class AE400_10(nn.Module):
def __init__(self):
super(AE400_10, self).__init__()
self.encoder = nn.Sequential(nn.Linear(28*28, n_embedded), nn.Sigmoid())
self.decoder = nn.Sequential(nn.Linear(n_embedded, 28*28))
self.test = nn.Linear(28*28, 10)
self.classifier = nn.Sequential(nn.Linear(28*28, 10))
def forward(self, x):
x = x.view(-1,28*28)
encoded = self.encoder(x)
decoded = self.decoder(encoded)
out = self.classifier(decoded)
return decoded, F.log_softmax(out)
x = torch.ones(128,28,28)
model = AE400_10()
model(x)

instead of encoded out = self.classifier(encoded)
put decoded as input of classifier
out = self.classifier(decoded)
I think, here is why you are getting the mismatch, because the classifier is expecting a tensor of 28 *28 as input as defined in your code.
Then,when calling the criterions:
loss1 = criterion1(decoded, X)
loss2 = criterion2(out, Y)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Pytorch Parameterized Layer not Updated - pytorch

Related

LSTM multi-sequence input to one sequence output

How to extract the encoded features after running a PyTorch LSTM autoencoder model?

Custom layer caused the batch dimension mismatch in pytorch. How to fix this problem?

Training loss is not changing at all while training model

Building Autoencoder with Softmax classifier - Input mismatch error

Categories

Resources