I found in official doc that CAddTable should be done as
x = x1 + x2 # instead of CAddTable(x1, x2) in older version
and PyTorch would do the rest of things like autograd
But how about if I have multiple tensors, aka. changing the input above from two tensors to a list of tensors. Could PyTorch still do the similar things?
Just for a clean display of the code snip in the comment:
x = torch.stack((x1, x2, x3, x4), dim=0)
y = torch.sum(x, dim=0, keepdim=False) # same shape as x1, x2...
Related
I created three Convolutional Autoencoders with the same architecture to extract features from some images related to different types of trees.
My code is something like:
model1 = myAutoencoder()
model2 = myAutoencoder()
model3 = myAutoencoder()
opt = keras.optimizers.Adam(learning_rate=0.001)
loss = keras.losses.MeanSquaredError()
model1.compile(opt=opt, loss=loss)
model2.compile(opt=opt, loss=loss)
model3.compile(opt=opt, loss=loss)
Then I train:
#X1, X2, X3 are tensors of 64x64 RGB images: for example(100, 64,64,3)
model1.fit(X1, X1)
model2.fit(X2, X2)
model3.fit(X3, X3)
However, only the first model is learning, while the second and third are stuck with the same loss as in the figure:
enter image description here
Interestingly, if I swap the positions of let's say model1 and model2, like this:
model2.fit(X2, X2)
model1.fit(X1, X1)
model3.fit(X3, X3)
then only model 2 is learning and models 1 and 3 are stuck. I cannot figure out why...
edit: The actual training that I am doing is this:
def scheduler(epoch, lr):
if epoch < 50:
return lr
else:
return lr * np.math.exp(-0.1)
model2.fit(X2, X2, epochs=100, callbacks=[LearningRateScheduler(scheduler)])
model1.fit(X1, X1, epochs=100, callbacks=[LearningRateScheduler(scheduler)])
model3.fit(X3, X3, epochs=100, callbacks=[LearningRateScheduler(scheduler)])
I figured out that if I delete the callbacks the learning process is "normal", is there a reason why the callbacks are interfering between models?
I have data of the form
y1, x1
y2, x2
...
I am trying to train a neural network function say f such that over a set of pairs xi, xj, f(xi) - f(xj) is trained to be similar to (yi - yj) i.e. pairwise loss
You need to reconstruct the data, for each pair Xi, Xj the label will be Yi-Yj. since the problem is a regression problem use MSE as a loss function and that hopefully will lead to what you want.
I want to swap the features before I feed them to another layer.
I have 4 variables so my input array is of size (#samples, 4)
Let's say the features are: x1, x2, x3, x4
Excepted output:
Swapping1: x4, x3, x2, x1
Swapping2: x2, x3, x2, x1
…. etc
Here is what I tried
def toy_model():
_input = Input(shape=(4,))
perm = Permute((4,3,2,1)) (_input)
dense = Dense(1024)(perm)
output = Dense(1)(dense)
model = Model(inputs=_input, outputs=output)
return model
toy_model().summary()
ValueError: Input 0 is incompatible with layer permute_58: expected ndim=5, found ndim=2
However, Permute layer is expecting multiple dimensions arrays to permute the arrays so it does not do the job.
Is there anyway can solve this in keras?
I also tried to feed the flowing functions as a Lambda layer and I get an error
def permutation(x):
x = keras.backend.eval(x)
permutation = [3,2,1,0]
idx = np.empty_like(x)
idx[permutation] = np.arange(len(x))
permutated = x[:,idx]
return K.constant(permutated)
ValueError: Layer dense_93 was called with an input that isn't a symbolic tensor. Received type:
<class 'keras.layers.core.Lambda'>. Full input: [<keras.layers.core.Lambda object at
0x7f20a405f710>]. All inputs to the layer should be tensors.
Use a Lambda layer with some backend function or with slices + concat.
4, 3, 2, 1:
perm = Lambda(lambda x: tf.reverse(x, axis=-1))(_input)
2, 3, 2, 1:
def perm_2321(x):
x1 = x[:, 0]
x2 = x[:, 1]
x3 = x[:, 2]
return tf.stack([x2,x3,x2,x1], axis=-1)
perm = Lambda(perm_2321)(_input)
Using PyTorch, I would like to calculate the Hessian vector product, where the Hessian is the second-derivative matrix of the loss function of some neural net, and the vector will be the vector of gradients of that loss function.
I know how to calculate the Hessian vector product for a regular function thanks to this post. However, I am running into trouble when the function is the loss function of a neural network. This is because the parameters are packaged into a module, accessible via nn.parameters(), and not a torch tensor.
I want to do something like this (doesn't work):
### a simple neural network
linear = nn.Linear(10, 20)
x = torch.randn(1, 10)
y = linear(x).sum()
### compute the gradient and make a copy that is detached from the graph
grad = torch.autograd.grad(y, linear.parameters(),create_graph=True)
v = grad.clone().detach()
### compute the Hessian vector product
z = grad # v
z.backward()
In analogy this this (does work):
x = Variable(torch.Tensor([1, 1]), requires_grad=True)
f = 3*x[0]**2 + 4*x[0]*x[1] + x[1]**2
grad, = torch.autograd.grad(f, x, create_graph=True)
v = grad.clone().detach()
z = grad # v
z.backward()
This post addresses a similar (possibly the same?) issue, but I don't understand the solution.
You are saying it doesn't work but do not show what error you get, this is why you haven't got any answers
torch.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True, allow_unused=False)
outputs and inputs are expected to be sequences of tensors. But you
use just a tensor as outputs.
What this is saying is that you should pass a sequence, so pass [y] instead of y
I have a dataset with features and their labels.
it looks like this:
X1, X2, X3, X4, X5 .. Xn L1, L2, L3
Y1, Y2, Y3, Y4, Y5 .. Yn L5, L2
..
I want to train a KNeighborsClassifier on this dataset. It seems like sklearn does not take multilabels. I have been trying this:
mlb = MultiLabelBinarizer()
Y = mlb.fit_transform(Y)
# parameters: n_neighbors=[5,15], weights = 'uniform', 'distance'
bagging = BaggingClassifier(KNeighborsClassifier(n_neighbors =5,weights ='uniform'), max_samples = 0.6, max_features= 0.7, verbose =1, oob_score =True)
scores = cross_val_score(bagging, X, Y, verbose =1, cv=3, n_jobs=3, scoring='f1_macro')
It is giving me ValueError: bad input shape
Is there a way that I can run multilabel classifier in sklearn?
According to sklearn documentation the classifiers that support multioutput-multiclass classification tasks are:
Decision Trees, Random Forests, Nearest Neighbors
Since you have a binary matrix for your labels, you can use OneVsRestClassifier to make your BaggingClassifier handle multilabel predictions. Code should now look like:
bagging = BaggingClassifier(KNeighborsClassifier(n_neighbors=5, weights='uniform'), max_samples=0.6, max_features=0.7, verbose=1, oob_score=True)
clf = OneVsRestClassifier(bagging)
scores = cross_val_score(clf, X, Y, verbose=1, cv=3, n_jobs=3, scoring='f1_macro')
You can use the OneVsRestClassifier with any of the sklearn models to do multilabel classification.
Here's an explanation:
http://scikit-learn.org/stable/modules/multiclass.html#one-vs-the-rest
And here are the docs:
http://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html
For anybody who finds this looking for multi-label KNN (MLKNN) options, I would recommend using skmultilearn, which is built on top of sklearn, so easy to use if you are familiar with the latter package.
Documentation here. This example is from the documentation:
from skmultilearn.adapt import MLkNN
classifier = MLkNN(k=3)
# train
classifier.fit(X_train, y_train)
# predict
predictions = classifier.predict(X_test)