Pytorch - skip calculating features of pretrained models for every epoch - keras

I am used to work with tenserflow - keras but now I am forced to start working with Pytorch for flexibility issues. However, I don't seem to find a pytorch code that is focused on training only the classifciation layer of a model. Is that not a common practice ? Now I have to wait out the calculation of the feature extraction of the same data for every epoch. Is there a way to avoid that ?
# in tensorflow - keras :
from tensorflow.keras.applications import vgg16, MobileNetV2, mobilenet_v2
# Load a pre-trained
pretrained_nn = MobileNetV2(weights='imagenet', include_top=False, input_shape=(Image_size, Image_size, 3))
# Extract features of the training data only once
X = mobilenet_v2.preprocess_input(X)
features_x = pretrained_nn.predict(X)
# Save features for later use
joblib.dump(features_x, "features_x.dat")
# Create a model and add layers
model = Sequential()
model.add(Flatten(input_shape=features_x.shape[1:]))
model.add(Dense(100, activation='relu', use_bias=True))
model.add(Dense(Y.shape[1], activation='softmax', use_bias=False))
# Compile & train only the fully connected model
model.compile( loss="categorical_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=0.001))
history = model.fit( features_x, Y_train, batch_size=16, epochs=Epochs)

Assuming you already have the features ìn features_x, you can do something like this to create and train the model:
# create a loader for the data
dataset = torch.utils.data.TensorDataset(features_x, Y_train)
loader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True)
# define the classification model
in_features = features_x.flatten(1).size(1)
model = torch.nn.Sequential(
torch.nn.Flatten(),
torch.nn.Linear(in_features=in_features, out_features=100, bias=True),
torch.nn.ReLU(),
torch.nn.Linear(in_features=100, out_features=Y.shape[1], bias=False) # Softmax is handled by CrossEntropyLoss below
)
model.train()
# define the optimizer and loss function
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
loss_function = torch.nn.CrossEntropyLoss()
# training loop
for e in range(Epochs):
for batch_x, batch_y in enumerate(loader):
optimizer.zero_grad() # clear gradients from previous batch
out = model(batch_x) # forward pass
loss = loss_function(out, batch_y) # compute loss
loss.backward() # backpropagate, get gradients
optimizer.step() # update model weights

Related

How to make a prediction on a RNN without training it every time [duplicate]

This question already has answers here:
How to save/restore a model after training?
(29 answers)
Closed 6 months ago.
I am new to Neural networks and I have successfully trained an RNN but it takes a while to train the data. It would not be feasible for me to train the data every time I want to make a prediction. So the question is, how do I make the training data persistent in that the RNN does not have to train every time every time I make a prediction?
This is the code I am using...
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.data_utils import pad_sequences
class Predict:
TrainingData = None
XScale = None
def __init__(self,PredData,TrainingData):
PreparedData = self.PrepareData(PredData,TrainingData)
#set the training data
self.Train(PreparedData)
#split all of the training data into their respective variables
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(self.XScale, self.Y, test_size = 0.6)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size = 0.6)
#define the model
self.DefineModel(X_train, Y_train, X_val, Y_val)
#do the prediction
return self.predict(PredData)
def PrepareData(self, PredData,TrainingData):
LNumber = 0
for I in TrainingData:
if(len(I) > LNumber):
LNumber = len(I)
PadData = pad_sequences(TrainingData, maxlen = LNumber, padding = 'post', truncating = 'post')
return PadData
def Train(self,TrainData):
min_max_scaler = preprocessing.MinMaxScaler()
self.X = TrainData[0:10]
self.Y = TrainData[-10:]
self.XScale = min_max_scaler.fit_transform(self.X)
def DefineModel(self,X_train, T_train, X_val, Y_val):
self.model = Sequential([
Dense(32, activation = 'relu', input_shape = (10,)),
Dense(32, activation = 'relu'),
Dense(1, activation = 'sigmoid'),
])
self.model.compile( optimizer = 'sgd',
loss = 'binary_crossentropy',
metrics = ['accuracy']
)
self.model.fit( X_train, Y_train,
batch_size = 32, epochs = 100,
validation_data = (X_val, Y_val))
def predict(self,PredData):
Self.Prediction = model.predict(PredData)
As suggested in the comments you can use SavedModel to save your entire architecture + weights. I suggest you having a look at this page on how to Save and load Keras models.
Basically you just need to save like this:
self.model.save('path/to/location')
And to restore later on:
from tensorflow import keras
model = keras.models.load_model('path/to/location')
Also if your training is quite long, you can also think about saving checkpoints of the best model so far, using the callback tf.keras.callbacks.ModelCheckpoint. You create the callback and add it to your fit:
checkpoint_filepath = '/tmp/checkpoint'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_accuracy',
mode='max',
save_best_only=True)
# Model weights are saved at the end of every epoch, if it's the best seen so far.
model.fit(X_train, Y_train, batch_size=32, epochs=100,
validation_data=(X_val, Y_val), callbacks=[model_checkpoint_callback])
If your train crashes or got interrupted you can re-load the best weights like this:
# The model weights (that are considered the best) are loaded into the model.
model.load_weights(checkpoint_filepath)
Note that if you are re-starting your script you will have to re-create the model object: load_weights does not re-create the architecture for you like it does load_model. But it is just a matter, in your case, of doing first:
self.model = Sequential([
Dense(32, activation = 'relu', input_shape = (10,)),
Dense(32, activation = 'relu'),
Dense(1, activation = 'sigmoid'),
])

Evalulate Tensorflow Keras VS KerasRegressor Neural Network

I'm attempting to find variable importance on a Neural Network I've built. Using tensorflow, it seems you can use either the tensorflow.keras way, or the kerasRegressor way. Admittedly, I have been reading documentation / stack overflow for hours and am confused on the differences. They seem to perform similarly but have slightly different pros/cons.
One issue I'm running into is when I use tf.keras to build the model, I am able to clearly compare my training data to my validation/testing data, and get an 'accuracy score'. But, when using kerasRegressor, I am not.
The difference here is the .evaluate() function, which kerasRegressor doesn't seem to have.
Questions:
How to evaluate performance of kerasRegressor model w/ same output as tf.keras.evaluate()?
kerasRegressor Code:
K.clear_session()
def base_model():
# 1- Instantiate Model
modelNEW = keras.Sequential()
# 2- Specify Shape of First Layer
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
# 3- Add the layers
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
return modelNEW
# *** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
currentModel = KerasRegressor(build_fn=base_model, epochs=100, batch_size=50, shuffle='True')
history = currentModel.fit(xTrain, yTrain)
Now if I want to test the accuracy, I have to use .predict()
prediction = currentModel.predict(xValidation)
# print(prediction)
# train_error = np.abs(yValidation - prediction)
# mean_error = np.mean(train_error)
# min_error = np.min(train_error)
# max_error = np.max(train_error)
# std_error = np.std(train_error)```
tf.Keras neural Network:
modelNEW = keras.Sequential()
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
*** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
history = modelNEW.fit(xTrain, yTrain, epochs=100, batch_size=50, shuffle="True")
This is the evaluation I need to see, and cannot with kerasRegressor:
# 6- Model evaluation with test data
test_loss, test_acc = modelNEW.evaluate(xValidation, yValidation)
print('test_acc:', test_acc)
Possible Workaround, still error:
# predictionTrain = currentModel.predict(xTrain)
predictionValidation = currentModel.predict(xValidation)
# print('Train Accuracy = ',accuracy_score(yTrain,np.argmax(pred_train, axis=1)))
print('Test Accuracy = ',accuracy_score(yValidation,np.argmax(predictionValidation, axis=1)))
: Classification metrics can't handle a mix of multilabel-indicator and binary targets

how to overfit a model on a single batch in keras?

I am trying to overfit my model on a single batch to check model integrity. I am using Keras and TensorFlow for the implementation of my model and coding style for this project.
I know how to get the single batch and overfit the model in PyTorch but don't have an idea in Keras.
to get a single batch in PyTorch I used:
images, labels = next(iter(train_dataset))
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)
for epoch in range(epochs):
print(f"Epoch [{epoch}/{epochs}]")
# for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
data = data.reshape(data.shape[0], -1)
# forward
score = model(data)
loss = criterion(score, target)
print(f"Loss: {loss.item()}")
# backward
optimizer.zero_grad()
loss.backward()
optimizer.step()
How to do it in keras any helping matrial?
Thank you everyone for coming here. I found a solution and here it is:
datagen = ImageDataGenerator(rescale=1 / 255.0,
rotation_range=20,
zoom_range=0.2,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.2,
horizontal_flip=True,
fill_mode="nearest"
)
# preprocessing_function=preprocess_input,
# Declare an image generator for validation & testing without generation
test_datagen = ImageDataGenerator(rescale = 1./255,)#preprocessing_function=preprocess_input
# Declare generators for training, validation, and testing from DataFrames
train_gen = datagen.flow_from_directory(directory_train,
target_size=(512, 512),
color_mode='rgb',
batch_size=BATCH_SIZE,
class_mode='binary',
shuffle=True)
val_gen = test_datagen.flow_from_directory(directory_val,
target_size=(512, 512),
color_mode='rgb',
batch_size=BATCH_SIZE,
class_mode='binary',
shuffle=False)
test_gen = test_datagen.flow_from_directory(directory_test,
target_size=(512, 512),
color_mode='rgb',
batch_size=BATCH_SIZE,
class_mode='binary',
shuffle=False)
train_images, train_labels = next(iter(train_gen))
val_images, val_labels = next(iter(val_gen))
test_images, test_labels = next(iter(val_gen))
#check shape for selected Batch
print("Length of Train images : {}".format(len(train_images)))
print("shape of Train images : {}".format(train_images.shape))
print("shape of Train labels : {}".format(train_labels.shape))
Length of Train images : 32
shape of Train images : (32, 512, 512, 3)
shape of Train labels : (32,)
history = model.fit(train_images, train_labels,
use_multiprocessing=True,
workers=16,
epochs=100,
class_weight=class_weights,
validation_data=(val_images, val_labels),
shuffle=True,
callbacks=call_backs)

Pytorch many-to-many time series LSTM always predicts the mean

I want to create an LSTM model using pytorch that takes multiple time series and creates predictions of all of them, a typical "many-to-many" LSTM network.
I am able to achieve what I want in keras. I create a set of data with three variables which are simply linearly spaced with some gaussian noise. Training the keras model I get a prediction 12 steps ahead that is reasonable.
When I try the same thing in pytorch the, model will always predict the mean of the input data. This is confirmed when looking at the loss during training I can see that the model never seems to perform better than just predicting the mean.
TL;DR; The question is: How can I achieve the same thing in pytorch as in the keras example in the gist below?
Full working examples are available here https://gist.github.com/jonlachmann/5cd68c9667a99e4f89edc0c307f94ddb
The keras network is defined as
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')
and the pytorch network is
# Define the pytorch model
class torchLSTM(torch.nn.Module):
def __init__(self, n_features, seq_length):
super(torchLSTM, self).__init__()
self.n_features = n_features
self.seq_len = seq_length
self.n_hidden = 100 # number of hidden states
self.n_layers = 1 # number of LSTM layers (stacked)
self.l_lstm = torch.nn.LSTM(input_size=n_features,
hidden_size=self.n_hidden,
num_layers=self.n_layers,
batch_first=True)
# according to pytorch docs LSTM output is
# (batch_size,seq_len, num_directions * hidden_size)
# when considering batch_first = True
self.l_linear = torch.nn.Linear(self.n_hidden * self.seq_len, 3)
def init_hidden(self, batch_size):
# even with batch_first = True this remains same as docs
hidden_state = torch.zeros(self.n_layers, batch_size, self.n_hidden)
cell_state = torch.zeros(self.n_layers, batch_size, self.n_hidden)
self.hidden = (hidden_state, cell_state)
def forward(self, x):
batch_size, seq_len, _ = x.size()
lstm_out, self.hidden = self.l_lstm(x, self.hidden)
# lstm_out(with batch_first = True) is
# (batch_size,seq_len,num_directions * hidden_size)
# for following linear layer we want to keep batch_size dimension and merge rest
# .contiguous() -> solves tensor compatibility error
x = lstm_out.contiguous().view(batch_size, -1)
return self.l_linear(x)

ValueError when Fine-tuning Inception_v3 in Keras

I am trying to fine-tune pre-trained Inceptionv3 in Keras for a multi-label (17) prediction problem.
Here's the code:
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a new top layer
x = base_model.output
predictions = Dense(17, activation='sigmoid')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(loss='binary_crossentropy', # We NEED binary here, since categorical_crossentropy l1 norms the output before calculating loss.
optimizer=SGD(lr=0.0001, momentum=0.9))
# Fit the model (Add history so that the history may be saved)
history = model.fit(x_train, y_train,
batch_size=128,
epochs=1,
verbose=1,
callbacks=callbacks_list,
validation_data=(x_valid, y_valid))
But I got into the following error message and had trouble deciphering what it is saying:
ValueError: Error when checking target: expected dense_1 to have 4
dimensions, but got array with shape (1024, 17)
It seems to have something to do with that it doesn't like my one-hot encoding for the labels as target. But how do I get 4 dimensions target?
It turns out that the code copied from https://keras.io/applications/ would not run out-of-the-box.
The following post has helped me:
Keras VGG16 fine tuning
The changes I need to make are the following:
Add in the input shape to the model definition base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(299,299,3)), and
Add a Flatten() layer to flatten the tensor output:
x = base_model.output
x = Flatten()(x)
predictions = Dense(17, activation='sigmoid')(x)
Then the model works for me!

Resources