I have the following NN architecture using Keras:
from keras import Sequential
from keras.layers import Dense
import keras
model = Sequential()
model.add(Dense(16, input_dim=32))
model.add(keras.layers.advanced_activations.PReLU())
model.add(Dense(8))
model.add(keras.layers.advanced_activations.PReLU())
model.add(Dense(4))
model.add(Dense(1, activation='sigmoid'))
I wonder if it makes any difference to add model.add(Dropout(0.5)) before advanced_activations.PReLU() or after it. In another word, where is the correct place to add dropout layer with the existence of advanced_activations layer?
Thank you.
It does not really matter if you do it before the activation or after, since for most activations f(0) = 0, then putting dropout after or before will produce the same result.
Related
I have built a custom peephole lstm, and I want to imitate the dropout part in the already built in nn.lstm. So, how to add the dropout like what this intialization of this lstm, nn.LSTM(input_size, hidden_size, dropout=0.3), do? I have an idea of how to do it, which is by just applying a normal dropout just before returning the output, like this:
# init method
self.dropout = nn.Dropout(0.3)
# forward method
hidden_seq = self.dropout(hidden_seq)
return hidden_seq, (h_t, c_t)
I just want to make sure that this is the right way. If not what to do?
nn.LSTM(... dropout=0.3) applies a Dropout layer on the outputs of each LSTM layer except the last layer. You can have multiple stacked layers by passing parameter num_layers > 1. If you want to add a dropout to the final layer (or if LSTM has only one layer), you have to add it as you are doing now.
If you want to replicate what LSTM dropout does (which is only in case of multiple layers), you can stack LSTM layers manually and add a dropout layer in between.
Im trying to build network like this:
the network
and my question is how to implement the begining with the shared weights,
because it contains FC+BN+ReLu (3-layers), and I have multi inputs vectors( M(~25) vectors with length=F).
I tried with functional API modle in keras, and I had some diffculty with this.
thanks
You can try using a TimeDistributed over each layer.
For example:
model = Sequential()
model.add(TimeDistributed(MobileNetV2(weights='imagenet',include_top=False), input_shape=(n_sequence, *dim, n_channels)))
model.add(TimeDistributed(GlobalAveragePooling2D()))
model.add(CuDNNLSTM(64, return_sequences=False))
model.add(Dense(64, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(24, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(n_output, activation='softmax'))
Code has been taken from https://github.com/peachman05/action-recognition-tutorial/blob/master/model_ML.py
I'm using two differents neural networks sequential models in my python program.
One RNN model defined like this:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, CuDNNLSTM
ModelRNN = Sequential()
ModelRNN.add(CuDNNLSTM(150, return_sequences=True, batch_size=None, input_shape=(None,10)))
ModelRNN.add(CuDNNLSTM(150, return_sequences=True))
ModelRNN.add(Dense(100, activation='relu'))
ModelRNN.add(Dense(10, activation='relu'))
optimizer = tf.keras.optimizers.Adam(lr=0.001)
ModelRNN.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
One Dense model defined like this :
ModelDense = Sequential()
ModelDense.add(Dense(380, batch_size=None, input_shape=(1,380), activation='elu'))
ModelDense.add(Dense(380, activation='elu'))
ModelDense.add(Dense(380, activation='elu'))
ModelDense.add(Dense(380, activation='elu'))
optimizer = tf.keras.optimizers.Adam(lr=0.00025)
ModelDense.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
So my problem is that the two networks will work together so i have to run both of them in the same tensorflow session BUT, i want to save them in two differents folders.
I don't really know if it's possible because i never really interested myself in how is working the tensorflow graphs at all and i only know that when i use my tensorflow saver, i only give as parameter my session and a path.
So my question is how can i separate the storage of my models in two folders ?
I want to do that because i want to be able to easily change my RNN without having to retrain both of my networks or without having to overwrite my trained RNN
If i'am not clear please ask me for more details
So my question is how can i separate the storage of my models in two
folders ?
The save method of Keras model accepts a file path, so different folders can be given for the individual models.
ModelRNN.save('folder1/<filename.h5>')
ModelDense.save('folder2/<filename.h5>')
https://www.tensorflow.org/tutorials/keras/save_and_restore_models#save_the_entire_model
I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:
model = Sequential()
model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?
there is my code: GitHubProject - Multi-Label-Text-Classification
In last layer, the activation function you are using is sigmoid, so binary_crossentropy should be used. Incase you want to use categorical_crossentropy then use softmax as activation function in last layer.
Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh as activation function in LSTM layers.
And you can try using LSTM's dropouts as well like dropout and recurrent dropout
LSTM(units, dropout=0.2, recurrent_dropout=0.2,
activation='tanh')
You can define units as 64 or 128. Start from small number and after testing you take them till 1024.
You can try adding convolution layer as well for extracting features or use Bidirectional LSTM But models based Bidirectional takes time to train.
Moreover, since you are working on text, pre-processing of text and size of training data always play much bigger role than expected.
Edited
Add Class weights in fit parameter
class_weights = class_weight.compute_class_weight('balanced',
np.unique(labels),
labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
class_weights))
model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)
change:
model.add(Activation('sigmoid'))
to:
model.add(Activation('softmax'))
Steps for fine-tuning a network are as follow:
Add your custom network on top of an already trained base
network.
Freeze the base network.
Train the part you added.
Unfreeze some layers in the base network.
Jointly train both these layers and the part you added.
Now if the network architecture is simple as VGG16, we can simply unfreeze the base network from block5_conv1 (Conv2D) and re-train it.
VGG16 Architecture
But When the architecture is highly complex as InceptionResnetV2, where to start? Does anyone has any practical experience? Run the following code in python to see the model:
from keras.applications import InceptionResNetV2
conv_base = InceptionResNetV2(weights='imagenet',
include_top=False,
input_shape=(299, 299, 3))
conv_base.summary()
from keras.utils import plot_model
plot_model(conv_base, to_file='model.png')`
A very basic fine-tuning of model with InceptionResNetV2 will look like this:
from inception_resnet_v2 import InceptionResNetV2
# ImageNet classification
model = InceptionResNetV2()
model.predict(...)
# Finetuning on another 100-class dataset
base_model = InceptionResNetV2(include_top=False, pooling='avg')
# The first argument in the next line represents the number of classes
outputs = Dense(100, activation='softmax')(base_model.output)
model = Model(base_model.inputs, outputs)
model.compile(...)
model.fit(...)
This is a good place to start github.com/yuyang-huang/keras-inception-resnet-v2