loss and accurancy stay the same after training - python-3.x

I have an lstm model for human activity recognition task using data from sensors.
when I train my model the loss and accuracy stays the same. Which is the problem in this case in general?
I try to change the learning rate but the results are the same,
below is the model that I use
model = Sequential()
model.add(LSTM(64, return_sequences=True, recurrent_regularizer=l2(0.0015), input_shape=(timestamps,
input_dim)))
model.add(Dropout(0.5))
model.add(LSTM(64, recurrent_regularizer=l2(0.0015), input_shape=(timesteps,input_dim)))
model.add(Dense(64, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(n_classes, activation='softmax'))
model.summary()
model.compile(optimizer=Adam(learning_rate = 0.0025), loss =
'sparse_categorical_crossentropy',
metrics = ['accuracy'])
history =model.fit(X_train, y_train, batch_size=32, epochs=100)
the dataset is balanced across classes and I have used the standard scaller
lying 68704
running 68704
walking 68704
climbingdown 68704
jumping 68704
climbingup 68704
standing 68704
sitting 68704

i find the solution after remove the l2 regularization. I do not know exactly how this work but I gonna investigate the issue a little more

Related

Batch Normalization when CNN with only 2 ConvLayer?

I wonder if it is a problem to use BatchNormalization when there are only 2 convolutional layers in a CNN.
Can this have adverse effects on classification performance? Now I don't mean the training time, but really the accuracy? Is my network overloaded with unneccessary layers? I want to train the network with a small data set.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), input_shape=(28,28,1), padding = 'same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
model.compilke(optimizer="Adam", loss='categorical_crossentropy, metrics =['accuracy'])
Many thanks.
Don’t Use With Dropout
Batch normalization offers some regularization effect, reducing generalization error, perhaps no longer requiring the use of dropout for regularization.
Removing Dropout from Modified BN-Inception speeds up training, without increasing overfitting.
— Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, 2015.
Further, it may not be a good idea to use batch normalization and dropout in the same network.
The reason is that the statistics used to normalize the activations of the prior layer may become noisy given the random dropping out of nodes during the dropout procedure.
Batch normalization also sometimes reduces generalization error and allows dropout to be omitted, due to the noise in the estimate of the statistics used to normalize each variable.
— Page 425, Deep Learning, 2016.
Source - machinelearningmastery.com - batch normalization

Music as input to the LSTM model

Is there a possibility to generate more complex music using LSTM, instead of only piano notes (all the tutorials are about piano music, and I can't find more complex stuff)?
I'm thinking about training the model with different instrumentals, but I don't know how to convert a normal music file (.mp3 or .wav) to network input or how to extract instruments from music file. My model is adaptation from this post
Do you have any idea?
model = Sequential()
model.add(LSTM(128,input_shape=(network_input.shape[1],
network_input.shape[2]), return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(64, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(32))
model.add(Dense(32))
model.add(Dropout(0.3))
model.add(Dense(n_vocab))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

How do we compare the performance of different ConvNets?

I am currently training a net to play a game with a CNN having the following architecture:
model = Sequential()
model.add(Conv2D(100, kernel_size=(2, 2), strides=(2, 2), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(250, activation='relu'))
model.add(Dense(classifications, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
Now I wish to introduce some complexity in the architecture and make the net deep. How can I tabulate the performance of the CNNs of different complexities and ultimately conclude by giving the best choice for the particular task?
Am i going in the wrong direction? How to decide the depth of a CNN and how does it affect the performance on the same dataset?
Thanks in advance (I am new to this site, kindly excuse the immaturity of this post)
Edit: Information about the dataset I am using: dataset consists of images and each image has 3 possible lables (0, 1, 2) stored in a CSV file with each row corresponding to that particular image.
The simplest thing you can do is generate a few different model architectures, train them on a train set and evaluate them on the test set. Then compare their accuracies and the one with the highest accuracy should in theory be the best performing model.
To make the model deeper you can add extra dense or convolutional layers. For example:
changing this:
model.add(Dense(250, activation='relu'))
to this:
model.add(Dense(250, activation='relu'))
model.add(Dense(250, activation='relu'))
model.add(Dense(250, activation='relu'))
will add three extra dense layers. Hence making the network deeper.
You can do the same with duplicating the convolutional layers by duplicating the Conv2D and MaxPooling2D lines.
The alternative to 'trial and error' approach to finding the best architecture and hyperparameters is to use a search approach like explained in this tutorial that user grid search. It will, however, take significantly longer than just trying out a few versions you can up with yourself.

LSTM Text Classification Bad Accuracy Keras

I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:
model = Sequential()
model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?
there is my code: GitHubProject - Multi-Label-Text-Classification
In last layer, the activation function you are using is sigmoid, so binary_crossentropy should be used. Incase you want to use categorical_crossentropy then use softmax as activation function in last layer.
Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh as activation function in LSTM layers.
And you can try using LSTM's dropouts as well like dropout and recurrent dropout
LSTM(units, dropout=0.2, recurrent_dropout=0.2,
activation='tanh')
You can define units as 64 or 128. Start from small number and after testing you take them till 1024.
You can try adding convolution layer as well for extracting features or use Bidirectional LSTM But models based Bidirectional takes time to train.
Moreover, since you are working on text, pre-processing of text and size of training data always play much bigger role than expected.
Edited
Add Class weights in fit parameter
class_weights = class_weight.compute_class_weight('balanced',
np.unique(labels),
labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
class_weights))
model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)
change:
model.add(Activation('sigmoid'))
to:
model.add(Activation('softmax'))

Training accuracy stuck at 6% and loss at 2.7 using LSTM

I am using LSTM for action recognition. My basic LSTM implemented on Keras is getting accuracy of 76% on test data on one dataset(CAD60) but when I use other dataset, my model gets stuck at a loss. Its predicting a single class always.
What can be the problem since I am using the exact framework, features on both the dataset. Even I tried to tune the learning rate change the optimizer but it didn't worked.
model = Sequential() # input has shape (samples, timesteps, locations)
model.add(LSTM(128, batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

Resources