I just started to build my first CNN. I'm practicing with the MNIST dataset, this is the code I just wrote:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv1D, Dropout, Flatten, Dense
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.optimizers import Adam
from sklearn.preprocessing import RobustScaler
import os
import numpy as np
import matplotlib.pyplot as plt
# CONSTANTS
EPOCHS = 300
TIME_STEPS = 30000
NUM_CLASSES = 10
# Loading data
print('Loading data:')
(train_X, train_y), (test_X, test_y) = mnist.load_data()
print('X_train: ' + str(train_X.shape))
print('Y_train: ' + str(train_y.shape))
print('X_test: ' + str(test_X.shape))
print('Y_test: ' + str(test_y.shape))
print('------------------------------')
# Splitting train/val
print('Splitting training/validation set:')
X_train = train_X[0:TIME_STEPS, :]
X_val = train_X[TIME_STEPS:TIME_STEPS*2, :]
print('X_train: ' + str(X_train.shape))
print('X_val: ' + str(X_val.shape))
# Normalizing data
print('------------------------------')
print('Normalizing data:')
X_train = X_train/255
X_val = X_val/255
print('X_train: ' + str(X_train.shape))
print('X_val: ' + str(X_val.shape))
# Building model
model = Sequential()
model.add(Conv1D(filters=32, kernel_size=5, input_shape=(28, 28)))
model.add(Conv1D(filters=16, kernel_size=4, activation="relu"))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(NUM_CLASSES, activation='softmax'))
model.compile(optimizer=Adam(), loss=categorical_crossentropy, metrics=['accuracy'])
model.summary()
model.fit(x=X_train, y=X_train, batch_size=10, epochs=EPOCHS, shuffle=False)
I'm going to explain what I did, any correction would be helpful so I can learn more:
The first thing I did is splitting the training set in two parts: a training part and a validation part, on which I would like to do the training before testing it on the test set.
Then, I normalized the data (is this a standard when we work with images?)
I then built my CNN with a simple structure: the first layer is the one which gets the inputs (with dimension 28x28) and I've chosen 32 filters that should be enough to perform well on this dataset. The kernel size is the one I did not understood since I thought that the kernel was the equivalent of the filter. I selected a low number to avoid problems. The second layer is similar to the previous one, but now it has an activation function (relu, but I'm not convinced, I was thinking to use a softmax to pass a set of probabilities to the full connected layer).
The last 3 layers are the full connected layer to get the output.
In the fit function I used a batch size of 10 and I think that this could be one of the reason I get the error:
ValueError: Shapes (10, 28, 28) and (10, 10) are incompatible
Even removing it I still getting the following error:
ValueError: Shapes (None, 28, 28) and (None, 10) are incompatible
Am I missing something important?
You are passing in the X_train variable twice, once as the x argument and once as the y argument. Instead of passing in X_train as the y argument in .fit() you should pass in an array of values you are trying to predict. Given that you are using MNIST is assume that you are trying to predict the written digit, so your y array should be of shape (n_samples, 10) with the digit being one-hot encoded.
how to implement hamming loss as a custom metric in keras model
I have a multilabel classification with 6 classes
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy',hamming_loss])
I tried using
from sklearn.metrics import hamming_loss
def custom_hl(y_true, y_pred):
return hamming_loss(y_true, y_pred)
which doesn't work because i have the y_true , y_pred as follow
YTRUE
Tensor("Cast_10:0", shape=(None, 6), dtype=float32)
YPRED
Tensor("model_1/dense_1/Sigmoid:0", shape=(None, 6), dtype=float32)
also tried the function in this question and it doesn't work
Getting the accuracy for multi-label prediction in scikit-learn
is there any way I can get the hamming loss as metric in keras
thanks for any help
so i found a way and
def Custom_Hamming_Loss(y_true, y_pred):
return K.mean(y_true*(1-y_pred)+(1-y_true)*y_pred)
def Custom_Hamming_Loss1(y_true, y_pred):
tmp = K.abs(y_true-y_pred)
return K.mean(K.cast(K.greater(tmp,0.5),dtype=float))
source:https://groups.google.com/g/keras-users/c/_sjndHbejTY?pli=1
I'm new to Keras
my neural network structure is here:
neural network structure
my idea is :
import keras.backend as KBack
import tensorflow as tf
#...some code here
model = Sequential()
hidden_units = 4
layer1 = Dense(
hidden_units,
input_dim=len(InputIndex),
activation='sigmoid'
)
model.add(layer1)
# layer1_bias = layer1.get_weights()[1][0]
layer2 = Dense(
1, activation='sigmoid',
use_bias=False
)
model.add(layer2)
# KBack.bias_add(model.output, layer1_bias[0])
I know this is not working cause layer1_bias[0] is not tensor, but I have no idea how to fix it. Or somebody has other solution.
Thanks.
You get the error because bias_add expects a Tensor and you are passing it a float (the actual value of the bias). Also, be aware that your hidden layer actually has 3 biases (one for each node). If you want to add the bias of the first node to your output layer, this should work:
import keras.backend as K
from keras.layers import Dense, Activation
from keras.models import Sequential
model = Sequential()
layer1 = Dense(3, input_dim=2, activation='sigmoid')
layer2 = Dense(1, activation=None, use_bias=False)
activation = Activation('sigmoid')
model.add(layer1)
model.add(layer2)
K.bias_add(model.output, layer1.bias[0:1]) # slice like this to not lose a dimension
model.add(activation)
print(model.summary())
Note that, to be 'correct' (according to the definition of what a dense layer does), you should add the bias first, then the activation.
Also, your code is not really in line with the picture of your network. In the picture, one single shared bias is added to each of the nodes in the network. You can do this with the functional API. The idea is to disable the use of biases in the hidden layer and the output layers, and to manually add a bias variable that you define yourself and that will be shared by the layers. I'm using tensorflow for tf.add() since that supports broadcasting:
from keras.layers import Dense, Lambda, Input, Add
from keras.models import Model
import keras.backend as K
import tensorflow as tf
# Define the shared bias as a custom keras variable
shared_bias = K.variable(value=[0], name='shared_bias')
input_layer = Input(shape=(2,))
# Disable biases in the hidden layer
dense_1 = Dense(units=3, use_bias=False, activation=None)(input_layer)
# Manually add the shared bias
dense_1 = Lambda(lambda x: tf.add(x, shared_bias))(dense_1)
# Disable bias in output layer
output_layer = Dense(units=1, use_bias=False)(dense_1)
# Manually add the bias variable
output_layer = Lambda(lambda x: tf.add(x, shared_bias))(output_layer)
model = Model(inputs=input_layer, outputs=output_layer)
print(model.summary())
This assumes that your shared bias is not trainable though.
I was to use a simple BiLSTM model with my own custom loss function in Keras.
See below.
model = Sequential()
model.add(Bidirectional(LSTM(128, return_sequences=True), input_shape=(1,8)))
model.add(Bidirectional(LSTM(128)))
model.add(Dense(64, activation='relu'))
model.add(Dense(20, activation='softmax'))
def my_loss_np(y_true, y_pred):
labels = [np.argmax(y_pred[i]) for i in range(y_pred.shape[1])]
loss = np.mean(labels)
return loss
import keras.backend as K
def my_loss(y_true, y_pred):
loss = K.eval(my_loss_np(K.eval(y_true), K.eval(y_pred)))
return loss
When I compile this model, I get an error -
model.compile(loss=my_loss, optimizer='adam')
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'dense_95_target' with dtype float and shape [?,?]
[[Node: dense_95_target = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
There are several issues here with your loss function:
You are using NumPy on tensors, unfortunately though it is an intuitive this doesn't work. You need to use tensor operators from the Keras backend, they are very similar.
To that end you are calling K.eval but at this stage you are still constructing a symbolic computation graph which will be run in TensorFlow or Theano. So the tensors don't have a value to compute per say, you need to keep it symbolic, you can get any values like you do in NumPy.
Even if you fix the problems above, you are using a non-differentiable operation argmax which will not work with gradient descent algorithms.
Your model looks like a multi-label classification problem, 20 classes as your final layer is 20 with softmax. In this case, the literature uses categorical-crossentropy loss to train the classifier network.
I am running keras over tensorflow, trying to implement a multi-dimensional LSTM network to predict a linear continuous target variable , a single value for each example(return_sequences = False).
My sequence length is 10 and number of features (dim) is 11.
This is what I run:
import pprint, pickle
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
# Input sequence
wholeSequence = [[0,0,0,0,0,0,0,0,0,2,1],
[0,0,0,0,0,0,0,0,2,1,0],
[0,0,0,0,0,0,0,2,1,0,0],
[0,0,0,0,0,0,2,1,0,0,0],
[0,0,0,0,0,2,1,0,0,0,0],
[0,0,0,0,2,1,0,0,0,0,0],
[0,0,0,2,1,0,0,0,0,0,0],
[0,0,2,1,0,0,0,0,0,0,0],
[0,2,1,0,0,0,0,0,0,0,0],
[2,1,0,0,0,0,0,0,0,0,0]]
# Preprocess Data:
wholeSequence = np.array(wholeSequence, dtype=float) # Convert to NP array.
data = wholeSequence
target = np.array([20])
# Reshape training data for Keras LSTM model
data = data.reshape(1, 10, 11)
target = target.reshape(1, 1, 1)
# Build Model
model = Sequential()
model.add(LSTM(11, input_shape=(10, 11), unroll=True, return_sequences=False))
model.add(Dense(11))
model.add(Activation('linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(data, target, nb_epoch=1, batch_size=1, verbose=2)
and get the error ValueError: Error when checking target: expected activation_1 to have 2 dimensions, but got array with shape (1, 1, 1)
Not sure what should the activation layer should get (shape wise)
Any help appreciated
thanks
If you just want to have a single linear output neuron, you can simply use a dense layer with one hidden unit and supply the activation there. Your output then can be a single vector without the reshape- I adjusted your given example code to make it work:
wholeSequence = np.array(wholeSequence, dtype=float) # Convert to NP array.
data = wholeSequence
target = np.array([20])
# Reshape training data for Keras LSTM model
data = data.reshape(1, 10, 11)
# Build Model
model = Sequential()
model.add(LSTM(11, input_shape=(10, 11), unroll=True, return_sequences=False))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(data, target, nb_epoch=1, batch_size=1, verbose=2)