I have the following Keras (Python) code for using the Keras Tuner for a single hidden layer neural network.
def build_model(hp):
# Initialize sequential
model = keras.Sequential()
# Tune the number of units in the first Dense layer
hp_units = hp.Int('units', min_value=25, max_value=250, step=25)
model.add(keras.layers.Dense(units=hp_units, input_shape = (train_pca_scaled.shape[1],),
kernel_regularizer=keras.regularizers.L1(l1 = hp.Choice('L1_value', [1e-2, 1e-3, 1e-4]))))
# Batch Normalization (before activation)
model.add(keras.layers.BatchNormalization())
# Activation
activation = hp.Choice("activation", ["relu", "elu"])
model.add(keras.layers.Activation(activation))
# Tune dropout rate
hp_dropout = hp.Float('dropout_rate', min_value=0, max_value=0.5, step=0.1)
model.add(keras.layers.Dropout(rate = hp_dropout))
# Final output layer
model.add(keras.layers.Dense(units = 1))
# Compile
lr = hp.Choice("learning_rate", values=[1e-2, 1e-3, 1e-4])
model.compile(optimizer = keras.optimizers.Adam(learning_rate= lr),
loss = keras.losses.MeanSquaredError(),
metrics= [keras.metrics.RootMeanSquaredError()])
return model
My issue is that I would like to also have the advanced activation layers of LeakyReLU and PReLU as activation options (along with relu and elu), but they have a different call. For advanced activations layers, it would be keras.layers.LeakyReLU() for example, which is different than the format I have above of keras.layers.Activation(activation).
How can I change the code above so that all 4 of the mentioned activation layers are included in the hyperparameter search?
Thanks.
Related
I'm attempting to find variable importance on a Neural Network I've built. Using tensorflow, it seems you can use either the tensorflow.keras way, or the kerasRegressor way. Admittedly, I have been reading documentation / stack overflow for hours and am confused on the differences. They seem to perform similarly but have slightly different pros/cons.
One issue I'm running into is when I use tf.keras to build the model, I am able to clearly compare my training data to my validation/testing data, and get an 'accuracy score'. But, when using kerasRegressor, I am not.
The difference here is the .evaluate() function, which kerasRegressor doesn't seem to have.
Questions:
How to evaluate performance of kerasRegressor model w/ same output as tf.keras.evaluate()?
kerasRegressor Code:
K.clear_session()
def base_model():
# 1- Instantiate Model
modelNEW = keras.Sequential()
# 2- Specify Shape of First Layer
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
# 3- Add the layers
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
return modelNEW
# *** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
currentModel = KerasRegressor(build_fn=base_model, epochs=100, batch_size=50, shuffle='True')
history = currentModel.fit(xTrain, yTrain)
Now if I want to test the accuracy, I have to use .predict()
prediction = currentModel.predict(xValidation)
# print(prediction)
# train_error = np.abs(yValidation - prediction)
# mean_error = np.mean(train_error)
# min_error = np.min(train_error)
# max_error = np.max(train_error)
# std_error = np.std(train_error)```
tf.Keras neural Network:
modelNEW = keras.Sequential()
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
*** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
history = modelNEW.fit(xTrain, yTrain, epochs=100, batch_size=50, shuffle="True")
This is the evaluation I need to see, and cannot with kerasRegressor:
# 6- Model evaluation with test data
test_loss, test_acc = modelNEW.evaluate(xValidation, yValidation)
print('test_acc:', test_acc)
Possible Workaround, still error:
# predictionTrain = currentModel.predict(xTrain)
predictionValidation = currentModel.predict(xValidation)
# print('Train Accuracy = ',accuracy_score(yTrain,np.argmax(pred_train, axis=1)))
print('Test Accuracy = ',accuracy_score(yValidation,np.argmax(predictionValidation, axis=1)))
: Classification metrics can't handle a mix of multilabel-indicator and binary targets
I am trying to apply a softmax activation layer to the output of the Add() layer. I am trying to make this layer the output of my model and I am running into a few problems.
It seems Add() layer doesn't allow the usage of activations and if I do something like this:
predictions = Add()([x,y])
predictions = softmax(predictions)
model = Model(inputs = model.input, outputs = predictions)
I get:
ValueError: Output tensors to a Model must be the output of a Keras `Layer` (thus holding past layer metadata). Found: Tensor("Softmax:0", shape=(?, 6), dtype=float32)
It has nothing to do with the Add layer, you are using K.softmax directly on Keras tensors and this won't work, you need an actual layer. You can use the Activation layer for this:
from keras.layers import Activation
predictions = Add()([x,y])
predictions = Activation("softmax")(predictions)
model = Model(inputs = model.input, outputs = predictions)
I want to have the y_pred output as either +1 or -1 only. It should not have the intermediate real values and not even zero.
classifier = Sequential()
#adding layers
# Adding the input layer and the first hidden l`enter code here`ayer
classifier.add(Dense(output_dim = 6, init = 'uniform', activation ='relu', input_shape = (22,)))
# Adding the second hidden layer classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu'))
# Adding the output layer
classifier.add(Dense(output_dim = 1, init = 'uniform', activation = 'tanh'))
# Compiling Neural Network
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Fitting our model
classifier.fit(x_train, y_train, batch_size = 10, epochs = 100)
# Predicting the Test set results
y_pred = classifier.predict(x_test)
The output values of y_pred are in the range of [-1,1] but I expected values only to be either of 1 or -1.
To function properly, neural networks require an activation function that can get non-integer values. If you need rigidly discrete output, you need to translate the output values yourself.
When you are implementing binary_crossentropy loss in your code, Keras automatically takes the output and applies a threshold of 0.5 to the value. This makes anything above 0.5 as 1 and anything below as 0. Unfortunately, in keras there is no easy way to change the threshold. You will have to write your own loss function.
Here is a Stackoverflow link that will guide you in doing that.
I am new to TensorFlow and I am trying to model a two layer fully-connected neural network for regression in TensorFlow. The answer in the following StackOverflow discussion shows how to make predictions with a single layer neural network model. However, I feel this approach will become inefficient with more layers.
Making predictions with a TensorFlow model
Could anyone please let me know, how I can make predictions with my TensorFlow model?
I have defined my network as shown below:
x = tf.placeholder(tf.float32,[None,n_input])
y = tf.placeholder(tf.float32,[None])
weights_h1 = tf.Variable(tf.truncated_normal([n_input, n_hidden_1]))
weights_h2 = tf.Variable(tf.truncated_normal([n_hidden_1, n_hidden_2]))
weights_out = tf.Variable(tf.truncated_normal([n_hidden_2, n_output]))
bias_b1 = tf.Variable(tf.truncated_normal([n_hidden_1]))
bias_b2 = tf.Variable(tf.truncated_normal([n_hidden_2]))
bias_out = tf.Variable(tf.truncated_normal([n_output]))
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x, weights_h1), bias_b1)
layer_1 = tf.nn.relu(layer_1)
# Hidden layer with RELU activation
layer_2 = tf.add(tf.matmul(layer_1, weights_h2), bias_b2)
layer_2 = tf.nn.relu(layer_2)
# Output layer with linear activation
out_layer = tf.add(tf.matmul(layer_2, weights_out), bias_out)
Using the following command, I can train my neural network model.
_, training_error = sess.run([optimizer, loss], feed_dict={x: batch_x, y: batch_y})
I have a network in Keras with many outputs, however, my training data only provides information for a single output at a time.
At the moment my method for training has been to run a prediction on the input in question, change the value of the particular output that I am training and then doing a single batch update. If I'm right this is the same as setting the loss for all outputs to zero except the one that I'm trying to train.
Is there a better way? I've tried class weights where I set a zero weight for all but the output I'm training but it doesn't give me the results I expect?
I'm using the Theano backend.
Outputting multiple results and optimizing only one of them
Let's say you want to return output from multiple layers, maybe from some intermediate layers, but you need to optimize only one target output. Here's how you can do it:
Let's start with this model:
inputs = Input(shape=(784,))
x = Dense(64, activation='relu')(inputs)
# you want to extract these values
useful_info = Dense(32, activation='relu', name='useful_info')(x)
# final output. used for loss calculation and optimization
result = Dense(1, activation='softmax', name='result')(useful_info)
Compile with multiple outputs, set loss as None for extra outputs:
Give None for outputs that you don't want to use for loss calculation and optimization
model = Model(inputs=inputs, outputs=[result, useful_info])
model.compile(optimizer='rmsprop',
loss=['categorical_crossentropy', None],
metrics=['accuracy'])
Provide only target outputs when training. Skipping extra outputs:
model.fit(my_inputs, {'result': train_labels}, epochs=.., batch_size=...)
# this also works:
#model.fit(my_inputs, [train_labels], epochs=.., batch_size=...)
One predict to get them all
Having one model you can run predict only once to get all outputs you need:
predicted_labels, useful_info = model.predict(new_x)
In order to achieve this I ended up using the 'Functional API'. You basically create multiple models, using the same layers input and hidden layers but different output layers.
For example:
https://keras.io/getting-started/functional-api-guide/
from keras.layers import Input, Dense
from keras.models import Model
# This returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions_A = Dense(1, activation='softmax')(x)
predictions_B = Dense(1, activation='softmax')(x)
# This creates a model that includes
# the Input layer and three Dense layers
modelA = Model(inputs=inputs, outputs=predictions_A)
modelA.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
modelB = Model(inputs=inputs, outputs=predictions_B)
modelB.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])