Keras fit ValueError in extremely simple test code - keras

I have encountered a very persistent problem in a more complex Keras program but have boiled it down to this: The answer must be very simple but I can't find it.
When I run this code:
def __init__ (self):
self.model = Sequential()
self.model.add(Dense(4, input_shape=(4,), activation='linear'))
self.model.compile(optimizer='adam', loss='mse')
def run(self):
x = [1., 1., 1., 1.]
print('x:', x, 'x shape:', np.shape(x))
y = [0., 0., 0., 0.]
print('y:', y, 'y shape:', np.shape(y))
self.model.fit(x, y, batch_size=1, epochs=1, verbose=2)
The print statements show both x and y to be of shape (4,) but the fit line generates:
ValueError: Error when checking input: expected dense_1_input to have
shape (4,) but got array with shape (1,)
I've tried reshaping x to (1,4) but it didn't help. I'm stumped.

Data should be 2D.
Make your x and y data as 2D by x = [[1., 1., 1., 1.]]. It becomes 1x4 data.
1 is number of data and 4 is dimension what you define as the input_shape.
And, make it as numpy array by x = np.array(x). Keras's fit method requires numpy array. I saw x: Numpy array of training data from https://keras.io/models/model/.
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
class A:
def __init__ (self):
self.model = Sequential()
self.model.add(Dense(4, input_shape=(4,), activation='linear'))
self.model.compile(optimizer='adam', loss='mse')
def run(self):
x = [[1., 1., 1., 1.]]
print('x:', x, 'x shape:', np.shape(x))
y = [[0., 0., 0., 0.]]
print('y:', y, 'y shape:', np.shape(y))
x = np.array(x)
y = np.array(y)
self.model.fit(x, y, batch_size=1, epochs=1, verbose=2)
a = A()
a.run()

the x and y arrays you pass are not the right shape. If you want to have input tensor of shape (4,) for your model then you have to prepare a tensor with shape (n,4) where n is the number of examples you are providing.
import tensorflow as tf
import numpy as np
from keras.models import Model, Sequential
from keras.layers import Input, Dense
class Mymodel(tf.keras.Model):
def __init__ (self):
super(Mymodel, self).__init__()
self.model = Sequential()
self.model.add(Dense(4, input_shape=(4,), activation='linear'))
self.model.compile(optimizer='adam', loss='mse')
def run(self):
x = np.ones((1,4))
print('x:', x, 'x shape:', np.shape(x))
y = np.zeros((1,4))
print('y:', y, 'y shape:', np.shape(y))
self.model.fit(x, y, batch_size=1, epochs=1, verbose=2)
model = Mymodel()
model.run()

Related

How to use "LeakyRelu" and Parametric Leaky Relu "PReLU" in Keras Tuner

I am using Keras Tuner and using RandomSearch() to hypertune my regression model. While I can hypertune using "relu" and "selu", I am unable to do the same for Leaky Relu. I understand that the reason "relu" and "selu" string works because, for "relu" and "selu", string aliases are available. String alias is not available for Leaky Relu. I tried passing a callable object of Leaky Relu (see my example below) but it doesn't seem to work. Can you please advise me how to do that? I have the same issue with using Parametric Leaky Relu,
Thank you in advance!
def build_model(hp):
model = Sequential()
model.add(
Dense(
units = 18,
kernel_initializer = 'normal',
activation = 'relu',
input_shape = (18, )
)
)
for i in range(hp.Int( name = "num_layers", min_value = 1, max_value = 5)):
model.add(
Dense(
units = hp.Int(
name = "units_" + str(i),
min_value = 18,
max_value = 180,
step = 18),
kernel_initializer = 'normal',
activation = hp.Choice(
name = 'dense_activation',
values=['relu', 'selu', LeakyReLU(alpha=0.01) ],
default='relu'
)
)
)
model.add( Dense( units = 1 ) )
model.compile(
optimizer = tf.keras.optimizers.Adam(
hp.Choice(
name = "learning_rate", values = [1e-2, 1e-3, 1e-4]
)
),
loss = 'mse'
)
return model
As a work-around, you can add another activation function in the tf.keras.activations.* module by modifying the source file ( which you'll see is activations.py )
Here's the code for tf.keras.activations.relu which you'll see in activations.py,
#keras_export('keras.activations.relu')
#dispatch.add_dispatch_support
def relu(x, alpha=0., max_value=None, threshold=0):
"""Applies the rectified linear unit activation function.
With default values, this returns the standard ReLU activation:
`max(x, 0)`, the element-wise maximum of 0 and the input tensor.
Modifying default parameters allows you to use non-zero thresholds,
change the max value of the activation,
and to use a non-zero multiple of the input for values below the threshold.
For example:
>>> foo = tf.constant([-10, -5, 0.0, 5, 10], dtype = tf.float32)
>>> tf.keras.activations.relu(foo).numpy()
array([ 0., 0., 0., 5., 10.], dtype=float32)
>>> tf.keras.activations.relu(foo, alpha=0.5).numpy()
array([-5. , -2.5, 0. , 5. , 10. ], dtype=float32)
>>> tf.keras.activations.relu(foo, max_value=5).numpy()
array([0., 0., 0., 5., 5.], dtype=float32)
>>> tf.keras.activations.relu(foo, threshold=5).numpy()
array([-0., -0., 0., 0., 10.], dtype=float32)
Arguments:
x: Input `tensor` or `variable`.
alpha: A `float` that governs the slope for values lower than the
threshold.
max_value: A `float` that sets the saturation threshold (the largest value
the function will return).
threshold: A `float` giving the threshold value of the activation function
below which values will be damped or set to zero.
Returns:
A `Tensor` representing the input tensor,
transformed by the relu activation function.
Tensor will be of the same shape and dtype of input `x`.
"""
return K.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)
Copy this code and paste it just below. Change #keras_export('keras.activations.relu') to #keras_export( 'keras.activations.leaky_relu' ) and also change the value of alpha to 0.2, like,
#keras_export('keras.activations.leaky_relu')
#dispatch.add_dispatch_support
def relu(x, alpha=0.2, max_value=None, threshold=0):
"""Applies the rectified linear unit activation function.
With default values, this returns the standard ReLU activation:
`max(x, 0)`, the element-wise maximum of 0 and the input tensor.
Modifying default parameters allows you to use non-zero thresholds,
change the max value of the activation,
and to use a non-zero multiple of the input for values below the threshold.
For example:
>>> foo = tf.constant([-10, -5, 0.0, 5, 10], dtype = tf.float32)
>>> tf.keras.activations.relu(foo).numpy()
array([ 0., 0., 0., 5., 10.], dtype=float32)
>>> tf.keras.activations.relu(foo, alpha=0.5).numpy()
array([-5. , -2.5, 0. , 5. , 10. ], dtype=float32)
>>> tf.keras.activations.relu(foo, max_value=5).numpy()
array([0., 0., 0., 5., 5.], dtype=float32)
>>> tf.keras.activations.relu(foo, threshold=5).numpy()
array([-0., -0., 0., 0., 10.], dtype=float32)
Arguments:
x: Input `tensor` or `variable`.
alpha: A `float` that governs the slope for values lower than the
threshold.
max_value: A `float` that sets the saturation threshold (the largest value
the function will return).
threshold: A `float` giving the threshold value of the activation function
below which values will be damped or set to zero.
Returns:
A `Tensor` representing the input tensor,
transformed by the relu activation function.
Tensor will be of the same shape and dtype of input `x`.
"""
return K.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)
You can use the String alias keras.activations.leaky_relu.
# Custom activation function
from keras.layers import Activation
from keras import backend as K
from keras.utils.generic_utils import get_custom_objects
## Add leaky-relu so we can use it as a string
get_custom_objects().update({'leaky-relu': Activation(LeakyReLU(alpha=0.2))})
## Main activation functions available to use
activation_functions = ['sigmoid', 'relu', 'elu', 'leaky-relu', 'selu', 'gelu',"swish"]

1D CNN input shape and the training data shape

I am getting error trying to feed in the following data into my network. I have issue with reshaping the training data and the input to the network. The error I am getting is:
Error when checking target: expected conv1d_92 to have shape (4, 1) but got array with shape (1, 784)
The code is as follows:
# -*- coding: utf-8 -*-
"""
Created on Wed Mar 17 20:57:51 2021
#author: morte
"""
import keras
from keras import layers
from keras.datasets import mnist
import numpy as np
#(x_train, _), (x_test, _) = mnist.load_data()
#x_train = x_train.astype('float32') / 255.
#x_test = x_test.astype('float32') / 255.
#x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
#x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train),1,28*28)) #
x_test = np.reshape(x_test, (len(x_test),1,28*28)) #
input_img = keras.Input(shape=(x_train.shape[1:]))
x = layers.Conv1D(16,(3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling1D(2, padding='same')(x)
x = layers.Conv1D(8,(3), activation='relu', padding='same')(x)
x = layers.MaxPooling1D(2, padding='same')(x)
x = layers.Conv1D(8,(3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling1D(2, padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = layers.Conv1D(8,(3), activation='relu', padding='same')(encoded)
x = layers.UpSampling1D(2)(x)
x = layers.Conv1D(8,(3), activation='relu', padding='same')(x)
x = layers.UpSampling1D(2)(x)
x = layers.Conv1D(16,(3), activation='relu')(x)
x = layers.UpSampling1D(2)(x)
decoded = layers.Conv1D(1, (3), activation='sigmoid', padding='same')(x)
autoencoder = keras.Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
from keras.datasets import mnist
import numpy as np
from keras.callbacks import TensorBoard
autoencoder.fit(x_train, x_train,
epochs=2,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test),
)
decoded_imgs = autoencoder.predict(x_test)
import matplotlib.pyplot as plt
n = 10
plt.figure(figsize=(20, 4))
for i in range(1, n + 1):
# Display original
ax = plt.subplot(2, n, i)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# Display reconstruction
ax = plt.subplot(2, n, i + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
Since it is an Autoencoder Network, Encoder-Input has to match Decoder-Output.
You are giving a (1,785) shaped Input-array and outputting an (4,1) array.
For more details you can add the line: autoencoder.summary() (for example after the line autoencoder = keras.Model(input_img, decoded))
This will give you information about the shapes of every layer.
An approach would be for example add a Dense Layer at the end of your Decoder.

Incremental learning in keras

I am looking for a keras equivalent of scikit-learn's partial_fit : https://scikit-learn.org/0.15/modules/scaling_strategies.html#incremental-learning for incremental/online learning.
I finally found the train_on_batch method but I can't find an example that shows how to properly implement it in a for loop for a dataset that looks like this :
x = np.array([[0.5, 0.7, 0.8]]) # input data
y = np.array([[0.4, 0.6, 0.33, 0.77, 0.88, 0.71]]) # output data
Note : this is a multi-output regression
my code so far:
import keras
import numpy as np
x = np.array([0.5, 0.7, 0.8])
y = np.array([0.4, 0.6, 0.33, 0.77, 0.88, 0.71])
in_dim = x.shape
out_dim = y.shape
model = Sequential()
model.add(Dense(100, input_shape=(1,3), activation="relu"))
model.add(Dense(32, activation="relu"))
model.add(Dense(6))
model.compile(loss="mse", optimizer="adam")
model.train_on_batch(x,y)
I get this Error:
ValueError: Input 0 of layer sequential_28 is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape [3, 1]
You should feed your data batch-wise. You are giving a single instance but model expecting batch data. So, you need to expand the input dimension for batch size.
import keras
import numpy as np
from keras.models import *
from keras.layers import *
from keras.optimizers import *
x = np.array([0.5, 0.7, 0.8])
y = np.array([0.4, 0.6, 0.33, 0.77, 0.88, 0.71])
x = np.expand_dims(x, axis=0)
y = np.expand_dims(y, axis=0)
# x= np.squeeze(x)
in_dim = x.shape
out_dim = y.shape
model = Sequential()
model.add(Dense(100, input_shape=((1,3)), activation="relu"))
model.add(Dense(32, activation="relu"))
model.add(Dense(6))
model.compile(loss="mse", optimizer="adam")
model.train_on_batch(x,y)

How to get classification probabilities in Keras?

I am trying to get classification probabilities out of my trained Keras model but when I use the model.predict (or model.predict_proba) method, all I get is an array of this form:
array([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]], dtype=float32)
So basically I get a one hot encoded float array. The "1" is mostly in the right place so the training seems to have worked fine. But why can't I get the probabilities out? See code for architecture used.
First I read in the data:
mnist_train = pd.read_csv('data/mnist_train.csv')
mnist_test = pd.read_csv('data/mnist_test.csv')
mnist_train_images = mnist_train.iloc[:, 1:].values
mnist_train_labels = mnist_train.iloc[:, :1].values
mnist_test_images = mnist_test.iloc[:, 1:].values
mnist_test_labels = mnist_test.iloc[:, :1].values
mnist_train_images = mnist_train_images.astype('float32')
mnist_test_images = mnist_test_images.astype('float32')
mnist_train_images /= 255
mnist_test_images /= 255
mnist_train_labels = keras.utils.to_categorical(mnist_train_labels, 10)
mnist_test_labels = keras.utils.to_categorical(mnist_test_labels, 10)
mnist_train_images = mnist_train_images.reshape(60000,28,28,1)
mnist_test_images = mnist_test_images.reshape(10000,28,28,1)
Then I build my model and train:
num_classes = mnist_test_labels.shape[1]
model = Sequential()
model.add(Conv2D(64, (5, 5), input_shape=(28, 28, 1), activation='relu', data_format="channels_last", padding="same"))
model.add(Conv2D(64, (5, 5), input_shape=(28, 28, 1), activation='relu', data_format="channels_last", padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', data_format="channels_last", padding="same"))
model.add(Conv2D(128, (3, 3), activation='relu', data_format="channels_last", padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(mnist_train_images, mnist_train_labels, validation_data=(mnist_test_images, mnist_test_labels), epochs=20, batch_size=256, verbose=2)
scores = model.evaluate(mnist_test_images, mnist_test_labels, verbose=0)
print("CNN Error: %.2f%%" % (100-scores[1]*100))
model.save('mnist-weights.model')
model.save_weights("mnist-model.h5")
model_json = model.to_json()
with open("mnist-model.json", "w") as json_file:
json_file.write(model_json)
But when I then load the model in another application and try to predict probabilities like this, the described error occurs. What am I doing wrong?
json_file = open('alphabet_keras/mnist_model.json', 'r')
model_json = json_file.read()
model = model_from_json(model_json)
model.load_weights("alphabet_keras/mnist_model.h5")
letter = cv2.cvtColor(someImg, cv2.COLOR_BGR2GRAY)
letter = fitSquare(letter,28,2) # proprietary function, doesn't matter
letter_expanded = np.expand_dims(letter, axis=0)
letter_expanded = np.expand_dims(letter_expanded, axis=3)
model.predict_proba(letter_expanded)#[0]
The output is as follows:
array([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]], dtype=float32)
I expect something like:
array([[0.1, 0.34, 0.2, 0.8, 0.1, 0.62, 0.67, 1.0, 0.31, 0.59]], dtype=float32)
There are not error messages of any kind. Please help :)
Your expected output is not correct, for classification the output of a neural network is a probability distribution over the labels, which means that the probabilities are between 0 and 1, and that they sum to 1.0. The values you show sum to more than 1.0.
About your specific problem, it looks the probabilities are saturated, this is caused by the fact that you are not normalizing the pixel values by dividing by 255, which you are doing with the training and testing sets, this inconsistency will saturate the output neurons.

keras multi output softmax model input shape

I have the following model:
from keras.layers import Activation, Input, Dense
from keras.models import Model
from keras.layers.merge import Concatenate, concatenate
input_ = Input(batch_shape=(512, 36))
x = input_
x1 = Dense(4)(x)
x2 = Dense(4)(x)
x3 = Dense(4)(x)
x4 = Dense(4)(x)
model = Model(inputs=input_, outputs=[x1, x2, x3, x4])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X, test, epochs=20, batch_size=512, verbose=2, shuffle=False, validation_data=[X, test])
My Y has the following format:
col1 col2 ... col 4
1 0 2
0 0 2
2 1 1
and is reshaped via:
y = to_categorical(Y).reshape(4, -1, 3)
However, when running the fit command, I get the following error:
ValueError: Error when checking model target: the list of Numpy arrays that
you are passing to your model is not the size the model expected. Expected
to see 4 array(s), but instead got the following list of 1 arrays:
[array([[[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
Assuming Y is a numpy matrix?
Try this:
y = [to_categorical(Y[:, col_numb]).reshape(-1, 3) for col_numb in range(Y.shape[1])]

Resources