Keras translate example to use functional API - keras

I have a working example in Keras which I want to translate to make use of the functional API. I am missing some detail I when I run the code I get ValueError: total size of new array must be unchanged when using embeddings.
Maybe someone sees what I am doing wrong.
The working code looks like this:
EMBEDDING_DIM=100
embeddingMatrix = mu.loadPretrainedEmbedding('englishEmb100.txt', EMBEDDING_DIM, wordMap)
###############
# Build Model #
###############
model = Sequential()
model.add(Embedding(vocabSize+1, EMBEDDING_DIM, weights=[embeddingMatrix], trainable=False))
#model.add(Embedding(vocabSize+1, EMBEDDING_DIM))
model.add(Bidirectional(LSTM(EMBEDDING_DIM, return_sequences=True)))
model.add(TimeDistributed(Dense(maximal_value)))
model.add(Activation('relu'))
# try using different optimizers and different optimizer configs
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
My attempt to re-write it with the functional API:
EMBEDDING_DIM=100
embeddingMatrix = mu.loadPretrainedEmbedding('englishEmb100.txt', EMBEDDING_DIM, wordMap)
input_layer = Input(shape=(longest_sequence,), dtype='int32')
emb = Embedding(vocabSize+1, EMBEDDING_DIM, weights=[embeddingMatrix]) (input_layer)
forwards = LSTM(EMBEDDING_DIM, return_sequences=True) (emb)
backwards = LSTM(EMBEDDING_DIM, return_sequences=True, go_backwards=True) (emb)
common = merge([forwards, backwards], mode='concat', concat_axis=-1)
dense = TimeDistributed(Dense(EMBEDDING_DIM, activation='tanh')) (common)
out = TimeDistributed(Dense(len(labelMap), activation='softmax')) (dense)
model = Model(input=input_layer, output=out)
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
There is somewhere an operation that change the size but I am not sure where or why this is happening. I would be happy if someone could help me with this.

Related

How to add the Count Vectorizer to Simple RNN model?

For my NLP project I used CountVectorizer to Extract Features from a dataset using vectorizer = CountVectorizer(stop_words='english') and all_features = vectorizer.fit_transform(data.Text) and i also wrote a Simple RNN model using keras but I am not sure how to do the padding and the tokeniser step and get the data be trained on the model.
my code for RNN is:
model.add(keras.layers.recurrent.SimpleRNN(units = 1000, activation='relu',
use_bias=True))
model.add(keras.layers.Dense(units=1000, input_dim = 2000, activation='sigmoid'))
model.add(keras.layers.Dense(units=500, input_dim=1000, activation='relu'))
model.add(keras.layers.Dense(units=2, input_dim=500,activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
can someone please give me some advice on this?
Thank you
add ensemble - you don't count vectorize, you use ensemble
https://github.com/dnishimoto/python-deep-learning/blob/master/UFO%20.ipynb
docs=ufo_df["summary"] #text
LABELS=['Egg', 'Cross','Sphere', 'Triangle','Disk','Oval','Rectangle','Teardrop']
#LABELS=['Triangle']
target=ufo_df[LABELS]
#print([len(d) for d in docs])
encoded_docs=[one_hot(d,vocab_size) for d in docs]
#print([np.max(d) for d in encoded_docs])
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')
#print([d for d in padded_docs])
model=Sequential()
model.add(Embedding(vocab_size, 8, input_length=max_length))
model.add(Flatten())
model.add(Dense(8, activation='softmax'))
#model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(padded_docs, target, epochs=50, verbose=0)

Tensorflow structured data model.predict() returns incorrect probabilities

I'm trying to follow a Tensorflow tutorial (i'm a beginner) for structured data models with some changes along the way.
My purpose is to create a model to which i provide data (in csv format) that looks something like this (the example has only 2 features but i want to extend it after i figure it out):
power_0,power_1,result
0.2,0.3,draw
0.8,0.1,win
0.3,0.1,draw
0.7,0.2,win
0.0,0.4,lose
I created the model using the following code:
def get_labels(df, label, mapping):
raw_y_true = df.pop(label)
y_true = np.zeros((len(raw_y_true)))
for i, raw_label in enumerate(raw_y_true):
y_true[i] = mapping[raw_label]
return y_true
tf.compat.v1.enable_eager_execution()
mapping_to_numbers = {'win': 0, 'draw': 1, 'lose': 2}
data_frame = pd.read_csv('data.csv')
data_frame.head()
train, test = train_test_split(data_frame, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
train_labels = np.array(get_labels(train, label='result', mapping=mapping_to_numbers))
val_labels = np.array(get_labels(val, label='result', mapping=mapping_to_numbers))
test_labels = np.array(get_labels(test, label='result', mapping=mapping_to_numbers))
train_features = np.array(train)
val_features = np.array(val)
test_features = np.array(test)
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(train_features.shape[-1],)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(3, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
epochs = 10
batch_size = 100
history = model.fit(
train_features,
train_labels,
epochs=epochs,
validation_data=(val_features, val_labels))
input_data_frame = pd.read_csv('input.csv')
input_data_frame.head()
input_data = np.array(input_data_frame)
print(model.predict(input_data))
input.csv looks as following:
power_0,power_1
0.8,0.1
0.7,0.2
And the actual result is:
[[0.00604381 0.00242573 0.00440606]
[0.01321151 0.00634229 0.01041476]]
I expected to get the probability for each label ('win', 'draw' and 'lose'), can anyone please help me with this?
Thanks in advance
Use softmax activation in this line tf.keras.layers.Dense(3, activation='sigmoid').
This works well for me with your example:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(train_features.shape[-1],)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(16, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax'),
])
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
Using Flatten Layer
I have to write my suggestions here because i cant comment yet.
#zihaozhihao is right you have to use softmax instead of sigmoid because you dont work with a binary problem. Another problem might be your loss function which is:
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
Try to use loss='categorical_crossentropy',because you are working with a multilabel classification. You could read more about multilable classification here and here.
As for your propability question. You get the propability of each class for your two test inputs.For example:
win draw loss
[[0.00604381 0.00242573 0.00440606]
[0.01321151 0.00634229 0.01041476]]
The Problem is your loss function and the activation function which leads to strange propability values. You might want to check this post here for more information.
Hope this helps a little and feel free to ask.
Since it is a multiclass classification problem, please use categorical_crossentropy instead of binary_crossentropy for loss function, also use softmax instead of sigmoid as activation function.
Also, you should increase your epochs for getting better convergence.

Find Most Important Input from a Neural Network

I trained a neural network with 37 Inputs. It has around 85% accuracy. Is it possible for me to find out which Input has the most effect. I tried this code but I cannot figure out how to find most important Input
weights = model.layers[0].get_weights()[0]
biases = model.layers[0].get_weights()[1]
One possible solution is to wrap your model with keras.wrappers.scikit_learn and then use Recursive Feature elimination in scikit-learn:
def create_model():
# create model
model = Sequential()
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=128, verbose=0)
rfe = RFE(estimator=model, n_features_to_select=1, step=1)
rfe.fit(X, y)
ranking = rfe.ranking_.reshape(digits.images[0].shape)
# Plot pixel ranking
plt.matshow(ranking, cmap=plt.cm.Blues)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()
If you need to visualize weights see here.

Getting differents results for the identical model FFNN in keras

I am building a model based on FFNN (Feed Forward Neural Network) using Keras.
I built a first version:
def mlp0(input_dim, loss):
model = Sequential()
model.add(Dropout(0.5, input_shape=(input_dim,)))
model.add(Dense(512, activation='sigmoid'))
model.add(Dense(1, activation='relu'))
model.compile(loss=loss, optimizer=Adagrad())
return model
This gives me very good results in k-fold cross-validation, but when I predict on validation set, the performance is bad.
So I tried another version.
def mlp1(input_dim, loss):
inputs = keras.Input(shape=(input_dim,))
x = keras.layers.Dropout(0.5)(inputs)
x = keras.layers.Dense(512, activation='sigmoid')(x)
outputs = keras.layers.Dense(1, activation='relu')(x)
model = keras.Model(inputs, outputs)
model.compile(loss=loss, optimizer=Adagrad())
return model
This second model gives worse results on cross-validation but the results are compatible with the results on the validation set.
To my eyes, they are identical models built in different ways, but for some reason they give me different answers. What am I doing wrong?
Edit:
These models behave the same way:
def mlp0(input_dim, loss):
model = Sequential()
model.add(Dense(512, activation='sigmoid', input_shape=(input_dim,), kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(1, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.compile(loss=loss, optimizer=Adam())
return model
import keras
from keras import regularizers
def mlp1(input_dim, loss):
inputs = keras.Input(shape=(input_dim,))
x = keras.layers.Dense(512, activation='sigmoid', kernel_regularizer=regularizers.l2(0.01))(inputs)
outputs = keras.layers.Dense(1, activation='relu', kernel_regularizer=regularizers.l2(0.01))(x)
model = keras.Model(inputs, outputs)
model.compile(loss=loss, optimizer=Adam())
return model
These make me think there is a catch in prediction phase with the dropout

Keras: shape error when using validation data

Trying to add validation to model.fit, but whenever I do I get an error:
ValueError: Cannot feed value of shape (6, 4, 10) for Tensor 'lstm_input_1:0', which has shape '(32, 4, 10)'
Model:
data_dim = 10
timesteps = 4
batch_size = 32
model = Sequential()
model.add(LSTM(128, batch_input_shape=(batch_size, timesteps, data_dim), return_sequences=True, stateful=True))
model.add(LSTM(64, return_sequences=True, stateful=True))
model.add(LSTM(32, stateful=True))
model.add(Dense(2, activation='softmax'))
sgd = SGD(lr=0.001, momentum=0.0, decay=0.0, nesterov=False)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, nb_epoch=50, batch_size=batch_size, validation_split=0.5)
What could be the error? If I remove validation_split the training works just fine. I've also tried to manually split my training set into two and add it with validation_data=(x_val, y_val) but I got the exact same error.
The issue comes from the fact that you hard code the batch_size value of your inputs. You have fixed it to 32 and then when you try and validate your model, the validation data is sent with a batch of 6 samples, this might be because you don't have enough validation data or maybe because the number of sample isn't a multiple of 32... However, I would let the batch_size free if I was you. Like this:
model.add(LSTM(128, input_shape=(timesteps, data_dim), return_sequences=True, stateful=True))
You specify input_shape instead of batch_input_shape. That way, your network will accept any size of batch, every layer down in the stream of your model are made to adapt to any batch_size if not hardcoded.
I hope this helps :)

Resources