How to properly stack LSTM and sklearn ML models

How to properly stack LSTM and sklearn ML models - keras

I am trying to stack Kears and ML models of sklean library for a regression problem. I am able to stack simple dense layers of Keras with any ML models.
My input type is similar to Boston house price (multi-variete). This code snippet shows how I tackle the problem:
#Scikit-learn Models
rnd_reg =RandomForestRegressor(n_estimators=100, random_state=42)
#Keras Model
def build_nn():
model = Sequential(
[Dense(512, activation='relu', input_shape=[X.shape[1]]),
Dense(256, activation='relu'),
Dropout(0.4),
Dense(128, activation='relu'),
Dense(64, activation='relu'),
Dropout(0.2),
Dense(1, activation='linear')
])
model.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['MeanSquaredError', 'MeanAbsolutePercentageError'])
return model
keras_reg = tf.keras.wrappers.scikit_learn.KerasRegressor(build_nn,
epochs=20,
batch_size=1,
verbose=True)
keras_reg._estimator_type = "regressor"
st_reg = StackingRegressor(
estimators=[('rf', rnd_reg),
('Dense',keras_reg)],
final_estimator=XGBRegressor(random_state=42)
)
Everything works jus fine with this part. The issue is when I want to use RNN (LSTM for example) and stack it with sklearn models. I am not sure if this is due to the wrong input shape that I use or LSTM cannot be stacked with ML models of sklearn.
Below is my LSTM structure:
def build_nn1():
model = Sequential(
#[LSTM(50, activation='relu', batch_input_shape=(None, X.shape[1], 1)),
[LSTM(50, activation='relu', input_shape=[X.shape[1], 1]),
BatchNormalization(),
#Dropout(0.2),
Dense(20, activation='relu'),
Dense(1, activation='relu')
])
model.compile(optimizer=Adam(lr=1e-5),
loss='mean_squared_error',
metrics=['MeanSquaredError', 'MeanAbsolutePercentageError'])
return model
enter code here

Related

What should I do in order to use LSTM as a weak learner for adaboostregressor

The specific implementation of base_estimator is not mentioned in the sklearn documentation. I want to use LSTM as base_estimator of adaboostregressor, but the way in the picture doesn't work, how can I design LSTM as base_estimator? Thank you all.
# This is my model
model = Sequential()
model.add(LSTM(128, return_sequences=False))
model.add(Dropout(0.2))
model.add( Dense(64,activation = 'relu'))
model.add(Dropout(0.3))
model.add( Dense(32,activation = 'relu'))
model.add(Dropout(0.3))
model.add( Dense(8,activation = 'relu'))
model.add(Dropout(0.3))
model.add(Dense(1))
model.compile(optimizer=Adam(lr =0.0003),
loss='mean_squared_error')
from sklearn.ensemble import AdaBoostRegressor
regr = AdaBoostRegressor(base_estimator=model, n_estimators=5, random_state=1)
I tried to assign 'model' to 'base_estimator'. But it reported an error. Here are the errors.
TypeError: Cannot clone object '<tensorflow.python.keras.engine.sequential.Sequential object at 0x7f972c2982e0>'
(type <class 'tensorflow.python.keras.engine.sequential.Sequential'>):
it does not seem to be a scikit-learn estimator as it does not implement a 'get_params' method.
I don't know how to implement this LSTM as a base_estimator.

how to make my curves smoother in a keras text classification model?

My model is defined like this for a sentiment analysis problem:
def create_model(vocab_size, embedding_dim, maxlen, embeddings_matrix):
model = tf.keras.Sequential([
# This is how you need to set the Embedding layer when using pre-trained embeddings
tf.keras.layers.Embedding(vocab_size+1, embedding_dim, input_length=maxlen, weights=[embeddings_matrix], trainable=False),
tf.keras.layers.Conv1D(128, 6 , activation='relu'),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(64, activation = 'relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64, activation = 'relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model
Then I have very jagged curves for validation accuracy and loss(in blue).
how can I smoothen them , what parameters, or the layers in my model can I change?
My curves should be close to these.

How to add the Count Vectorizer to Simple RNN model?

For my NLP project I used CountVectorizer to Extract Features from a dataset using vectorizer = CountVectorizer(stop_words='english') and all_features = vectorizer.fit_transform(data.Text) and i also wrote a Simple RNN model using keras but I am not sure how to do the padding and the tokeniser step and get the data be trained on the model.
my code for RNN is:
model.add(keras.layers.recurrent.SimpleRNN(units = 1000, activation='relu',
use_bias=True))
model.add(keras.layers.Dense(units=1000, input_dim = 2000, activation='sigmoid'))
model.add(keras.layers.Dense(units=500, input_dim=1000, activation='relu'))
model.add(keras.layers.Dense(units=2, input_dim=500,activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
can someone please give me some advice on this?
Thank you

add ensemble - you don't count vectorize, you use ensemble
https://github.com/dnishimoto/python-deep-learning/blob/master/UFO%20.ipynb
docs=ufo_df["summary"] #text
LABELS=['Egg', 'Cross','Sphere', 'Triangle','Disk','Oval','Rectangle','Teardrop']
#LABELS=['Triangle']
target=ufo_df[LABELS]
#print([len(d) for d in docs])
encoded_docs=[one_hot(d,vocab_size) for d in docs]
#print([np.max(d) for d in encoded_docs])
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')
#print([d for d in padded_docs])
model=Sequential()
model.add(Embedding(vocab_size, 8, input_length=max_length))
model.add(Flatten())
model.add(Dense(8, activation='softmax'))
#model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(padded_docs, target, epochs=50, verbose=0)

Tensorflow structured data model.predict() returns incorrect probabilities

I'm trying to follow a Tensorflow tutorial (i'm a beginner) for structured data models with some changes along the way.
My purpose is to create a model to which i provide data (in csv format) that looks something like this (the example has only 2 features but i want to extend it after i figure it out):
power_0,power_1,result
0.2,0.3,draw
0.8,0.1,win
0.3,0.1,draw
0.7,0.2,win
0.0,0.4,lose
I created the model using the following code:
def get_labels(df, label, mapping):
raw_y_true = df.pop(label)
y_true = np.zeros((len(raw_y_true)))
for i, raw_label in enumerate(raw_y_true):
y_true[i] = mapping[raw_label]
return y_true
tf.compat.v1.enable_eager_execution()
mapping_to_numbers = {'win': 0, 'draw': 1, 'lose': 2}
data_frame = pd.read_csv('data.csv')
data_frame.head()
train, test = train_test_split(data_frame, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
train_labels = np.array(get_labels(train, label='result', mapping=mapping_to_numbers))
val_labels = np.array(get_labels(val, label='result', mapping=mapping_to_numbers))
test_labels = np.array(get_labels(test, label='result', mapping=mapping_to_numbers))
train_features = np.array(train)
val_features = np.array(val)
test_features = np.array(test)
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(train_features.shape[-1],)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(3, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
epochs = 10
batch_size = 100
history = model.fit(
train_features,
train_labels,
epochs=epochs,
validation_data=(val_features, val_labels))
input_data_frame = pd.read_csv('input.csv')
input_data_frame.head()
input_data = np.array(input_data_frame)
print(model.predict(input_data))
input.csv looks as following:
power_0,power_1
0.8,0.1
0.7,0.2
And the actual result is:
[[0.00604381 0.00242573 0.00440606]
[0.01321151 0.00634229 0.01041476]]
I expected to get the probability for each label ('win', 'draw' and 'lose'), can anyone please help me with this?
Thanks in advance

Use softmax activation in this line tf.keras.layers.Dense(3, activation='sigmoid').

This works well for me with your example:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(train_features.shape[-1],)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(16, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax'),
])
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
Using Flatten Layer

I have to write my suggestions here because i cant comment yet.
#zihaozhihao is right you have to use softmax instead of sigmoid because you dont work with a binary problem. Another problem might be your loss function which is:
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
Try to use loss='categorical_crossentropy',because you are working with a multilabel classification. You could read more about multilable classification here and here.
As for your propability question. You get the propability of each class for your two test inputs.For example:
win draw loss
[[0.00604381 0.00242573 0.00440606]
[0.01321151 0.00634229 0.01041476]]
The Problem is your loss function and the activation function which leads to strange propability values. You might want to check this post here for more information.
Hope this helps a little and feel free to ask.

Since it is a multiclass classification problem, please use categorical_crossentropy instead of binary_crossentropy for loss function, also use softmax instead of sigmoid as activation function.
Also, you should increase your epochs for getting better convergence.

Getting differents results for the identical model FFNN in keras

I am building a model based on FFNN (Feed Forward Neural Network) using Keras.
I built a first version:
def mlp0(input_dim, loss):
model = Sequential()
model.add(Dropout(0.5, input_shape=(input_dim,)))
model.add(Dense(512, activation='sigmoid'))
model.add(Dense(1, activation='relu'))
model.compile(loss=loss, optimizer=Adagrad())
return model
This gives me very good results in k-fold cross-validation, but when I predict on validation set, the performance is bad.
So I tried another version.
def mlp1(input_dim, loss):
inputs = keras.Input(shape=(input_dim,))
x = keras.layers.Dropout(0.5)(inputs)
x = keras.layers.Dense(512, activation='sigmoid')(x)
outputs = keras.layers.Dense(1, activation='relu')(x)
model = keras.Model(inputs, outputs)
model.compile(loss=loss, optimizer=Adagrad())
return model
This second model gives worse results on cross-validation but the results are compatible with the results on the validation set.
To my eyes, they are identical models built in different ways, but for some reason they give me different answers. What am I doing wrong?
Edit:
These models behave the same way:
def mlp0(input_dim, loss):
model = Sequential()
model.add(Dense(512, activation='sigmoid', input_shape=(input_dim,), kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(1, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.compile(loss=loss, optimizer=Adam())
return model
import keras
from keras import regularizers
def mlp1(input_dim, loss):
inputs = keras.Input(shape=(input_dim,))
x = keras.layers.Dense(512, activation='sigmoid', kernel_regularizer=regularizers.l2(0.01))(inputs)
outputs = keras.layers.Dense(1, activation='relu', kernel_regularizer=regularizers.l2(0.01))(x)
model = keras.Model(inputs, outputs)
model.compile(loss=loss, optimizer=Adam())
return model
These make me think there is a catch in prediction phase with the dropout

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to properly stack LSTM and sklearn ML models - keras

Related

What should I do in order to use LSTM as a weak learner for adaboostregressor

how to make my curves smoother in a keras text classification model?

How to add the Count Vectorizer to Simple RNN model?

Tensorflow structured data model.predict() returns incorrect probabilities

Getting differents results for the identical model FFNN in keras

Categories

Resources