I am doing text classification using LSTM model, I got 98% accuracy in validation data but when I am submitting It gets 0 scores, please help me how to do, I am a beginner to NLP.
I have data like this
train.head()
id category text
0 959 0 5573 1189 4017 1207 4768 8542 17 1189 5085 5773
1 994 0 6315 7507 6700 4742 1944 2692 3647 4413 6700
2 995 0 5015 8067 5335 1615 7957 5773
3 996 0 2925 7199 1994 4647 7455 5773 4518 2734 2807 8...
4 997 0 7136 1207 6781 237 4971 3669 6193
I am applying tokenizer here :
from keras.preprocessing.text import Tokenizer
max_features = 1000
tokenizer = Tokenizer(num_words=max_features)
tokenizer.fit_on_texts(list(X_train))
X_train = tokenizer.texts_to_sequences(X_train)
X_test = tokenizer.texts_to_sequences(X_test)
I am applying sequence padding here:
from keras.preprocessing import sequence
max_words = 30
X_train = sequence.pad_sequences(X_train, maxlen=max_words)
X_test = sequence.pad_sequences(X_test, maxlen=max_words)
print(X_train.shape,X_test.shape)
Here my model:
batch_size = 64
epochs = 5
max_features = 1000
embed_dim = 100
num_classes = train['category'].nunique()
model = Sequential()
model.add(Embedding(max_features, embed_dim, input_length=X_train.shape[1]))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100, dropout=0.2))
model.add(Dense(num_classes, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
Layer (type) Output Shape Param #
=================================================================
embedding_2 (Embedding) (None, 30, 100) 100000
_________________________________________________________________
conv1d_3 (Conv1D) (None, 30, 32) 9632
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 15, 32) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 15, 32) 3104
_________________________________________________________________
max_pooling1d_4 (MaxPooling1 (None, 7, 32) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 100) 53200
_________________________________________________________________
dense_2 (Dense) (None, 2) 202
=================================================================
Total params: 166,138
Trainable params: 166,138
Non-trainable params: 0
_________________________________________________________________
None
Here my epochs:
model_history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=batch_size, verbose=1)
Train on 2771 samples, validate on 693 samples
Epoch 1/5
2771/2771 [==============================] - 2s 619us/step - loss: 0.2816 - acc: 0.9590 - val_loss: 0.1340 - val_acc: 0.9668
Epoch 2/5
2771/2771 [==============================] - 1s 238us/step - loss: 0.1194 - acc: 0.9664 - val_loss: 0.0809 - val_acc: 0.9668
Epoch 3/5
2771/2771 [==============================] - 1s 244us/step - loss: 0.0434 - acc: 0.9843 - val_loss: 0.0258 - val_acc: 0.9899
Epoch 4/5
2771/2771 [==============================] - 1s 236us/step - loss: 0.0150 - acc: 0.9958 - val_loss: 0.0423 - val_acc: 0.9899
Epoch 5/5
2771/2771 [==============================] - 1s 250us/step - loss: 0.0064 - acc: 0.9984 - val_loss: 0.0532 - val_acc: 0.9899
after I will applied predict function to test data:
my submission file like this :
submission.head()
id category
0 3729 0.999434
1 3732 0.999128
2 3761 0.999358
3 5 0.996779
4 7 0.998702
my actual submission file like this :
submission.head()
id category
0 3729 1
1 3732 1
2 3761 1
3 5 1
4 7 1
Looks like you need to transform your results back into words! When you tokenized and padded, that turned words into numbers. You just need to change them back! For example:
transformed_category = []
for cat in submission['category']:
transformed_category.append(tokenizer.word_index(cat))
For education sake... It does this because math can't really be performed on strings---at least, not as readily as it can be done with numbers. So any time you've got text in your neural networks, they need to be turned into a numerical representation prior to getting fed into the network. Vectorizers (which your tokenizer did) and 'one-hot' or 'categorical' are the most common methods. In either case, once you get your results back out of the network, you can turn them back into words for humans. :)
edit after comments
Hi! So yes, I was looking at the columns askew. You're getting values of 1 (or really close) because sigmoid can only choose between 0 and 1, but, then, it looks like you wanted that, since your loss is binary_crossentropy. With sigmoid activation, large values will asymtotically approach 1. So I'd say you need to re-think your output layer. It looks like you're sending in arrays of numbers, and it looks like you want to get out a category spanning broader than 0 to 1, so consider turning your Y data into categoricals, using softmax as your final output activation, and changing your loss to categorical_crossentropy
Related
I am trying to design a model for binary image classification, this is my first classifier and I am following an online tutorial but the model always predicts class 0
My dataset contains 3620 and 3651 images of each class respectively, I don't suppose the problem is due to an imbalanced dataset as the model is predicting only the class with lower number of sample in the dataset.
My code
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
img_hieght, img_width = 150,150
train_data_dir = 'dataset/train'
#validation_data_dir = 'dataset/validation'
nb_train_samples = 3000
#nb_validation_samples = 500
epochs = 10
batch_size = 16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_hieght)
else:
input_shape = (img_width, img_hieght, 3)
model = Sequential()
model.add(Conv2D(32,(3,3), input_shape = input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32,(3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64,(3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = 'rmsprop', metrics = ['accuracy'])
train_datagen = ImageDataGenerator(
rescale = 1. /255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_width,img_hieght),
batch_size = batch_size,
class_mode = 'binary')
model.fit_generator(train_generator,
steps_per_epoch = nb_train_samples//batch_size,
epochs = epochs)
model.save('classifier.h5')
I have tried checking the model summary as well, but couldn't detect anything notable
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 148, 148, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 148, 148, 32) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 74, 74, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 72, 72, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 72, 72, 32) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 36, 36, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 34, 34, 64) 18496
_________________________________________________________________
activation_3 (Activation) (None, 34, 34, 64) 0
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 17, 17, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 18496) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 1183808
_________________________________________________________________
activation_4 (Activation) (None, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 64) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 65
_________________________________________________________________
activation_5 (Activation) (None, 1) 0
=================================================================
Total params: 1,212,513
Trainable params: 1,212,513
Non-trainable params: 0
_________________________________________________________________
None
I have not used validation dataset, I am using only training data and testing the model manually using:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
batch_size = 16
path = 'dataset/test'
imgen = ImageDataGenerator(rescale=1/255.)
testGene = imgen.flow_from_directory(directory=path,
target_size=(150, 150,),
shuffle=False,
class_mode='binary',
batch_size=batch_size,
save_to_dir=None
)
model = tf.keras.models.load_model("classifier.h5")
pred = model.predict_generator(testGene, steps=testGene.n/batch_size)
print(pred)
Here are the accuracy and loss values per epochs:
Epoch 1/10
187/187 [==============================] - 62s 330ms/step - loss: 0.5881 - accuracy: 0.7182
Epoch 2/10
187/187 [==============================] - 99s 529ms/step - loss: 0.4102 - accuracy: 0.8249
Epoch 3/10
187/187 [==============================] - 137s 733ms/step - loss: 0.3266 - accuracy: 0.8646
Epoch 4/10
187/187 [==============================] - 159s 851ms/step - loss: 0.3139 - accuracy: 0.8620
Epoch 5/10
187/187 [==============================] - 112s 597ms/step - loss: 0.2871 - accuracy: 0.8873
Epoch 6/10
187/187 [==============================] - 60s 323ms/step - loss: 0.2799 - accuracy: 0.8847
Epoch 7/10
187/187 [==============================] - 66s 352ms/step - loss: 0.2696 - accuracy: 0.8870
Epoch 8/10
187/187 [==============================] - 57s 303ms/step - loss: 0.2440 - accuracy: 0.8947
Epoch 9/10
187/187 [==============================] - 56s 299ms/step - loss: 0.2478 - accuracy: 0.8994
Epoch 10/10
187/187 [==============================] - 53s 285ms/step - loss: 0.2448 - accuracy: 0.9047
You use only 3000 samples per epoch (see line nb_train_samples = 3000), while having 3620 and 3651 images for the each class. Given that model gets 90% accuracy and predicts only zeros, I suppose that you pass only class-zero images to the network during training. Consider increasing nb_train_samples.
I am building a hashtag recommendation model for twitter media posts, which takes tweet text as input and does 300-dimensional word embedding on it and classifies it among 198 hashtags as classes. When I run my model I get lower than 0.0011 accuracy which does not change later! What is wrong in my model?
import pickle
import numpy as np
from keras import initializers, regularizers
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras.layers import LSTM, Activation, Dense, Dropout, Embedding
from keras.layers.normalization import BatchNormalization
from keras.models import Sequential, load_model
package = "2018_pickle"
with open(path1, "rb") as f:
maxLen, l_h2i, l_w2i = pickle.load(f)
with open(path2, "rb") as f:
X_train, X_test, X_train_indices, X_test_indices = pickle.load(f)
with open(path3, "rb") as f:
Y_train, Y_test, Y_train_oh, Y_test_oh = pickle.load(f)
with open(path4, "rb") as f:
emd_matrix = pickle.load(f)
if __name__ == "__main__":
modelname = "model_1"
train = False
vocab_size = len(emd_matrix)
emd_dim = emd_matrix.shape[1]
if train:
model = Sequential()
model.add(
Embedding(
vocab_size,
emd_dim,
weights=[emd_matrix],
input_length=maxLen,
trainable=False,
)
)
model.add(
LSTM(
256,
return_sequences=True,
activation="relu",
kernel_regularizer=regularizers.l2(0.01),
kernel_initializer=initializers.glorot_normal(seed=None),
)
)
model.add(
LSTM(
256,
return_sequences=True,
activation="relu",
kernel_regularizer=regularizers.l2(0.01),
kernel_initializer=initializers.glorot_normal(seed=None),
)
)
model.add(
LSTM(
256,
return_sequences=False,
activation="relu",
kernel_regularizer=regularizers.l2(0.01),
kernel_initializer=initializers.glorot_normal(seed=None),
)
)
model.add(Dense(198, activation="softmax"))
model.compile(
loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]
)
checkpoint = ModelCheckpoint(
filepath, monitor="loss", verbose=1, save_best_only=True, mode="min"
)
reduce_lr = ReduceLROnPlateau(
monitor="val_loss", factor=0.5, patience=2, min_lr=0.000001
)
history = model.fit(
X_train_indices,
Y_train_oh,
batch_size=2048,
epochs=5,
validation_split=0.1,
shuffle=True,
callbacks=[checkpoint, reduce_lr],
)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_10 (Embedding) (None, 54, 300) 22592100
_________________________________________________________________
lstm_18 (LSTM) (None, 54, 256) 570368
_________________________________________________________________
lstm_19 (LSTM) (None, 54, 256) 525312
_________________________________________________________________
lstm_20 (LSTM) (None, 256) 525312
_________________________________________________________________
dense_7 (Dense) (None, 198) 50886
=================================================================
Total params: 24,263,978
Trainable params: 1,671,878
Non-trainable params: 22,592,100
_________________________________________________________________
None
Train on 177278 samples, validate on 19698 samples
Epoch 1/5
177278/177278 [==============================] - 70s 396us/step - loss: 3.3672 - acc: 8.7433e-04 - val_loss: 0.3103 - val_acc: 0.0000e+00
Epoch 00001: loss improved from inf to 3.36719, saving model to ./checkpoints/model_1/lstm-01-3.367-0.001-0.310-0.000.hdf5
Epoch 2/5
177278/177278 [==============================] - 66s 371us/step - loss: 0.1950 - acc: 2.4820e-04 - val_loss: 0.1616 - val_acc: 0.0016
Epoch 00002: loss improved from 3.36719 to 0.19496, saving model to ./checkpoints/model_1/lstm-02-0.195-0.000-0.162-0.002.hdf5
Epoch 3/5
177278/177278 [==============================] - 66s 370us/step - loss: 0.1583 - acc: 0.0011 - val_loss: 0.1570 - val_acc: 0.0016
Epoch 00003: loss improved from 0.19496 to 0.15826, saving model to ./checkpoints/model_1/lstm-03-0.158-0.001-0.157-0.002.hdf5
Epoch 4/5
177278/177278 [==============================] - 65s 369us/step - loss: 0.1566 - acc: 0.0011 - val_loss: 0.1573 - val_acc: 0.0016
Epoch 00004: loss improved from 0.15826 to 0.15660, saving model to ./checkpoints/model_1/lstm-04-0.157-0.001-0.157-0.002.hdf5
Epoch 5/5
177278/177278 [==============================] - 66s 374us/step - loss: 0.1561 - acc: 0.0011 - val_loss: 0.1607 - val_acc: 0.0016
Epoch 00005: loss improved from 0.15660 to 0.15610, saving model to ./checkpoints/model_1/lstm-05-0.156-0.001-0.161-0.002.hdf5
my training variable shape is (264, 120, 120, 3)
trying to give numpy array of images as input
model = Sequential()
model.add(Conv2D(8, (3, 3), activation='relu', strides=2,input_shape=(image_height,image_width,channels)))
model.add(Conv2D(16, (3, 3), activation='relu'))
model.summary()
model.compile(optimizer='rmsprop', loss='mse')
model.fit(x=X_train, y=y_train, batch_size=1, epochs=1, verbose=1)
below is the error message
________________________________________________________________
Layer (type) Output Shape Param
=================================================================
conv2d_36 (Conv2D) (None, 59, 59, 8) 224
_________________________________________________________________
conv2d_37 (Conv2D) (None, 57, 57, 16) 1168
=================================================================
Total params: 1,392
Trainable params: 1,392
Non-trainable params: 0
ValueError: Error when checking target: expected conv2d_37 to have shape (57, 57, 16) but got array with shape (120, 120, 3)
This error was because of mismatch in shape between model output and training data.
Please refer sample code in below
#Import Dependencies
import keras
from keras.models import Model, Sequential
from keras.layers import Conv2D, Flatten, Dense
# Model Building
model = Sequential()
model.add(Conv2D(8, (3, 3), activation='relu', strides=2, input_shape=(28,28,1)))
model.add(Conv2D(16, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['mse'])
# Generate dummy data
import numpy as np
data = np.random.random((100, 28, 28, 1))
labels = np.random.randint(2, size=(100, 10))
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=5, batch_size=32)
Output:
Epoch 1/5
100/100 [==============================] - 0s 1ms/step - loss: 1.2342 - mse: 0.4195
Epoch 2/5
100/100 [==============================] - 0s 234us/step - loss: 1.2183 - mse: 0.4167
Epoch 3/5
100/100 [==============================] - 0s 222us/step - loss: 1.2104 - mse: 0.4151
Epoch 4/5
100/100 [==============================] - 0s 255us/step - loss: 1.2019 - mse: 0.4131
Epoch 5/5
100/100 [==============================] - 0s 239us/step - loss: 1.1938 - mse: 0.4120
How do you know when you've successfully frozen a layer in Keras? Below is a snippet of my model where I am trying to freeze the entire DenseNet121 layer; however, I'm unsure if that is actually occurring since the outputs to the console don't indicate what's happening.
I've tried two methods (1) densenet.trainable = False and (2) model.layers[0].trainable = False.
Furthermore, if I load the model again and add model.layers[0].trainable = True, will this unfreeze the layer?
densenet = DenseNet121(
weights='/{}'.format(WEIGHTS_FILE_NAME),
include_top=False,
input_shape=(IMG_SIZE, IMG_SIZE, 3)
)
model = Sequential()
model.add(densenet)
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(NUM_CLASSES, activation='sigmoid'))
model.summary()
# This is how I freeze my layers, I decided to do it twice because I wasn't sure if it was working
densenet.trainable = False
model.layers[0].trainable = False
history = model.fit_generator(
datagen.flow(x_train, y_train, batch_size=BATCH_SIZE),
steps_per_epoch=len(x_train) / BATCH_SIZE,
epochs=NUM_EPOCHS,
validation_data=(x_test, y_test),
callbacks=callbacks_list,
max_queue_size=2
)
Below is the output of model.summary(), which I would expect to indicate if a layer has been successfully frozen or not.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
densenet121 (Model) (None, 8, 8, 1024) 7037504
_________________________________________________________________
global_average_pooling2d_3 ( (None, 1024) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 5) 5125
=================================================================
Total params: 7,042,629
Trainable params: 5,125
Non-trainable params: 7,037,504
_________________________________________________________________
Epoch 1/100
354/353 [==============================] - 203s 573ms/step - loss: 0.4374 - acc: 0.8098 - val_loss: 0.3785 - val_acc: 0.8290
val_kappa: 0.0440
Epoch 2/100
354/353 [==============================] - 199s 561ms/step - loss: 0.3738 - acc: 0.8457 - val_loss: 0.3575 - val_acc: 0.8310
val_kappa: 0.0463
Epoch 3/100
however, I'm unsure if that is actually occurring since the outputs to
the console don't indicate what's happening.
It does, as can be seen from the number of trainable parameters. As expected, only the parameters(5125) of the last Dense layer are trainable.
Total params: 7,042,629
Trainable params: 5,125
Non-trainable params: 7,037,504
You can find whether a layer is frozen by looking at it's config:
>>> model.get_layer("dense_2").get_config()
{'name': 'dense_2',
'trainable': True,
...
If trainable is True, it is unfrozen.
I am new to python, deep learning and keras. I known many people asked similar questions before and i tried to read through them but my issues is still not solve. could someone please give me a hand
I want to build a 6 inputs and 1 output model. below are my codes. your help or hint will be truly appreciated.
input and output shape:
print(x_train.shape, y_train.shape)
output:
(503, 6) (503, 1)
model codes:
inputList={}
lstmList={}
for i in range (x_train.shape[1]):
inputList[varList[i]]=Input(shape=(x_train.shape[0], 1), name=varList[i])
lstmList[varList[i]]=LSTM(64, activation='relu', return_sequences=None, dropout=0.2)(inputList[varList[i]])
z=concatenate([lstmList[i] for i in varList])
output=Dense(next_number_prediction, activation='softmax')(z)
model = Model(inputs=[inputList[i] for i in varList], outputs=[output])
model.compile(optimizer='rmsprop',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.summary()
the output is:
Layer (type) Output Shape Param # Connected to
==================================================================================================
open (InputLayer) (None, 503, 1) 0
__________________________________________________________________________________________________
high (InputLayer) (None, 503, 1) 0
__________________________________________________________________________________________________
low (InputLayer) (None, 503, 1) 0
__________________________________________________________________________________________________
close (InputLayer) (None, 503, 1) 0
__________________________________________________________________________________________________
change (InputLayer) (None, 503, 1) 0
__________________________________________________________________________________________________
pct (InputLayer) (None, 503, 1) 0
__________________________________________________________________________________________________
lstm_7 (LSTM) (None, 64) 16896 open[0][0]
__________________________________________________________________________________________________
lstm_8 (LSTM) (None, 64) 16896 high[0][0]
__________________________________________________________________________________________________
lstm_9 (LSTM) (None, 64) 16896 low[0][0]
__________________________________________________________________________________________________
lstm_10 (LSTM) (None, 64) 16896 close[0][0]
__________________________________________________________________________________________________
lstm_11 (LSTM) (None, 64) 16896 change[0][0]
__________________________________________________________________________________________________
lstm_12 (LSTM) (None, 64) 16896 pct[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 384) 0 lstm_7[0][0]
lstm_8[0][0]
lstm_9[0][0]
lstm_10[0][0]
lstm_11[0][0]
lstm_12[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 1) 385 concatenate_1[0][0]
==================================================================================================
Total params: 101,761
Trainable params: 101,761
Non-trainable params: 0
__________________________________________________________________________________________________
Data treatment and model.fit:
Data={}
for i in range (x_train.shape[1]):
Data[varList[i]]=np.expand_dims(x_train[:, i], axis=0)
Data[varList[i]]=np.reshape(Data[varList[i]], (1,x_train.shape[0],1))
model.fit(
[Data[i] for i in varList],
[y_train],
epochs=10)
and the error is
ValueError Traceback (most recent call last)
<ipython-input-21-392e0052f15a> in <module>()
1 model.fit(
2 [Data[i] for i in varList],
----> 3 [y_train],
epochs=10)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, max_queue_size, workers, use_multiprocessing, **kwargs)
1534 steps_name='steps_per_epoch',
1535 steps=steps_per_epoch,
-> 1536 validation_split=validation_split)
1537
1538 # Prepare validation data.
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split)
990 x, y, sample_weight = next_element
991 x, y, sample_weights = self._standardize_weights(x, y, sample_weight,
--> 992 class_weight, batch_size)
993 return x, y, sample_weights
994
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in _standardize_weights(self, x, y, sample_weight, class_weight, batch_size)
1167 # Check that all arrays have the same length.
1168 if not self._distribution_strategy:
-> 1169 training_utils.check_array_lengths(x, y, sample_weights)
1170 if self._is_graph_network and not context.executing_eagerly():
1171 # Additional checks to avoid users mistakenly using improper loss fns.
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_utils.py in check_array_lengths(inputs, targets, weights)
424 'the same number of samples as target arrays. '
425 'Found ' + str(list(set_x)[0]) + ' input samples '
--> 426 'and ' + str(list(set_y)[0]) + ' target samples.')
427 if len(set_w) > 1:
428 raise ValueError('All sample_weight arrays should have '
ValueError: Input arrays should have the same number of samples as target arrays. Found 1 input samples and 503 target samples.
The feed input and output dimension
print (Data[varList[i]].shape)
print (np.array([Data[i] for i in varList]).shape)
print (y_train.shape)
output:
(1, 503, 1)
(6, 1, 503, 1)
(503, 1)
tried new codes:
input = Input(shape=(x_train.shape))
lstm = LSTM(64, activation='relu', return_sequences=True, dropout=0.2)(input)
output = Dense(1)(lstm)
model2 = Model(inputs=input, outputs=output)
model2.compile(optimizer='rmsprop',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model2.fit(x_train[np.newaxis,:,:], y_train[np.newaxis,:,:])
gives a untrained model:
Epoch 1/10
1/1 [==============================] - 4s 4s/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 2/10
1/1 [==============================] - 0s 385ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 3/10
1/1 [==============================] - 0s 387ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 4/10
1/1 [==============================] - 0s 386ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 5/10
1/1 [==============================] - 0s 390ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 6/10
1/1 [==============================] - 0s 390ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 7/10
1/1 [==============================] - 0s 390ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 8/10
1/1 [==============================] - 0s 389ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 9/10
1/1 [==============================] - 0s 387ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
Epoch 10/10
1/1 [==============================] - 0s 391ms/step - loss: 0.0000e+00 - acc: 0.0000e+00
<tensorflow.python.keras.callbacks.History at 0x7f4c97583e80>
where the max and min of the data are:
print (max(y_train), x_train.max(axis=0))
print (min(y_train), x_train.min(axis=0))
output:
[0.79951533] [0.79930947 0.79750822 0.79934846 0.79951533 0.72939786 0.99697845]
[0.19443386] [1.94643871e-01 1.96481512e-01 1.94604099e-01 1.94433856e-01
2.52289062e-04 3.70721060e-01]
Your network expects only a single label for the whole sequence. If I adapt the code like this it runs:
model.fit(
[Data[i] for i in varList],
[y_train[0:1]],
epochs=10)
Of course you need to decide whether this reflects your attention or whether you need to restructure your network so it accepts one label for every element in the sequence.
By the way: This is how I would have constructed the network. So if you are new to this, maybe this is the architecture that you actually want:
input = Input(shape=(x_train.shape))
lstm = LSTM(64, activation='relu', return_sequences=True, dropout=0.2)(input)
output = Dense(1)(lstm)
model2 = Model(inputs=input, outputs=output)
model2.compile(optimizer='rmsprop',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model2.fit(x_train[np.newaxis,:,:], y_train[np.newaxis,:,:])