I would like to predict multiple timesteps into the future. My current NN outputs a sparse classification of 0, 1 or 2.
Sparse classifications outputs via a SoftMax Dense layer with 3 neurons to correspond to the three categories mentioned above.
How would I shape the output layer (softmaxed Dense) to give me the ability to predict two timesteps into the future, while keeping the sparse categorical classes to only 3?
Right now if I set that Dense layer to have 6 neurons (3 classes * 2 timesteps) I get an output of a sparse categorical classification with 6 classes and 1 timestep.
Related
I want to train a Keras model with a dataset of X which have a shape of :
(N,128) N (about millions) is the number of samples and 128 is the features (binary value 0 or 1)
y (output) is a normally distributed binary sequence of 32 bits (0 or 1).
What is the best model or/and output layer for this kind of prediction?
I tried MLP, CNN and LSTM seq2seq models, but learning of my data is very difficult (always predict the same output or loss value don't change)...
I think some of my problems is the output layer (I choose a
Dense(32,activation='sigmoid')),
and the losse function is a classical
'binary_crossentropy'
```.
I have a problem where I need to predict the number of clicks based on Date, CPC(cost per click), Market(US or UK) and keywords(average length 3-4 words, max 6). So the features that I decided to input the model were:
Day(1-31) WeekDay(0-6) Month(1-12) US-Market(0 or 1) UK-Market(0 or 1) CPC(continuous numeric)
And the output is the continuous numeric Number of Clicks.
For Keywords I used keras tokenizer to convert to sequences and the padded those sequences. The I used the glove word embeddings and created an embedding matrix and fed to the nueral Network model as described here in pretrained glove embeddings section.
The model that I used is:
The last Dense layer has linear activation. The model has two inputs (nlp_input for text data) and meta_input for (numerical,categorical data). Both models are concatenated after the Bidirectional LSTM to the nlp_input
The loss is :
model.compile(loss="mean_absolute_percentage_error", optimizer=opt,metrics=['acc'])
where opt = Adam(lr=1e-3, decay=1e-3 / 200)
I trained the model for 100 epochs and the loss was close to 8000 at that point.
But when I apply prediction to the test they result in the same number for all test inputs and that number is even negative -4.5 * e^-5. Could someone guide me as to how should I approach this problem and what improvements could I do to the model.
How is dense layer changing the output coming from LSTM layer? How come that from 50 shaped output from previous layer i get output of size 1 from dense layer that is used for prediction?
Lets say i have this basic model:
model = Sequential()
model.add(LSTM(50,input_shape=(60,1)))
model.add(Dense(1, activation="softmax"))
Is the Dense layer taking the values coming from previous layer and assigning the probablity(using softmax function) of each of the 50 inputs and then taking it out as an output?
No, Dense layers do not work like that, the input has 50-dimensions, and the output will have dimensions equal to the number of neurons, one in this case. The output is a weighted linear combination of the input plus a bias.
Note that with the softmax activation, it makes no sense to use it with a one neuron layer, as the softmax is normalized, the only possible output will be constant 1.0. That's probably now what you want.
I'm playing with keras+TF. I have a model which is composed by a
4 LSTM layers + 2 dense layers.
I have 3 features which are 3 sin sequences and a target which is the multiplication of the 3 sin sequences.
The LSTM layers are configured with 30 backlog time-steps.
I train the RNN with 80% of the features and than I request it to predict
the learned data (80% of the total data) I obtain a very good prediction.
Next I proceed with the last 20% of data splitting it in 10 sub-parts and
looping in predict(part_x[0]), fit(part_y[0]), predict(part_x[1]), fit(part_y[1])... But the quality of prediction dramatically drop down.
Is correct to expect that a predict(x[i])/fit(x[i],y[i]) loop should produce a decent outcome for every x[i+1] block?
Yet another question: is possible to train an RNN with 4 features and predict it with 3 features? If yes, how can I "blind" the unavailable features on prediction phase?
TIA
Roberto C.
Using RapidMiner I want to implement an LSTM to classify patternes in a time series. Input data is a flat table. My first layer in the Keras operator is a core reshape from exampleset_length x nr_of_attributes to batch x time-steps x features. In the reshape parameter I specifically enter three figures because I want a specific amount of features and time-steps. The only way to achieve this is to specify also batch size, so in total three figures. But when I add a RNN LSTM layer an error is returned: Input is incompatible with layer lstm expected ndim=n found ndim=n+1. What’s wrong?
When specifying 'input_shape' for the LSTM layer, you do not include the batch size.
So your 'input_shape' value should be (timesteps, input_dim).
Source: Keras RNN Layer, the parent layer for LSTM