hi i am new to keras and i just wanted to know are ann's good for polynomial regression tasks or we shuold just
use sklearn for exmaple i write this script
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
x=np.arange(1, 100)
y=x**2
model = Sequential()
model.add(Dense(units=200, activation = 'relu',input_dim=1))
model.add(Dense(units=200, activation= 'relu'))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',optimizer=keras.optimizers.SGD(learning_rate=0.001))
model.fit(x, y,epochs=2000)
but after testing it on some of numbers i didn't get good result like :
model.predict([300])
array([[3360.9023]], dtype=float32)
is there any problem in my code or i just shouldn't use ann's for polynomial regressions.
thank you.
I'm not 100 percent sure, but I think that the reason you are getting such bad predictions is because you did not scale your data. Artificial neural networks are extremely computationally intensive, and thus, scaling is a must. Scale your data as shown below:
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
x=np.arange(1, 100)
y=x**2
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
x = sc_x.fit_transform(x)
sc_y = StandardScaler()
y = sc_y.fit_transform(y)
model = Sequential()
model.add(Dense(units=5, activation = 'relu',input_dim=1))
model.add(Dense(units=5, activation= 'relu'))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',optimizer=keras.optimizers.SGD(learning_rate=0.001))
model.fit(x, y,epochs=75, batch_size=10)
prediction = sc_y.inverse_transform(model.predict(sc_x.transform([300])))
print(prediction)
Note that I changed the number of epochs from 2000 to 75. This is because 2000 epochs is way to high for a neural network, and it requires lots of time to train. Your X dataset contains only 100 values, so the maximum number of epochs I would suggest is 75.
Furthermore, I also changed the number of neurons in each hidden layer from 200 to 5. This is because 200 neurons is far to many for most datasets, let alone a small dataset of length 100.
These changes should ensure that your neural network produces more accurate predictions.
Hope that helped.
Related
I'm trying to develop a multitask deep neural network (MTDNN) to make prediction on small molecule bioactivity against kinase targets and something is definitely wrong with my model structure but I can't figure out what.
For my training data (highly imbalanced data with 0 as inactive and 1 as active), I have 423 unique kinase targets (tasks) and over 400k unique compounds. I first calculate the ECFP fingerprint using smiles, and then I randomly split the input data into train, test, and valid sets based on 8:1:1 ratio using RandomStratifiedSplitter from deepchem package. After training my model using the train set and I want to make prediction on the test set to check model performance.
Here's what my data looks like (screenshot example):
(https://i.stack.imgur.com/8Hp36.png)
Here's my code:
# Import Packages
import numpy as np
import pandas as pd
import deepchem as dc
from sklearn.metrics import roc_auc_score, roc_curve, auc, confusion_matrix
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import initializers, regularizers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Input, Dropout, Reshape
from tensorflow.keras.optimizers import SGD
from rdkit import Chem
from rdkit.Chem import rdMolDescriptors
# Build Model
inputs = keras.Input(shape = (1024, ))
x = keras.layers.Dense(2000, activation='relu', name="dense2000",
kernel_initializer=initializers.RandomNormal(stddev=0.02),
bias_initializer=initializers.Ones(),
kernel_regularizer=regularizers.L2(l2=.0001))(inputs)
x = keras.layers.Dropout(rate=0.25)(x)
x = keras.layers.Dense(500, activation='relu', name='dense500')(x)
x = keras.layers.Dropout(rate=0.25)(x)
x = keras.layers.Dense(846, activation='relu', name='output1')(x)
logits = Reshape([423, 2])(x)
outputs = keras.layers.Softmax(axis=2)(logits)
Model1 = keras.Model(inputs=inputs, outputs=outputs, name='MTDNN')
Model1.summary()
opt = keras.optimizers.SGD(learning_rate=.0003, momentum=0.9)
def loss_function (output, labels):
loss = tf.nn.softmax_cross_entropy_with_logits(output,labels)
return loss
loss_fn = loss_function
Model1.compile(loss=loss_fn, optimizer=opt,
metrics=[keras.metrics.Accuracy(),
keras.metrics.AUC(),
keras.metrics.Precision(),
keras.metrics.Recall()])
for train, test, valid in split2:
trainX = pd.DataFrame(train.X)
trainy = pd.DataFrame(train.y)
trainy2 = tf.one_hot(trainy,2)
testX = pd.DataFrame(test.X)
testy = pd.DataFrame(test.y)
testy2 = tf.one_hot(testy,2)
validX = pd.DataFrame(valid.X)
validy = pd.DataFrame(valid.y)
validy2 = tf.one_hot(validy,2)
history = Model1.fit(x=trainX, y=trainy2,
shuffle=True,
epochs=10,
verbose=1,
batch_size=100,
validation_data=(validX, validy2))
y_pred = Model1.predict(testX)
y_pred2 = y_pred[:, :, 1]
y_pred3 = np.round(y_pred2)
# Check the # of nonzero in assay
(y_pred3!=0).sum () #all 0s
My questions are:
The roc and precision recall are all extremely high (>0.99), but the prediction result of test set contains all 0s, no actives at all. I also use the randomized dataset with same active:inactive ratio for each task to test if those values are too good to be true, and turns out all values are still above 0.99, including roc which is expected to be 0.5.
Can anyone help me to identify what is wrong with my model and how should I fix it please?
Can I use built-in functions in sklearn to calculate roc/accuracy/precision-recall? Or should I manually calculate the metrics based on confusion matrix on my own for multitasking purpose. Why and why not?
I am using Keras functional API to build a classifier and I am using the training flag in the dropout layer to enable dropout when predicting new instances (in order to get an estimate of the uncertainty). In order to get the expected response one needs to repeat this prediction several times, with keras randomly activating links in the dense layer, and of course it is computational expensive. Therefore, I would also like to have the option to not use dropout at the prediction phase, i.e., use all the network links. Does anyone know how I can do this? Following is a sample code of what I am doing. I tried to look if predict has any relevant parameter but does not seem like it does (?). I can technically train the same model without the training flag at the dropout layer, but I do not want to do this (or better I want to have a more clean solution, rather than having 2 different models).
from sklearn.datasets import make_circles
from keras.models import Sequential
from keras.utils import to_categorical
from keras.layers import Dense
from keras.layers import Dropout
import numpy as np
import keras
# generate a 2d classification sample dataset
X, y = make_circles(n_samples=100, noise=0.1, random_state=1)
n_train = 30
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
trainy = to_categorical(trainy)
testy = to_categorical(testy)
inputlayer = keras.layers.Input((2,))
d = keras.layers.Dense(500, activation = 'relu')(inputlayer)
d1 = keras.layers.Dropout(rate = .3)(d,training = True)
out = keras.layers.Dense(2, activation = 'softmax')(d1)
model = keras.Model(inputs = inputlayer, outputs = out)
model.compile(loss = 'categorical_crossentropy',metrics = ['accuracy'],optimizer='adam')
model.fit(x = trainX, y = trainy, validation_data=(testX, testy),epochs=1000, verbose=1)
# another prediction on a specific sample
print(model.predict(testX[0:1,:]))
# another prediction on the same sample
print(model.predict(testX[0:1,:]))
Running the above example I get the following output:
[[0.9230819 0.07691813]]
[[0.8222245 0.17777553]]
which is as expected, different class probabilities for the same input, since there is a random (de)activation of the links from the dropout layer.
Any suggestions on how I can enable/disable dropout at the prediction phase with the functional API?
Sure, you do not need to set the training flag when building the Dropout layer. After training your model you define this function:
mc_func = K.function([model.input, K.learning_phase()],
[model.output])
Then you call mc_func with your input and flag 1 to enable dropout, or 0 to disable it:
stochastic_pred = mc_func([some_input, 1])
deterministic_pred = mc_func([some_input, 0])
I try to train a simple LSTM to predict the next number in a sequence (1,2,3,4,5 --> 6).
from keras.models import Sequential
from keras.layers import LSTM, Dense
from sklearn.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt
xs = [[[(j+i)/100] for j in range(5)] for i in range(100)]
ys = [(i+5)/100 for i in range(100)]
x_train, x_test, y_train, y_test = train_test_split(xs, ys)
model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
training = model.fit(x_train, y_train, epochs=200)
new_xs = np.array(xs)*5
new_ys = np.array(ys)*5
pred = model.predict(new_xs)
plt.scatter(range(len(pred)), pred, c='r')
plt.scatter(range(len(new_ys)), new_ys, c='b')
In order for the net to learn anything I had to normalize the training data (divided it by 100). It did work indeed for the data from the range it was trained on.
I want it to be able to predict the numbers form outside the range it was trained on, but as soon as it leaves the range, it starts to diverge:
When I increased the number of units in both LSTM layers to 30 it looks a little better, but it's still diverging:
Is LSTM capable of learning that task without adding an infinite number of units?
I was looking for an answer to the same question, and I came across the following paper from 2019 and the corresponding Git repo. In particular, see section 5.3 in the paper. It seems like ABBA-LSTM is the solution, though it depends on the time series problem you're trying to solve.
I'm new to keras and have been experimenting with various things such as BatchNormalization but it is not working at all. When the BatchNormalization line is commented out it will converge to around 0.04 loss or better, but with it as it is it will converge to 0.71 and get stuck around there, I'm not sure what's wrong.
from sklearn import preprocessing
from sklearn.datasets import load_boston
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers.normalization import BatchNormalization
import keras.optimizers
boston = load_boston()
x = boston.data
y = boston.target
normx = preprocessing.scale(x)
normy = preprocessing.scale(y)
# doesnt construct output layer
def layer_looper(inputs, number_of_loops, neurons):
inputs_copy = inputs
for i in range(number_of_loops):
inputs_copy = Dense(neurons, activation='relu')(inputs_copy)
inputs_copy = BatchNormalization()(inputs_copy)
return inputs_copy
inputs = Input(shape = (13,))
x = layer_looper(inputs, 40, 20)
predictions = Dense(1, activation='linear')(x)
model = Model(inputs=inputs, outputs=predictions)
opti = keras.optimizers.Adam(lr=0.0001)
model.compile(loss='mean_absolute_error', optimizer=opti, metrics=['acc'])
print(model.summary())
model.fit(normx, normy, epochs=5000, verbose=2, batch_size=128)
I have tried experimenting with batch sizes and the optimizer but it doesn't seem very effective. Am I doing something wrong?
I've increased learning rate to 0.01 and it seems like the network is able to learn something (I get Epoch 1000/5000- 0s - loss: 0.2330) .
I think it's worth to note the following from the abstract of original Batch Normalization paper:
Batch Normalization allows us to use much higher learning rates and
be less careful about initialization. It also acts as a regularizer (...)
That hinted to increased learning rate (that's something you might want to experiment with).
Be aware that since it works like regularization, BatchNorm should make your training loss worse - it's supposed to prevent overfitting and thus close the gap between the train and test/valid errors.
I am trying to build a deep learning network for binary classification using LSTM based RNN.
Here is what I have tried using python
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
import numpy as np
train = np.loadtxt("TrainDatasetFinal.txt", delimiter=",")
test = np.loadtxt("testDatasetFinal.txt", delimiter=",")
y_train = train[:,7]
y_test = test[:,7]
train_spec = train[:,6]
test_spec = test[:,6]
model = Sequential()
model.add(Embedding(8, 256, input_length=1))
model.add(LSTM(output_dim=128, activation='sigmoid',
inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
model.fit(train_spec, y_train, batch_size=2000, nb_epoch=11)
score = model.evaluate(test_spec, y_test, batch_size=2000)
Here is a sample from the dataset
(Patient Number, time in millisecond, accelerometer x-axis,y-axis,
z-axis,magnitude, spectrogram,label (0 or 1))
1,15,70,39,-970,947321,596768455815000,0
1,31,70,39,-970,947321,612882670787000,0
1,46,60,49,-960,927601,602179976392000,0
1,62,60,49,-960,927601,808020878060000,0
1,78,50,39,-960,925621,726154800929000,0
I believe that the my problem in those lines but I cannot recognize the error
model.add(Embedding(8, 256, input_length=1))
model.add(LSTM(output_dim=128, activation='sigmoid',
inner_activation='hard_sigmoid'))
and this is the error I have got
InvalidArgumentError (see above for traceback): indices[0,0] = -2147483648 is not in [0, 8)
Is the sample from your dataset provided above, the data you are trying to feed into the model? If so, there is a problem because your data is 2-dimensional, but for an RNN you need a 3-dimensional input tensor. You need a feature dimension, a batch size dimension and a time dimension. It looks like you are missing a proper time dimension. You should not have a column with 15, 31, 46,... (time in milliseconds) this should be shaped into its own dimension, so your input data looks like a "cube". Otherwise, you don't need a temporal model at all. Furthermore, you should standardize your input since your features have vastly different orders of magnitude. Moreover, the batch size of 2000 is almost certainly too large. Are you trying to express that your whole training set has 2000 samples? In this case, you may not have enough training data for the model you are building.