Value Error problem from using kernel regularizer - python-3.x

I got a ValueError when using TensorFlow to create a model. Based on the error there is a problem that occurs with the kernel regularizer applied on the Conv2D layer and the mean squared error function. I used the L1 regularizer provided by the TensorFlow keras package. I've tried setting different values for the L1 regularization factor and even setting the value to 0, but I get the same error.
Context: Creating a model that predicts phenotype traits given genotypes and phenotypes datasets. The genotype input data has 4276 samples, and the input shape that the model takes is (28220,1). My labels represent the phenotype data. The labels include 4276 samples with 20 as the number of phenotype traits in the dataset. In this model we use differential privacy(DP) and add it to a CNN model which uses the Mean squared error loss function and the DPKerasAdamOptimizer to add DP. I'm just wondering if MSE would be a good choice as a loss function?
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
!pip install tensorflow-privacy
import numpy as np
import tensorflow as tf
from tensorflow_privacy import *
import tensorflow_privacy
from matplotlib import pyplot as plt
import pylab as pl
import numpy as np
import pandas as pd
from tensorflow.keras.models import Model
from tensorflow.keras import datasets, layers, models, losses
from tensorflow.keras import backend as bke
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l1, l2, l1_l2 #meaning of norm
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
batch_size = 32
epochs = 4
microbatches = 8
inChannel = 1
kr = 0#1e-5
num_kernels=8
drop_perc=0.25
dim = 1
l2_norm_clip = 1.5
noise_multiplier = 1.3
learning_rate = 0.25
latent_dim = 0
def print_datashape():
print('genotype data: ', genotype_data.shape)
print('phenotype data: ', single_pheno.shape)
genotype_data = tf.random.uniform([4276, 28220],1,3, dtype=tf.dtypes.int32)
phenotype_data = tf.random.uniform([4276, 20],-4.359688,34,dtype=tf.dtypes.float32)
genotype_data = genotype_data.numpy()
phenotype_data = phenotype_data.numpy()
small_geno = genotype_data
single_pheno = phenotype_data[:, 1]
print_datashape()
df = small_geno
min_max_scaler = preprocessing.MinMaxScaler()
df = min_max_scaler.fit_transform(df)
scaled_pheno = min_max_scaler.fit_transform(single_pheno.reshape(-1,1)).reshape(-1)
feature_size= df.shape[1]
df = df.reshape(-1, feature_size, 1, 1)
print("df: ", df.shape)
print("scaled: ", scaled_pheno.shape)
# split train to train and valid
train_data,test_data,train_Y,test_Y = train_test_split(df, scaled_pheno, test_size=0.2, random_state=13)
train_X,valid_X,train_Y,valid_Y = train_test_split(train_data, train_Y, test_size=0.2, random_state=13)
def print_shapes():
print('train_X: {}'.format(train_X.shape))
print('train_Y: {}'.format(train_Y.shape))
print('valid_X: {}'.format(valid_X.shape))
print('valid_Y: {}'.format(valid_Y.shape))
input_shape= (feature_size, dim, inChannel)
predictor = tf.keras.Sequential()
predictor.add(layers.Conv2D(num_kernels, (5,1), padding='same', strides=(12, 1), activation='relu', kernel_regularizer=tf.keras.regularizers.L1(kr),input_shape= input_shape))
predictor.add(layers.AveragePooling2D(pool_size=(2,1)))
predictor.add(layers.Dropout(drop_perc))
predictor.add(layers.Flatten())
predictor.add(layers.Dense(int(feature_size / 4), activation='relu'))
predictor.add(layers.Dropout(drop_perc))
predictor.add(layers.Dense(int(feature_size / 10), activation='relu'))
predictor.add(layers.Dropout(drop_perc))
predictor.add(layers.Dense(1))
mse = tf.keras.losses.MeanSquaredError(reduction=tf.keras.losses.Reduction.NONE)
optimizer = DPKerasAdamOptimizer(learning_rate=learning_rate, l2_norm_clip=l2_norm_clip, noise_multiplier=noise_multiplier, num_microbatches=microbatches)
# compile
predictor.compile(loss=mse, optimizer=optimizer, metrics=['mse'])
#summary
predictor.summary()
print_shapes()
predictor.fit(train_X, train_Y,batch_size=batch_size,epochs=epochs,verbose=1, validation_data=(valid_X, valid_Y))
ValueError: Shapes must be equal rank, but are 1 and 0
From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](mean_squared_error/weighted_loss/Mul, conv2d_2/kernel/Regularizer/mul)' with input shapes: [?], [].

Related

Questions about Multitask deep neural network modeling using Keras

I'm trying to develop a multitask deep neural network (MTDNN) to make prediction on small molecule bioactivity against kinase targets and something is definitely wrong with my model structure but I can't figure out what.
For my training data (highly imbalanced data with 0 as inactive and 1 as active), I have 423 unique kinase targets (tasks) and over 400k unique compounds. I first calculate the ECFP fingerprint using smiles, and then I randomly split the input data into train, test, and valid sets based on 8:1:1 ratio using RandomStratifiedSplitter from deepchem package. After training my model using the train set and I want to make prediction on the test set to check model performance.
Here's what my data looks like (screenshot example):
(https://i.stack.imgur.com/8Hp36.png)
Here's my code:
# Import Packages
import numpy as np
import pandas as pd
import deepchem as dc
from sklearn.metrics import roc_auc_score, roc_curve, auc, confusion_matrix
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import initializers, regularizers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Input, Dropout, Reshape
from tensorflow.keras.optimizers import SGD
from rdkit import Chem
from rdkit.Chem import rdMolDescriptors
# Build Model
inputs = keras.Input(shape = (1024, ))
x = keras.layers.Dense(2000, activation='relu', name="dense2000",
kernel_initializer=initializers.RandomNormal(stddev=0.02),
bias_initializer=initializers.Ones(),
kernel_regularizer=regularizers.L2(l2=.0001))(inputs)
x = keras.layers.Dropout(rate=0.25)(x)
x = keras.layers.Dense(500, activation='relu', name='dense500')(x)
x = keras.layers.Dropout(rate=0.25)(x)
x = keras.layers.Dense(846, activation='relu', name='output1')(x)
logits = Reshape([423, 2])(x)
outputs = keras.layers.Softmax(axis=2)(logits)
Model1 = keras.Model(inputs=inputs, outputs=outputs, name='MTDNN')
Model1.summary()
opt = keras.optimizers.SGD(learning_rate=.0003, momentum=0.9)
def loss_function (output, labels):
loss = tf.nn.softmax_cross_entropy_with_logits(output,labels)
return loss
loss_fn = loss_function
Model1.compile(loss=loss_fn, optimizer=opt,
metrics=[keras.metrics.Accuracy(),
keras.metrics.AUC(),
keras.metrics.Precision(),
keras.metrics.Recall()])
for train, test, valid in split2:
trainX = pd.DataFrame(train.X)
trainy = pd.DataFrame(train.y)
trainy2 = tf.one_hot(trainy,2)
testX = pd.DataFrame(test.X)
testy = pd.DataFrame(test.y)
testy2 = tf.one_hot(testy,2)
validX = pd.DataFrame(valid.X)
validy = pd.DataFrame(valid.y)
validy2 = tf.one_hot(validy,2)
history = Model1.fit(x=trainX, y=trainy2,
shuffle=True,
epochs=10,
verbose=1,
batch_size=100,
validation_data=(validX, validy2))
y_pred = Model1.predict(testX)
y_pred2 = y_pred[:, :, 1]
y_pred3 = np.round(y_pred2)
# Check the # of nonzero in assay
(y_pred3!=0).sum () #all 0s
My questions are:
The roc and precision recall are all extremely high (>0.99), but the prediction result of test set contains all 0s, no actives at all. I also use the randomized dataset with same active:inactive ratio for each task to test if those values are too good to be true, and turns out all values are still above 0.99, including roc which is expected to be 0.5.
Can anyone help me to identify what is wrong with my model and how should I fix it please?
Can I use built-in functions in sklearn to calculate roc/accuracy/precision-recall? Or should I manually calculate the metrics based on confusion matrix on my own for multitasking purpose. Why and why not?

Error occurs when executing the LSTM model with three classes

import pandas as pd
import matplotlib.pyplot as plt
import re
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM,Dense, Dropout, SpatialDropout1D
from tensorflow.keras.layers import Embedding
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
data = pd.read_csv("./emails.csv")
print(data.head())
data = data[['text','email_sentiment']]
data['text'] = data['text'].apply(lambda x: x.lower())
data['text'] = data['text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))
print(data.head())
max_fatures = 50000
max_seq_length = 250
tokenizer = Tokenizer(num_words=max_fatures,filters='!"#$%&()*+,-./:;<=>?#[\]^_`{|}~', lower=True)
tokenizer.fit_on_texts(data['text'].values)
word_index = tokenizer.word_index
X = tokenizer.texts_to_sequences(data['text'].values)
X = pad_sequences(X,maxlen=max_seq_length)
Y= pd.get_dummies(data['email_sentiment']).values
X_train,Y_train = train_test_split(X,Y, test_size = 0.10, random_state = 42)
embedding_vector_length = 100
lstm_out= 196
model = Sequential()
model.add(Embedding(max_fatures, embedding_vector_length, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(3,activation='softmax'))
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
print(model.summary())
Error occured:
X_train,Y_train = train_test_split(X,Y, test_size = 0.10, random_state = 42)
ValueError: too many values to unpack (expected 2)
Unable to train X_train with the dataset due to value errors. The X_train value consist of various emails which are categorized to postive negative and neutral sentiment based on LSTM classes 3

TensorFlow custom loss ValueError: No gradients provided for any variable:

I am implementing a custom loss function as in the code below for a simple classification. However, when I run the code I get the error ValueError: No gradients provided for any variable:
import os
os.environ['KERAS_BACKEND'] = "tensorflow"
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
import statistics as st
import tensorflow as tf
from keras.utils import np_utils
# if the probability is greater than 0.75 then set the value to 1 for buy or sell else set it to None
# convert the y_pred to 0 and 1 using argmax function
# add the two matrices y_pred and y_true
# if value is 2 then set that to 0
# multiply by misclassification matrix
# add the losses to give a unique number
def custom_loss(y_true, y_pred):
y_pred = y_pred.numpy()
y_pred_dummy = np.zeros_like(y_pred)
y_pred_dummy[np.arange(len(y_pred)), y_pred.argmax(1)] = 1
y_pred = y_pred_dummy
y_true = y_true.numpy()
y_final = y_pred + y_true
y_final[y_final == 2] = 0
w_array = [[1,1,5],[1,1,1],[5,1,1]]
return tf.convert_to_tensor(np.sum(np.dot(y_final, w_array)))
model = keras.Sequential()
model.add(layers.Dense(32, input_dim=4, activation='relu'))
model.add(layers.Dense(16, input_dim=4, activation='relu'))
model.add(layers.Dense(8, input_dim=4, activation='relu'))
model.add(layers.Dense(3, activation='softmax'))
model.compile(loss=custom_loss, optimizer='adam', run_eagerly=True)
I do not understand what I am doing incorrectly over here. I read through the issues on tensorflow and one of the reasons is that the link between the loss function and input variables is broken. But I am using y_true in the loss function
Thanks
You can not use numpy within custom loss function. this function is a part of graph and should deal with tensors, not arrays. Numpy doesn't support backpropagation of gradients.

How To Do Model Predict Using Distributed Dask With a Pre-Trained Keras Model?

I am loading my pre-trained keras model and then trying to parallelize a large number of input data using dask? Unfortunately, I'm running into some issues with this relating to how I'm creating my dask array. Any guidance would be greatly appreciated!
Setup:
First I cloned from this repo https://github.com/sanchit2843/dlworkshop.git
Reproducible Code Example:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.model_selection import train_test_split
from keras.models import load_model
import keras
from keras.models import Sequential
from keras.layers import Dense
from dask.distributed import Client
import warnings
import dask.array as DaskArray
warnings.filterwarnings('ignore')
dataset = pd.read_csv('data/train.csv')
X = dataset.drop(['price_range'], axis=1).values
y = dataset[['price_range']].values
# scale data
sc = StandardScaler()
X = sc.fit_transform(X)
ohe = OneHotEncoder()
y = ohe.fit_transform(y).toarray()
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2)
# Neural network
model = Sequential()
model.add(Dense(16, input_dim=20, activation="relu"))
model.add(Dense(12, activation="relu"))
model.add(Dense(4, activation="softmax"))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=64)
# Use dask
client = Client()
def load_and_predict(input_data_chunk):
def contrastive_loss(y_true, y_pred):
margin = 1
square_pred = K.square(y_pred)
margin_square = K.square(K.maximum(margin - y_pred, 0))
return K.mean(y_true * square_pred + (1 - y_true) * margin_square)
mlflow.set_tracking_uri('<uri>')
mlflow.set_experiment('clean_parties_ml')
runs = mlflow.search_runs()
artifact_uri = runs.loc[runs['start_time'].idxmax()]['artifact_uri']
model = mlflow.keras.load_model(artifact_uri + '/model', custom_objects={'contrastive_loss': contrastive_loss})
y_pred = model.predict(input_data_chunk)
return y_pred
da_input_data = da.from_array(X_test, chunks=(100, None))
prediction_results = da_input_data.map_blocks(load_and_predict, dtype=X_test.dtype).compute()
The Error I'm receiving:
AttributeError: '_thread._local' object has no attribute 'value'
Keras/Tensorflow don't play nicely with other threaded systems. There is an ongoing issue on this topic here: https://github.com/dask/dask-examples/issues/35

Keras Multiclass Classification (Dense model) - Confusion Matrix Incorrect

I have a labeled dataset. last column (78) contains 4 types of attack. following codes confusion matrix is correct for two types of attack. can any one help to modify the code for keras multiclass attack detection and correction for get correct confusion matrix? and for correct code for precision, FPR,TPR for multiclass. Thanks.
import pandas as pd
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
from keras.utils.np_utils import to_categorical
dataset_original = pd.read_csv('./XYZ.csv')
# Dron NaN value from Data Frame
dataset = dataset_original.dropna()
# data cleansing
X = dataset.iloc[:, 0:78]
print(X.info())
print(type(X))
y = dataset.iloc[:, 78] #78 is labeled column contains 4 anomaly type
print(y)
# encode the labels to 0, 1 respectively
print(y[100:110])
encoder = LabelEncoder()
y = encoder.fit_transform(y)
print([y[100:110]])
# Split the dataset now
XTrain, XTest, yTrain, yTest = train_test_split(X, y, test_size=0.2, random_state=0)
# feature scaling
scalar = StandardScaler()
XTrain = scalar.fit_transform(XTrain)
XTest = scalar.transform(XTest)
# modeling
model = Sequential()
model.add(Dense(units=16, kernel_initializer='uniform', activation='relu', input_dim=78))
model.add(Dense(units=8, kernel_initializer='uniform', activation='relu'))
model.add(Dense(units=6, kernel_initializer='uniform', activation='relu'))
model.add(Dense(units=1, kernel_initializer='uniform', activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(XTrain, yTrain, batch_size=1000, epochs=10)
history = model.fit(XTrain, yTrain, batch_size=1000, epochs=10, verbose=1, validation_data=(XTest,
yTest))
yPred = model.predict(XTest)
yPred = [1 if y > 0.5 else 0 for y in yPred]
matrix = confusion_matrix(yTest, yPred)`enter code here`
print(matrix)
accuracy = (matrix[0][0] + matrix[1][1]) / (matrix[0][0] + matrix[0][1] + matrix[1][0] + matrix[1][1])
print("Accuracy: " + str(accuracy * 100) + "%")
If i understand correctly, you are trying to solve a multiclass classification problem where your target label belongs to 4 different attacks. Therefore, you should use the output Dense layer having 4 units instead of 1 with a 'softmax' activation function (not 'sigmoid' activation). Additionally, you should use 'categorical_crossentropy' loss in place of 'binary_crossentropy' while compiling your model.
Furthermore, with this setting, applying argmax on prediction result (that has 4 class probability values for each test sample) you will get the final label/class.
[Edit]
Your confusion matrix and high accuracy indicates that you are working with an imbalanced dataset. May be very high number of samples are from class 0 and few samples are from the remaining 3 classes. To handle this you may want to apply weighting samples or over-sampling/under-sampling approaches.

Resources