import torch
#Y_pred = ?
for xi in X_iter:
y_pred = net(xi).argmax(dim=1)
Y_pred = torch.cat([Y_pred, y_pred])
How do you initialize this tensor, or is there a better way to write it?
You could do this instead:
Y_pred = torch.cat([net(xi).argmax(dim=1) for xi in X_iter])
Related
I am looking for a way to encourage/restrict false positives/negatives. I have not been able to find a solution, which I can get working - most likely due to my lack of experience.
I found this post: Custom loss function in Keras to penalize false negatives which pretty much has the same purpose, but I cannot get the answer to work.
First I kept on getting this error:
AttributeError: 'Tensor' object has no attribute '_numpy'
After some searching around I found that it could be solved with "model.run_eagerly = True", this though provides this error:
No gradients provided for any variable
I would like not to use "model.run_eagerly = True" given that I am not entirely sure what it does, but it quite obviously makes it so that I need to do something with calculating gradients. I have therefore altered the code, but keep getting the No gradients provided for any variable.
I have made an example of this:
import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
class Custom_loss_Class(tf.keras.losses.Loss):
def __init__(self, recall_weight = 0.5, spec_weight = 0.5, name="custom"):
super().__init__(name=name)
self.recall_weight = recall_weight
self.spec_weight = spec_weight
def call(self, y_true, y_pred):
y_pred = tf.math.round(y_pred)
y_true = tf.math.round(y_true)
TN = tf.dtypes.cast(tf.math.logical_and(tf.math.equal(y_true, 0), tf.math.equal(y_pred, 0)), tf.float32)
TP = tf.dtypes.cast(tf.math.logical_and(tf.math.equal(y_true, 1), tf.math.equal(y_pred, 1)), tf.float32)
FP = tf.dtypes.cast(tf.math.logical_and(tf.math.equal(y_true, 0), tf.math.equal(y_pred, 1)), tf.float32)
FN = tf.dtypes.cast(tf.math.logical_and(tf.math.equal(y_true, 1), tf.math.equal(y_pred, 0)), tf.float32)
TN = tf.reduce_sum(TN)
TP = tf.reduce_sum(TP)
FP = tf.reduce_sum(FP)
FN = tf.reduce_sum(FN)
specificity = TN / (TN + FP + K.epsilon())
recall = TP / (TP + FN + K.epsilon())
loss = tf.Variable(1- (self.recall_weight * recall + self.spec_weight * specificity))
return loss
data = load_breast_cancer()
X_train, X_test, Y_train, Y_test = train_test_split(data.data, data.target, test_size=0.3)
N, D = X_train.shape
scalar = StandardScaler()
X_train = scalar.fit_transform(X_train)
X_test = scalar.transform(X_test)
i = Input(shape=(D,))
x = Dense(64, activation="relu")(i)
x = Dense(1, activation="sigmoid")(x)
model = Model(i, x)
model.compile(optimizer="adam",
loss=Custom_loss_Class(recall_weight=0.1, spec_weight=0.9),
metrics="accuracy")
r = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=10)
I am aware that I can use class_weight to give one class more weight, but that is not really what I want. In this example I would like to heavily restrict false negatives so a patient with cancer do not get a negative prediction, but at the same time not get too many false positives. It seems like a common use case and I therefore hope someone made a good function for it.
Edit:
I am aware that this most likely is due to my loss function not being differentiable, but I do not know how to change that so I get the described functionality.
I am running neural network by keras. There is my code:
import numpy as np
from keras import Model
from keras.models import Sequential
from keras.layers import Dense
from keras import backend as K
def mean_squared_error(y_true, y_pred):
return K.mean(K.square(y_pred - y_true),axis=-1)
np.random.seed(1)
Train_X = np.random.randint(low=0,high=100,size = (50,5))
Train_Y = np.matmul(Train_X,np.arange(10).reshape(5,2))+np.random.randint(low=0,high=10,size=(50,2))
Test_X = np.random.randint(low=0,high=100,size = (10,5))
Test_Y = np.matmul(Test_X,np.arange(10).reshape(5,2))+np.random.randint(low=0,high=10,size=(10,2))
model = Sequential()
model.add(Dense(4,activation = 'relu'))
model.add(Dense(2,activation='relu'))
model.add(Dense(2,activation='relu'))
model.add(Dense(2))
model.compile(loss=mean_squared_error, optimizer='adam', metrics=['mae'])
history = model.fit(Train_X, Train_Y, epochs=100, batch_size=5,validation_data = (Test_X, Test_Y))
loss1 = model.evaluate(Test_X,Test_Y)
loss2 = history.history['val_loss'][99]
y_pred = model.predict(Test_X)
y_true = Test_Y
loss3 = np.mean(np.square(y_pred-y_true))
I find that loss1 is the same as loss2 but is different with loss3. So i feel so confused. Could someone tell me why?
This is possibly due to different dtypes for Test_Y and y_pred. Keras tries to automatically take care of dtype mismatches for you, so it is possible that Test_Y is a float64 and y_pred is a float32. If that is indeed the case, try converting one of their dtypes for the loss3 calculation and see if the values match.
y_pred = model.predict(Test_X)
y_true = Test_Y.astype(np.float32)
loss3 = np.mean(np.square(y_pred-y_true))
I am trying to write a simple ML code to classify the mnist dataset in tensorflow2.0. I didn't use Keras for now since I just want to use lower API to help me understand how tensorflow works. However, after I defined the cross entropy, It seems impossible to continue. All the tf2.0 optimizers are moved to keras and I don't know how to train a model without keras in tf2.0. Is there a way that we bypass keras in tf2.0?
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
print(train_images.shape)
print(len(train_labels))
print(train_images[1,:,:].shape)
# plt.figure()
# plt.imshow(train_images[0])
# plt.colorbar()
# plt.grid(False)
# plt.show()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
for i in range(1):
x = tf.constant(train_images[1,:,:].reshape(784), dtype = tf.float32)
x = tf.reshape(x, [1, 784])
print(tf.shape(x), tf.shape(W))
# define the model
y = tf.nn.softmax(tf.matmul(x, W) + b)
print(y)
# correct labels
y_ = np.zeros(10)
y_[train_labels[i]] = 1.0
y_ = tf.constant(y_, dtype = tf.float32)
y_ = tf.reshape(y_, [1, 10])
cross_entropy = -tf.reduce_sum(y_* tf.math.log(y))
print(cross_entropy)
I don't know how to continue from here.
Backpropagation-based training of models is totally possible in TensorFlow 2.x without using the keras API. The usage will be centered around the tf.GradientTape API and optimizers objects under the tf.optimizers namespace.
Your example can be modified as follows. Note that it's a simplistic code meant to illustrate the basic usage in a short code snippet. It's not to illustrate machine learning best practices in TF2.
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
#tf.function
def my_model(x):
# This is a hand-rolled logistic regressor.
y = tf.matmul(x, W) + b
return tf.nn.softmax(y)
#tf.function
def loss(x, y):
# This is a hand-rolled categorical cross-entropy loss.
diff = -(labels * tf.math.log(logits))
loss = tf.reduce_mean(diff)
return loss
optimizer = tf.optimizers.Adam(learning_rate=1e-3)
for i in xrange(num_steps):
# A single training step.
with tf.GradientTape() as tape:
# This is atypical, in that you would normally want to do this in
# mini-batches, instead of using all examples in x_train and y_train
# at once. But again, this is just a simple example.
loss_value = loss(x_train, y_train)
gradients = tape.gradient(loss_value, [W, b])
optimizer.apply_gradients(zip(gradients, [w, b]))
I have a very simple DNN with a given data set. However, the standard deviation of error I got from "evaluate" and "predict" are different. The mean error seems OK but the stdev from predict is always larger than the stdev from evaluate. Why do these differences happen and how can I fix it?
Raw data is here for download
from keras.models import Sequential
from keras.layers import Dense, Activation
import keras.backend as K
from keras import optimizers
import pickle
import numpy as np
with open('.\\dump','rb') as f:
xTr = pickle.load(f)
yTr = pickle.load(f)
muX = pickle.load(f)
stdX = pickle.load(f)
muY = pickle.load(f)
stdY = pickle.load(f)
def mean_pred(y_true, y_pred):
y_true = y_true*stdY + muY
y_pred = y_pred*stdY + muY
return K.mean(y_pred - y_true)
def std_pred(y_true, y_pred):
y_true = y_true*stdY + muY
y_pred = y_pred*stdY + muY
return K.std(y_pred - y_true)
model = Sequential()
model.add(Dense(256, input_shape=(100,)))
model.add(Activation('tanh'))
model.add(Dense(1))
adam = optimizers.adam(lr=0.0001)
model.compile(optimizer=adam,loss='mse', metrics=[mean_pred, std_pred])
model.fit(xTr, yTr.reshape(-1,1), epochs = 5, batch_size = 128, verbose=0, shuffle=True)
score = model.evaluate(xTr, yTr.reshape(-1,1), verbose=0)
pred = model.predict(xTr, verbose=0)
print(score) #mse, mean, stdev of error
errArr = []
for i,y in enumerate(yTr):
errArr.append((pred[i][0] - y)*stdY)
e = np.asarray(errArr)
print(e.mean(), e.std()) #mean, stdev of error
Finally got the reason... By default, evaluate is not using all samples even if batch_size is set to none. After set batch_size = 1000 (number of samples in my data set), I got the same mean and standard deviation of error
I have one question, I'm trying to implement KFold and cross_val_score.
My goal is to calculate mean_squared_errorand for this purpose I used the following code:
from sklearn import linear_model
import numpy as np
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import KFold, cross_val_score
x = np.random.random((10000,20))
y = np.random.random((10000,1))
x_train = x[7000:]
y_train = y[7000:]
x_test = x[:7000]
y_test = y[:7000]
Model = linear_model.LinearRegression()
Model.fit(x_train,y_train)
y_predicted = Model.predict(x_test)
MSE = mean_squared_error(y_test,y_predicted)
print(MSE)
kfold = KFold(n_splits = 100, random_state = None, shuffle = False)
results = cross_val_score(Model,x,y,cv=kfold, scoring='neg_mean_squared_error')
print(results.mean())
I think it's all right here, I got the following results:
Results: 0.0828856459279 and -0.083069435946
But when I try to do this on some other example (datas from Kaggle House Prices), it does not work properly, at least I think so..
train = pd.read_csv('train.csv')
Insert missing values...
...
train = pd.get_dummies(train)
y = train['SalePrice']
train = train.drop(['SalePrice'], axis = 1)
x_train = train[:1000].values.reshape(-1,339)
y_train = y[:1000].values.reshape(-1,1)
y_train_normal = np.log(y_train)
x_test = train[1000:].values.reshape(-1,339)
y_test = y[1000:].values.reshape(-1,1)
Model = linear_model.LinearRegression()
Model.fit(x_train,y_train_normal)
y_predicted = Model.predict(x_test)
y_predicted_transform = np.exp(y_predicted)
MSE = mean_squared_error(y_test, y_predicted_transform)
print(MSE)
kfold = KFold(n_splits = 10, random_state = None, shuffle = False)
results = cross_val_score(Model,train,y, cv = kfold, scoring = "neg_mean_squared_error")
print(results.mean())
Here I get the following results: 0.912874946869 and -6.16986926564e+16
Apparently, the mean_squared_error calculated 'manually' is not the same as the mean_squared_error calculated by the help of KFold.
I'm interested in where I made a mistake?
The discrepancy is because, in contrast to your first approach (training/test set), in your CV approach you use the unnormalized y data for fitting the regression, hence your huge MSE. To get comparable results, you should do the following:
y_normal = np.log(y)
y_test_normal = np.log(y_test)
MSE = mean_squared_error(y_test_normal, y_predicted) # NOT y_predicted_transform
results = cross_val_score(Model, train, y_normal, cv = kfold, scoring = "neg_mean_squared_error")