Minimizing and maximizing the loss

Minimizing and maximizing the loss - keras

I would like to train an autoencoder in such a way that the reconstruction error will be low on some observations, and high on the others.
from keras.model import Sequential
from keras.layers import Dense
import keras.backend as K
def l1Loss(y_true, y_pred):
return K.mean(K.abs(y_true - y_pred))
model = Sequential()
model.add(Dense(5, input_dim=10, activation='relu'))
model.add(Dense(10, activation='sigmoid'))
model.compile(optimizer='adam', loss=l1Loss)
for i in range(1000):
model.train_on_batch(x_good, x_good) # minimize on low
model.train_on_batch(x_bad, x_bad, ???) # need to maximize this part, so that mse(x_bad, x_bad_reconstructed is high)
I saw something about replacing ??? with sample_weight=-np.ones(batch_size), but I have no idea if this is fitting for my goal.

If you set sample weight to negative numbers, then minimizing it would in fact lead to maximization of its absolute value.

Related

Polynomial Regression using keras

hi i am new to keras and i just wanted to know are ann's good for polynomial regression tasks or we shuold just
use sklearn for exmaple i write this script
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
x=np.arange(1, 100)
y=x**2
model = Sequential()
model.add(Dense(units=200, activation = 'relu',input_dim=1))
model.add(Dense(units=200, activation= 'relu'))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',optimizer=keras.optimizers.SGD(learning_rate=0.001))
model.fit(x, y,epochs=2000)
but after testing it on some of numbers i didn't get good result like :
model.predict([300])
array([[3360.9023]], dtype=float32)
is there any problem in my code or i just shouldn't use ann's for polynomial regressions.
thank you.

I'm not 100 percent sure, but I think that the reason you are getting such bad predictions is because you did not scale your data. Artificial neural networks are extremely computationally intensive, and thus, scaling is a must. Scale your data as shown below:
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
x=np.arange(1, 100)
y=x**2
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
x = sc_x.fit_transform(x)
sc_y = StandardScaler()
y = sc_y.fit_transform(y)
model = Sequential()
model.add(Dense(units=5, activation = 'relu',input_dim=1))
model.add(Dense(units=5, activation= 'relu'))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',optimizer=keras.optimizers.SGD(learning_rate=0.001))
model.fit(x, y,epochs=75, batch_size=10)
prediction = sc_y.inverse_transform(model.predict(sc_x.transform([300])))
print(prediction)
Note that I changed the number of epochs from 2000 to 75. This is because 2000 epochs is way to high for a neural network, and it requires lots of time to train. Your X dataset contains only 100 values, so the maximum number of epochs I would suggest is 75.
Furthermore, I also changed the number of neurons in each hidden layer from 200 to 5. This is because 200 neurons is far to many for most datasets, let alone a small dataset of length 100.
These changes should ensure that your neural network produces more accurate predictions.
Hope that helped.

what does [1] mean in model.evaluate(X, Y)[1]

The following codes are from a textbook called 'Deeplearning for everybody' and it is to predict diabetes based on the data from Pima indians. I wonder what the [1] at the end of the codes mean.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy
import tensorflow as tf
np.random.seed(3)
tf.random.set_seed(3)
dataset = np.loadtxt('.\dataset\pima-indians-diabetes.csv', delimiter=',')
X = dataset[:, 0:8]
Y = dataset[:, 8]
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, Y, epochs=200, batch_size=10)
print('\n Accuracy: %.4f' % (model.evaluate(X, Y)[1])) # <---------

In Keras model.evaluate() returns a list of aggregated metric values. Say you want to measure the loss, the accuracy, F1 score on your test data, then you would compile your model something like this: model.compile(optimizer, loss, metrics=['accuracy', custom_f1_function], .. ) . Then these will be calculated for every sample (or batch) in the dataset and then reduced usually by taking the average. In the end you will get a list that has three elements: aggregated loss, aggregated accuracy, aggregated F1 score. In your code you are accessing the second element of this list, namely the accuracy.
(The order in `metrics=[..] determines the order in the output list!)

How to pass weights to mean squared error in keras

I am trying to approach a regression problem, which is multi label with 8 labels for which i am using mean squared error loss, but the data set is imbalanced and i want to pass weights to the loss function.Currently i am compiling the model this way.
model.compile(loss='mse', optimizer=Adam(lr=0.0001), metrics=['mse', 'acc'])
Could someone please suggest if it is possible to add weights to mean squared error,if so, how could i do it?
Thanks in advance
The labels look like so
#
model = Sequential()
model.add(effnet)
model.add(GlobalAveragePooling2D())
model.add(Dropout(0.5))
model.add(Dense(8,name = 'nelu', activation=elu))
model.compile(loss=custom_mse(class_weights),
optimizer=Adam(lr=0.0001), metrics=['mse', 'acc'])

import keras
from keras.models import Sequential
from keras.layers import Conv2D, Flatten, Dense, Conv1D, LSTM, TimeDistributed
import keras.backend as K
# custom loss function
def custom_mse(class_weights):
def loss_fixed(y_true, y_pred):
"""
:param y_true: A tensor of the same shape as `y_pred`
:param y_pred: A tensor resulting from a sigmoid
:return: Output tensor.
"""
# print('y_pred:', K.int_shape(y_pred))
# print('y_true:', K.int_shape(y_true))
y_pred = K.reshape(y_pred, (8, 1))
y_pred = K.dot(class_weights, y_pred)
# calculating mean squared error
mse = K.mean(K.square(y_pred - y_true), axis=-1)
# print('mse:', K.int_shape(mse))
return mse
model = Sequential()
model.add(Conv1D(8, (1), input_shape=(28, 28)))
model.add(Flatten())
model.add(Dense(8))
# custom class weights
class_weights = K.variable([[0.25, 1., 2., 3., 2., 0.6, 0.5, 0.15]])
# print('class_weights:', K.int_shape(class_weights))
model.compile(optimizer='adam', loss=custom_mse(class_weights), metrics=['accuracy'])
Here is a small implementation of a custom loss function based on your problem statement
You find more information about keras loss function from losses.py and also check out its official documentation from here
Keras does not handle low-level operations such as tensor products, convolutions and so on itself. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. More information about keras backend can be found here and also check out its official documentation from here
Use K.int_shape(tensor_name) to find the dimensions of a tensor.

First create a dictionary of how much you want to weight each class, for example:
class_weights = {0: 1,
1: 1,
2: 1,
3: 9,
4: 1...} # Do this for all eight classes
Then pass them into model.fit:
model.fit(X, y, class_weight=class_weights)

Why does sigmoid function outperform tanh and softmax in this case?

The sigmoid function gives better results than tanh or softmax for the below neural network.
If I change the activation function from sigmoid to tanh or softmax the error increases an accuracy decreases. Although I have learned that tanh and softmax are better compared to sigmoid. Could someone help me understand this?
The datasets I used are iris and Pima Indians Diabetes Database. I have used TensorFlow 1.5 and Keras 2.2.4
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
import numpy as np
dataset = np.genfromtxt('diabetes.csv', dtype=float, delimiter=',')
X = dataset[1:, 0:8]
Y = dataset[1:, 8]
xtrain, xtest, ytrain, ytest = train_test_split(X, Y, test_size=0.2, random_state=42)
model = Sequential()
model.add(Dense(10, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(xtrain, ytrain, epochs=50, batch_size=20)
print(model.metrics_names)
print(model.evaluate(xtest, ytest))

The value range is between -1 and 1, but that's not necessarily a problem as far as the Tanh is concerned. By learning suitable weights, the Tanh can fit to the value range [0,1] using the bias. Therefore both the Sigmoid and the Tangh can be used here. Only Softmax is not possible for the reasons mentioned. See the code below:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
X = np.hstack((np.linspace(0, 0.45, num=50), np.linspace(0.55, 1, num=50)))
Y = (X > 0.5).astype('float').T
model = Sequential()
model.add(Dense(1, input_dim=1, activation='tanh'))
model.compile(loss='binary_crossentropy', optimizer='SGD', metrics=['accuracy'])
model.fit(X, Y, epochs=100)
print(model.evaluate(X, Y, verbose=False))
Whenever someone says you should always prefer foo over bar in machine learning, it's probably an inadmissible simplification. There are anti-patterns that one can explain to people, things that never work, like the Softmax in the example above. If the rest were that simple, AutoML would be a very boring field of research ;) . PS: I'm not exactly working on AutoML.

Softmax activation function is generally used as a categorical activation. This is because softmax squashes the outputs between the range (0,1) so that the sum of the outputs is always 1. If your output layer only has one unit/neuron, it will always have a constant 1 as an output.
Tanh, or hyperbolic tangent is a logistic function that maps the outputs to the range of (-1,1). Tanh can be used in binary classification between two classes. When using tanh, remember to label the data accordingly with [-1,1].
Sigmoid function is another logistic function like tanh. If the sigmoid function inputs are restricted to real and positive values, the output will be in the range of (0,1). This makes sigmoid a great function for predicting a probability for something.
So, all in all, the output activation function is usually not a choice of model performance but actually is dependent on the task and network architecture you are working with.

How do i calculate accuracy of my ANN in this case

I am running the following code. I want to calculate accuracy of my ANN for test data. I am using windows platfrom, python 3.5
import numpy
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
dataset=pd.read_csv('main.csv')
dataset=dataset.fillna(0)
X=dataset.iloc[:, 0:6].values
#X = X[numpy.logical_not(numpy.isnan(X))]
y=dataset.iloc[:, 6:8].values
#y = y[numpy.logical_not(numpy.isnan(y))]
#regr = LinearRegression()
#regr.fit(numpy.transpose(numpy.matrix(X)), numpy.transpose(numpy.matrix(y)))
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X,y, test_size=0.24,random_state=0)
create model
model = Sequential()
model.add(Dense(4, input_dim=6, kernel_initializer='normal', activation='relu'))
model.add(Dense(4, kernel_initializer='normal', activation='relu'))
model.add(Dense(2, kernel_initializer='normal'))
Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, batch_size=5, epochs=5)
y_pred=model.predict(X_test)
Now, i want to calculate the accuracy of y_pred. Any help will be appreciated.
The above code is self explanatory. I am currently using only 5 epochs just for experimenting.

Keras already implements metrics such as accuracy, so you just need to change the model.compile line to:
model.compile(loss='mean_squared_error', optimizer='adam',
metrics = ["accuracy"])
Then training and validation accuracy (in the [0, 1] range) will be presented at the progress bar during training, and you can compute accuracy with model.evaluate as well, which will return a tuple of loss and metrics (accuracy in this case).

Besides the suggestion of using keras. You can compute the accuracy using scikit-learn as follows:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
For more information, check the documentation : sklearn.metrics.accuracy_score

Although in a narrow technical sense both answers already provided are correct, there is a more general issue with your question which affects the essence of it: are you in a regression or a classification context?
If you are in a regression context (as implied by your loss='mean_squared_error' and the linear activation in your output layer), then the simple augmentation of model compilation
model.compile(loss='mean_squared_error', optimizer='adam',
metrics = ["accuracy"])
will, as Matias says, provide the accuracy. Nevertheless, accuracy is meaningless in a regression setting; see the answer & discussion here for more details.
If you are in a classification context (as implied by your wish to calculate the accuracy, which is meaningful only in classification), then your loss function should not be the MSE, but the cross-entropy instead, plus that the activation of your last layer should not be linear.

to compute accuracy we can use model.evaluate function

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Minimizing and maximizing the loss - keras

If you set sample weight to negative numbers, then minimizing it would in fact lead to maximization of its absolute value.

Related

Polynomial Regression using keras

what does [1] mean in model.evaluate(X, Y)[1]

How to pass weights to mean squared error in keras

Why does sigmoid function outperform tanh and softmax in this case?

How do i calculate accuracy of my ANN in this case

Categories

Resources