What is the range for nll_loss? - pytorch

I thought the range was only within the positive domain, but I am getting both negative numbers and positive numbers for F.nll_loss?
Why is that? I am confused. Softmax ranges from 0 to 1 and -log(0 to 1) is from infinity to 0. So, why am I getting negative numbers?

Here is how the function is implemented:
f = F.nll_loss
def NLLLoss(logs, targets):
out = logs[range(len(targets)), targets]
return -out.sum()/len(out)
i = torch.randn(3, 5)
print(i)
t = torch.empty(3).random_(0, 5).to(dtype=torch.long)
print(t)
o = f(i,t)
print(o)
f = NLLLoss
o = f(i,t)
print(o)
# tensor([[ 0.0684, 0.9493, -0.9932, -1.9325, -0.1642],
# [ 1.7073, 0.8153, -0.6435, -1.0128, 0.9894],
# [ 0.6948, -1.3770, -0.0932, -0.0951, -1.4226]])
# tensor([2, 0, 3])
# tensor(-0.2063)
# tensor(-0.2063)
It is just a sum of values and range can be from -inf to inf.

Related

Pytorch mask like Gluon npx.sequence_mask

I am trying to do this tutorial: https://d2l.ai/chapter_attention-mechanisms/attention.html but in Pytorch and I'm stuck in this function:
npx.sequence_mask()
I tried using torch.masked_fill and masked_scatter but to no success. Namely, I want:
a = torch.randn(2, 2, 4)
b = torch.randn(2, 3)
and to get a result like npx.sequence_mask()
sequence_mask documentation
([[[0.488994 , 0.511006 , 0. , 0. ],
[0.43654838, 0.56345165, 0. , 0. ]],
[[0.28817102, 0.3519408 , 0.3598882 , 0. ],
[0.29034293, 0.25239873, 0.45725834, 0. ]]])
Could anyone help me out with any ideas?
Maybe this works, but any better solution?
def mask_softmax(vec, mask):
leafs= vec.shape[0]
rows = vec.shape[1]
cols = vec.shape[2]
for k in range(leafs):
stop = int(mask[k])
for j in reversed(range(stop,cols)):
vec[k, :, j] = torch.zeros(rows) #all rows of col i <-- 0
vec = vec - torch.where(vec > 0,
torch.zeros_like(vec),
torch.ones_like(vec)*float('inf')) # switch 0 by -inf
# softmax(-inf) = nan
for k in range(leafs):
for i in range(rows):
vec[k,i] = F.softmax(vec[k, i], dim=0)
vec[vec != vec] = 0 # nan = 0
return vec
# testing
a = torch.rand((2,2,4))
mask = torch.Tensor((1,3))
mask_softmax(a, mask)
>>> tensor([[[0.5027, 0.4973, 0.0000, 0.0000],
[0.6494, 0.3506, 0.0000, 0.0000]],
[[0.3412, 0.3614, 0.2975, 0.0000],
[0.2699, 0.3978, 0.3323, 0.0000]]])
d2l now provides an official PyTorch version, in which the following function is defined as equivalent to npx.sequence_mask:
def sequence_mask(X, valid_len, value=0):
"""Mask irrelevant entries in sequences."""
maxlen = X.size(1)
mask = torch.arange((maxlen), dtype=torch.float32,
device=X.device)[None, :] < valid_len[:, None]
X[~mask] = value
return X
Ref: https://d2l.ai/chapter_recurrent-modern/seq2seq.html#loss-function

How to determine the maximum,minimum and average values of the matrix

I need to create a matrix, where the user can input his values and determine the maximum value of the matrix, the minimum value and the average value of all matrix's data
I created the matrix, where I can input my own values and I tried to write which could determine the maximum and minimum value. However, after several checks, I understood, that my piece of code, which determines max and min values, doesn't work.
line = int(input('Enter the amount of lines:'))
columns = int(input('Enter the amount of columns:'))
matrix = []
for i in range(0, columns):
matrix.append([])
for i in range(0, line):
for j in range(0, columns):
matrix[i].append(j)
matrix[i][j] = 0
for i in range(0, line):
for j in range(0,columns):
matrix[i][j] = int(input('Enter the value:'))
avg = matrix
for i in range(0, line):
for j in range(0, columns):
max_val = matrix[j]
min_val = matrix[j]
for j in range(0, len(matrix[j]), 1):
max_val = max(max_val, matrix[j])
min_val = min(min_val, matrix[j])
maxVal = max_val[0]
minVal = min_val[0]
for i in range(0, len(max_val), 1):
maxVal = max(maxVal, max_val[i])
minVal = min(minVal, min_val[i])
print(matrix)
print('The maximum value is ' + str(maxVal))
print('The minimum value is ' + str(minVal))
I excepted the result, which will print me the matrix, maximum value, minimum value and average value
One way of doing it with Python lists is this:
(I'll just do find_min(), as find_max() and compute_mean() would be pretty much the same.)
import random
def gen_random_matrix(rows, cols, min_val=1, max_val=100):
return [
[random.randint(min_val, max_val) for j in range(cols)]
for i in range(rows)]
def find_min(matrix):
rows = len(matrix)
cols = len(matrix[0])
min_i, min_j = 0, 0
min_val = matrix[min_i][min_j]
for i in range(rows):
for j in range(cols):
if matrix[i][j] < min_val:
min_i = i
min_j = j
min_val = matrix[i][j]
return min_val, min_i, min_j
random.seed(0)
matrix = gen_random_matrix(3, 4)
print(matrix)
# [[50, 98, 54, 6], [34, 66, 63, 52], [39, 62, 46, 75]]
print(find_min(matrix))
# (6, 0, 3)

Python3 Modified Gram-Schmidt

I'm new to python3 I'm trying to write a code that takes a matrix as its argument and computes and prints the QR factorization using the modified Gram-Schmidt algorithm. I'm trying to use nested for loops for the code and not use NUMPY at all. I have attached my code below any help would be greatly appreciated. Thank you in advance.
def twoNorm(vector):
'''
twoNorm takes a vector as it's argument. It then computes the sum of
the squares of each element of the vector. It then returns the square
root of this sum.
'''
# This variable will keep track of the validity of our input.
inputStatus = True
# This for loop will check each element of the vector to see if it's a number.
for i in range(len(vector01)):
if ((type(vector01[i]) != int) and (type(vector01[i]) != float) and (type(vector01[i]) != complex)):
inputStatus = False
print("Invalid Input")
# If the input is valid the function continues to compute the 2-norm
if inputStatus == True:
result = 0
# This for loop will compute the sum of the squares of the elements of the vector.
for i in range(len(vector01)):
result = result + (vector01[i]**2)
result = result**(1/2)
return result
def QR(matrix):
r[i][i] = twoNorm(vector01)
return [vector01 * (1/(twoNorm(vector01)) for i in matrix]
for j in range(len(matrix)):
r[i][j] = q[i] * vector02[i]
vector02 = vector02[i] - (r[i][j] * q[i])
matrix = [[1, 2], [0, 1], [1, 0]]
vector01 = [1, 0, 1]
vector02 = [2, 1, 0]

Wrong number of dimensions: expected 0, got 1 with shape (1,)

I am doing word-level language modelling with a vanilla rnn, I am able to train the model but for some weird reasons I am not able to get any samples/predictions from the model; here is the relevant part of the code:
train_set_x, train_set_y, voc = load_data(dataset, vocab, vocab_enc) # just load all data as shared variables
index = T.lscalar('index')
x = T.fmatrix('x')
y = T.ivector('y')
n_x = len(vocab)
n_h = 100
n_y = len(vocab)
rnn = Rnn(input=x, input_dim=n_x, hidden_dim=n_h, output_dim=n_y)
cost = rnn.negative_log_likelihood(y)
updates = get_optimizer(optimizer, cost, rnn.params, learning_rate)
train_model = theano.function(
inputs=[index],
outputs=cost,
givens={
x: train_set_x[index],
y: train_set_y[index]
},
updates=updates
)
predict_model = theano.function(
inputs=[index],
outputs=rnn.y,
givens={
x: voc[index]
}
)
sampling_freq = 2
sample_length = 10
n_train_examples = train_set_x.get_value(borrow=True).shape[0]
train_cost = 0.
for i in xrange(n_train_examples):
train_cost += train_model(i)
train_cost /= n_train_examples
if i % sampling_freq == 0:
# sample from the model
seed = randint(0, len(vocab)-1)
idxes = []
for j in xrange(sample_length):
p = predict_model(seed)
seed = p
idxes.append(p)
# sample = ''.join(ix_to_words[ix] for ix in idxes)
# print(sample)
I get the error: "TypeError: ('Bad input argument to theano function with name "train.py:94" at index 0(0-based)', 'Wrong number of dimensions: expected 0, got 1 with shape (1,).')"
Now this corresponds to the following line (in the predict_model):
givens={ x: voc[index] }
Even after spending hours I am not able to comprehend how could there be a dimension mis-match when:
train_set_x has shape: (42, 4, 109)
voc has shape: (109, 1, 109)
And when I do train_set_x[index], I am getting (4, 109) which 'x' Tensor of type fmatrix can hold (this is what happens in train_model) but when I do voc[index], I am getting (1, 109), which is also a matrix but 'x' cannot hold this, why ? !
Any help will be much appreciated.
Thanks !
The error message refers to the definition of the whole Theano function named predict_model, not the specific line where the substitution with givens occurs.
The issue seems to be that predict_model gets called with an argument that is a vector of length 1 instead of a scalar. The initial seed sampled from randint is actually a scalar, but I would guess that the output p of predict_model(seed) is a vector and not a scalar.
In that case, you could either return rnn.y[0] in predict_model, or replace seed = p with seed = p[0] in the loop over j.

finding optimum lambda and features for polynomial regression

I am new to Data Mining/ML. I've been trying to solve a polynomial regression problem of predicting the price from given input parameters (already normalized within range[0, 1])
I'm quite close as my output is in proportion to the correct one, but it seems a bit suppressed, my algorithm is correct, just don't know how to reach to an appropriate lambda, (regularized parameter) and how to decide to what extent I should populate features as the problem says : "The prices per square foot, are (approximately) a polynomial function of the features. This polynomial always has an order less than 4".
Is there a way we could visualize data to find optimum value for these parameters, like we find optimal alpha (step size) and number of iterations by visualizing cost function in linear regression using gradient descent.
Here is my code : http://ideone.com/6ctDFh
from numpy import *
def mapFeature(X1, X2):
degree = 2
out = ones((shape(X1)[0], 1))
for i in range(1, degree+1):
for j in range(0, i+1):
term1 = X1**(i-j)
term2 = X2 ** (j)
term = (term1 * term2).reshape( shape(term1)[0], 1 )
"""note that here 'out[i]' represents mappedfeatures of X1[i], X2[i], .......... out is made to store features of one set in out[i] horizontally """
out = hstack(( out, term ))
return out
def solve():
n, m = input().split()
m = int(m)
n = int(n)
data = zeros((m, n+1))
for i in range(0, m):
ausi = input().split()
for k in range(0, n+1):
data[i, k] = float(ausi[k])
X = data[:, 0 : n]
y = data[:, n]
theta = zeros((6, 1))
X = mapFeature(X[:, 0], X[:, 1])
ausi = computeCostVect(X, y, theta)
# print(X)
print("Results usning BFGS : ")
lamda = 2
theta, cost = findMinTheta(theta, X, y, lamda)
test = [0.05, 0.54, 0.91, 0.91, 0.31, 0.76, 0.51, 0.31]
print("prediction for 0.31 , 0.76 (using BFGS) : ")
for i in range(0, 7, 2):
print(mapFeature(array([test[i]]), array([test[i+1]])).dot( theta ))
# pyplot.plot(X[:, 1], y, 'rx', markersize = 5)
# fig = pyplot.figure()
# ax = fig.add_subplot(1,1,1)
# ax.scatter(X[:, 1],X[:, 2], s=y) # Added third variable income as size of the bubble
# pyplot.show()
The current output is:
183.43478288
349.10716957
236.94627602
208.61071682
The correct output should be:
180.38
1312.07
440.13
343.72

Resources