Un-normalizing PyTorch data - pytorch

Below code :
ux = torch.tensor(np.array([[255,1,255],[255,1,255]])).float()
print(ux)
ux = F.normalize(ux, p=2, dim=1)
print(ux)
prints :
tensor([[ 255., 1., 255.],
[ 255., 1., 255.]])
tensor([[ 0.7071, 0.0028, 0.7071],
[ 0.7071, 0.0028, 0.7071]])
How can I un-normalize the ux in order to return to values
tensor([[ 255., 1., 255.],
[ 255., 1., 255.]])
from
tensor([[ 0.7071, 0.0028, 0.7071],
[ 0.7071, 0.0028, 0.7071]])
There are various resources that detail this process such as https://discuss.pytorch.org/t/simple-way-to-inverse-normalize-a-batch-of-input-variable/12385/3 but do not detail unnormalizing result of F.normalize

F.normalize simply divides by the norm according to the documentation, so you simply need to multiply it by its magnitude.
This means you still need access to the magnitude of the original vector ux, otherwise, this is not possible, since the information about the magnitude cannot be recovered from the normalized vector.
Here's how this can be done:
# I modified the input to make it more interesting, but you can use any other value
ux = torch.tensor(np.array([[255,1,255],[101,10,123]])).float()
magnitude = ux.norm(p=2, dim=1, keepdim=True) # NEW
ux = F.normalize(ux, p=2, dim=1)
ux_orig = ux * magnitude # NEW
print(ux_orig)
# Outputs:
# tensor([[255., 1., 255.],
# [101., 10., 123.]])

Related

Broadcasting in PyTorch

To better understand how nn.BatchNorm2d works, I wanted to recreate the following lines of code:
input = torch.randint(1, 5, size=(2, 2, 3, 3)).float()
batch_norm = nn.BatchNorm2d(2)
output = (batch_norm(input))
print(input)
print(output)
tensor([[[[3., 3., 2.],
[3., 2., 2.],
[4., 2., 1.]],
[[1., 1., 2.],
[3., 2., 1.],
[1., 1., 1.]]],
[[[4., 3., 3.],
[4., 1., 4.],
[1., 3., 2.]],
[[2., 1., 4.],
[4., 2., 1.],
[4., 1., 3.]]]])
tensor([[[[ 0.3859, 0.3859, -0.6064],
[ 0.3859, -0.6064, -0.6064],
[ 1.3783, -0.6064, -1.5988]],
[[-0.8365, -0.8365, 0.0492],
[ 0.9349, 0.0492, -0.8365],
[-0.8365, -0.8365, -0.8365]]],
[[[ 1.3783, 0.3859, 0.3859],
[ 1.3783, -1.5988, 1.3783],
[-1.5988, 0.3859, -0.6064]],
[[ 0.0492, -0.8365, 1.8206],
[ 1.8206, 0.0492, -0.8365],
[ 1.8206, -0.8365, 0.9349]]]]
To achieve, I first calculated the mean and variance for each channel:
my_mean = (torch.mean(input, dim=[0, 2, 3]))
my_var = (torch.var(input, dim=[0, 2, 3]))
print(my_mean, my_var)
tensor([2.8333, 2.5556])
tensor([1.3235, 1.3203])
This seems reasonable, I have the mean and variance for each channel across the whole batch. Then I wanted to simply extract the mean from the input and divide by the variance. This is where problems arise, since I do not know to properly set up the mean and variance. PyTorch does not seem to broadcast properly:
my_output = (input - my_mean) / my_var
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3
I then wanted to reshape the mean and variance in the appropriate shape, such that each value is repeated 25 times in a 5x5 shape
First try:
my_mean.repeat(25).reshape(3, 5, 5)
But this also results in an error. What is the best way to achieve my goal?

How to add to pytorch tensor at indices?

I have to admit, I'm a bit confused by the scatter* and index* operations - I'm not sure any of them do exactly what I'm looking for, which is very simple:
Given some 2-D tensor
z = tensor([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
And a list (or tensor?) of 2-d indexes:
inds = tensor([[0, 0],
[1, 1],
[1, 2]])
I want to add a scalar to z at those indexes (and do it efficiently):
znew = z.something_add(inds, 3)
->
znew = tensor([[4., 1., 1., 1.],
[1., 4., 4., 1.],
[1., 1., 1., 1.]])
If I have to I can make that scalar a tensor of whatever shape (where all elements = 3), but I'd rather not...
You must provide two lists to your indexing. The first having the row positions and the second the column positions. In your example, it would be:
z[[0, 1, 1], [0, 1, 2]] += 3
torch.Tensor indexing follows Numpy. See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details.
This code achieves what you want:
z_new = z.clone() # copy the tensor
z_new[inds[:, 0], inds[:, 1]] += 3 # modify selected indices of new tensor
In PyTorch, you can index each axis of a tensor with another tensor.

Pytorch select values from the last tensor dimension with indices from another tenor with a smaller dimension

I have a tensor a with three dimensions. The first dimension corresponds to minibatch size, the second to the sequence length, and the third to the feature dimension. E.g.,
>>> a = torch.arange(1, 13, dtype=torch.float).view(2,2,3) # Consider the values of a to be random
>>> a
tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
I have a second, two-dimensional tensor. Its first dimension corresponds to the minibatch size and its second dimension to the sequence length. It contains values in the range of the indices of the third dimension of a. as third dimension has size 3, so b can contain values 0, 1 or 2. E.g.,
>>> b = torch.LongTensor([[0, 2],[1,0]])
>>> b
tensor([[0, 2],
[1, 0]])
I want to obtain a tensor c that has the shape of b and contains all the values of a that are referenced by b.
In the upper scenario I would like to have:
c = torch.empty(2,2)
c[0,0] = a[0, 0, b[0,0]]
c[1,0] = a[1, 0, b[1,0]]
c[0,1] = a[0, 1, b[0,1]]
c[1,1] = a[1, 1, b[1,1]]
>>> c
tensor([[ 1., 5.],
[ 8., 10.]])
How can I create the tensor c fast? Further, I also want c to be differentiable (be able to use .backprob()). I am not too familiar with pytorch, so I am not sure, if a differentiable version of this exists.
As an alternative, instead of c having the same shape as b I could also use a c with the same shape of a, having only zeros, but at the places referenced by b ones. Then I could multiply a and c to obtain a differentiable tensor.
Like follows:
c = torch.zeros(2,2,3, dtype=torch.float)
c[0,0,b[0,0]] = 1
c[1,0,b[1,0]] = 1
c[0,1,b[0,1]] = 1
c[1,1,b[1,1]] = 1
>>> a*c
tensor([[[ 1., 0., 0.],
[ 0., 5., 0.]],
[[ 0., 8., 0.],
[10., 0., 0.]]])
Lets declare necessary variables first: (notice requires_grad in a's initialization, we will use it to ensure differentiability)
a = torch.arange(1,13,dtype=torch.float32,requires_grad=True).reshape(2,2,3)
b = torch.LongTensor([[0, 2],[1,0]])
Lets reshape a and squash minibatch and sequence dimensions:
temp = a.reshape(-1,3)
so temp now looks like:
tensor([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.],
[10., 11., 12.]], grad_fn=<AsStridedBackward>)
Notice now each value of b can be used in each row of temp to get desired output. Now we do:
c = temp[range(len(temp )),b.view(-1)].view(b.size())
Notice how we index temp, range(len(temp )) to select each row and 1D b i.e b.view(-1) to get corresponding columns. Lastly .view(b.size()) brings this array to the same size as b.
If we print c now:
tensor([[ 1., 6.],
[ 8., 10.]], grad_fn=<ViewBackward>)
The presence of grad_fn=.. shows that c requires gradient i.e. its differentiable.

How to feed a model with "a list of outputs"?

Sorry for the title but I could't come up with a better description here.
I am trying to apply batches for training on a model which should have 13 fully connected output layers. Each output layer has only two nodes (but are fully connected as stated).
Building the model's output looks like this:
outputs = list()
for i in range(num_labels):
out_y = Dense(2, activation='softmax', name='out_{:d}'.format(i))(convolution_layer)
outputs.append(out_y)
self.model = Model(input=inputs, output=outputs)
However, I can't manage to feed this model. I've tried to go with a [batch_size, 13, 1, 2] sized output array:
y = np.zeros((batch_size, 13, 1, 2))
But for a batch of size 2 I get:
ValueError: The model expects 13 input arrays, but only received one array. Found: array with shape (2, 13, 1, 2)
I've tried several other things but it's simply not clear to me how the input for the model looks like.
How can I train this model?
I have also tried to pass a list of lists of numpy arrays:
where the first level of the batch represent the sample (here 2) and the second level is the sample with the list of 13 numpy arrays. Yet I am getting:
ValueError: Error when checking model target: you are passing a list as input to your model, but the model expects a list of 13 Numpy arrays instead. The list you passed was: [[array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 0., 1.]), array([ 1., 0.]), array([
As suggested, I also tried to return a list() of numpy arrays of size [13,2]:
Where the error becomes:
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 13 arrays but instead got the following list of 2 arrays: [array([[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 0., 1.],
[ 1., 0.],
[ ...
The code
Below you can find the current code which generates one sample in sample_generator and a full batch in batch_generator (which uses sample_generator).
def batch_generator(w2v, file_path, meta_info, batch_size, sample_generator_fn, embedding_size):
Please note: The code shows now how I generate a list() of [13,2] ndarrays whereas the number of such ndarrays in that list is defined by batch_size.
try:
x = np.zeros((batch_size, meta_info.max_sequence_length, embedding_size, 1))
y = list() #np.zeros((batch_size, 13, 1, 2))
file = open(file_path)
while True:
x[:] = 0.0
#y[:] = 0.0
for batch in range(batch_size):
sentence_info_json = file.readline()
if sentence_info_json == '':
file.seek(0)
sentence_info_json = file.readline()
sample = sample_generator_fn(w2v, sentence_info_json, meta_info)
if not sample:
continue
sentence_embedding = sample[0]
final_length = len(sentence_embedding)
x[batch, :final_length, :, 0] = sentence_embedding
y.append(sample[1])
shuffled = np.asarray(range(batch_size))
np.random.shuffle(shuffled)
x = x[shuffled]
#y = y[shuffled]
y = [y[i] for i in shuffled]
yield x, y
except Exception as e:
print('Error in generator.')
print(e)
raise e
def sample_generator(w2v, sentence_info_json, meta_info):
if not sentence_info_json:
print('???')
sentence_info = json.loads(sentence_info_json)
tokens = [token['word'] for token in sentence_info['corenlp']['tokens']]
sentence = Sentence(tokens=tokens)
sentence_embedding = w2v.get_word_vectors(sentence.tokens.tolist())
sentence_embedding = np.asarray([word_vector for word_vector in sentence_embedding if word_vector is not None])
final_length = len(sentence_embedding)
if final_length == 0:
return None
y = np.zeros((2, len(meta_info.category_dict)))
y[1, :] = 1.
#y_list = []
y_tar = np.zeros((len(meta_info.category_dict), 2))
for i in range(len(meta_info.category_dict)):
y_tar[i][1] = 1.0
# y_list.append(np.asarray([0.0, 1.0]))
for opinion in sentence_info['opinions']:
index = meta_info.category_dict[opinion['category']]
y_tar[index][0] = 1.0
y_tar[index][1] = 0.0
#y_list[index][0] = 1.0
#y_list[index][1] = 0.0
return sentence_embedding, y_tar
As requested, the call to fit_generator()
cnn.model.fit_generator(generator=batch_generator(word2vec,
train_file, train_meta_info,
num_batches, sample_generator,
embedding_size),
samples_per_epoch=2000,
nb_epoch=2,
# validation_data=batch_generator(test_file_path, train_meta_info),
# nb_val_samples=100,
verbose=True)
Your output should be a list as specified in the error. Each element of the list should be a numpy array of size [batch_size, nb_outputs]. So a list of 13 elements of size [batch_size,2] in your case.

One-vs-Rest algorithm and out-of-the-box multiclass algorithm gives different results

Can someone explain why the OneVsRestClassifier gives different result than the out-of-the-box algorithm?
from sklearn.multiclass import OneVsRestClassifier, OneVsOneClassifier
X = [[1,2],[1,3],[4,2],[2,3],[1,4]]
y = [1,2,3,2,1]
X_pred = [[2,4], [5,4], [3,7]]
dummy_clf = OneVsRestClassifier(SGDClassifier(verbose=0, class_weight="auto", loss='modified_huber', random_state=0)) # first case
#dummy_clf = SGDClassifier(verbose=0, class_weight="auto", loss='modified_huber', random_state=0) # second case
dummy_clf.fit(X, y)
dummy_clf.predict_proba(X_pred)
First case:
array([[ 0.5, 0.5, 0. ],
[ 0. , 1. , 0. ],
[ 0.5, 0.5, 0. ]])
Second case:
array([[ 0., 1., 0.],
[ 0., 1., 0.],
[ 0., 1., 0.]])
OneVsRest gives you the probability of X_pred for all of the classes, thus the first and last test cases have a value for multiple classes (that sum to 1). The classifier is trained on all classes.
OneVsOne trains a classifier on all class pairs. For all class pairs, the class predicted most is the winner, so you only get one prediction per instance.

Resources