Related
I am working on a project that involves the use of NDCG (normalized distributed cumulative gain), and I understand the method's underlying calculations.
So I imported ndcg_score from sklearn.metrics, and then pass in a ground truth array and another array to the ndcg_score function to calculate their NDCG score. The ground truth array has the values [5, 4, 3, 2, 1] while the other array has the values [5, 4, 3, 2, 0], so only the last element is different in these 2 arrays.
from sklearn.metrics import ndcg_score
user_ndcg = ndcg_score(array([[5, 4, 3, 2, 1]]), array([[5, 4, 3, 2, 0]]))
I was expecting the result to be around 0.96233 (9.88507/10.27192). However, user_ndcg actually returned 1.0, which surprised me. Initially I thought this is due to rounding, but this is not the case because when I did an experiment on another set of array: ndcg_score(array([[5, 4, 3, 2, 1]]), array([[5, 4, 0, 2, 0]])), it correctly returned 0.98898.
Does anyone know whether this could be a bug with the sklearn ndcg_score function, or whether I was doing something wrong with my code?
I am assuming you are trying to predict six different classes for this problem (0, 1, 2, 3, 4 and 5). If you want to evaluate the ndcg for five different observations, you have to pass the function two arrays of shape (5, 6) each.
That is, you have transform your ground truth and predictions to arrays of five rows and six columns per row.
# Current form of ground truth and predictions
y_true = [5, 4, 3, 2, 1]
y_pred = [5, 4, 3, 2, 0]
# Transform ground truth to ndarray
y_true_nd = np.zeros(shape=(5, 6))
y_true_nd[np.arange(5), y_true] = 1
# Transform predictions to ndarray
y_pred_nd = np.zeros(shape=(5, 6))
y_pred_nd[np.arange(5), y_pred] = 1
# Calculate ndcg score
ndcg_score(y_true_nd, y_pred_nd)
> 0.8921866522394966
Here's what y_true_nd and y_pred_nd look like:
y_true_nd
array([[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0., 0.]])
y_pred_nd
array([[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.]])
Given a tensor:
A = torch.tensor([2., 3., 4., 5., 6., 7.])
Then, give each element in A an id:
id = torch.arange(A.shape[0], dtype = torch.int) # tensor([0,1,2,3,4,5])
In other words, id of 2. in A is 0 and id of 3. in A is 1:
2. -> 0
3. -> 1
4. -> 2
5. -> 3
6. -> 4
7. -> 5
Then, I have a new tensor:
B = torch.tensor([3., 6., 6., 5., 4., 4., 4.])
In pytorch, is there any way in Pytorch to map each element in B to id?
In other words, I want to obtain tensor([1, 4, 4, 3, 2, 2, 2]), in which each element is id of the element in B.
What you ask can be done with slowly iterating the whole B matrix and checking each element of it against all elements of A and then retrieving the index of each element:
In [*]: for x in B:
...: print(torch.where(x==A)[0][0])
...:
...:
tensor(1)
tensor(4)
tensor(4)
tensor(3)
tensor(2)
tensor(2)
tensor(2)
Here I used torch.where to find all the True elements in the matrix x==A, where x take the value of each element of matrix B. This is really slow but it allows you to add some functionality to deal with cases where some elements of B do not appear in matrix A
The fast and dirty method to get what you want with linear algebra operations is:
In [*]: (B.view(-1,1) == A).int().argmax(dim=1)
Out[*]: tensor([1, 4, 4, 3, 2, 2, 2])
This trick takes advantage of the fact that argmax returns the first 'max' index of each vector in dim=1.
Big warning here, if the element does not exist in the matrix no error will be raised and the result will silently be 0 for all elements that do not exist in A.
In [*]: C = torch.tensor([100, 1000, 1, 3, 9999])
In [*]: (C.view(-1,1) == A).int().argmax(dim=1)
Out[*]: tensor([0, 0, 0, 1, 0])
I don't think there is such a function in PyTorch to map a tensor.
It seems quite unreasonable to solve this by comparing each value from B to values from B.
Here are two possible solutions to solve this problem.
Using a dictionary as a map
You can use a dictionary. Not so not much of a pure-PyTorch solution but will most probably be the fastest and safest way...
Just create a dict to map each element to an id, then use it to map B:
>>> map = {x.item(): i for i, x in enumerate(A)}
>>> torch.tensor([map[x.item()] for x in B])
tensor([1, 4, 4, 3, 2, 2, 2])
Change of basis approach
An alternative only using torch.Tensors. This will require the values you want to map - the content of A - to be integers because they will be used to index a tensor.
Encode the content of A into one-hot encodings:
>>> A_enc = torch.zeros((int(A.max())+1,)*2)
>>> A_enc[A, torch.arange(A.shape[0])] = 1
>>> A_enc
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.]])
We'll use A_enc as our basis to map integers:
>>> v = torch.argmax(A_enc, dim=0)
tensor([0, 0, 0, 1, 2, 3, 4, 5])
Now, given an integer for instance x=3, we can encode it into a one-hot-encoding: x_enc = [0, 0, 0, 1, 0, 0, 0, 0]. Then, use v to map it. With a simple dot product you can get the mapping of x_enc: here <v/x_enc> gives 1 which is the desired result (first element of mapped-B). But instead of giving x_enc, we will compute the matrix multiplication between v and encoded-B. First encode B then compute the matrix multiplcition vxB_enc:
>>> B_enc = torch.zeros(A_enc.shape[0], B.shape[0])
>>> B_enc[B, torch.arange(B.shape[0])] = 1
>>> B_enc
tensor([[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 1.],
[0., 0., 0., 1., 0., 0., 0.],
[0., 1., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.]])
>>> v#B_enc.long()
tensor([1, 4, 4, 3, 2, 2, 2])
Note - you will have to define your tensors with Long type.
There is a similar issue for numpy so my answer is heavily inspired by their solution. I will compare some of the mentioned methods using perfplot. I will also generalize the problem to apply a mapping to a tensor (yours is just a specific case).
For the analysis, I will assume the mapping contains all the unique elements in the tensor and the number of elements to small and constant.
import torch
def apply(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
mapping = {k.item(): v.item() for k, v in zip(a, ids)}
return b.clone().apply_(lambda x: mapping.__getitem__(x))
def bucketize(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
mapping = {k.item(): v.item() for k, v in zip(a, ids)}
# From `https://stackoverflow.com/questions/13572448`.
palette, key = zip(*mapping.items())
key = torch.tensor(key)
palette = torch.tensor(palette)
index = torch.bucketize(b.ravel(), palette)
remapped = key[index].reshape(b.shape)
return remapped
def iterate(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
mapping = {k.item(): v.item() for k, v in zip(a, ids)}
return torch.tensor([mapping[x.item()] for x in b])
def argmax(a: torch.Tensor, ids: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
return (b.view(-1, 1) == a).int().argmax(dim=1)
if __name__ == "__main__":
import perfplot
a = torch.arange(2, 8)
ids = torch.arange(0, 6)
perfplot.show(
setup=lambda n: torch.randint(2, 8, (n,)),
kernels=[
lambda x: apply(a, ids, x),
lambda x: bucketize(a, ids, x),
lambda x: iterate(a, ids, x),
lambda x: argmax(a, ids, x),
],
labels=["apply", "bucketize", "iterate", "argmax"],
n_range=[2 ** k for k in range(25)],
xlabel="len(a)",
)
Running this yields the following plot:
Hence depending on the number of elements in your tensor you can pick either the argmax method (with the caveats mentioned and the restriction that you have to map the values from 0 to N), apply, or bucketize.
Now if we increase the number of elements to be mapped lets say tens of thousands i.e. a = torch.arange(2, 10002) and ids = torch.arange(0, 10000) we get the following results:
This means the speed increase of bucketize will only be visible for a larger array but still outperforms the other methods (the argmax method was killed and therefore I had to remove it).
Last, if we have a mapping that does not have all keys present in the tensor we can just update a dictionary with all unique keys:
mapping = {x.item(): x.item() for x in torch.unique(a)}
mapping.update({k.item(): v.item() for k, v in zip(a, ids)})
Now, if the unique elements you want to map is orders of magnitude larger than the array computing this may shift the value of n for when bucketize is faster than apply (since for apply you can change the mapping.__getitem__(x) for mapping.get(x, x).
I guess there is an easier way. Create an array as mapper, cast your tensor back into np.ndarray first and then address it.
import numpy as np
a_array = A.numpy().astype(int)
b_array = B.numpy().astype(int)
mapper = np.zeros(10)
for i, x in enumerate(a_array):
mapper[x] = i
out = torch.Tensor(mapper[b_array])
I have a tensor that looks like
coords = torch.Tensor([[0, 0, 1, 2],
[0, 2, 2, 2]])
The first row is the x-coordinates of objects on a grid and the second row is the corresponding y-coordinates.
I need a differentiable way (i.e. gradients can flow) to go from this tensor to the corresponding "grid" tensor, where a 1 represents the presence of an object in that location (row index, column index) and 0 represents no object:
grid = torch.Tensor([[1, 0, 1],
[0, 0, 1],
[0, 0, 1]])
In general, coords can be large (the grid size is 300x300). If coords was a sparse tensor I could simply call to_dense on it, but for various reasons specific to my application I cannot store coords as sparse. Additionally, I cannot create a new sparse tensor from coords and call to_dense on it because creating a new tensor is not differentiable.
Any help is appreciated!
I'm not sure what you mean by 'differentiable', but here's a simple way to do it using advanced indexing.
coords = coords.long()
grid[coords[0],coords[1]] = 1
tensor([[1., 0., 1.],
[0., 0., 1.],
[0., 0., 1.]])
I think Torch doesn't have a detailed documentation about this, but numpy has here. (probably very similar for torch)
this is also possible
coords = coords.long()
grid[coords[0],coords[1]] = torch.Tensor([1,2,3,4])
tensor([[1., 0., 2.],
[0., 0., 3.],
[0., 0., 4.]])
Say
coords = [[0, 0, 1, 2],
[0, 2, 2, 2]]
Then:
torch.stack([torch.stack(x) for x in coords])
Keras offers a couple of helper functions to process text:
texts_to_sequences and texts_to_matrix
It seems that most people use texts_to_sequences, but it is unclear to me why one is picked over the other and under what conditions you might want to use texts_to_matrix.
texts_to_matrix is easy to understand. It will convert texts to a matrix with columns refering to words and cells carrying number of occurrence or presence. Such a design will be useful for direct application of ML algorithms (logistic regression, decision tree, etc.)
texts_to_sequence will create lists that are collection of integers representing words. Certain functions like Keras-embeddings require this format for preprocessing.
Consider the example below.
txt = ['Python is great and useful', 'Python is easy to learn', 'Python is easy to implement']
txt = pd.Series(txt)
tok = Tokenizer(num_words=10)
tok.fit_on_texts(txt)
mat_texts = tok.texts_to_matrix(txt, mode='count')
mat_texts
Output:
array([[0., 1., 1., 0., 0., 1., 1., 1., 0., 0.],
[0., 1., 1., 1., 1., 0., 0., 0., 1., 0.],
[0., 1., 1., 1., 1., 0., 0., 0., 0., 1.]])
tok.get_config()['word_index']
Output:
'{"python": 1, "is": 2, "easy": 3, "to": 4, "great": 5, "and": 6, "useful": 7, "learn": 8, "implement": 9}'
mat_texts_seq = tok.texts_to_sequences(txt)
mat_texts_seq
Output:-
[[1, 2, 5, 6, 7], [1, 2, 3, 4, 8], [1, 2, 3, 4, 9]]
I have to admit, I'm a bit confused by the scatter* and index* operations - I'm not sure any of them do exactly what I'm looking for, which is very simple:
Given some 2-D tensor
z = tensor([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
And a list (or tensor?) of 2-d indexes:
inds = tensor([[0, 0],
[1, 1],
[1, 2]])
I want to add a scalar to z at those indexes (and do it efficiently):
znew = z.something_add(inds, 3)
->
znew = tensor([[4., 1., 1., 1.],
[1., 4., 4., 1.],
[1., 1., 1., 1.]])
If I have to I can make that scalar a tensor of whatever shape (where all elements = 3), but I'd rather not...
You must provide two lists to your indexing. The first having the row positions and the second the column positions. In your example, it would be:
z[[0, 1, 1], [0, 1, 2]] += 3
torch.Tensor indexing follows Numpy. See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details.
This code achieves what you want:
z_new = z.clone() # copy the tensor
z_new[inds[:, 0], inds[:, 1]] += 3 # modify selected indices of new tensor
In PyTorch, you can index each axis of a tensor with another tensor.