Calculate accuracy of a tensor compared to a target tensor - pytorch

I have my output tensor like this:
tensor([[0.1834, 0.8166],
[0.3031, 0.6969],
[0.3104, 0.6896],
[0.3065, 0.6935],
[0.3060, 0.6940],
[0.2963, 0.7037],
[0.2340, 0.7660],
[0.2302, 0.7698],
[0.2581, 0.7419],
[0.2081, 0.7919]], grad_fn=<PowBackward0>)
I would like to first convert my output tensor to something like this:
tensor([1., 1., 1......])
where the value indicate the index of the larger value(for example, 0.8166 > 0.1834 so the first element is 1).
Any suggestions would by appreciated!

That's literally just your_tensor.argmax(dim=1).
your_tensor.argmax(dim=1).float() if you truly need it to be float.
After that, the accuracy can be calculated as sum(my_tensor == target_tensor) / len(target_tensor), for example. (See also this question).

To get the index with the largest value you should use torch.argmax.
Related PyTorch commands that can be used here are also:
torch.max: this command (when used with dim argument) returns the max value and the index of that value along the specified dimension.
torch.topk: returns the top k elements along a dimension. When k=1 works like torch.max.

Related

How does this function traverse the token through slicing, Can you please explain how it selects the elements in the list?

The code I'm referring to:
predicted_index = torch.argmax(predictions[0, -1, :]).item()
This is a tensor not a list, major difference being:
tensor has one specified dtype (usually float32 in PyTorch)
faster to run operations on
Your predictions are 3D tensor of which you are taking:
0th row
last column (-1 index)
all of the elements from third dimension (:)
Essentially your are left with a vector after the slicing.
torch.argmax returns the index under which the largest element resides, for example:
torch.argmax(torch.tensor([-1, 0, 1.5, 1, 0])) # would return 2'
Code of argmax is implemented in C++ and keeps the index of largest value found until now and returns the one found at the end (O(n) complexity).
.item() changes tensor to it's Python counterpart (usually float from any floating point, int from integer family type etc.).

How to create a tensor by accessing specific values at given indices of a 2 X 2 tensor in pytorch?

Suppose
mat = torch.rand((5,7)) and I want to get values from 1st dimension (here, 7) by passing the indices, say idxs=[0,4,2,3,6]. The way I am able to do it now is by doing mat[[0,1,2,3,4],idxs]. I expected mat[:,idxs] to work, but it didn't. Is the first option the only way or is there a better way?
torch.gather is what you are looking for:
torch.gather(mat, 1, torch.tensor(idxs)[:, None])

Sklearn Binning Process - It is possible to return a interval?

I'm trying to use KBinsDiscretizer from sklearn.preprocessing, but it returns integer values as 1,2,..,N (representing the interval). Is it possible to return a correct interval as (0.2, 0.5) or this is not implemented yet?
based on the docs: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KBinsDiscretizer.html:
Attributes: n_bins_ : int array, shape (n_features,):
Number of bins per feature. Bins whose width are too small (i.e., <= 1e-8) are removed with a warning. bin_edges_ : array of arrays,
shape (n_features, ):
The edges of each bin. Contain arrays of varying shapes (n_bins_, ) Ignored features will have empty arrays.
This would mean a no in your case. There is also another hint:
The inverse_transform function converts the binned data into the original feature space. Each value will be equal to the mean of the two bin edges.```

Copying data from one tensor to another using bit masking

import numpy as np
import torch
a = torch.zeros(5)
b = torch.tensor(tuple((0,1,0,1,0)),dtype=torch.uint8)
c= torch.tensor([7.,9.])
print(a[b].size())
a[b]=c
print(a)
torch.Size([2])tensor([0., 7., 0., 9., 0.])
I am struggling to understand how this works. I initially thought the above code was using Fancy indexing but I realised that values from c tensors are getting copied corresponding to the indices marked 1. Also, if I don't specify dtype of b as uint8 then the above code does not work. Can someone please explain me the mechanism of the above code.
Indexing with arrays works the same as in numpy and most other vectorized math packages I am aware of. There are two cases:
When b is of type uint8 (think boolean, pytorch doesn't distinguish bool from uint8), a[b] is a 1-d array containing the subset of values of a (a[i]) for which the corresponding in b (b[i]) was nonzero. These values are aliased to the original a so if you modify them, their corresponding locations will change as well.
The alternative type you can use for indexing is an array of int64, in which case a[b] creates an array of shape (*b.shape, *a.shape[1:]). Its structure is as if each element of b (b[i]) was replaced by a[i]. In other words, you create a new array by specifying from which indexes of a should the data be fetched. Again, the values are aliased to the original a, so if you modify a[b] the values of a[b[i]], for each i, will change. An example usecase is shown in this question.
These two modes are explained for numpy in integer array indexing and boolean array indexing, where for the latter you have to keep in mind that pytorch uses uint8 in place of bool.
Also, if your goal is to copy data from one tensor to another you have to keep in mind that an operation like a[ixs] = b[ixs] is an in-place operation (a is modified in place), which my not play well with autograd. If you want to do out of place masking, use torch.where. An example usecase is shown in this answer.

Finding the maximum slope in an Lat\Lon array

I'm plotting sea surface height by Latitude for a 20 different Longitudes.
The result is a line plot with 20 lines. I need to find in which line has the steepest slope and then pinpoint that lat lon.
I've tried so far with np.gradient and then max() but I keep getting an error (ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all())
I have a feeling theres a much better way to do it.Thanks to those willing to help.
example of plot
slice3lat= lat[20:40]
slice3lon= lon[20:40]
slice3ssh=ssh[:,0,20:40,20:40]
plt.plot(slice3lat,slice3ssh)
plt.xlabel("Latitude")
plt.ylabel("SSH (m)")
plt.legend()
When you say max(), I assume you mean Python's built-in max function. This works on numpy arrays only if they are one-dimensional / flat, where by iterating over the elements, size comparable scalars are obtained. If you have a 2D array like in your case, the top-level elements of the array become its rows, where the size comparison fails with the message you presented.
In this case, you should use np.max on the array or call the arr.max() method directly.
Here's some example code using np.gradient, vector adding the gradients in each direction and obtaining the max together with its coordinate position in the original data:
grad_y, grad_x = np.gradient(ssh)
grad_total = np.sqrt(grad_y**2 + grad_x**2) # or just grad_y ?
max_grad = grad_total.max()
max_grad_pos = np.unravel_index(grad_total.argmax(), grad_total.shape)
print("Gradient max is {} at pos {}.".format(max_grad, max_grad_pos))
Might ofc still need to fiddle a liddle with it.

Resources