Tensorfow.js find median of tensor

Tensorfow.js find median of tensor - statistics

I need to find median of a image read as a tensor via tensorflow.js in angular application. Can anyone suggest how to find median in tfjs?
Like -
let x = tf.tensor([1, 2, 3, 4, 5, 8, 9]);
console.log(x.median());
This should print 4.
I tried finding equivalent of tensorflow probability stats in javascript but no luck so far.

The median is the middle number after the series have been arranged. The tensor needs to first sorted. The value at the index sizeTensor / 2 will be the median value.
t= tf.tensor([1, 2, 3, 4, 5, 8, 9]);
tf.topk(t, t.size).values.slice(t.size / 2, 1).print() // will print 4
For the above to work with high dimension tensors - images for example, we will need to first reshape to 1d tensor
Given img the tensor of an image
t = img.reshape([-1])
// do the same as above

Related

tensorflow find index of the last occurrence of maximum value in a tensor

Suppose I have a constant tensor like this [[0,0,1,1,2,0][0,1,0,0,0,2]].
The index of the last occurrence of maximum values in [0,0,1,1,2,0] is 4, and the index of the last occurrence of maximum values in [0,1,0,0,0,2] is 5.
So What I want to get is [4, 5], any ideas how I can do that? Thanks.
I'm using tensorflow 1.9.

Just figured it out.
import tensorflow as tf
a=tf.constant([[0,0,1,1,2,0], [0,1,0,0,0,2]])
max_idx = tf.cast(a.shape[1], tf.int64) - tf.argmax(tf.reverse(a, [1]), axis=1)-1
sess=tf.Session()
sess.run(max_idx)
The output is :
Out[137]: array([4, 5], dtype=int64)

Equivalent of np.multiply.at in Pytorch

Are there equivalent of np.multiply.at in Pytorch? I have two 4d arrays and one 2d index array:
base = torch.ones((2, 3, 5, 5))
to_multiply = torch.arange(120).view(2, 3, 4, 5)
index = torch.tensor([[0, 2, 4, 2], [0, 3, 3, 2]])
As shown in this question I asked earlier (in Numpy), the row index of the index array corresponds to the 1st dimension of base and to_multiply, and the value of the index array corresponds to the 3rd dimension of base. I want to take the slice from base according to the index and multiply with to_multiply, it can be achieved in Numpy as follows:
np.multiply.at(base1, (np.arange(2)[:,None,None],np.arange(3)[:,None],index[:,None,:]), to_multiply)
However, now when I want to translate this to PyTorch, I cannot find an equivalent of np.multiply.at in Pytorch, I can only find the "index_add_" method but there is no "index_multiply". And I want to avoid doing explicit for loop.
So how can I achieve above in PyTorch? Thanks!

How does the parameter 'dim' in torch.unique() work?

I am trying to extract the unique values in each row of a matrix and returning them into the same matrix (with repeated values set to say, 0) For example, I would like to transform
torch.Tensor(([1, 2, 3, 4, 3, 3, 4],
[1, 6, 3, 5, 3, 5, 4]])
to
torch.Tensor(([1, 2, 3, 4, 0, 0, 0],
[1, 6, 3, 5, 0, 0, 4]])
or
torch.Tensor(([1, 2, 3, 4, 0, 0, 0],
[1, 6, 3, 5, 4, 0, 0]])
I.e. the order does not matter in the rows. I have tried using pytorch.unique() and in the documentation it is mentioned that the dimension to take the unique values can be specified with the parameter dim. However, It doesn't seem to work for this case.
I've tried:
output= torch.unique(torch.Tensor([[4,2,52,2,2],[5,2,6,6,5]]), dim = 1)
output
Which gives
tensor([[ 2., 2., 2., 4., 52.],
[ 2., 5., 6., 5., 6.]])
Does anyone have a particular fix for this? If possible, I'm trying to avoid for loops.

One must admit the unique function can sometimes be very confusing without given proper examples and explanations.
The dim parameter specifies which dimension on the matrix tensor you want to apply on.
For instance, in a 2D matrix, dim=0 will let operation perform vertically where dim=1 means horizontally.
Example, let's consider a 4x4 matrix with dim=1. As you can see from my code below, the unique operation is applied row by row.
You notice the double occurrence of the number 11 in the first and last row. Numpy and Torch does this to preserve the shape of the final matrix.
However, if you do not specify any dimension, torch will automatically flatten your matrix and then apply unique to it and you will get a 1D array that contains unique data.
import torch
m = torch.Tensor([
[11, 11, 12,11],
[13, 11, 12,11],
[16, 11, 12, 11],
[11, 11, 12, 11]
])
output, indices = torch.unique(m, sorted=True, return_inverse=True, dim=1)
print("Ori \n{}".format(m.numpy()))
print("Sorted \n{}".format(output.numpy()))
print("Indices \n{}".format(indices.numpy()))
# without specifying dimension
output, indices = torch.unique(m, sorted=True, return_inverse=True)
print("Sorted (no dim) \n{}".format(output.numpy()))
Result (dim=1)
Ori
[[11. 11. 12. 11.]
[13. 11. 12. 11.]
[16. 11. 12. 11.]
[11. 11. 12. 11.]]
Sorted
[[11. 11. 12.]
[11. 13. 12.]
[11. 16. 12.]
[11. 11. 12.]]
Indices
[1 0 2 0]
Result (no dimension)
Sorted (no dim)
[11. 12. 13. 16.]

I was confused when using torch.unique the first time. After doing some experiments I have finally figured out how the dim argument works.
Docs of torch.unique says that:
counts (Tensor): (optional) if return_counts is True, there will be an additional returned tensor (same shape as output or output.size(dim), if dim was specified) representing the number of occurrences for each unique value or tensor.
For example, if your input tensor is a 3D tensor of size n x m x k and dim=2, unique will work separately on k matrices of size n x m. In other words, it will treat all dimensions other than the dim 2 as a single tensor.

MNIST Tensorflow example

def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
This is the code from the Deep MNIST for experts tutorial on Tensorflow website.
I have two questions:
1) The documentation k-size is an integer list of length greater than 4 that refers to the size of the max-pool window. Shouldn't that be just [2,2] considering that it's a 2X2 window? I mean why is it [1, 2, 2, 1] instead of [2,2] ?
2) If we are taking a stride step on size one. Why do we need a vector of 4 values, wouldn't one value suffice?
strides = [1]
3) If padding = 'SAME' why does the image size decrease by half? ( from 28 X 28 to 14 X 14 in the first convolutional process )

I'm not sure which documentation you're referring to in this question. The maxpool window is indeed 2x2.
The step size can be different depending on the dimensions. The 4 vector is the most general case where suppose you wanted to skip images in the batch, skip different height and width and potentially even skip based on channels. This is hardly used but has been left in.
If you have a stride of 2 along each direction then you skip every other pixel that you could potentially use for max pooling. If you set the skip size to be [1,1,1,1] with padding same then you would indeed return a result of the same size. The padding "SAME" refers to zero padding the image such that you add a border of height kernel hieght and a width of size kernel width to the image.

Scikit-learn R2 always zero

I'm trying to test my Scikit-learn machine learning algorithm with a simple R^2 score, but for some reason it always returns zero.
import numpy
from sklearn.metrics import r2_score
prediction = numpy.array([0.1567, 4.7528, 1.1260, 0.2294]).reshape(1, -1)
training = numpy.array([0, 3, 1, 0]).reshape(1, -1)
r2 = r2_score(training, prediction, multioutput="raw_values")
print r2
[ 0. 0. 0. 0.]
This is a single four-part value, not four separate values. How do I get proper R^2 scores?

If you are trying to calculate the r2 value between two vectors you should just pass two one dimensional arrays. See the documentation
In the example you provided, the first item is compared to the first item, but note you only have one list in each the prediction and training, so it is calculating R2 for 0.1567 to 0, which is 0, then it calculates it for 4.7528 to 3 which is also 0 and so on... It sounds like you want the R2 for the two vectors like the following:
prediction = numpy.array([0.1567, 4.7528, 1.1260, 0.2294])
training = numpy.array([0, 3, 1, 0])
print(r2_score(training, prediction))
0.472439485
If you have multi-dimensional arrays you can use the multioutput flag to determine what the output should look like:
#modified from the scikit-learn example
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(r2_score(y_true, y_pred, multioutput='raw_values'))
array([ 0.96543779, 0.90816327])
Here the output is where the first item of each list in y_true is compared to the first item in each list of y_pred, the second item to the second and so on

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Tensorfow.js find median of tensor - statistics

Related

tensorflow find index of the last occurrence of maximum value in a tensor

Equivalent of np.multiply.at in Pytorch

How does the parameter 'dim' in torch.unique() work?

MNIST Tensorflow example

Scikit-learn R2 always zero

Categories

Resources