My pytorch code below keeps getting jit tracer warning (in pytorch 1.1.0 environment) complaining that "Pytorch 1.0 Tracer Warning: Converting a tensor to a Python index might ..."
Is there a way to implement the code line marked (A) below without using python indexing?
N,C,H,W = input.size()
Cout=4*C
Hout=H//2
Wout=W//2
downsampled=torch.zeros([N,Cout,Hout,Wout], dtype= torch.FloatTensor)
downsampled[:,1:Cout:4,:,:]=input[:,:,0::2,1::2] ---- (A)
I confirmed that jit tracer no longer complains the python indexing in Pytorch 1.2 (as Umang Gupta commented).
BTW, I came up with an implementation with no slicing (but still using indexing) as follow:
import torch
input=torch.arange(100)
input=input.view(10,10)
input=input[None, None, ...].expand(2,3,10,10) #torch.Size([2,3,10,10])
N,C,H,W=input.size()
Cout=4*C
Hout=H//2
Wout=W//2
downsampled=torch.zeros([N,Cout,Hout,Wout],dtype=torch.int8) #torch.Size([2,12,5,5])
dim2_idx=torch.tensor([k for k in range(0,H,2)])
dim3_idx=torch.tensor([k for k in range(1,W,2)])
sliced_input=input.index_select(2,dim2_idx).index_select(3,dim3_idx) #torch.Size([2,3,5,5])
#downsampled.index_select(1,torch.tensor([k for k in range(1,Cout,4)]))=temp <---Error: Can't assign to function call
for idx in range(1,Cout,4):
downsampled[:,idx,:,:]=sliced_input[:,idx//4,:,:]
Related
I want to use Python3-h5py to store matrix to the .HDF5 format
My problem is that when I compare the initial data to the data extracted from the HDF5 file, I get surprising differences.
import numpy
import h5py
# Create a vector of float64 values between 0 and 1
A = numpy.array(range(16384+1))/(16384+1)
# Save the corresponding float16 array to a HDF5 file
Fid = h5py.File("Output/Test.hdf5","w")
Group01 = Fid.create_group("Group")
Group01.create_dataset("Data", data=A, dtype='f2')
# Group01.create_dataset("Data", data=A.astype(numpy.float16), dtype='f2')# Use that line to avoid the bug
Fid.flush()
Fid.close()
# Read the HDF5 file
Fid = h5py.File("Output/Test.hdf5",'r')
B = Fid["Group/Data"][:]
Fid.close()
# Compare float64 and float16 Values
print(A[8192])
print(B[8192])
print("")
print(A[8192+1])
print(B[8192+1])
print("")
print(A[16384])
print(B[16384])
Gives :
0.499969484284
0.25
0.500030515716
0.5
0.999938968569
0.5
Sometimes I get a difference of about "0.00003" and sometimes "0.4999".
Normally, I am supposed to always get "0.00003" which is related to the float16 rounding for a value between 0 and 1.
But the "0.4999" value is really unexpected, I have noticed that it happens to values which are close to power of 2 (for example "~1/2" will be stored as "~1/4").
Is it a bug into the h5py package ?
Thanks in advance,
Stéphane,
[Xubuntu 17.09 64bits + python3-h5py v2.7.1-2 + python3 v3.6.3-0ubuntu2]
I am not fully sure that this can be considered as an answer, but I finally get rid of my problem with a small circumvent.
To sum it up, it looks like there is a bug with "h5py v2.7.1-2"
When using h5py to store arrays, don't use such command :
`Group01.create_dataset("Data", data=A, dtype='f2')# Buggy command`
But instead :
`Group01.create_dataset("Data", data=A.astype(numpy.float16), dtype='f2')`
Edit 18 Nov 2022 : with h5py==3.7.0 the bug is now fixed
a=[1,2,3];
context_var = autograd.Variable(torch.LongTensor(a))
This is giving an error
RuntimeError: tried to construct a tensor from a int sequence, but found an item of type numpy.int32 at index
I am not able to figure out how to get over this.
Your code works perfectly fine in the recent version of pytorch. But for older versions, you can convert the numpy array to list using .tolist() method as follows to get rid of the error.
a=[1,2,3];
context_var = autograd.Variable(torch.LongTensor(a.tolist()))
Works fine for me:
a=[1,2,3]
print(torch.autograd.Variable(torch.LongTensor(a)))
b = np.array(a)
print(torch.autograd.Variable(torch.LongTensor(b)))
outputs:
Variable containing:
1
2
3
[torch.LongTensor of size 3]
Variable containing:
1
2
3
[torch.LongTensor of size 3]
I'm using Python 3.6.2, torch 0.2.0.post3, and numpy 1.13.3.
I'd like to build a tensorflow graph in a separate function get_graph(), and to print out a simple ops a in the main function. It turns out that I can print out the value of a if I return a from get_graph(). However, if I use get_operation_by_name() to retrieve a, it print out None. I wonder what I did wrong here? Any suggestion to fix it? Thank you!
import tensorflow as tf
def get_graph():
graph = tf.Graph()
with graph.as_default():
a = tf.constant(5.0, name='a')
return graph, a
if __name__ == '__main__':
graph, a = get_graph()
with tf.Session(graph=graph) as sess:
print(sess.run(a))
a = sess.graph.get_operation_by_name('a')
print(sess.run(a))
it prints out
5.0
None
p.s. I'm using python 3.4 and tensorflow 1.2.
Naming conventions in tensorflow are subtle and a bit offsetting at first.
The thing is, when you write
a = tf.constant(5.0, name='a')
a is not the constant op, but its output. Names of op outputs derive from the op name by adding a number corresponding to its rank. Here, constant has only one output, so its name is
print(a.name)
# `a:0`
When you run sess.graph.get_operation_by_name('a') you do get the constant op. But what you actually wanted is to get 'a:0', the tensor that is the output of this operation, and whose evaluation returns an array.
a = sess.graph.get_tensor_by_name('a:0')
print(sess.run(a))
# 5
If I run a basic logistic regression with 4 classes, I can get the predict_proba array.
How can i manually calculate the probabilities using the coefficients and intercepts? What are the exact steps to get the same answers that predict_proba generates?
There seem to be multiple questions about this online and several suggestions which are either incomplete or don't match up anyway.
For example, I can't replicate this process from my sklearn model so what is missing?
https://stats.idre.ucla.edu/stata/code/manually-generate-predicted-probabilities-from-a-multinomial-logistic-regression-in-stata/
Thanks,
Because I had the same question but could not find an answer that gave the same results I had a look at the sklearn GitHub repository to find the answer. Using the functions from their own package I was able to create the same results I got from predict_proba().
It appears that sklearn uses a special softmax() function that differs from the usual softmax function in their code.
Let's assume you build a model like this:
from sklearn.linear_model import LogisticRegression
X = ...
Y = ...
model = LogisticRegression(multi_class="multinomial", solver="saga")
model.fit(X, Y)
Then you can calculate the probabilities either with model.predict(X) or use the sklearn function mentioned above to calculate them manually like this.
from sklearn.utils.extmath import softmax,
import numpy as np
scores = np.dot(X, model.coef_.T) + model.intercept_
softmax(scores) # Sklearn implementation
In the documentation for their own softmax() function, they note that
The softmax function is calculated by
np.exp(X) / np.sum(np.exp(X), axis=1)
This will cause overflow when large values are exponentiated. Hence
the largest value in each row is subtracted from each data point to
prevent this.
Replicate sklearn calcs (saw this on a different post):
V = X_train.values.dot(model.coef_.transpose())
U = V + model.intercept_
A = np.exp(U)
P=A/(1+A)
P /= P.sum(axis=1).reshape((-1, 1))
seems slightly different than softmax calcs, or the UCLA stat example, but it works.
According to the documentation and other SO questions, ElasticNetCV accepts multiple output regression. When I try it, though, it fails. Code:
from sklearn import linear_model
import numpy as np
import numpy.random as rnd
nsubj = 10
nfeat_train = 5
nfeat_predict = 20
x = rnd.random((nsubj, nfeat_train))
y = rnd.random((nsubj, nfeat_predict))
lm = linear_model.LinearRegression()
lm.fit(x,y) # works
el = linear_model.ElasticNetCV()
el.fit(x,y) # fails
Error message:
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
This is with scikit-learn version 0.14.1. Is this a mismatch between the documentation and implementation?
You may want to take a look at sklearn.linear_model.MultiTaskElasticNetCV. But beware, this object assumes that your multiple targets share features. Thus, a feature is either active for all tasks (with variable activation for each, which can be small), or active for none of them. Before using this object, make sure this is the functionality you need.