I'm using torch probability distributions and trying to make sure my results are reproducible. However, I can't get the sampling part to generate the same predictions using the regular method of setting the fix seed in the beginning of the script:
torch.manual_seed(42)
for _ in range(10):
label_probs = torch.tensor([0.2, 0.2, 0.5, 0.1])
labels_distributions = torch.distributions.Categorical(label_probs)
labels_sampled = labels_distributions.sample().detach()
print(labels_sampled)
>>>
tensor(2)
tensor(3)
tensor(2)
tensor(3)
tensor(2)
tensor(0)
tensor(3)
tensor(2)
tensor(1)
tensor(2)
However, if I do the same but constantly set the manual seed inside the loop it works:
for _ in range(10):
torch.manual_seed(42)
label_probs = torch.tensor([0.2, 0.2, 0.5, 0.1])
labels_distributions = torch.distributions.Categorical(label_probs)
labels_sampled = labels_distributions.sample().detach()
print(labels_sampled)
>>>
tensor(0)
tensor(0)
tensor(0)
tensor(0)
tensor(0)
tensor(0)
tensor(0)
tensor(0)
tensor(0)
tensor(0)
I could set it inside the loop, but I can't imagine it's especially efficient (unless the time complexity for this is small -- if anyone knows?). What's the reason for the differences? Trying to figure out how does this work and why the strange (at least to me) behavior.
Related
Is it possible to use a similar method as "tensordot" with torch.sparse tensors?
I am trying to apply a 4 dimensional tensor onto a 2 dimensional tensor. This is possible using torch or numpy. However, I did not find the way to do it using torch.sparse without making the sparse tensor dense using ".to_dense()".
More precisely, here is what I want to do without using ".to_dense()":
import torch
import torch.sparse
nb_x = 4
nb_y = 3
coordinates = torch.LongTensor([[0,1,2],[0,1,2],[0,1,2],[0,1,2]])
values = torch.FloatTensor([1,2,3])
tensor4D = torch.sparse.FloatTensor(coordinates,values,torch.Size([nb_x,nb_y,nb_x,nb_y]))
inp = torch.rand((nb_x,nb_y))
#what I want to do
out = torch.tensordot(tensor4D.to_dense(),inp,dims=([2,3],[0,1]))
print(inp)
print(out)
(here is the output: torch_code)
Alternatively, here is a similar code using numpy:
import numpy as np
tensor4D = np.zeros((4,3,4,3))
tensor4D[0,0,0,0] = 1
tensor4D[1,1,1,1] = 2
tensor4D[2,2,2,2] = 3
inp = np.random.rand(4,3)
out = np.tensordot(tensor4D,inp)
print(inp)
print(out)
(here is the output: numpy_code)
Thanks for helping!
Your specific tensordot can be cast to a simple matrix multiplication by "squeezing" the first two and last two dimensions of tensor4D.
In short, what you want to do is
raw = tensor4D.view(nb_x*nb_y, nb_x*nb_y) # inp.flatten()
out = raw.view(nb_x, nb_y)
However, since view and reshape are not implemented for sparse tensors, you'll have to it manually:
sz = tensor4D.shape
coeff = torch.tensor([[1, sz[1], 0, 0], [0, 0, 1, sz[3]]])
reshaped = torch.sparse.FloatTensor(coeff # idx, tensor4D._values(), torch.Size([nb_x*nb_y, nb_x*nb_y]))
# once we reshaped tensord4D it's all downhill from here
raw = torch.sparse.mm(reshaped, inp.flatten()[:, None])
out = raw.reshape(nb_x, nb_y)
print(out)
And the output is
tensor([[0.4180, 0.0000, 0.0000],
[0.0000, 0.6025, 0.0000],
[0.0000, 0.0000, 0.5897],
[0.0000, 0.0000, 0.0000]])
Indeed, this works very well, thank you for your answer!
The weakness of this method seems to me that it is hard to generalize.
In fact, "inp" and "out" are supposed to be images. Here, they are black and white images since there are only two dimensions: height and width.
If instead, I take RGB images, then I will have to consider 6D tensors acting on 3D tensors. I can still apply the same trick by "squeezing" the first three dimensions together and the last three dimensions together. However it seems to me that it will become more involving very quickly (maybe I am wrong). While using tensordot instead would be much more simpler for generalization.
Therefore, I am going to use the solution you proposed, but I am still interested if someone finds an other solution.
I have trained a WGAN on the CelebA dataset in PyTorch following this youtube video. Since I do this on Google Cloud Platform where TensorBoard is not availabe, I save one figure of generated images by the GAN every epoch to see how the GAN is actually doing.
Now, the saved pdf files look sth like this: generated images. Unfortunately, this is not really readable, and I suspect this has to do with the preprocessing I do:
trafo = transforms.Compose(
[transforms.Resize(size = (64, 64)),
transforms.ToTensor(),
transforms.Normalize( mean = (0.5,), std = (0.5,))])
Is there any way to kind of undo this transformation when I save the image?
Currently, I save the image every epoch as follows:
visualization = torchvision.utils.make_grid(
tensor = gen(fixed_noise),
nrow = 8,
normalize = False)
plt.savefig("generated_WGAN_" + datetime.now().strftime("%Y%m%d-%H%M%S") + ".pdf")
Also, I should probably mention that in the Jupyter notebook, I get the following warning:
"Clipping input data to the valid range for imshow with RGB data ([0..1]) for floats or [0..255] for integers)."
The torchvision.transform.Normalize function is usually used to standardize data (make mean(data)=0 and std(x)=1) while the normalize option on torchvision.utils.make_grid is used to normalize the data between [0,1] given a range. So no need to implement a function to fix this.
If True, shift the image to the range (0, 1), by the min and max values specified by range. Default: False.
Here you are looking to normalize between 0 and 1. Given a tensor x:
torchvision.utils.make_grid(x, nrow=8, normalize=True, range=(x.min(), x.max()))
Here are some examples of use provided by the PyTorch's documentation.
Back to your original question, I should mention that torchvision.transform.Normalize(mean=0.5, std=0.5) doesn't transform your data such that it has mean=0.5 and std=0.5... Neither will it standardize it to mean=0, std=1. You have to measure the mean and std from your dataset.
torchvision.transform.Normalize simply performs a shift-scale operation. To undo that just unscale-unshift with the same values:
>>> x = torch.rand(64, 3, 100, 100)*torch.rand(64, 1, 1, 1)
>>> x.mean(), x.std()
(tensor(0.2536), tensor(0.2175))
>>> t = T.Normalize(mean, std)
>>> t_inv = lambda x: x*std + mean
>>> x_after = t(x)
>>> x_after.mean(), x_after.std()
(tensor(-0.4928), tensor(0.4350))
>>> x_before = t_inv(x_after)
>>> x_before.mean(), x_before.std()
(tensor(0.2536), tensor(0.2175))
It seems like your output pixel values are in range [-1, 1] (please verify this).
Therefore, when you save the images, the negative part is being clipped (as the error message you got suggests).
Try:
visualization = torchvision.utils.make_grid(
tensor = torch.clamp(gen(fixed_noise), -1, 1) * 0.5 + 0.5, # from [-1, 1] -> [0, 1]
nrow = 8,
normalize = False)
plt.savefig("generated_WGAN_" + datetime.now().strftime("%Y%m%d-%H%M%S") + ".pdf")
I am trying to plot precision recall curve for multiclass in one figure for this purpose I used below code
def plot_prc(y_test, y_score, N_classes):
precision = dict()
recall = dict()
average_precision = dict()
for i in range(N_classes):
precision[i], recall[i], _ = precision_recall_curve(y_test[:, i],y_score[:, i])
average_precision[i] = average_precision_score(y_test[:, i], y_score[:, i])
for i in range(N_classes):
plt.plot(recall[i], precision[i], lw=2, label='class {}'.format(i,average_precision[i] ))
#plt.plot(recall[i], precision[i], lw=2, label='class {}'.format(i))
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel("recall")
plt.ylabel("precision")
plt.legend(loc="best")
plt.title("precision vs. recall curve")
plt.show()
but i am getting multiple figures for different classes.I could not point it out what is error in my code.
I got a single line like this
for class 1 curve
but i want look like this figure
will be multiple line in a figure for all class
I will appreciate any kind of help regarding this problem.
In the second for-loop which is iterated for each class, if plt.show() is included in the loop, then the plot shows up for single class every time instead of plotting the graphs of all the classes in the same plot. So the solution for this problem should be to put the plt.show() out of the loop.
You are plotting the lines inside the for loop but your plt.show() function is also
inside the for loop, so this will result in plotting of result of last for loop cycle.
Put your plt.show() function outside the for loop.
I am pre-processing a numpy array and want to enter it in as a tensorflow Variable. I've tried following other stack exchange advice, but so far without success. I would like to see if I'm doing something uniquely wrong here.
npW = np.zeros((784,10))
npW[0,0] = 20
W = tf.Variable(tf.convert_to_tensor(npW, dtype = tf.float32))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
print("npsum", np.sum(npW))
print(tf.reduce_sum(W))
And this is the result.
npsum 20.0
Tensor("Sum:0", shape=(), dtype=float32)
I don't know why the reduced sum of the W variable remains zero. Am i missing something here?
You need to understand that Tensorflow differs from traditionnal computing. First, you declare a computational graph. Then, you run operations through the graph.
Taking your example, you have your numpy variables :
npW = np.zeros((784,10))
npW[0,0] = 20
Next, these instructions are a definition of tensorflow variables, i.e. nodes in the computational graph:
W = tf.Variable(tf.convert_to_tensor(npW, dtype = tf.float32))
sum = tf.reduce_sum(W)
And to be able to compute the operation, you need to run the op through the graph, with a sesssion, i.e. :
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
result = sess.run(sum)
print(result) # print 20
Another way is to call eval instead of sess.run()
print(sum.eval()) # print 20
So i tested it a bit differently and found out that the variable is getting assigned properly, but, the reduced_sum function isn't working as expected. If any one has explanations on that it would be much appreciated.
npW = np.zeros((2,2))
npW[0,0] = 20
W = tf.Variable(npW, dtype = tf.float32)
A= tf.constant([[20,0],[0,0]])
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
# Train
print("npsum", np.sum(npW))
x=tf.reduce_sum(W,0)
print(x)
print(tf.reduce_sum(A))
print(W.eval())
print(A.eval())
This had output
npsum 20.0
Tensor("Sum:0", shape=(2,), dtype=float32)
Tensor("Sum_1:0", shape=(), dtype=int32)
[[ 20. 0.],
[ 0. 0.]]
[[20 0],
[ 0 0]]
I am trying to do a grid search using a SVM classifier.
Consider my data and target that have been parsed from file and input to numpy arrays.
I then preprocess them.
# Transform the data to have zero mean and unit variance.
zeroMeanUnitVarianceScaler = preprocessing.StandardScaler().fit(data)
zeroMeanUnitVarianceScaler.transform(data)
scaledData = data
# Transform the target to have range [-1, 1].
scaledTarget = np.empty([161L,], dtype=int)
for i in range(len(target)):
if(target[i] == 'Malignant'):
scaledTarget[i] = 1
if(target[i] == 'Benign'):
scaledTarget[i] = -1
I now try to set up my grid and fit the scaled data to targets.
# Generate parameters for parameter grid.
CValues = np.logspace(-3, 3, 7)
GammaValues = np.logspace(-3, 3, 7)
kernelValues = ('poly', 'sigmoid')
# kernelValues = ('linear', 'rbf', 'sigmoid')
degreeValues = np.array([0, 1, 2, 3, 4])
coef0Values = np.logspace(-3, 3, 7)
# Generate the parameter grid.
paramGrid = dict(C=CValues, gamma=GammaValues, kernel=kernelValues,
coef0=coef0Values)
# Create and train a SVM classifier using the parameter grid and with
stratified shuffle split.
stratifiedShuffleSplit = StratifiedShuffleSplit(n_splits = 10, test_size =
0.25, train_size = None, random_state = 0)
clf = GridSearchCV(estimator=svm.SVC(), param_grid=paramGrid,
cv=stratifiedShuffleSplit, n_jobs=1)
clf.fit(scaledData, scaledTarget)
If I uncomment the line kernelValues = ('linear', 'rbf', 'sigmoid'), then the code runs in approximately 50 seconds on my 16 GB i7-4950 3.6 GHz machine running windows 10.
However, if I try to run the code as is with 'poly' as a possible kernel value, then the code hangs forever. For example, I ran it yesterday overnight and it did not return anything when I got back in the office today.
Interestingly enough, if I try to create a SVM classifier with a poly kernel, it returns a result immediately
clf = svm.SVC(kernel='poly',degree=2)
clf.fit(data, target)
It hangs up when I do the above code. I have not tried other cv methods to see if that changes anything.
Is this a bug in sci-kit learn? Am I doing things properly? On a side note, is my method of doing gridsearch/cross validation using GridSearchCV and StratifiedShuffleSplit sensible? It seems to me the most brute force (i.e. time consuming) but robust method.
Thank you!