How does pytorch L1-norm pruning works?

How does pytorch L1-norm pruning works? - pytorch

Lets see the result that I got first. This is one of a convolution layer of my model, and im only showing 11 filter's weight of it (11 3x3 filter with channel=1)
Left side is original weight Right side is Pruned weight
So I was wondering how does the "TORCH.NN.UTILS.PRUNE.L1_UNSTRUCTURED" works because by the pytorch website said, it prune the lowest L1-norm unit, but as far as I know, L1-norm pruning is a filter pruning method which prune the whole filter which use this equation to fine the lowest filter value instead of pruning single weight. So I'm a bit curious about how does this function actually works?
The following is my pruning code
parameters_to_prune = (
(model.input_layer[0], 'weight'),
(model.hidden_layer1[0], 'weight'),
(model.hidden_layer2[0], 'weight'),
(model.output_layer[0], 'weight')
)
prune.global_unstructured(
parameters_to_prune,
pruning_method=prune.L1Unstructured,
amount = (pruned_percentage/100),
)

The nn.utils.prune.l1_unstructured utility does not prune the whole filter, it prunes individual parameter components as you observed in your sheet. That is components with the lower norm get masked.
Here is a minimal example as discussed in the comments below:
>>> m = nn.Linear(10,1,bias=False)
>>> m.weight = nn.Parameter(torch.arange(10).float())
>>> prune.l1_unstructured(m, 'weight', .3)
>>> m.weight
tensor([0., 0., 0., 3., 4., 5., 6., 7., 8., 9.], grad_fn=<MulBackward0>)

Related

Rounding only specific entries of torch.tensor()

Using torch.round() is it possible to eventually round specific entries of a tensor? Example:
tensor([ 8.5040e+00, 7.3818e+01, 5.2922e+00, -1.8912e-01, 5.4389e-01,
-3.6032e-03, 4.5763e-01, -2.7471e-02])
Desired output:
tensor([ 9., 74., 5., 0., 5.4389e-01,
-3.6032e-03, 4.5763e-01, -2.7471e-02])
(Only first 4 rounded)

you can do as follow
a[:4]=torch.round(a[:4])

Another (a little bit shorter) option is
t = torch.tensor([ 8.5040e+00, 7.3818e+01, 5.2922e+00, -1.8912e-01, 5.4389e-01, -3.6032e-03, 4.5763e-01, -2.7471e-02])
t[:4].round()
or inplace
t[:4].round_()

How does torchvision.transforms.Normalize operate?

I don't understand how the normalization in Pytorch works.
I want to set the mean to 0 and the standard deviation to 1 across all columns in a tensor x of shape (2, 2, 3).
A simple example:
>>> x = torch.tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
>>> norm = transforms.Normalize((0, 0), (1, 1))
>>> norm(x)
tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
So nothing has changed when applying the normalization transform. Why is that?

To give an answer to your question, you've now realized that torchvision.transforms.Normalize doesn't work as you had anticipated. That's because it's not meant to:
normalize: (making your data range in [0, 1]) nor
standardize: making your data's mean=0 and std=1 (which is what you're looking for.
The operation performed by T.Normalize is merely a shift-scale transform:
output[channel] = (input[channel] - mean[channel]) / std[channel]
The parameters names mean and std which seems rather misleading knowing that it is not meant to refer to the desired output statistics but instead any arbitrary values. That's right, if you input mean=0 and std=1, it will give you output = (input - 0) / 1 = input. Hence the result you received where function norm had no effect on your tensor values when you were expecting to get a tensor of mean and variance 0 and 1, respectively.
However, providing the correct mean and std parameters, i.e. when mean=mean(data) and std=std(data), then you end up calculating the z-score of your data channel by channel, which is what is usually called 'standardization'. So in order to actually get mean=0 and std=1, you first need to compute the mean and standard deviation of your data.
If you do:
>>> mean, std = x.mean(), x.std()
(tensor(6.5000), tensor(3.6056))
It will give you the global average, and global standard deviation respectively.
Instead, what you want is to measure the 1st and 2nd order statistics per-channel. Therefore, we need to apply torch.mean and torch.std on all dimensions expect dim=1. Both of those functions can receive a tuple of dimensions:
>>> mean, std = x.mean((0,2)), x.std((0,2))
(tensor([5., 8.]), tensor([3.4059, 3.4059]))
The above is the correct mean and standard deviation of x measured along each channel. From there you can go ahead and use T.Normalize(mean, std) to correctly transform your data x with the correct shift-scale parameters.
>>> norm(x)
tensor([[[-1.5254, -1.2481, -0.9707],
[-0.6934, -0.4160, -0.1387]],
[[ 0.1387, 0.4160, 0.6934],
[ 0.9707, 1.2481, 1.5254]]])

Follow the explanation on documentation of torchvision.transforms.Normalize:
Normalize a tensor image with mean and standard deviation. Given mean:
(mean[1],...,mean[n]) and std: (std[1],..,std[n]) for n channels, this
transform will normalize each channel of the input torch.*Tensor i.e.,
output[channel] = (input[channel] - mean[channel]) / std[channel]
So if you have mead=0 and std=1 then output=(output - 0) / 1 will not change.
Example to show above explanation:
from torchvision import transforms
import torch
norm = transforms.Normalize((0,0),(1,2))
x = torch.tensor([[[1.0,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
out = norm(x)
print(x)
print(out)
Outputs:
tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
tensor([[[1.0000, 2.0000, 3.0000],
[4.0000, 5.0000, 6.0000]],
[[3.5000, 4.0000, 4.5000],
[5.0000, 5.5000, 6.0000]]])
As you can see, the first channel is not change and second channel is divide by
2.

Getting range of values from Pytorch Tensor

I am trying to get a specific range of values from my pytorch tensor.
tensor=torch.tensor([0,1,2,3,4,5,6,7,8,9])
new_tensor=tensor[tensor>2]
print(new_tensor)
This will give me a tensor with scalars of 3-9
new_tensor2=tensor[tensor<8]
print(new_tensor2)
This will give me a tensor with scalars of 0-7
new_tensor3=tensor[tensor>2 and tensor<8]
print(new_tensor3)
However this raises an error. Would I be able to get a tensor with the values of 3-7 using something like this? I am trying to edit the tensor directly, and do not wish to change the order of the tensor itself.
grad[x<-3]=0.1
grad[x>2]=1
grad[(x>=-3 and x<=2)]=siglrelu(grad[(x>=-3 and x<=2)])*(1.0-siglrelu(grad[(x>=-3 and x<=2)]))
This is what I am really going for, and I am not exactly sure of how to go about this. Any help is appreciated, thank you!

You can use & operation,
t = torch.arange(0., 10)
print(t)
print(t[(t > 2) & (t < 8)])
Output is,
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
tensor([3., 4., 5., 6., 7.])

PYSPARK: how to cluster more efficiently? [duplicate]

I am using Spark Mlib for kmeans clustering. I have a set of vectors from which I want to determine the most likely cluster center. So I will run kmeans clustering training on this set and select the cluster with the highest number of vector assigned to it.
Therefore I need to know the number of vectors assigned to each cluster after training (i.e KMeans.run(...)). But I can not find a way to retrieve this information from KMeanModel result. I probably need to run predict on all training vectors and count the label which appear the most.
Is there another way to do this?
Thank you

You are right, this info is not provided by the model, and you have to run predict. Here is an example of doing so in a parallelized way (Spark v. 1.5.1):
from pyspark.mllib.clustering import KMeans
from numpy import array
data = array([0.0,0.0, 1.0,1.0, 9.0,8.0, 8.0,9.0, 10.0, 9.0]).reshape(5, 2)
data
# array([[ 0., 0.],
# [ 1., 1.],
# [ 9., 8.],
# [ 8., 9.],
# [ 10., 9.]])
k = 2 # no. of clusters
model = KMeans.train(
sc.parallelize(data), k, maxIterations=10, runs=30, initializationMode="random",
seed=50, initializationSteps=5, epsilon=1e-4)
cluster_ind = model.predict(sc.parallelize(data))
cluster_ind.collect()
# [1, 1, 0, 0, 0]
cluster_ind is an RDD of the same cardinality with our initial data, and it shows which cluster each datapoint belongs to. So, here we have two clusters, one with 3 datapoints (cluster 0) and one with 2 datapoints (cluster 1). Notice that we have run the prediction method in a parallel fashion (i.e. on an RDD) - collect() is used here only for our demonstration purposes, and it is not needed in a 'real' situation.
Now, we can get the cluster sizes with
cluster_sizes = cluster_ind.countByValue().items()
cluster_sizes
# [(0, 3), (1, 2)]
From this, we can get the maximum cluster index & size as
from operator import itemgetter
max(cluster_sizes, key=itemgetter(1))
# (0, 3)
i.e. our biggest cluster is cluster 0, with a size of 3 datapoints, which can be easily verified by inspection of cluster_ind.collect() above.

Spark KMeans clustering: get the number of sample assigned to a cluster

I am using Spark Mlib for kmeans clustering. I have a set of vectors from which I want to determine the most likely cluster center. So I will run kmeans clustering training on this set and select the cluster with the highest number of vector assigned to it.
Therefore I need to know the number of vectors assigned to each cluster after training (i.e KMeans.run(...)). But I can not find a way to retrieve this information from KMeanModel result. I probably need to run predict on all training vectors and count the label which appear the most.
Is there another way to do this?
Thank you

You are right, this info is not provided by the model, and you have to run predict. Here is an example of doing so in a parallelized way (Spark v. 1.5.1):
from pyspark.mllib.clustering import KMeans
from numpy import array
data = array([0.0,0.0, 1.0,1.0, 9.0,8.0, 8.0,9.0, 10.0, 9.0]).reshape(5, 2)
data
# array([[ 0., 0.],
# [ 1., 1.],
# [ 9., 8.],
# [ 8., 9.],
# [ 10., 9.]])
k = 2 # no. of clusters
model = KMeans.train(
sc.parallelize(data), k, maxIterations=10, runs=30, initializationMode="random",
seed=50, initializationSteps=5, epsilon=1e-4)
cluster_ind = model.predict(sc.parallelize(data))
cluster_ind.collect()
# [1, 1, 0, 0, 0]
cluster_ind is an RDD of the same cardinality with our initial data, and it shows which cluster each datapoint belongs to. So, here we have two clusters, one with 3 datapoints (cluster 0) and one with 2 datapoints (cluster 1). Notice that we have run the prediction method in a parallel fashion (i.e. on an RDD) - collect() is used here only for our demonstration purposes, and it is not needed in a 'real' situation.
Now, we can get the cluster sizes with
cluster_sizes = cluster_ind.countByValue().items()
cluster_sizes
# [(0, 3), (1, 2)]
From this, we can get the maximum cluster index & size as
from operator import itemgetter
max(cluster_sizes, key=itemgetter(1))
# (0, 3)
i.e. our biggest cluster is cluster 0, with a size of 3 datapoints, which can be easily verified by inspection of cluster_ind.collect() above.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How does pytorch L1-norm pruning works? - pytorch

Related

Rounding only specific entries of torch.tensor()

How does torchvision.transforms.Normalize operate?

Getting range of values from Pytorch Tensor

PYSPARK: how to cluster more efficiently? [duplicate]

Spark KMeans clustering: get the number of sample assigned to a cluster

Categories

Resources