Batched index_fill in PyTorch - pytorch

I have an index tensor of size (2, 3):
>>> index = torch.empty(6).random_(0,8).view(2,3)
tensor([[6., 3., 2.],
[3., 4., 7.]])
And a value tensor of size (2, 8):
>>> value = torch.zeros(2,8)
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
I want to set the element in value to 1 by the index along dim=-1.** The output should be like:
>>> output
tensor([[0., 0., 1., 1., 0., 0., 1., 0.],
[0., 0., 0., 1., 1., 0., 0., 1.]])
I tried value[range(2), index] = 1 but it triggers an error. I also tried torch.index_fill but it doesn't accept batched indices. torch.scatter requires creating an extra tensor of size 2*8 full of 1, which consumes unnecessary memory and time.

You can actually use torch.Tensor.scatter_ by setting the value (int) option instead of the src option (Tensor).
>>> value.scatter_(dim=-1, index=index.long(), value=1)
>>> value
tensor([[0., 0., 1., 1., 0., 0., 1., 0.],
[0., 0., 0., 1., 1., 0., 0., 1.]])
Make sure the index is of type int64 though.

Related

How padding works in PyTorch

Normally if I understood well PyTorch implementation of the Conv2D layer, the padding parameter will expand the shape of the convolved image with zeros to all four sides of the input. So, if we have an image of shape (6,6) and set padding = 2 and strides = 2 and kernel = (5,5), the output will be an image of shape (1,1). Then, padding = 2 will pad with zeroes (2 up, 2 down, 2 left and 2 right) resulting in a convolved image of shape (5,5)
However when running the following script :
import torch
from torch import nn
x = torch.ones(1,1,6,6)
y = nn.Conv2d(in_channels= 1, out_channels=1,
kernel_size= 5, stride = 2,
padding = 2,)(x)
I got the following outputs:
y.shape
==> torch.Size([1, 1, 3, 3]) ("So shape of convolved image = (3,3) instead of (5,5)")
y[0][0]
==> tensor([[0.1892, 0.1718, 0.2627, 0.2627, 0.4423, 0.2906],
[0.4578, 0.6136, 0.7614, 0.7614, 0.9293, 0.6835],
[0.2679, 0.5373, 0.6183, 0.6183, 0.7267, 0.5638],
[0.2679, 0.5373, 0.6183, 0.6183, 0.7267, 0.5638],
[0.2589, 0.5793, 0.5466, 0.5466, 0.4823, 0.4467],
[0.0760, 0.2057, 0.1017, 0.1017, 0.0660, 0.0411]],
grad_fn=<SelectBackward>)
Normally it should be filled with zeroes. I'm confused. Can anyone help please?
The input is padded, not the output. In your case, the conv2d layer will apply a two-pixel padding on all sides just before computing the convolution operation.
For illustration purposes,
>>> weight = torch.rand(1, 1, 5, 5)
Here we apply a convolution with padding=2:
>>> x = torch.ones(1,1,6,6)
>>> F.conv2d(x, weight, stride=2, padding=2)
tensor([[[[ 5.9152, 8.8923, 6.0984],
[ 8.9397, 14.7627, 10.8613],
[ 7.2708, 12.0152, 9.0840]]]])
And we don't use any padding but instead apply it ourselves on the input:
>>> x_padded = F.pad(x, (2,)*4)
tensor([[[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]]])
>>> F.conv2d(x_padded, weight, stride=2)
tensor([[[[ 5.9152, 8.8923, 6.0984],
[ 8.9397, 14.7627, 10.8613],
[ 7.2708, 12.0152, 9.0840]]]])

pytorch loss function for regression model with a vector of values

I'm training a CNN architecture to solve a regression problem using PyTorch where my output is a tensor of 25 values. The input/target tensor could be either all zeros or a gaussian distribution with a sigma value of 2. An example of a 4-sample batch is as this one:
[[0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534, 0.043937, 0.011109, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534, 0.043937, 0.011109, 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534 ],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
My question is how to design a loss function for the model effectively learn the regression output with 25 values.
I have tried 2 types of loss, torch.nn.MSELoss() and torch.nn.MSELoss()-torch.nn.CosineSimilarity(). They sort of work. However, sometimes the network has difficulty converging, especially when there are a lot of samples with all "zeros", which leads the network to output a vector with all 25 small values.
My question is, is there any other loss which we could try?
Your values do not seem widely different in scale so an MSELoss seems like it would work fine. Your model could be collapsing because of the many zeros in your target.
You can always try torch.nn.L1Loss() (but I do not expect it to be much better than torch.nn.MSELoss())
I suggest that you instead try to predict the gaussian mean/mu, and later try to re-create the gaussian for each sample if you really need it.
So you have two alternatives if you choose to try this method.
Alt 1
A good alternative is to encode your target to look like a classification target. Your 25 element vectors become a single value where the original target == 1 (possible classes will 0, 1, 2, ..., 24). We can then assign a sample that contains "only zeroes" as our last class "25". So your target:
[[0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534, 0.043937, 0.011109, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534, 0.043937, 0.011109, 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534 ],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
becomes
[4,
10,
20,
25]
If you do this, then you can try the common torch.nn.CrossEntropyLoss().
I do not know what your dataloader looks like but given a single sample in your original format, you can convert it to my proposed format with:
def encode(tensor):
if tensor.sum() == 0:
return len(tensor)
return torch.argmax(tensor)
and back to a gaussian with:
def decode(value):
n_values = 25
zero = torch.zeros(n_values)
if value == n_values:
return zero
# Create gaussian around value
std = 2
n = torch.arange(n_values) - value
sig = 2*std**2
gauss = torch.exp(-n**2 / sig2)
# Only return 9 values from the gaussian
start_ix = max(value-6, 0)
end_ix = min(value+7,n_values)
zero[start_ix:end_ix] = gauss[start_ix:end_ix]
return zero
(Note I have not tried them with batches, only samples)
Alt 2
The second option is to change your regression targets (still only the argmax positions (mu)) to a nicer regression value in the range 0-1 and have a separate neuron that outputs a "mask value" (also 0-1). Then your batch of:
[[0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534, 0.043937, 0.011109, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534, 0.043937, 0.011109, 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.13534, 0.32465, 0.60653, 0.8825, 1.0000, 0.88250,0.60653, 0.32465, 0.13534 ],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
becomes
# [Mask, mu]
[
[1, 0.1666], # True, 4/24
[1, 0.4166], # True, 10/24
[1, 0.8333], # True, 20/24
[0, 0] # False, undefined
]
If you are using this setup, then you should be able to use an MSELoss with modification:
def custom_loss(input, target):
# Assume target and input is of shape [Batch, 2]
mask = target[...,1]
mask_loss = torch.nn.functional.mse_loss(input[...,0], target[...,0])
mu_loss = torch.nn.functional.mse_loss(mask*input[...,1], mask*target[...,1])
return (mask_loss + mu_loss) / 2
This loss would only look at the 2nd value (mu) if the mask of the target is 1. Otherwise it only tried to optimize for the correct mask.
To encode to this format you would use:
def encode(tensor):
n_values = 25
if tensor.sum() == 0:
return torch.tensor([0,0])
return torch.argmax(tensor) / (n_values-1)
and to decode:
def decode(tensor):
n_values = 25
# Parse values
mask, value = tensor
mask = torch.round(mask)
value = torch.round((n_values-1)*value)
zero = torch.zeros(n_values)
if mask == 0:
return zero
# Create gaussian around value
std = 2
n = torch.arange(n_values) - value
sig = 2*std**2
gauss = torch.exp(-n**2 / sig2)
# Only return 9 values from the gaussian
start_ix = max(value-6, 0)
end_ix = min(value+7,n_values)
zero[start_ix:end_ix] = gauss[start_ix:end_ix]
return zero

concatenating multiple Numpy arrays in dictionaries

I have four dictionaries. And each value for each of the key in the dictionary is a 1D numpy array. I want to join all those numpy arrays into one. For example:
first dictionary = {'feature1': array([0., 0., 1., 0.]),
'feature2': array([0., 1., 0., 0.]),
'feature3': array([1., 0., 0.,0.,0.,0.])}
second dictionary = {'feature4': array([0.]),
'feature5': array([0., 0.]),
'feature6': array([0.023]),
'feature7': array([0.009]),
'feature8': array([0.])}
third dictionary = {'feature9': array([ 0., 0., 0., 912., 0., 0., 0.]),
'feature10': array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]),}
The resultant final numpy should look like:
array([0., 0., 1., 0.,0., 1., 0., 0.,1., 0., 0.,0.,0.,0.,
0.,0., 0.,0.023,0.009,0.,0.,
0., 0., 912., 0., 0., 0.,0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]
As an example I've made this dictionaries smaller but I have unto 50 keys in each of the dictionaries. So basically I want to join all of the numpy arrays in my dictionaries. How can I achieve this? Insights will be appreciated.
you could just do:
output = []
for dictionary in [firstdictionary,seconddictionary,thirddictionary]:
for key, value in dictionary.items():
output += list(value)
output_array = np.array(output)

Problem with updating running_mean and running_var in a custom Batchnorm built in Pytorch?

I have been trying to implement a custom batch normalization function such that it can be extended to the Multi GPU version, in particular, the DataParallel module in Pytorch.The custom batchnorm works alright when using 1 GPU, but, when extended to 2 or more, the running mean and variance work in the forward function, but when it returns back from the network, the mean and variance are reinitialized to 0 and 1.
The torch.nn.DataParallel mentions in the warning section that " In each forward, module is replicated on each device, so any updates to the running module in forward will be lost. For example, if module has a counter attribute that is incremented in each forward, it will always stay at the initial value because the update is done on the replicas which are destroyed after forward." But I am not really sure how to retain the mean and variance from the default device.
I have provided code with the result obtained during multi GPU training. This code utilizes the Batchnorm provided here.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.backends.cudnn as cudnn
import torchvision
import torchvision.transforms as transforms
from torch.nn.parameter import Parameter
class ptrblck_BatchNorm2d(nn.BatchNorm2d):
def __init__(self, num_features, eps=1e-5, momentum=0.1,
affine=True, track_running_stats=True):
super(ptrblck_BatchNorm2d, self).__init__(
num_features, eps, momentum, affine, track_running_stats)
def forward(self, input):
self._check_input_dim(input)
exponential_average_factor = 0.0
if self.training and self.track_running_stats:
if self.num_batches_tracked is not None:
self.num_batches_tracked += 1
if self.momentum is None: # use cumulative moving average
exponential_average_factor = 1.0 / float(self.num_batches_tracked)
else: # use exponential moving average
exponential_average_factor = self.momentum
# calculate running estimates
if self.training:
mean = input.mean([0, 2, 3])
# use biased var in train
var = input.var([0, 2, 3], unbiased=False)
n = input.numel() / input.size(1)
with torch.no_grad():
self.running_mean = exponential_average_factor * mean\
+ (1 - exponential_average_factor) * self.running_mean
# update running_var with unbiased var
self.running_var = exponential_average_factor * var * n / (n - 1)\
+ (1 - exponential_average_factor) * self.running_var
else:
mean = self.running_mean
var = self.running_var
input = (input - mean[None, :, None, None]) / (torch.sqrt(var[None, :, None, None] + self.eps))
if self.affine:
input = input * self.weight[None, :, None, None] + self.bias[None, :, None, None]
return input
class net(nn.Module):
def __init__(self):
super(net, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=3, padding=1)
self.bn1 = ptrblck_BatchNorm2d(64)
print("==> printing bn1 mean when init")
print(self.bn1.running_mean)
print("==> printing bn1 when init")
print(self.bn1.running_mean)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
self.classifier = nn.Linear(64, 10)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = F.relu(x)
x = self.pool(x)
x = self.avgpool(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
print("======================================================")
print("==> printing bn1 running mean from NET during forward")
print(net.module.bn1.running_mean)
print("==> printing bn1 running mean from SELF. during forward")
print(self.bn1.running_mean)
print("==> printing bn1 running var from NET during forward")
print(net.module.bn1.running_var)
print("==> printing bn1 running mean from SELF. during forward")
print(self.bn1.running_var)
return x
# Data
print('==> Preparing data..')
transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
# Model
print('==> Building model..')
net = net()
net = torch.nn.DataParallel(net).cuda()
print('Number of GPU {}'.format(torch.cuda.device_count()))
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)
# Training
def train(epoch):
print('\nEpoch: %d' % epoch)
net.train()
train_loss = 0
correct = 0
total = 0
for batch_idx, (inputs, targets) in enumerate(trainloader):
inputs, targets = inputs.cuda(), targets.cuda()
outputs = net(inputs)
loss = criterion(outputs, targets)
print("====================================================")
print("==> printing bn1 running mean FROM net after forward")
print(net.module.bn1.running_mean)
print("==> printing bn1 running var FROM net after forward")
print(net.module.bn1.running_var)
break
# optimizer.zero_grad()
# loss.backward()
# optimizer.step()
# train_loss += loss.item()
# _, predicted = outputs.max(1)
# total += targets.size(0)
# correct += predicted.eq(targets).sum().item()
# break
for epoch in range(0, 1):
train(epoch)
Result:
==> Preparing data..
Files already downloaded and verified
Files already downloaded and verified
==> Building model..
==> printing bn1 mean when init
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
==> printing bn1 when init
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
Number of GPU 2
Epoch: 0
======================================================
==> printing bn1 running mean from NET during forward
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
device='cuda:0')
==> printing bn1 running mean from SELF. during forward
tensor([ 0.0053, 0.0010, -0.0077, -0.0290, 0.0241, 0.0258, -0.0048, 0.0151,
-0.0133, 0.0080, 0.0197, -0.0042, -0.0188, 0.0233, 0.0310, -0.0230,
-0.0133, 0.0222, 0.0119, -0.0042, -0.0220, -0.0169, -0.0342, -0.0025,
0.0338, -0.0070, 0.0202, 0.0050, 0.0108, 0.0008, 0.0363, 0.0347,
-0.0106, 0.0082, 0.0128, 0.0074, 0.0111, -0.0030, -0.0089, 0.0070,
-0.0262, -0.0029, 0.0053, -0.0136, -0.0183, 0.0045, -0.0014, -0.0221,
0.0132, 0.0064, 0.0388, -0.0220, -0.0008, 0.0400, -0.0187, 0.0397,
-0.0131, -0.0176, 0.0035, 0.0055, -0.0270, 0.0066, -0.0149, 0.0135],
device='cuda:0')
==> printing bn1 running var from NET during forward
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:0')
==> printing bn1 running mean from SELF. during forward
tensor([0.9665, 0.9073, 0.9220, 1.0947, 1.0687, 0.9624, 0.9252, 0.9131, 0.9066,
0.9536, 0.9258, 0.9203, 1.0359, 0.9690, 1.1066, 1.0636, 0.9135, 0.9644,
0.9373, 0.9846, 0.9696, 0.9454, 1.0459, 0.9245, 0.9778, 0.9709, 0.9352,
0.9995, 0.9657, 0.9510, 1.0943, 1.0171, 0.9298, 1.0747, 0.9341, 0.9635,
0.9978, 0.9303, 0.9261, 0.9137, 0.9569, 1.0066, 1.0463, 0.9955, 0.9621,
0.9172, 0.9836, 0.9817, 0.9086, 0.9576, 1.0905, 0.9861, 0.9661, 1.1773,
0.9345, 1.0904, 0.9133, 1.0660, 0.9164, 0.9058, 0.9446, 0.9225, 1.0914,
0.9292], device='cuda:0')
======================================================
==> printing bn1 running mean from NET during forward
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
device='cuda:0')
==> printing bn1 running mean from SELF. during forward
tensor([-0.0020, 0.0002, -0.0103, -0.0426, 0.0386, 0.0311, -0.0059, 0.0151,
-0.0140, 0.0145, 0.0218, -0.0029, -0.0281, 0.0284, 0.0449, -0.0329,
-0.0107, 0.0278, 0.0135, -0.0123, -0.0260, -0.0214, -0.0423, -0.0035,
0.0410, -0.0097, 0.0276, 0.0102, 0.0197, -0.0001, 0.0483, 0.0451,
-0.0078, 0.0190, 0.0135, -0.0004, 0.0196, -0.0028, -0.0140, 0.0070,
-0.0332, -0.0110, 0.0151, -0.0210, -0.0226, 0.0074, -0.0088, -0.0314,
0.0125, -0.0003, 0.0505, -0.0312, 0.0086, 0.0544, -0.0245, 0.0528,
-0.0086, -0.0290, 0.0063, 0.0042, -0.0339, 0.0061, -0.0277, 0.0092],
device='cuda:1')
==> printing bn1 running var from NET during forward
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:0')
==> printing bn1 running mean from SELF. during forward
tensor([0.9665, 0.9072, 0.9211, 1.0999, 1.0714, 0.9610, 0.9209, 0.9125, 0.9063,
0.9553, 0.9260, 0.9189, 1.0386, 0.9706, 1.1139, 1.0610, 0.9121, 0.9660,
0.9366, 0.9886, 0.9683, 0.9454, 1.0511, 0.9227, 0.9792, 0.9704, 0.9330,
0.9989, 0.9657, 0.9476, 1.1008, 1.0191, 0.9294, 1.0814, 0.9320, 0.9642,
1.0006, 0.9287, 0.9254, 0.9128, 0.9559, 1.0100, 1.0521, 0.9972, 0.9621,
0.9168, 0.9849, 0.9803, 0.9083, 0.9556, 1.0946, 0.9865, 0.9651, 1.1880,
0.9330, 1.0959, 0.9116, 1.0706, 0.9149, 0.9057, 0.9450, 0.9215, 1.0972,
0.9261], device='cuda:1')
====================================================
==> printing bn1 running mean FROM net after forward
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
device='cuda:0')
==> printing bn1 running var FROM net after forward
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:0')
How can I make sure that the running estimates of the default device be used? Currently, I am not working towards synchronized Batchnorm.
Replacing
self.running_mean = (...)
with
self.running_mean.copy_(...)
did the job.
Reference

How to keep using values from a list until the diagonal of a matrix is full using itertools

So I am trying to use a smaller list to populate the diagonal of a larger matrix. I thought using the cycle function in itertools would make this an easy task but I can't seem to get it to work. Here is what I tried
a = np.zeros((10,10))
b = [1, 2, 3, 4, 5]
for i in range(len(a.shape[0])):
a[i, i] = list(itertools.cycle(b))
but this makes it endlessly iterate. I am hoping that it will stop once the diagonal has been filled. Other options that are more pythonic are greatly appreciated!
you mean to use itertools.cycle, not repeat. The latter repeats the element (the list), good luck setting that into a value, specially if you force iteration (since it runs forever)
I'd create a reference on a cycle object outside the loop and assign a value to the diagonal iterating over it manually (the only proper way with cycle). Also note that your loop range was wrong. a.shape[0] is a dimension, no need for len
import numpy as np,itertools
a = np.zeros((10,10))
b = [1, 2, 3, 4, 5]
iterator = itertools.cycle(b)
for i in range(a.shape[0]):
a[i, i] = next(iterator)
result:
>>> a
array([[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 2., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 3., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 4., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 5., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 2., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 3., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 4., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 5.]])
As they loop forever, cycle and repeat should not be used in a context of forced iteration (repeat has an optional parameter to limit the repeats, though).

Resources