RuntimeError: CUDA out of memory - pytorch

I got this Error:
RuntimeError: CUDA out of memory
GPU 0; 1.95 GiB total capacity; 1.23 GiB already allocated 1.27 GiB reserved in total by PyTorch
But it is not out of memory, it seems (to me) that the PyTorch allocates the wrong size of memory. I did change the batch size to 1, kill all apps that use the memory then reboot, and none worked.
This is how I run it, please let me know what info I need to fix it, or where should I check? Thank you.
python train.py --img 416 --batch 16 --epochs 1 \\
--data '../data.yaml' --cfg ./models/yolov4-csp.yaml \\
--weights '' --name yolov4-csp-results --cache
Using CUDA device0 _CudaDeviceProperties(name='Quadro P620', total_memory=2000MB)
Namespace(adam=False, batch_size=16, bucket='', cache_images=True, cfg='./models/yolov4-csp.yaml', data='../data.yaml', device='', epochs=1, evolve=False, global_rank=-1, hyp='data/hyp.scratch.yaml', img_size=[416, 416], local_rank=-1, logdir='runs/', multi_scale=False, name='yolov4-csp-results', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batch_size=16, weights='', world_size=1)
Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/
Hyperparameters {'lr0': 0.01, 'momentum': 0.937, 'weight_decay': 0.0005, 'giou': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.5, 'scale': 0.5, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mixup': 0.0}
Overriding ./models/yolov4-csp.yaml nc=80 with nc=1
from n params module arguments
0 -1 1 928 models.common.Conv [3, 32, 3, 1]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 20672 models.common.Bottleneck [64, 64]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 119936 models.common.BottleneckCSP [128, 128, 2]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 1463552 models.common.BottleneckCSP [256, 256, 8]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 5843456 models.common.BottleneckCSP [512, 512, 8]
9 -1 1 4720640 models.common.Conv [512, 1024, 3, 2]
10 -1 1 12858368 models.common.BottleneckCSP [1024, 1024, 4]
11 -1 1 7610368 models.common.SPPCSP [1024, 512, 1]
12 -1 1 131584 models.common.Conv [512, 256, 1, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 8 1 131584 models.common.Conv [512, 256, 1, 1]
15 [-1, -2] 1 0 models.common.Concat [1]
16 -1 1 1642496 models.common.BottleneckCSP2 [512, 256, 2]
17 -1 1 33024 models.common.Conv [256, 128, 1, 1]
18 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
19 6 1 33024 models.common.Conv [256, 128, 1, 1]
20 [-1, -2] 1 0 models.common.Concat [1]
21 -1 1 411648 models.common.BottleneckCSP2 [256, 128, 2]
22 -1 1 295424 models.common.Conv [128, 256, 3, 1]
23 -2 1 295424 models.common.Conv [128, 256, 3, 2]
24 [-1, 16] 1 0 models.common.Concat [1]
25 -1 1 1642496 models.common.BottleneckCSP2 [512, 256, 2]
26 -1 1 1180672 models.common.Conv [256, 512, 3, 1]
27 -2 1 1180672 models.common.Conv [256, 512, 3, 2]
28 [-1, 11] 1 0 models.common.Concat [1]
29 -1 1 6561792 models.common.BottleneckCSP2 [1024, 512, 2]
30 -1 1 4720640 models.common.Conv [512, 1024, 3, 1]
31 [22, 26, 30] 1 32310 models.yolo.Detect [1, [[12, 16, 19, 36, 40, 28], [36, 75, 76, 55, 72, 146], [142, 110, 192, 243, 459, 401]], [256, 512, 1024]]
Model Summary: 334 layers, 5.24994e+07 parameters, 5.24994e+07 gradients
Optimizer groups: 111 .bias, 115 conv.weight, 108 other
Scanning labels ../train/labels.cache (78 found, 0 missing, 0 empty, 0 duplicate, for 78 images): 100%|█| 78/78 [00:00<0
Caching images (0.0GB): 3%|█▌ | 2/78 [00:00<00:03, 19.31it/Caching images (0.0GB): 54%|███████████████████████████████▏ |Caching images (0.0GB): 100%|█████████████████████████████████████████████ █████████████| 78/78 [00:00<00:00, 305.27it/s]
Scanning labels ../valid/labels.cache (15 found, 0 missing, 0 empty, 0 duplicate, for 15 images): 100%|█| 15/15 [00:00<0
Caching images (0.0GB): 100%|█████████████████████████████████████████████]█████████████| 15/15 [00:00<00:00, 333.01it/s]
Analyzing anchors... anchors/target = 4.64, Best Possible Recall (BPR) = 1.0000
Image sizes 416 train, 416 test
Using 8 dataloader workers
Starting training for 1 epochs...
Epoch gpu_mem GIoU obj cls total targets img_size
0%| | 0/5 [00:04<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 443, in <module>
train(hyp, opt, device, tb_writer)
File "train.py", line 256, in train
pred = model(imgs)
File "/home/ctdi/anaconda3/envs/scaled-yolov4.03/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ctdi/content/ScaledYOLOv4/models/yolo.py", line 109, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/home/ctdi/content/ScaledYOLOv4/models/yolo.py", line 129, in forward_once
x = m(x) # run
File "/home/ctdi/anaconda3/envs/scaled-yolov4.03/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ctdi/content/ScaledYOLOv4/models/common.py", line 47, in forward
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
File "/home/ctdi/anaconda3/envs/scaled-yolov4.03/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ctdi/content/ScaledYOLOv4/models/common.py", line 31, in forward
return self.act(self.bn(self.conv(x)))
File "/home/ctdi/anaconda3/envs/scaled-yolov4.03/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ctdi/anaconda3/envs/scaled-yolov4.03/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward
self.weight, self.bias, bn_training, exponential_average_factor, self.eps)
File "/home/ctdi/anaconda3/envs/scaled-yolov4.03/lib/python3.6/site-packages/torch/nn/functional.py", line 2059, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 44.00 MiB (GPU 0; 1.95 GiB total capacity; 1.23 GiB already allocated; 26.94 MiB free; 1.27 GiB reserved in total by PyTorch)

I finally find it. The problem was, I was using the new CUDA 11.2. That's bad. I remove it. and install CUDA 10.2. That fix the problem.

Related

Problem with pytorch hooks? Activation maps allways positiv

I was looking at the activation maps of vgg19 in pytorch.
I found that all the values of the maps are positive even before I applied the ReLU.
This seems very strange to me... If this would be correct (could be that I not used the register_forward_hook method correctly?) why would one then apply ReLu at all?
This is my code to produce this:
import torch
import torchvision
import torchvision.models as models
import torchvision.transforms as transforms
from torchsummary import summary
import os, glob
import matplotlib.pyplot as plt
import numpy as np
# settings:
batch_size = 4
# load the model
model = models.vgg19(pretrained=True)
summary(model.cuda(), (3, 32, 32))
model.cpu()
# how to preprocess??? See here:
# https://discuss.pytorch.org/t/how-to-preprocess-input-for-pre-trained-networks/683/2
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
transform = transforms.Compose(
[transforms.ToTensor(),
normalize])
# build data loader
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
shuffle=True, num_workers=2)
# show one image
dataiter = iter(trainloader)
images, labels = dataiter.next()
# set a hook
activation = {}
def get_activation(name):
def hook(model, input, output):
activation[name] = output.detach()
return hook
# hook at the first conv layer
hook = model.features[0].register_forward_hook(get_activation("firstConv"))
model(images)
hook.remove()
# show results:
flatted_feat_maps = activation["firstConv"].detach().numpy().flatten()
print("All positiv??? --> ",np.all(flatted_feat_maps >= 0))
plt.hist(flatted_feat_maps)
plt.show()
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 32, 32] 1,792
ReLU-2 [-1, 64, 32, 32] 0
Conv2d-3 [-1, 64, 32, 32] 36,928
ReLU-4 [-1, 64, 32, 32] 0
MaxPool2d-5 [-1, 64, 16, 16] 0
Conv2d-6 [-1, 128, 16, 16] 73,856
ReLU-7 [-1, 128, 16, 16] 0
Conv2d-8 [-1, 128, 16, 16] 147,584
ReLU-9 [-1, 128, 16, 16] 0
MaxPool2d-10 [-1, 128, 8, 8] 0
Conv2d-11 [-1, 256, 8, 8] 295,168
ReLU-12 [-1, 256, 8, 8] 0
Conv2d-13 [-1, 256, 8, 8] 590,080
ReLU-14 [-1, 256, 8, 8] 0
Conv2d-15 [-1, 256, 8, 8] 590,080
ReLU-16 [-1, 256, 8, 8] 0
Conv2d-17 [-1, 256, 8, 8] 590,080
ReLU-18 [-1, 256, 8, 8] 0
MaxPool2d-19 [-1, 256, 4, 4] 0
Conv2d-20 [-1, 512, 4, 4] 1,180,160
ReLU-21 [-1, 512, 4, 4] 0
Conv2d-22 [-1, 512, 4, 4] 2,359,808
ReLU-23 [-1, 512, 4, 4] 0
Conv2d-24 [-1, 512, 4, 4] 2,359,808
ReLU-25 [-1, 512, 4, 4] 0
Conv2d-26 [-1, 512, 4, 4] 2,359,808
ReLU-27 [-1, 512, 4, 4] 0
MaxPool2d-28 [-1, 512, 2, 2] 0
Conv2d-29 [-1, 512, 2, 2] 2,359,808
ReLU-30 [-1, 512, 2, 2] 0
Conv2d-31 [-1, 512, 2, 2] 2,359,808
ReLU-32 [-1, 512, 2, 2] 0
Conv2d-33 [-1, 512, 2, 2] 2,359,808
ReLU-34 [-1, 512, 2, 2] 0
Conv2d-35 [-1, 512, 2, 2] 2,359,808
ReLU-36 [-1, 512, 2, 2] 0
MaxPool2d-37 [-1, 512, 1, 1] 0
AdaptiveAvgPool2d-38 [-1, 512, 7, 7] 0
Linear-39 [-1, 4096] 102,764,544
ReLU-40 [-1, 4096] 0
Dropout-41 [-1, 4096] 0
Linear-42 [-1, 4096] 16,781,312
ReLU-43 [-1, 4096] 0
Dropout-44 [-1, 4096] 0
Linear-45 [-1, 1000] 4,097,000
================================================================
Total params: 143,667,240
Trainable params: 143,667,240
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 5.25
Params size (MB): 548.05
Estimated Total Size (MB): 553.31
----------------------------------------------------------------
Could it be that I somehow did not use the register_forward_hook correctly?
You should clone the output in
def get_activation(name):
def hook(model, input, output):
activation[name] = output.detach().clone() #
return hook
Note that Tensor.detach only detaches the tensor from the graph, but both tensors will still share the same underlying storage.
Returned Tensor shares the same storage with the original one. In-place modifications on either of them will be seen, and may trigger errors in correctness checks. IMPORTANT NOTE: Previously, in-place size / stride / storage changes (such as resize_ / resize_as_ / set_ / transpose_) to the returned tensor also update the original tensor. Now, these in-place changes will not update the original tensor anymore, and will instead trigger an error. For sparse tensors: In-place indices / values changes (such as zero_ / copy_ / add_) to the returned tensor will not update the original tensor anymore, and will instead trigger an error.

During Triple Gaussian fitting, huge diverse from initial guess value and high chi-square value

I am use this code,
import matplotlib.pyplot as plt
from numpy import exp, loadtxt, pi, sqrt
from lmfit import Model
data = loadtxt('model1d_gauss.dat')
x = data[:, 0]
y = data[:, 1] + 0.25*x - 1.0
def gaussian(x, amp, cen, wid):
"""1-d gaussian: gaussian(x, amp, cen, wid)"""
return (amp / (sqrt(2*pi) * wid)) * exp(-(x-cen)**2 / (2*wid**2))
def line(x, slope, intercept):
"""a line"""
return slope*x + intercept
mod = Model(gaussian) + Model(line)
pars = mod.make_params(amp=5, cen=5, wid=1, slope=0, intercept=1)
result = mod.fit(y, pars, x=x)
print(result.fit_report())
plt.plot(x, y, 'bo')
plt.plot(x, result.init_fit, 'k--')
plt.plot(x, result.best_fit, 'r-')
plt.show()
Instead of "gaussian_plus_line" fitting, I tried to fitted my data with triple gaussian fit. And modify this code like this,
y=[9, 11, 7, 6, 21, 9, 36, 8, 22, 7, 25, 27, 18, 22, 22, 18, 21, 17, 16, 13, 30, 8, 10, 18, 12, 17, 24, 19, 18, 25, 6, 18, 20, 36, 22, 12, 25, 20, 22, 32, 30, 32, 51, 52, 46, 41, 49, 51, 56, 71, 56, 58, 73, 66, 71, 80, 76, 90, 71, 71, 87, 68, 74, 67, 71, 67, 75, 51, 51, 57, 38, 45, 39, 37, 23, 23, 21, 20, 13, 9, 10, 7, 5, 5, 9, 5, 6, 5, 0]
x=[-4.91, -3.29, -2.5700000000000003, -2.39, -2.21, -1.94, -1.67, -1.4900000000000002, -1.4000000000000004, -1.2200000000000002, -1.1300000000000003, -1.04, -0.8600000000000003, -0.6799999999999997, -0.5, -0.41000000000000014, -0.3200000000000003, -0.23000000000000043, -0.14000000000000057, -0.04999999999999982, 0.040000000000000036, 0.1299999999999999, 0.21999999999999975, 0.3099999999999996, 0.39999999999999947, 0.4900000000000002, 0.5800000000000001, 0.6699999999999999, 0.7599999999999998, 0.8499999999999996, 0.9399999999999995, 1.0299999999999994, 1.12, 1.21, 1.2999999999999998, 1.3899999999999997, 1.4799999999999995, 1.5699999999999994, 1.6600000000000001, 1.75, 1.8399999999999999, 1.9299999999999997, 2.0199999999999996, 2.1099999999999994, 2.1999999999999993, 2.29, 2.38, 2.4699999999999998, 2.5599999999999996, 2.6499999999999995, 2.7399999999999993, 2.83, 2.92, 3.01, 3.0999999999999996, 3.1899999999999995, 3.2799999999999994, 3.369999999999999, 3.459999999999999, 3.549999999999999, 3.6400000000000006, 3.7300000000000004, 3.8200000000000003, 3.91, 4.0, 4.09, 4.18, 4.27, 4.359999999999999, 4.449999999999999, 4.539999999999999, 4.629999999999999, 4.719999999999999, 4.8100000000000005, 4.9, 4.99, 5.08, 5.17, 5.26, 5.35, 5.4399999999999995, 5.529999999999999, 5.619999999999999, 5.709999999999999, 5.799999999999999, 5.98, 6.07, 6.25, 6.609999999999999]
def gaussian1(x, amp1, cen1, wid1):
"1-d gaussian: gaussian(x, amp, cen, wid)"
return (amp1/(sqrt(2*pi)*wid1)) * exp(-(x-cen1)**2 /(2*wid1**2))
def gaussian2(x, amp2, cen2, wid2):
"1-d gaussian: gaussian(x, amp, cen, wid)"
return (amp2/(sqrt(2*pi)*wid2)) * exp(-(x-cen2)**2 /(2*wid2**2))
def gaussian3(x, amp3, cen3, wid3):
"1-d gaussian: gaussian(x, amp, cen, wid)"
return (amp3/(sqrt(2*pi)*wid3)) * exp(-(x-cen3)**2 /(2*wid3**2))
mod = Model(gaussian1) + Model(gaussian2) + Model(gaussian3)
pars = mod.make_params( amp1=23,cen1=-1.5,wid1=3.5,amp2=17,cen2=1.0,wid2=1.5,amp3=80,cen3=3.5,wid3=3.0 )
result = mod.fit(y, pars, x=x)
print(result.fit_report())
#plt.plot(x, y, 'bo')
plt.plot(x, result.init_fit, 'k--')
plt.plot(x, result.best_fit, 'r-')
plt.show()
But the problem is that, I got huge diverse from initial guess values and also very high chisquare value. I attached the output value below.
[[Model]]
((Model(gaussian1) + Model(gaussian2)) + Model(gaussian3))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 751
# data points = 89
# variables = 9
chi-square = 3715.94994
reduced chi-square = 46.4493743
Akaike info crit = 350.126040
Bayesian info crit = 372.523767
## Warning: uncertainties could not be estimated:
[[Variables]]
amp1: -7174.13129 (init = 23)
cen1: -853.048883 (init = -1.5)
wid1: -84.6651961 (init = 3.5)
amp2: -189.857626 (init = 17)
cen2: 3.47343596 (init = 1)
wid2: -1.02072899 (init = 1.5)
amp3: 111.911023 (init = 80)
cen3: -0.65585443 (init = 3.5)
wid3: 2.37279022 (init = 3)
Help me to reduce the chis-quare value and the values of the parameters around the initial guess value.
The fit is telling you what a basic plot of your data will suggest: there are two Gaussians in this data, one with an amplitude around 190 centered around 3.5, and one with an amplitude around 110 centered at -0.65. There is not a third Gaussian.
You might be able to give better guesses or perhaps use a better definition of Gaussian (ahem, like the built-in version ;)) that will prevent the width from being negative. And you might place bounds on the parameter values, but I will guess that the data will always show 2 reliable peaks, but not a reliable 3rd peak.

How to get the information like the output of Model.get_config() in keras?

I want to get some information like this.
The input layer of one layer and the output layer of one layer.
You can try with summary, here is an example
from torchsummary import summary
vgg = models.vgg16()
summary(vgg, (3, 224, 224))
----------------------------------------------------------------
Layer (type) Output Shpae Param #
================================================================
Conv2d-1 [-1, 64, 224, 224] 1792
ReLU-2 [-1, 64, 224, 224] 0
Conv2d-3 [-1, 64, 224, 224] 36928
ReLU-4 [-1, 64, 224, 224] 0
MaxPool2d-5 [-1, 64, 112, 112] 0
Conv2d-6 [-1, 128, 112, 112] 73856
ReLU-7 [-1, 128, 112, 112] 0
Conv2d-8 [-1, 128, 112, 112] 147584
ReLU-9 [-1, 128, 112, 112] 0
MaxPool2d-10 [-1, 128, 56, 56] 0
Conv2d-11 [-1, 256, 56, 56] 295168
ReLU-12 [-1, 256, 56, 56] 0
Conv2d-13 [-1, 256, 56, 56] 590080
ReLU-14 [-1, 256, 56, 56] 0
Conv2d-15 [-1, 256, 56, 56] 590080
ReLU-16 [-1, 256, 56, 56] 0
MaxPool2d-17 [-1, 256, 28, 28] 0
Conv2d-18 [-1, 512, 28, 28] 1180160
ReLU-19 [-1, 512, 28, 28] 0
Conv2d-20 [-1, 512, 28, 28] 2359808
ReLU-21 [-1, 512, 28, 28] 0
Conv2d-22 [-1, 512, 28, 28] 2359808
ReLU-23 [-1, 512, 28, 28] 0
MaxPool2d-24 [-1, 512, 14, 14] 0
Conv2d-25 [-1, 512, 14, 14] 2359808
ReLU-26 [-1, 512, 14, 14] 0
Conv2d-27 [-1, 512, 14, 14] 2359808
ReLU-28 [-1, 512, 14, 14] 0
Conv2d-29 [-1, 512, 14, 14] 2359808
ReLU-30 [-1, 512, 14, 14] 0
MaxPool2d-31 [-1, 512, 7, 7] 0
Linear-32 [-1, 4096] 102764544
ReLU-33 [-1, 4096] 0
Dropout-34 [-1, 4096] 0
Linear-35 [-1, 4096] 16781312
ReLU-36 [-1, 4096] 0
Dropout-37 [-1, 4096] 0
Linear-38 [-1, 1000] 4097000
================================================================
Total params: 138357544
Trainable params: 138357544
Non-trainable params: 0
----------------------------------------------------------------

VOC2012: PIL Image.open converts PNG to 2d array

I am working with VOC2012 dataset. The input image is in PNG format which has a shape of (375, 500, 4) when I use imageio to open the image. When I use PIL to open the image, then suddenly the shape becomes (500, 375). PNG images should have four dimensions on the last axis: r g b & alpha.
The image is obviously colored image, so it should have 3 dimensions (height, width, depth). PIL seems to suggest that it only has two dimensions: width & height.
Can PNG images be represented by a 2d array? Please help! So lost at the moment. Thanks!
from PIL import Image
from keras.preprocessing.image import img_to_array
import os, imageio
import numpy as np
root_path = '/Users/johnson/Downloads/'
imageio_img = imageio.imread(
os.path.join(root_path, '2009_003193.png')
)
# (375, 500, 4)
print(imageio_img.shape)
# [ 0 128 192 224 255]
print(np.unique(imageio_img))
PIL_img = Image.open(
os.path.join(root_path, '2009_003193.png')
)
# (500, 375)
print(PIL_img.size)
PIL_img_to_array = img_to_array(PIL_img)
# (375, 500, 1)
print(PIL_img_to_array.shape)
# [ 0. 2. 255.]
print(np.unique(PIL_img_to_array))
It's also quite magical that PIL seems to know how VOC2012 labels the data. PIL_image_to_array has a unique value of [0, 2, 255]. Conveniently, 2 denotes bicycle in VOC2012. 0 means background and 255 probably means the yellowish boundary around the bicycle. But from the first code snippet, I never passed the pascal classes to PIL for conversion.
def pascal_classes():
classes = {'aeroplane' : 1, 'bicycle' : 2, 'bird' : 3, 'boat' : 4,
'bottle' : 5, 'bus' : 6, 'car' : 7, 'cat' : 8,
'chair' : 9, 'cow' : 10, 'diningtable' : 11, 'dog' : 12,
'horse' : 13, 'motorbike' : 14, 'person' : 15, 'potted-plant' : 16,
'sheep' : 17, 'sofa' : 18, 'train' : 19, 'tv/monitor' : 20}
return classes
def pascal_palette():
palette = {( 0, 0, 0) : 0 ,
(128, 0, 0) : 1 ,
( 0, 128, 0) : 2 ,
(128, 128, 0) : 3 ,
( 0, 0, 128) : 4 ,
(128, 0, 128) : 5 ,
( 0, 128, 128) : 6 ,
(128, 128, 128) : 7 ,
( 64, 0, 0) : 8 ,
(192, 0, 0) : 9 ,
( 64, 128, 0) : 10,
(192, 128, 0) : 11,
( 64, 0, 128) : 12,
(192, 0, 128) : 13,
( 64, 128, 128) : 14,
(192, 128, 128) : 15,
( 0, 64, 0) : 16,
(128, 64, 0) : 17,
( 0, 192, 0) : 18,
(128, 192, 0) : 19,
( 0, 64, 128) : 20 }
Your image is palletised, not RGB. Each pixel is represented by an 8-bit index into a palette. You can see this by looking at image.mode which shows up as P.
If you want an RGB image, use:
rgb = Image.open('bike.png').convert('RGB')
If you want and RGBA image with transparency, use:
RGBA = Image.open('bike.png').convert('RGBA')
However, there is no useful information in the alpha channel, so that seems pointless.
Regarding the pascal palette, you can get that via PIL like this:
im = Image.open('bike.png')
p = im.getpalette()
for i in range (256):
print(p[3*i:3*i+3])
[0, 0, 0]
[128, 0, 0]
[0, 128, 0]
[128, 128, 0]
[0, 0, 128]
[128, 0, 128]
[0, 128, 128]
[128, 128, 128]
[64, 0, 0]
[192, 0, 0]
[64, 128, 0]
[192, 128, 0]
[64, 0, 128]
[192, 0, 128]
[64, 128, 128]
[192, 128, 128]
[0, 64, 0]
[128, 64, 0]
[0, 192, 0]
[128, 192, 0]
[0, 64, 128]
[128, 64, 128]
[0, 192, 128]
[128, 192, 128]
[64, 64, 0]
[192, 64, 0]
[64, 192, 0]
[192, 192, 0]
[64, 64, 128]
[192, 64, 128]
[64, 192, 128]
[192, 192, 128]
[0, 0, 64]
[128, 0, 64]
[0, 128, 64]
[128, 128, 64]
[0, 0, 192]
[128, 0, 192]
[0, 128, 192]
[128, 128, 192]
[64, 0, 64]
[192, 0, 64]
[64, 128, 64]
[192, 128, 64]
[64, 0, 192]
[192, 0, 192]
[64, 128, 192]
[192, 128, 192]
[0, 64, 64]
[128, 64, 64]
[0, 192, 64]
[128, 192, 64]
[0, 64, 192]
[128, 64, 192]
[0, 192, 192]
[128, 192, 192]
[64, 64, 64]
[192, 64, 64]
[64, 192, 64]
[192, 192, 64]
[64, 64, 192]
[192, 64, 192]
[64, 192, 192]
[192, 192, 192]
[32, 0, 0]
[160, 0, 0]
[32, 128, 0]
[160, 128, 0]
[32, 0, 128]
[160, 0, 128]
[32, 128, 128]
[160, 128, 128]
[96, 0, 0]
[224, 0, 0]
[96, 128, 0]
[224, 128, 0]
[96, 0, 128]
[224, 0, 128]
[96, 128, 128]
[224, 128, 128]
[32, 64, 0]
[160, 64, 0]
[32, 192, 0]
[160, 192, 0]
[32, 64, 128]
[160, 64, 128]
[32, 192, 128]
[160, 192, 128]
[96, 64, 0]
[224, 64, 0]
[96, 192, 0]
[224, 192, 0]
[96, 64, 128]
[224, 64, 128]
[96, 192, 128]
[224, 192, 128]
[32, 0, 64]
[160, 0, 64]
[32, 128, 64]
[160, 128, 64]
[32, 0, 192]
[160, 0, 192]
[32, 128, 192]
[160, 128, 192]
[96, 0, 64]
[224, 0, 64]
[96, 128, 64]
[224, 128, 64]
[96, 0, 192]
[224, 0, 192]
[96, 128, 192]
[224, 128, 192]
[32, 64, 64]
[160, 64, 64]
[32, 192, 64]
[160, 192, 64]
[32, 64, 192]
[160, 64, 192]
[32, 192, 192]
[160, 192, 192]
[96, 64, 64]
[224, 64, 64]
[96, 192, 64]
[224, 192, 64]
[96, 64, 192]
[224, 64, 192]
[96, 192, 192]
[224, 192, 192]
[0, 32, 0]
[128, 32, 0]
[0, 160, 0]
[128, 160, 0]
[0, 32, 128]
[128, 32, 128]
[0, 160, 128]
[128, 160, 128]
[64, 32, 0]
[192, 32, 0]
[64, 160, 0]
[192, 160, 0]
[64, 32, 128]
[192, 32, 128]
[64, 160, 128]
[192, 160, 128]
[0, 96, 0]
[128, 96, 0]
[0, 224, 0]
[128, 224, 0]
[0, 96, 128]
[128, 96, 128]
[0, 224, 128]
[128, 224, 128]
[64, 96, 0]
[192, 96, 0]
[64, 224, 0]
[192, 224, 0]
[64, 96, 128]
[192, 96, 128]
[64, 224, 128]
[192, 224, 128]
[0, 32, 64]
[128, 32, 64]
[0, 160, 64]
[128, 160, 64]
[0, 32, 192]
[128, 32, 192]
[0, 160, 192]
[128, 160, 192]
[64, 32, 64]
[192, 32, 64]
[64, 160, 64]
[192, 160, 64]
[64, 32, 192]
[192, 32, 192]
[64, 160, 192]
[192, 160, 192]
[0, 96, 64]
[128, 96, 64]
[0, 224, 64]
[128, 224, 64]
[0, 96, 192]
[128, 96, 192]
[0, 224, 192]
[128, 224, 192]
[64, 96, 64]
[192, 96, 64]
[64, 224, 64]
[192, 224, 64]
[64, 96, 192]
[192, 96, 192]
[64, 224, 192]
[192, 224, 192]
[32, 32, 0]
[160, 32, 0]
[32, 160, 0]
[160, 160, 0]
[32, 32, 128]
[160, 32, 128]
[32, 160, 128]
[160, 160, 128]
[96, 32, 0]
[224, 32, 0]
[96, 160, 0]
[224, 160, 0]
[96, 32, 128]
[224, 32, 128]
[96, 160, 128]
[224, 160, 128]
[32, 96, 0]
[160, 96, 0]
[32, 224, 0]
[160, 224, 0]
[32, 96, 128]
[160, 96, 128]
[32, 224, 128]
[160, 224, 128]
[96, 96, 0]
[224, 96, 0]
[96, 224, 0]
[224, 224, 0]
[96, 96, 128]
[224, 96, 128]
[96, 224, 128]
[224, 224, 128]
[32, 32, 64]
[160, 32, 64]
[32, 160, 64]
[160, 160, 64]
[32, 32, 192]
[160, 32, 192]
[32, 160, 192]
[160, 160, 192]
[96, 32, 64]
[224, 32, 64]
[96, 160, 64]
[224, 160, 64]
[96, 32, 192]
[224, 32, 192]
[96, 160, 192]
[224, 160, 192]
[32, 96, 64]
[160, 96, 64]
[32, 224, 64]
[160, 224, 64]
[32, 96, 192]
[160, 96, 192]
[32, 224, 192]
[160, 224, 192]
[96, 96, 64]
[224, 96, 64]
[96, 224, 64]
[224, 224, 64]
[96, 96, 192]
[224, 96, 192]
[96, 224, 192]
[224, 224, 192]
Then, if you want to make the bicycle red, you can do:
# Load the image and make Numpy version
im = Image.open('bike.png')
n = np.array(im)
# Make all pixels belonging to bike (2) into red (palette index 9)
n[n==2] = 9
# Make all pixels not red (9) into grey (palette index 7)
n[n!=9] = 7
# Convert back into PIL palettised image and re-apply original palette
r = Image.fromarray(n,mode='P')
r.putpalette(im.getpalette())
r.save('result.png')
Keywords: Python, PIL, Pillow, image processing, palette, palette operations, masked image, mask, extract palette, apply palette.

How to iterate over a Tensor in Tensorflow and change its values if necessary?

Assume I have a Tensor in TensorFlow of shape [600, 11]. All the elements of the last (11th) column are zero. I want to iterate over the values of the Tensor like that: For each row, I check whether the maximum of the first 10 elements of the row is greater than a value X. If True, then keep the row unchanged, if False, then set the first 10 elements of the row to be equal to zero and make the 11th element equal to 1. How can I do that? The structure of my Tensor is shown below:
import tensorflow as tf
a = tf.zeros([600, 1], dtype=tf.float32)
b = tf.random.uniform([600,10], minval=0, maxval=1, dtype=tf.float32)
c = tf.concat([b, a], axis=1)
You cannot iterate through tensors, nor set the value of individual elements. Tensors are immutable, so you always have to build a new tensor from the previous one instead. This is how you can do something like what you describe:
import tensorflow as tf
def modify_matrix(matrix, X):
all_but_last_column = matrix[:, :-1]
max_per_row = tf.reduce_max(all_but_last_column, axis=1)
replace = tf.concat([tf.zeros_like(all_but_last_column),
tf.ones_like(matrix[:, -1])[:, tf.newaxis]], axis=1)
mask = max_per_row > X
return tf.where(mask, matrix, replace)
nums = [list(range(i * 10, (i + 1) * 10)) + [0] for i in range(1, 5)]
print(*nums, sep='\n')
# [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 0]
# [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 0]
# [30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 0]
# [40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 0]
matrix = tf.constant(nums)
X = tf.constant(36, dtype=matrix.dtype)
result = modify_matrix(matrix, X)
print(sess.run(result))
# [[ 0 0 0 0 0 0 0 0 0 0 1]
# [ 0 0 0 0 0 0 0 0 0 0 1]
# [30 31 32 33 34 35 36 37 38 39 0]
# [40 41 42 43 44 45 46 47 48 49 0]]
I also found another solution that it worked for me:
import tensorflow as tf
zeroes = tf.zeros([600, 1], dtype=tf.float32)
ones = tf.ones([600, 1], dtype=tf.float32)
b = tf.random.uniform([600,10], minval=0, maxval=1, dtype=tf.float32)
threshold = tf.constant(0.6, dtype=tf.float32)
check = tf.reduce_max(tf.cast(b > threshold, dtype=tf.float32), axis=1)
last_col = tf.where(check>0, zeroes, ones)
new_b = tf.where(check>0, b, tf.zeros([600, 10], dtype=tf.float32))
new_matrix = tf.concat([new_b, last_col], axis=1)

Resources