Model parallelism not working? All GPUs not being used?

Model parallelism not working? All GPUs not being used? - python-3.x

I have been stuck with a problem like this for a while now. I have an AWS setup with 500 GB of ram and about 7 GPUs. Now the issue is that each time I try to run my keras with tensorflow as back-end code, it runs out of memory. I have found out the reason for this as well. The reason is that each you GPU just has 12gb of memory whereas my model needs more than that. So, how can I run the model such that it uses the memory of all the GPUs combined to load the model and not just rely on the memory of one GPU for loading the entire model and running out of memory? I have tried model parallelism with keras and it seems to be set-up correctly as on printing the layers , each layer is assigned to the programmed GPU but the model is still trying to load into a single GPU's memory I.e. just 11gb and soon runs out of memory.
Any idea what's going on?
with tf.device('/gpu:0'):
x = conv2d_bn(img_input, 32, 3, 3, strides=(2, 2), padding='valid')
x = conv2d_bn(x, 32, 3, 3, padding='valid')
x = conv2d_bn(x, 64, 3, 3)
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
x = conv2d_bn(x, 80, 1, 1, padding='valid')
x = conv2d_bn(x, 192, 3, 3, padding='valid')
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
# mixed 0, 1, 2: 35 x 35 x 256
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 32, 1, 1)
x = layers.concatenate(
[branch1x1, branch5x5, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed0')
print(x)
with tf.device('/gpu:1'):
# mixed 1: 35 x 35 x 256
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 64, 1, 1)
x = layers.concatenate(
[branch1x1, branch5x5, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed1')
# mixed 2: 35 x 35 x 256
branch1x1 = conv2d_bn(x, 64, 1, 1)
branch5x5 = conv2d_bn(x, 48, 1, 1)
branch5x5 = conv2d_bn(branch5x5, 64, 5, 5)
branch3x3dbl = conv2d_bn(x, 64, 1, 1)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch3x3dbl = conv2d_bn(branch3x3dbl, 96, 3, 3)
branch_pool = AveragePooling2D((3, 3), strides=(1, 1), padding='same')(x)
branch_pool = conv2d_bn(branch_pool, 64, 1, 1)
x = layers.concatenate(
[branch1x1, branch5x5, branch3x3dbl, branch_pool],
axis=channel_axis,
name='mixed2')
print(x)
Edit:
Here's the link to the code . Also it may be noted that currently, I am just feeding one image to the model so as to test whether my GPU can handle it. Hence, reducing the batch size is not a possible solution.

Related

Obtaining a specific shape using nn.Conv2d

Starting with an input shape like (64, 1, 103, 8) how should I set the parameters of nn.Conv2d to arrive at a shape of (64, 32, 43, 8)?
Currently I'm using the following
nn.Conv2d(in_channels=1,out_channels=32, stride=(2,1),kernel_size=(3,3),padding=(0,1),dilation=(9,1))
But I'm afraid that dilation parameter may cause bad performance.

You can use padding = (13, 1), stride = (3, 1) and kernel_size = 3:
nn.Conv2d(1, 32, 3, stride = (3, 1), padding = (13, 1))

Python 3: IndexError: list index out of range while doing Knapsack Problem

I am currently self-learning python for a career change. While doing some exercises about 'list', I encountered IndexError: list index out of range.
So, I am trying to build a function, that determines which product should be placed on my store's shelves. But, I also put constraints.
The shelve has a max capacity of 200
small-sized items should be placed first
if two or more items have the same size, the item with the highest price should be placed first
As an input for the function, I have a list of tuples "dairy_items", denoted as [(id, size, price)].
This is my code:
capacity=200
dairy_items=[('p1', 10, 3), ('p2', 13, 5),
('p3', 15, 2), ('p4', 26, 2),
('p5', 18, 6), ('p6', 25, 3),
('p7', 20, 4), ('p8', 10, 5),
('p9', 15, 4), ('p10', 12, 7),
('p11', 19, 3), ('p12', 27, 6),
('p13', 16, 4), ('p14', 23, 5),
('p15', 14, 2), ('p16', 23, 5),
('p17', 12, 7), ('p18', 11, 3),
('p19', 16, 5), ('p20', 11, 4)]
def shelving(dairy_items):
#first: sort the list of tuples based on size: low-to-big
items = sorted(dairy_items, key=lambda x: x[1], reverse=False)
#second: iterate the sorted list of tuples.
#agorithm: retrieve the first 2 elements of the sorted list
#then compare those two elements by applying rules/conditions as stated
#the 'winning' element is placed to 'result' and this element is removed from 'items'. Also 'temp' list is resetted
#do again untill shelves cannot be added anymore (capacity full and do not exceeds limit)
result = []
total_price = []
temp_capacity = []
temp = items[:2]
while sum(temp_capacity) < capacity:
#add conditions: (low first) and (if size the same, highest price first)
if (temp[0][1] == temp[1][1]) and (temp[0][2] > temp[1][2]):
temp_capacity.append(temp[0][1])
result.append(temp.pop(0))
items.pop(0)
temp.clear()
temp = items[:2]
total_price.append(temp[0][2])
elif ((temp[0][1] == temp[1][1])) and (temp[0][2] < temp[1][2]):
temp_capacity.append(temp[1][1])
result.append(temp.pop())
items.pop()
temp.clear()
temp = items[:2]
total_price.append(temp[1][2])
else:
temp_capacity.append(temp[0][1])
result.append(temp.pop(0))
items.pop(0)
temp.clear()
temp = items[:2]
total_price.append(temp[0][2])
result = result.append(temp_capacity)
#return a tuple with three elements: ([list of product ID to be placed in order], total occupied capacity of shelves, total prices)
return result
c:\Users\abc\downloads\listexercise.py in <module>
----> 1 print(shelving(dairy_items))
c:\Users\abc\downloads\listexercise.py in shelving(dairy_items)
28 while sum(temp_capacity) < capacity:
29
---> 30 if (temp[0][1] == temp[1][1]) and (temp[0][2] > temp[1][2]):
31 temp_capacity.append(temp[0][1])
32 result.append(temp2.pop(0))
IndexError: list index out of range
EDIT:
This is the expected result:
#Result should be True
print(shelving(dairy_items) == (['p8', 'p1', 'p20', 'p18', 'p10', 'p17', 'p2', 'p15', 'p9', 'p3', 'p19', 'p13', 'p5', 'p11'], 192, 60))

The IndexError occured because, you had tried to append the 2nd element after popping it from temp because, after popping it out, there will be only one element in temp which can indexed with 0.
Also I noticed a few more bugs which could hinder your program from giving the correct output and rectified them.
The following code will work efficiently...
from time import time
start = time()
capacity = 200
dairy_items = [('p1', 10, 3), ('p2', 13, 5),
('p3', 15, 2), ('p4', 26, 2),
('p5', 18, 6), ('p6', 25, 3),
('p7', 20, 4), ('p8', 10, 5),
('p9', 15, 4), ('p10', 12, 7),
('p11', 19, 3), ('p12', 27, 6),
('p13', 16, 4), ('p14', 23, 5),
('p15', 14, 2), ('p16', 23, 5),
('p17', 12, 7), ('p18', 11, 3),
('p19', 16, 5), ('p20', 11, 4)]
def shelving(dairy_items):
items = sorted(dairy_items, key=lambda x: x[1])
result = ([],)
total_price, temp_capacity = 0, 0
while (temp_capacity+items[0][1]) < capacity:
temp = items[:2]
if temp[0][1] == temp[1][1]:
if temp[0][2] > temp[1][2]:
temp_capacity += temp[0][1]
result[0].append(temp[0][0])
total_price += temp[0][2]
items.pop(0)
elif temp[0][2] < temp[1][2]:
temp_capacity += temp[1][1]
result[0].append(temp[1][0])
total_price += temp[1][2]
items.pop(items.index(temp[1]))
else:
temp_capacity += temp[0][1]
result[0].append(temp[0][0])
total_price += temp[0][2]
items.pop(0)
else:
temp_capacity += temp[0][1]
result[0].append(temp[0][0])
total_price += temp[0][2]
items.pop(0)
result += (temp_capacity, total_price)
return result
a = shelving(dairy_items)
end = time()
print(a)
print(f"\nTime Taken : {end-start} secs")
Output:-
(['p8', 'p1', 'p20', 'p18', 'p10', 'p17', 'p2', 'p15', 'p9', 'p3', 'p19', 'p13', 'p5', 'p11'], 192, 60)
Time Taken : 3.123283386230469e-05 secs

Not sure what the question is, but the following information may be relevant:
IndexError occurs when a sequence subscript is out of range. What does this mean? Consider the following code:
l = [1, 2, 3]
a = l[0]
This code does two things:
Define a list of 3 integers called l
Assigns the first element of l to a variable called a
Now, if I were to do the following:
l = [1, 2, 3]
a = l[3]
I would raise an IndexError, as I'm accessing the fouth element of a three element list. Somewhere in your code, you're likely over-indexing your list. This is a good chance to learn about debugging using pdg. Throw a call to breakpoint() in your code and inspect the variables, good luck!

ok, firstly, you should debug your code, if you print temp before adding temp[1][2] to total_price you would see that the last index is what causing the error, the example is here:
capacity=200
dairy_items=[('p1', 10, 3), ('p2', 13, 5),
('p3', 15, 2), ('p4', 26, 2),
('p5', 18, 6), ('p6', 25, 3),
('p7', 20, 4), ('p8', 10, 5),
('p9', 15, 4), ('p10', 12, 7),
('p11', 19, 3), ('p12', 27, 6),
('p13', 16, 4), ('p14', 23, 5),
('p15', 14, 2), ('p16', 23, 5),
('p17', 12, 7), ('p18', 11, 3),
('p19', 16, 5), ('p20', 11, 4)]
def shelving(dairy_items):
#first: sort the list of tuples based on size: low-to-big
items = sorted(dairy_items, key=lambda x: x[1], reverse=False)
#second: iterate the sorted list of tuples.
#agorithm: retrieve the first 2 elements of the sorted list
#then compare those two elements by applying rules/conditions as stated
#the 'winning' element is placed to 'result' and this element is removed from 'items'. Also 'temp' list is resetted
#do again untill shelves cannot be added anymore (capacity full and do not exceeds limit)
result = []
total_price = []
temp_capacity = []
temp = items[:2]
while sum(temp_capacity) < capacity:
#add conditions: (low first) and (if size the same, highest price first)
if (temp[0][1] == temp[1][1]) and (temp[0][2] > temp[1][2]):
temp_capacity.append(temp[0][1])
result.append(temp.pop(0))
items.pop(0)
temp.clear()
temp = items[:2]
total_price.append(temp[0][2])
elif ((temp[0][1] == temp[1][1])) and (temp[0][2] < temp[1][2]):
temp_capacity.append(temp[1][1])
result.append(temp.pop())
items.pop()
temp.clear()
temp = items[:2]
print(temp) # -----------NEW LINE ADDED TO DEBUG YOUR CODE
total_price.append(temp[1][2])
else:
temp_capacity.append(temp[0][1])
result.append(temp.pop(0))
items.pop(0)
temp.clear()
temp = items[:2]
total_price.append(temp[0][2])
result = result.append(temp_capacity)
#return a tuple with three elements: ([list of product ID to be placed in order], total occupied capacity of shelves, total prices)
return result
shelving(dairy_items)
the result i am getting is:
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3), ('p8', 10, 5)]
[('p1', 10, 3)]
Traceback (most recent call last):
File "<string>", line 55, in <module>
File "<string>", line 44, in shelving
IndexError: list index out of range
>
as you see clearly last index [('p1', 10, 3)] has only 1 tuple, hence the IndexError

how to fit the dimension in the autoencoder of Keras

I am using a convolutional autoencoder for the Mnist image data (with dimension 28*28), here is my code
input_img = Input(shape=(28, 28, 1))
x = Convolution2D(16, (5, 5), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (5, 5), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
I get an error message (with padding ='same' at each layer)
ValueError: Error when checking target: expected conv2d_148 to have shape (32, 32, 1) but got array
with shape (28, 28, 1)
Here is my model summary
Layer (type) Output Shape Param #
input_20 (InputLayer) (None, 28, 28, 1) 0
conv2d_142 (Conv2D) (None, 28, 28, 16) 416
max_pooling2d_64 (MaxPooling (None, 14, 14, 16) 0
conv2d_143 (Conv2D) (None, 14, 14, 8) 1160
max_pooling2d_65 (MaxPooling (None, 7, 7, 8) 0
conv2d_144 (Conv2D) (None, 7, 7, 8) 584
max_pooling2d_66 (MaxPooling (None, 4, 4, 8) 0
conv2d_145 (Conv2D) (None, 4, 4, 8) 584
up_sampling2d_64 (UpSampling (None, 8, 8, 8) 0
conv2d_146 (Conv2D) (None, 8, 8, 8) 584
up_sampling2d_65 (UpSampling (None, 16, 16, 8) 0
conv2d_147 (Conv2D) (None, 16, 16, 16) 3216
up_sampling2d_66 (UpSampling (None, 32, 32, 16) 0
conv2d_148 (Conv2D) (None, 32, 32, 1) 145
Total params: 6,689
Trainable params: 6,689
Non-trainable params: 0
I know if I change the first layer to
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
It works but I want to use a 5*5 convolution.
How it happens?

You can increase your last filter size to (5, 5) to make this work:
from tensorflow.keras.layers import *
from tensorflow.keras import Model, Input
import numpy as np
input_img = Input(shape=(28, 28, 1))
x = Conv2D(16, (5, 5), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (5, 5), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (5, 5), activation='sigmoid', padding='valid')(x)
auto = Model(input_img, decoded)
auto.build(input_shape=(1, 28, 28, 1))
auto(np.random.rand(1, 28, 28, 1)).shape
TensorShape([1, 28, 28, 1])
Or, use tf.keras.Conv2DTranspose

Problem with designing a Convolutional Autoencoder

This is my first question, so please forgive if I've missed adding something.
I'm trying to create a Convolutional Autoencoder in Pytorch 1.7.0, yet am having difficulty in designing the model so that the output size is equal to the input size. I'm currently working on the MNIST dataset, with the input tensor size being 1128*28 and currently, the output is 1*1*29*29...
Can someone please help me identify the problem? *Please note that I'll incorporate the learnings afterwards.
class autoencoder(nn.Module):
def __init__(self, hidden_node_count):
super(autoencoder, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 5, stride=2, padding=2)
self.conv2 = nn.Conv2d(32,32, 5, stride=2)#, padding=2)
self.pool = nn.MaxPool2d(hidden_node_count, hidden_node_count)
self.t_conv1 = nn.ConvTranspose2d(32, 32, 5, stride=2)#, padding=2)
self.t_conv2 = nn.ConvTranspose2d(32, 32, 5, stride=2)#, padding=2)
self.t_conv3 = nn.ConvTranspose2d(32, 1, 5, stride=2)#, padding=2)
self.relu = nn.ReLU(True)
self.tanh = nn.Tanh()
def forward(self, x):
print(x.size(), "input")
x = self.conv1(x)
x = self.relu(x)
print(x.size(), "conv1")
x = self.conv2(x)
print(x.size(), "conv2")
x = self.pool(x)
print(x.size(), "pool")
x = self.t_conv1(x)
x = self.relu(x)
print(x.size(), "deconv1")
x = self.t_conv2(x)
x = self.relu(x)
print(x.size(), "deconv2")
x = self.t_conv3(x)
x = self.tanh(x)
print(x.size(), "deconv3")
return x
With its STDOUT being ->
torch.Size([1, 1, 28, 28]) input
torch.Size([1, 32, 14, 14]) conv1
torch.Size([1, 32, 5, 5]) conv2
torch.Size([1, 32, 1, 1]) pool
torch.Size([1, 32, 5, 5]) deconv1
torch.Size([1, 32, 13, 13]) deconv2
torch.Size([1, 1, 29, 29]) deconv3
torch.Size([1, 1, 29, 29])
torch.Size([1, 1, 28, 28])

according to the documentation for ConvTranspose2d, here is the formula to compute the output size :
Hout=(Hin−1)×stride[0]−2×padding[0]+dilation[0]×(kernel_size[0]−1)+output_padding[0]+1
In your case, Hin=13, padding=0, dilation=1, kernel_size=5, output_padding=0, which gives Hout=29. Your output tensor is as it should be !
If you want to have an output of 28, add some padding. With padding=1, you will get an output of size (1,32,27,27), because the output size of a ConvTranpose2d is ambiguous (read the doc). Therefore, you need to add some output padding as well :
conv = nn.ConvTranspose2d(32, 1, 5, stride= 2, padding=1, output_padding=1)
conv(randn(1,32,13,13)).size()
>>> (1, 1, 28, 28)

TensorFlow: Why does avg_pool ignore one stride dimension?

I am attempting to stride over the channel dimension, and the following code exhibits surprising behaviour. It is my expectation that tf.nn.max_pool and tf.nn.avg_pool should produce tensors of identical shape when fed the exact same arguments. This is not the case.
import tensorflow as tf
x = tf.get_variable('x', shape=(100, 32, 32, 64),
initializer=tf.constant_initializer(5), dtype=tf.float32)
ksize = (1, 2, 2, 2)
strides = (1, 2, 2, 2)
max_pool = tf.nn.max_pool(x, ksize, strides, padding='SAME')
avg_pool = tf.nn.avg_pool(x, ksize, strides, padding='SAME')
print(max_pool.shape)
print(avg_pool.shape)
This prints
$ python ex04/mini.py
(100, 16, 16, 32)
(100, 16, 16, 64)
Clearly, I am misunderstanding something.

The link https://github.com/Hvass-Labs/TensorFlow-Tutorials/issues/19 states:
The first and last stride must always be 1,
because the first is for the image-number and
the last is for the input-channel.

Turns out this is really a bug.
https://github.com/tensorflow/tensorflow/issues/14886#issuecomment-352934112

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Model parallelism not working? All GPUs not being used? - python-3.x

Related

Obtaining a specific shape using nn.Conv2d

Python 3: IndexError: list index out of range while doing Knapsack Problem

how to fit the dimension in the autoencoder of Keras

Problem with designing a Convolutional Autoencoder

TensorFlow: Why does avg_pool ignore one stride dimension?

Categories

Resources