In keras, if I want my display the output for a given input low resolution image, what code would I use?
I tried
predictions = model.predict_classes(c)
as well as
get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[2].output])
layer_output = get_3rd_layer_output([c])[0]
However, both of these do not work.
Related
I have an existing model where I load some pre-trained weights and then do prediction (one image at a time) in pytorch. I am trying to basically convert it to a pytorch lightning module and am confused about a few things.
So currently, my __init__ method for the model looks like this:
self._load_config_file(cfg_file)
# just creates the pytorch network
self.create_network()
self.load_weights(weights_file)
self.cuda(device=0) # assumes GPU and uses one. This is probably suboptimal
self.eval() # prediction mode
What I can gather from the lightning docs, I can pretty much do the same, except not to do the cuda() call. So something like:
self.create_network()
self.load_weights(weights_file)
self.freeze() # prediction mode
So, my first question is whether this is the correct way to use lightning? How would lightning know if it needs to use the GPU? I am guessing this needs to be specified somewhere.
Now, for the prediction, I have the following setup:
def infer(frame):
img = transform(frame) # apply some transformation to the input
img = torch.from_numpy(img).float().unsqueeze(0).cuda(device=0)
with torch.no_grad():
output = self.__call__(Variable(img)).data.cpu().numpy()
return output
This is the bit that has me confused. Which functions do I need to override to make a lightning compatible prediction?
Also, at the moment, the input comes as a numpy array. Is that something that would be possible from the lightning module or do things always have to use some sort of a dataloader?
At some point, I want to extend this model implementation to do training as well, so want to make sure I do it right but while most examples focus on training models, a simple example of just doing prediction at production time on a single image/data point might be useful.
I am using 0.7.5 with pytorch 1.4.0 on GPU with cuda 10.1
LightningModule is a subclass of torch.nn.Module so the same model class will work for both inference and training. For that reason, you should probably call the cuda() and eval() methods outside of __init__.
Since it's just a nn.Module under the hood, once you've loaded your weights you don't need to override any methods to perform inference, simply call the model instance. Here's a toy example you can use:
import torchvision.models as models
from pytorch_lightning.core import LightningModule
class MyModel(LightningModule):
def __init__(self):
super().__init__()
self.resnet = models.resnet18(pretrained=True, progress=False)
def forward(self, x):
return self.resnet(x)
model = MyModel().eval().cuda(device=0)
And then to actually run inference you don't need a method, just do something like:
for frame in video:
img = transform(frame)
img = torch.from_numpy(img).float().unsqueeze(0).cuda(0)
output = model(img).data.cpu().numpy()
# Do something with the output
The main benefit of PyTorchLighting is that you can also use the same class for training by implementing training_step(), configure_optimizers() and train_dataloader() on that class. You can find a simple example of that in the PyTorchLightning docs.
Even though above answer suffices, if one takes note of following line
img = torch.from_numpy(img).float().unsqueeze(0).cuda(0)
One has to put both the model as well as image to the right GPU. On multi-gpu inference machine, this becomes a hassle.
To solve this, .predict was also recently produced, see more at https://pytorch-lightning.readthedocs.io/en/stable/deploy/production_basic.html
I have adapted this retrain.py script to use with several pretraineds model,
after training is done this generates a 'retrained_graph.pb' which I then read and try to use to run predictions on an image using this code:
def get_top_labels(image_data):
'''
Returns a list of labels and their probabilities
image_data: content of image as string
'''
with tf.compat.v1.Session() as sess:
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': image_data})
return predictions
This works fine for inception_v3 model because it has a tensor called 'DecodeJpeg', other models I'm using such as inception_v4, mobilenet and inception_resnet_v2 don't.
My question is can I add an ops to the graph, like the one used in add_jpeg_decoding in the retrain.py script so that I can afterwards use that for prediction ?
Would it be possible to do something like this:
predictions = sess.run(softmax_tensor, {image_data_tensor: image_data}) where image_data_tensor is a variable that depends on what model I'm using ?
I looked through stackoverflow and couldn't find a question that solves my problem, I'd really appreciate any help with this, thanks.
I need to at least know if it's possible.
Sorry for repost I got no views on my first one.
So after some research, I figured out a way, leaving an answer here in case someone needs it. What you need to do is do the decoding yourself get a tensor from the image using t = read_tensor_from_image_file found here, then you can run your predictions using this piece of code:
start = time.time()
results = sess.run(output_layer_name,
{input_layer_name: t})
end = time.time()
return results
usually input_layer_name = input:0 and output_layer_name = final_result:0.
I'm developing a machine learning model using keras and I notice that the available losses functions are not giving the best results on my test set.
I am using an Unet architecture, where I input a (16,16,3) image and the net also outputs a (16,16,3) picture (auto-encoder). I notice that maybe one way to improve the model would be if I used a loss function that compares pixel to pixel on the gradients (laplacian) between the net output and the ground truth. However, I did not found any tutorial that would handle this kind of application, because it would need to use opencv laplacian function on each output image from the net.
The loss function would be something like this:
def laplacian_loss(y_true, y_pred):
# y_true already is the calculated gradients, only needs to compute on the y_pred
# calculates the gradients for each predicted image
y_pred_lap = []
for img in y_pred:
laplacian = cv2.Laplacian( np.float64(img), cv2.CV_64F )
y_pred_lap.append( laplacian )
y_pred_lap = np.array(y_pred_lap)
# mean squared error, according to keras losses documentation
return K.mean(K.square(y_pred_lap - y_true), axis=-1)
Has anyone done something like that for loss calculation?
Given the code above, it seems that it would be equivalent to using a Lambda() layer as the output layer that applies that transformation in the image, before considering the mean square error.
Regardless as whether it is implemented as a Lambda() layer or in the loss function; the transformation needs to be such that Tensorflow understands how to calculate the gradients. The simplest was to do this would probably be to reimplement the cv2.Laplacian computation using Tensorflow math operations.
In order to use the cv2 library directly, you need to create a function that calculates the gradients for what happens inside the cv2 lib; that seems significantly more error prone.
Gradient descent optimisation relies on being able to compute gradients from the inputs to the loss; and back. Any operation in the middle must be differentiable; and Tensorflow must understand the math operations for auto differentiation to work; or you need to add them manually.
I managed to reach a easy solution. The main feature was that the gradient calculation is actually a 2D filter. For more information about it, please follow the link about the laplacian kernel. In that matter, is necessary that the output of my network be filtered by the laplacian kernel. For that, I created an extra convolutional layer with fixed weights, exactly as the laplacian kernel. After that, the network will have two outputs (one been the desired image, and the other been the gradient's image). So, is also necessary to define both losses.
To make it clearer, I'll exemplify. In the end of the network you'll have something like:
channels = 3 # number of channels of network output
lap = Conv2D(channels , (3,3), padding='same', name='laplacian') (net_output)
model = Model(inputs=[net_input], outputs=[net_out, lap])
Define how you want to calculate the losses for each output:
# losses for output, laplacian and gaussian
losses = {
"enhanced": "mse",
"laplacian": "mse"
}
lossWeights = {"enhanced": 1.0, "laplacian": 0.6}
Compile the model:
model.compile(optimizer=Adam(), loss=losses, loss_weights=lossWeights)
Define the laplacian kernel, apply its values in the weights of the above convolutional layer and set trainable equals False (so it won't be updated).
bias = np.asarray([0]*3)
# laplacian kernel
l = np.asarray([
[[[1,1,1],
[1,-8,1],
[1,1,1]
]]*channels
]*channels).astype(np.float32)
bias = np.asarray([0]*3).astype(np.float32)
wl = [l,bias]
model.get_layer('laplacian').set_weights(wl)
model.get_layer('laplacian').trainable = False
When training, remember that you need two values for the ground truth:
model.fit(x=X, y = {"out": y_out, "laplacian": y_lap})
Observation: Do not utilize the BatchNormalization layer! In case you use it, the weights in the laplacian layer will be updated!
I am working on a CNN that deals with super-resolution. It is required that I extract patches from the image, then train on these small patches (ie.41x41).
However, when it comes to predicting the image, the image is of a larger size than the patches. But Keras doesn't allow me to predict an image of larger size than the training images.
I have read Can Keras deal with input images with different size?. I have tried the way by putting None in my network input shape and then loading the weights. However, when it comes to this line: c1 = PReLU()(c1), I get the error: nt() argument must be a string, a bytes-like object or a number, not 'NoneType'. The code is attched below.
How can I fix this problem? I am using Keras with tensorflow backend. I have no fully connected layers, all are Conv2D with relu, except for the snippet below, is PReLU for c1.
Thanks.
input_shape = (None,None,1)
x = Input(shape = input_shape)
c1 = Convolution2D(64, (3,3), init = 'he_normal', padding='same', name='Conv1')(x)
c1 = PReLU()(c1)
#............................
output_img = keras.layers.add([x, finalconv])
model = Model(x, output_img)
Keras doesn't allow me to predict an image of larger size than the
training images
This is wrong, and keras allows you to do so when your network is designed properly.
However, when it comes to this line: c1 = PReLU()(c1), I get the
error: nt() argument must be a string, a bytes-like object or a
number, not 'NoneType'.
This error is expected because your input shape contains None. Actually, if you previously set shared_axes=[1,2] for PReLU (default value shared_axes=None), you will not see this error.
Therefore, the real issue here is that PReLU's parameters, previously set only for an 41x41 input, but now are asked to work for an arbitrary input size.
The best solution is to train a new model with input shape = (None,None,3) directly.
If you don't care about the possible degradation, you can load all layer weights of your pretrained model except for the PReLU layer. Then manually compute appropriate PReLU parameters can be shared across shared_axes =[1,2], and use it as the new PReLU parameters.
I have built a model and saved its weights as 'first_try.h5' but I am not able to load the model & run it on any random image. My problem is how to use model.predict_generator or model.predict or model.predict_on_batch methods. I have read the documentation, but I need one small example to do so. Sorry, I am new to this. It is a binary classifier.
I am using CNTK as my backend. I am following this tutorial.
Need little help.
EDIT:
I am trying the following way, but the output is just [[0.]]
model.load_weights('first_try.h5')
img = Image.open("data/predict/T160305M-0222c_5.jpg")
a = array(img).reshape(1,512,512,3)
score = model.predict(a, batch_size=32, verbose=0)
print(score)