YOLOv3 SPP and YOLOv3 difference? - conv-neural-network

I couldn't find any good explanation about YOLOv3 SPP which has better mAP than YOLOv3. The author himself states YOLOv3 SPP as this on his repo:
YOLOv3 with spatial pyramid pooling, or something
But still I don't really understand it. In yolov3-spp.cfg I notice there are some additions
575 ### SPP ###
576 [maxpool]
577 stride=1
578 size=5
579
580 [route]
581 layers=-2
582
583 [maxpool]
584 stride=1
585 size=9
586
587 [route]
588 layers=-4
589
590 [maxpool]
591 stride=1
592 size=13
593
594 [route]
595 layers=-1,-3,-5,-6
596
597 ### End SPP ###
598
599 [convolutional]
600 batch_normalize=1
601 filters=512
602 size=1
603 stride=1
604 pad=1
605 activation=leaky
Anybody can give further explanation about how YOLOv3 SPP works? Why layers -2, -4 and -1, -3, -5, -6 are chosen in [route] layers? Thanks.

Finally some researchers published a paper about SPP application in Yolo https://arxiv.org/abs/1903.08589.
For yolov3-tiny, yolov3, and yolov3-spp differences :
yolov3-tiny.cfg uses downsampling (stride=2) in Max-Pooling layers
yolov3.cfg uses downsampling (stride=2) in Convolutional layers
yolov3-spp.cfg uses downsampling (stride=2) in Convolutional layers + gets the best features in Max-Pooling layers
But they got only mAP = 79.6% on Pascal VOC 2007 test with using Yolov3SPP-model on original framework.
But we can achive higher accuracy mAP = 82.1% even with yolov3.cfg model by using AlexeyAB's repository https://github.com/AlexeyAB/darknet/issues/2557#issuecomment-474187706
And for sure we can achieve even higher mAP with yolov3-spp.cfg using Alexey's repo.
Original github question : https://github.com/AlexeyAB/darknet/issues/2859

See Figure 3. SPP explanation.
In yolov3-spp.cfg, they use 3 different size max pool to the same image by using [route]
After then, they collect created feature map as called "fixed-length representation" in Figure 3.

Related

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE on PyTorch Lightning

I am working on a tutorial of PyTorch Lightning.
https://pytorch-lightning.readthedocs.io/en/stable/starter/introduction.html
Because I wanted to try GPU training, I changed definition of trainer as below.
trainer = pl.Trainer(limit_train_batches=100, max_epochs=1, gpus=1)
Then I got the following error.
RuntimeError Traceback (most recent call last)
Cell In [3], line 4
1 # train the model (hint: here are some helpful Trainer arguments for rapid idea iteration)
2 # trainer = pl.Trainer(limit_train_batches=100, max_epochs=3)
3 trainer = pl.Trainer(limit_train_batches=100, max_epochs=3, accelerator='gpu', devices=1)
----> 4 trainer.fit(model=autoencoder, train_dataloaders=train_loader)
File ~/miniconda3/envs/py38-cu116/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:696, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
677 r"""
678 Runs the full optimization routine.
679
(...)
693 datamodule: An instance of :class:`~pytorch_lightning.core.datamodule.LightningDataModule`.
694 """
695 self.strategy.model = model
--> 696 self._call_and_handle_interrupt(
697 self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
698 )
File ~/miniconda3/envs/py38-cu116/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:650, in Trainer._call_and_handle_interrupt(self, trainer_fn, *args, **kwargs)
648 return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
649 else:
--> 650 return trainer_fn(*args, **kwargs)
651 # TODO(awaelchli): Unify both exceptions below, where `KeyboardError` doesn't re-raise
652 except KeyboardInterrupt as exception:
[...]
File ~/miniconda3/envs/py38-cu116/lib/python3.8/site-packages/pytorch_lightning/core/module.py:1450, in LightningModule.backward(self, loss, optimizer, optimizer_idx, *args, **kwargs)
1433 def backward(
1434 self, loss: Tensor, optimizer: Optional[Optimizer], optimizer_idx: Optional[int], *args, **kwargs
1435 ) -> None:
1436 """Called to perform backward on the loss returned in :meth:`training_step`. Override this hook with your
1437 own implementation if you need to.
1438
(...)
1448 loss.backward()
1449 """
-> 1450 loss.backward(*args, **kwargs)
File ~/miniconda3/envs/py38-cu116/lib/python3.8/site-packages/torch/_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
387 if has_torch_function_unary(self):
388 return handle_torch_function(
389 Tensor.backward,
390 (self,),
(...)
394 create_graph=create_graph,
395 inputs=inputs)
--> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File ~/miniconda3/envs/py38-cu116/lib/python3.8/site-packages/torch/autograd/__init__.py:173, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
168 retain_graph = create_graph
170 # The reason we repeat same the comment below is that
171 # some Python versions print out the first line of a multi-line function
172 # calls in the traceback and some print out the last line
--> 173 Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
174 tensors, grad_tensors_, retain_graph, create_graph, inputs,
175 allow_unreachable=True, accumulate_grad=True)
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
The only thing I added to the tutorial code is gpus=1, so I cannot figure out what is the problem. How can I fix this?
FYI, I tried giving devices=1, accelerator='ddp' instead of gpus=1, and got a following error.
ValueError: You selected an invalid accelerator name: `accelerator='ddp'`. Available names are: cpu, cuda, hpu, ipu, mps, tpu.
My environments are:
CUDA 11.6
Python 3.8.13
PyTorch 1.12.1
PyTorch Lightning 1.7.7
I think you made a mistake on the trainer's argument.
accelerator should be cpu, cuda, hpu, ipu, mps, tpu;
devices is the number of, say that, gpus;
and then you can pass the "ddp" argument to "strategy"
trainer = pl.Trainer(
accelerator="GPU",
devices=[0],
strategy="ddp"
)
hope it helps!
Though I'm not sure about the reason, the issue disappeared when I used Python 3.10 instead of 3.8.

How to predict multi outputs using gradient boosting regression?

I have implemented simple code for gradient boosting regression (GBR) to predict one output and it works well, but when I try to predict two outputs I got error as shown below:
ValueError Traceback (most recent call last)
<ipython-input-5-bb1f191ee195> in <module>()
4 }
5 gradient_boosting_regressor = ensemble.GradientBoostingRegressor(**params)
----> 6 gradient_boosting_regressor.fit(train_data,train_targets)
7 # 'learning_rate': 0.01
D:\Anoconda\lib\site-packages\sklearn\ensemble\gradient_boosting.py in fit(self, X, y, sample_weight, monitor)
977
978 # Check input
--> 979 X, y = check_X_y(X, y, accept_sparse=['csr', 'csc', 'coo'], dtype=DTYPE)
980 n_samples, self.n_features_ = X.shape
981 if sample_weight is None:
D:\Anoconda\lib\site-packages\sklearn\utils\validation.py in check_X_y(X, y, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
576 dtype=None)
577 else:
--> 578 y = column_or_1d(y, warn=True)
579 _assert_all_finite(y)
580 if y_numeric and y.dtype.kind == 'O':
D:\Anoconda\lib\site-packages\sklearn\utils\validation.py in column_or_1d(y, warn)
612 return np.ravel(y)
613
--> 614 raise ValueError("bad input shape {0}".format(shape))
615
616
ValueError: bad input shape (22, 2)
Could I get any assistance or idea to predict two outputs using GBR?
My try as below:
Data_ini = pd.read_excel('Data - 2output -Ra-in - angle.xlsx').iloc[:,:]
Data_ini_val = pd.read_excel('val - Ra-in -angle 12.xlsx').iloc[:,:]
train_data = Data_ini.iloc[:,:4]
train_targets = Data_ini.iloc[:,-2:]
val_data = Data_ini_val.iloc[:,:4]
val_targets = Data_ini_val.iloc[:,-2:]
params = {'n_estimators': 5000, 'max_depth': 4, 'min_samples_split': 2, 'min_samples_leaf': 2
}
gradient_boosting_regressor = ensemble.GradientBoostingRegressor(**params)
gradient_boosting_regressor.fit(train_data,train_targets)
Use MultiOutputRegressor for that.
Multi target regression
This strategy consists of fitting one regressor per target. This is a
simple strategy for extending regressors that do not natively support
multi-target regression.
Example:
from sklearn.multioutput import MultiOutputRegressor
...
params = {'n_estimators': 5000, 'max_depth': 4, 'min_samples_split': 2, 'min_samples_leaf': 2
}
estimator = MultiOutputRegressor(ensemble.GradientBoostingRegressor(**params))
estimator.fit(train_data,train_targets)

How does one use 3D convolutions on standard 3 channel images?

I am trying to use 3d conv on cifar10 data set (just for fun). I see the docs that we usually have the input be 5d tensors (N,C,D,H,W). Am I really forced to pass 5 dimensional data necessarily?
The reason I am skeptical is because 3D convolutions simply mean my conv moves across 3 dimensions/directions. So technically I could have 3d 4d 5d or even 100d tensors and then should all work as long as its at least a 3d tensor. Is that not right?
I tried it real quick and it did give an error:
import torch
​
​
def conv3d_example():
N,C,H,W = 1,3,7,7
img = torch.randn(N,C,H,W)
##
in_channels, out_channels = 1, 4
kernel_size = (2,3,3)
conv = torch.nn.Conv3d(in_channels, out_channels, kernel_size)
##
out = conv(img)
print(out)
print(out.size())
​
##
conv3d_example()
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-3-29c73923cc64> in <module>
15
16 ##
---> 17 conv3d_example()
<ipython-input-3-29c73923cc64> in conv3d_example()
10 conv = torch.nn.Conv3d(in_channels, out_channels, kernel_size)
11 ##
---> 12 out = conv(img)
13 print(out)
14 print(out.size())
~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
--> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)
~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
474 self.dilation, self.groups)
475 return F.conv3d(input, self.weight, self.bias, self.stride,
--> 476 self.padding, self.dilation, self.groups)
477
478
RuntimeError: Expected 5-dimensional input for 5-dimensional weight 4 1 2 3, but got 4-dimensional input of size [1, 3, 7, 7] instead
cross posted:
https://discuss.pytorch.org/t/how-does-one-use-3d-convolutions-on-standard-3-channel-images/53330
How does one use 3D convolutions on standard 3 channel images?
Consider the following scenario. You have a 3 channel NxN image. This image will have size of 3xNxN in pytorch (ignoring the batch dimension for now).
Say you pass this image to a 2D convolution layer with no bias, kernel size 5x5, padding of 2, and input/output channels of 3 and 10 respectively.
What's actually happening when we apply this layer to the input image?
You can think of it like this...
For each of the 10 output channels there is a kernel of size 3x5x5. A 3D convolution is applied to the 3xNxN input image using this kernel, which can be thought of as unpadded in the first dimension. The result of this convolution is a 1xNxN feature map.
Since there are 10 output layers, there are 10 of the 3x5x5 kernels. After all kernels have been applied the outputs are stacked into a single 10xNxN tensor.
So really, in the classical sense, a 2D convolution layer is already performing a 3D convolution.
Similarly for a 3D convolution layer, its really doing a 4D convolution, which is why you need 5 dimensional input.
Let's review what we know, for a 3D convolution we will need to address these:
N For mini batch (or how many sequences do we want to feed at one go)
Cin For the number of channels in our input (if our image is rgb, this is 3)
D For depth or in other words the number of images/frames in one input sequence (if we are dealing videos, this is the number of frames)
H For the height of the image/frame
W For the width of the image/frame
So now that we know what's needed, it should be easy to get this going.
In your example, you are missing the depth in the input and since you have a single rgb image, then the depth or time dimension of your input is 1.
You also have a wrong in_channels. its C (in your case 3, as you have rgb image it seems)
You also need to fix your kernel dimensions as it has the wrong depth dimension as well. again since we are dealing with a single image and not a sequence of images, the depth is 1. were you to have a depth of k in your input, then you could choose any values 1<=n<=k in your kernel.
Now you should be able to successfully run your snippet.
def conv3d_example():
# for deterministic output only
torch.random.manual_seed(0)
N,C,D,H,W = 1,3,1,7,7
img = torch.randn(N,C,D,H,W)
##
in_channels = C
out_channels = 4
kernel_size = (1,3,3)
conv = torch.nn.Conv3d(in_channels, out_channels, kernel_size)
##
out = conv(img)
print(out)
print(out.size())
results in :
In [3]: conv3d_example()
tensor([[[[[ 0.9368, -0.6973, 0.1359, 0.2023, -0.3149],
[-0.4601, 0.2668, 0.3414, 0.6624, -0.6251],
[-1.0212, -0.0767, 0.2693, 0.9537, -0.4375],
[ 0.6981, -0.1586, -0.3076, 0.1973, -0.2972],
[-0.0747, -0.8704, 0.1757, -0.4161, -0.3464]]],
[[[-0.4710, -0.7841, -1.1406, -0.6413, 0.9183],
[-0.2473, 0.2532, -1.0443, -0.8634, -0.8797],
[ 0.5243, -0.4383, 0.1375, -0.7561, 0.7913],
[-1.1216, -0.4496, 0.5481, 0.1034, -1.0036],
[-0.0941, -0.1458, -0.1438, -1.0257, -0.4392]]],
[[[ 0.5196, 0.3102, 0.5299, -0.0126, 0.7945],
[ 0.3721, -1.3339, -0.5849, -0.2701, 0.4842],
[-0.2661, 0.9777, -0.3328, -0.1730, -0.6360],
[ 0.4960, 0.2348, 0.5183, -0.2935, 0.1777],
[-0.2672, 0.0233, -0.5573, 0.8366, 0.6082]]],
[[[-0.1565, -1.7331, -0.2015, -1.1708, 0.3099],
[-0.3667, 0.1985, -0.4940, 0.4044, -0.8000],
[ 0.2814, -0.6172, -0.4466, -0.6098, 0.0983],
[-0.5814, -0.2825, -0.1321, 0.5536, -0.4767],
[-0.3337, 0.3160, -0.4748, -0.7694, -0.0705]]]]],
grad_fn=<SlowConv3DBackward0>)
torch.Size([1, 4, 1, 5, 5])

The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(), in image classification problem

I doing image classification using predefined model vgg16, I got 89% accuracy in validation data, To increase the model accuracy, I did an image augmentation, but got some errors. please help me on how to fit for the model.
here my code.
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
train_datagen.fit(X_train)
I am using the input image are 64x64x3.
I am a fit model like this.
history = model.fit_generator(
train_datagen.flow(X_train,y_train),
steps_per_epoch=(X_train)/32 ,
epochs=30,
validation_data=(X_test,y_test),
validation_steps=(X_test)/32,
verbose=1)
Epoch 1/30
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-30-ff3a9aaa40da> in <module>()
5 validation_data=(X_test,y_test),
6 validation_steps=(X_test)/32,
----> 7 verbose=1)
/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name + '` call to the ' +
90 'Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1416 use_multiprocessing=use_multiprocessing,
1417 shuffle=shuffle,
-> 1418 initial_epoch=initial_epoch)
1419
1420 #interfaces.legacy_generator_methods_support
/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
178 steps_done = 0
179 batch_index = 0
--> 180 while steps_done < steps_per_epoch:
181 generator_output = next(output_generator)
182
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Referring to #jmetz, #suri you have the same issue with your validation_steps parameter, as you initialized it to (X_test)/32(probably not a scalar).
Check your validation_steps.shape / len(validation_steps) and your steps_per_epoch.shape / len(steps_per_epoch)(depending on the input dimensions).
They have to be scalars.
It looks like steps_per_epoch should be a scalar (single value).
You set it to (X_train)/32.

How can I get the train and test scores for each iteration of a MLPRegressor?

This answer seems exactly what I need BUT for a regressor instead of a classifier.
https://stackoverflow.com/a/46913459/9726897
I made very minor modifications to modified the code provided by sascha from link as shown below. I thought it would be fairly straightforward to use for my MLPRegressior... but I'm getting an error message I don't know how to fix Any help would be greatly appreciated:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPRegressor
estimator_reg = MLPRegressor(
solver='adam',
activation='relu',
learning_rate='adaptive',
learning_rate_init=.01,
hidden_layer_sizes=[100],
alpha=0.01,
max_iter=1000,
random_state=42,
tol=0.0001,
early_stopping=False,
warm_start=True,
beta_1=0.7,
beta_2=0.98,
epsilon=0.0000000001,
verbose=10,
)
""" Home-made mini-batch learning
-> not to be used in out-of-core setting!
"""
N_TRAIN_SAMPLES = train_data.shape[0]
N_EPOCHS = 25
N_BATCH = 128
scores_train = []
scores_test = []
# EPOCH
epoch = 0
while epoch < N_EPOCHS:
print('epoch: ', epoch)
# SHUFFLING
random_perm = np.random.permutation(train_data.shape[0])
mini_batch_index = 0
while True:
# MINI-BATCH
indices = random_perm[mini_batch_index:mini_batch_index + N_BATCH]
estimator_reg.partial_fit(train_data[indices], train_labels[indices])
mini_batch_index += N_BATCH
if mini_batch_index >= N_TRAIN_SAMPLES:
break
# SCORE TRAIN
scores_train.append(estimator_reg.score(train_data, train_labels))
# SCORE TEST
scores_test.append(estimator_reg.score(test_data, test_labels))
epoch += 1
""" Plot """
fig, ax = plt.subplots(2, sharex=True, sharey=True)
ax[0].plot(scores_train)
ax[0].set_title('Train')
ax[1].plot(scores_test)
ax[1].set_title('Test')
fig.suptitle("Accuracy over epochs", fontsize=14)
plt.show()
and I get this error:
KeyError Traceback (most recent call last)
in ()
---> 46 estimator_reg.partial_fit(train_data[indices], train_labels[indices])
.......
.......
KeyError: '[ 789 1493 353 33 1011 2029 1696 1649 653 1648 22 2477 2120 1000\n 2481 2448 1704 1962 2291 1995 2085 710 967 1839 461 504 1650 2166\n 584 513 676 1196 1621 2109 766 2012 1017 1636 1286 448 2049 1791\n 141 1168 1249 159 2061 2456 431 1799 2249 2379 1169 1044 1010 120\n 2503 316 1070 671 1005 2164 975 2371 811 1555 1193 1316 487 1867\n 1262 1395 135 2224 32 1509 2132 997 263 233 1614 2317 1432 49\n 1251 2227 2536 1955 359 650 2287 792 1900 606 763 1837 742 965\n 1190 53 910 2486 738 103 1965 99 1084 123 1061 806 384 2261\n 2284 2114 360 1075 1479 1446 455 2294 221 1856 979 1078 2106 189\n 2153 1183] not in index'
I guess that you have indexes that are not in the range (0,N_TRAIN_SAMPLES).
That may happen if you deleted or filtered some rows, or the index contained from the begining some numbers not in that range.
Try changing this line:
random_perm = np.random.permutation(train_data.shape[0])
into this:
random_perm = np.random.permutation(train_data.index.values)

Resources