Pyro: Simple inverse graphics example using SVI not working - pytorch

I'm new to pyro and trying to implement a simple inverse graphics problem involving estimating the coordinates of the points of a triangle rendered on a black & white 32x32 image.
So I defined a generative model that generates 3 uniformly random points, renders them into an image and observes the result.
I then use SVI with an autoguide (AutoMultivariateNormal) to try and estimates the points for a fixed triangle image.
SVI seems to run well and ELBO loss decreases, however when trying to sample from the posterior all I get is uniformly random points with no sign of learning.
My code in a Jupyter notebook with the results:
What am I missing here?

Related

Is labelling images with polygon better than square?

I aim to make an object detection model and I labelled data with a square box
If I label the images with polygon, will it be better than square?
(labelling on image of people wearing safety helmet or not)
I did try label with polygon shape on a few images and after export txt file for YOLO
why it has only 4 points in the text file as same as labelled with a square shape
how those points will represent an area that I label accurately?
1 0.573748 0.018953 0.045332 0.036101
1 0.944520 0.098375 0.108931 0.167870
You have labeled your object in a polygonial format, but when you had made a conversion to YOLO-format the information in the labelings has reduced. The picture below shows how I suppose has happend;
...where you have done polygon shape annotation (black shape). But, the conversion has "searched" the smallest x-value from the polygonial coordinate points and smallest y-value from corresponding polygonial coordinate points. And, those are the "first two" values of your YOLO-format. The same logic has happend with the "width" and "heigth" -parameters.
A good description about the idea behind the labelling and dataset is shown in https://www.youtube.com/watch?v=h6s61a_pqfM.
In short; for your purpose (for efficiency) I propose you make fast & convenient annotation using rectangles only - no time consuming polygon annotation.
The YOLO you are using very likely only has square annotation support.
See this video showing square vs polygon quality of results for detection, and the problem of annotation time required to create custom data sets.
To use polygonal masks can I suggest switching to use YOLOv3-Polygon or YOLOv5-Polygon

How to determine object orientation in image?

I am supposed to determine the direction of a boat from drone imagery, whether it's docked from the front or from the back
I tried to split the bbox of the boat image, use binary images by thresholding the boat bbox,
and i tried to split the bbox into two half, calculate the sum of blue pixels in every half sinc the front part of the boat will have more water i the image due to the triangle shape, but it didn't work
My question is, how can I determine the correct direction of the boat using image processing techniques
I used this TensorFlow library to detect object orientation in many projects. You need to train a neural network on your images. Then it will predict both boat location and direction.
use semantic segmentation to detect the dock
and a keypoint method to detect the boats, this method is usually used for face recognition but I think it would help in your case

Gaussian Mixture Models for pixel clustering

I have a small set of aerial images where different terrains visible in the image have been have been labelled by human experts. For example, an image may contain vegetation, river, rocky mountains, farmland etc. Each image may have one or more of these labelled regions. Using this small labeled dataset, I would like to fit a gaussian mixture model for each of the known terrain types. After this is complete, I would have N number of GMMs for each N types of terrains that I might encounter in an image.
Now, given a new image, I would like to determine for each pixel, which terrain it belongs to by assigning the pixel to the most probable GMM.
Is this the correct line of thought ? And if yes, how can I go about clustering an image using GMMs
Its not clustering if you use labeled training data!
You can, however, use the labeling function of GMM clustering easily.
For this, compute the prior probabilities, mean and covariance matrixes, invert them. Then classify each pixel of the new image by the maximum probability density (weighted by prior probabilities) using the multivariate Gaussians from the training data.
Intuitively, your thought process is correct. If you already have the labels that makes this a lot easier.
For example, let's pick on a very well known and non-parametric algorithm like Known Nearest Neighbors https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
In this algorithm, you would take your new "pixels" which would then find the closest k-pixels like the one you are currently evaluating; where closest is determined by some distance function (usually Euclidean). From there, you would then assign this new pixel to the most frequently occurring classification label.
I am not sure if you are looking for a specific algorithm recommendation, but KNN would be a very good algorithm to begin testing this type of exercise out on. I saw you tagged sklearn, scikit learn has a very good KNN implementation I suggest you read up on.

Detect red square in image with OpenGL-ES

I need to write a program that will detect a red square in an image. I would like to do this on my GPU using OpenGl-ES. I have no experience with GPU programming, and haven't found the answer through Google so far.
Is it possible to do this using OpenGL? Does OpenGL-ES give access to the whole matrix of pixels as well as their location in the matrix, allowing a program to go through the pixels, and check the color value of each one as well as their locations in the matrix?
Thank you.
Above all, you are confused to call a few terms. There is 'no matrix of pixels'
If what you meant by that is Convolution, yes, you can put the convolution on Fragment shader to detect edges. However, there is no returning datas, and no way to access each pixel to get the color value. Convolution would work if you just want the shader to draw of square's edge. But if you want to know if a red square exist in the camera frame it must be calculated in CPU not in GPU.

How Can I Detect Ellipses in OpenCV/JavaCV?

I am currently working on a program to detect coordinates of pool balls in an image of a pool table taken from an arbitrary point.
I first calculated the table corners and warped the perspective of the image to obtain a bird's eye view. Unfortunately, this made the spherical balls appear to be slightly elliptical as shown below.
In an attempt to detect the ellipses, I extracted all but the green felt area and used a Hough transform algorithm (HoughCircles) on the resulting image shown below. Unfortunately, none of the ellipses were detected (I can only assume because they are not circles).
Is there any better method of detecting the balls in this image? I am technically using JavaCV, but OpenCV solutions should be suitable. Thank you so much for reading.
The extracted BW image is good but it needs some morphological filters to eliminate noises then you can extract external contours of each object (by cvFindContours) and fit best ellipse to them (by cvFitEllipse2).

Resources