I'm using TensorFlow model repo for object detection. For evaluation, I'm using Tensorboard which displays mAP and the result - predicted bounding box for detected object.
I would like to display the ground truth bounding box along with the predicted one. How can I accomplish this?
it seems like you seek to display the ground truth bounding box next to the predicted box for object detection in TensorBoard. I'm assuming you are using the image dashboard. Here is one idea.
You can pass the bytes of the original image into a py_func, which lets you wrap a python function and use it as a TensorFlow op.
Within the py_func, you can render boxes on top of the image say using matplotlib (using patches.Rectangle):
matplotlib: how to draw a rectangle on image
And then, you can pass those bytes into an image summary op. This GitHub project offers an example: https://github.com/vahidk/EffectiveTensorflow/blob/master/README.md#prototyping-kernels-and-advanced-visualization-with-python-ops
Apparently, there is already API for that.
https://github.com/tensorflow/models/issues/2596
Related
I want to have a reference point or to know the coordinates of any point on an exported Image (from any view) from Revit.
For example in the attached image exported from Revit, I'd like to know the bounding box of the picture or the middle point of the picture (in X,Y coordinates) or any other reference point.
Plan image
Is there a way to extract the bounding box coordinates of the picture?
I would suggest defining two diagonally opposite points in your image file that you can identify precisely in your Revit model. Determine their image pixel coordinates, export their Revit model coordinates, and use this information to determine the appropriate scaling and translation.
The RoomEditorApp Revit add-in and its corresponding roomedit CouchDb web interface demonstrate exporting an SVG image from Revit, scaling it for display in a web browser, and transformation and calculation of exact coordinates back and forth between two environments.
I am trying to extract text from an image, but within a certain area of the image and not the entire image.
I have already been able to detect where the objects of interest are and get their coordinates. Though I do not know where to start when extracting text from a specific area.
I'm using the code from this example:
https://www.codingame.com/playgrounds/38470/how-to-detect-circles-in-images
It is able to detect the circles, but I want to take it one step further and extract the numbers from the circles and tag them to their corresponding coordinate.
I'm using this example to learn how to do something similar myself, but I'm really more interested in deciding the search in a set area.
Most image processing libraries support the concept of ROIs (region of interest) or AOIs (area of interest).
The idea is to restrict processing to a subset of pixels that are usually selected by defining geometric shapes like rectangles, polygons, circles within the image coordinate system.
You can fix this issue by first cropping the image using your coordinates and try to extract text from it.
I have a tensor named input with dimensions 64x21x21. It is a minibatch of 64 images, each 21x21 pixels. I'd like to crop each image down to 11x11 pixels. So the output tensor I want would have dimensions 64x11x11.
I'd like to crop each image around a different "center pixel." The center pixels are given by a 2-dimensional long tensor named center with dimensions 64x2. For image i, center[i][0] gives the row index and center[i][1] gives the column index for the pixel that should be at the center in the output. We can assume that the center pixel is always at least 5 pixels away from the border.
Is there an efficient way to do this in pytorch (on the gpu)?
UPDATE: Let me clarify that the center tensor is formed by a deep neural network. It acts as a "hard attention mechanism," to use the reinforcement learning term for it. After I "crop" an image, that subimage becomes the input to another neural network. That's why I want to do the cropping in Pytorch: because the operations before and after the cropping are in Pytorch. I'd like to avoid having to transfer anything from the GPU back to the CPU.
I raised the question over on the pytorch forums, and got an answer there from smth. The grid_sample function should totally solve the problem.
https://discuss.pytorch.org/t/cropping-a-minibatch-of-images-each-image-a-bit-differently/12247
torchvision contains transforms including RandomCrop, but it doesn't seem to fit your use case if you want the images cropped in a specific way. I would recon that PyTorch, a deep learning framework, is not the appropriate tool for cropping images.
Instead, have a look at this tutorial that uses pillow. You should be able to implement your use case with this. Also have a look at pillow-simd which does some operations faster.
I am doing some studies on eye vascularization - my project contains a machine which can detect the different blood vessels in the retinal membrane at the back of the eye. What I am looking for is a possibility to segment the picture and analyze each segmentation on it`s own. The Segmentation consist of six squares wich I want to analyze separately on the density of white pixels.
I would be very thankful for every kind of input, I am pretty new in the programming world an I actually just have a bare concept on how it should work.
Thanks and Cheerio
Sam
Concept DrawOCTA PICTURE
You could probably accomplish this by using numpy to load the image and split it into sections. You could then analyze the sections using scikit-image or opencv (though this could be difficult to get working. To view the image, you can either save it to a file using numpy, or use matplotlib to open it in a new window.
First of all, please note that in image processing "segmentation" describes the process of grouping neighbouring pixels by context.
https://en.wikipedia.org/wiki/Image_segmentation
What you want to do can be done in various ways.
The most common way is by using ROIs or AOIs (region/area of interest). That's basically some geometric shape like a rectangle, circle, polygon or similar defined in image coordinates.
The image processing is then restricted to only process pixels within that region. So you don't slice your image into pieces but you restrict your evaluation to specific areas.
Another way, like you suggested is to cut the image into pieces and process them one by one. Those sub-images are usually created using ROIs.
A third option which is rather limited but sufficient for simple tasks like yours is accessing pixels directly using coordinate offsets and several nested loops.
Just google "python image processing" in combination with "library" "roi" "cropping" "sliding window" "subimage" "tiles" "slicing" and you'll get tons of information...
I am currently working on a program to detect coordinates of pool balls in an image of a pool table taken from an arbitrary point.
I first calculated the table corners and warped the perspective of the image to obtain a bird's eye view. Unfortunately, this made the spherical balls appear to be slightly elliptical as shown below.
In an attempt to detect the ellipses, I extracted all but the green felt area and used a Hough transform algorithm (HoughCircles) on the resulting image shown below. Unfortunately, none of the ellipses were detected (I can only assume because they are not circles).
Is there any better method of detecting the balls in this image? I am technically using JavaCV, but OpenCV solutions should be suitable. Thank you so much for reading.
The extracted BW image is good but it needs some morphological filters to eliminate noises then you can extract external contours of each object (by cvFindContours) and fit best ellipse to them (by cvFitEllipse2).