Is labelling images with polygon better than square? - conv-neural-network

I aim to make an object detection model and I labelled data with a square box
If I label the images with polygon, will it be better than square?
(labelling on image of people wearing safety helmet or not)
I did try label with polygon shape on a few images and after export txt file for YOLO
why it has only 4 points in the text file as same as labelled with a square shape
how those points will represent an area that I label accurately?
1 0.573748 0.018953 0.045332 0.036101
1 0.944520 0.098375 0.108931 0.167870

You have labeled your object in a polygonial format, but when you had made a conversion to YOLO-format the information in the labelings has reduced. The picture below shows how I suppose has happend;
...where you have done polygon shape annotation (black shape). But, the conversion has "searched" the smallest x-value from the polygonial coordinate points and smallest y-value from corresponding polygonial coordinate points. And, those are the "first two" values of your YOLO-format. The same logic has happend with the "width" and "heigth" -parameters.
A good description about the idea behind the labelling and dataset is shown in https://www.youtube.com/watch?v=h6s61a_pqfM.
In short; for your purpose (for efficiency) I propose you make fast & convenient annotation using rectangles only - no time consuming polygon annotation.

The YOLO you are using very likely only has square annotation support.
See this video showing square vs polygon quality of results for detection, and the problem of annotation time required to create custom data sets.
To use polygonal masks can I suggest switching to use YOLOv3-Polygon or YOLOv5-Polygon

Related

How to approximate low-res 3D density map to smooth models?

3D Density maps of course can be plotted as heatmap, but when data itself is homogeneous (near 0) except for a small part (2D cross section for example):
This should give a letter 'E' shape as 2D "model". The original data is not saved as point-cloud however.
A naive approach would be to use the pixels that are more than a certain value, and then smooth the border. However this does not take into account of the border pixels being small.
Another would be to use some point-cloud based algorithms that come with modeling softwares, but then the point-cloud's probability function would still be discontinuous on pixel border, and not taking into account that only one side have signal.
Is there any tested solution to this (the example is 2D, the actual case is many 2D slices that compose a low-res 3D density map)? I was thinking of making border pixels have area proportional to signal data, and border should be defined from gradient? Any suggestions?
I was thinking of model visualization results similar to this (seems to be based on established point-cloud algorithm):

How can i create an image morpher inside a graphics shader?

Image morphing is mostly a graphic design SFX to adapt one picture into another one using some points decided by the artist, who has to match the eyes some key zones on one portrait with another, and then some kinds of algorithms adapt the entire picture to change from one to another.
I would like to do something a bit similar with a shader, which can load any 2 graphics and automatically choose zones of the most similar colors in the same kinds of zone of the picture and automatically morph two pictures in real time processing. Perhaps a shader based version would be logically alot faster at the task? except I don't even understand how it works at all.
If you know, Please don't worry about a complete reply about the process, it would be great if you have save vague background concepts and keywords, for how to attempt a 2d texture morph in a graphics shader.
There are more morphing methods out there the one you are describing is based on geometry.
morph by interpolation
you have 2 data sets with similar properties (for example 2 images are both 2D) and interpolate between them by some parameter. In case of 2D images you can use linear interpolation if both images are the same resolution or trilinear interpolation if not.
So you just pick corresponding pixels from each images and interpolate the actual color for some parameter t=<0,1>. for the same resolution something like this:
for (y=0;y<img1.height;y++)
for (x=0;x<img1.width;x++)
img.pixel[x][y]=(1.0-t)*img1.pixel[x][y] + t*img2.pixel[x][y];
where img1,img2 are input images and img is the ouptput. Beware the t is float so you need to overtype to avoid integer rounding problems or use scale t=<0,256> and correct the result by bit shift right by 8 bits or by /256 For different sizes you need to bilinear-ly interpolate the corresponding (x,y) position in both of the source images first.
All This can be done very easily in fragment shader. Just bind the img1,img2 to texture units 0,1 pick the texel from them interpolate and output the final color. The bilinear coordinate interpolation is done automatically by GLSL because texture coordinates are normalized to <0,1> no matter the resolution. In Vertex you just pass the texture and vertex coordinates. And in main program side you just draw single Quad covering the final image output...
morph by geometry
You have 2 polygons (or matching points) and interpolate their positions between the 2. For example something like this: Morph a cube to coil. This is suited for vector graphics. you just need to have points corespondency and then the interpolation is similar to #1.
for (i=0;i<points;i++)
{
p(i).x=(1.0-t)*p1.x + t*p2.x
p(i).y=(1.0-t)*p1.y + t*p2.y
}
where p1(i),p2(i) is i-th point from each input geometry set and p(i) is point from the final result...
To enhance visual appearance the linear interpolation is exchanged with specific trajectory (like BEZIER curves) so the morph look more cool. For example see
Path generation for non-intersecting disc movement on a plane
To acomplish this you need to use geometry shader (or maybe even tesselation shader). you would need to pass both polygons as single primitive, then geometry shader should interpolate the actual polygon and pass it to vertex shader.
morph by particle swarms
In this case you find corresponding pixels in source images by matching colors. Then handle each pixel as particle and create its path from position in img1 to img2 with parameter t. It i s the same as #2 but instead polygon areas you got just points. The particle has its color,position you interpolate both ... because there is very slim chance you will get exact color matches and the count ... (histograms would be the same) which is in-probable.
hybrid morphing
It is any combination of #1,#2,#3
I am sure there is more methods for morphing these are just the ones I know of. Also the morphing can be done not only in spatial domain...

Generating density map for tree growth rings

I was just wondering if someone know of any papers or resources on generating synthetic images of growth rings in trees. Im thinking 2d scalar-fields or some other data representation which can then be used to render growth rings like images :)
Thanks!
never done or heard about this ...
If you need simulation then search for biology/botanist sites instead.
If you need just visually close results then I would:
make a polygon covering the cut (circle/oval like shape)
start with circle and when all working try to add some random distortion or use ellipse
create 1D texture with the density
it will be used to fill the polygon via triangle fan. So first find an image of the tree type you want to generate for example this:
Analyze the color and intensity as a function of diameter so extract a pie like piece (or a thin rectangle)
and plot a graph of R,G,B values to see how the rings are shaped
then create function that approximate that (or use piecewise interpolation) and create your own texture as function of tree age. You can interpolate in this way booth the color and density of rings.
My example shows that for this tree the color is the same so only its intensity changes. In this case you do not need to approximate all 3 functions. The bumps are a bit noisy due to another texture layer (ignore this at start). You can use:
intensity=A*|cos(pi*t)| as a start
A is brightness
t is age in years/cycles (and also the x coordinate (scaled) in your 1D texture)
so take base color R,G,B multiply it by A for each t and fill the texture pixel with this color. You can add some randomness to ring period (pi*t) and also the scale can be matched more closely. This is linear growth ,... so you can use exponential instead or interpolate to match bumps per length affected by age (distance form t=0)...
now just render the polygon
mid point is the t=0 coordinate in texture each vertex of polygon is t=full_age coordinate in texture. So render the triangle fan with these texture coordinates. If you need more close match (rings are not the same thickness along the perimeter) then you can convert this to 2D texture
[Notes]
You can also do this incrementally so do just one ring per iteration. Next ring polygon is last one enlarged or scaled by scale>1 and add some randomness, but this needs to be rendered by QUAD STRIP. You can have static texture for single ring so interpolate just the density and overall brightness:
radius(i)=radius(i-1)+ring_width=radius(i-1)*scale
so:
scale=(radius(i-1)+ring_width)/radius(i-1)

BoundingBox Shape

In my Android mapping activity, I have a parallelogram shaped area that I want to tell if points (ie:LatLng) are inside. I've tried using the:
bounds = new LatLngBounds.Builder()
.include(latlngNW)
.include(latlngNE)
.include(latlngSW)
.include(latlngSE)
.build();
and later
if (bounds.contains(currentLatLng) {
.....
}
but it is not that accurate. Do I need to create equations for lines connecting the four corners?
Thanks in advance.
The LatLngBounds appears to create a box from the points included. Given the shape that I'm trying to monitor is a parallelogram, you do need to create equations for each of the edges of the shape and use if statements to determine which side of the line a point lies.
Not an easy solution!
If you wish to build a parallelogram-shaped bounding "box" from a collection of points, and you know the desired angles of the parallelogram's sides, your best bet is to probably define a 2d linear shear transform which will one of those angles to horizontal, and the other to vertical. One may then feed the transformed points into normal "bounding box" routines, and feed the corners of the resulting box through the inverse of the above transform to get a bounding parallelogram.
Note that this approach is generally only suitable for parallelograms, not trapezoids. There are a few special cases where it could be used to find bounding trapezoids [e.g. if the top and bottom were horizontal, and the sides were supposed to converge at a known point (x0-y0), one could map x' = (x-x0)/(y-y0)] but for many kinds of trapezoids, the trapezoid formed by inverse mapping the corners of a horizontal/vertical bounding rectangle may not properly bound the points that are supposed to be within it.

Converting from Latitude/Longitude to Cartesian Coordinates with a World File and map image

I have a java applet that allows users to import a jpeg and world file from the local system. The user can then "click" draw lines on the image that was imported. Each endpoint of each line contains a set of X/Y and Lat/Long values. The XY is standard java coordinate space, the applet uses an affine transform calculation with the world file to determine the lat/long for every point on the canvas.
I have a requirement that allows a user to type a distance into a text field and use the arrow key to draw a line in a certain direction (Up, Down, Left, Right) from a single selected point on the screen. I know how to determine the lat/long of a point given a source lat/long, distance, and bearing.
So a user types "100" in the text field and presses the Right arrow key a line should be drawn 100 feet to the right from the currently selected point.
My issue is I don't know how to convert the distance( which is in feet ) into the distance in pixels. This would then tell my where to plot the point.
tcarobruce,
You are correct. The inverse transform algorithm is what I needed. Since I use java I was able to replace my "home made" transform algorithm with the java.awt.AffineTransform object which has an inverse transform function.
This seems to have solved my issue.
Thanks.
I guess you are certain your users are always uploading a raster image that is in the lat/lon wgs84 projection? Because in that case you can set a fixed coordinate transformation.
If you consider ever digitizing images from other sources with other projections, you might want to take a look at the open source geotools library: http://www.geotools.org/

Resources