Handling incorrect non binary labellling when labelling objects in a greyscale image by tile - python-3.x

I am working on a project where I have a model which does instance segmentation to segment nuclei in a image. Next step would be to label these segmented nuclei. I am scaling the labeling by processing images as tiles.
The issue I am facing now is to come up with a way to handle incorrect labeling. Basically , when there is a object which gets split due to tiling they are labelled differently .
tile_size = 2048
for x in range(0, vec_arr.shape[2], tile_size):
x_max = min([vec_arr.shape[2], x + tile_size])
for y in range(0, vec_arr.shape[1], tile_size):
y_max = min([vec_arr.shape[1], y + tile_size])
The above code explains how I am tiling a image. I am using this repo(https://github.com/MouseLand/cellpose/blob/master/cellpose/dynamics.py#L574) as the basis for labeling images since I am using their network. I am looking for ideas on how I can identify objects which are connected across tiles and fill them with same values.
Currently I maintain a counter of number of objects labelled in a tile and start labeling from that value.
I am interested in knowing on how I can identify same objects across tiles.

This is not easy.
First of all you need an overlap in your tiling. Each tile should overlap the surrounding ones by some amount, which you then cut off when recomposing the larger image. The overlap amount should be at least the size of a nucleus, but preferably larger. The extra space is meant to guarantee that a nucleus that straddles the tile edge is detected identically in the two tiles where you can see it.
Next, when cutting off the overlap region and decomposing the larger image, a nucleus that straddles the tile edge (is partially in the overlap region) must be either preserved entirely or removed completely depending on which tile it “belongs to”. There are different ways to define this. For example, you can compute the centroid of the nucleus, and determine in which tile that is, and remove the nucleus from the other tile.
Thus, each nucleus is detected in exactly one tile. However, if the overlap region is not large enough, then a detection for a nucleus might not have the same shape in the two overlapping tiles, leading to two different centroids for the same nucleus. In this case, the nucleus could be perceived as not part of either tile, or part of both tiles. It is important to understand the detection algorithm, so that you can find the right overlap size that will guarantee identical detection for the two tiles.

Related

Detecting center and area of shapes in an image

I am working with GD library, and I'm looking for a way to detect the nearest pixel to the middle center of shapes, as well as total area used by each shape in a monochromic black-and-white image.
I'm having difficulty coming up with an efficient algorithm to do this. If you have done something similar to this in the past, I'd be grateful for any solution that would help.
Check out the binary image library
Essentially, Otsu threshold to separate out foreground from background, then label connected components. That particular image looks very clean but you might need morph ops to clean it up a bit and get rid of small holes and other artifacts.
Then you have area trivially (count pixels in component) or almost as trivially (use the weighted area function that penalises edge pixels). Centre is just mean.
http://malcolmmclean.github.io/binaryimagelibrary/
#MalcolmMcLean is right but there are remaining difficulties (if you are after maximum accuracy).
If you threshold with Otsu, there are a few pairs of "kissing" dots which will form a single blob using connected component analysis.
In addition, Otsu threshoding will discard some of the partially filled edge pixels so that the weighted averages will be inaccurate. A cure would be to increase the threshold (up to 254 is possible), but that worsens the problem of the kissing dots.
A workaround is to keep a low threshold and dilate the blobs individually to obtain suitable masks that cover all edge pixels. Even so, slight inaccuracies will result in the vicinity of the kissings.
Blob splitting by the watershed transform is also possible but more care is required to handle the common pixels. I doubt that a prefect solution is possible.
An alternative is the use of subpixel edge detection and least-squares circle fitting (after blob detection with a very low threshold to separate the dots). By avoiding the edge pixels common to two circles, you can probably achieve excellent results.

Three.js ParticleSystem flickering with large data

Back story: I'm creating a Three.js based 3D graphing library. Similar to sigma.js, but 3D. It's called graphosaurus and the source can be found here. I'm using Three.js and using a single particle representing a single node in the graph.
This was the first task I had to deal with: given an arbitrary set of points (that each contain X,Y,Z coordinates), determine the optimal camera position (X,Y,Z) that can view all the points in the graph.
My initial solution (which we'll call Solution 1) involved calculating the bounding sphere of all the points and then scale the sphere to be a sphere of radius 5 around the point 0,0,0. Since the points will be guaranteed to always fall in that area, I can set a static position for the camera (assuming the FOV is static) and the data will always be visible. This works well, but it either requires changing the point coordinates the user specified, or duplicating all the points, neither of which are great.
My new solution (which we'll call Solution 2) involves not touching the coordinates of the inputted data, but instead just positioning the camera to match the data. I encountered a problem with this solution. For some reason, when dealing with really large data, the particles seem to flicker when positioned in front/behind of other particles.
Here are examples of both solutions. Make sure to move the graph around to see the effects:
Solution 1
Solution 2
You can see the diff for the code here
Let me know if you have any insight on how to get rid of the flickering. Thanks!
It turns out that my near value for the camera was too low and the far value was too high, resulting in "z-fighting". By narrowing these values on my dataset, the problem went away. Since my dataset is user dependent, I need to determine an algorithm to generate these values dynamically.
I noticed that in the sol#2 the flickering only occurs when the camera is moving. One possible reason can be that, when the camera position is changing rapidly, different transforms get applied to different particles. So if a camera moves from X to X + DELTAX during a time step, one set of particles get the camera transform for X while the others get the transform for X + DELTAX.
If you separate your rendering from the user interaction, that should fix the issue, assuming this is the issue. That means that you should apply the same transform to all the particles and the edges connecting them, by locking (not updating ) the transform matrix until the rendering loop is done.

Preventing pixelshader overdraw for a single ERG

Background
Using gluTess to build a triangle list in Direct3D9 from a GDI+ DrawString(..) path:
A pixel shader (v3.0) is then used to fill in the shape. When painting with opaque values, everything looks fine:
The problem
At certain font sizes, if the color has an alpha component (ie Argb #55FFFFFF) we begin to see these nasty tessellation artifacts where triangles may overlap ever so slightly:
At larger font sizes the problem is sometimes not present:
Using Intel's excellent GPA Frame Analyzer Pixel History tool, we can see in areas where the artifacts occur, the pixel has been "touched" 3 times from the single Erg.
I'm trying to figure out how I can stop my pixel shader from touching the same pixel more than once.
Other solutions relating to overdraw prevention seem to be all about zbuffer strategies, however this problem is more to do with painting of a single 2D triangle list within a single pixel shader pass.
I'm at a bit of a loss trying to come up with a solution on this one. I was hoping that HLSL might have some sort of "touch each pixel only once" flag, but I've been unable to find anything like that. The closest I've found was to set the BLENDOP to MAX instead of ADD. But the output is not correct when blending over other colors in the scene.
I also have SRCBLEND = ONE, DSTBLEND = INVSRCALPHA. The only combination of flags which produce correct output (albeit with overdraw artifacts.)
I have played with SEPARATEALPHABLENDENABLE in the GPA frame analyzer, which sounded like almost exactly what I need here -- set blending to MAX but only on the "alpha" channel, however from what I can determine, that setting (and corresponding BLENDOPALPHA) affects nothing at all.
One final thing I thought of was to bake text as opaque onto a texture, and then repaint that texture into the scene with the appropriate alpha value applied, however this doesn't actually work in this project because I also support gradient brushes, where stop values may contain alpha, meaning either the artifacts would still be seen, or the final output just plain wrong if we stripped the alpha away from the stop values prior to baking to a texture. Also the whole endeavor would be hideously expensive.
Any hints or pointers would be appreciated. Thanks for reading.
The problem you're seeing shouldn't happen.
If two of your triangles are overlapping it's because you've placed the vertices in such a way that when the adjacent triangles are drawn, they overlap. What's probably happening is that these two adjacent triangles share two vertices, but each triangle has its own copy of each vertex that's been calculated to be in a very, very slightly different position.
The solution to the problem isn't to try and make the pixel shader touch the pixel only once it's to use an index buffer (if you aren't already) and have the shared vertices between each triangle actually share the same vertex and not use one that's ever-so-slightly not in the same place as the one used by the adjacent triangle.
If you aren't in control of the tessellation algorithm being used you may have to run a pass over the vertex buffer after its been generated to detect and merge vertices that are within some very small tolerance of one another. Even without an index buffer, a naive solution would be this:
For each vertex in the vertex buffer, compare its position to every other vertex in the rest of the vertex buffer.
If two vertices are within some small tolerance of another, replace the second vertex's position with the position of the one you are comparing it against.
This should have the effect of pairing up the positions of two vertices if they are close enough that you deem them to be the same.
You now shouldn't have any problem with overlapping triangles. In everyday rendering two triangles share edges with each other all the time and you won't ever get the effect where they appear to every-so-slightly overlap. The hardware guarantees that a sample point is either on one side of the line or the other, but never both at the same time, no matter how close the point is to the line (even if it's mathematically on the line, it still fails on one side or the other).

Efficient data structure for nearest neighbour search in a tiled context

I am looking for a datastructure to store irregular elevation data {xi,yi,zi} that facilitates fast look-up of points within a xy range.
From what I gather a kd tree should be suitable for this? And also fairly simple to implement?
However the number of points in the elevation dataset may be enormous. It may therefore not be possible to process all points in one go. Instead I aim to divide the xy region into tiles and process each tile separately:
The points within the green rectangle are those needed for tile 1. When I move into tile 2 I will need the points within a green rectangle centered around tile 2. The 2 rightmost point in the green rectangle around tile 1 will still be needed. The other points could be swapped out of memory if needed. In addition 4 more points will be needed for tile 2.
A kd tree may therefore not be optimal since this would require me to rebuild the complete tree for each new tile? Would a R-tree be a better choice?
The point themselves should be stored on disk in some clever format and read into memory just before they are needed. Before I start processing tile 1, I could tell the data structure maintaining the points, that next I will be needing tile 2 and it could then begin to read the necessary points from disk in a separate thread.
I was considering using smaller tiles for loading points into the datastructure. For instance the points in the figure could be divided into 16x16 tiles.
Are there any libraries in C/C++ that implement this functionality?

Do I need to rectify if camera planes are aligned?

If I am taking images from a pair of cameras whose principle axis(in both the cameras) is perpendicular to the baseline do I need to rectify the images?Typical example would be bumblebee stereo cameras.
If you can also guarantee that:
the camera axes are parallel (maybe so if bought as a single package like the bumblebee)
you have no lens distortion (probably not)
all the other internal camera parameters are identical
your measurement axis is parallel to your baseline
then you might be able to skip image rectification. Personally I wouldn't.
Just think about lens distortion. Even assuming everything else is equal and aligned, this might mess things up. Suppose a feature appears on the edge in one image and a the centre of the other. At the edge it might be distorted a few pixels away, while at the centre it appears where it should. Without rectification, your stereoscopic calculation (which assumes straight lines from object to sensor) is going to give you bad results.
Depends what you mean by "rectify". In stereo vision, it is common to ensure that the epipolar lines are aligned too. That means the i-th row in image 1 corresponds to the i-th row in image 2. An optional step is to reduce distortion caused by the rectification process.
If you are taking images from a pair of cameras whose principle axis is perpendicular to the baseline, then you have epipoles mapped on infinity (parallel epipolar lines in the same image). You need another transform to align the epipolar lines in both images. You will find this transform in Loop & Zhang's paper, also the transform to reduce distortion.
And be careful about lens distortion (see wxffles' answer).

Resources