I was reading the source of the random walker algorithm in scikit-image library, and it's written there that:
Parameters
data : array_like
Image to be segmented in phases. Gray-level `data` can be two- or
three-dimensional
My question is: what do they mean by 3D gray-level image?
A 2D image is an image that is indexed by (x,y). A 3D image is an image that is indexed by (x,y,z).
A digital image samples the real world in some way. Photography produces a 2D projection of the 3D world, the digital photograph is a sampling of that projection. But other imaging modalities do not project, and can sample all three dimensions of the 3D world. For example:
Confocal microscopy
Computed tomography
Magnetic resonance imaging
Besides these, a 2D time-series (a movie) is sometimes also treated as a 3D image, applying algorithms that work on 3D images.
Related
3D Density maps of course can be plotted as heatmap, but when data itself is homogeneous (near 0) except for a small part (2D cross section for example):
This should give a letter 'E' shape as 2D "model". The original data is not saved as point-cloud however.
A naive approach would be to use the pixels that are more than a certain value, and then smooth the border. However this does not take into account of the border pixels being small.
Another would be to use some point-cloud based algorithms that come with modeling softwares, but then the point-cloud's probability function would still be discontinuous on pixel border, and not taking into account that only one side have signal.
Is there any tested solution to this (the example is 2D, the actual case is many 2D slices that compose a low-res 3D density map)? I was thinking of making border pixels have area proportional to signal data, and border should be defined from gradient? Any suggestions?
I was thinking of model visualization results similar to this (seems to be based on established point-cloud algorithm):
I have a 3D video that I have broken down into single images in 7 different planes. I am wondering what tools can I use for object detection. I read that OpenCV might not be the right tool for that, what could I use instead?
Regards
Aleksej
OpenCV can be used for segmentation on 3D data as long as it can be represented as a depth map (normally the information of the Z-axis in camera coordinate).
If you have depth data as a cv::Mat, you can run segmentation (region-growing, watershed, etc) on the depth data to get segmented objects.
It is assumed that the edges are distinguishable and unique between objects ofcourse.
As a pre-processing step, you can also smoothen the edges with some morphological operations to make the segmentation better.
I'm trying to create a 3D mask model from the 3D coordinate points that are stored in the txt file. I use the Marching cubes algorithm. It looks like it´s not able to link individual points, and therefore holes are created in the model.
Steps: (by https://lorensen.github.io/VTKExamples/site/Cxx/Modelling/MarchingCubes/)
First, load 3D points from file as vtkPolyData.
Then, use vtkVoxelModeller
Put voxelModeller output to MC algorithm and finally visualize
visualization
Any ideas?
Thanks
The example takes a spherical mesh (a.k.a. a set of triangles forming a sealed 3D shape), converts it to a voxel representation (a 3D image where the voxels outside the mesh are black and those inside are not) then converts it back to a mesh using Marching Cubes algorithm. In practice the input and output of the example are very similar meshes.
In your case, you load the points and try to create a voxel representation of them. The problem is that your set of points is not sufficient to define a volume, they are not a sealed mesh, just a list of points.
In order to replicate the example you should do the following:
1) building a 3D mesh from your points (you gave no information of what the points are/represent so I can't help you much with this task). In other words you need to tell how these points are connected between then to form a 3D shape (vtkPolyData). VTK can't guess how your points are connected, you have to tell it.
2) once you have a mesh, if you need a voxel representation (vtkImageData) of it you can use vtkVoxelModeller or vtkImplicitModeller. At this point you can use vtk filters that need a vtkImageData as input.
3) finally in order to convert voxels back to a mesh (vtkPolyData) you can use vtkMarchingCubes (or better vtkFlyingEdges3D that is a very similar algorithm but much faster).
Edit:
It is not clear what the shape you want should be, but you can try to use vtkImageOpenClose3D so the steps are:
First, load 3D points from file as vtkPolyData.
Then, use vtkVoxelModeller
Put voxelModeller output to vtkImageOpenClose3D algorithm, then vtkImageOpenClose3D algorithm output to MC (change to vtkFlyingEdges3D) algorithm and finally visualize
Example for vtkImageOpenClose3D:
https://www.vtk.org/Wiki/VTK/Examples/Cxx/Images/ImageOpenClose3D
I'm working on implementing Akush Gupta's synthetic data generation dataset (http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf). In his work. he used a convolutional neural network to extract a point cloud from a 2-dimensional scenery image, segmented the point clouds to isolate different planes, used RANSAC to fit a 3d plane to the point cloud segments, and then warped the pixels for the segment, given the 3D plane, to a fronto-parallel view.
I'm stuck in this last part- warping my extracted 3D plane to a fronto-parallel view. I have X, Y, and Z vectors as well as a normal vector. I'm thinking what I need to do is perform some type of perspective transform or rotation that would bring all the pixels on the plane to a complete 0 Z-axis while the X and Y would remain the same. I could be wrong about this, it's been a long time since I've had any formal training in geometry or linear algebra.
It looks like skimage's Perspective Transform requires me to know the dimensions of the final segment coordinates in 2d space. It looks like AffineTransform requires me to know the rotation. All I have at this point is my X,Y,Z and normal vector and the suspicion that I may know my destination plane by just setting the Z axis to all zeros. I'm not sure if my assumption is correct but I need to be able to warp all the pixels in the segment of interest to fronto-parallel, fit a bounding box, place text inside of it, then warp the final segment back to the original perspective in 3d space.
Any help with how to think about this or implement it would be massively useful.
So correct me if i'm wrong, but I think all elements in 3d graphics are meshes.
So the question is really, how do you take mesh data and create a 2d projection based on the mesh data, the camera location, rotations of camera & mesh, etc.
I realize this is fairly complicated and I would be satisfied by just knowing what the technical term for this is called so I may search and research it.
You can read about 3D projection on Wikipedia.