Object recognition in 3D images - object

I have a 3D video that I have broken down into single images in 7 different planes. I am wondering what tools can I use for object detection. I read that OpenCV might not be the right tool for that, what could I use instead?
Regards
Aleksej

OpenCV can be used for segmentation on 3D data as long as it can be represented as a depth map (normally the information of the Z-axis in camera coordinate).
If you have depth data as a cv::Mat, you can run segmentation (region-growing, watershed, etc) on the depth data to get segmented objects.
It is assumed that the edges are distinguishable and unique between objects ofcourse.
As a pre-processing step, you can also smoothen the edges with some morphological operations to make the segmentation better.

Related

Is there a way to define the camera parameters myself?

I am working with Shapenet dataset which contains 3D information. I want to create images out of that dataset by defining the camera intrinsics and extrinsics on my own (so its like I will be defining where my camera is with respect to object and what will be the focal length and optical center of camera). Is there a concrete way by which I can pick some values for these ?
PS : I can load the shapenet models in some 3D viewing software and if I could extract the camera parameters might be using (since at any particular time, it is showing me an image)

VTK - create 3D model

I'm trying to create a 3D mask model from the 3D coordinate points that are stored in the txt file. I use the Marching cubes algorithm. It looks like it´s not able to link individual points, and therefore holes are created in the model.
Steps: (by https://lorensen.github.io/VTKExamples/site/Cxx/Modelling/MarchingCubes/)
First, load 3D points from file as vtkPolyData.
Then, use vtkVoxelModeller
Put voxelModeller output to MC algorithm and finally visualize
visualization
Any ideas?
Thanks
The example takes a spherical mesh (a.k.a. a set of triangles forming a sealed 3D shape), converts it to a voxel representation (a 3D image where the voxels outside the mesh are black and those inside are not) then converts it back to a mesh using Marching Cubes algorithm. In practice the input and output of the example are very similar meshes.
In your case, you load the points and try to create a voxel representation of them. The problem is that your set of points is not sufficient to define a volume, they are not a sealed mesh, just a list of points.
In order to replicate the example you should do the following:
1) building a 3D mesh from your points (you gave no information of what the points are/represent so I can't help you much with this task). In other words you need to tell how these points are connected between then to form a 3D shape (vtkPolyData). VTK can't guess how your points are connected, you have to tell it.
2) once you have a mesh, if you need a voxel representation (vtkImageData) of it you can use vtkVoxelModeller or vtkImplicitModeller. At this point you can use vtk filters that need a vtkImageData as input.
3) finally in order to convert voxels back to a mesh (vtkPolyData) you can use vtkMarchingCubes (or better vtkFlyingEdges3D that is a very similar algorithm but much faster).
Edit:
It is not clear what the shape you want should be, but you can try to use vtkImageOpenClose3D so the steps are:
First, load 3D points from file as vtkPolyData.
Then, use vtkVoxelModeller
Put voxelModeller output to vtkImageOpenClose3D algorithm, then vtkImageOpenClose3D algorithm output to MC (change to vtkFlyingEdges3D) algorithm and finally visualize
Example for vtkImageOpenClose3D:
https://www.vtk.org/Wiki/VTK/Examples/Cxx/Images/ImageOpenClose3D

Why is a normal vector necessary for STL files?

STL is the most popular 3d model file format for 3d printing. It records triangular surfaces that makes up a 3d shape.
I read the specification the STL file format. It is a rather simple format. Each triangle is represented by 12 float point number. The first 3 define the normal vector, and the next 9 define three vertices. But here's one question. Three vertices are sufficient to define a triangle. The normal vector can be computed by taking the cross product of two vectors (each pointing from a vertex to another).
I know that a normal vector can be useful in rendering, and by including a normal vector, the program doesn't have to compute the normal vectors every time it loads the same model. But I wonder what would happen if the creation software include wrong normal vectors on purpose? Would it produce wrong results in the rendering software?
On the other hand, 3 vertices says everything about a triangle. Include normal vectors will allow logical conflicts in the information and increase the size of file by 33%. Normal vectors can be computed by the rendering software under reasonable amount of time if necessary. So why should the format include it? The format was created in 1987 for stereolithographic 3D printing. Was computing normal vectors to costly to computers back then?
I read in a thread that Autodesk Meshmixer would disregard the normal vector and graph triangles according to the vertices. Providing wrong normal vector doesn't seem to change the result.
Why do Stereolithography (.STL) files require each triangle to have a normal vector?
At least when using Cura to slice a model, the direction of the surface normal can make a difference. I have regularly run into STL files that look just find when rendered as solid objects in any viewer, but because some faces have the wrong direction of the surface normal, the slicer "thinks" that a region (typically concave) which should be empty is part of the interior, and the slicer creates a "top layer" covering up the details of the concave region. (And this was with an STL exported from a Meshmixer file that was imported from some SketchUp source).
FWIW, Meshmixer has a FlipSurfaceNormals tool to help deal with this.

How to compute a 3d miniature model from a large set of 3d geometric models

i want to import a set of 3d geometries in to current scene, the imported geometries contains tons of basic componant which may represent an
entire building. The Product Manager want the entire building to be displayed
as a 3d miniature(colors and textures must corrosponding to the original building).
The problem: Is there any algortithms which can handle these large amount of datasin a reasonable time and memory cost.
//worst case: there may be a billion triangle surfaces in the imported data
And, by the way, i am considering another solotion: using a type of textue mapping:
1 take enough snapshots by the software render of the imported objects.
2 apply the images to a surface .
3 use some shader tricks to perform effects like bump-mapping---when the view posisition changed, the texture will alter and makes the viewer feels as if he was looking at a 3d scene.
----my modeller and render are ACIS and hoops, any ideas?
An option is to generate side views of the building at a suitable resolution, using the rendering engine and map them as textures to a parallelipipoid.
The next level of refinement is to obtain a bump or elevation map that you can use for embossing. Not the easiest to do.
If the modeler allows it, you can slice the volume using a 2D grid of "voxels" (actually prisms). You can do that by repeatedly cutting the model in two with a plane. And in every prism, find the vertex closest to the observer. This will give you a 2D map of elevations, with the desired resolution.
Alternatively, intersect parallel "rays" (linear objects) with the solid and keep the first endpoint.
It can also be that your modeler includes a true voxel model, or that rendering can be zone with a Z-buffer that you can access.

Geometric/Shape Recognition ( Odd Shape )

I would like to do some odd geometric/odd shape recognition. But I'm not sure how to do it.
Here's what I have so far:
Convert RGB image to Monochrome.
Otsu Threshold
Hough Transform.
I'm not sure what to do next.
For geometric information, you could do a raster to vector conversion to convert your image into coordinated vectors (lines and points) and finite element analysis to look for known shapes. Not easy but libraries should be available for both.
Edit: Note that there are sometimes easier practical solutions, but they depend on the image and types of errors. For example, removing perspective, identifying a 3d object from a 2d image, significance of colour, etc... You often see registration markers added to the real world object to overcome
this and allow much easier identification. Looking up articles on feature extraction techniques might help.

Resources