I am supposed to determine the direction of a boat from drone imagery, whether it's docked from the front or from the back
I tried to split the bbox of the boat image, use binary images by thresholding the boat bbox,
and i tried to split the bbox into two half, calculate the sum of blue pixels in every half sinc the front part of the boat will have more water i the image due to the triangle shape, but it didn't work
My question is, how can I determine the correct direction of the boat using image processing techniques
I used this TensorFlow library to detect object orientation in many projects. You need to train a neural network on your images. Then it will predict both boat location and direction.
use semantic segmentation to detect the dock
and a keypoint method to detect the boats, this method is usually used for face recognition but I think it would help in your case
Related
I'm developing an application to identify cars crossing the white line in the parking lot using the CCTV camera system. I'm experiencing difficulty with some tilt-angle cameras.
My solution for the rear camera angles is to utilize the shapely library in Python to detect the intersection of the bounding boxes of the car recognized with model yolov7 and the bounding box of the parking slot. The experimental results are pretty promising.
camera1
In the case of tilt-angle cameras, such as the one shown below, the car does not cross the line, but its bounding box crosses with the white line. camera2
Does anyone have any suggestions on how to make this work? Thank you very much.
Note: my input is a 1920x1080 video, I use OpenCV to read frames and the yolov7-tiny.pt model to detect cars in the frame. The bounding box of the parking slots is predefined in a JSON file.
Note2: I used the yolov7-tiny.pt model in image 1. In image 2, I used yolov7-mask.pt model.
If the yolov7-tiny.pt model gives you back not only the bounding box of the car but also the car edges, as seen in the second image.
p = cv2.arcLength(cnt, True) # cnt is the x,y values of the car's edges
appr = cv2.approxPolyDP(cnt, 0.02*p, True)
The appr contains the bousing box of the car aligned with the car angle.
I am an undergraduate student working with detecting defects on a surface of an object, in a given digital image using image processing technique. I am planning on using OpenCV library to get image processing functions. Currently I am trying to decide on which defect detection algorithm to use, in order to detect defects. This is one of my very first projects related to this field, so it will be appreciated if I can get some help related to this issue. The reference image with a defect (missing teeth in the gear), which I am currently working with is uploaded as a link below ("defective gear image").
defective gear image
Get the convex hull of a gear (which is a polygon) and shrink is slightly so that it crosses the teeth. Make sure that the centroid of the gear is the fixed point.
Then sample the pixels along the hull, preferably using equidistant points (divide the perimeter by a multiple of the number of teeth). The unwrapped profile will be a dashed line, with missing dashes corresponding to missing teeth, and the problem is reduced to 1D.
You can also try a polar unwarping, making the outline straight, but you will need an accurate location of the center.
I need to write a program that will detect a red square in an image. I would like to do this on my GPU using OpenGl-ES. I have no experience with GPU programming, and haven't found the answer through Google so far.
Is it possible to do this using OpenGL? Does OpenGL-ES give access to the whole matrix of pixels as well as their location in the matrix, allowing a program to go through the pixels, and check the color value of each one as well as their locations in the matrix?
Thank you.
Above all, you are confused to call a few terms. There is 'no matrix of pixels'
If what you meant by that is Convolution, yes, you can put the convolution on Fragment shader to detect edges. However, there is no returning datas, and no way to access each pixel to get the color value. Convolution would work if you just want the shader to draw of square's edge. But if you want to know if a red square exist in the camera frame it must be calculated in CPU not in GPU.
A bit of background
I am writing a simple ray tracer in C++. I have most of the core complete but don't understand how to retrieve the world coordinate of a pixel on the image plane. I need this location so that I can cast the ray into the world.
Currently I have a Camera with a position(aka my perspective reference point), a direction (vector) which is not normalized. The directions length signifies the center of the image plane and which way the camera is facing.
There are other values associated with the camera but they should not be relevant.
My image coordinates will range from -1 to 1 and the perspective(focal length), will change based on the distance of the direction associated with the camera.
What I need help with
I need to go from pixel coordinates (say [0, 256] in an image 256 pixels on each side) to my world coordinates.
I will also want to program this so that no matter where the camera is placed and where it is directed, that I can find the pixel in the world coordinates. (Currently the camera will almost always be centered at the origin and will look down the negative z axis. I would like to program this with the future changes in mind.) It is also important to know if this code should be pushed down into my threaded code as well. Otherwise it will be calculated by the main thread and then the ray will be used in the threaded code.
(source: in.tum.de)
I did not make this image and it is only there to give an idea of what I need.
Please leave comments if you need any additional info. Otherwise I would like a simple theory/code example of what to do.
Basically you have to do the inverse process of V * MVP which transforms the point to unit cube dimensions. Look at the following urls for programming help
http://nehe.gamedev.net/article/using_gluunproject/16013/ https://sites.google.com/site/vamsikrishnav/gluunproject
I am currently working on a program to detect coordinates of pool balls in an image of a pool table taken from an arbitrary point.
I first calculated the table corners and warped the perspective of the image to obtain a bird's eye view. Unfortunately, this made the spherical balls appear to be slightly elliptical as shown below.
In an attempt to detect the ellipses, I extracted all but the green felt area and used a Hough transform algorithm (HoughCircles) on the resulting image shown below. Unfortunately, none of the ellipses were detected (I can only assume because they are not circles).
Is there any better method of detecting the balls in this image? I am technically using JavaCV, but OpenCV solutions should be suitable. Thank you so much for reading.
The extracted BW image is good but it needs some morphological filters to eliminate noises then you can extract external contours of each object (by cvFindContours) and fit best ellipse to them (by cvFitEllipse2).