I have developed a script using dlib and cv2 to draw facial landmarks on images having one face in that image. Here is the scripts;
import cv2
import dlib
img_path = 'landmarks.png'
detector = dlib.get_frontal_face_detector()
shape_predictor = 'shape_predictor_68_face_landmarks.dat'
predictor = dlib.shape_predictor(shape_predictor)
count = 1
ready = True
while ready:
frame = cv2.imread("demo.jpg")
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = detector(gray)
for face in faces:
x1 = face.left()
y1 = face.top()
x2 = face.right()
y2 = face.bottom()
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
landmarks = predictor(gray, face)
for n in range(0, 68):
x = landmarks.part(n).x
y = landmarks.part(n).y
cv2.circle(frame, (x, y), 4, (255, 0, 0), -1)
cv2.imshow("Frame", frame)
ready = False
Now, here what makes me crazy. When I try to download any of the images(with or without mask) from google to test it, this script is working fine. Likewise, you can see these results such as,
But when I try over these following images, it does nothing.
I have made a couple of searches over the internet but I haven't found anything that is serving the current purpose.
Even, I have tried the combination of
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
m_cascade = cv2.CascadeClassifier('haarcascade_mcs_mouth.xml')
I also have looked into the following useful links out there;
Face Bounding Box
Detect Face Landmarks in Android (Even not same domain)
Landmarks detection
OpenCV2 Detect Facial Landmarks
but it's also not working on these images. CV2 detector shows an empty list when I debug through script such as;
I just want to draw fiducial landmarks using the above images. What would the best possible solution, I can go through? Maybe, I am missing something in cv2 & Dlib, but unable to get the results as required.
I have also find the confidence score for dlib using the recommended implementation from a Stack Overflow geek such as;
import dlib
detector = dlib.get_frontal_face_detector()
img = dlib.load_rgb_image('demo.jpg')
dets, scores, idx = detector.run(img, 1, -1)
for i, d in enumerate(dets):
print("Detection {}, score: {}, face_type:{}".format(
d, scores[i], idx[i]))
Here is the result of a confidence score for the first image in the above-given images in the second row;
Looking forward to getting better research from any of the awesome guys out there. Thanks
First, I might try to see if you can get confidence scores out of dlib. I'm not sure what the confidence threshold is, but perhaps faces are detected that are below the limit. From the dlib Git Repo, here is an example of how to get confidence from the detections:
if (len(sys.argv[1:]) > 0):
img = dlib.load_rgb_image(sys.argv[1])
dets, scores, idx = detector.run(img, 1, -1)
for i, d in enumerate(dets):
print("Detection {}, score: {}, face_type:{}".format(
d, scores[i], idx[i]))
Alternatively, consider another face detector, for example a CNN-based detector like this MobileNet SSD face detector. I have not used this particular model, but I have used similar models, like the Google TPU-based face detector model here with very good results.
Download "shape_predictor_68_face_landmarks.dat" link:
enter link description here
100% working Code Try This One:
import cv2
import dlib
import numpy as np
img= cv2.imread('Capture 8.PNG')
gray=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
p = "shape_predictor_68_face_landmarks.dat"
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(p)
faces = detector(gray)
for face in faces:
cv2.rectangle(img, (x1,y1), (x2,y2),(0,255,0),3)
landmarks=predictor(gray, face)
for n in range(0,68):
cv2.circle(img, (x, y), 4, (0, 0, 255), -1)
After long hours to search by myself a solution to my question, I am here to find some help so that, I hope, someone could help me to unfreeze my actual situation. So if there is any specialist or nice "Python Guru" who has some time to give me a hand on it, here is the context :
I am working on a mesh manipulation script thanks to the wonderful Trimesh library on Python 3.6 and I would like, while applying some matrix rotation transformation, to refresh the mesh visualisation in order to see the real time rotation evolution of the mesh.
Without success, I did some try following the hereinbelow script found on the Trimesh GitHub but I am not able to stop it without clicking on the upper right "closing cross". Here is the original code:
Show how to pass a callback to the scene viewer for
easy visualizations.
import time
import trimesh
import numpy as np
def sinwave(scene):
A callback passed to a scene viewer which will update
transforms in the viewer periodically.
scene : trimesh.Scene
Scene containing geometry
# create an empty homogenous transformation
matrix = np.eye(4)
# set Y as cos of time
matrix[1][3] = np.cos(time.time()) * 2
# set Z as sin of time
matrix[2][3] = np.sin(time.time()) * 3
# take one of the two spheres arbitrarily
node = s.graph.nodes_geometry[0]
# apply the transform to the node
scene.graph.update(node, matrix=matrix)
if __name__ == '__main__':
# create some spheres
a = trimesh.primitives.Sphere()
b = trimesh.primitives.Sphere()
# set some colors for the balls
a.visual.face_colors = [255, 0, 0, 255]
b.visual.face_colors = [0, 0, 100, 255]
# create a scene with the two balls
s = trimesh.Scene([a, b])
# open the scene viewer and move a ball around
And here is my try to integrate a matrix rotation transformation (to apply rotation on the imported mesh) to see the evolution.
But the rotation is not smooth (the animation is crenellated) and I am not able to stop it automatically lets say after a 97° rotation on z. (And the code is based on time while I would like it to be based on angular position).
from pathlib import Path
import pandas as pd
import time
import xlsxwriter
import numpy as np
import trimesh
from trimesh import transformations as trf
# Actual directory loading and stl adress saving
actual_dir = Path(__file__).resolve().parent
stl = Path(actual_dir/"Belt_Bearing_Gear.stl")
mesh = trimesh.load(f"{stl}")
def R_matrix(scene):
u= 0
o= 0
t= time.time()
rotation = (u, o, t)
# Angle conversion from degres to radian
def trig(angle):
r = np.deg2rad(angle)
return r
alpha = trig(rotation[0])
beta = trig(rotation[1])
gamma = trig(rotation[2])
origin, xaxis, yaxis, zaxis = [0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1]
Rx = trf.rotation_matrix(alpha, xaxis)
Ry = trf.rotation_matrix(beta, yaxis)
Rz = trf.rotation_matrix(gamma, zaxis)
R = trf.concatenate_matrices(Rx, Ry, Rz)
# The rotation matrix is applyed to the mesh
mesh.vertices = np.matmul(mesh.vertices,R2)
# apply the transform to the node
s = trimesh.Scene([mesh])
scene.graph.update(s, matrix=R)
if __name__ == '__main__':
# set some colors for the mesh and the bounding box
mesh.visual.face_colors = [102, 255, 255, 255]
# create a scene with the mesh and the bounding box
s = trimesh.Scene([mesh])
# open the scene viewer and move a ball around
All your ideas and suggestions are welcome since I am a young Python beginner :)
Thanks in advance for your help,
Warm regards,
I am currently have a document that needs to be smart scanned.
For that, I need to find proper contours of the document in any background so that I can do a warped perspective projection and detection with that image.
The main issue faced while doing this is that the document edge detects any kind of background.
I have tried to use the function HoughLineP and tried to find contours on the grayscale blurred image passed through canny edge detection until now.
CANNY = 84
HOUGH = 25
IM_HEIGHT, IM_WIDTH, _ = rescaled_image.shape
# convert the image to grayscale and blur it slightly
gray = cv2.cvtColor(rescaled_image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7,7), 0)
#dilate helps to remove potential holes between edge segments
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
dilated = cv2.dilate(gray, kernel)
# find edges and mark them in the output map using the Canny algorithm
edged = cv2.Canny(dilated, 0, CANNY)
test_corners = self.get_corners(edged)
approx_contours = []
(_, cnts, hierarchy) = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:5]
# loop over the contours
for c in cnts:
# approximate the contour
approx = cv2.approxPolyDP(c, 80, True)
if self.is_valid_contour(approx, IM_WIDTH, IM_HEIGHT):
How to find a proper bounding box around the document via OpenCV code.
Any help will be much appreciated.
(The document is taken from the camera in any angle and any coloured background.)
Following code might help you to detect/segment the page in the image...
import cv2
import matplotlib.pyplot as plt
import numpy as np
image = cv2.imread('test_p.jpg')
image = cv2.imread('test_p.jpg')
ori = image.copy()
image = cv2.resize(image, (image.shape[1]//10,image.shape[0]//10))
Resized the image to make the operations more faster so that we can work on realtime..
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (11,11), 0)
edged = cv2.Canny(gray, 75, 200)
print("STEP 1: Edge Detection")
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts[1], key = cv2.contourArea, reverse = True)[:5]
Here we will consider only first 5 contours from the sorted list based on area
Here the size of the gaussian blur is bit sensitive, so chose it accordingly based on the image size.
After the above operations image may look like..
for c in cnts:
### Approximating the contour
#Calculates a contour perimeter or a curve length
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.01 * peri, True)
# if our approximated contour has four points, then we
# can assume that we have found our screen
screenCnt = approx
if len(approx) == 4:
screenCnt = approx
# show the contour (outline)
print("STEP 2: Finding Boundary")
cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)
image_e = cv2.resize(image,(image.shape[1],image.shape[0]))
Final Image may look like...
Rest of the things may be handled after getting the final image...
Code Reference :- Git Repository
I guess this answer would be helpful...
There is a similar problem which is called orthographic projection.
Orthographic approaches
Rather than doing, Gaussian blur+morphological operation to get the edge of the document, try to do orthographic projection first and then find contours via your method.
For fining proper bounding box, try some preset values or a reference letter after which an orthographic projection will allow you to compute the height and hence the dimensions of the bounding box.
So After some image preprocessing I have gotten an image which holds 5 contours
(The image was resized for posting here in stackoverflow):
I'd like to remove all "islands" except for the actual letter,
So at first I tried using cv2.erode and cv2.dilate with all kinds of kernels sizes and it didn't do the job, so I decided to remove by masking all contours except the largest one by this:
_, cnts, _ = cv2.findContours(original, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
I would expect according to the given image there would be 5 contours
areas = []
for contour in cnts:
area = cv2.contourArea(contour)
relevant_indexes = list(range(1, len(cnts)))
mask = numpy.zeros(eroded.shape).astype(eroded.dtype)
color = 255
for i in relevant_indexes:
cv2.fillPoly(mask, cnts[i], color)
cv2.imwrite("mask.png", mask)
// Trying to mask out the noise
result = cv2.bitwise_xor(orifinal, mask)
cv2.imwrite("result.png", result)
But the mask I get is:
it's not what I would expect, and the left down contour is missing,
can someone PLEASE explain me what am I missing here? And what would be the correct approach for getting rid of those "isolated islands"?
Thank you all!
The original photo I'm working on:
It sounds like you want to mask out the largest connected component (cv-speak for "island").
Here's an opencv/python script to do that:
#!/usr/bin/env python
import cv2
import numpy as np
import console
# load image in grayscale
img = cv2.imread("img.png", 0)
# get all connected components
_, output, stats, _ = cv2.connectedComponentsWithStats(img, connectivity=4)
# get a list of areas for each group label
group_areas = stats[cv2.CC_STAT_AREA]
# get the id of the group with the largest area (ignoring 0, which is the background id)
max_group_id = np.argmax(group_areas[1:]) + 1
# get max_group_id mask and save it as an image
max_group_id_mask = (output == max_group_id).astype(np.uint8) * 255
cv2.imwrite("output.png", max_group_id_mask)
Here's the result of the above script on your sample image:
I have been working on a project that require finding defect in onions. The second image that's attached shows an abnormal onion. You can see that the onion is made-up of two smaller onion twins. What's interesting is that human eye can easily detect whats wrong with the structure.
One can do an structural analysis and can observe that a normal onion has almost smooth curvature while an abnormal one doesn't. Thus quite simply I want to build a classification algorithm based on the edges of the object.
However there are times when the skin of onion makes the curve irregular. See the image, there's a small part of skin that's outside the actual curvature. I want to discriminate the bulged part due to the skin vs the deformities produced at the point where the two subsection meet and then reconstruct the contour of object for further analysis .
Is there a mathematical thing that would help me here given the fact that I have majority of the points that makes the outer edge of onion including the two irregularities?
See the code below:
import cv2
import numpy as np
import sys
cv2.namedWindow('test', cv2.WINDOW_NORMAL)
cv2.namedWindow('orig', cv2.WINDOW_NORMAL)
cv2.resizeWindow('test', 600,600)
cv2.resizeWindow('orig', 600,600)
image = cv2.imread('./buffer/crp'+str(sys.argv[1])+'.JPG')
tim = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
hsv_image = cv2.cvtColor(image,cv2.COLOR_BGR2HSV)
frame_threshed = cv2.inRange(hsv_image, np.array([70,0,0],np.uint8),
canvas = np.zeros(image.shape, np.uint8)
kernel = np.ones((3,3),np.uint8)
frame_threshed = cv2.erode(frame_threshed,kernel,iterations = 1)
kernel = np.ones((5,5),np.uint8)
frame_threshed = cv2.erode(frame_threshed,kernel,iterations = 1)
kernel = np.ones((7,7),np.uint8)
frame_threshed = cv2.erode(frame_threshed,kernel,iterations = 1)
_, cnts, hierarchy = cv2.findContours(frame_threshed.copy(),
cnts= sorted(cnts, key=cv2.contourArea, reverse=True)
big_contours = [c for c in cnts if cv2.contourArea(c) > 100000]
for cnt in big_contours:
perimeter = cv2.arcLength(cnt,True)
epsilon = 0.0015*cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt,epsilon,True)
# print(len(approx))
hull = cv2.convexHull(cnt,returnPoints = False)
# try:
defects = cv2.convexityDefects(cnt,hull)
for i in range(defects.shape[0]):
s,e,f,d = defects[i,0]
start = tuple(cnt[s][0])
end = tuple(cnt[e][0])
far = tuple(cnt[f][0])
cv2.drawContours(image, [approx], -1, (0, 0, 255), 5)
cv2.drawContours(canvas, [approx], -1, (0, 0, 255), 5)
I would suggest you to try HuMoments since you already have extracted the shape of your objects. It would allow you to calculate a distance between two shapes, so basically between your abnormal onion and a reference onion.
The Hu Moments shape descriptor is available for Python using OpenCV. If image is binary, you can use it like this :
# Reference image
shapeArray1 = cv2.HuMoments(cv2.moments(image1)).flatten()
# Abnormal image
shapeArray2 = cv2.HuMoments(cv2.moments(image2)).flatten()
# Calculation of distance between both arrays
# Threshold based on the distancce
# Classification as abnormal or normal
MatchShapes could do the job too. It takes two binary images of contours to return a float that evaluate the distance between both.
Python: cv.MatchShapes(object1, object2, method, parameter=0) → float
More details
So when an onion shape is detected as abnormal, you would have to fill this shape and apply some binary morphology to erase the imperfection and extract the shape without imperfection.
Fill your shape
Apply an opening (erosion followed by dilatation) with a disk structural element to get rid of the irregularities
Extract the contours again
You should have a form without your irregularities. If not, go back to step 2 and change the size of the structural element
OK so if you look at the first two pictures of your onions you can see that they have a circular shape (except the peel peaks) and the "defect" one has more of an oval shape. What you could try is to find your contour (after you apply image transformation of course) and determine its center points. Then you could measure the distance from the center of the contour to each point of the contour. You can do it using scipy (ckd.tree() and tree.query()) or simply by mathematical formula for distance between two points sqrt(x2-x1)^2+(y2-y1)^2. Then you can say that if some number of points are out of bounds it is still an OK onion but if there are a lot of points out of bounds then it is a defective onion. I drew two example images just for the sake of demonstration.
Example in code:
import cv2
import numpy as np
import scipy
from scipy import spatial
img = cv2.imread('oniond.png')
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray_image,180,255,cv2.THRESH_BINARY_INV)
im2, cnts, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(cnts, key=cv2.contourArea)
list_distance = []
points_minmax = []
M = cv2.moments(cnt)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
center = (cX, cY)
for i in cnt:
tree = spatial.cKDTree(i)
mindist, minid = tree.query(center)
if float(mindist) < 100:
elif float(mindist) > 140:
reshape = np.reshape(list_distance, (-1,1))
under_min = [i for i in list_distance if i < 100]
over_max = [i for i in list_distance if i > 140]
for i in points_minmax:
if len(over_max) > 50:
print('distances over maximum: ', len(over_max))
print('distances over minimum: ', len(under_min ))
elif len(under_min ) > 50:
print('distances over maximum: ', len(over_max))
print('distances over minimum: ', len(under_min ))
print('distances over maximum: ', len(over_max))
print('distances over minimum: ', len(under_min ))
cv2.imshow('img', img)
distances over maximum: 37
distance over minimum: 0
The output shows that there are 37 points out of bounds (red color) but the onion is still OK.
Result 2:
distances over maximum: 553
distances over minimum: 13
And here you can see that there are more points out of bounds (red color) and the onion is not OK.
Hope this gives at least an idea on how to solve your problem. Cheers!
I'm using the face_recognition library for Python and I'm trying to change the KNN algorithm to run with OpenCV in real time. For that I "merged" two other algorithms provided by the library author (algorithm1, algorithm2).
(Edited: Now it shows the frame until it detect some face, then it crashes)
What I tried so far:
import numpy as np
import cv2
import face_recognition
import pickle
def predict(frame, knn_clf=None, model_path=None, distance_threshold=0.6):
Recognizes faces in given image using a trained KNN classifier
:param knn_clf: (optional) a knn classifier object. if not specified, model_save_path must be specified.
:param model_path: (optional) path to a pickled knn classifier. if not specified, model_save_path must be knn_clf.
:param distance_threshold: (optional) distance threshold for face classification. the larger it is, the more chance
of mis-classifying an unknown person as a known one.
:return: a list of names and face locations for the recognized faces in the image: [(name, bounding box), ...].
For faces of unrecognized persons, the name 'unknown' will be returned.
if knn_clf is None and model_path is None:
raise Exception("Must supply knn classifier either thourgh knn_clf or model_path")
# Load a trained KNN model (if one was passed in)
if knn_clf is None:
with open(model_path, 'rb') as f:
knn_clf = pickle.load(f)
# find face locations from frame
X_face_locations = face_recognition.face_locations(frame)
# If no faces are found in the image, return an empty result.
if len(X_face_locations) == 0:
return []
# Find encodings for faces in the frame
faces_encodings = face_recognition.face_encodings(frame, known_face_locations=X_face_locations)
# Use the KNN model to find the best matches for the test face
closest_distances = knn_clf.kneighbors(faces_encodings, n_neighbors=1)
are_matches = [closest_distances[0][i][0] <= distance_threshold for i in range(len(X_face_locations))]
# Predict classes and remove classifications that aren't within the threshold
return [(pred, loc) if rec else ("unknown", loc) for pred, loc, rec in zip(knn_clf.predict(faces_encodings), X_face_locations, are_matches)]
def show_labels_on_webcam(RGBFrame, predictions):
Shows the face recognition results visually.
:param img_path: path to image to be recognized
:param predictions: results of the predict function
frame = RGBFrame
for name, (top, right, bottom, left) in predictions:
# Scale back up face locations since the frame we detected in was scaled to 1/4 size
top *= 4
right *= 4
bottom *= 4
left *= 4
# Draw a box around the face
print (frame.shape)
print (frame.dtype)
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
# Draw a label with a name below the face
cv2.rectangle(frame, (left, bottom - 35), (right, bottom), (0, 0, 255), cv2.FILLED)
cv2.putText(frame, name, (left + 6, bottom - 6), font, 1.0, (255, 255, 255), 1)
# Display the resulting image
cv2.imshow('Video', frame)
# Get a reference to webcam #0 (the default one)
video_capture = cv2.VideoCapture(0)
while True:
# Grab a single frame of video
ret, frame = video_capture.read()
# Resize frame of video to 1/4 size for faster face recognition processing
small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
# Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)
rgb_small_frame = small_frame[:, :, ::-1]
predictions = predict(rgb_small_frame, model_path="trained_knn_model_1.clf")
# Display results overlaid on webcam video
print (rgb_small_frame.shape)
print (rgb_small_frame.dtype)
show_labels_on_webcam(rgb_small_frame, predictions)
# Hit 'q' on the keyboard to quit!
if cv2.waitKey(1) & 0xFF == ord('q'):
# Release handle to the webcam
The error I'm getting:
Traceback (most recent call last):
File "withOpenCV.py", line 91, in <module>
show_labels_on_webcam(rgb_small_frame, predictions)
File "withOpenCV.py", line 62, in show_labels_on_webcam
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
TypeError: Layout of the output array img is incompatible with cv::Mat (step[ndims-1] != elemsize or step[1] != elemsize*nchannels)
If you have any suggestions or see what I'm missing, please let me know! Thanks in advance!
I solved the error by changing the show_labels_on_webcam(rgb_small_frame, predictions) by show_labels_on_webcam(frame, predictions). Thanks to #api55 for the hint!