Kinect V2 vs PyKinect2 : difference between depth image - python-3.x

I'm currently working on the kinect V2 to have access to the depth image. I use the library in python PyKinect2 that help me to have this images.
My first problem is :
-> I run KinectStudio 2, and look for the depth image, i look for the implementation of the PyKinect2 and i have a different image. How that could be possible ?
-> To have access to the depth of specific point called X(x,y), I use the method MapColorFrameToDepthSpace, and i manage to have some coordinates that will help me to have the distance on the depth frame.
Is this assertion is correct ?
to get the depth image :
def draw_depth_frame(self, frame):
depth_frame = frame.reshape((424, 512,-1)).astype(np.uint8)
return depth_frame
the image from kinect 2 :
the image from kinect 2 with color ramp :
the image from pyKinect 2
Sincerely,

I found my mistake about this,
When i use this
depth_frame = frame.reshape((424, 512,-1)).astype(np.uint8)
This line is in fact wrong.
The depth is not mapped on a uint8, but on a uint16.
By performing the reshape, i have a "lost" of information about the distance, i have only value as 255, so not very useful.
example: 1600 as a distance, was considered as 255, and because of this line, it gives me the third depth image on my previous post. SO the correction is simply something :
depth_frame = frame.reshape((424, 512,-1)).astype(np.uint16)

Related

OpenCV - ArUco : detectMarkers failed identified some markers in a photos

I have pictures containing ArUco markers but I am unable to detect all of them with the detectMarkers function. Actually, I have many pictures : in some of them I can detect all the markers, in others I cannot and I don't really understand why.
I thought it was because of the quality of the photo, but it seems to be not so simple. Here's an example of my code :
import cv2
import matplotlib.pyplot as plt
from cv2 import aruco
aruco_dict = aruco.Dictionary_get(aruco.DICT_4X4_1000)
inputfile = 'EOS-1D-X-Mark-II_1201-ConvertImage.jpg'
frame = cv2.imread(inputfile)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
parameters = aruco.DetectorParameters_create()
corners, ids, rejectedImgPoints = aruco.detectMarkers(frame, aruco_dict, parameters=parameters)
frame_markers = aruco.drawDetectedMarkers(frame.copy(),rejectedImgPoints)
plt.figure(figsize=(20,10))
plt.imshow(frame_markers)
for i in range(len(ids)):
c = corners[i][0]
plt.plot([c[:, 0].mean()], [c[:, 1].mean()], "o", label = "id={0}".format(ids[i]))
plt.legend()
plt.show()
In this picture, 1 marker is not detected and I don't understand why.
I tried to tune the parameters of detectMarkers function manually with an interactive method thanks to jupyter notebook. There are many parameters and I found nothing that really helped me, except in some photos the reduction of polygonalApproxAccuracyRate.
The photo is orginally in 5472 x 3648 pixels but the one I send in this post is 2189 x 1459 pixels. Note that it doesn't work with the better resolution neither. Actually, I found in some photos that reducing the resolution help to detect the markers ... It's a contradiction but I think this is because the default parameters of the function are not adapted to my pictures, but I found no solution when tuning the parameters.
Another idea is to use the refineDetectMarkers function after calling detectMarkers. It uses the candidates that were found in detectMarkers but failed to be identified, and try to refine their identification. However, as far as I understood, I need to know where my markers should be in the picture and put it in refineDetectMarkers (as a board). In my situation, I don't know where the markers should be, otherwise I wouldn't take photos. The photos are used to observe precisely the evolution of their positions.
I am interested in any ideas you may have, thanks for reading !

How to make tesseract work for different kinds of images?

I am working on a digit recognition task using Tesseract and OpenCV. I did use it and came to a solution that is specific for a particular image. If I change my image I do not obtain correct results. I should change the threshold value according to the image. The steps I did was:
Cropping the image to find an appropriate region
Change image into grayscale
Using Gaussian Blur
Taking appropriate threshold
passing the image through Tesseract
So, My question is how can I make my code generic i.e. it can be used for different images without updating my code.
While working on this image I processed as`
imggray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imggray,(5,5), 0)
imgDil=cv2.dilate(imgBlur,np.ones((5,5),np.uint8),iterations=1)
imgEro=cv2.erode(imgDil,np.ones((5,5),np.uint8),iterations=2)
ret,imgthresh=cv2.threshold(imgEro,28,255, cv2.THRESH_BINARY )
And for this Image as
imggray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imggray,(5,5), 0)
imgDil=cv2.dilate(imgBlur,np.ones((5,5),np.uint8),iterations=0)
imgEro=cv2.erode(imgDil,np.ones((5,5),np.uint8),iterations=0)
ret,imgthresh=cv2.threshold(imgEro,37,255, cv2.THRESH_BINARY )
I had to change the value of iterations and the minimum threshold to obtain proper results. What can be the solution so I should not change the values?

How to differentiate Passport and PAN card Scanned images in python

The goal is to identify that the input scanned image is passport or PAN card using Opencv.
I have used structural_similarity(compare_ssim) method of skimage to compare input scan image with the images of template of Passport and PAN card.
But in both cases i got low score.
Here is the code that i have tried
from skimage.measure import compare_ssim as ssim
import matplotlib.pyplot as plt
import numpy as np
import cv2enter code here
img1 = cv2.imread('PAN_Template.jpg', 0)
img2 = cv2.imread('PAN_Sample1.jpg', 0)
def prepare_img(im):
size = 300, 200
im = cv2.resize(im, size)
return im
img1 = prepare_img(img1)
img2 = prepare_img(img2)
def compare_images(imageA, imageB):
s = ssim(imageA, imageB)
return s
ssim = compare_images(img1, img2)
print(ssim)
Comparing the PAN Card Template with Passport i have got ssim score of 0.12
and Comparing the PAN Card template with a PAN Card the score was 0.20
Since both the score were very close i wast not able to distinguish between them through the code.
If anyone got any other solution or approach then please help.
Here is a sample image
PAN Scanned Image
You can also compare 2 images by the mean square error (MSE) of those 2 images.
def mse(imageA, imageB):
# the 'Mean Squared Error' between the two images is the
# sum of the squared difference between the two images;
# NOTE: the two images must have the same dimension
err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
err /= float(imageA.shape[0] * imageA.shape[1])
# return the MSE, the lower the error, the more "similar"
# the two images are
return err
As per my understanding Pan card and Passport images contain different text data, so i believe OCR can solve this problem.
All you need to do is- extract the text data from the images using any OCR library like Tesseract and look for a few predefined key words in the text data to differentiate the images.
Here is simple Python script showing the image pre-processing and OCR using pyteseract module:
img = cv2.imread("D:/pan.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,th1 = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
cv2.imwrite('filterImg.png', th1)
pilImg = Image.open('filterimg.png')
text = pytesseract.image_to_string(pilImg)
print(text.encode("utf-8"))
Below is the binary image used for OCR:
I got the below string data after doing the OCR on the above image:
esraax fram EP aca ae
~ INCOME TAX DEPARTMENT Ld GOVT. OF INDIA
wrtterterad sg
Permanent Account Number. Card \xe2\x80\x98yf
KFWPS6061C
PEF vom ; ae
Reviavs /Father's Name. e.
SUDHIR SINGH : . ,
Though this text data contains noises but i believe it is more than enough to get the job done.
Another OCR solution is to use TextCleaner ImageMagick script from Fred's Scripts. A tutorial which explain how to install and use it (on Windows) is available here.
Script used:
C:/cygwin64/bin/textcleaner -g -e normalize -f 20 -o 20 -s 20 C:/Users/Link/Desktop/id.png C:/Users/Link/Desktop/out.png
Result:
I applied OCR on this with Tesseract (I am using version 4) and that's the result:
fart
INCOME TAX DEPARTMENT : GOVT. OF INDIA
wort cra teat ears -
Permanent Account Number Card
KFWPS6061C
TT aa
MAYANK SUDHIR SINGH el
far aT ary /Father's Name
SUDHIR SINGH
Wa RT /Date of Birth den. +
06/01/1997 genge / Signature
Code for OCR:
import cv2
from PIL import Image
import tesserocr as tr
number_ok = cv2.imread("C:\\Users\\Link\\Desktop\\id.png")
blur = cv2.medianBlur(number_ok, 1)
cv2.imshow('ocr', blur)
pil_img = Image.fromarray(cv2.cvtColor(blur, cv2.COLOR_BGR2RGB))
api = tr.PyTessBaseAPI()
try:
api.SetImage(pil_img)
boxes = api.GetComponentImages(tr.RIL.TEXTLINE, True)
text = api.GetUTF8Text()
finally:
api.End()
print(text)
cv2.waitKey(0)
Now, this don't answer at your question (passport or PAN card) but it's a good point where you can start.
Doing OCR might be a solution for this type of image classification but it might fail for the blurry or not properly exposed images. And it might be slower than newer deep learning methods.
You can use Object detection (Tensorflow or any other library) to train two separate class of image i.e PAN and Passport. For fine-tuning pre-trained models, you don't need much data too. And as per my understanding, PAN and passport have different background color so I guess it will be really accurate.
Tensorflow Object Detection: Link
Nowadays OpenCV also supports object detection without installing any new libraries(i.e.Tensorflow, caffee, etc.). You can refer this article for YOLO based object detection in OpenCV.
We can use:
Histogram Comparison - Simplest & fastest methods, using this we will get the similarity between histograms.
Template Matching - Searching and finding the location of a template image, using this we can find smaller image parts in a bigger one. (like some common patterns in PAN card).
Feature Matching - Features extracted from one image and the same feature will be recognised in another image even if the image rotated or skewed.

Google Earth Engine - RGB image export from ImageCollection Python API

I encounter some problems with the Google Earth Engine python API to generate a RGB image based on an ImageCollection.
Basically to transform the ImageCollection into an Image, I apply a median reduction. After this reduction, I apply the visualize function where I need to define the different variables like the min and max. The problem is that these two values are image dependent.
dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')
.filterBounds(ee.Geometry.Polygon([[39.05789266, 13.59051553],
[39.11335033, 13.59051553],
[39.11335033, 13.64477783],
[39.05789266, 13.64477783],
[39.05789266, 13.59051553]]))
.filterDate('2016-01-01', '2016-12-31')
.select(['B4', 'B3', 'B2'])
reduction = dataset.reduce('median')
.visualize(bands=['B4_median', 'B3_median', 'B2_median'],
min=0,
max=3000,
gamma=1)
Thus for each different image I need to process these two values that can sightly change. Since the number of images I need to generate is huge, It is impossible to do that manually. I do not know how to overcome this problem and I cannot find any answer to that problem. An idea would be to find the minimal value of the image and the maximum value. But I did not find any function that allows to do that on the Javascript or python API.
I hope that someone will be able to help me.
You can use img.reduceRegion() to get image statistics for the region you want and for each image to export. You will have to call the results of the region reduction into the visualization function. Here is an example:
geom = ee.Geometry.Polygon([[39.05789266, 13.59051553],
[39.11335033, 13.59051553],
[39.11335033, 13.64477783],
[39.05789266, 13.64477783],
[39.05789266, 13.59051553]])
dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')\
.filterBounds(geom)\
.filterDate('2016-01-01', '2016-12-31')\
.select(['B4', 'B3', 'B2'])
reduction = dataset.median()
stats = reduction.reduceRegion(reducer=ee.Reducer.minMax(),geometry=geom,scale=100,bestEffort=True)
statDict = stats.getInfo()
prettyImg = reduction.visualize(bands=['B4', 'B3', 'B2'],
min=[statDict['B4_min'],statDict['B3_min'],statDict['B2_min']]
max=[statDict['B4_max'],statDict['B3_max'],statDict['B2_max']],
gamma=1)
Using this approach, I get an output image like this:
I hope this helps!

pixel tracing not working properly

so i got an image with a curve, and i got a function that looks for 1 pixel and then connect the adjacent pixels (like forming a path), but somehow when it is going south-west or west it does not find any pixels even if there is one.
the function goes like:
pixels=img.getpixels()
window=((1,0),(0,-1),(-1,0),(0,1),(1,1),(1,-1),(-1,-1),(-1,1))
objs=[]
obj=[pixels.pop(),]
while pixels:
p=obj[-1]
neighbours=tuple((x+p[0],y+p[1]) for x,y in window)
for n in neighbours:
if n in pixels:
obj.append(n)
pixels.remove(n)
break
if p==obj[-1]:
objs.append(obj)
obj=[pixels.pop(),]
and an example of the failure:
the original image
the objects image
each different color represents a different "object"
should it not be showing just one object?
i appreciate any thoughts on this, tyvm :D

Resources