I am new to Kinect development. I am using the Kinect v2 and to create a Windows store application following the Face Basics example found here. I want to be able to capture a face image if the face is engaged. I am having trouble however capturing the image from the Win2D CanvasControl. I am not sure how else I can capture the face image.
Can anyone assist me with how I might accomplish this?
In the Face Basics example, the author is storing the image captured by the Kinect sensor in a CanvasBitmap (eg line 38 of the ColorFrameSourceRenderer code snippet).
I assume that by "capturing the image" you mean "save to disk". The contents of this bitmap can then be saved using the SaveAsync method.
Related
I'm not really too familiar with the programming/coding aspect of Computer Vision. What I can tell from a functional perspective is that it's analyzing an image, then outputting tags based on what it sees. The issue is that the Plugin I use in Wordpress doesn't filter the response of that image analysis. It basically takes my API Key and then echoes the response it receives from Computer Vision to display all of the image tags.
That being said, I have a fairly straight-forward yes or no question. Can Computer Vision be set up to only output specific image tags if they are present in the image? If so, where can I find information on how to do this?
I'm looking at the API reference here: https://westus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-2/operations/56f91f2e778daf14a499f21b
There does not appear to be any setting allowing one to filter the image tags returned by the service. It also appears that the only format it will return a response in is JSON. So the answer would be no.
I am building an API that extracts text from image using Tesseract.js and Node.js. I want to add a feature that tells the user the percentage of text occupied in the image. I'd be much grateful if anyone could guide me how to do this.
I am using the Vimeo Depth Player (https://github.com/vimeo/vimeo-depth-player/) for volumetric videos - only for a hobby/out of curiosity - and I'd like to know more about the parameters we use in the video description (such as in this video: https://vimeo.com/279527916) - I searched for it but I wasn't able to find a description for any of the supported parameters.
Does anyone here knows where to find such description?
Thanks!
Unfortunately, this JSON config is not publicly documented anywhere right now, except for the source code which parses it.
If you are using Depthkit to do a volumetric capture, they automatically generate this configuration for you so you don't have to worry about what it means.
https://docs.depthkit.tv/docs/the-depthkit-workflow
The point of this config is to mathematically describe how the subject was captured. e.g. How far is the subject from the camera? Without all of this, you won't be able to properly reconstruct the volumetric capture.
Recently google launched its new feature in image search by image means we can search other images by uploading a image in the google search box. How is this possible?
http://images.google.com
Look at WP:Content-based image retrieval. An example of open-source implementation that you can study internal working of is for example GNU Image Finding Tool.
If you click on the "Learn more" link on the page you are referring to, you'll find this explanation
How it works
Google uses computer vision techniques to match your image to other images in the Google Images index and additional image collections. From those matches, we try to generate an accurate "best guess" text description of your image, as well as find other images that have the same content as your search image. Your search results page can show results for that text description as well as related images.
Actually the answer to this lies in the image processing.....in over a decade image processing and computer vision have done great deal of advancement...
search by image uses pixels ...it compares the pixels and matches with image database it contains....its quite similar to what actual tyext search does but there pixels in place of text...
there are various operators like soble operator,etc which help us focus on the important details of the picture being tested and and we we can search on the basis of the important feature of the image.....
I'm trying to use the code posted here: http://seanho.posterous.com/monotouch-first-attempt-arkit-c-version
however - when i try to overlay it on a camera - it seems to behave really strangely.
I'm guess that it's because the camera view only does portrait?
Has anyone succesfully used this? Or maybe know's how to get this working?
Cheers
w://
The code on the blog entry you linked to is ported from the open source iPhone ARKit: http://www.iphonear.org/
iPhone ARKit was updated after this port was posted (to quote the author "My version of code may be outdated"). You may want to examine the source on github to pick up any changes/fixes.