Has anyone ever programmed using CNTK for reading hand-filled documents? I tried OCRs and they dont do handwriting recognition at all (next to nothing). Thinking of using CNTK for the same. I searched and found that not many have tried such a thing. Any advice on libraries or any pointers anyone?
Here a basic OCR example using CNTK:
https://github.com/Microsoft/CNTK/blob/master/Tutorials/CNTK_103B_MNIST_FeedForwardNetwork.ipynb
However, in order to use the model in a real application you will need a way to segment the handwritting.
Related
TL;DR: how can I detect the presence of handwriting in an image?
I'm using Google's Python Vision API to scan for text in images, with generally good results. Most of the time the images contain printed text, but sometimes there is handwriting.
As noted in the documentation, you sometimes get better results for handwritten text using document_text_detection rather than the standard text_detection API call. My own tests back this up, but also show that the standard text_detection call generally works best for printed text in JPEG images.
So I'd like to use the standard text_detection by default, and only run images thrrough document_text_detection if there is handwriting. However, I can't find a reliable way to detect the presence of handwritten text in an image using the Vision APIs.
I tried label detection, but there does not appear to be a specific label for handwriting. Occasionally it will spit out "Calligraphy" but not reliably.
Does anyone know of a way to accomplish this?
I haven’t used Google Cloud Vision API but you can try Object detection models. I would suggest to create a labeled dataset over the document images of your use case using tools like LabelImg and train an Object detection model like Yolov3 [paper] [code]. I have worked on similar problems It should work.
I know, the object recognition feature is currently not supported by Google's ARCore.
My simple goal: detect cups and show some coffee inside. (Best would be display it live on the phone)
Is there really no way to detect objects?
Do you know any additional computations approaches, which can recognize some objects via ARCore?
Train a CNN. Instead of training image + annotation, use the point cloud + annotation. Is this approach viable?
Any approach, to record the a video + point cloud and compute them on a backend?
Is Snapchat using ARCore?
Are they detecting the face and pose to put the virtual makeup on the mesh?
How is the mesh computed?
I don't expect answers to every question, just ideas.
Maybe, someone knows simular projects, interesting links or something to think about.
Thanks in advance.
My question is exactly the same as this one, except that... is it possible to achieve the goal with LibVLC? Thanks!
By the way, are there any full-fledged tutorials or books for LibVLC? There are plenty modules mentioned in this page, but without a tutorial it's difficult/impossible for me to understand how they work. So far, the only tutorial I found is https://wiki.videolan.org/LibVLC_Tutorial/ which is very primitive and says nothing about demuxing, decoding, encoding or muxing. Any information or suggestion is highly appreciated!
I am doing a project on hand sign recognition on a static image. Can I use just Haar training to accomplish this?
As what I've understood, it is somewhat similar to the concept of neural networks.
Using Haar training maybe help to detect the hand, but not for recognize.
The people use many approaches, so I cannot give a unique. You could make some research using Google Scholar and use the keywords "hand sign", "recognition" and "detection".
Some tips: you need to segment the hand and use some template matching or other method to recognize the format. There is also a project for hand gestures here.
Is there any way we can get the text from a scanned document in jpg jpeg or any other format ? I am using ruby as my programming language . But I guess if I can get the texts with some help from other programming languages , it will not be much of a problem to integrate.
Thanks.
Yes, you can use an OCR library. There are additional details at https://stackoverflow.com/questions/1085/free-ocr-library.
In brief, you may wish to consider using tessnet (http://www.pixel-technology.com/freeware/tessnet2/).
This technology is called optical character recognition (OCR).
For programming, check out this question, which recommends tesseract-ocr.
OCR for ruby? check out this question.
If it's just a couple images, here's a site that supposedly does it for free.
OCR Terminal http://www.ocrterminal.com has been the best (most accurate) free tool out of at least a dozen that I have used. It works especially well with formatted (table) data.