Can tesseract work with languages such as bengali? If so, with how much accuracy and what steps should I follow to implement it for bengali language? - nlp

I want to implement an offline program which can detect bengali text from an image (white background black text). I need to know how to approach my work to begin with

Yes. Tesseract is trained for Bengali. List of languages supported. You have to use language code ben for that. Rest of the implementation details are given here. Simply follow it.

Related

Animation Sprite

I want to create a 2D sprite that mimic the provided image:
http://a4.mzstatic.com/us/r1000/069/Purple2/v4/e6/0d/73/e60d73a8-6d78-64c2-dd59-9aabb54c7837/mzl.ujapwanw.320x480-75.jpg
and create different face expressions as provided sprites to unity3d in order to create an android application has multiple face expressions with those sprites... so my question... is what exactly the software I might use through out this process ??
Please, let me know the simplest step-by-step procedures, as I am in my first steps in computer graphics.
Thanks a lot.
Image manipulation is what you are looking for. To modify the current image you have and generate other facial expressions from it, you need to be very good at math. Image manipulation is not a basic stuff and I hope you are not new to programming.
Now that you understand that, you need OpenCV to be able to do this. You need to make a wrapper for it in c#. You can get the already made wrapper [here].1 https://www.assetstore.unity3d.com/en/#!/content/21088 .It works on Windows,Mac, Android and iOS and will save you time. Its NOT free but the price is worth it compare to the time you will spend building the wrappers for all platforms.
Once you get this, you can start learning OpenCV from the following link.
http://docs.opencv.org/doc/tutorials/tutorials.html
http://opencv-srf.blogspot.com/
http://shervinemami.info/openCV.html
http://www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/opencv-intro.html
If you the Unity plugin I mentioned, you can ask the author of the plugin to help you out if you are tuck.

define pronunciation starting time for each word in script

I have a text script that is used to create podcasts. So the words in podcast audio are exactly the same as in my text. Now what I want to have is the following:
Word in text | Pronounciation started at
Hello 0:0:0.000
my 0:0:1.125
friends 0:0:2.750
Is that possible to do at all?
Thanks in advance!
One of the key words you could start with to approach the complexity of the problem is "forced alignment". This site also covers questions regarding this topic e.g. here which leads you to questions and answers concerning HTK (the Hidden Markov Model Toolkit) via the releated threads.
You can find a more hands-on style description of how to use forced alignment in automated audio segmentation here.
So the answer is: yes, it is possible, but it is algorithmically very complex and even in its best implementations it is not error-free.
PS.: I found you a really simple tool

Search in a book with speech

I am trying to build a program that will find which page/sentence in a book is read to microphone. I have the book's text and its audio content. The user will start reading from a random page and program is supposed to synch to the user and show the section of the book which is being read. It might seem useless program but please bear with me..
Would an approach similar to shazam-like programs work? I am not sure how effective those algorithms for speech. Also, the speaker will be different and might have accent and different speeds to read.
Another approach would be converting the speech to text and searching the text in the book. The problem is that the language of the book is a rare one for which there is no language model available. In addition, the script does not use latin characters which makes programming difficult (for me at least).
Is there any solutions that anyone can recommend? Would extracting features from the audio file and comparing with the "real-time" extracted features (from microphone) would work? Which features?
Any implementation/code that I can start with? Any language is ok but prefer C.
You need to use speech recognizer.
Create a language model directly from the book text. That will make the recognition of the book reading very accurate, both original reading and the reading by the user.
Use this language model to recognize the book and assign timestamps for the words or use more advanced algorithm to perform text to audio alignment.
Recognize user's speech with the book-specific language model and use the recognized text to display a position in a book.
You can use CMUSphinx for the mentioned tasks.

Is it possible for a system to identify hand signs using just the Haar training in OpenCV?

I am doing a project on hand sign recognition on a static image. Can I use just Haar training to accomplish this?
As what I've understood, it is somewhat similar to the concept of neural networks.
Using Haar training maybe help to detect the hand, but not for recognize.
The people use many approaches, so I cannot give a unique. You could make some research using Google Scholar and use the keywords "hand sign", "recognition" and "detection".
Some tips: you need to segment the hand and use some template matching or other method to recognize the format. There is also a project for hand gestures here.

read text document from scanned image

Is there any way we can get the text from a scanned document in jpg jpeg or any other format ? I am using ruby as my programming language . But I guess if I can get the texts with some help from other programming languages , it will not be much of a problem to integrate.
Thanks.
Yes, you can use an OCR library. There are additional details at https://stackoverflow.com/questions/1085/free-ocr-library.
In brief, you may wish to consider using tessnet (http://www.pixel-technology.com/freeware/tessnet2/).
This technology is called optical character recognition (OCR).
For programming, check out this question, which recommends tesseract-ocr.
OCR for ruby? check out this question.
If it's just a couple images, here's a site that supposedly does it for free.
OCR Terminal http://www.ocrterminal.com has been the best (most accurate) free tool out of at least a dozen that I have used. It works especially well with formatted (table) data.

Resources