How to do isolated word recognition using pocketsphinx - cmusphinx

I tried to follow this link for isolated word recognition
Speech to text for single word
But when I am providing -keyphrase, it is giving me the result (keyphrase word) even if I am giving a wrong keyphrase.
Is this expected?

Related

A way to translate images?

In Python I am attempting to translate Arabic characters within a image. I can provide the language 'source' type (Arabic) and 'destination' (English). Is there a python library or API that is free that I can use for this? I.e that provides a service like https://translate.google.com, that allows for cloud image translation (the uploading of images containing non-translated characters) and downloading of images containing the destination characters translated within the image? Or a library to do this locally within my system (i.e. detect Arabic characters from an image containing Arabic text, extract the Arabic characters, for using cloud translation services (e.g. google translate) and then modify the image containing Arabic characters with the newly translated English characters? So, my goal is to modify/replace the Arabic characters within an image containing Arabic characters with English characters that are the translated characters of the original/extracted Arabic characters. I know Yandex / https://translate.yandex.com/ocr allows for this however you must pay for their translation API. How could I do this?
While I'm not sure if there is support for Arabic, there are libraries like OpenCV2 for python and pytesseract to extract text from image. Then you can use another library like translate to finish the process from there. https://pypi.org/project/translate/

How to detect language from audio in python?

I have tried using FFMPEG for audio extraction from video.
How to transcribe speech to text and detect the language?
I have tried using ,
Myspokenlangauge
Google cloud -speech to text api
You can use speech to text to detect the words.
Send the words to google translate which automatically detects the language.
Then you can web-scrape the auto detected tag.
You can take the text from speech to text engine using one of the most used languages for your use case.
Create a labelled dataset to train NLP model - for text classification.
Use this model to detect the text coming out of STT engine.

Translating text in Microsoft translator removes //

I am trying to translate a code file and I see the below issues with Microsoft translator
Translating below text
// 行番号削除正常終
Gives output
Line number deletion successful end
where it removes the //(double slash)
Translating below text
System.out.println("【○】COBOL
Gives output
System.out.println_○○ COBOL
Where the special characters are removed and replaced with some other.
I am reading my file using CP932 encoding and writing using UTF-8.
Please let me know if there are any ideas on how to resolve this issue.
I have tried Google translate and it works well on the same encodings.

Azure OCR unable to detect roman character "I" and "II"

I have this image .
I am using Azure Computer Vision API - v2.0,
combination of Recognize Text API(POSt) and Get Recognize Text Operation Result(GET) as mentioned in https://westcentralus.dev.cognitive.microsoft.com/docs/services/5adf991815e1060e6355ad44/operations/587f2c6a154055056008f200. to detect text characters in the image.
Currently it is able to detect all the characters except letter I and II.
Can someone help?

text to phonemes converter

I'm searching for a tool that converts text to phonemes, (like text to speech software)
I can program one but it will not be without errors and takes a lot of time!
so my question is:
is there a simple tool for converting e.g.
"hello" to "HH AH0 L OW1"
maybe some command-line tool so i can capture the stdout?
i'm searching for the phonemes in 'Arpabet' style (see the 'hello' example).
espeak does something like that but the output is not in Arpabet style and the phonemes are
not split by some determiner.
If you had searched for Arpabet on wiki you would have found your answer. The CMU guys have prepared scripts which convert most english words to their respective Arpabet phonetic break up.
If you want the phone sequence of a couple of words you can use their interface here. But, if you want it for a big file then you might have to run their scripts on your own. They used to have a working page here, but it seems to be not working now.

Resources