I am making music retrieval with HTK toolkit
Especially,I would like to make singer recognition.
I think what I want to to do is close to speaker recognition.
SO I researched some samples and red red HTK book.
But still can't find good reference
Is there any good references for singer recognition or speaker recognition?
You can also try Microsoft Speaker Recognition APIs: https://www.microsoft.com/cognitive-services/en-us/speaker-recognition-api#verification.
You can use the APIs for both verification & identification. Here are their C# & Python SDKs: https://github.com/Microsoft/ProjectOxford-ClientSDK/tree/master/SpeakerRecognition
From the HTK website:
http://www.bas.uni-muenchen.de/Bas/SV/
It uses HTK for speaker identification.
From google:
http://code.google.com/p/sibo-htk/
Luis
ASR Labs
www.asrlabs.com.br
Related
I am making an Ai assistant using python's Tensorflow module. Now I am trying to make a voice for my Ai assistant. Like Google assistant, Cortana, Siri all of them has their own voice. But I don't know how to make an artificial voice. I searched the web but not getting any helpful answer.
Can someone please tell me a way of making a artificial voice or just the methods I should look for. I don't know what this process is called. Probably that's why I can't find any answer on the web. It would be nice if someone please help me!
The easiest way to add a voice to your AI assistant is to use a text-to-speech library like:
pyttsx3
gTTS
Google's text-to-speech
If you want to add your own voice, you could use deep learning for that, like in:
Real-Time-Voice-Cloning
more approaches in this article
everyone, I am learning speech recognition in python and I am quite interesting whether it can be used offline. I mean, we use:
rec = sr.Recognizer()
with sr.Microphone() as source:
audio = rec.listen(source)
said = rec.recognize_google(audio)
print(said)
to recognize our speech, however recognise_google() doesn't work without internet connection. Is there any other way, that works offline? I'll be grateful if someone helps...
I am assuming that you're using the python speech recognition library. This library can be used with CMU Sphinx, which works offline.
The library pocketsphinx is setup to work offline by default, so might be a good choice if youre just getting started.
I want to develop a speech recognition system for punjabi language, but I am not able to find any library supporting Punjabi language and having efficient use.
You can build and use acoustic models for CMUSphinx
http://cmusphinx.sourceforge.net
See for details https://stackoverflow.com/a/8215967/432021
The jarvis application that is currently developed, is in English. I want to customize it to use local language. How to develop this kind of app for local languages? what kind of programming languages I must know to proceed to the development? I have tested the english version of the jarvis, it works well for me. How to attach the c# with HTK for the purpose of the development?
How to develop this kind of app for local languages?
You don't need to develop from scratch, take existing software and build on it. For example you can consider https://github.com/jasperproject/jasper-client, it's pretty actively developed.
what kind of programming languages I must know to proceed to the development?
Most NLP libraries are in Python or Java. You also need shell scripting (awk/perl) experience because often models are built with Linux tools.
For speech recognition it's easiest to use CMUSphinx, the tutorial to add your language to CMUSphinx is at http://cmusphinx.sourceforge.net/wiki/tutorialam.
I have tested the english version of the jarvis, it works well for me. How to attach the c# with HTK for the purpose of the development?
There are many ways for interoperability:
1) C# can invoke HTK tools as binaries through Process.Start http://msdn.microsoft.com/en-us/library/system.diagnostics.process.start(v=vs.110).aspx
2) You can build a library from HTK and invoke it with PInvoke through interop framework
3) You can build a TCP or HTTP server with HTK tools and connect to this server from C# application to get speech recognition results.
Overall, you could probably use existing solutions like mentioned above, they have all hard things implemented, you only need to configure your local language.
I would suggest you to go for HTK or if you have lots of training data then go for kaldi one of the best toolkit for speech recognition for local language which uses deep learning.
I want to make multiplayer games with J2ME but I didn't find any game source codes.
Where can I find sample game sources?
Thanks.
Checkout the following svn's:
http://code.google.com/p/oppositelock/
http://code.google.com/p/oware-midlet/
You might want to look at third-party J2ME gaming SDK sample codes as well. Check out the Tic-Tac-Toe sample from Skiller, it's a free, powerful and easy to use SDK. The advantage with such third-party SDKs is that a number of features are already implemented so you don't need to re-invent the wheel for anything and everything. I hope that helps.