I am interested in developing a domain-specific language which could be driven only by speech input. Are there examples of programming languages designed specifically for speech-only input? We can assume some form of feedback and storage (for humans), probably a screen although I'd also be interested in languages which have no feedback. I'd be interested in both formal grammars and natural language input.
LOLCODE :)
HAI
CAN HAS STDIO?
I HAS A VAR
IM IN YR LOOP
UP VAR!!1
IZ VAR BIGGER THAN 10? KTHX
VISIBLE VAR
IM OUTTA YR LOOP
KTHXBYE
Might be releated: Is There a Human Readable Programming Language?
Perligata is highly pronounceable (http://www.csse.monash.edu.au/~damian/papers/HTML/Perligata.html)
use Lingua::Romana::Perligata;
maximum inquementum tum biguttam egresso scribe.
meo maximo vestibulo perlegamentum da.
da duo tum maximum conscribementa meis listis.
dum listis decapitamentum damentum nexto
fac sic
nextum tum novumversum scribe egresso.
lista sic hoc recidementum nextum cis vannementa da listis.
cis.
Related
I am trying to build a program that will find which page/sentence in a book is read to microphone. I have the book's text and its audio content. The user will start reading from a random page and program is supposed to synch to the user and show the section of the book which is being read. It might seem useless program but please bear with me..
Would an approach similar to shazam-like programs work? I am not sure how effective those algorithms for speech. Also, the speaker will be different and might have accent and different speeds to read.
Another approach would be converting the speech to text and searching the text in the book. The problem is that the language of the book is a rare one for which there is no language model available. In addition, the script does not use latin characters which makes programming difficult (for me at least).
Is there any solutions that anyone can recommend? Would extracting features from the audio file and comparing with the "real-time" extracted features (from microphone) would work? Which features?
Any implementation/code that I can start with? Any language is ok but prefer C.
You need to use speech recognizer.
Create a language model directly from the book text. That will make the recognition of the book reading very accurate, both original reading and the reading by the user.
Use this language model to recognize the book and assign timestamps for the words or use more advanced algorithm to perform text to audio alignment.
Recognize user's speech with the book-specific language model and use the recognized text to display a position in a book.
You can use CMUSphinx for the mentioned tasks.
For a school project I am looking into natural language programming and thinking how the concept may be applied to Arduino.
Think along the lines of a piece of software that would translate sentences like If the analog sensor on pin 9 reads more than 2 Volts, set the duty cycle of the the servo on pin 10 to 70%. or If the digital sensor on pin 4 reads high, light the onboard LED for 5 seconds. into Arduino code. I'm suspecting that doing this for basic Arduino use cases should be straightforward compared to more general applications.
Does such a thing exist for Arduino? Does it exist for any other popular high-level language, like Python or MATLAB? Could anyone recommend resources for an absolute beginner on the topic of natural language processing (more specifically, a non-CS, non-CS-background graduate student who knows his way around Python, C#, MATLAB and, obviously, Arduino)?
You can checkout nltk which is a library for Python.
Translate the sentence into commands using the above library and then use the commands to generate the Arduino code.
Another approach is to use firmdata which will allows you to send commands to Arduino through a serial port.
I want to learn DirectShow & MediaFoundation programming right from basics. Help needed on training resources (Website links, etc..).
What all prerequisites should one have to start with DirectShow & MF programming.
I think I must have COM programming basics for this. Since I need to quickly ramp up with DirectShow & MF, it would be very helpful if one can kindly let me know which part of COM should I know to do with DShow & MF programming. (As I don't have much time, I need to quickly do with COM so that I can spare more time ramping up DirectShow & MF).
Helping me with Training links on COM would also be very helpful.
I am a fresher to COM,MFC, DirectShow,Media foundation etc..... (Training links provided keeping this in mind would very helpful to ramp up from basics)
Thanks in advance.
Here's a good book for getting started with Media Foundation http://www.docstoc.com/docs/109589628/Developing-Microsoft-Media-Foundation-Applications#.
Multimedia API have not been drawing sufficient attention to result in books. There are no good resources for MF that I am aware of due to limited interest to Media Foundation, and a good book for DirectShow is the question brought up many times over years - you will find answers DirectShow introduction material and in other topics and sites. Additionally, you will perhaps want to get some basic introduction into digital video/audio such as with a book mentioned here Video Editing Books
I would like to synchronize a spoken recording against a known text. Is there a speech-to-text / natural language processing library that would facilitate this? I imagine I'd want to detect word boundaries and compute candidate matches from a dictionary. Most of the questions I've found on SO concern written language.
Desired, but not required:
Open Source
Compatible with American English out-of-the-box
Cross-platform
Thoroughly documented
Edit: I realize this is a very broad, even naive, question, so thanks in advance for your guidance.
What I've found so far:
OpenEars (iOS Sphinx/Flite wrapper)
Forced Alignment
It sounds like you want to do forced alignment between your audio and the known text.
Pretty much all research/industry grade speech recognition systems will be able to do this, since forced alignment is an important part of training a recognition system on data that doesn't have phone level alignments between the audio and the transcript.
Alignment CMUSphinx
The Sphinx4-1.0 beta 5 release of CMU's open source speech recognition system now includes a demo on how to do alignment between a transcript and long speech recordings.
Will you please provide me a reference to help me understand how scanline based rendering engines works?
I want to implement a 2D rendering engine which can support region-based clipping, basic shape drawing and filling with anti aliasing, and basic transformations (Perspective, Rotation, Scaling). I need algorithms which give priority to performance rather than quality because I want to implement it for embedded systems with no fpu.
I'm probably showing my age, but I still love my copy of Foley, Feiner, van Dam, and Hughes (The White Book).
Jim Blinn had a great column that's available as a book called Jim Blinn's Corner: A Trip Down the Graphics Pipeline.
Both of these are quited dated now, and aside from the principles of 3D geometry, they're not very useful for programming today's powerful pixel pushers.
OTOH, they're probably just perfect for an embedded environment with no GPU or FPU!
Here is a good series of articles by Chris Hecker that covers software rasterization:
http://chrishecker.com/Miscellaneous_Technical_Articles
And here is a site that talks about and includes code for a software rasterizer. It was written for a system that does not have an FPU (the GP2X) and includes source for a fixed point math library.
http://www.trenki.net
I'm not sure about the rest, but I can help you with fast scaling and 2D rotation for ARM (written in assembly language). Check out a demo:
http://www.modaco.com/content/smartphone-software-games/291993/bbgfx-2d-graphics-library-beta/
L.B.