I am to test voice recognition programs. Some which I have access to the code and others where I don't.
Sadly my (beautiful) voice is not perfect, so when I am reading a text it sounds slightly different each time. Which makes the testing difficult and time consuming. Giving that I can tweak a lot of parameters.
So I was wondering if there was a way to record my own voice (already done). And then play it as normal microphone input so the voice recognition program I am testing will see it as microphone input.
This would also help greatly if it could be done programatically in C#. So I can in my own code specify when to play what.
To play it from speakers and have the voice recognition programs listen to the microphone is not an option, because it is not the same sound on different computers/speakers/microphones.
Thanks.
Edit:
What i have found so far is to use a software sound Card simulator. But I haven't been able to find a suitable one.
Just as there are printer drivers that do not connect to a printer at all but rather write to a PDF file, analogously there are virtual audio drivers available that do not connect to a physical microphone at all but can pipe input from other sources such as files or other programs.
I hope I'm not breaking any rules by recommending free/donation software, but VB-Audio Virtual Cable should let you create a pair of virtual input and output audio devices. Then you could play an MP3 into the virtual output device and then set the virtual input device as your "microphone". In theory I think that should work.
If all else fails, you could always roll your own virtual audio driver. Microsoft provides some sample code but unfortunately it is not applicable to the older Windows XP audio model. There is probably sample code available for XP too.
Related
I am new to Arduino development and just started trying some of the provided examples for the MXChip devkit. What I'm trying to do now is accessing the analog readout from the microphone to get a rough estimation of sound levels. I tried to find information on how to do this and found some articles that use an Arduino board and an external microphone wired to the analog inputs. Since the dev kit has a built-in microphone, I want to use that, but I don't know how to access it, and I can't find any information on pin layout. Any help would be appreciated!
The microphone is not connected to the analog pins. It is connected to dedicated Audio codec hardware.
See https://microsoft.github.io/azure-iot-developer-kit/docs/apis/audio-v2/
The hardware does not seem to give you direct access to incoming values. It looks like you will need to record and the read the buffer to get audio input levels.
I'm trying to create an interactive voice-tree for an art project. Think of something like a choose-you-own-adventure, but on the phone and with voice commands. I already have a fair amount of experience working with Construct 2 (game-making software), and can easily build a branching, voice controlled interaction loadable through a modern browser with it. For reasons relevant to the overall story, I need players to connect to the interaction through a Google Voice number they will call.
I already have a GV number and have written an AutoHotKey script to auto-answer the Hangouts call, but I'm stuck trying to route the audio from the caller in Hangouts to the browser AND the audio response output of the browser back to the caller.
I know of an extremely primitive way to accomplish this, [which I've illustrated with this diagram:
Unfortunately, this is rather cumbersome and I suspect I can achieve my goal through virtualization or at the VERY least some sort of attenuation cables between two physical machines (I tried running a generic AUX cable between two laptops, but couldn't get speaker audio to go into microphone audio from one to the other).
I've been experimenting on Parallels running Windows 8.1 with Virtual Audio Cable(no luck), JACK(too robust), Chevolume(too limited), and IndieVolume(too limited).
I suspect VAC would be the best bet, but I can't seem to find a way to route Firefox audio output to a microphone input which directs to Chrome and vice versa. If I try accomplishing it all through just one virtual machine I have to use two different browsers for the voice-tree webpage and Hangouts call since Hangouts pushes its audio through Chrome (even the stand-alone application).
Is there any way to route microphone input and speaker output separately between two virtual machines? If not, could I still try and accomplish this with a specific type of cables between two laptops running windows 7/8 that have generic audio jacks?
What I would like to do involves a small bit of hardware. 1) a phone headset, 2) a PCI-modem, and 3) a phone wire. What I would like to do is read audio from the modem, and then digitize it for processing. I'm sure the best way to do this is with Linux, but if it can be done in Windows as well that would be awesome. A second extension of this, is that I would like to be able to translate digital audio to analog audio and send that to the modem so it can be heard from the headset.
Any advice would be greatly appreciated. ( Also, if anybody has a general "pointer" to what I should investigate to replicate the audio stream to a TCP server so it can be accessed over LAN, that would be even cooler. I know how to handle TCP well enough, but I haven't a clue about audio encoding / decoding ).
If anybody's curious, I'm wanting to create a home-wide audio-stream with ears and mouths. Since the phone cables can do that with normal headsets, I thought "why not".
Not just any modem will do. You need a "voice modem", which includes audio capability as well as general modem functionality. These devices usually expose themselves as a regular sound card on the system, once the drivers are installed. From there, you can use any mechanism you want to read/write from those audio streams.
Be warned though that your plan of a whole-house speakerphone isn't simple at all. There are significant feedback issues when using regular POTS lines. There are entire companies that work to solve this problem. The best of them use microphone arrays that are steerable in software. You would be better off using one of these off-the-shelf systems.
I have developed a pretty complex audio software for my client with plugins for Winamp, Windows Media player and VST. Now the client is interested in some method to avoid maintaining the multitude of plugins, we have no way to support all the media players out there.
The client does not care for Unix/Mac yet, so I can look only at Windows XP and Vista/7/
Basically, what we need is a way to always reliably intercept as much audio stream protocols as possible (well, except maybe ASIO, that's another story, I guess), then pass this audio through our custom effects engine and then route back to the default audio device, whatever it is.
Now I am thinking, what options do I have (theoretically).
I could use hooks. I need to hook globally older vaweOut and also DirectSound.
But will this still work on Vista/7?
I could use a virtual driver, like the author of the Virtual Audio Cable did:
http://software.muzychenko.net/eng/vac.htm
Seems a pretty daunting task. Anyway, the client will contact the author of VAC to see if he agrees to sell his source code for a reasonable price.
This driver could install itself as a default audio output device, intercept the audio stream from Windows, and pass it back to default device. Hmm, but what about various DirectSound audio buffers, do I have to mix them myself or is there any way I could tell Windows mixer to mix all for me and pass a single mixed audio stream?
It seems, this custom driver will of course kill all the hardware audio acceleration, but we can live with that, if we warn our customers about this issue.
As I understand, the most current Windows driver standard is WDF.
But maybe it does not work for audio on Windows Vista/7?
I know, Vista/7 has a different audio stack from XP.
If I can do it using WDF, what driver should I write - kernel mode or user mode?
Maybe I am missing more elegant and simple options to intercept, process and route audio on Windows?
Try Virtual Audio Streaming SDK. Also virutal sound card and let you read/process audio data in realtime.
http://www.virtualaudiostreaming.net/sdk-license.html
I have written an application that receives media files from a central server and plays those files according to a playlist. All works well.
A client has contacted us and wants to use our application to play some audio files as presentations in a kiosk-style application. So far, so good, our application can handle this no problems.
He has requested as a potential feature that we would have a number of headphone sockets at the front of the kiosk. Each headphone socket would play the same audio presentation in a different language.
I have come up with the idea of encoding a single audio file with the presentation in multiple languages, and each language in a different channel. We would then require a sound card that could decode each channel and output it on a different headphone socket.
Thing is, while I'm think the theory is sound, I have absolutely no idea whether this is feasible and what would be required to pull it off.
Any ideas?!
As a side-note: the application uses Media Player as the underlying component to handle the playback of audio and video. I'd appreciate any help as to the software we could use to generate the multi-channel audio stream and the hardware (USB sound card would be fine) that we could use to decode the stream.
Thanks!
You need to use multiple files not channels, its going to be way easier that way.
Instead of using Media Player use DirectShow (on .NET you have DirectShow.NET), In DirectShow you have the notation of Multiple files on the same graph.
You will be able to control to which audio device play which files, and your Play, Pause, Stop commands will be preformed on all files without you need to worry about syncing.
There are many samples on how to build media player like with DiectShow, extending them to use multiple files should be really easy.
For HW take a look at this (USB with 8 output channels)
I think with Shay's hardware you've got a complete solution:
Encode a 7.1 file with a different mono voice track on each channel.
Use the 8 channel output device in 7.1 mode, with a different headset in each port, and you've got it. Or, if you only have 6 languages, a 5.1 file would work. Many PC's have 5.1 outputs built in, you'd only need 3 splitters to break out the left and right channels from each jack.
You can do the encoding with Windows Media Encoder, or other pro audio tool.