Low audio quality with Microsoft Translator - audio

I'm working on a desktop application built with XNA. It has a Text-To-Speech application and I'm using Microsoft Translator V2 api to do the job. More specifically, I'm using is the Speak method (http://msdn.microsoft.com/en-us/library/ff512420.aspx), and I play the audio with SoundEffect and SoundEffectInstance classes.
The service works fine, but I'm having some issues with the audio. The quality is not very good and the volume is not loud enough.
I need a way to improve the volume programmatically (I've already tried some basic solutions in CodeProject, but the algorithms are not very good and the resulting audio is very low quality), or maybe use another api.
Are there some good algorithms to improve the audio programmatically? Are there other good text-to-speech api's out there with better audio quality and wav support?
Thanks in advance

If you are doing off-line processing of audio, you can try using Audacity. It has very good tools for off-line processing of audio. If you are processing real-time streaming audio you can try using SoliCall Pro. It creates virtual audio device and filters all audio that it captures.

Related

What is the simplest way to implement small group, low latency, one-to-many audio broadcast

I have a Linode server and need to broadcast one to-many audio (they can hear but can not talk back) to a group of three to five people. I looked at WebRTC and the Janus server but it seems complete overkill. Using commercial applications like Skype, Discord etc. results in low audio quality and it is mono. Best possible audio quality and low latency (on a par with that of Skype, Discord etc.) is essential.
Any pointers would be greatly appreciated.
I can recommend building such system based on Icecast streaming. It's an old proven technology which has a latency close to real-time.
You could use any set of Icecast-enabled tools for that.
As example, here's what you an do with tools by our company:
Larix Broadcaster mobile app allows streaming in audio-only
mode.
Nimble Streamer software media server can get Larix' input and
produce Icecast stream. You can use any Icecast-enabled here
instead.
SLDP Player can play Icecast produced by Nimble
Streamer or any other Icecast-enabled server.
That can also be built with other companies products, so you can pick the right tools yourself.
A super simple setup would be to just use command line tool called ffmpeg (it also has an api) see doc at https://trac.ffmpeg.org/wiki/ffserver
Where your source audio lives just launch either the ffmpeg or ffserver
ffserver -f /etc/ffserver.conf
in that config put location of source audio and output url it will publish to ... then your client receivers can use ffplay with
ffplay <stream URL>
ffmpeg is a free open source industry workhorse for audio/video manipulation ... its the underlying technology several more visable tools like vlc use under the covers

using core audio and wave audio together on windows

I am planning to use c++ core Audio API's to perform various audio related operations in my application like detecting device change, detecting volume levels etc. But there is also an Audio capture code in my solution that uses old Wave API's (waveInxxx) which I don't want to touch right now.
Can I use core Audio API's safely and can these (core and wave) co exist together given that both these would operate on same audio end point? Will this lead to crash or hang in my application ?
Thanks in advance.
Yes, you can use the old wave APIs safely. They are now implemented in terms of Core Audio APIs.
This MSDN page describes how the old APIs are implemented in terms of Core Audio:
Interoperability with Legacy Audio APIs
And this page has a nice diagram showing how things are plugged together.
User-Mode Audio Components

Audio enhancement over windows

I am from signal processing background. When I listen audio from youtube, sometime I feel the chances of improvement in audio quality at run time. These basic improvements will be auto adjusted as per the audio content. I mean, there is a possibility of write a generalized or feedback based algorithm.
I am aware to test my algorithm on offline audio(recorded audio file). But I do not know, how can I interface my algorithm with youtube audio? Is it possible to write some windows API which activates whenever any audio plays over the machine.

How to add live video streaming to a website?

.Hi everyone! I am looking forward to create a website with a live video streaming feature.
I have done some research and read about some applications including Flash Media Live Encoder.
Can anyone please guide me on how to start with this? Thanks!
It really depends from your requirements.
Do you need live streaming for big event or small event (what is your bandwidth)?
Do you need to stream to different devices (desktop+mobile)?
Do you have to stream your desktop/webcam or high quality camera feeds through capture cards?
Are you flexible with different Operative Systems?
Your question is too general. FMLE + FMS is a good solution, but FMS can be expensive.
Try to have a look also to Wowza.
If you just need a few live videos on your website, the solution is quite simple, Flash Media Live Encoder plus Flash Media Server are suitable.

javame: is it possible to disable AGC / VAD when recording using the microphone?

We are developing an application which takes audio from the microphone and does some analysis. We found during the analysis, that AGC is implemented on the microphone subsystem. Also I have heard that VAD is used.
Are there any other post processing done on the audio(PCM) before it is delivered to the application?
Is it possible for the application to disable the AGC and VAD post processing? Is it possible in JavaME or using some proprietary API, such as Nokia/Samsung?
See my answers to my own questions:
Unknown.
Impossible in JavaME. If you are working on Symbian/S60
devices, you could check if Qt or Symbian C++ has such capability. For example, I found the following info on the web, but did not check it: "There is an API called SetGain/GetMaxGain in CMdaAudioInputStream, but in S60 phones the range is between 1-1, so not very useful using this API. But you can use CVoIPAudioUplinkStream which allows you to dynamically control the audio gain and other codec properties". Try if you are interested in...

Resources