using core audio and wave audio together on windows - audio

I am planning to use c++ core Audio API's to perform various audio related operations in my application like detecting device change, detecting volume levels etc. But there is also an Audio capture code in my solution that uses old Wave API's (waveInxxx) which I don't want to touch right now.
Can I use core Audio API's safely and can these (core and wave) co exist together given that both these would operate on same audio end point? Will this lead to crash or hang in my application ?
Thanks in advance.

Yes, you can use the old wave APIs safely. They are now implemented in terms of Core Audio APIs.
This MSDN page describes how the old APIs are implemented in terms of Core Audio:
Interoperability with Legacy Audio APIs
And this page has a nice diagram showing how things are plugged together.
User-Mode Audio Components

Related

What benefits howler.js brings for a basic audio player in comparison to `audio` element?

Preconditions:
Developing an audio player for a web application.
All target browsers fully support audio tag.
No need in sprites, multiple simultaneous sounds etc, just one audio track to be played back at a moment.
Audio file has to be streamed from the server, not downloaded at once. Therefore not Web Audio API.
Why would I want to utilize howler.js or similar library instead of relying on the built-in audio tag in this scenario?
The only howler.js feature that is intriguing is “Handles edge cases and bugs across environments”.

What libraries/APIs allow me access real time audio waveforms of a phone call?

I am looking to build an app that needs to process incoming audio on a phone call in real time.
WebRTC allows for this but i think this works only in their browser based P2P audio communications functionality but not for phone calls/ VOIP.
Twilio and Plivo allow you record the audio for batch/later processing.
Is there a library that will give me access to the audio streams in real time? If not, what would I need to build such a service from scratch?
Thanks
If you are open to using a media server (so that the call is not longe P2P but it's mediated by the media server using a B2B model), then perhaps the Kurento Media Server may solve your problem. Kurento Media Server makes possible to create processing capabilities which are applyied in real time onto the media streams. There are many examples in the documentation of computer vision and augmented reality algorithms applied in real time over the video streams. I've never seen an only-audio processing module, but it should be simple to implement just by creating an additional module, which is not too complex if you have some knowledge about C/C++ and media processing concepts.
Disclaimer: I'm part of the Kurento development team.

manipulating audio input buffer on Ubuntu Linux

Suppose that I want to code an audio filter in C++ that is applied on every audio or to a specific microphone/source, where should I start with this on ubuntu ?
edit, to be clear I don't get how to do this and what is the role of Pulseaudio, ALSA and Gstreamer.
Alsa provides an API for accessing and controlling audio and MIDI hardware. One portion of ALSA is a series of kernel-mode device drivers, whilst the other is a user-space library that applications link against. Alsa is single-client.
PulseAudio is framework that facilitates multiple client applications accessing a single audio interface (alsa is single-client). It provides a daemon process which 'owns' the audio interface and provides a IPC transport for audio between the daemon and applications using it. This is used heavily in open source desktop environments. Use of Pulse is largely transparent to applications - they continue to access the audio input and output using the alsa API with audio transport and mixing. There is also Jack which is targeted more towards 'professional' audio applications - perhaps a bit of a misnomer, although what is meant here is low latency music production tools.
gStreamer is a general purpose multi-media framework based on the signal-graph pattern, in which components have a number of inputs and output pins and provide a transformation function. A Graph of these components is build to implement operations such as media decoding, with special nodes for audio and video input or output. It is similar in concept to CoreAudio and DirectShow. VLC and libAV are both open source alternatives that operate along similar lines. Your choice between these is a matter of API style, and implementation language. gStreamer, in particular, is an OO API implemented in C. VLC is C++.
The obvious way of implementing the problem you describe is to implement a gStreamer/libAV/VLC component. If you want to process the audio and then route it to another application, this can be achieved by looping it back through Pulse or Jack.
Alsa provides a plug-in mechanism, but I suspect that implementing this from the ALSA documentation will be tough going.
The de-facto architecture for building effects plug-ins of the type you describe is Steinberg's VST. There are plenty of open source hosts and examples of plug-ins that can be used on Linux, and crucially, there is decent documentation. As with a gStreamer/libAV/VLC, you be able to route audio in an out of this.
Out of these, VST is probably the easiest to pick up.

javame: is it possible to disable AGC / VAD when recording using the microphone?

We are developing an application which takes audio from the microphone and does some analysis. We found during the analysis, that AGC is implemented on the microphone subsystem. Also I have heard that VAD is used.
Are there any other post processing done on the audio(PCM) before it is delivered to the application?
Is it possible for the application to disable the AGC and VAD post processing? Is it possible in JavaME or using some proprietary API, such as Nokia/Samsung?
See my answers to my own questions:
Unknown.
Impossible in JavaME. If you are working on Symbian/S60
devices, you could check if Qt or Symbian C++ has such capability. For example, I found the following info on the web, but did not check it: "There is an API called SetGain/GetMaxGain in CMdaAudioInputStream, but in S60 phones the range is between 1-1, so not very useful using this API. But you can use CVoIPAudioUplinkStream which allows you to dynamically control the audio gain and other codec properties". Try if you are interested in...

Low level audio programming

I wonder; does audio software like Cubase and Audacity use PlaySound calls??
Where can I learn about low level audio programming? As far as I've found information on the web, MCI seems to be the lowest level audio API in Windows...
Thanks
Edit: I don't ask for information specific for Windows only.
There's several audio APIs to choose from. The oldest and most widely supported is the waveOut API - look for functions starting with waveOut in MSDN. A slightly newer one is DirectSound which is geared more towards games, but it's main feature over waveOut is positional 3D sound which professional audio software doesn't use (it was also supposed to have lower latency than waveOut, but that never really materialized). For low latency audio, there is ASIO. Professional audio apps support this API, but not all drivers do (it's a standard feature in professional sound cards, but not gaming or on-board hardware). ASIO can provide much lower latency than waveOut or DirectSound. Finally, there's the kernel streaming interface, which is the lowest-level audio interface still accessible from user-mode code. This is a direct pipe into Windows's internal mixer which combines output from all apps that are currently playing sound into the signal that gets sent to the sound card. It's scarcely documented though. There's a driver called ASIO4ALL (just google it) that provides ASIO support on soundcards without ASIO drivers by implementing the ASIO API on top of the kernel streaming interface.
I'm a little late to the game here, but I posted a Windows API history last week that might add a little more context. The choice of API really depends on your needs. If you want to avoid 3rd party libraries, it really only comes down to MME, XAudio2, and Core Audio (WASAPI).
A Brief History of Windows Audio APIs
Hope this helps!
Actually, if you are looking for more than Windows-only output support, then the best way to start is to review Phil Burk's PortAudio, available as of this writing at http://www.portaudio.com/ .
ASIO is a good quality interface, but it's proprietary and owned by Steinberg.
There are many lower-level interfaces to audio output than MCI in modern Windows. These include, at least, DirectSound, XAudio and WASAPI.
I recommend avoiding the Windows APIs as much as possible, and learning PortAudio instead.

Resources