HOW-TO: The Most Simple Audio Engine? - audio

I am curious. How would one implement the most simple audio engine ever? I have something like a stream for audio data in mind using your default audio device. Playing a lot with RtAudio, I think if one could drop some of the features, this would be possible. Someone any idea where to start?

I would do it (did do it) like this:
http://ccan.ozlabs.org/info/wwviaudio.html

Well there is no reason why you can't create an audio engine that has a trivially simple interface:
audioEngine.PlayStream(myStream)
The audio engine would then periodically read data from that stream and send it to the soundcard. The reason audio engines tend to be more complicated than this, is that there are all kinds of parameters you might want to control, including latency of playback, sample rate, bit depth, as well as often the need to convert audio between formats. Add in the problems of repositioning streams, and synchronizing multiple streams, supporting multiple audio driver APIs etc, and soon you have an audio engine as complicated as any other.

Thank you for your answers.
to .Mark Heath:
yes of course I know that there might be a lot of parameters to tweak be it the filter cutoff, resonance, delay timing etc etc ..
I was just curious how to build an audio engine as simple as possible and modular as possible. The major intention I had in mind was to rebuild the gameboy soundchip ( again here, there a lot of implementations ie. JavaBoy).
to.smcameron
It seems that ccan/wwviaaudio has a dependency to libvorbis / portaudio (version >=19), that would yield the same effect as using rtaudio ( which is, compared to other realtime audio interface having build in asio support, rather small). However, I will give it a try.
regards,
audax

Related

Getting multiple audio clips to same level

I am working on a project that involves using a lot of found audio clips (some new, some very old archival and poor quality etc).
I am trying to figure out a way to have all audio clips to be of a similar quality (if this is possible) and play at a similar volume?
I have use of both audacity and ableton...any suggestions would be great.
What you are asking for is commonly called normalization. There are several tools that can do it, including commandline tools and also audacity.
You'll find the tool in audacity under Effect > Normalize...
You can select multiple audio tracks.
You could also consider using a limiter and/or a compressor on your track. Have a look in the Live effect reference for more info on these: https://www.ableton.com/en/manual/live-audio-effect-reference/
The results will not be as good as applying normalization by hand, but it will be a lot quicker.

Realtime audio manipulation

Here is what i like to achieve:
I like to play around in creating "new" software / hardware instruments.
Sound processing and creation is always managed by software. But one could play the instrument via ultrasonic distance sensor for example. Another idea is to start playback when someone interrupts the light of a photoelectric barrier and so on....
So the instrument would play common sounds, but has to be used in an unusal way. For example, the ultrasonic instrument would play a sound if it detects something in a certain distance. The sound could be manipiulated in pitch for example if the distance gets smaller.
Basically i like to playback a sound sample and manipualte this in realtime.
I guess i have to use WAV samples for this, right? And which programming language do you think fits best for this task?
Edited after kevins hint: please kick me into the right direction - give me a hint where to start.
Thanks in advance
Since you're using the the Processing tag, you can try Processing.
It comes with a sound library like Minim or you can install beads which is great. There's actually a nice book on it: Sonifying Processing
You might find SuperColider fun as well.
The main thing is what are you comfortable with at the moment ?
If Processing syntax looks intimidating, you can actually try a different programming paradigm like data flow. In which case you can use PureData(free, opensource) or MaxMSP(very similar, but commercial). The idea is rather than typing instructions, you connect boxes with wires which is fun and the examples are great too.
If you're into c++ there are plenty of libraries. On the creative side, there's a nice set of libraries called OpenFrameworks that's easy and fun to use. If this is your cup of tea, have a peek at Maximilian.
Bottomline is: there are multiple options to achieve the same task. Choose the best tool for your (based on your background) or try each and see what you like best.
You asked "And which programming language do you think fits best for this task?" - I would also suggest using Processing. I have been used Processing to work with sounds previously. And in all cases I used Minim. It has many UgenS to generate sounds programmatically.
Also, you wants to integrate with some sensors. I'm not sure what types of sensors you will use, but Processing goes pretty well with different Arduino modules and sensors. Check this link for more direction.
Furthermore, you can export your project as .exe or executable .jar files. And their JS version (P5.js) works almost the same as the Java version.

Detecting ads in audio streams?

I have never tried, but just curious if there is any possibility to detect ads in audio streams? I mean except machine learning or something. Some specifics about byte stream during adverts. Maybe kind of different loud value?
From a purely audio standpoint, this isn't possible. There is nothing distinguishable between an advertisement and other audio content. Sure, you could argue that a station playing music will have different spectral characteristics than when talking comes on for an advertisement, but what about ads that also play music? How do you distinguish between an announcer and someone reading an ad? What if the ad is embedded in normal content?
Now, some stations do provide metadata which occasionally contain ad information. If you look at the length of a particular content item, your ads are usually going to be under a minute or 30 seconds. How you get this metadata and deal with it depend on the kind of stream you're working with.
There are techniques emerging to do this and they tend to leverage databases of known adverts to get around the theoretical problems that Brad correctly highlights in his answer.
One of the references below however, uses a techniques based on detecting slight differences in the audio when an ad starts as the initial detection trigger.
Some techniques also use both audio and visual streams to aid detection - for example the Google paper below uses first audio matching and then the video to validate/verify.
Some sources that might be worth looking at for anyone interested in this area (I realise it is an old question but it is still topical):
http://www.xavieranguera.com/papers/cimca_2008.pdf
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/55.pdf
https://www.audiblemagic.com/wp-content/uploads/2014/02/ad_detection_datasheet_150406.pdf

How to Serve/Stream Multiple Audio Files

I'm working on a project where we have many small audio files of around 500-600k. Then there are audio files of around 15M.
The 15M files are full narrated articles. The smaller ones are individual sentences within the article.
There are going to be many users and many articles in the future.
I want to be able to load the audio files relatively fast -- either through pre-loading or streaming or something of that nature. Basically if a user clicks on a button -- I want the audio to start more or less immediately.
What are my options here? Red5? Icecast?
EDIT:
I'd like to avoid flash if at all possible but not opposed to it -- I definitely can't use html5 audio as much as I'd like too.
I've already tried doing document onload to issue get requests for the files -- there are usually 15-20 per page. (19 small files, one big one). That doesn't seem to work as well as I thought it might.
In terms of latency -- I'm looking for push-button instant play -- right now I can count to 2 or 3 for the small files and 6-7 for the big one. Flash would be able to do this?
Streaming solutions such as Icecast are not appropriate here. All you need is simple HTTP.
You don't mention what you are playing these things on the client side with. If you are doing this in flash, it is relatively simple to preload or play while the download is still running.
For audio compression, you should be using MP3. For speech, you can easily get away with a lower bitrate. 48kbit 44.1kHz Mono is generally acceptable. This will load fine, even on decent mobile connections.
In any case, HTTP is the way to go. That way you can request the separate files easily. Icecast is for a single stream that runs for awhile, such as internet radio.
ok -- so i did some investigation and figured out what the competition was using
it was this:
http://www.schillmania.com/projects/soundmanager2/
basically what it does is try and use html5 audio tags with the ever so helpful 'preload=true' flag set and if it can't do that it fallsback on flash to preload the mp3

Converting audio to code and vice-versa

Having just witnessed Sound Load technology on the Nintendo DS game Bangai-O Spritis. I was curious as to how this technology works? Does anyone have any links, documentation or sample code on implementing such a feature, that would allow the state of an application to be saved and loaded via audio?
Its the same old thing used in ZX Spectrum era. You load programs/games from tape.Only the sound quality and the filters are probably better.
In my opinion something like Bluetooth or WiFi is better. You can also send files that can be put on some storage and then load them. I find these methods much easier than sound because if there is a lot of noise around you cannot do much.
It is just a conversion of data to audio and then back from audio to data.
Search for Zotyocopy and Copy86M on google - these are the utilities used for saving a game to tape after loading it into memory on zx spectrum.
If you want to pass data as audio through the air there are a few things you need to be aware of though, such as how the speaker and microphone interact for example. It is important that they don't distort or alter the sound too much as what you are sending are in fact the raw bytes.
Some audio software will let you open any file as audio so that you may listen to it. If you record audio as data do not use lossy compression such as mp3 on the audio file!

Resources