Looping parts of audio with gstreamer - audio

I'm developing a sort of a "advanced playback" audio application to aid music transcription. The idea is to allow the user to change the audio tempo/pitch, as well as select and possibly loop parts of the audio track. I've opted to use gstreamer for the time being. I have the scaletempo plugin in the pipeline to aid with changing the tempo. I am unsure as to what's the best way to do the looping.
From reading the docs it seems that I could get it done by performing a gst_element_seek on the scaletempo element and setting the *stop_type* and stop parameters, waiting for an EOS on the message bus, and then performing yet another seek etc.
Is there a better way to do it? Ideally I'd like to get smooth looping, though it's not a dealbreaker if I don't. The gstreamer docs contain mentions of a concept of "segments", but from glancing at the docs I still don't have any idea what they are or whether they're useful in my scenario.
Pointers to code in C/Python/Haskell/whatever are very much welcome.

Related

Overlapping one sound multiple times in processing

I'm working on the sound part of an interactive installation that would need an event to be triggered by osc an undefined number of times, making the sound linked to it overlaps instead of being rewinded and started again.
Would it be possible to do that without needing to make an array of loadings of the same sound?
I'm actually trying to do it with processing and minim library.
Do you think it would be easier to achieve it with another programming software? I've found myself in the same difficulties trying to do it with puredata. Any tip or clue would be extremely welcome.
Thanks a lot.
You will need multiple readers ([tabread~] resp [tabplay~] in Pd; i don't know about Processing/minim, but the same principle applies) to read the table multiple times (in parallel), where each one can be started separately.
However, you only need a single instance of your data array (e.g. [table]), as the various readers can access the same array independently.
Can you use Java libraries in Processing? Processing is built on Java, yes?
If you can, I have a library you can use, supporting a class I call AudioCue available via github. This is modeled on a Java Clip but with additional capabilities. It allows multiple, concurrent playback. AudioCue also has real time controls for volume, panning and playback speed, in case you want to play around with adding some more interactivity to your installation.
I would love to know if it can be used with Processing. Please follow up with me if you try this route. I'd like to see it done, and can possibly assist.
If Processing allows you to send PCM directly out for playback, then the basic algorithm is the store the audio data in an array, and create pointers or cursors (depending on your preferred terminology) that independently iterate through that array. This is the main basis of the algorithm I use for AudioCue, with the PCM being routed out via a Java SourceDataLine.

Realtime audio manipulation

Here is what i like to achieve:
I like to play around in creating "new" software / hardware instruments.
Sound processing and creation is always managed by software. But one could play the instrument via ultrasonic distance sensor for example. Another idea is to start playback when someone interrupts the light of a photoelectric barrier and so on....
So the instrument would play common sounds, but has to be used in an unusal way. For example, the ultrasonic instrument would play a sound if it detects something in a certain distance. The sound could be manipiulated in pitch for example if the distance gets smaller.
Basically i like to playback a sound sample and manipualte this in realtime.
I guess i have to use WAV samples for this, right? And which programming language do you think fits best for this task?
Edited after kevins hint: please kick me into the right direction - give me a hint where to start.
Thanks in advance
Since you're using the the Processing tag, you can try Processing.
It comes with a sound library like Minim or you can install beads which is great. There's actually a nice book on it: Sonifying Processing
You might find SuperColider fun as well.
The main thing is what are you comfortable with at the moment ?
If Processing syntax looks intimidating, you can actually try a different programming paradigm like data flow. In which case you can use PureData(free, opensource) or MaxMSP(very similar, but commercial). The idea is rather than typing instructions, you connect boxes with wires which is fun and the examples are great too.
If you're into c++ there are plenty of libraries. On the creative side, there's a nice set of libraries called OpenFrameworks that's easy and fun to use. If this is your cup of tea, have a peek at Maximilian.
Bottomline is: there are multiple options to achieve the same task. Choose the best tool for your (based on your background) or try each and see what you like best.
You asked "And which programming language do you think fits best for this task?" - I would also suggest using Processing. I have been used Processing to work with sounds previously. And in all cases I used Minim. It has many UgenS to generate sounds programmatically.
Also, you wants to integrate with some sensors. I'm not sure what types of sensors you will use, but Processing goes pretty well with different Arduino modules and sensors. Check this link for more direction.
Furthermore, you can export your project as .exe or executable .jar files. And their JS version (P5.js) works almost the same as the Java version.

Realtime Sound Routing...Trigger a Sound with Another Sound

I'm looking for a program that is able to recognize individual audio samples from my computer and reroute them to trigger WAV files from a library. In my project, it would need to be realtime as the latency would not be a desired result. I tried using dictation software that would recognize words to trigger opening a file and that's the direction where I want to go, but instead of words I want it to be sounds and it would happen in realtime. I'm not sure where to go and am just looking for some guidance. Does anyone have any suggestions of what I should do?
That's a fairly broad question, but I can tell you how I would do it. (Hardly the only way, but where I would start.)
If you're looking for real time input, the Java Sound library (excellent tutorial here) allows for that. (Just note that microphone input from a web page is difficult on anything, due to major security concerns, so this would be a desktop application.)
If it needs to be real time, the first thing I would suggest is stream and multithread the hell out of it. I would suggest the Java 8 Stream API, but since you're looking for subsamples that match a specific pattern, then each data point will have to be aware of the state of its neighbors, and that isn't easy with streams.
You will probably want to know if a sound roughly resembles an audio profile, so for that, I would pick a tolerance on just how close you want it to be for a match (remembering that samples may not line up 100% anyway, so "exact" is not an option), and then look up Hidden Markov Models. I suggest these because they're what voice recognition software typically uses, and while your sounds may not be voices, it will give you an idea of what has already been done.
You'll also want to maintain a limited list of audio samples in memory. Specifically, you will likely need the most recent data, because an audio signal is a time-variant signal, and you can't get a match from just one point. I wouldn't make it much longer than the longest sample you're looking to recognize, as audio takes up a boatload of memory.
Lastly (for audio), I would recommend picking a standard format for comparison. Make it as good as gets you decent results, and start high. You will want to convert everything to that format before you compare it.
Once you recognize a specific sound, it's basically a Command Pattern. Specific sounds can be mapped, even with a java.util.HashMap, to specific files, which (if there are few enough) you might even have pre-loaded.
Lastly, it's worth looking at the Java Speech API. It's not part of the JDK and it's quite dated, but you might get some good advice from its implementation.
This is of course the advice of a Java-preferring programmer, but I imagine that there might be some decent libraries in Python and Ruby to help you as well; and of course there's something in C somewhere. This may sound like a lot, but most of the material is already implemented and ready-to-go.
Hopefully this helps, let's look forward to other answers.

Detecting ads in audio streams?

I have never tried, but just curious if there is any possibility to detect ads in audio streams? I mean except machine learning or something. Some specifics about byte stream during adverts. Maybe kind of different loud value?
From a purely audio standpoint, this isn't possible. There is nothing distinguishable between an advertisement and other audio content. Sure, you could argue that a station playing music will have different spectral characteristics than when talking comes on for an advertisement, but what about ads that also play music? How do you distinguish between an announcer and someone reading an ad? What if the ad is embedded in normal content?
Now, some stations do provide metadata which occasionally contain ad information. If you look at the length of a particular content item, your ads are usually going to be under a minute or 30 seconds. How you get this metadata and deal with it depend on the kind of stream you're working with.
There are techniques emerging to do this and they tend to leverage databases of known adverts to get around the theoretical problems that Brad correctly highlights in his answer.
One of the references below however, uses a techniques based on detecting slight differences in the audio when an ad starts as the initial detection trigger.
Some techniques also use both audio and visual streams to aid detection - for example the Google paper below uses first audio matching and then the video to validate/verify.
Some sources that might be worth looking at for anyone interested in this area (I realise it is an old question but it is still topical):
http://www.xavieranguera.com/papers/cimca_2008.pdf
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/55.pdf
https://www.audiblemagic.com/wp-content/uploads/2014/02/ad_detection_datasheet_150406.pdf

HOW-TO: The Most Simple Audio Engine?

I am curious. How would one implement the most simple audio engine ever? I have something like a stream for audio data in mind using your default audio device. Playing a lot with RtAudio, I think if one could drop some of the features, this would be possible. Someone any idea where to start?
I would do it (did do it) like this:
http://ccan.ozlabs.org/info/wwviaudio.html
Well there is no reason why you can't create an audio engine that has a trivially simple interface:
audioEngine.PlayStream(myStream)
The audio engine would then periodically read data from that stream and send it to the soundcard. The reason audio engines tend to be more complicated than this, is that there are all kinds of parameters you might want to control, including latency of playback, sample rate, bit depth, as well as often the need to convert audio between formats. Add in the problems of repositioning streams, and synchronizing multiple streams, supporting multiple audio driver APIs etc, and soon you have an audio engine as complicated as any other.
Thank you for your answers.
to .Mark Heath:
yes of course I know that there might be a lot of parameters to tweak be it the filter cutoff, resonance, delay timing etc etc ..
I was just curious how to build an audio engine as simple as possible and modular as possible. The major intention I had in mind was to rebuild the gameboy soundchip ( again here, there a lot of implementations ie. JavaBoy).
to.smcameron
It seems that ccan/wwviaaudio has a dependency to libvorbis / portaudio (version >=19), that would yield the same effect as using rtaudio ( which is, compared to other realtime audio interface having build in asio support, rather small). However, I will give it a try.
regards,
audax

Resources