sound synchronization in C or Python - audio

I'd like to play a sound and have some way of reliably telling how much of it has thus far been played.
I've looked at several sound libraries but they are all horribly underdocumented and only seem to export a "PlaySound, no questions asked" routine.
I.e, I want this:
a = Sound(filename)
PlaySound(a);
while true:
print a.miliseconds_elapsed, a.length
sleep(1)
C, C++ or Python solutions preferred.
Thank you.

I use BASS Audio Library (http://www.un4seen.com/)
BASS is an audio library for use in Windows and Mac OSX software. Its purpose is to provide developers with powerful and efficient sample, stream (MP3, MP2, MP1, OGG, WAV, AIFF, custom generated, and more via add-ons), MOD music (XM, IT, S3M, MOD, MTM, UMX), MO3 music (MP3/OGG compressed MODs), and recording functions. All in a tiny DLL, under 100KB in size.*
A C program using BASS is as simple as
HSTREAM str;
BASS_Init(-1,44100,0,0,NULL);
BASS_Start();
str=BASS_StreamCreateFile(FALSE,filename,0,0,0);
BASS_ChannelPlay(str,FALSE);
while (BASS_ChannelIsActive(str)==BASS_ACTIVE_PLAYING) {
pos=BASS_ChannelGetPosition(str,BASS_POS_BYTE);
}
BASS_Stop();
BASS_Free();

This is most likely going to be both hardware-dependent (sound card etc) and OS-dependent (size of buffers used by OS etc).
Maybe it would help if you said a little more about what you're really trying to achieve and also whether we can make any assumptions about what hardware and OS this will run on ?
One possible solution: assume that the sound starts playing more or less immediately and then use a reasonably accurate timer to determine how much of the sound has played (since it will have a known, fixed sample rate).

I'm also looking for a nice Audiolibrary, where i can directly write on the Soundcards Buffer. I didn't have time yet to have a look at it myself, but pyAudio looks pretty nice. If you scroll down on the page you see an example similar like yours.
With help of the buffersize, number of channels and sample rate you can easily calculate the time each loop-step lasts and print it out.

Related

Better sound in winsound?

I am trying to automate playing sounds using python's winsound module. I've come to the point where I can give the function any number of notes (whose names I've stored as frequencies), and a tempo in BPM. However, the default beep sound in winsound is very crude, and it spaces out in higher tempos (anything above 60 BPM).
Is there any way to use another sound instead of the default winsound.Beep function? I want to stick with generated sound through code (not mp3's or any downloaded files), because I have already stored various notes as frequencies, and have something to work with; all it needs is a better sound.
If winsound isn't the best option, is there any other way I could generate sound in python? I also have some experience with C++ and Java, (or at least enough for the purposes of automating the sounds), so are there any libraries to produce a better, and a more consistent sound?

Realtime audio manipulation

Here is what i like to achieve:
I like to play around in creating "new" software / hardware instruments.
Sound processing and creation is always managed by software. But one could play the instrument via ultrasonic distance sensor for example. Another idea is to start playback when someone interrupts the light of a photoelectric barrier and so on....
So the instrument would play common sounds, but has to be used in an unusal way. For example, the ultrasonic instrument would play a sound if it detects something in a certain distance. The sound could be manipiulated in pitch for example if the distance gets smaller.
Basically i like to playback a sound sample and manipualte this in realtime.
I guess i have to use WAV samples for this, right? And which programming language do you think fits best for this task?
Edited after kevins hint: please kick me into the right direction - give me a hint where to start.
Thanks in advance
Since you're using the the Processing tag, you can try Processing.
It comes with a sound library like Minim or you can install beads which is great. There's actually a nice book on it: Sonifying Processing
You might find SuperColider fun as well.
The main thing is what are you comfortable with at the moment ?
If Processing syntax looks intimidating, you can actually try a different programming paradigm like data flow. In which case you can use PureData(free, opensource) or MaxMSP(very similar, but commercial). The idea is rather than typing instructions, you connect boxes with wires which is fun and the examples are great too.
If you're into c++ there are plenty of libraries. On the creative side, there's a nice set of libraries called OpenFrameworks that's easy and fun to use. If this is your cup of tea, have a peek at Maximilian.
Bottomline is: there are multiple options to achieve the same task. Choose the best tool for your (based on your background) or try each and see what you like best.
You asked "And which programming language do you think fits best for this task?" - I would also suggest using Processing. I have been used Processing to work with sounds previously. And in all cases I used Minim. It has many UgenS to generate sounds programmatically.
Also, you wants to integrate with some sensors. I'm not sure what types of sensors you will use, but Processing goes pretty well with different Arduino modules and sensors. Check this link for more direction.
Furthermore, you can export your project as .exe or executable .jar files. And their JS version (P5.js) works almost the same as the Java version.

Creating .wav files of varying pitches but still having the same fundamental frequency

I am using pygame to play .wav files and want to change the pitch of a particular .wav file as each level in my game progresses. To explain, my game is a near copy of the old Oric1 computer OricMunch Pacman game, where there are a few hundred pills to be munched on each level, and for every pill that is munched a short sound is played, with the pitch of the sound increasing slightly for each pill eaten/munched.
Now here is what I have tried:
1) I have used pythons wave module to create multiple copies of the sound file, each newly created file having a slight increase in pitch (by changing the 3rd parameter in params() the framerate, sometimes referred to as the sample frequency) for each cycle of a for loop. Having achieved this, I could then within the loop create multiple sound objects to add to a list, and then index through the list to play the sounds as each pill is eaten.
The problem is even though I can create hundreds of files (using the wave module) that play perfectly with their own unique pitches when played using windows media player, or even pythons winsound module, pygame does not seem to interpret the difference in pitch.
Now interestingly, I have downloaded the free trial version of Power Sound Editor which has the option to change the pitch, and so I’ve created just a few .wav files to test, and they clearly play with different pitches when played in pygame.
Observations:
From printing the params in my for loop, I can see that the framerate/frequency is changing as intended, and so obviously this is why the sounds play as intended through windows media player and winsound.
Within pygame I suspect the reason they don’t play with different pitches is because the frequency parameter is fixed, either to the default settings or via the use of pygame.mixer.pre_init, which I have indeed experimented with.
I then checked the params for each .wav file created by the Power Sound Editor, and noticed that even though the pitch sound was changing, the frequency stayed the same, which is not totally surprising since you have to select 1 of 3 options to save the files, either 22050, 44100 or 96000Hz
So now I thought time to check out the difference between pitch and frequency specifically in relation to sound, since I thought they were the same. What I found was it seems there are two principle aspects of sound waves: 1) The framerate/frequency And 2) The varying amplitude of multiple waves based on that frequency. Now I far from clearly understand this, but realise the Power Sound Editor must be altering the shape/pitch of the sound by manipulating the varying amplitude of multiple waves, point 2) above, and not by changing the fundamental frequency, point 1) above.
I am a beginner to python, pygame and programming in general, and have tried hard to find a simple way to change sound files to have gradually increasing pitches without changing the framerate/fundamental frequency. If there’s a module that I can import to help me change the pitch by manipulating the varying amplitude of mutiple waves (instead of changing the framerate/sample frequency which typically is either 22050 or 44100Hz), then it needs to take relatively no time at all if being done on the fly in order to not slow the game down. If the potential module opens, changes and then saves sound files, as opposed to altering them on the fly, then I guess it does not matter if it’s slow because I will just be creating the sound files so I can create sound objects from them in pygame to play.
Now if the only way to achieve no slow down in pygame is to create sound objects from sound files as I have already done, and then play them, then I need a way to manipulate the sound files like the Power Sound Editor (again I stress not by changing the framerate/sample frequency of typically 22050 or 44100) and then save the changed file.
I suppose in a nut shell, if I could magically automate Power Sound Editor to produce 3 to 4 hundred sound files without me having to click on the change pitch option and then save each time, this would be like having my own python way of doing it.
Conclusion:
Assuming creating sound objects from sound files is the only way not to slow my game down (as I suspect it might be) then I need the following:
An equivalent to the python wave module, but which changes the pitch like Power Sound Editor does, and not by changing the fundamental frequency like the wave module does.
Please can someone help me and let me know if there’s a way.
I am using python 3.2.3 and pygame 1.9.2
Also I’m just using pythons IDLE and I’m not familiar with using other editors.
Also I’m aware of Numpy and of various sound modules, but definitely don’t know how to use them. Also any potential modules would need to work with the above versions of python and pygame.
Thank you in advance.
Gary Townsend.
My Reply To The First Answer From Andbdrew Is Below:
Thank you for your assistance.
It does sound like changing the wave file data rather than the wave file parameters is what I need to do. For reference here is the code I have used to create the multiple files:
framerate = 44100 #Original .wav file framerate/sample frequency
for x in range(0, 25):
file = wave.open ('MunchEatPill3Amp.wav')
nFrames = file.getnframes()
wdata = file.readframes(nFrames)
params = file.getparams()
file.close()
n = list(params)
n[0] = 2
n[2] = framerate
framerate += 500
params = tuple(n)
name = 'PillSound' + str(x) + '.wav'
file = wave.open(name, 'wb')
file.setparams(params)
print(params)
file.writeframes(wdata)
file.close()
It sounds like writing different data would be equivalent or similar to how the Power Sound Editor is changing the pitch.
So please can you tell me if you know a way to modify/manipulate wdata to effectively change the pitch, rather than alter the sample rate in params(). Would this mean some relatively simple operation applied to wdata after it’s read from my .wav file. (I really hope so) I’ve heard of using numpy arrays, but I have no clue how to use these.
Please note that any .wav files modified in the above code, do indeed play in Python using winsound, or in windows media player, with the pitch increase sounding as intended. It’s only in Pygame that they don’t.
As I’ve mentioned, it seems because Pygame has a set frequency (I guess this frequency is also sample rate), that this might be the reason the pitch sounds the same, as if it wasn’t increased at all. Whereas when played with e.g. windows media player, the change in sample rate does result in a higher sounding pitch.
I suppose I just need to achieve the same increase in pitch sound by changing the file data, and not the file parameters, and so please can you tell me if you know a way.
Thank you again for helping with this.
To Summarise My Initial Question Overall, Here It Is Again:
How do you change the pitch of a .wav file without changing the framerate/sample frequency, by using the python programming language, and not some kind of separate software program such as Power Sound Editor?
Thank You Again.
You should change the frequency of the wave in your sample instead of changing the sample rate. It seems like python is playing back all of your wave files at the same sample rate (which is good), so your changes are not reflected.
Sample rate is sort of like meta information for a sound file. Read about it at http://en.m.wikipedia.org/wiki/Sampling_rate#mw-mf-search .
It tells you the amount of time between samples when you convert a continuous waveform into a discrete one. Although your (ab)use of it is cool, you would be better served by encoding different frequencies of sound in your different files all at the same sample rate.
I took a look at the docs for the wave module ( http://docs.python.org/3.3/library/wave.html ) and it looks like you should just write different data to your audio files when you call
Wave_write.writeframes(data)
That is the method that actually writes your audio data to your audio file.
The method you described is responsible for writing information about the audio file itself, not the content of the audio data.
Wave_write.setparams(tuple)
"... Where the tuple should be (nchannels, sampwidth, framerate, nframes, comptype, compname), with values valid for the set*() methods. Sets all parameters... " ( also from the docs )
If you post your code, maybe we can fix it.
If you just want to create multiple files and you are using linux, try SoX.
#!/bin/bash
for i in `seq -20 10 20`; do
sox 'input.wav' 'output_'$i'.wav' pitch $i;
done

Driving the sound card in Linux

On a basic embedded systems speaker with a single line of output, wiggling the output as 0 or 1 in a for given periods produces sound.
I'd like to do something similar on a modern Linux desktop. A brief look-see of Portaudio, OpenAL, and ALSA suggests to me that most people do things at a considerable higher level. That's ok, but not what I'm looking for.
(I've never worked with sounds on Linux before, so if a tutorial exists, I'd love to see it).
Actually, it... kinda is. While you can generate the waveform yourself, you still need to use an API to queue it and send it to the audio hardware; there no longer even exists a sane way to twiddle the audio line directly. Plus you get cross-platform compatibility for free.
[...] embedded systems speaker with a single line of output, wiggling the output as 0 or 1 in a for given periods produces sound.
Sounds a lot like the old PC speaker. You might still find code for it in the Linux kernel.
I'd like to do something similar on a modern Linux desktop.
Then you need AFAIK a driver for ALSA. There you can find infos on how to write an ALSA driver. Use PWM to produce the sound.
Since there are many different sound cards and audio interfaces produced by different companies, there is no uniform way to have a low level access to them. With most sound I/O APIs what you need to do is to generate the PCM data and send that to the driver. That's pretty much the lowest level you can go.
But PCM data is very similar to the 0-1 approach you describe. It's just that you have the in-between options too. 0-1 is 1-bit audio. 8-, 16-, 24-bit audio is what you'll find on a modern sound card. There are also 32- and 64-bit float formats. But they're still similar.

Sound Synthesis Framework in C/C++/Objective-C?

I've searched the net but didn't found anything interesting. Maybe I'm doing something wrong.
I'm looking for sound synthesis API written on C, C++ or even Objective-C, which can synthesize different types of waves, effects are optional.
Here's a complete library/toolkit for FM (Frequency Modulation) synthesis:
link1
link2
If you have time to spare... creating simple sound synthesis from scratch is actually a fun endeavor. If you create a small buffer of 256 16 bit samples which represent either a sine. a sawtooth, block or pulse, you can copy these to a live audiobuffer (e.g. a small buffer (say 16kb)) which constantly loops. By staying ahead of the playposition, and constantly filling up the buffer with new values, you can create the soundoutput.
You can use the small buffers to combine these in interesting ways (simplest is just to add them together (additive synthesis)).
The frequency of the tone can be manipulated by using a bigger or smaller sampling step through the small buffers. Amplitude can be manipulated by scaling the samples before putting them into the output buffer.
Great fun experimenting with this!
If you have this step nailed, you can add more sophisticated effects like filters (low pass, high pass, etc) and effects (reverbs, echoes, etc)
R
Have you looked at the synthesis toolkit (STK)? It's in C++ (I don't think ObjC is the right language for audio synthesis, in fact audio units, Apple's own way of doing audio stuff, including generators/filters/effects... is in C++).
STK will run on Mac OS X, and iOS no problem (CoreAudio is supported), but will also run on Linux and Windows (Direct sound and ASIO), using RtAudio. It's really nicely done and lightweight, these guys have spent a lot of time thinking about it and it will definitely give you a big head start. It can handle loads of different audio file formats + midi (and hopefully OSC soon...).
There is also Create and CLAM which is huge, these include GUI components and many other things which you might or might not want. If you're only interested in doing sound synthesis I really recommend STK.
PortAudio is also a great C API that we used last semester in an audio programming course. It provides an audio callback...what more could you need!?
I haven't tried incorporating it with anything in Objective-C yet, but will report back when I do.
Writing audio synthesis algorithms in C/obj-C is quite difficult in my opinion. I would recommend writing your signal processing algorithms using PureData and then use ZenGarden or libpd to embed and interpret the pd patches in your app.
Another C++ library is nsound:
http://nsound.sourceforge.net
One can generate any kind of modulated signal using the Generator class or using the provided Sine class. Each time-step can have it's own instantaneous frequency and phase offset.
You can also experiment with the Python module to prototype your algorithm quickly, then implement in C++. It can produce pretty matplotlib plots from Python and even from C++!
Have you looked at CSound? It's an incredibly flexible audio generation platform, and can handle everything from simple waveform generation to FM synthesis and all kinds of filters. It also provides MIDI support, and you can extend it by writing custom opcodes. There's a full C API and several C++ APIs as well.

Resources