Clicking Sounds When Playing Clips in Rapid Succession - audio

I have a very simple program that plays 4 different tones, depending on what button is pressed. I have found that if I play multiple tones or the same tone in rapid succession, there are unpleasant clicking noises produced. I have made sure that these clicks are not present in my audio samples; it is definitely caused by playing the clips quickly one after another.
After googling around, I'm fairly sure that the clicks are due to the rapid change in pitch between clips. Looking at the waveform of the playback from the offending audio, it looks like a clip is first cancelled for a fraction of a second before starting the next clip. I have highlighted the section where this seems particularly obvious.
The clip that showcases these audio clicks can also be downloaded here.
My code is very simple. I am using XInput to read input from a connected controller, which determines the tone to play, and I am using WinMM to output sound from wav files. It is written in the D programming language, but I have modified it to use no D-specific features to make it as C-like as possible and to avoid confusion.
SHORT keyPressed(int vkey)
{
enum highBit { val = 0x8000 }
return cast(SHORT)(GetKeyState(vkey) & highBit.val);
}
enum Button
{
DPAD_UP = 0x0001,
DPAD_DOWN = 0x0002,
DPAD_LEFT = 0x0004,
DPAD_RIGHT = 0x0008,
START = 0x0010,
BACK = 0x0020,
LEFT_THUMB = 0x0040,
RIGHT_THUMB = 0x0080,
LEFT_SHOULDER = 0x0100,
RIGHT_SHOULDER = 0x0200,
A = 0x1000,
B = 0x2000,
X = 0x4000,
Y = 0x8000,
}
struct XINPUT_GAMEPAD
{
WORD wButtons;
BYTE bLeftTrigger;
BYTE bRightTrigger;
SHORT sThumbLX;
SHORT sThumbLY;
SHORT sThumbRX;
SHORT sThumbRY;
}
struct XINPUT_STATE
{
DWORD dwPacketNumber;
XINPUT_GAMEPAD Gamepad;
bool isPressed(int button)
{
return cast(bool)(Gamepad.wButtons & button);
}
}
int main()
{
HANDLE xinputDLL = initXinput();
XINPUT_STATE oldState;
XINPUT_STATE newState;
while (!keyPressed(VK_ESCAPE))
{
oldState = newState;
XInputGetState(0, &newState);
enum flags { val = SND_ASYNC | SND_FILENAME | SND_NODEFAULT }
if (newState.isPressed(Button.A) && !oldState.isPressed(Button.A))
{
PlaySoundA(toStringz("Piano.ff.A4.wav"), null, flags.val);
}
if (newState.isPressed(Button.B) && !oldState.isPressed(Button.B))
{
PlaySoundA(toStringz("Piano.ff.B4.wav"), null, flags.val);
}
if (newState.isPressed(Button.X) && !oldState.isPressed(Button.X))
{
PlaySoundA(toStringz("Piano.ff.C5.wav"), null, flags.val);
}
if (newState.isPressed(Button.Y) && !oldState.isPressed(Button.Y))
{
PlaySoundA(toStringz("Piano.ff.F4.wav"), null, flags.val);
}
}
denitXinput(xinputDLL);
return 0;
}
Assuming that I'm correct in regards to the source of the clicking sounds, I think the solution is to have each sample fade into the next one. However, I am not sure how to do this as the WinMM documentation seems relatively sparse, and I am inexperienced with it.
Is the solution to my problem of clicks when playing audio samples to have each sample fade into the next one? If so, how can I accomplish this using WinMM? If not, is there another solution that I can try?

I know how we can solve this in theory, but I don't have actual working code yet for all cases. (When I do, I'll edit this.)
First, the simple case which kinda works: instead of using PlaySound, try mciSendStringA:
if(auto err = mciSendStringA("play test.wav", null, 0, null))
writeln(err);
I am not making that up, Windows actually has that function, and it actually works with a lot of little command strings and file formats (though if your program terminates, all sound stops, so make sure the program keeps running e.g. stay in your controller loop or call Sleep(something)).
I've used a lot of Win32 and sometimes I'm amazed by how much stuff it has. Prototype:
extern(Windows) uint mciSendStringA(in char*,char*,uint,void*);
found in winmm.lib.
That basically works, but in my test, playing the same file twice at the same time has no effect. Playing different files together mixes them though. So it is a partial solution.
Next step from that would be to use the mciSendCommand function - a bit lower level than send string, so you can open multiple devices and try to get more overlap that way:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd743675%28v=vs.85%29.aspx
I haven't tried this yet, but it looks fairly simple and I suspect it might be good enough for you. Open up a few devices for each button so you can hit them a few times fast and it cycles through them, hopefully mixing the same sound more than once when needed.
The prototype to that is:
extern(Windows) uint /*MCIERROR*/ mciSendCommandA(MCIDEVICEID,UINT,DWORD,DWORD);
Yes, it casts to void* then to DWORD in the msdn example. Blargh. Relevant structs:
struct MCI_OPEN_PARMSA {
DWORD dwCallback;
MCIDEVICEID wDeviceID; // aka uint
LPCSTR lpstrDeviceType;
LPCSTR lpstrElementName;
LPCSTR lpstrAlias;
}
struct MCI_PLAY_PARMS {
DWORD dwCallback;
DWORD dwFrom;
DWORD dwTo;
}
and you can borrow some constants from here too:
https://github.com/AndrejMitrovic/DWinProgramming/blob/master/WindowsAPI/win32/mmsystem.d#L693
(if you are already using the win32 bindings, great! But I think they are kinda a pain for little things so I try to avoid them, preferring to copy/paste prototypes+structs+constants off MSDN as I need them.)
You should be able to get the MSDN example working with those definitions and core.sys.windows.windows. Don't forget pragma(lib, "winmm"); too.
I think a full solution that will certainly work, but is also quite a bit harder, will be using the low level interface to mix the sounds yourself as they happen and send that result to the device. I don't have this working yet and I'm out of time today, but hopefully I can get something to you tomorrow.
The basic steps are:
1) call waveOutOpen to get a device. Set up a callback function which it calls when it needs more data.
2) prepare a buffer - or perhaps more than one - with waveOutPrepareHeader
3) feed data with waveOutWrite when requested by your callback (might want this in a separate thread) with the current notes. Mixing two samples is simply a case of adding the values together (and clipping if they overflow - sounds awful btw but hopefully that won't actually happen) so if you are doing more than one sound, just add them as you go.
Don't forget extern(Windows) on any callback function!
4) Loading your samples probably means reading the .wav file. That's not super hard, Windows has helper functions or you can do it yourself. I'll show code for this too.
What I have so far is in my simpleaudio.d https://github.com/adamdruppe/arsd/blob/master/simpleaudio.d find struct AudioOutput and the WinMM version. It has a horrible API right now that must be radically changed - it was acceptable on Linux but sucks on Windows. A callback feeder instead of write(data) should work better on both platforms, so that's what I'll do.
Problem I'm having with the demo right now is gaps between buffers... leading to clicky sounds. Yeah. But I'm sure it is just latency that should be solved with the proper callback approach and buffer sizing.
That MCI function might work for you as a next step though, maybe even a final step if the multiple devices works.
BTW: you could also prolly make it do MIDI commands instead of playing wavs and get all kinds of cool stuff. Simpleaudio.d's low level midi is already functioning - the demo main even shows a piano scale. Rigging it into the xbox controller shouldn't be too hard... note on when the button is pressed, note off when released, and not even think about timing.. Not really an answer to the question but a cool thing to play with in the same vein!

Related

Grab an image from willOutputSampleBuffer or related?

So, Brad Larson is awesome. I'm using his GPUImage library since he optimized the CGContextCreateImage for video output to instead render straight into OpenGL. Then he rewrote it to be even more amazing, and half the questions are outdated. The other half have the new callbacks, Like this question, but for the life of me, I can't get the video frames callback to not be nil. (the CMSampleBuffer to CIImage functions)
I know I have to "tag the next frame" to be kept in memory, thanks to his blog. I also know I process it (but GPUImageVideo also does that), then I grab from the framebuffer. Still nill.
The capture command that's supposed to auto-filter it into a CGImage, from the CGImagePicture's processImageUpToFilter function seems to be what I want, and I've seen it mentioned, but I am lost as to how to hook up the output to its frameBuffer.
Or should I use GPUImageRawDataOutput, and how to hook up? I've been copying and pasting, editing, experimenting, but unsure if it's just the fact I don't know openGL enough to hook up the right stuff or?
Any help is appreciated. I wouldn't ask, since so many related questions are up here, but I use them and still get nil on the output.
Here is my current try:
func willOutputSampleBuffer(sampleBuffer: CMSampleBuffer!) {
gpuImageVideoCamera.useNextFrameForImageCapture()
//Next line seems like a waste, as this func is called in GPUImageVideoCamera already.
gpuImageVideoCamera.processVideoSampleBuffer(sampleBuffer);
if let image = gpuImageVideoCamera.imageFromCurrentFramebuffer()
{
//it's nil
}
}
It seems to use the filter instead, and useNextFrame should be AFTER processing to not go super-slow.
Inside of willOutputSampleBuffer, this is it.
if let image = transformFilter.imageFromCurrentFramebuffer()
{
// image not nil now.
}
transformFilter.useNextFrameForImageCapture(); //enusre this comes after
This has given us stunning speeds that beat Google's p2p library. Brad, thanks, everyone should support your efforts.

AS3 load new URLRequest through a string

I want to load a new URL (which is in a array), everytime I press the button. I have the following code to do so:
public function selectRadio(radio:Radio):void {
var soundR:Sound = new Sound();
if(!playing) {
soundR.load(new URLRequest(radio.getURL()));
soundChannel = soundR.play();
playing = true;
}
else{
soundChannel.stop();
playing = false;
}
trace("You are now listening to " + radio.getTitle());
}
But it gives me this error:
"implicit coercion of a value of type flash.net:URLRequest to an unrelated type string"
It works if I just leave it like this:
soundR.load(radio.getURL());
But if i do so, I can only press play and stop 4 times. After the fourth there is no sound, like it can't load the URL.
Is it possible to fix this?
Radio.getURL() should return a string instead of a URLRequest?
Ah nevermind I see you tried avoiding that by removing URLRequest, but you are having problems with only 4 connections.
In all but the simplest cases, your application should pay attention to the sound’s loading progress and watch for errors during loading. For example, if the click sound is fairly large, it might not be completely loaded by the time the user clicks the button that triggers the sound. Trying to play an unloaded sound could cause a run-time error. It’s safer to wait for the sound to load completely before letting users take actions that might start sounds playing.
http://help.adobe.com/en_US/as3/dev/WS5b3ccc516d4fbf351e63e3d118a9b90204-7d25.html
If you can't wait for the sound to completely load you may need NetStream:
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/net/NetStream.html

Best approach for playing a single tone audio file?

Can anyone tell me the best approach to playing single-tone, audio (.mp3) files in a Windows Phone 8 app? Think of a piano app, where each key would represent a button, and each button would play a different tone.
I'm looking for the most efficient way to go about this - I've got 8 different buttons that need to play a different tone when tapped.
I tried using the MediaElement:
MediaElement me;
public MainPage()
{
InitializeComponent();
me = new MediaElement();
me.AutoPlay = false;
me.Source = new Uri("/Sounds/Sound1.mp3", UriKind.Relative);
btnPlay.Click += btnPlay_Click;
}
private void btnPlay_Click(object sender, EventArgs e)
{
me.Play();
}
But nothing happens, either in the emulator or on a device (testing w/ a Lumia 822). Am I doing something wrong here? It seems like it should be pretty simple. Or would using MediaElement even be the best thing to use for my scenario?
Would this fall under the Background Audio category? I've read through this example but it seems overkill for what I want to do.
I've also read about using XNA's SoundEffect to do the job, but then I'd have to convert my .mp3 files to .wav (which isn't necessarily a problem, but I'd rather not go through that if I don't need to).
Can anyone tell me either what I'm doing wrong in my example above or guide me to a better solution for playing quick <1s audio tones?
I had this problem before with MediaElement not playing audio files. After many attempts I found out that it only plays if it defined in the xaml and AutoPlay is set to true.
Try defining it in the xaml or you can just add it to your LayoutRoot.
var me = new MediaElement();
LayoutRoot.Children.Add(me);
me.AutoPlay = true;
me.Source = new Uri("Sound/1.mp3", UriKind.Relative);
I have had good luck just doing this piece of code in my app. But it may not work as well in your context, give it a whirl though.
mediaElement.Source = new Uri("/Audio/" + songID.ToString() + ".mp3", UriKind.Relative);
mediaElement.Play();

FMOD surround sound openframeworks

Ok, I hope I don't mess this up, I have had a look for some answers but can't find anything. I am trying to make a simple sampler in openframeworks using the FMOD sound player in 3D mode. I can make a single instance work fine (recording a new file using libsndfilerecorder and then playing it back and moving it in surround.
However I want to have 8 layers of looping audio that I can record and replace one layer at a time in a live show. I get a lot of problems as soon as I have more than 1 layer.
The first part of my question relates to the FMOD 3D modes, it is listener relative, so I have to define the position of my listener for every sound (I would prefer to have head relative mode but I cannot make this work at all. Again this works fine when I am using a single player but with multiple players only the last listener I update actually works.
The main problem I have is that when I use multiple players I get distortion, and often a mix of other currently playing sounds (even when the microphone cannot hear them) in my new recordings. Is there an incompatability with libsndfilerecorder and FMOD?
Here I initialise the players
for (int i=0; i<CHANNEL_COUNT; i++) {
lvelocity[i].set(1, 1, 1);
lup[i].set(0, 1, 0);
lforward[i].set(0, 0, 1);
lposition[i].set(0, 0, 0);
sposition[i].set(3, 3, 2);
svelocity[i].set(1, 1, 1);
//player[1].initializeFmod();
//player[i].loadSound( "1.wav" );
player[i].setVolume(0.75);
player[i].setMultiPlay(true);
player[i].play();
setupHold[i]==false;
recording[i]=false;
channelHasFile[i]=false;
settingOsc[i]=false;
}
When I am recording I unload the file and make sure the positions of the player that is not loaded are not updating.
void fmodApp::recordingStart( int recordingId ){
if (recording[recordingId]==false) {
setupHold[recordingId]=true; //this stops the position updating
cout<<"Start recording Channel " + ofToString(recordingId+1)+" setup hold is true \n";
pt=getDateName() +".wav";
player[recordingId].stop();
player[recordingId].unloadSound();
audioRecorder.setup(pt);
audioRecorder.setFormat(SF_FORMAT_WAV | SF_FORMAT_PCM_16);
recording[recordingId]=true; //this starts the libSndFIleRecorder
}
else {
cout<<"Channel" + ofToString(recordingId+1)+" is already recording \n";
}
}
And I stop the recording like this.
void fmodApp::recordingEnd( int recordingId ){
if (recording[recordingId]=true) {
recording[recordingId]=false;
cout<<"Stop recording" + ofToString(recordingId+1)+" \n";
audioRecorder.finalize();
audioRecorder.close();
player[recordingId].loadSound(pt);
setupHold[recordingId]=false;
channelHasFile[recordingId]=true;
cout<< "File recorded channel " + ofToString(recordingId+1) + " file is called " + pt + "\n";
}
else {
cout << "Sorry track" + ofToString(recordingId+1) + "is not recording";
}
}
I am careful not to interrupt the updating process but I cannot see where I am going wrong.
Many Thanks
to deal with the distortion, i think you will need to lower the volume of each channel on playback, try setting the volume to 1/8 of the max volume. there isn't any clipping going on so if the sum of sounds > 1.0f you will clip and it will sound bad.
to deal with crosstalk when recording: i guess you have some sort of feedback going on with the output, ie the output sound is being fed back into the input channel, probably by the operating system. if you run another app that makes sound do you also get that in your recording as well? if so then that is probably your problem.
if it works with one channel, try it with just 2, instead of jumping straight up to 8 channels.
in general i would try to abstract out the playback/record logic and soundPlayer/recorder into a separate class. you have a couple of booleans there and it's really easy to make mistakes with >1 boolean. is there any way you can replace the booleans with an enum or an integer state variable?
EDIT: I didn't see the date on your question :D Suppose you managed to do it by now. Maybe it helps somebody else..
I'm not sure if I can answer everything of your question, but I can share how I've worked with 3D sound in FMOD. I haven't worked with recording though.
For my own application a user can place sounds in 3D space around himself. For this I only have one Listener and multiple Sounds. In your code you're making a listener for every sound, are you sure that is necessary? I would imagine that this causes the multiple listeners to pick up multiple sounds and output that to your soundcard. So from the second sound+listener, both listeners pick up both sounds? I'm not a 100% sure but it sounds plausible to me.
I made a class to create sound objects (and one listener). Then I use a vector to store the objects and move trough them to render them.
My class SoundBox basically holds all the necessary things for FMOD
Making a "SoundBox" object and adding it to my soundboxes vector:
SoundBox * box = new SoundBox(box_loc, box_rotation, box_color);
box->loadVideo(ofToDataPath(video_files[soundboxes.size()]));
box->loadSound(ofToDataPath(sound_files[soundboxes.size()]));
box->setVolume(1);
box->setMultiPlay(true);
box->updateSound(box_loc, box_vel);"
box->play();
soundboxes.push_back(box);
Constructor for the SoundBox. I use a similar constructor in the same class for the listener, but since the listener will always be at the origin for me, it doesn't take any arguments and just sets all the listener locations to 0. The constructor for the listener only gets called once, while the one for the Sound gets called whenever I want to make a new one. (don't mind the box_color. I'm drawing physical boxes in this case..):
SoundBox::SoundBox(ofVec3f box_location, ofVec3f box_rotation, ofColor box_color) {
_box_location = box_location;
_box_rotation = box_rotation;
_box_color = box_color;
sound_position.x = _box_location.x;
sound_position.y = _box_location.y;
sound_position.z = _box_location.z;
sound_velocity.x = 0;
sound_velocity.y = 0;
sound_velocity.z = 0;
Then I just use a for loop to loop trough them and play them if they're not playing. I also have some similar code to select them and move then around.
for(auto box = soundboxes.begin(); box != soundboxes.end(); box++){
if(!(*box)->getIsPlaying())
(*box)->play();
}
I really hoped this helped. I'm not a very experienced programmer but this is how I got FMOD with multiple sounds to work in OpenFrameworks and hope you can use some of it. I just dumped as much of my code as I could :D
My main suggestion is to make one listener instead of more. Also having a class for making the sounds is useful if you, for instance, want to relocate the sounds after the initial placement.
Hope it helps and good luck :)

Linux/X11 input library without creating a window

Is there a good library to use for gathering user input in Linux from the mouse/keyboard/joystick that doesn't force you to create a visible window to do so? SDL lets you get user input in a reasonable way, but seems to force you to create a window, which is troublesome if you have abstracted control so the control machine doesn't have to be the same as the render machine. However, if the control and render machines are the same, this results in an ugly little SDL window on top of your display.
Edit To Clarify:
The renderer has an output window, in its normal use case, that window is full screen, except when they are both running on the same computer, just so it is possible to give the controller focus. There can actually be multiple renderers displaying a different view of the same data on different computers all controlled by the same controller, hence the total decoupling of the input from the output (Making taking advantage of the built in X11 client/server stuff for display less useable) Also, multiple controller applications for one renderer is also possible. Communication between the controllers and renderers is via sockets.
OK, if you're under X11 and you want to get the kbd, you need to do a grab.
If you're not, my only good answer is ncurses from a terminal.
Here's how you grab everything from the keyboard and release again:
/* Demo code, needs more error checking, compile
* with "gcc nameofthisfile.c -lX11".
/* weird formatting for markdown follows. argh! */
#include <X11/Xlib.h>
int main(int argc, char **argv)
{
Display *dpy;
XEvent ev;
char *s;
unsigned int kc;
int quit = 0;
if (NULL==(dpy=XOpenDisplay(NULL))) {
perror(argv[0]);
exit(1);
}
/*
* You might want to warp the pointer to somewhere that you know
* is not associated with anything that will drain events.
* (void)XWarpPointer(dpy, None, DefaultRootWindow(dpy), 0, 0, 0, 0, x, y);
*/
XGrabKeyboard(dpy, DefaultRootWindow(dpy),
True, GrabModeAsync, GrabModeAsync, CurrentTime);
printf("KEYBOARD GRABBED! Hit 'q' to quit!\n"
"If this job is killed or you get stuck, use Ctrl-Alt-F1\n"
"to switch to a console (if possible) and run something that\n"
"ungrabs the keyboard.\n");
/* A very simple event loop: start at "man XEvent" for more info. */
/* Also see "apropos XGrab" for various ways to lock down access to
* certain types of info. coming out of or going into the server */
for (;!quit;) {
XNextEvent(dpy, &ev);
switch (ev.type) {
case KeyPress:
kc = ((XKeyPressedEvent*)&ev)->keycode;
s = XKeysymToString(XKeycodeToKeysym(dpy, kc, 0));
/* s is NULL or a static no-touchy return string. */
if (s) printf("KEY:%s\n", s);
if (!strcmp(s, "q")) quit=~0;
break;
case Expose:
/* Often, it's a good idea to drain residual exposes to
* avoid visiting Blinky's Fun Club. */
while (XCheckTypedEvent(dpy, Expose, &ev)) /* empty body */ ;
break;
case ButtonPress:
case ButtonRelease:
case KeyRelease:
case MotionNotify:
case ConfigureNotify:
default:
break;
}
}
XUngrabKeyboard(dpy, CurrentTime);
if (XCloseDisplay(dpy)) {
perror(argv[0]);
exit(1);
}
return 0;
}
Run this from a terminal and all kbd events should hit it. I'm testing it under Xorg
but it uses venerable, stable Xlib mechanisms.
Hope this helps.
BE CAREFUL with grabs under X. When you're new to them, sometimes it's a good
idea to start a time delay process that will ungrab the server when you're
testing code and let it sit and run and ungrab every couple of minutes.
It saves having to kill or switch away from the server to externally reset state.
From here, I'll leave it to you to decide how to multiplex renderes. Read
the XGrabKeyboard docs and XEvent docs to get started.
If you have small windows exposed at the screen corners, you could jam
the pointer into one corner to select a controller. XWarpPointer can
shove the pointer to one of them as well from code.
One more point: you can grab the pointer as well, and other resources. If you had one controller running on the box in front of which you sit, you could use keyboard and mouse input to switch it between open sockets with different renderers. You shouldn't need to resize the output window to less than full screen anymore with this approach, ever. With more work, you could actually drop alpha-blended overlays on top using the SHAPE and COMPOSITE extensions to get a nice overlay feature in response to user input (which might count as gilding the lily).
For the mouse you can use GPM.
I'm not sure off the top of my head for keyboard or joystick.
It probably wouldn't be too bad to read directly off there /dev files if need be.
Hope it helps

Resources