In the jukebox.c example of libspotify I count all frames of the current track in the music_delivery callback. When end_of_track is called the frames count is different each time I played the same track. So end_of_track is called several seconds after the song is over. And this timespan differs for each playback.
How can I determine if the song is really over? Do I have to take the duration of the song in seconds and multiply it with the sample rate to take care when the song is over?
Why are more frames delivered than necessary for the track? And why is end_of_track not called on the real end of it? Or I am missing something?
end_of_track is called when libspotify has finished delivering audio frames for that track. This is not information about playback - every playback implementation I've seen keeps an internal buffer between libspotify and the sound driver.
Depending on where you're counting, this will account for the difference you're seeing. Since the audio code is outside of libspotify, you need to keep track of what's actually going to the sound driver yourself and stop playback, skip to the next track or whatever you need to do accordingly. end_of_track is basically there to let you know that you can close any output streams you may have from the delivery callback to your audio code or something along those lines.
Related
I want to create a speech jammer. It is essentially something that repeats back to you what you just said, but it is continuous. I was trying to use the sounddevice library and record what I am saying while also playing it back. Then I changed it to originally record what I was saying, then play it back while also recording something new. However it is not functioning as I would like it. Any suggestions for other libraries? Or if someone sees a suggestion for the code I already have.
Instead of constantly playing back to me, it is starting and stopping. It does this at intervals of the duration specified. So it will record for 500 ms, then play that back for 500 ms and then start recording again. Wanted behavior would be - recording for 500ms while playing back the audio it is recording at some ms delay.
import sounddevice as sd
import numpy as np
fs = 44100
sd.default.samplerate = fs
sd.default.channels = 2
#the above is to avoid having to specify arguments in every function call
duration = .5
myarray = sd.rec(int(duration*fs))
while(True):
sd.wait()
myarray = sd.playrec(myarray)
sd.wait()
Paraphrasing my own answer from https://stackoverflow.com/a/54569667:
The functions sd.play(), sd.rec() and sd.playrec() are not meant to be used repeatedly in rapid succession. Internally, they each time create an sd.OutputStream, sd.InputStream or sd.Stream (respectively), play/record the audio data and close the stream again. Because of opening and closing the stream, gaps will occur. This is expected.
For continuous playback you can use the so-called "blocking mode" by creating a single stream and calling the read() and/or write() methods on it.
Or, what I normally prefer, you can use the so-called "non-blocking mode" by creating a custom "callback" function and passing it to the stream on creation.
In this callback function, you can e.g. write the input data to a queue.Queue and read the output data from the same queue. By filling the queue by a certain amount of zeros beforehand, you can specify how long the delay between input and output shall be.
You can have a look at the examples to see how callback functions and queues are used.
Let me know if you need more help, then I can try to come up with a concrete code example.
I'm seeing a potential problem here of you trying to use myarray as both the input and the output of the .playrec() function. I would recommend having two arrays, one for recording the live audio, and one for playing back the recorded audio.
Instead of using the .playrec() command, you could just rapidly alternate between the use of .record() and .play() with a small delay between within your while-loop.
For example, the following code should record for one millisecond, wait a millisecond, and then playback the one millisecond of audio:
duration = 0.001
while(True):
myarray= sd.rec(int(duration*fs))
sd.wait()
sd.play(myarray, (int(duration*fs)))
There is no millisecond delay after the playback because you want to go right back to recording the next millisecond straight away. It should be noted, however, that this does not keep a recording of your audio for more than one millisecond! You would have to add your own code that adds to the array of a specified size and fills it up over time.
I am building a ambient audio skill for sleep for Alexa! I am trying to loop the audio so I don't have to download 10 hour versions of the audio. How do I get the audio to work? I have it build to where it will play the audio, but not loop.
I've solved this problem in my skill Rainmaker: https://www.amazon.com/Arif-Gebhardt-Rainmaker/dp/B079V11ZDM
The trick is to handle the PlaybackNearlyFinished event.
https://developer.amazon.com/de/docs/alexa-voice-service/audioplayer.html#playbacknearlyfinished
This event is fired shortly before the currently playing audio stream is ending.
Respond to the event with another audioPlayerPlay directive with behavior ENQUEUE. This will infinitely loop your audio until it gets interrupted by e.g. the AMAZON.StopIntent.
Advanced: if you want a finite loop, say ten times your audio, use the token of the audioPlayerPlay directive to count down from ten. Once the counter hits zero, just don't enqueue another audio. But be sure to respond something in this case, even if it's just an empty response. Otherwise you will get a timeout error or the like.
I am trying to cast just a snippet of a file (say, only from 00:00:30 to 00:00:40) from a Chrome sender to the default receiver. Reading the API reference documentation documentation for LoadRequest, MediaInfo, and QueueItem, it seemed like I should be able to do this with some combination of these. In particular, the first queued item (loaded with CastSession#loadMedia) would need LoadRequest#currentTime set to the offset (30 seconds in my example above) and MediaInfo#duration set to the duration (10 seconds in my example), while subsequently queued items would set QueueItem#startTime and QueueItem#playbackDuration to the offset and duration (respectively).
However, this isn't happening in practice. I can confirm that the queue on the receiver has these fields set, but the no matter how I go about this, I can't get the right snippet to play. When I add the first media item as described above, the receiver just plays the track from beginning to end, neither respecting the offset nor the duration. Since the combination of LoadRequest#currentTime and MediaInfo#duration is a bit odd, I tried using only the QueueItem method (add the first media item with autoplay = false, add another queue item, remove the first, and then start playing the queue). In this case, the offset was still not respected, and the duration ended up being (very strangely) the sum of startTime and playbackDuration (in addition, any subsequently queued items would load, and then "finish" playing without starting, which I also can't figure out).
Does anyone else have experience with this part of the API? Am I reading the documentation incorrectly and what I'm doing just isn't supported, or am I just piecing things together incorrectly?
I am not sure I understand why you are attempting to use a queue with multiple items. First, the duration field is not what you think it is; it is not the duration of play back that you want, it is the total duration of the media that is being loaded, regardless of where you start or stop the playback. In fact, in most cases, you don't even need to set that; the receiver gets the total duration of the media when it loads he item, at least in the majority of the cases. The currentTime should work (if it is not, please file a bug on our SDK issue tracker) and alternatively, you can load a media (with autoplay off) and "seek" to the time you want and then play. To stop at a certain point, you need to monitor the the playback location and when it reaches that point, pause the playback.
I have a playlist, and I want to sequentially play through the tracks, but every time a new track is loaded, I want to call a function. How would I go about listening for this event?
SPPlaybackManager, the playback class in CocoaLibSpotify, doesn't automatically play tracks sequentially, so you have to manually tell it to play each time. Since you're managing that, you already know when a new track is starting playback.
Additionally, SPPlaybackManagerDelegate has a method -playbackManagerWillStartPlayingAudio:, which will let you know when audio starts hitting the speakers.
I have built a standalone app version of a project that until now was just a VST/audiounit. I am providing audio support via rtaudio.
I would like to add MIDI support using rtmidi but it's not clear to me how to synchronise the audio and MIDI parts.
In VST/audiounit land, I am used to MIDI events that have a timestamp indicating their offset in samples from the start of the audio block.
rtmidi provides a delta time in seconds since the previous event, but I am not sure how I should grab those events and how I can work out their time in relation to the current sample in the audio thread.
How do plugin hosts do this?
I can understand how events can be sample accurate on playback, but it's not clear how they could be sample accurate when using realtime input.
rtaudio gives me a callback function. I will run at a low block size (32 samples). I guess I will pass a pointer to an rtmidi instance as the userdata part of the callback and then call midiin->getMessage( &message ); inside the audio callback, but I am not sure if this is thread-sensible.
Many thanks for any tips you can give me
In your case, you don't need to worry about it. Your program should send the MIDI events to the plugin with a timestamp of zero as soon as they arrive. I think you have perhaps misunderstood the idea behind what it means to be "sample accurate".
As #Brad noted in his comment to your question, MIDI is indeed very slow. But that's only part of the problem... when you are working in a block-based environment, incoming MIDI events cannot be processed by the plugin until the start of a block. When computers were slower and block sizes of 512 (or god forbid, >1024) were common, this introduced a non-trivial amount of latency which results in the arrangement not sounding as "tight". Therefore sequencers came up with a clever way to get around this problem. Since the MIDI events are already known ahead of time, these events can be sent to the instrument one block early with an offset in sample frames. The plugin then receives these events at the start of the block, and knows not to start actually processing them until N samples have passed. This is what "sample accurate" means in sequencers.
However, if you are dealing with live input from a keyboard or some sort of other MIDI device, there is no way to "schedule" these events. In fact, by the time you receive them, the clock is already ticking! Therefore these events should just be sent to the plugin at the start of the very next block with an offset of 0. Sequencers such as Ableton Live, which allow a plugin to simultaneously receive both pre-sequenced and live events, simply send any live events with an offset of 0 frames.
Since you are using a very small block size, the worst-case scenario is a latency of .7ms, which isn't too bad at all. In the case of rtmidi, the timestamp does not represent an offset which you need to schedule around, but rather the time which the event was captured. But since you only intend to receive live events (you aren't writing a sequencer, are you?), you can simply pass any incoming MIDI to the plugin right away.