I have an audio streaming application that uses requests to download the audio file and then played using Gstreamer.
I want to trim the first few seconds of all the audio files that i have. I could use ffmpeg to trim but that would waste cpu resources on my embedded platform and also waste network bandwidth
(The number of songs are around 1000, and they get downloaded continously, so it does make a difference)
I have tried downloading partial file using the range header in requests but that doesn't work. I can't play the file.
Can someone please tell me how i can make this work?
The audio files are generally .m4a / .webm but they are extracted from youtube so can't say for sure.
This is an uneasy task.. there is no clean way how to do it..
you can probably use the valve element set it to drop by default..
and then put some timer which sets the drop to false..
not sure how this will work, you need to try.
Here are some hints:
Related
The common situation when the integrity of an MP3 file is not correct, is when the file has been partially uploaded to the server. In this case, the indicated audio duration doesn't correspond to what is really in the MP3 file: we can hear the beginning, but at some point the playing stops and the indicated duration of the audio player is broken.
I tried with libraries like node-ffprobe, but it seems they just read metadata, without making comparison with real audio data in the file. Is there a way to detect efficiently a corrupted or incomplete MP3 file from node.js?
Note: the client uploading MP3 files is a hardware (an audio recorder), uploading files on a FTP server. Not a browser. So I'm not able to upload potentially more useful data from the client.
MP3 files don't normally have a duration. They're just a series of MPEG frames. Sometimes, there is an ID3 tag indicating duration, but not always.
Players can determine duration by choosing one of a few methods:
Decode the entire audio file.This is the slowest method, but if you're going to decode the file anyway, you might as well go this route as it gives you an exact duration.
Read the whole file, skimming through frame headers.You'll have to read the whole file from disk, but you won't have to decode it. Can be slow if I/O is slow, but gives you an exact duration.
Read the first frame's bitrate and estimate duration by file size.Definitely the fastest method, and the one most commonly used by players. Duration is an estimate only, and is reasonably accurate for CBR, but can be wildly inaccurate for VBR.
What I'm getting at is that these files might not actually be broken. They might just be VBR files that your player doesn't know the duration of.
If you're convinced they are broken (such as stopping in the middle of content), then you'll have to figure out how you want to handle it. There are probably only a couple ways to determine this:
Ideally, there's an ID3 tag indicating duration, and you can decode the whole file and determine its real duration to compare.
Usually, that ID3 tag won't exist, so you'll have to check to see if the last frame is complete or not.
Beyond that, you don't really have a good way of knowing if the stream is incomplete, since there is no outer container that actually specifies number of frames to expect.
The expression for calculating the filesize of an mp3 based on duration and encoding (from this answer) is quite simple:
x = length of song in seconds
y = bitrate in kilobits per second
(x * y) / 1024 = filesize (MB)
There is also a javascript implementation for the Web Audio API in another answer on that same question. Perhaps that would be useful in your Node implementation.
mp3diags is some older open source software for fixing mp3s and which was great for batch processing stuff like this. The source is c++ and still available if you're feeling nosy and want to see how some of these features are implemented.
Worth a look since it has some features that might be be useful in your context:
What is MP3 Diags and what does it do?
low quality audio
missing VBR header
missing normalization data
Correcting files that show incorrect song duration
Correcting files in which the player cannot seek correctly
I'm converting an ESP32 project to a Raspberry Pi zero. One of the project behaviors is to play back sound effects based on specific events or triggers. I prefer to use MP3 format so I can store information about the contents of the file in the ID3TAGs to make the files themselves easier to manage. (there are a lot of them!)
I can find examples of using any number of libraries to play mp3s in python, and I found an example of selecting a device using 'sounddevice' but it seems to want numpy arrays to play sound data.
I'm wondering what the easiest and quickest way is to play mp3 files (or should I go to some other file format with a data stub file for each to do my file management?).
Since these behaviors are played as responses, they need to at least start playback quickly (i.e. not wait for a format conversion to take place). And in some cases, other behaviors (such as voice recognition triggers) are already going to add to potential latency on the device in it's total response time.
EDIT: additional info
quickest means processor speed (pi zeros slow down quick under heavy load)
These are real time responses so any 'lag' converting defeats the purpose of the playback.
Also, the device from seeed is configured as an alsa (asound) device
I have 2 mp3 files that are nearly identical. The first file refused to stream down and play in my Appcelerator iPhone app that I am developing:
http://www.zerogravpro.com/temp/bad.mp3 (you'll find you can play this just fine in your browser, or download it and it plays fine)
This is 100% replicatable; it's not sporadic at all. The actual behavior is that the file begins to play in the iphone mediaPlayer for just a split second, then stops with some kind of "unknown" error. So then I took that file, opened it in audacity, removed the first split-second of silence from the beginning of the clip, and re-generated the mp3:
http://www.zerogravpro.com/temp/good.mp3
And this one works perfectly in the iphone app! 100% success each and every time. I have many mp3 files that are similar to bad.mp3 in that they play fine in any audio device, but error out when streaming/playing in iphone's media player. Audacity fixed it somehow and I need to know how/why, so that I can automate the fix in my hundreds of other mp3 files. I'd love to not have to open hundreds of files in Audacity and re-save. There must be some way to automate these fixes. How did Audacity fix the file? What did it do? I can only think of 2 possibilities:
The existence of a split second of silence at the beginning of the clip chokes iphone
Audacity fixes something non-obvious in the mp3
Experts: Any idea what the difference is between these 2 files, and how I could automatically turn "bad" mp3s into good ones, from some command-line tool or something? Thanks all.
I discovered that the only difference between the files that actually matter is the size. iPhone apps (at least in the simulator), when using the audio streaming library, choke on any file under 40Kb. So you have to use the standard sound library for the smaller files.
Is it possible to download only the last 30 seconds of an mp3? Or is it necessary to download the whole thing and crop it after the fact? I would be downloading via http, i.e. I have the URL of the file but that's it.
No, it is not possible... at least not without knowing some more information first.
The real problem here is determining at what byte offset the last 30 seconds is. This is a product of knowing:
Sample Rate
Bit Depth (per sample)
# of Channels
CBR or VBR
Bit Rate
Even then, you're not going to get that with a VBR MP3 file, and even with CBR, who knows how big the ID3 and other crap at the beginning of the file is. Even if you know all of that, there is still some variability, as you have the problem of the bit reservoir.
The only way to know would be to download the whole file and use a tool such as FFMPEG to find out the right offset. Then if you want to play it, you'll want to add the appropriate headers, and make sure you are trimming on an eligible frame, or fix the bit reservoir yourself.
Now, if this could all be figured out server-side ahead of time, then yes, you could request the offset from the server, and then download from there. As for how to download it, your question is very incomplete and didn't mention what protocol you were using, so I cannot help you there.
Basically I'm trying to replicate YouTube's ability to begin video playback from any part of hosted movie. So if you have a 60 minute video, a user could skip straight to the 30 minute mark without streaming the first 30 minutes of video. Does anyone have an idea how YouTube accomplishes this?
Well the player opens the HTTP resource like normal. When you hit the seek bar, the player requests a different portion of the file.
It passes a header like this:
RANGE: bytes-unit = 10001\n\n
and the server serves the resource from that byte range. Depending on the codec it will need to read until it gets to a sync frame to begin playback
Video is a series of frames, played at a frame rate. That said, there are some rules about the order of what frames can be decoded.
Essentially, you have reference frames (called I-Frames) and you have modification frames (class P-Frames and B-Frames)... It is generally true that a properly configured decoder will be able to join a stream on any I-Frame (that is, start decoding), but not on P and B frames... So, when the user drags the slider, you're going to need to find the closest I frame and decode that...
This may of course be hidden under the hood of Flash for you, but that is what it will be doing...
I don't know how YouTube does it, but if you're looking to replicate the functionality, check out Annodex. It's an open standard that is based on Ogg Theora, but with an extra XML metadata stream.
Annodex allows you to have links to named sections within the video or temporal URIs to specific times in the video. Using libannodex, the server can seek to the relevant part of the video and start serving it from there.
If I were to guess, it would be some sort of selective data retrieval, like the Range header in HTTP. that might even be what they use. You can find more about it here.