I want to have an away to split an audio file in multiple files within a silent period. For that I searched for something that can detect audio level (for example in db) and I found nothing or something that doesn't exist anymore.
Edit:
I will use mp3 as audio format.
I don't send that in a server to a client. Just process a local file to multiple files and then do something else with the audio files
Related
I'm creating an app which needs to take the user's voice and convert it to text, but the audio file seems to have something wrong with the length after it's creation.
Here is how I'm gathering the audio and converting the data to a Blob. I'm just using the getUserMedia method to record the audio and convert it to a Blob when its stopped.
This is the beginning of the function I pass the Blob object to. I convert the Blob to a buffer and write that buffer to a file in the root directory of my project. But once the file has been written and I go to listen to it, the length of the audio is 435:13:24 no matter how long the original audio was. But even though the length is that long, the file sounds exactly like it should and ends at the correct time.
Img Heres a picture of what the file looks like when viewed.
Although this may not seem like a big deal since listening to the file provides the correct audio, I'm passing the file to an API that converts it to text, and it almost always gives either the wrong translation or an error about the file containing nothing. I've tried different ways of writing the Blob data to the file, and tried cutting off the excess length audio, but nothing has worked.
This is my first post on stack overflow since I ran out of options to fix this, so I'm sorry if the question is kind of vague or I formatted it incorrectly in some way.
I'm converting an ESP32 project to a Raspberry Pi zero. One of the project behaviors is to play back sound effects based on specific events or triggers. I prefer to use MP3 format so I can store information about the contents of the file in the ID3TAGs to make the files themselves easier to manage. (there are a lot of them!)
I can find examples of using any number of libraries to play mp3s in python, and I found an example of selecting a device using 'sounddevice' but it seems to want numpy arrays to play sound data.
I'm wondering what the easiest and quickest way is to play mp3 files (or should I go to some other file format with a data stub file for each to do my file management?).
Since these behaviors are played as responses, they need to at least start playback quickly (i.e. not wait for a format conversion to take place). And in some cases, other behaviors (such as voice recognition triggers) are already going to add to potential latency on the device in it's total response time.
EDIT: additional info
quickest means processor speed (pi zeros slow down quick under heavy load)
These are real time responses so any 'lag' converting defeats the purpose of the playback.
Also, the device from seeed is configured as an alsa (asound) device
I have an audio streaming application that uses requests to download the audio file and then played using Gstreamer.
I want to trim the first few seconds of all the audio files that i have. I could use ffmpeg to trim but that would waste cpu resources on my embedded platform and also waste network bandwidth
(The number of songs are around 1000, and they get downloaded continously, so it does make a difference)
I have tried downloading partial file using the range header in requests but that doesn't work. I can't play the file.
Can someone please tell me how i can make this work?
The audio files are generally .m4a / .webm but they are extracted from youtube so can't say for sure.
This is an uneasy task.. there is no clean way how to do it..
you can probably use the valve element set it to drop by default..
and then put some timer which sets the drop to false..
not sure how this will work, you need to try.
Here are some hints:
I have a system that creates audio files of automated fire dispatches as wav files, and occasionally the file will record multiple calls separated by a multi-tone sequence. Here is a sample
I have been searching for a way to have ffmpeg, or some other Linux CLI tool, recognise the tones and then cut the file into a separate WAV file named [unixtimestamp of original]+duration seconds.wav
Does anyone have any ideas?
I want to make a live stream. And, I want to use HLS.
I understand that a HLS live stream is just a main playlist file with '.m3u8' extension that lists all the files to be played.
But, for live stream, since all the files are not available readily, they are added as they come in.
I want to use S3 for now to host these files and the playlist file.
Now, I want to update the playlist file in S3. But it's actually going to replace the existing playlist file instead of just updating it (according to this answer).
So, I'm assuming that there will be no dead-time during the file replace. If there is a dead-time, how do I overcome it? Is this the way to do it or is there some other better way to do this.
I'm using a NodeJS server, just FYI.
*dead-time time when there is no file.
I want to make a live stream. And, I want to use HLS.
Why HLS? Why not DASH? DASH is also segmented and implemented almost exactly as HLS, but has more flexibility as far as codec choice and what not. Either is fine, but if you're starting from scratch today, I recommend DASH, and the DASH.js reference player code, which uses Media Source Extensions.
I understand that a HLS live stream is just a main playlist file with '.m3u8' extension that lists all the files to be played.
Correct.
But, for live stream, since all the files are not available readily, they are added as they come in.
Correct.
Now, I want to update the playlist file in S3. But it's actually going to replace the existing playlist file instead of just updating it
Yes, and as the other answer noted, there's no difference. The playlist file will be overwritten with the new full copy. The S3 API doesn't allow appending to a file, unless doing a multi-part upload which really isn't the same thing. In any case, your playlist file for a live stream isn't going to contain each and every segment anyway. Usually you only keep the last handful of segments in the playlist, but this is up to you to decide how far back to go.
So, I'm assuming that there will be no dead-time during the file replace.
S3 doesn't replace that object until the full new object is uploaded and stored. There will never be a case where a partial file is there. S3 isn't like a regular file system. Additionally, if a subsequent upload fails, the old object is still going to remain.
HLS and DASH players read the playlist and buffer a ton of data before beginning playback. (This is why they notoriously have high latency.) It will be several seconds before the new segment is uploaded and added to the playlist, so it's important that they already have data in the buffer to play. This is why you don't have to worry about any drop-out, unless there is a failure to upload in time.
I'm using a NodeJS server, just FYI.
Is that so? Sounds like you're using S3 to me... not sure what Node.js has to do with any of this.