I've been asked to find the actual runtime of a batch of files. Each of these files contains voice and silences (guided meditation type), and I need to find a way to measure the runtime of just the voice.
The manual way of doing this is opening a file, looking at the wave, identifying the silences and removing them so the final duration of the file is the "just voice" runtime. This can take me 3-4 minutes per file, and that's just too much for a batch of 1800 files.
So my question is: is there a way to automatically delete the silent parts? And if so, can it be scripted or automated in any way?
In my studio we work with Sound Forge and ProTools.
ProTools has this built in, select the region and edit->strip silence.
SoX can do this if you want to set up some scripts without using ProTools(nice blog post)
Related
I have a flow where iOS app users will record a large video file and upload it to our server. After the fact, the user might want to extract certain portions of that larger video based on specific time stamps and generate a highlight reel that can be viewed and shared locally back on the iOS device.
As a FE developer I don't really have much experience with where to even start here. Our BE will be built in NodeJS. It seems to me that this should be a relatively straightforward problem to solve, but I don't know.
Are there APIs that make movie manipulation easy? Can I easily extract a clip based on a start and stop time and save that as a separate file? Are those costly tasks? Or not too bad?
I'm guessing that the response to this call would be a list of a series of file names that have been generated as a result of these clips being generated, that the iOS app could then pull down and load.
It's not quite as straightforward as it might seem as video files are quite structured with header information and indexing into the individual video and audio tracks and frames. Any splitting up or cropping needs to allow for this and also create new files with the correct headers and indexing etc.
Fortunately, there are indeed libraries that you can use to do this type of thing, one of the most powerful being ffmpeg.
There are projects which allow the ffmpeg command line tool be used programatically - the advantage of this approach is that you get to leverage the vast community knowledge base for ffmpeg command line.
One of the popular ones for nodejs is:
https://github.com/damianociarla/node-ffmpeg
You can then look at the ffmpeg documentation or community answers to find the particularly functionality you need - for example to crop video at a start and end time as you asked:
https://stackoverflow.com/a/42827058/334402
https://superuser.com/a/704118
The general idea is quite simple and will be of the format:
ffmpeg -i yourInputVideo.mp4 -ss 01:30:00 -to 02:30:00 -c copy copy yourNewOutputVideo.mp4
It's worth taking a look at the seeking info in the ffmpeg online documentation (https://ffmpeg.org/ffmpeg.html) to help understand the examples, especially the second one above:
-ss position (input/output)
When used as an input option (before -i), seeks in this input file to position. Note that in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position. When transcoding and -accurate_seek is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when -noaccurate_seek is used, it will be preserved.
When used as an output option (before an output url), decodes but discards input until the timestamps reach position.
position must be a time duration specification, see (ffmpeg-utils)the Time duration section in the ffmpeg-utils(1) manual.
I have one folder full of audio samples in mp3 format and i want to format all of them in OGG format. So to save time i want to format all at the same time. what method do you suggest?
Using an audio file converter utility may be the best option for you. One very good option is NCH Switch. It's widely reviewed, and you get 30 days free use with no limitations. They allow you to batch convert, so you can add the whole folder, specify rename options (if you wish) and many other functions. It is quite simple and intuitive to use.
Another free and open source option is fre:ac.
I have a batch of audio files which recording people's voice. But some of this audio files record only noises or microphone burst. I want to detect these files and jump over them while processing my program.
I'm not sure whether ffmpeg can do this. If yes, could you guys provide me a link of that method? If not, do you know if there is some other software can do this? Or do you have any solution or suggestion to this problem?
Thank you.
I would approach this by looking at peak values and duration. SOX is a program that allows shell scripting which could batch analyze this. There is a large user base and forum as well.
Here is a link to a forum topic discussing it's use on batch discovering peak values and outputting information to a .csv file.
I want to take a classical music piece in .mp3 (or other audio file if necessary) file and take the same music piece in *.midi file. then - I want to synchronize between them so as a result only the midi file would change and the timing of its beat would be synchronized with the .mp3. So lets say - if I would play them both on the same time they would play the same notes synchronizly.
How can I do so?
(I have cubase if the answer might be there...)
It's a tough task because general beat-tracking (follow tempo changes) hasn't yet been figured out.
There's at least one tool that does work though for matching an audio file to a midi file, assuming the audio file is almost identical to the midi file in terms of the score. But I can't remember it's named, never have used it. The place is to ask is the Music Information Retrieval community of scientists:
http://listes.ircam.fr/wws/info/music-ir
For manual mathcing, you can use modern DAW's like Logic, Pro Tools, etc, to help you with this by providing reasonably nice tools to build a detailed tempo-map of the audio file, and then the MIDI file would line right up with it, but it's a tedious task. You'll likely need tempo changes more often than every measure to get a nice alignment - it will be style-dependent.
You could use tools that already exist. For example, if you know the tempo of the mp3, then you could use this page to change the tempo on the midi file.
I'm working on a tool to edit .srt (subtitle) files within the browser (the tool is to be used for linguistic annotation). In desktop tools that are used for similar purposes, the user has access to the waveform, and can "see" where silences are in the signal, and thus select a particular phrase for transcription.
Such a tool might be buildable in-browser down the road (using Web Workers and Canvas, say), but for now, it's not feasible to do the sort of signal processing it would take to find those silences.
So, I'm thinking about the next-best approach: what free tool could I use to produce a list of timestamps of where silences (below some given threshold) start and stop? If I produce such a list offline and upload it with the audio file, then I can at least make it possible to navigate through the "phrases" (defined as periods of non-silence). I think that would still be a win for in productivity for doing the transcription.
Audacity can sort of doing this, but AFAICT, only if you install Nyquist, which seems to have some patent issues.
Are there any alternatives?
It would be nice if the tool could handle as many as possible of ogg, mp3, and wav files.