I have a flow where iOS app users will record a large video file and upload it to our server. After the fact, the user might want to extract certain portions of that larger video based on specific time stamps and generate a highlight reel that can be viewed and shared locally back on the iOS device.
As a FE developer I don't really have much experience with where to even start here. Our BE will be built in NodeJS. It seems to me that this should be a relatively straightforward problem to solve, but I don't know.
Are there APIs that make movie manipulation easy? Can I easily extract a clip based on a start and stop time and save that as a separate file? Are those costly tasks? Or not too bad?
I'm guessing that the response to this call would be a list of a series of file names that have been generated as a result of these clips being generated, that the iOS app could then pull down and load.

It's not quite as straightforward as it might seem as video files are quite structured with header information and indexing into the individual video and audio tracks and frames. Any splitting up or cropping needs to allow for this and also create new files with the correct headers and indexing etc.
Fortunately, there are indeed libraries that you can use to do this type of thing, one of the most powerful being ffmpeg.
There are projects which allow the ffmpeg command line tool be used programatically - the advantage of this approach is that you get to leverage the vast community knowledge base for ffmpeg command line.
One of the popular ones for nodejs is:
You can then look at the ffmpeg documentation or community answers to find the particularly functionality you need - for example to crop video at a start and end time as you asked:
The general idea is quite simple and will be of the format:
ffmpeg -i yourInputVideo.mp4 -ss 01:30:00 -to 02:30:00 -c copy copy yourNewOutputVideo.mp4
It's worth taking a look at the seeking info in the ffmpeg online documentation ( to help understand the examples, especially the second one above:
-ss position (input/output)
When used as an input option (before -i), seeks in this input file to position. Note that in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position. When transcoding and -accurate_seek is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when -noaccurate_seek is used, it will be preserved.
When used as an output option (before an output url), decodes but discards input until the timestamps reach position.
position must be a time duration specification, see (ffmpeg-utils)the Time duration section in the ffmpeg-utils(1) manual.


I have a bunch of video clips from a webcam (duration is 5, 10, 60 seconds), and I'm looking for a way to detect "does this video clip have movement", to decide whether the file should be saved or discarded in a future processing phase.
I've looked into motion and OpenCV, but motion seems to only want to work on the raw video stream, and OpenCV seems to be way too advanced for my use.
My ideal solution would be a linux command-line tool that I can feed video files into, and get a simple "does/doesn't contain movement" answer back, so I can discard the irrelevant files. False positives (in a reasonable quantity) are perfectly acceptable for my use.
Does such a tool exist? Or any simple examples of doing this with other tools?
You can check dvr-scan which is simple cross-platform command line tool based on OpenCV.
To just list motion events in csv format (scan only):
dvr-scan -i some_video.mp4 -so
To extract motion in single video:
dvr-scan -i some_video.mp4 -o some_video_motion_only.avi
For more examples and various other parameters see:
I had the same problem and wrote the solution:
Should be fairly easy to use from command-line.
If you would like to post-process already-captured video then motion can be useful.
VLC allow you to stream or convert your media for use locally, on your private network, or on the Internet. So an already-captured video can be streamed over HTTP, RTSP, etc. and motion can handle it as a network camera.
How to Stream using VLC Media Player
If OpenCv is to advanced for you, maybe you should consider something easier which is... SimpleCV (wrapper for OpenCV) "This is computer vision made easy". There is even an example of motion detection using SimpleCV - Unfortunetely i can't test it(because my OpenCv version isn't compatible with SimpleCV), but generally it looks fine (and isn't complicated) - it just substract previous frame from current and calculate mean of the result. If this value is bigger than some threshold (which most likely you will have to adjust) than we can assume that there were some motion between those 2 frames. Note that setting threshold to 0 is really a bad idea, because always there is some difference between 2 consecuitve frames (changes of lighting, noises, etc).

Apple gives an example of support for byte-range segments in m3u8 files for HLS
But I cannot figure out how to create such playlist for given .ts file.
Are there any tools for that?
There is -hls_flags as a ffmpeg option. (
Following command generates single ts file which is segmented by byte range feature(supported from HLS version 4) in m3u8 index file.
$ ffmpeg -i sample.mp3 -hls_time 20 -hls_flags single_file out.m3u8
Looks like
ffprobe -show_frames media.ts -print_format json
gives enough information about frames to build such playlist, although some scripting will be required to construct it.
I'll update this answer with script if I succeed with that approach.
Here is couple of useful links I've found by now:
Bash scripts for generating iframe playlists - needs a bit of optimization, as it calls ffprobe multiple times
iframe-playlist-generator - project on python that can be used to generate iframe playlists from usual ones
It is not exactly what I've searched initially, but I-Frame playlists are similar to byte-range ones and fit for my task even better, so I'm going to use these two projects as a reference/starting point to create something a bit more suitable for me.
The projects actually use different methods to find size of I-Frame - the bash script just uses what ffprobe shows in pkt_size, and the python project adds a bit of voodoo by calculating size as difference of positions of packets and adding 188 to match example playlists from apple. 188 bytes is the size of mpeg-ts packet, probably that is related somehow, I have not managed to understand how, however. This difference in size calculation causes different playlists to be generated, probably one of them is incorrect in some way, but actually VLC plays both without any problems, so I'm going to stick to simpler method until it will be proven as incorrect.
Update 2:
I've created a ruby module that can extract I-Frame information of given .ts file with ffprobe and build both I-Frame and usual byterange m3u8 playlist (as it was requested in question) based on that information.
I've found the simple method of creating I-Frame playlist I mentioned before to be incorrect, so I used the method from iframe-playlist-generator. The output is almost similar to the I-Frame playlist generated by mediafilesegmenter -output-single-file -file-base output-dir/ input.ts, mentioned by Duvrai, but sometimes there are some 188-byte size misses for random frames, I could not understand the pattern, so it is currently ignored.
You can use a standard segmenter such as Apple's mediafilesegmenter, check the lengths of the files, and then concatenate (with the cat program) them into a single file. From the file sizes you have all the information needed to specify the byte ranges in a playlist file.
Not as nice as just downloading a tool from the net, but it's not a very complicated algorithm.
Unified Streaming also offers a tool that can do this for you:
mp4split --package-hls output-single-file -o prog_index.m3u8 input.mp4
This is part of their commercial streaming package (they offer a free trial upon request). They also provide an Amazon AWS instance with hourly fees.

I would like to create a utility in either PHP or Perl to convert an audio file created by the Nortel's Callpilot voice mail system into a wave file. The problem is that the format, which has the .vbk file extension, is unknown to virtually any audio player. To date, I have not found one that will play a .vbk file. I've looked at audio file conversion libraries in CPAN and tried many of them, they don't recognize the file. I was not successful with PHP's audio formats manipulation either. Nortel does provide a converter, however, it does not suite my needs. I would like to have this run via cron on a CentOS system. I don't know how to reverse engineer this format. There seems to be just scraps of info on this format on the web. This page indicates that it is "based on the H.232 format":
I know this is a very old thread, but I've recently been looking into converting Nortel's vbk format as well. Importing the vbk files into Audacity with raw data option, Encoding: U-Law, Byte order: little-endian, Channels: 1 Channel (Mono), Sample rate: 8000 Hz. Not sure if they have multiple formats for their vbk files, but mine were from a BCM50 phone system.
Well, this is the joy of closed proprietary systems. But there is a chance they could play nice. Try to contact Callpilot and see if they'll give you the format specs. It's worth a shot.
As for reverse engineering, you need to be able to generate known content. Like a constant tone at 60Hz for exactly 1 second. Then at 50Hz. Then at 10 seconds. Compare them. Isolate the data from the metadata. There is going to be compression involved, so try a handful of common compression schemes, maybe research into Nortel's practices will probably tell you more. If you can feed that into a player and get a tone back out, you're on your way.
There's probably more informed and structured ways to go about reverse engineering, but from my experience it's a lot of trial and error.

I'm working on a tool to edit .srt (subtitle) files within the browser (the tool is to be used for linguistic annotation). In desktop tools that are used for similar purposes, the user has access to the waveform, and can "see" where silences are in the signal, and thus select a particular phrase for transcription.
Such a tool might be buildable in-browser down the road (using Web Workers and Canvas, say), but for now, it's not feasible to do the sort of signal processing it would take to find those silences.
So, I'm thinking about the next-best approach: what free tool could I use to produce a list of timestamps of where silences (below some given threshold) start and stop? If I produce such a list offline and upload it with the audio file, then I can at least make it possible to navigate through the "phrases" (defined as periods of non-silence). I think that would still be a win for in productivity for doing the transcription.
Audacity can sort of doing this, but AFAICT, only if you install Nyquist, which seems to have some patent issues.
Are there any alternatives?
It would be nice if the tool could handle as many as possible of ogg, mp3, and wav files.

I've got many, many mp3 files that I would like to merge into a single file. I've used the command line method
copy /b 1.mp3+2.mp3 3.mp3
but it's a pain when there's a lot of them and their namings are inconsistent. The time never seems to come out right either.
David's answer is correct that just concatenating the files will leave ID3 tags scattered inside (although this doesn't normally affect playback, so you can do "copy /b" or on UNIX "cat a.mp3 b.mp3 > combined.mp3" in a pinch).
However, mp3wrap isn't exactly the right tool to just combine multiple MP3s into one "clean" file. Rather than using ID3, it actually inserts its own custom data format in amongst the MP3 frames (the "wrap" part), which causes issues with playback, particularly on iTunes and iPods. Although the file will play back fine if you just let them run from start to finish (because players will skip these is arbitrary non-MPEG bytes) the file duration and bitrate will be reported incorrectly, which breaks seeking. Also, mp3wrap will wipe out all your ID3 metadata, including cover art, and fail to update the VBR header with the correct file length.
mp3cat on its own will produce a good concatenated data file (so, better than mp3wrap), but it also strips ID3 tags and fails to update the VBR header with the correct length of the joined file.
Here's a good explanation of these issues and method (two actually) to combine MP3 files and produce a "clean" final result with original metadata intact -- it's command-line so works on Mac/Linux/BSD etc. It uses:
mp3cat to combine the MPEG data frames only into a continuous file, then
id3cp to copy all metadata over to the combined file, and finally
VBRFix to update the VBR header.
For a Windows GUI tool, take a look at Merge MP3 -- it takes care of everything. (VBRFix also comes in GUI form, but it doesn't do the joining.)
As Thomas Owens pointed out, simply concatenating the files will leave multiple ID3 headers scattered throughout the resulting concatenated file - so the time/bitrate info will be wildly wrong.
You're going to need to use a tool which can combine the audio data for you.
mp3wrap would be ideal for this - it's designed to join together MP3 files, without needing to decode + re-encode the data (which would result in a loss of audio quality) and will also deal with the ID3 tags intelligently.
The resulting file can also be split back into its component parts using the mp3splt tool - mp3wrap adds information to the IDv3 comment to allow this.
Use ffmpeg or a similar tool to convert all of your MP3s into a consistent format, e.g.
ffmpeg -i originalA.mp3 -f mp3 -ab 128kb -ar 44100 -ac 2 intermediateA.mp3
ffmpeg -i originalB.mp3 -f mp3 -ab 128kb -ar 44100 -ac 2 intermediateB.mp3
Then, at runtime, concat your files together:
cat intermediateA.mp3 intermediateB.mp3 > output.mp3
Finally, run them through the tool MP3Val to fix any stream errors without forcing a full re-encode:
mp3val output.mp3 -f -nb
The time problem has to do with the ID3 headers of the MP3 files, which is something your method isn't taking into account as the entire file is copied.
Do you have a language of choice that you want to use or doesn't it matter? That will affect what libraries are available that support the operations you want.
MP3 files have headers you need to respect.
You could ether use a library like Open Source Audio Library Project and write a tool around it.
Or you can use a tool that understands mp3 files like Audacity.
What I really wanted was a GUI to reorder them and output them as one file
Playlist Producer does exactly that, decoding and reencoding them into a combined MP3. It's designed for creating mix tapes or simple podcasts, but you might find it useful.
(Disclosure: I wrote the software, and I profit if you buy the Pro Edition. The Lite edition is a free version with a few limitations).
As David says, mp3wrap is the way to go. However, I found that it didn't fix the audio length header, so iTunes refused to play the whole file even though all the data was there. (I merged three 7-minute files, but it only saw up to the first 7 minutes.)
I dug up this blog post, which explains how to fix this and also how to copy the ID3 tags over from the original files (on its own, mp3wrap deletes your ID3 tags). Or to just copy the tags (using id3cp from id3lib), do:
id3cp original.mp3 new.mp3
I would use Winamp to do this. Create a playlist of files you want to merge into one, select Disk Writer output plugin, choose filename and you're done. The file you will get will be correct MP3 file and you can set bitrate etc.
I'd not heard of mp3wrap before. Looks great. I'm guessing someone's made it into a gui as well somewhere. But, just to respond to the original post, I've written a gui that does the COPY /b method. So, under the covers, nothing new under the sun, but the program is all about making the process less painful if you have a lot of files to merge...AND you don't want to re-encode AND each set of files to merge are the same bitrate. If you have that (and you're on Windows), check out Mp3Merge at: and see if that's what you're looking for.
If you want something free with a simple user interface that makes a completely clean mp3 I recommend MP3 Joiner.
Strips ID3 data (both ID3v1 and ID3v2.x) and doesn't add it's own (unlike mp3wrap)
Lossless joining (doesn't decode and re-encode the .mp3s). No codecs required.
Simple UI (see below)
Low memory usage (uses streams)
Very fast (compared to mp3wrap)
I wrote it :) - so you can request features and I'll add them.
MP3 Joiner website: Here
Latest installer: Here
Personally I would use something like mplayer with the audio pass though option eg -oac copy
Instead of using the command line to do
copy /b 1.mp3+2.mp3 3.mp3
you could instead use "The Rename" to rename all the MP3 fragments into a series of names that are in order based on some kind of counter. Then you could just use the same command line format but change it a little to:
copy /b *.mp3 output_name.mp3
That is assuming you ripped all of these fragment MP3's at the same time and they have the same audio settings. Worked great for me when I was converting an Audio book I had in .aa to a single .mp3. I had to burn all the .aa files to 9 CD's then rip all 9 CD's and then I was left with about 90 mp3's. Really a pain in the a55.
