Converting Audio From Unknown Format - linux

I would like to create a utility in either PHP or Perl to convert an audio file created by the Nortel's Callpilot voice mail system into a wave file. The problem is that the format, which has the .vbk file extension, is unknown to virtually any audio player. To date, I have not found one that will play a .vbk file. I've looked at audio file conversion libraries in CPAN and tried many of them, they don't recognize the file. I was not successful with PHP's audio formats manipulation either. Nortel does provide a converter, however, it does not suite my needs. I would like to have this run via cron on a CentOS system. I don't know how to reverse engineer this format. There seems to be just scraps of info on this format on the web. This page indicates that it is "based on the H.232 format":
https://www.odesk.com/o/jobs/job/Reverse-Engineer-Nortel-VBK-Audio-Format_~~f501f11679f3f6bb/

I know this is a very old thread, but I've recently been looking into converting Nortel's vbk format as well. Importing the vbk files into Audacity with raw data option, Encoding: U-Law, Byte order: little-endian, Channels: 1 Channel (Mono), Sample rate: 8000 Hz. Not sure if they have multiple formats for their vbk files, but mine were from a BCM50 phone system.

Well, this is the joy of closed proprietary systems. But there is a chance they could play nice. Try to contact Callpilot and see if they'll give you the format specs. It's worth a shot.
As for reverse engineering, you need to be able to generate known content. Like a constant tone at 60Hz for exactly 1 second. Then at 50Hz. Then at 10 seconds. Compare them. Isolate the data from the metadata. There is going to be compression involved, so try a handful of common compression schemes, maybe research into Nortel's practices will probably tell you more. If you can feed that into a player and get a tone back out, you're on your way.
There's probably more informed and structured ways to go about reverse engineering, but from my experience it's a lot of trial and error.

Related

Node.js: Is it possible to extract sub-clips from a broader video file based on start/stop time stamps?

I have a flow where iOS app users will record a large video file and upload it to our server. After the fact, the user might want to extract certain portions of that larger video based on specific time stamps and generate a highlight reel that can be viewed and shared locally back on the iOS device.
As a FE developer I don't really have much experience with where to even start here. Our BE will be built in NodeJS. It seems to me that this should be a relatively straightforward problem to solve, but I don't know.
Are there APIs that make movie manipulation easy? Can I easily extract a clip based on a start and stop time and save that as a separate file? Are those costly tasks? Or not too bad?
I'm guessing that the response to this call would be a list of a series of file names that have been generated as a result of these clips being generated, that the iOS app could then pull down and load.
It's not quite as straightforward as it might seem as video files are quite structured with header information and indexing into the individual video and audio tracks and frames. Any splitting up or cropping needs to allow for this and also create new files with the correct headers and indexing etc.
Fortunately, there are indeed libraries that you can use to do this type of thing, one of the most powerful being ffmpeg.
There are projects which allow the ffmpeg command line tool be used programatically - the advantage of this approach is that you get to leverage the vast community knowledge base for ffmpeg command line.
One of the popular ones for nodejs is:
https://github.com/damianociarla/node-ffmpeg
You can then look at the ffmpeg documentation or community answers to find the particularly functionality you need - for example to crop video at a start and end time as you asked:
https://stackoverflow.com/a/42827058/334402
https://superuser.com/a/704118
The general idea is quite simple and will be of the format:
ffmpeg -i yourInputVideo.mp4 -ss 01:30:00 -to 02:30:00 -c copy copy yourNewOutputVideo.mp4
It's worth taking a look at the seeking info in the ffmpeg online documentation (https://ffmpeg.org/ffmpeg.html) to help understand the examples, especially the second one above:
-ss position (input/output)
When used as an input option (before -i), seeks in this input file to position. Note that in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position. When transcoding and -accurate_seek is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when -noaccurate_seek is used, it will be preserved.
When used as an output option (before an output url), decodes but discards input until the timestamps reach position.
position must be a time duration specification, see (ffmpeg-utils)the Time duration section in the ffmpeg-utils(1) manual.

Is there a way to set the details of a file in Windows using python?

I want to be able to set the "Title" and "Comments" (listed in properties->details) of some mp3 files in Windows using python. Is this possible, perhaps with a library like PyWin32? Also, would these details be visible in other operating systems or are they Windows-specific? Thanks.
Simple Answer:
Yes, you can set 'Title' and 'Comments' (and many other fields) of an mp3 file in Windows using Python.
Also, the details are visible on all operating systems and are not windows specific.
First you have to understand what is mp3 file and how data is organized within an mp3 file.
Detailed Answer:
Raw audio consumes a lot of size. For example, an audio signal of 10 sec sampled 48 kHz and having a bit depth of 16 bits per sample will be of size 10*48000*16 bits, which is close to 1 MB. So, for a 5 minute song, it will almost take 30 MB. But, if you observe, most 5 min mp3 songs are of size around 5 MB (of course it depends on sampling frequency, bit depth and amount of compression used). How is it possible? It is possible because we compress the data using signal processing techniques which in itself is a big topic altogether which we will not discuss here. So, to create an mp3 file we need something called encoder which converts the raw audio data to compressed data and every time you play an mp3 song, decoder is used which converts the data from compressed format to raw audio, which is what you can only listen. So, compression is done for saving storage and also transmission bandwidth (basically saving amount of data to be transmitted over internet).
Now, coming to how data is organized inside an mp3 file. mp3 file will obviously contain the compressed data. In addition many mp3 files contain some meta data (like Title and Comments you mentioned in your question). There are several formats for storing this meta data. So, a decoder which is decoding mp3 file should also support decoding of meta-data, then only you can see the information, other wise you can't see. The meta data is operating system independent, and can be seen on any operating system provided you have a proper decoder.
Finally, yes you can edit the meta data on windows (for that matter on any OS) using python. If you want to do this, using only python without any library, you need to understand how data is organized inside an mp3 file, find the meta-data inside it, edit it and store it back. But, there are libraries and packages in python which support editing meta-data of mp3 file. You can use them directly. Also, the meta data is independent of OS, and once you edit your properties, you should be able to see the properties in any OS provided the decoder you use has the support.
Some links which will help you:
mp3 tag tool
Another stack overflow question which gives details about libraries that support viewing and editing of meta data using Python

Reverse-engineer Cubase .cpr format

I don't have an opportunity to buy Cubase, but my partner uses it a lot. I wanted to simplify his life and provide him with cpr projects instead of plain wav files, but no other software can open/save this format.
I looked at a sample cpr he sent me and it seems like the file does not contain audio data itself, it rather contains the mark-up and effects.
I wanted to know the following things:
Is it legal to try to reverse-engineer cpr files?
Is it difficult and who tried?
If someone knows other ways to transfer project files between Audacity/Rosegarden and Cubase? The main thing is the support of several tracks and their timing in one project, nothing fancy.
Cpr files comes from a proprietary format. You can have a look on this question.
I suppose it is pretty hard... and I didn't tried !
To my knowledge, there is no way to export/import a project between cubase and Audacity or Rosegarden. The OMF format which could be a good candidate, is not supported by Audacity or Rosegarden for now. You can still import/export the audio mix, the separated tracks, and the midi files separately. This method is really fastidious, but it probably provides the advantage to let you play and edit your projects in the next decades, that isn't obvious with project files.

Linux Audio record and quality comparison

I am starting a project to test the audio performance on linux.
What I need to do is to play the audio on our websystem and check the audio quality (or just check it has audio output) on linux.
I am going to record the audio on linux with ffmpeg. Is there any other better choice?
I don't know how to (automation) check I recorded is what I played, as well as the quality of recorded audio.
I think what you need is PESQ (Perceptual Evaluation of Sound Quality). However I have not found anything which is open source/free and out of the box.
You can download the recommendation from here:
http://www.itu.int/rec/T-REC-P.862-200511-I!Amd2/en
Basically this is the reference implementation of PESQ.
Sevana has an audio quality analyser which is not an ITU standard, it is AQuA:
http://www.sevana.fi/aqua_wiki.php
It is available for linux but I think you have to pay for it.
You can also check the similarities for two audio files with cross-correlation, please refer to here:
https://dsp.stackexchange.com/questions/736/how-do-i-implement-cross-correlation-to-prove-two-audio-files-are-similar
I just learned that lot of people are using Matlab or Octave to generate the necessary data, for example:
http://bagustris.blogspot.ie/2011/11/calculate-time-lag-from-cross.html

Searching for audio

I am looking for a toolkit or library to search contents of audio files for am audio sample.
For example I have 5 seconds of speech that I know it exists in hundreds of hours of audio, and I want to find exact file and position of this sub-samples.
The sample is %99 similar but maybe converted to different audio format so it may have minor differences in waveform.
I prefer .NET library if there is such an option.
Thank you.
What you are trying to do is not an easy DSP problem to solve, and there is no one foolproof method. There is however an excellent recent article on audio fingerprinting on codeproject which goes into some depth on an algorithm that searches for duplicate MP3s, with code in C#. You may be able to adapt the algorithm to your needs.

Resources