I want to take a classical music piece in .mp3 (or other audio file if necessary) file and take the same music piece in *.midi file. then - I want to synchronize between them so as a result only the midi file would change and the timing of its beat would be synchronized with the .mp3. So lets say - if I would play them both on the same time they would play the same notes synchronizly.
How can I do so?
(I have cubase if the answer might be there...)
It's a tough task because general beat-tracking (follow tempo changes) hasn't yet been figured out.
There's at least one tool that does work though for matching an audio file to a midi file, assuming the audio file is almost identical to the midi file in terms of the score. But I can't remember it's named, never have used it. The place is to ask is the Music Information Retrieval community of scientists:
http://listes.ircam.fr/wws/info/music-ir
For manual mathcing, you can use modern DAW's like Logic, Pro Tools, etc, to help you with this by providing reasonably nice tools to build a detailed tempo-map of the audio file, and then the MIDI file would line right up with it, but it's a tedious task. You'll likely need tempo changes more often than every measure to get a nice alignment - it will be style-dependent.
You could use tools that already exist. For example, if you know the tempo of the mp3, then you could use this page to change the tempo on the midi file.
Related
I am starting a project to test the audio performance on linux.
What I need to do is to play the audio on our websystem and check the audio quality (or just check it has audio output) on linux.
I am going to record the audio on linux with ffmpeg. Is there any other better choice?
I don't know how to (automation) check I recorded is what I played, as well as the quality of recorded audio.
I think what you need is PESQ (Perceptual Evaluation of Sound Quality). However I have not found anything which is open source/free and out of the box.
You can download the recommendation from here:
http://www.itu.int/rec/T-REC-P.862-200511-I!Amd2/en
Basically this is the reference implementation of PESQ.
Sevana has an audio quality analyser which is not an ITU standard, it is AQuA:
http://www.sevana.fi/aqua_wiki.php
It is available for linux but I think you have to pay for it.
You can also check the similarities for two audio files with cross-correlation, please refer to here:
https://dsp.stackexchange.com/questions/736/how-do-i-implement-cross-correlation-to-prove-two-audio-files-are-similar
I just learned that lot of people are using Matlab or Octave to generate the necessary data, for example:
http://bagustris.blogspot.ie/2011/11/calculate-time-lag-from-cross.html
I would like to create a utility in either PHP or Perl to convert an audio file created by the Nortel's Callpilot voice mail system into a wave file. The problem is that the format, which has the .vbk file extension, is unknown to virtually any audio player. To date, I have not found one that will play a .vbk file. I've looked at audio file conversion libraries in CPAN and tried many of them, they don't recognize the file. I was not successful with PHP's audio formats manipulation either. Nortel does provide a converter, however, it does not suite my needs. I would like to have this run via cron on a CentOS system. I don't know how to reverse engineer this format. There seems to be just scraps of info on this format on the web. This page indicates that it is "based on the H.232 format":
https://www.odesk.com/o/jobs/job/Reverse-Engineer-Nortel-VBK-Audio-Format_~~f501f11679f3f6bb/
I know this is a very old thread, but I've recently been looking into converting Nortel's vbk format as well. Importing the vbk files into Audacity with raw data option, Encoding: U-Law, Byte order: little-endian, Channels: 1 Channel (Mono), Sample rate: 8000 Hz. Not sure if they have multiple formats for their vbk files, but mine were from a BCM50 phone system.
Well, this is the joy of closed proprietary systems. But there is a chance they could play nice. Try to contact Callpilot and see if they'll give you the format specs. It's worth a shot.
As for reverse engineering, you need to be able to generate known content. Like a constant tone at 60Hz for exactly 1 second. Then at 50Hz. Then at 10 seconds. Compare them. Isolate the data from the metadata. There is going to be compression involved, so try a handful of common compression schemes, maybe research into Nortel's practices will probably tell you more. If you can feed that into a player and get a tone back out, you're on your way.
There's probably more informed and structured ways to go about reverse engineering, but from my experience it's a lot of trial and error.
I was wondering if there was a tool similar to jCrop, with the exception that instead of an image I'd allow the user to crop an audio file? Google didn't give me any useful results sadly :(
The reason why I'm asking is that I'm making a tool to convert audio files to popular ringtone formats, and only letting the user specify the offsets in numbers is somewhat inconvenient. Obviously the tool doesn't have to be in javascript - anything that fits into a website is ok.
Here's a browser-based audio editor written in Flash that you could probably adapt (it supports cropping):
http://www.hisschemoller.com/2010/audio-editor-1-0/
One thing I found a bit confusing is that you have to hold down the play button on the editor to play the full sound.
I am looking to record voice in as compact a file format as possible for an ipad app, and not concerned about sound quality. I chose the ima4 format but don't really know much about audio, so am having trouble figuring out how to play back the produced file to test how it sounds. Is this a compressed format that I have to uncompress with some tool in order to just listen to it? Is this the right format if I want something compact and reasonably coherent but not worried about great quality?
Apparently, I had to save it as an .aif, .aiff, or .aifc file which then was playable by common players like iTunes.
I'm working on a tool to edit .srt (subtitle) files within the browser (the tool is to be used for linguistic annotation). In desktop tools that are used for similar purposes, the user has access to the waveform, and can "see" where silences are in the signal, and thus select a particular phrase for transcription.
Such a tool might be buildable in-browser down the road (using Web Workers and Canvas, say), but for now, it's not feasible to do the sort of signal processing it would take to find those silences.
So, I'm thinking about the next-best approach: what free tool could I use to produce a list of timestamps of where silences (below some given threshold) start and stop? If I produce such a list offline and upload it with the audio file, then I can at least make it possible to navigate through the "phrases" (defined as periods of non-silence). I think that would still be a win for in productivity for doing the transcription.
Audacity can sort of doing this, but AFAICT, only if you install Nyquist, which seems to have some patent issues.
Are there any alternatives?
It would be nice if the tool could handle as many as possible of ogg, mp3, and wav files.