How to export audio-media from a MOV-file with QuickTime-API? - audio

I want to export the audio-media of a MOV-File with the QuickTime-API and save it to an WAV-File (or something equivalent). How can I do that? I use Windows XP.

Sorry if I am stating the obvious, but you should be able to achieve this the same way as described here:
Export every frame as image from a Movie-File (QuickTime-API)
Or do you need to do this in a non-interactive way?
Edit
To export the audio media of a Movie file to WAVE non-interactively using a Movie Exporter, use the following code:
"include "QuickTimeComponents.h"
...
// aquire Movie
...
ComponentDescription desc;
MovieExportComponent exporter;
char filename[255];
FSSpec fsspec;
int flags;
// first we have to find a Movie Exporter capable of eporting the format we want
desc.componentType = MovieExportType;
desc.componentSubType = kQTFileTypeWave; // 'WAVE'
desc.componentManufacturer = SoundMediaType;
desc.componentFlags = 0;
desc.componentFlagsMask = 0;
// and create an instance of it
exporter = OpenComponent( FindNextComponent( 0, &desc ) );
// then set up a FSSpec for our output file
sprintf( outfilename, "C:/test.wav" );
c2pstr( outfilename );
FSMakeFSSpec( 0, 0L, (ConstStr255Param)outfilename, &fsspec );
// if you do error handling take care to ignore fnfErr being returned
// by FSMakeFSSpec - we're about to create a new file after all
// then finally initiate the conversion
flags= createMovieFileDeleteCurFile | movieToFileOnlyExport;
ConvertMovieToFile( movie, 0, &fsspec, kQTFileTypeWave, 'TVOD', 0, 0, flags, exporter );
CloseComponent( exporter );
...
// Clean up
...

ffmpeg is easily capable of this and can be compiled under the LGPL if needed for commercial software. If you need more customizable hooks you can use libavcodec and libavformat from the same project.

with mplayer
mplayer.exe -ao pcm:file=output.wav -vo null -vc dummy input.mov

Related

ALSA Hooks -- Modifying audio as it's being played

I'm at a loss with this — I have several personal projects in mind that essentially require that I "tap" into the audio stream: read the audio data, do some processing and modify the audio data before it is finally sent to the audio device.
One example of these personal projects is a software-based active crossover. If I have an audio device with 6 channels (i.e., 3 left + 3 right), then I can read the data, apply a LP filter (×2 – left + right), a BP filter, and a HP filter and output the streams through each of the six channels.
Notice that I know how to write a player application that does this — instead, I would want to do this so that any audio from any source (audio players, video players, youtube or any other source of audio being played by the web browser, etc.) is subject to this processing.
I've seen some of the examples (e.g., pcm_min.c from the alsa-project web site, play and record examples in the Linux Journal article by Jeff Tranter from Sep 2004) but I don't seem to have enough information to do something like what I describe above.
Any help or pointers will be appreciated.
You can implement your project as a LADSPA plugin, test it with audacity or any other program supporting LADSPA plugins, and when you like it, insert it into alsa/pulseaudio/jack playback chain.
"LADSPA" is a single header file defining a simple interface to write audio processing plugins. Each plugin has its input/output/control ports and run() function. The run() function is executed for each block of samples to do actual audio processing — apply "control" arguments to "input" buffers and write result to "output" buffers.
Example LADSPA stereo amplifier plugin (single control argument: "Amplification factor", two input ports, two output ports):
///gcc -shared -o /full/path/to/plugindir/amp_example.so amp_example.c
#include <stdlib.h>
#include "ladspa.h"
enum PORTS {
PORT_CAMP,
PORT_INPUT1,
PORT_INPUT2,
PORT_OUTPUT1,
PORT_OUTPUT2
};
typedef struct {
LADSPA_Data *c_amp;
LADSPA_Data *i_audio1;
LADSPA_Data *i_audio2;
LADSPA_Data *o_audio1;
LADSPA_Data *o_audio2;
} MyAmpData;
static LADSPA_Handle myamp_instantiate(const LADSPA_Descriptor *Descriptor, unsigned long SampleRate)
{
MyAmpData *data = (MyAmpData*)malloc(sizeof(MyAmpData));
data->c_amp = NULL;
data->i_audio1 = NULL;
data->i_audio2 = NULL;
data->o_audio1 = NULL;
data->o_audio2 = NULL;
return data;
}
static void myamp_connect_port(LADSPA_Handle Instance, unsigned long Port, LADSPA_Data *DataLocation)
{
MyAmpData *data = (MyAmpData*)Instance;
switch (Port)
{
case PORT_CAMP: data->c_amp = DataLocation; break;
case PORT_INPUT1: data->i_audio1 = DataLocation; break;
case PORT_INPUT2: data->i_audio2 = DataLocation; break;
case PORT_OUTPUT1: data->o_audio1 = DataLocation; break;
case PORT_OUTPUT2: data->o_audio2 = DataLocation; break;
}
}
static void myamp_run(LADSPA_Handle Instance, unsigned long SampleCount)
{
MyAmpData *data = (MyAmpData*)Instance;
double amp = *data->c_amp;
size_t i;
for (i = 0; i < SampleCount; i++)
{
data->o_audio1[i] = data->i_audio1[i]*amp;
data->o_audio2[i] = data->i_audio2[i]*amp;
}
}
static void myamp_cleanup(LADSPA_Handle Instance)
{
MyAmpData *data = (MyAmpData*)Instance;
free(data);
}
static LADSPA_Descriptor myampDescriptor = {
.UniqueID = 123, // for public release see http://ladspa.org/ladspa_sdk/unique_ids.html
.Label = "amp_example",
.Name = "My Amplify Plugin",
.Maker = "alsauser",
.Copyright = "WTFPL",
.PortCount = 5,
.PortDescriptors = (LADSPA_PortDescriptor[]){
LADSPA_PORT_INPUT | LADSPA_PORT_CONTROL,
LADSPA_PORT_INPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_INPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_OUTPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_OUTPUT | LADSPA_PORT_AUDIO
},
.PortNames = (const char * const[]){
"Amplification factor",
"Input left",
"Input right",
"Output left",
"Output right"
},
.PortRangeHints = (LADSPA_PortRangeHint[]){
{ /* PORT_CAMP */
LADSPA_HINT_BOUNDED_BELOW | LADSPA_HINT_BOUNDED_ABOVE | LADSPA_HINT_DEFAULT_1,
0, /* LowerBound*/
10 /* UpperBound */
},
{0, 0, 0}, /* PORT_INPUT1 */
{0, 0, 0}, /* PORT_INPUT2 */
{0, 0, 0}, /* PORT_OUTPUT1 */
{0, 0, 0} /* PORT_OUTPUT2 */
},
.instantiate = myamp_instantiate,
//.activate = myamp_activate,
.connect_port = myamp_connect_port,
.run = myamp_run,
//.deactivate = myamp_deactivate,
.cleanup = myamp_cleanup
};
// NULL-terminated list of plugins in this library
const LADSPA_Descriptor *ladspa_descriptor(unsigned long Index)
{
if (Index == 0)
return &myampDescriptor;
else
return NULL;
}
(if you prefer "short" 40-lines version see https://pastebin.com/unCnjYfD)
Add as many input/output channels as you need, implement your code in myamp_run() function. Build the plugin and set LADSPA_PATH environment variable to the directory where you've built it, so that other apps could find it:
export LADSPA_PATH=/usr/lib/ladspa:/full/path/to/plugindir
Test it in audacity or any other program supporting LADSPA plugins. To test it in terminal you can use applyplugin tool from "ladspa-sdk" package:
applyplugin input.wav output.wav /full/path/to/plugindir/amp_example.so amp_example 2
And if you like the result insert it into your default playback chain. For plain alsa you can use a config like (won't work for pulse/jack):
# ~/.asoundrc
pcm.myamp {
type plug
slave.pcm {
type ladspa
path "/usr/lib/ladspa" # required but ignored as `filename` is set
slave.pcm "sysdefault"
playback_plugins [{
filename "/full/path/to/plugindir/amp_example.so"
label "amp_example"
input.controls [ 2.0 ] # Amplification=2
}]
}
}
# to test it: aplay -Dmyamp input.wav
# to point "default" pcm to it uncomment next line:
#pcm.!default "myamp"
See also:
ladspa.h - answers to most technical questions are there in comments
LADSPA SDK overview
listplugins and analyseplugin tools from "ladspa-sdk" package
alsa plugins : "type ladspa" syntax, and alsa configuration file syntax
ladspa plugins usage examples
If you want to get your hands dirty with some code, you could check out some of these articles by Paul Davis (Paul Davis is a Linux audio guru). You'll have to combine the playback and capture examples to get live audio. Give it a shot, and if you have problems you can post a code-specific problem on SO.
Once you get the live audio working, you can implement an LP filter and go from there.
There are plenty of LADSPA and LV2 audio plugins that implement LP, HP and BP filters but I'm not sure if any are available for your particular channel configuration. It sounds like you want to roll your own anyway.

Strange noise and abnormalities when writing audio data from libspotify into a file

Currently we're implementing Libspotify in a win 7 64 bit system. Everything seems to work fine except the playback. We get data from the callback , but even using audicity on the saved audio, is filled with abnormalities. So to research further we took the win32 sample (spshell ) and modified it to save the music data to file. Same problem, definitely music with these ticks in it. I'm sure there's something simple I'm missing here, but I'm at a loss as to what could be the problem. Any help would be great since as it stands our project is at a stand still until we can resolve this.
The audio saved can be viewed here
http://uploader.crestron.com/download.php?file=8001d80992480280dba365752aeaca81
Below are the code changes I made to save the file ( for testing only )
static FILE *pFile;
int numBytesToWrite=0;
CRITICAL_SECTION m_cs;
int SP_CALLCONV music_delivery(sp_session *s, const sp_audioformat *fmt, const void *frames, int num_frames)
{
if ( num_frames == 0 )
return;
EnterCriticalSection(&m_cs);
numBytesToWrite = ( num_frames ) * fmt->channels * sizeof(short);
if (numBytesToWrite > 0 )
fwrite(frames, sizeof(short), numBytesToWrite, pFile);
LeaveCriticalSection(&m_cs);
return num_frames;
}
static void playtrack_test(void)
{
sp_error err;
InitializeCriticalSection(&m_cs);
pFile = fopen ("C:\\zzzspotify.pcm","wb");
test_start(&playtrack);
if((err = sp_session_player_load(g_session, stream_track)) != SP_ERROR_OK) {
test_report(&playtrack, "Unable to load track: %s", sp_error_message(err));
return;
}
info_report("Streaming '%s' by '%s' this will take a while", sp_track_name(stream_track),
sp_artist_name(sp_track_artist(stream_track, 0)));
sp_session_player_play(g_session, 1);
}
void SP_CALLCONV play_token_lost(sp_session *s)
{
fclose(pFile);
DeleteCriticalSection(&m_cs);
stream_track_end = 2;
notify_main_thread(g_session);
info_report("Playtoken lost");
}
static int check_streaming_done(void)
{
if(stream_track_end == 2)
test_report(&playtrack, "Playtoken lost");
else if(stream_track_end == 1)
test_ok(&playtrack);
else
return 0;
fclose(pFile);
stream_track_end = 0;
return 1;
}
It looks like this is the problem:
fwrite(frames, sizeof(short), numBytesToWrite, pFile);
The fwrite documentation states that the second argument is the "size in bytes of each element to be written", and the third is this "number of elements, each one with a size of size bytes".
The way you're calling frwritewill tell it to write numBytesToWrite * sizeof(short) bytes, which will run right off the end of the given buffer. I'm actually surprised it doesn't crash!
I'd suggest changing your fwrite call to something like:
fwrite(frames, sizeof(char), numBytesToWrite, pFile);
or:
int numSamplesToWrite = num_frames * fmt->channels;
fwrite(frames, sizeof(short), numSamplesToWrite, pFile);
Edit:
After looking at your audio in detail, I'm more convinced that this is the case. The song seems to be playing at half speed (i.e., 2x as much data is being written) and the artefacts seem to look like buffer overrun into random memory.

How to set number of PPL threads to one?

I have a number crunching function, so I have paralleled it by using PPL..however another developer requires this function to be run in serial because of some reason..I need to give a parameter so that he can call my function in serial mode...I dont want to duplicate the code so I need a way to limit the number of PPL threads..Although I have sad
Concurrency::SchedulerPolicy sp( 1, Concurrency::MaxConcurrency, 1 );
CurrentScheduler::Create(sp);
PPL creates two threads and running my code in parallel...Any suggestions how to serialize a ppl enhanced code.
For this problem better not set scheduler policies, and use some manual task group initialization control, for example:
using namespace Concurrency;
std::vector< task_handle< std::function< void() > > > aTask;
aTask.push_back( make_task([](){ /*taks 1*/}) );
aTask.push_back( make_task([](){ /*taks 2*/}) );
aTask.push_back( make_task([](){ /*taks 3*/}) );
task_group tGroup;
bool bSerialMode = true; /* or false */
if (!bSerialMode)
{
std::for_each(aTask.begin(), aTask.end(), [&](task_handle< std::function< void() > >& handle){
tGroup.run( handle );
});
}
else
{
tGroup.run( [&](){
std::for_each(aTask.begin(), aTask.end(), [&](task_handle< std::function< void() > >& handle){
tGroup.run_and_wait( handle ); });
});
}
If you do decide to limit all tasks of one virtual processor, then set MinConcurrency too.
CurrentScheduler::Create( SchedulerPolicy( 2, Concurrency::MinConcurrency, 1, Concurrency::MaxConcurrency, 1 ) );

how to change stream index in libavformat

I'm a newbie in ffmpeg. I have a problem when some media has multiple audio streams.
Suppose in MKV file, it has three audio streams (MP3, WMA and WMAPro)
How do I change the stream index when demuxing using:
AVPacket inputPacket;
ret = av_read_frame(avInputFmtCtx, &inputPacket)
So I'm searching something like change_stream_index(int streamindex), and when I call that function (suppose change_stream_index(2)), the next call to av_read_frame will demux WMAPro frame instead of MP3.
Thanks guys!
Well, at first you check for the number of streams within the input. Then you write them in some buffer(in my case I only have 2 streams, but you can easily expand that)
ptrFormatContext = avformat_alloc_context();
if(avformat_open_input(&ptrFormatContext, filename, NULL, NULL) != 0 )
{
qDebug("Error opening the input");
exit(-1);
}
if(av_find_stream_info( ptrFormatContext) < 0)
{
qDebug("Could not find any stream info");
exit(-2);
}
dump_format(ptrFormatContext, 0, filename, (int) NULL);
for(i=0; i<ptrFormatContext->nb_streams; i++)
{
switch(ptrFormatContext->streams[i]->codec->codec_type)
{
case AVMEDIA_TYPE_VIDEO:
{
if(videoStream < 0) videoStream = i;
break;
}
case AVMEDIA_TYPE_AUDIO:
{
if(audioStream < 0) audioStream = i;
}
}
}
if(audioStream == -1)
{
qDebug("Could not find any audio stream");
exit(-3);
}
if(videoStream == -1)
{
qDebug("Could not find any video stream");
exit(-4);
}
Since you don't know in which order the streams come in, you'll also have to check for the name of the codec: ptrFormatContext->streams[i]->codec->codec_name and then save the index for the regarding target_format.
Then you can just access the stream through the given index:
while(av_read_frame(ptrFormatContext,&ptrPacket) >= 0)
{
if(ptrPacket.stream_index == videoStream)
{
//decode the video stream to raw format
if(avcodec_decode_video2(ptrCodecCtxt, ptrFrame, &frameFinished, &ptrPacket) < 0)
{
qDebug("Error decoding the Videostream");
exit(-13);
}
if(frameFinished)
{
printf("%s\n", (char*) ptrPacket.data);
//encode the video stream to target format
// av_free_packet(&ptrPacket);
}
}
else if (ptrPacket.stream_index == audioStream)
{
//decode the audio stream to raw format
// if(avcodec_decode_audio3(aCodecCtx, , ,&ptrPacket) < 0)
// {
// qDebug("Error decoding the Audiostream");
// exit(-14);
// }
//encode the audio stream to target format
}
}
I just copied some extracts from a program of mine but this will hopefully help you to understand how to select streams from the input.
I did not post complete code, only excerpts, so you will have to do some initialization etc on your own, but if you have any questions I'll gladly help you!
I was facing the same problem today and in my case I deal with a mp4 file with 80 tracks and obviously if you just need to demux a single track you don't want to skip up to 79 packets each time you want to process a single packet from a selected stream.
The solution for me was setting the discard attribute of all streams which I'm not interested in to AVDISCARD_ALL. For example in order to select only a single stream with index 71 you can do this:
int32_t stream_index = 71;
for(int32_t i = 0; i<pFormatContext->nb_streams; i++)
{
if(stream_index != i) pFormatContext->streams[i]->discard = AVDISCARD_ALL;
}
after this you can call av_seek_frame or av_read_frame and only track 71 is processed.
Just for the reference, here is the list of all available discard types:
AVDISCARD_NONE =-16, ///< discard nothing
AVDISCARD_DEFAULT = 0, ///< discard useless packets like 0 size packets in avi
AVDISCARD_NONREF = 8, ///< discard all non reference
AVDISCARD_BIDIR = 16, ///< discard all bidirectional frames
AVDISCARD_NONINTRA= 24, ///< discard all non intra frames
AVDISCARD_NONKEY = 32, ///< discard all frames except keyframes
AVDISCARD_ALL = 48, ///< discard all
The answer by Dimitri Podborski is good! But there's a small issue with that approach. If you inspect the code of av_read_frame function, you'll find that there can be two cases:
format_context->flags & AVFMT_FLAG_GENPTS == true - then OK, the approach works
format_context->flags & AVFMT_FLAG_GENPTS == false - then the discard field of a stream will be ignored, and av_read_frame will read all the packets.
So, obviously, you should go with the AVDISCARD_ALL approach, but in case of absence of the GENPTS flag - fallback to a classic stream index inspection.

Is there a demo for add effects and export to wav files?

Is there a demo for add effects and export to wav files?
I have searched, but not find a way to solve it.
Add effects to a input.wav file, and play it. and then export a new wav file with effects. please help me.
my code is :
result = FMOD::System_Create(&system);
ERRCHECK(result);
result = system->getVersion(&version);
if (FMOD_OK != result) {
printf("FMOD lib version %08x doesn't match header version %08x", version, FMOD_VERSION);
}
// result = system->setOutput(FMOD_OUTPUTTYPE_WAVWRITER);
// ERRCHECK(result);
char cDest[200] = {0};
NSString *fileName=[NSString stringWithFormat:#"%#/addeffects_sound.wav", [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) objectAtIndex:0]];
[fileName getCString:cDest maxLength:200 encoding:NSASCIIStringEncoding];
result = system->init(32, FMOD_INIT_NORMAL | FMOD_INIT_PROFILE_ENABLE, cDest);
//result = system->init(32, FMOD_INIT_NORMAL, extradriverdata);
ERRCHECK(result);
result = system->getMasterChannelGroup(&mastergroup);
ERRCHECK(result);
[self createAllDSP];
-(void)createSound:(NSString *)filename
{
//printf("really path = %s", getPath(filename));
result = system->createSound(getPath(filename), FMOD_DEFAULT, 0, &sound);
ERRCHECK(result);
[self playSound];
}
-(void) playSound
{
result = system->playSound(sound, 0, false, &channel);
ERRCHECK(result);
//result = channel->setLoopCount(1);
// ERRCHECK(result);
}
Your question is quite broad, I encourage you to refocus on the areas you are having trouble with.
To answer your question generally though, there are several APIs you will need to achieve your goal, you have some of them in your code.
To get the FMOD system ready to output a .wav:
System_Create
System::setOutput
System::init
To create and prepare the desired effect:
System::createDSPByType
System::addDSP
To create and play the desired sound:
System::createSound
System::playSound
To check when the sound is done:
System::update
Channel::isPlaying
To shutdown and finialize the .wav
Sound::release
System::release
This is a basic outline of one way you can achieve your goal with FMOD.

Resources