how to change stream index in libavformat

how to change stream index in libavformat - libavformat

I'm a newbie in ffmpeg. I have a problem when some media has multiple audio streams.
Suppose in MKV file, it has three audio streams (MP3, WMA and WMAPro)
How do I change the stream index when demuxing using:
AVPacket inputPacket;
ret = av_read_frame(avInputFmtCtx, &inputPacket)
So I'm searching something like change_stream_index(int streamindex), and when I call that function (suppose change_stream_index(2)), the next call to av_read_frame will demux WMAPro frame instead of MP3.
Thanks guys!

Well, at first you check for the number of streams within the input. Then you write them in some buffer(in my case I only have 2 streams, but you can easily expand that)
ptrFormatContext = avformat_alloc_context();
if(avformat_open_input(&ptrFormatContext, filename, NULL, NULL) != 0 )
{
qDebug("Error opening the input");
exit(-1);
}
if(av_find_stream_info( ptrFormatContext) < 0)
{
qDebug("Could not find any stream info");
exit(-2);
}
dump_format(ptrFormatContext, 0, filename, (int) NULL);
for(i=0; i<ptrFormatContext->nb_streams; i++)
{
switch(ptrFormatContext->streams[i]->codec->codec_type)
{
case AVMEDIA_TYPE_VIDEO:
{
if(videoStream < 0) videoStream = i;
break;
}
case AVMEDIA_TYPE_AUDIO:
{
if(audioStream < 0) audioStream = i;
}
}
}
if(audioStream == -1)
{
qDebug("Could not find any audio stream");
exit(-3);
}
if(videoStream == -1)
{
qDebug("Could not find any video stream");
exit(-4);
}
Since you don't know in which order the streams come in, you'll also have to check for the name of the codec: ptrFormatContext->streams[i]->codec->codec_name and then save the index for the regarding target_format.
Then you can just access the stream through the given index:
while(av_read_frame(ptrFormatContext,&ptrPacket) >= 0)
{
if(ptrPacket.stream_index == videoStream)
{
//decode the video stream to raw format
if(avcodec_decode_video2(ptrCodecCtxt, ptrFrame, &frameFinished, &ptrPacket) < 0)
{
qDebug("Error decoding the Videostream");
exit(-13);
}
if(frameFinished)
{
printf("%s\n", (char*) ptrPacket.data);
//encode the video stream to target format
// av_free_packet(&ptrPacket);
}
}
else if (ptrPacket.stream_index == audioStream)
{
//decode the audio stream to raw format
// if(avcodec_decode_audio3(aCodecCtx, , ,&ptrPacket) < 0)
// {
// qDebug("Error decoding the Audiostream");
// exit(-14);
// }
//encode the audio stream to target format
}
}
I just copied some extracts from a program of mine but this will hopefully help you to understand how to select streams from the input.
I did not post complete code, only excerpts, so you will have to do some initialization etc on your own, but if you have any questions I'll gladly help you!

I was facing the same problem today and in my case I deal with a mp4 file with 80 tracks and obviously if you just need to demux a single track you don't want to skip up to 79 packets each time you want to process a single packet from a selected stream.
The solution for me was setting the discard attribute of all streams which I'm not interested in to AVDISCARD_ALL. For example in order to select only a single stream with index 71 you can do this:
int32_t stream_index = 71;
for(int32_t i = 0; i<pFormatContext->nb_streams; i++)
{
if(stream_index != i) pFormatContext->streams[i]->discard = AVDISCARD_ALL;
}
after this you can call av_seek_frame or av_read_frame and only track 71 is processed.
Just for the reference, here is the list of all available discard types:
AVDISCARD_NONE =-16, ///< discard nothing
AVDISCARD_DEFAULT = 0, ///< discard useless packets like 0 size packets in avi
AVDISCARD_NONREF = 8, ///< discard all non reference
AVDISCARD_BIDIR = 16, ///< discard all bidirectional frames
AVDISCARD_NONINTRA= 24, ///< discard all non intra frames
AVDISCARD_NONKEY = 32, ///< discard all frames except keyframes
AVDISCARD_ALL = 48, ///< discard all

The answer by Dimitri Podborski is good! But there's a small issue with that approach. If you inspect the code of av_read_frame function, you'll find that there can be two cases:
format_context->flags & AVFMT_FLAG_GENPTS == true - then OK, the approach works
format_context->flags & AVFMT_FLAG_GENPTS == false - then the discard field of a stream will be ignored, and av_read_frame will read all the packets.
So, obviously, you should go with the AVDISCARD_ALL approach, but in case of absence of the GENPTS flag - fallback to a classic stream index inspection.

Related

Why does this FTDI function return zero for the fthandle?

I have a FTDI USB3 development board and some FTDI provided code for accessing it. The code works fine for things like the Device number, VID/PID etc. but always returns zero for the 'ftHandle'. As the handle is required for driving the board, this is not helpful! Can anyone see why this should happen?
static FT_STATUS displayDevicesMethod2(void)
{
FT_STATUS ftStatus;
FT_HANDLE ftHandle = NULL;
// Get and display the list of devices connected
// First call FT_CreateDeviceInfoList to get the number of connected devices.
// Then either call FT_GetDeviceInfoList or FT_GetDeviceInfoDetail to display device
info.
// Device info: Flags (usb speed), device type (600 e.g.), device ID (vendor,
product),
handle for subsequent data access.
DWORD numDevs = 0;
ftStatus = FT_CreateDeviceInfoList(&numDevs); // Build a list and return number
connected.
if (FT_FAILED(ftStatus))
{
printf("Failed to create a device list, status = %d\n", ftStatus);
}
printf("Successfully created a device list.\n\tNumber of connected devices: %d\n",
numDevs);
// Method 2: using FT_GetDeviceInfoDetail
if (!FT_FAILED(ftStatus) && numDevs > 0)
{
ftHandle = NULL;
DWORD Flags = 0;
DWORD Type = 0;
DWORD ID = 0;
char SerialNumber[16] = { 0 };
char Description[32] = { 0 };
for(DWORD i = 0; i <numDevs; i++)
{
ftStatus = FT_GetDeviceInfoDetail(i, &Flags, &Type, &ID, NULL, SerialNumber,
Description, &ftHandle);
if (!FT_FAILED(ftStatus))
{
printf("Device[%d] (using FT_GetDeviceInfoDetail)\n", i);
printf("\tFlags: 0x%x %s | Type: %d | ID: 0x%08X | ftHandle=0x%p\n",
Flags,
Flags & FT_FLAGS_SUPERSPEED? "[USB 3]":
Flags & FT_FLAGS_HISPEED? "[USB 2]":
Flags & FT_FLAGS_OPENED? "[OPENED]": "",
Type,
ID,
ftHandle);
printf("\tSerialNumber=%s\n", SerialNumber);
printf("\tDescription=%s\n", Description);
}
}
}
return ftStatus;
}

This is indeed not super straight forward, but a short peek in the FTDI Knowledgebase yields:
This function builds a device information list and returns the number of D2XX devices connected to the system. The list contains information about both unopen and open devices.
A handle only exists for an opened device. Thus, I assume that your code does not already include that step. If so you need to open it first, e.g. using FT_Open. There are plenty of examples available. You can check their page or stackoverflow for a working example.

Ogg opus granule position to timestamp

With an ultimate aim to crop/cut/trim the Ogg file containing a single opus stream,
I'm trying to retrieve and filter ogg pages form the file and those which sit between the crop window of startTimeMs and endTimeMs I'll append them to the 2 ogg heads resulting in trimmed opus without transcoding
I have reached a stage where I have access to the ogg pages but I'm confused on how to determine if the page lies in crop window or not
OggOpusStream oggOpusStream = OggOpusStream.from("audio/technology.opus");
// Get ID Header
IdHeader idHeader = oggOpusStream.getIdHeader();
// Get Comment Header
CommentHeader commentHeader = oggOpusStream.getCommentHeader();
while (true) {
AudioDataPacket audioDataPacket = oggOpusStream.readAudioPacket();
if (audioDataPacket == null) break;
for (OpusPacket opusPacket : audioDataPacket.getOpusPackets()) {
if(packetLiesWithinTrimRange(opusPacket )){ writeToOutput(opusPacket); }
}
}
// Create an output stream
OutputStream outputStream = ...;
// Create a new Ogg page
OggPage oggPage = OggPage.empty();
// Set header fields by calling setX() method
oggPage.setSerialNum(100);
// Add a data packet to this page
oggPage.addDataPacket(audioDataPacket.dump());
// Call dump() method to dump the OggPage object to byte array binary
byte[] binary = oggPage.dump();
// Write the binary to stream
outputStream.write(binary);
It should work if I would be able to complete this method
private boolean packetLiesWithinTrimRange(OpusPacket packet){
if(????????){ return true;}
return false;
}
or maybe
private boolean pageLiesWithinTrimRange(OggPage page){
if(????????){ return true;}
return false;
}
Any ogg/opus help is appreciated
https://github.com/leonfancy/oggus/issues/2
OggPage.java with private long granulePosition;
https://github.com/leonfancy/oggus/blob/master/src/main/java/org/chenliang/oggus/ogg/OggPage.java
Ogg Encapsulation for the Opus Audio Codec
https://datatracker.ietf.org/doc/html/rfc7845

An audio page's end time can be calculated with using the stream's pre-skip and first/initial granule position. The page's start time can be obtained using the previous page's end time. See pseudocode below:
sampleRate = 48_000
streamPreskip = ...
streamGranulePosFirst = ...
isPageWithinTimeRange(page, prevPage, msStart, msEnd) {
pageMsStart = getPageMsEnd(prevPage)
pageMsEnd = getPageMsEnd(page)
return (pageMsStart >= msStart && pageMsEnd <= msEnd)
}
getPageMsEnd(page) {
return (page.granulePos - streamGranulePosFirst - streamPreskip) / sampleRate
}

ALSA Hooks -- Modifying audio as it's being played

I'm at a loss with this — I have several personal projects in mind that essentially require that I "tap" into the audio stream: read the audio data, do some processing and modify the audio data before it is finally sent to the audio device.
One example of these personal projects is a software-based active crossover. If I have an audio device with 6 channels (i.e., 3 left + 3 right), then I can read the data, apply a LP filter (×2 – left + right), a BP filter, and a HP filter and output the streams through each of the six channels.
Notice that I know how to write a player application that does this — instead, I would want to do this so that any audio from any source (audio players, video players, youtube or any other source of audio being played by the web browser, etc.) is subject to this processing.
I've seen some of the examples (e.g., pcm_min.c from the alsa-project web site, play and record examples in the Linux Journal article by Jeff Tranter from Sep 2004) but I don't seem to have enough information to do something like what I describe above.
Any help or pointers will be appreciated.

You can implement your project as a LADSPA plugin, test it with audacity or any other program supporting LADSPA plugins, and when you like it, insert it into alsa/pulseaudio/jack playback chain.
"LADSPA" is a single header file defining a simple interface to write audio processing plugins. Each plugin has its input/output/control ports and run() function. The run() function is executed for each block of samples to do actual audio processing — apply "control" arguments to "input" buffers and write result to "output" buffers.
Example LADSPA stereo amplifier plugin (single control argument: "Amplification factor", two input ports, two output ports):
///gcc -shared -o /full/path/to/plugindir/amp_example.so amp_example.c
#include <stdlib.h>
#include "ladspa.h"
enum PORTS {
PORT_CAMP,
PORT_INPUT1,
PORT_INPUT2,
PORT_OUTPUT1,
PORT_OUTPUT2
};
typedef struct {
LADSPA_Data *c_amp;
LADSPA_Data *i_audio1;
LADSPA_Data *i_audio2;
LADSPA_Data *o_audio1;
LADSPA_Data *o_audio2;
} MyAmpData;
static LADSPA_Handle myamp_instantiate(const LADSPA_Descriptor *Descriptor, unsigned long SampleRate)
{
MyAmpData *data = (MyAmpData*)malloc(sizeof(MyAmpData));
data->c_amp = NULL;
data->i_audio1 = NULL;
data->i_audio2 = NULL;
data->o_audio1 = NULL;
data->o_audio2 = NULL;
return data;
}
static void myamp_connect_port(LADSPA_Handle Instance, unsigned long Port, LADSPA_Data *DataLocation)
{
MyAmpData *data = (MyAmpData*)Instance;
switch (Port)
{
case PORT_CAMP: data->c_amp = DataLocation; break;
case PORT_INPUT1: data->i_audio1 = DataLocation; break;
case PORT_INPUT2: data->i_audio2 = DataLocation; break;
case PORT_OUTPUT1: data->o_audio1 = DataLocation; break;
case PORT_OUTPUT2: data->o_audio2 = DataLocation; break;
}
}
static void myamp_run(LADSPA_Handle Instance, unsigned long SampleCount)
{
MyAmpData *data = (MyAmpData*)Instance;
double amp = *data->c_amp;
size_t i;
for (i = 0; i < SampleCount; i++)
{
data->o_audio1[i] = data->i_audio1[i]*amp;
data->o_audio2[i] = data->i_audio2[i]*amp;
}
}
static void myamp_cleanup(LADSPA_Handle Instance)
{
MyAmpData *data = (MyAmpData*)Instance;
free(data);
}
static LADSPA_Descriptor myampDescriptor = {
.UniqueID = 123, // for public release see http://ladspa.org/ladspa_sdk/unique_ids.html
.Label = "amp_example",
.Name = "My Amplify Plugin",
.Maker = "alsauser",
.Copyright = "WTFPL",
.PortCount = 5,
.PortDescriptors = (LADSPA_PortDescriptor[]){
LADSPA_PORT_INPUT | LADSPA_PORT_CONTROL,
LADSPA_PORT_INPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_INPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_OUTPUT | LADSPA_PORT_AUDIO,
LADSPA_PORT_OUTPUT | LADSPA_PORT_AUDIO
},
.PortNames = (const char * const[]){
"Amplification factor",
"Input left",
"Input right",
"Output left",
"Output right"
},
.PortRangeHints = (LADSPA_PortRangeHint[]){
{ /* PORT_CAMP */
LADSPA_HINT_BOUNDED_BELOW | LADSPA_HINT_BOUNDED_ABOVE | LADSPA_HINT_DEFAULT_1,
0, /* LowerBound*/
10 /* UpperBound */
},
{0, 0, 0}, /* PORT_INPUT1 */
{0, 0, 0}, /* PORT_INPUT2 */
{0, 0, 0}, /* PORT_OUTPUT1 */
{0, 0, 0} /* PORT_OUTPUT2 */
},
.instantiate = myamp_instantiate,
//.activate = myamp_activate,
.connect_port = myamp_connect_port,
.run = myamp_run,
//.deactivate = myamp_deactivate,
.cleanup = myamp_cleanup
};
// NULL-terminated list of plugins in this library
const LADSPA_Descriptor *ladspa_descriptor(unsigned long Index)
{
if (Index == 0)
return &myampDescriptor;
else
return NULL;
}
(if you prefer "short" 40-lines version see https://pastebin.com/unCnjYfD)
Add as many input/output channels as you need, implement your code in myamp_run() function. Build the plugin and set LADSPA_PATH environment variable to the directory where you've built it, so that other apps could find it:
export LADSPA_PATH=/usr/lib/ladspa:/full/path/to/plugindir
Test it in audacity or any other program supporting LADSPA plugins. To test it in terminal you can use applyplugin tool from "ladspa-sdk" package:
applyplugin input.wav output.wav /full/path/to/plugindir/amp_example.so amp_example 2
And if you like the result insert it into your default playback chain. For plain alsa you can use a config like (won't work for pulse/jack):
# ~/.asoundrc
pcm.myamp {
type plug
slave.pcm {
type ladspa
path "/usr/lib/ladspa" # required but ignored as `filename` is set
slave.pcm "sysdefault"
playback_plugins [{
filename "/full/path/to/plugindir/amp_example.so"
label "amp_example"
input.controls [ 2.0 ] # Amplification=2
}]
}
}
# to test it: aplay -Dmyamp input.wav
# to point "default" pcm to it uncomment next line:
#pcm.!default "myamp"
See also:
ladspa.h - answers to most technical questions are there in comments
LADSPA SDK overview
listplugins and analyseplugin tools from "ladspa-sdk" package
alsa plugins : "type ladspa" syntax, and alsa configuration file syntax
ladspa plugins usage examples

If you want to get your hands dirty with some code, you could check out some of these articles by Paul Davis (Paul Davis is a Linux audio guru). You'll have to combine the playback and capture examples to get live audio. Give it a shot, and if you have problems you can post a code-specific problem on SO.
Once you get the live audio working, you can implement an LP filter and go from there.
There are plenty of LADSPA and LV2 audio plugins that implement LP, HP and BP filters but I'm not sure if any are available for your particular channel configuration. It sounds like you want to roll your own anyway.

Strange noise and abnormalities when writing audio data from libspotify into a file

Currently we're implementing Libspotify in a win 7 64 bit system. Everything seems to work fine except the playback. We get data from the callback , but even using audicity on the saved audio, is filled with abnormalities. So to research further we took the win32 sample (spshell ) and modified it to save the music data to file. Same problem, definitely music with these ticks in it. I'm sure there's something simple I'm missing here, but I'm at a loss as to what could be the problem. Any help would be great since as it stands our project is at a stand still until we can resolve this.
The audio saved can be viewed here
http://uploader.crestron.com/download.php?file=8001d80992480280dba365752aeaca81
Below are the code changes I made to save the file ( for testing only )
static FILE *pFile;
int numBytesToWrite=0;
CRITICAL_SECTION m_cs;
int SP_CALLCONV music_delivery(sp_session *s, const sp_audioformat *fmt, const void *frames, int num_frames)
{
if ( num_frames == 0 )
return;
EnterCriticalSection(&m_cs);
numBytesToWrite = ( num_frames ) * fmt->channels * sizeof(short);
if (numBytesToWrite > 0 )
fwrite(frames, sizeof(short), numBytesToWrite, pFile);
LeaveCriticalSection(&m_cs);
return num_frames;
}
static void playtrack_test(void)
{
sp_error err;
InitializeCriticalSection(&m_cs);
pFile = fopen ("C:\\zzzspotify.pcm","wb");
test_start(&playtrack);
if((err = sp_session_player_load(g_session, stream_track)) != SP_ERROR_OK) {
test_report(&playtrack, "Unable to load track: %s", sp_error_message(err));
return;
}
info_report("Streaming '%s' by '%s' this will take a while", sp_track_name(stream_track),
sp_artist_name(sp_track_artist(stream_track, 0)));
sp_session_player_play(g_session, 1);
}
void SP_CALLCONV play_token_lost(sp_session *s)
{
fclose(pFile);
DeleteCriticalSection(&m_cs);
stream_track_end = 2;
notify_main_thread(g_session);
info_report("Playtoken lost");
}
static int check_streaming_done(void)
{
if(stream_track_end == 2)
test_report(&playtrack, "Playtoken lost");
else if(stream_track_end == 1)
test_ok(&playtrack);
else
return 0;
fclose(pFile);
stream_track_end = 0;
return 1;
}

It looks like this is the problem:
fwrite(frames, sizeof(short), numBytesToWrite, pFile);
The fwrite documentation states that the second argument is the "size in bytes of each element to be written", and the third is this "number of elements, each one with a size of size bytes".
The way you're calling frwritewill tell it to write numBytesToWrite * sizeof(short) bytes, which will run right off the end of the given buffer. I'm actually surprised it doesn't crash!
I'd suggest changing your fwrite call to something like:
fwrite(frames, sizeof(char), numBytesToWrite, pFile);
or:
int numSamplesToWrite = num_frames * fmt->channels;
fwrite(frames, sizeof(short), numSamplesToWrite, pFile);
Edit:
After looking at your audio in detail, I'm more convinced that this is the case. The song seems to be playing at half speed (i.e., 2x as much data is being written) and the artefacts seem to look like buffer overrun into random memory.

Is there a demo for add effects and export to wav files?

Is there a demo for add effects and export to wav files?
I have searched, but not find a way to solve it.
Add effects to a input.wav file, and play it. and then export a new wav file with effects. please help me.
my code is :
result = FMOD::System_Create(&system);
ERRCHECK(result);
result = system->getVersion(&version);
if (FMOD_OK != result) {
printf("FMOD lib version %08x doesn't match header version %08x", version, FMOD_VERSION);
}
// result = system->setOutput(FMOD_OUTPUTTYPE_WAVWRITER);
// ERRCHECK(result);
char cDest[200] = {0};
NSString *fileName=[NSString stringWithFormat:#"%#/addeffects_sound.wav", [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) objectAtIndex:0]];
[fileName getCString:cDest maxLength:200 encoding:NSASCIIStringEncoding];
result = system->init(32, FMOD_INIT_NORMAL | FMOD_INIT_PROFILE_ENABLE, cDest);
//result = system->init(32, FMOD_INIT_NORMAL, extradriverdata);
ERRCHECK(result);
result = system->getMasterChannelGroup(&mastergroup);
ERRCHECK(result);
[self createAllDSP];
-(void)createSound:(NSString *)filename
{
//printf("really path = %s", getPath(filename));
result = system->createSound(getPath(filename), FMOD_DEFAULT, 0, &sound);
ERRCHECK(result);
[self playSound];
}
-(void) playSound
{
result = system->playSound(sound, 0, false, &channel);
ERRCHECK(result);
//result = channel->setLoopCount(1);
// ERRCHECK(result);
}

Your question is quite broad, I encourage you to refocus on the areas you are having trouble with.
To answer your question generally though, there are several APIs you will need to achieve your goal, you have some of them in your code.
To get the FMOD system ready to output a .wav:
System_Create
System::setOutput
System::init
To create and prepare the desired effect:
System::createDSPByType
System::addDSP
To create and play the desired sound:
System::createSound
System::playSound
To check when the sound is done:
System::update
Channel::isPlaying
To shutdown and finialize the .wav
Sound::release
System::release
This is a basic outline of one way you can achieve your goal with FMOD.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to change stream index in libavformat - libavformat

Related

Why does this FTDI function return zero for the fthandle?

Ogg opus granule position to timestamp

ALSA Hooks -- Modifying audio as it's being played

Strange noise and abnormalities when writing audio data from libspotify into a file

Is there a demo for add effects and export to wav files?

Categories

Resources