I'm writing a code which will merge multiple audios (with different formats) and create a single audio. When i set the encoder sample_rate and sample_fmt same with the input videos i have no problem merging the audios. However as obvious, all of the input audio formats are not same with the output format, so i have to do format conversion. I tried to use "avresample" for this purpose but could not manage to encode the output frames when sample_rate and sample_fmt are different for input&output.
It might be done manually by hand (by sample dropping, interpolation etc.), but since libav provides a conversion api i think this can (and may be should for tidyness) be done automatically.
Here is how i set encoder and resampling context params:
AVCodecContext* avAudioEncoder = outputAudioStream->codec;
AVCodec * audioEncoder = avcodec_find_encoder(AV_CODEC_ID_MP3);
avcodec_get_context_defaults3(avAudioEncoder, audioEncoder);
avAudioEncoder->sample_fmt = AV_SAMPLE_FMT_S16P;
avAudioEncoder->sample_rate = 48000;
avAudioEncoder->channels = 2;
avAudioEncoder->time_base.num = 1;
avAudioEncoder->time_base.den = 48000;
avAudioEncoder->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL;
if (outputAVFormat->oformat->flags & AVFMT_GLOBALHEADER)
{
avAudioEncoder->flags |= CODEC_FLAG_GLOBAL_HEADER;
}
avcodec_open2(avAudioEncoder, audioEncoder, nullptr);
std::shared_ptr<AVAudioResampleContext> avAudioResampleContext(avresample_alloc_context(), [](AVAudioResampleContext * avARC){avresample_close(avARC), avresample_free(&avARC); });
av_opt_set_int(avAudioResampleContext.get(), "in_channel_layout", 2, 0);
av_opt_set_int(avAudioResampleContext.get(), "in_sample_rate", 44100, 0);
av_opt_set_int(avAudioResampleContext.get(), "in_sample_fmt", AV_SAMPLE_FMT_S16P, 0);
av_opt_set_int(avAudioResampleContext.get(), "out_channel_layout", avAudioEncoder->channels, 0);
av_opt_set_int(avAudioResampleContext.get(), "out_sample_rate", avAudioEncoder->sample_rate, 0);
av_opt_set_int(avAudioResampleContext.get(), "out_sample_fmt", avAudioEncoder->sample_fmt, 0);
And here is how i read & encode the frames
...
int result = avcodec_decode_audio4(avAudioDecoder.get(), audioFrame.get(), &isFrameAvailable, &decodingPacket);
...
if (isFrameAvailable)
{
decodingPacket.size -= result;
decodingPacket.data += result;
encodeAudioFrame->format = outputAudioStream->codec->sample_fmt;
encodeAudioFrame->channel_layout = outputAudioStream->codec->channel_layout;
auto available = avresample_available(avAudioResampleContext.get());
auto delay = avresample_get_delay(avAudioResampleContext.get());
encodeAudioFrame->nb_samples = available + av_rescale_rnd( delay + audioFrame->nb_samples, avAudioEncoder->sample_rate, audioStream->codec->sample_rate, AV_ROUND_ZERO);
int linesize;
av_samples_alloc(encodeAudioFrame->data, &linesize, avAudioEncoder->channels, encodeAudioFrame->nb_samples, avAudioEncoder->sample_fmt, 1);
encodeAudioFrame->linesize[0] = linesize;
avresample_convert(avAudioResampleContext.get(), nullptr, encodeAudioFrame->linesize[0], encodeAudioFrame->nb_samples, &audioFrame->data[0], audioFrame->linesize[0], audioFrame->nb_samples*outputAudioStream->codec->channels);
std::shared_ptr<AVPacket> outPacket(new AVPacket, [](AVPacket* p){ av_free_packet(p); delete p; });
av_init_packet(outPacket.get());
outPacket->data = nullptr;
outPacket->size = 0;
while (avresample_available(avAudioResampleContext.get()) >= encodeAudioFrame->nb_samples)
{
avresample_read(avAudioResampleContext.get(), &encodeAudioFrame->data[0], encodeAudioFrame->nb_samples*outputAudioStream->codec->channels);
encodeAudioFrame->pts = av_rescale_q(++encodedAudioPts, outputAudioStream->codec->time_base, outputAudioStream->time_base);
encodeAudioFrame->pts *= avAudioEncoder->frame_size;
...
auto ret = avcodec_encode_audio2(avAudioEncoder, outPacketPtr, encodeAudioFramePtr, &got_output);
...
}
It seems i can't use avresample properly, but i could not figure out how to solve this problem. Any help will be appreciated.
Related
I'm trying to write an app that will listen to my computer audio and transcribe it using Google Speach Recognition.
I've been able to record the system sound using WasapiLoopbackCapture and I've been able to use google streaming recognition api with test files, but I was not able to merge the two togther.
When I stream the audio from the WasapiLoopbackCapture to google it doesn't return any result.
I've based my code on the google code sample at:
https://github.com/GoogleCloudPlatform/dotnet-docs-samples/blob/9588cee6d96bfe484c8e189e9ac2f6eaa3c3b002/speech/api/Recognize/InfiniteStreaming.cs#L225
private WaveInEvent StartListening()
{
var waveIn = new WaveInEvent
{
DeviceNumber = 0,
WaveFormat = new WaveFormat(SampleRate, ChannelCount)
};
waveIn.DataAvailable += (sender, args) =>
_microphoneBuffer.Add(ByteString.CopyFrom(args.Buffer, 0, args.BytesRecorded));
waveIn.StartRecording();
return waveIn;
}
And adjusted it to use the WasapiLoopbackCapture:
private IDisposable StartListening()
{
var waveIn = new WasapiLoopbackCapture();
//var waveIn = new WaveInEvent
//{
// DeviceNumber = 0,
// WaveFormat = new WaveFormat(SampleRate, ChannelCount)
//};
SampleRate = waveIn.WaveFormat.SampleRate;
ChannelCount = waveIn.WaveFormat.Channels;
BytesPerSecond = SampleRate * ChannelCount * BytesPerSample;
Console.WriteLine(SampleRate);
Console.WriteLine(BytesPerSecond);
waveIn.DataAvailable += (sender, args) =>
_microphoneBuffer.Add(ByteString.CopyFrom(args.Buffer, 0, args.BytesRecorded));
waveIn.StartRecording();
return waveIn;
}
But it doesn't return any transcribed text.
I've saved the input stream to a file, and it played ok - so the sound is getting there, my guess is that the waveFormat that is received from the WasapiLoopback is not compatible with what google likes - I tried some conversion and couldn't get it to work.
I've reviewed the following topics on stack overflow, but still couldn't get it to work:
Resampling WasapiLoopbackCapture
Naudio - Convert 32 bit wav to 16 bit wav
And tried combining them both:
private IDisposable StartListening()
{
var waveIn = new WasapiLoopbackCapture();
//var waveIn = new WaveInEvent
//{
//DeviceNumber = 0,
//WaveFormat = new WaveFormat(SampleRate, ChannelCount)
//};
// SampleRate = waveIn.WaveFormat.SampleRate;
// ChannelCount = waveIn.WaveFormat.Channels;
// BytesPerSecond = waveIn.WaveFormat.AverageBytesPerSecond;// SampleRate * ChannelCount * BytesPerSample;
var target = new WaveFormat(SampleRate, 16, 1);
var writer = new WaveFileWriter(#"c:\temp\xx.wav", waveIn.WaveFormat);
Console.WriteLine(SampleRate);
Console.WriteLine(BytesPerSecond);
var stop = false;
waveIn.DataAvailable += (sender, args) =>
{
var a = args;
byte[] newArray16Bit = new byte[args.BytesRecorded / 2];
short two;
float value;
for (int i = 0, j = 0; i < args.BytesRecorded; i += 4, j += 2)
{
value = (BitConverter.ToSingle(args.Buffer, i));
two = (short)(value * short.MaxValue);
newArray16Bit[j] = (byte)(two & 0xFF);
newArray16Bit[j + 1] = (byte)((two >> 8) & 0xFF);
}
var resampleStream = new NAudio.Wave.Compression.AcmStream(new WaveFormat(waveIn.WaveFormat.SampleRate
,16,waveIn.WaveFormat.Channels), target);
Buffer.BlockCopy(newArray16Bit, 0, resampleStream.SourceBuffer, 0, a.BytesRecorded/2);
int sourceBytesConverted = 0;
var bytes = resampleStream.Convert(a.BytesRecorded/2, out sourceBytesConverted);
var converted = new byte[bytes];
Buffer.BlockCopy(resampleStream.DestBuffer, 9, converted,0, bytes);
a = new WaveInEventArgs(converted,bytes);
_microphoneBuffer.Add(ByteString.CopyFrom(a.Buffer, 0, a.BytesRecorded));
if (writer != null)
{
writer.Write(a.Buffer, 0, a.BytesRecorded);
if (writer.Position > waveIn.WaveFormat.AverageBytesPerSecond * 5)
{
stop = true;
writer.Dispose();
writer = null;
Console.WriteLine("Saved file");
}
}
};
waveIn.StartRecording();
return waveIn;
}
But it doesn't work.
I'm not sure if this is the right path.
A code sample of a fix would be highly appreciated
I tried converting the bit rate etc.. but couldn't get this to work.
I am new to Direct3D11 and I am currently trying to create a texture programatically within my code using this code I found online:
// Some Constants
int w = 256;
int h = 256;
int bpp = 4;
int *buf = new int[w*h];
//declarations
ID3D11Texture2D* tex;
D3D11_TEXTURE2D_DESC sTexDesc;
D3D11_SUBRESOURCE_DATA tbsd;
// filling the image
for (int i = 0; i<h; i++)
for (int j = 0; j<w; j++)
{
if ((i & 32) == (j & 32))
buf[i*w + j] = 0x00000000;
else
buf[i*w + j] = 0xffffffff;
}
// setting up D3D11_SUBRESOURCE_DATA
tbsd.pSysMem = (void *)buf;
tbsd.SysMemPitch = w*bpp;
tbsd.SysMemSlicePitch = w*h*bpp; // Not needed since this is a 2d texture
// initializing sTexDesc
sTexDesc.Width = w;
sTexDesc.Height = h;
sTexDesc.MipLevels = 1;
sTexDesc.ArraySize = 1;
sTexDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
sTexDesc.SampleDesc.Count = 1;
sTexDesc.SampleDesc.Quality = 0;
sTexDesc.Usage = D3D11_USAGE_DEFAULT;
sTexDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
sTexDesc.CPUAccessFlags = 0;
sTexDesc.MiscFlags = 0;
hr = m_pd3dDevice->CreateTexture2D(&sTexDesc, &tbsd, &tex);
and that' all fine and dandy, but I am a bit confused about how to actually load this into the shader. Below I initialized this ID3D11ShaderResourceView:
ID3D11ShaderResourceView* m_pTextureRV = nullptr;
I found on the Microsoft tutorials I need to use the CreateShaderResourceView. But how exactly do I use it? I tried this:
hr = m_pd3dDevice->CreateShaderResourceView(tex, NULL , m_pTextureRV);
but it gives me an error, telling me that m_pTextureRV is not a valid argument for the function. What am I doing wrong here?
The correct way to call that function is:
hr = m_pd3dDevice->CreateShaderResourceView(tex, nullptr, &m_pTextureRV);
Remember that ID3D11ShaderResourceView* is a pointer to an interface. You need a pointer-to-a-pointer to get a new instance of one back.
You should really consider using a COM smart-pointer like Microsoft::WRL::ComPtr instead of raw pointers for these interfaces.
Once you have created the shader resource view for your texture object, then you need to associate it with whatever slot the HLSL expects to find it in. So, for example, if you were to write an HLSL source file as:
Texture2D texture : register( t0 );
SamplerState sampler: register( s0 );
float4 PS(float2 tex : TEXCOORD0) : SV_Target
{
return texture.Sample( sampler, tex );
}
Then compile it as a Pixel Shader, and bind it to the render pipeline via PSSetShader. Then you'd need to call:
ID3D11ShaderResourceView* srv[1] = { m_pTextureRV };
m_pImmediateContext->PSSetShaderResources( 0, 1, srv );
Of course you also need a ID3D11SamplerState* sampler bound as well:
ID3D11SamplerState* m_pSamplerLinear = nullptr;
D3D11_SAMPLER_DESC sampDesc = {};
sampDesc.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR;
sampDesc.AddressU = D3D11_TEXTURE_ADDRESS_WRAP;
sampDesc.AddressV = D3D11_TEXTURE_ADDRESS_WRAP;
sampDesc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP;
sampDesc.ComparisonFunc = D3D11_COMPARISON_NEVER;
sampDesc.MinLOD = 0;
sampDesc.MaxLOD = D3D11_FLOAT32_MAX;
hr = m_pd3dDevice->CreateSamplerState( &sampDesc, &m_pSamplerLinear );
Then when you are about to draw:
m_pImmediateContext->PSSetSamplers( 0, 1, &m_pSamplerLinear );
I strongly recommend you check out the DirectX Tool Kit and the tutorials there.
I am currently saving audio stream as a .wav file in my windows phone 8 application. What changes do I need to make for saving the stream as an .aac file?
public void saveAudioBuffer()
{
IsolatedStorageFile myStore = IsolatedStorageFile.GetUserStoreForApplication();
//create a random file name and then assign
Random rand = new Random(DateTime.Now.Millisecond);
int rvalue = rand.Next(1000000000, 2000000000);
newAudioFileName = "NEW_AUDIO_" + rvalue + "_FILE_NAME.wav";
string fileName = ConfigurationInfo.savedFilesDirectory + newAudioFileName;
if(myStore.FileExists(fileName))
myStore.DeleteFile(fileName);
try
{
using (var isoFileStream = new IsolatedStorageFileStream(
fileName,
FileMode.OpenOrCreate,
myStore))
{
// Write a header before the actual pcm data
int sampleBits = 16;
int sampleBytes = sampleBits / 8;
int byteRate = microphone.SampleRate * sampleBytes * 1;
int blockAlign = sampleBytes * 1;
Encoding encoding = Encoding.UTF8;
isoFileStream.Write(encoding.GetBytes("RIFF"), 0, 4); // "RIFF"
isoFileStream.Write(BitConverter.GetBytes(0), 0, 4); // Chunk Size
isoFileStream.Write(encoding.GetBytes("WAVE"), 0, 4); // Format - "Wave"
isoFileStream.Write(encoding.GetBytes("fmt "), 0, 4); // sub chunk - "fmt"
isoFileStream.Write(BitConverter.GetBytes(16), 0, 4); // sub chunk size
isoFileStream.Write(BitConverter.GetBytes((short)1), 0, 2); // audio format
isoFileStream.Write(BitConverter.GetBytes((short)1), 0, 2); // num of channels
isoFileStream.Write(BitConverter.GetBytes(microphone.SampleRate), 0, 4); // sample rate
isoFileStream.Write(BitConverter.GetBytes(byteRate), 0, 4); // byte rate
isoFileStream.Write(BitConverter.GetBytes((short)(blockAlign)), 0, 2); // block align
isoFileStream.Write(BitConverter.GetBytes((short)(sampleBits)), 0, 2); // bits per sample
isoFileStream.Write(encoding.GetBytes("data"), 0, 4); // sub chunk - "data"
isoFileStream.Write(BitConverter.GetBytes(0), 0, 4); // sub chunk size
// write the actual pcm data
stream.Position = 0;
stream.CopyTo(isoFileStream);
// and fill in the blanks
long previousPos = isoFileStream.Position;
isoFileStream.Seek(4, SeekOrigin.Begin);
isoFileStream.Write(BitConverter.GetBytes((int)isoFileStream.Length - 8), 0, 4);
isoFileStream.Seek(40, SeekOrigin.Begin);
isoFileStream.Write(BitConverter.GetBytes((int)isoFileStream.Length - 44), 0, 4);
isoFileStream.Seek(previousPos, SeekOrigin.Begin);
isoFileStream.Flush();
}
}
catch
{
MessageBox.Show("Error while trying to store audio stream.");
}
}
in my application, I have used a Microphone object from Microsoft.Xna.Framework.Audio to capture sound.
I am running into some problems trying to create a ProRes encoded mov file using the AVFramework framework, and AVAsset.
On OSX 10.10.5, using XCode 7, linking against 10.9 libraries.
So far I have managed to create valid ProRes files that contain both video and multiple channels of audio.
( I am creating multiple tracks of uncompressed 48K, 16-bit PCM Audio)
Adding the Video Frames work well, and adding the Audio frames works well, or at least succeeds in the code.
However when i play the file back, it appears as though the audio frames are repeated, in 12,13,14, or 15 frame sequences.
Looking at the wave form, from the *.mov it is easy to see the repeated audio...
That is to say, the first 13 or X video frames all contain exactly the same audio, this is then again repeated for the next X, and then again and again and again etc...
The Video is fine, it is just the Audio that appears to be looping/repeating.
The issue appears no matter how many audio channels/ tracks I use as the source, I have tested using just 1 track and also using 4 and 8 tracks.
It is independent of what format and amount of samples i feed to the system, ie using, 720p60, 1080p23, and 1080i59 all exhibit the same incorrect behavior.
well actually the 720p captures appears to repeat the audio frames 30 or 31 times, and the 1080 formats only repeat the audio frames 12 or 13 times,
But i am definitely submitting different audio data to the Audio encode/SampleBuffer create process, as i have logged this in great detail ( tho it is not shown in the code below)
I have tried a number of different things to modify the code and expose the issue, but had no success, hence i am asking here, and hopefully someone can either see an issue with my code or give me some info with regards to this problem.
The code i am using is as follows:
int main(int argc, const char * argv[])
{
#autoreleasepool
{
NSLog(#"Hello, World! - Welcome to the ProResCapture With Audio sample app. ");
OSStatus status;
AudioStreamBasicDescription audioFormat;
CMAudioFormatDescriptionRef audioFormatDesc;
// OK so lets include the hardware stuff first and then we can see about doing some actual capture and compress stuff
HARDWARE_HANDLE pHardware = sdiFactory();
if (pHardware)
{
unsigned long ulUpdateType = UPD_FMT_FRAME;
unsigned long ulFieldCount = 0;
unsigned int numAudioChannels = 4; //8; //4;
int numFramesToCapture = 300;
gBFHancBuffer = (unsigned int*)myAlloc(gHANC_SIZE);
int audioSize = 2002 * 4 * 16;
short* pAudioSamples = (short*)new char[audioSize];
std::vector<short*> vecOfNonInterleavedAudioSamplesPtrs;
for (int i = 0; i < 16; i++)
{
vecOfNonInterleavedAudioSamplesPtrs.push_back((short*)myAlloc(2002 * sizeof(short)));
}
bool bVideoModeIsValid = SetupAndConfigureHardwareToCaptureIncomingVideo();
if (bVideoModeIsValid)
{
gBFBytes = (BLUE_UINT32*)myAlloc(gGoldenSize);
bool canAddVideoWriter = false;
bool canAddAudioWriter = false;
int nAudioSamplesWritten = 0;
// declare the vars for our various AVAsset elements
AVAssetWriter* assetWriter = nil;
AVAssetWriterInput* assetWriterInputVideo = nil;
AVAssetWriterInput* assetWriterAudioInput[16];
AVAssetWriterInputPixelBufferAdaptor* adaptor = nil;
NSURL* localOutputURL = nil;
NSError* localError = nil;
// create the file we are goijmng to be writing to
localOutputURL = [NSURL URLWithString:#"file:///Volumes/Media/ProResAVCaptureAnyFormat.mov"];
assetWriter = [[AVAssetWriter alloc] initWithURL: localOutputURL fileType:AVFileTypeQuickTimeMovie error:&localError];
if (assetWriter)
{
assetWriter.shouldOptimizeForNetworkUse = NO;
// Lets configure the Audio and Video settings for this writer...
{
// Video First.
// Add a video input
// create a dictionary with the settings we want ie. Prores capture and width and height.
NSMutableDictionary* videoSettings = [NSMutableDictionary dictionaryWithObjectsAndKeys:
AVVideoCodecAppleProRes422, AVVideoCodecKey,
[NSNumber numberWithInt:width], AVVideoWidthKey,
[NSNumber numberWithInt:height], AVVideoHeightKey,
nil];
assetWriterInputVideo = [AVAssetWriterInput assetWriterInputWithMediaType: AVMediaTypeVideo outputSettings:videoSettings];
adaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:assetWriterInputVideo
sourcePixelBufferAttributes:nil];
canAddVideoWriter = [assetWriter canAddInput:assetWriterInputVideo];
}
{ // Add a Audio AssetWriterInput
// Create a dictionary with the settings we want ie. Uncompressed PCM audio 16 bit little endian.
NSMutableDictionary* audioSettings = [NSMutableDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:48000.0], AVSampleRateKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
[NSNumber numberWithUnsignedInteger:1], AVNumberOfChannelsKey,
nil];
// OR use... FillOutASBDForLPCM(AudioStreamBasicDescription& outASBD, Float64 inSampleRate, UInt32 inChannelsPerFrame, UInt32 inValidBitsPerChannel, UInt32 inTotalBitsPerChannel, bool inIsFloat, bool inIsBigEndian, bool inIsNonInterleaved = false)
UInt32 inValidBitsPerChannel = 16;
UInt32 inTotalBitsPerChannel = 16;
bool inIsFloat = false;
bool inIsBigEndian = false;
UInt32 inChannelsPerTrack = 1;
FillOutASBDForLPCM(audioFormat, 48000.00, inChannelsPerTrack, inValidBitsPerChannel, inTotalBitsPerChannel, inIsFloat, inIsBigEndian);
status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault,
&audioFormat,
0,
NULL,
0,
NULL,
NULL,
&audioFormatDesc
);
for (int t = 0; t < numAudioChannels; t++)
{
assetWriterAudioInput[t] = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:audioSettings];
canAddAudioWriter = [assetWriter canAddInput:assetWriterAudioInput[t] ];
if (canAddAudioWriter)
{
assetWriterAudioInput[t].expectsMediaDataInRealTime = YES; //true;
[assetWriter addInput:assetWriterAudioInput[t] ];
}
}
CMFormatDescriptionRef myFormatDesc = assetWriterAudioInput[0].sourceFormatHint;
NSString* medType = [assetWriterAudioInput[0] mediaType];
}
if(canAddVideoWriter)
{
// tell the asset writer to expect media in real time.
assetWriterInputVideo.expectsMediaDataInRealTime = YES; //true;
// add the Input(s)
[assetWriter addInput:assetWriterInputVideo];
// Start writing the frames..
BOOL success = true;
success = [assetWriter startWriting];
CMTime startTime = CMTimeMake(0, fpsRate);
[assetWriter startSessionAtSourceTime:kCMTimeZero];
// [assetWriter startSessionAtSourceTime:startTime];
if (success)
{
startOurVideoCaptureProcess();
// **** possible enhancement is to use a pixelBufferPool to manage multiple buffers at once...
CVPixelBufferRef buffer = NULL;
int kRecordingFPS = fpsRate;
bool frameAdded = false;
unsigned int bufferID;
for( int i = 0; i < numFramesToCapture; i++)
{
printf("\n");
buffer = pixelBufferFromCard(bufferID, width, height, memFmt); // This function to get a CVBufferREf From our device, as well as getting the Audio data
while(!adaptor.assetWriterInput.readyForMoreMediaData)
{
printf(" readyForMoreMediaData FAILED \n");
}
if (buffer)
{
// Add video
printf("appending Frame %d ", i);
CMTime frameTime = CMTimeMake(i, kRecordingFPS);
frameAdded = [adaptor appendPixelBuffer:buffer withPresentationTime:frameTime];
if (frameAdded)
printf("VideoAdded.....\n ");
// Add Audio
{
// Do some Processing on the captured data to extract the interleaved Audio Samples for each channel
struct hanc_decode_struct decode;
DecodeHancFrameEx(gBFHancBuffer, decode);
int nAudioSamplesCaptured = 0;
if(decode.no_audio_samples > 0)
{
printf("completed deCodeHancEX, found %d samples \n", ( decode.no_audio_samples / numAudioChannels) );
nAudioSamplesCaptured = decode.no_audio_samples / numAudioChannels;
}
CMTime audioTimeStamp = CMTimeMake(nAudioSamplesWritten, 480000); // (Samples Written) / sampleRate for audio
// This function repacks the Audio from interleaved PCM data a vector of individual array of Audio data
RepackDecodedHancAudio((void*)pAudioSamples, numAudioChannels, nAudioSamplesCaptured, vecOfNonInterleavedAudioSamplesPtrs);
for (int t = 0; t < numAudioChannels; t++)
{
CMBlockBufferRef blockBuf = NULL; // *********** MUST release these AFTER adding the samples to the assetWriter...
CMSampleBufferRef cmBuf = NULL;
int sizeOfSamplesInBytes = nAudioSamplesCaptured * 2; // always 16bit memory samples...
// Create sample Block buffer for adding to the audio input.
status = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault,
(void*)vecOfNonInterleavedAudioSamplesPtrs[t],
sizeOfSamplesInBytes,
kCFAllocatorNull,
NULL,
0,
sizeOfSamplesInBytes,
0,
&blockBuf);
if (status != noErr)
NSLog(#"CMBlockBufferCreateWithMemoryBlock error");
status = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault,
blockBuf,
TRUE,
0,
NULL,
audioFormatDesc,
nAudioSamplesCaptured,
audioTimeStamp,
NULL,
&cmBuf);
if (status != noErr)
NSLog(#"CMSampleBufferCreate error");
// leys check if the CMSampleBuf is valid
bool bValid = CMSampleBufferIsValid(cmBuf);
// examine this values for debugging info....
CMTime cmTimeSampleDuration = CMSampleBufferGetDuration(cmBuf);
CMTime cmTimePresentationTime = CMSampleBufferGetPresentationTimeStamp(cmBuf);
if (status != noErr)
NSLog(#"Invalid Buffer found!!! possible CMSampleBufferCreate error?");
if(!assetWriterAudioInput[t].readyForMoreMediaData)
printf(" readyForMoreMediaData FAILED - Had to Drop a frame\n");
else
{
if(assetWriter.status == AVAssetWriterStatusWriting)
{
BOOL r = YES;
r = [assetWriterAudioInput[t] appendSampleBuffer:cmBuf];
if (!r)
{
NSLog(#"appendSampleBuffer error");
}
else
success = true;
}
else
printf("AssetWriter Not ready???!? \n");
}
if (cmBuf)
{
CFRelease(cmBuf);
cmBuf = 0;
}
if(blockBuf)
{
CFRelease(blockBuf);
blockBuf = 0;
}
}
nAudioSamplesWritten = nAudioSamplesWritten + nAudioSamplesCaptured;
}
if(success)
{
printf("Audio tracks Added..");
}
else
{
NSError* nsERR = [assetWriter error];
printf("Problem Adding Audio tracks / samples");
}
printf("Success \n");
}
if (buffer)
{
CVBufferRelease(buffer);
}
}
}
AVAssetWriterStatus sta = [assetWriter status];
CMTime endTime = CMTimeMake((numFramesToCapture-1), fpsRate);
if (audioFormatDesc)
{
CFRelease(audioFormatDesc);
audioFormatDesc = 0;
}
// Finish the session
StopVideoCaptureProcess();
[assetWriterInputVideo markAsFinished];
for (int t = 0; t < numAudioChannels; t++)
{
[assetWriterAudioInput[t] markAsFinished];
}
[assetWriter endSessionAtSourceTime:endTime];
bool finishedSuccessfully = [assetWriter finishWriting];
if (finishedSuccessfully)
NSLog(#"Writing file ended successfully \n");
else
{
NSLog(#"Writing file ended WITH ERRORS...");
sta = [assetWriter status];
if (sta != AVAssetWriterStatusCompleted)
{
NSError* nsERR = [assetWriter error];
printf("investoigating the error \n");
}
}
}
else
{
NSLog(#"Unable to Add the InputVideo Asset Writer to the AssetWriter, file will not be written - Exiting");
}
if (audioFormatDesc)
CFRelease(audioFormatDesc);
}
for (int i = 0; i < 16; i++)
{
if (vecOfNonInterleavedAudioSamplesPtrs[i])
{
bfFree(2002 * sizeof(unsigned short), vecOfNonInterleavedAudioSamplesPtrs[i]);
vecOfNonInterleavedAudioSamplesPtrs[i] = nullptr;
}
}
}
else
{
NSLog(#"Unable to find a valid input signal - Exiting");
}
if (pAudioSamples)
delete pAudioSamples;
}
}
return 0;
}
It's a very basic sample that connects to some special hardware ( code for that is left out)
It grabs frames of video and audio, and then there is the processing for the Audio to go from interleaved PCM to the individual Array's of PCM data for each track
and then each buffer is added to the appropriate track, be it video or audio...
Lastly the AvAsset stuff is finished and closed and i exit and clean up.
Any help will be most appreciated,
Cheers,
James
Well i finally found a working solution for this problem.
The solution comes in 2 parts:
I moved from using CMAudioSampleBufferCreateWithPacketDescriptions
to using CMSampleBufferCreate(..) and the appropriate arguments to that function call.
Initially when experiementing with CMSampleBufferCreate i was mis-using some of the arguments and it was giving me the same results as i initially outlined here, but with careful examination of the values i was passing for the CMSampleTimingInfo struct - specifically the duration part, i eventually got everything working correctly!!
So it appears that i was creating the CMBlockBufferRef correctly, but i needed to take more care when using this to create the CMSampleBufRef that i was passing to the AVAssetWriterInput!
Hope this helps someone else, as it was a nasty one for me to solve!
James
i want to find the duration of an audio file of type "amr" without converting it to other audio formats
with any way?
AK
I have coded the following in objective-C to get the duration of a movie. This can similarly be used to get the duration of audio too:
-(double)durationOfMovieAtPath:(NSString*)inMoviePath
{
double durationToReturn = -1;
NSFileManager *fm = [NSFileManager defaultManager];
if ([fm fileExistsAtPath:inMoviePath])
{
av_register_all();
AVFormatContext *inMovieFormat = NULL;
inMovieFormat = avformat_alloc_context();
int errorCode = av_open_input_file(&inMovieFormat, [inMoviePath UTF8String], NULL, 0, NULL);
//double durationToReturn = (double)inMovieFormat->duration / AV_TIME_BASE;
if (0==errorCode)
{
// only on success
int numberOfStreams = inMovieFormat->nb_streams;
AVStream *videoStream = NULL;
for (int i=0; i<numberOfStreams; i++)
{
AVStream *st = inMovieFormat->streams[i];
if (st->codec->codec_type == CODEC_TYPE_VIDEO)
{
videoStream = st;
break;
}
}
double divideFactor;
// The duraion in AVStream is set in accordance with the time_base of AVStream, so we need to fetch back the duration using this factor
divideFactor = (double)1/rationalToDouble(videoStream->time_base);
if (NULL!=videoStream)
durationToReturn = (double)videoStream->duration / divideFactor;
//DEBUGLOG (#"Duration of movie at path: %# = %0.3f", inMoviePath, durationToReturn);
}
else
{
DEBUGLOG (#"avformat_alloc_context error code = %d", errorCode);
}
if (nil!=inMovieFormat)
{
av_close_input_file(inMovieFormat);
//av_free(inMovieFormat);
}
}
return durationToReturn;
}
Change the CODEC_TYPE_VIDEO to CODEC_TYPE_AUDIO and I think it should work for you.