I'm building an application that allows for ALSA configuration and in the GUI there is a peek meter that allows the client to see playback levels in realtime. I'm having a hard time determining what device to connect to because I don't know if ALSA has a default "loopback" or not and what it's called. I am also having trouble converting the read data into a sample, then finding said sample's amplitude. Here is what I have built so far:
Grab device and set hardware params
if (0 == snd_pcm_open(&pPcm, "default", SND_PCM_STREAM_CAPTURE, SND_PCM_NONBLOCK))
{
if (0 == snd_pcm_set_params(pPcm, SND_PCM_FORMAT_S16_LE, SND_PCM_ACCESS_RW_INTERLEAVED, 1, 96000, 1, 1)) // This last argument confuses me because I'm not given a unit of measurement (second, millisecond, mircosecond, etc.)
{
return snd_pcm_start(pPcm);
}
}
pPcm = nullptr;
return -1;
Read from device and return the peek of the audio signal
int iRtn = -1;
if (nullptr == pPcm)
{
if (-1 == SetupListener())
{
return iRtn;
}
}
// Check to make the state is sane for reading.
if (SND_PCM_STATE_PREPARED == snd_pcm_state(pPcm) ||
SND_PCM_STATE_RUNNING == snd_pcm_state(pPcm))
{
snd_pcm_resume(pPcm); // This might be superfluous.
// The state is sane, read from the stream.
signed short iBuffer = 0;
int iNumRead = snd_pcm_readi(pPcm, &iBuffer, 1);
if (0 < iNumRead)
{
// This calculates an approximation.
// We have some audio data, acquire it's peek in dB. (decibels)
float nSample = static_cast<float>(iBuffer);
float nAmplitude = nSample / MAX_AMPLITUDE_S16; // MAX_AMPLITUDE_S16 is defined as "32767"
float nDecibels = (0 < nAmplitude) ? 20 * log10(nAmplitude) : 0;
iRtn = static_cast<int>(nDecibels); // Cast to integer for GUI element.
}
}
return iRtn;
The ALSA documentation seems very barren and so I apologize if I'm misusing the API.
Related
I'm using pjsua to create a video call from a monitor to a phone. I'm able to establish an audio call without problem, but if I try to establish a video call (vid_cnt=1), I'm getting an error.
My purpose is to get and save the audio and video of the phone.
This is my configuration:
void hard_account_config(pjsua_acc_config& acc_cfg, pjsua_transport_id transport_tcp) {
pjsua_acc_config_default(&acc_cfg);
acc_cfg.ka_interval = 15;
// VIDEO
acc_cfg.vid_in_auto_show = PJ_TRUE;
acc_cfg.vid_out_auto_transmit = PJ_TRUE;
acc_cfg.vid_cap_dev = VideoCaptureDeviceId();
acc_cfg.vid_wnd_flags = PJMEDIA_VID_DEV_WND_BORDER | PJMEDIA_VID_DEV_WND_RESIZABLE;
acc_cfg.reg_timeout = 300;
acc_cfg.use_srtp = PJMEDIA_SRTP_DISABLED;
pjsua_srtp_opt_default(&acc_cfg.srtp_opt);
acc_cfg.ice_cfg_use = PJSUA_ICE_CONFIG_USE_CUSTOM;
acc_cfg.ice_cfg.enable_ice = PJ_FALSE;
acc_cfg.allow_via_rewrite = PJ_FALSE;
acc_cfg.allow_sdp_nat_rewrite = acc_cfg.allow_via_rewrite;
acc_cfg.allow_contact_rewrite = acc_cfg.allow_via_rewrite ? 2 : PJ_FALSE;
acc_cfg.publish_enabled = PJ_TRUE;
acc_cfg.transport_id = transport_tcp;
acc_cfg.cred_count = 1;
acc_cfg.cred_info[0].username = pj_string(USER);
acc_cfg.cred_info[0].realm = pj_string("*");
acc_cfg.cred_info[0].scheme = pj_string("Digest");
acc_cfg.cred_info[0].data_type = PJSIP_CRED_DATA_PLAIN_PASSWD;
acc_cfg.cred_info[0].data = pj_string(PASS);
}
Once registration is completed, I run the following code:
prn("=== Test Call ===");
pj_str_t uri = pj_string("sip:" + call_target + "#" + SERVER);
pjsua_call_id call_id;
pjsua_call_setting call_setting;
pjsua_call_setting_default(&call_setting);
call_setting.flag = 0;
call_setting.vid_cnt = PJMEDIA_HAS_VIDEO ? 1 : 0;
pjsua_msg_data msg_data;
pjsua_msg_data_init(&msg_data);
pj_status_t status = pjsua_call_make_call(acc_id, &uri, &call_setting, NULL, &msg_data, &call_id);
if (status != PJ_SUCCESS) {
prn("Error trying: pjsua_call_make_call");
return;
}
I know that PJMEDIA_HAS_VIDEO is equal to 1 on the conf_site.h and pjsua_call_make_call return PJ_SUCCESS.
I've seen that if I have headphones connected, there is no problem. But if I disconnect them, the following error is shown:
#pjsua_aud.c ..Error retrieving default audio device parameters: Unable to find default audio device (PJMEDIA_EAUD_NODEFDEV) [status=420006]
If I connect the headphones, I enable the video and run my code, the following error is shown:
#pjsua_media.c ......pjsua_vid_channel_update() failed for call_id 0 media 1: Unable to find default video device (PJMEDIA_EVID_NODEFDEV)
So, using PJSUA it is necessary to have audio and video devices on the monitor and phone? Should I create virtual ports if I don't have the devices?
You can use the following code to get a list of audio/video devices in PJSUA, which will most likely provide you with a loopback device (among others).
pjmedia_aud_dev_info audio_device[64];
unsigned int audio_device_cnt = 64;
status = pjsua_enum_aud_devs(audio_device, &audio_device_cnt);
printf("There are %d audio devices\n", audio_device_cnt);
for (int i = 0; i < audio_device_cnt; i++) {
printf("%d: %s\n", i, audio_device[i].name);
}
pjmedia_vid_dev_info video_device[64];
unsigned int video_device_cnt = 64;
status = pjsua_vid_enum_devs(video_device, &video_device_cnt);
printf("There are %d video devices\n", video_device_cnt);
for (int i = 0; i < video_device_cnt; i++) {
printf("%d: %s\n", i, video_device[i].name);
}
I have not personally tried capturing a loopback audio device but for video, PJSUA provides an internal colorbar generator (Colorbar generator in this list), which you can use.
Once you find the indices of loopback or dummy audio/video devices you want to use, you can set them by using
pjsua_set_snd_dev(<YOUR DUMMY CAPTURE DEVICE>, <YOUR DUMMY PLAYBACK DEVICE>);
acc_cfg.vid_cap_dev = <YOUR VIDEO CAPTURE DEVICE>;
I'm trying to convert some of my OpenAL code to OpenSL ES for my Android usage (Kitkat 4.4.4) on Genymotion and encountered an issue with .wav files sampled at 44.1Khz. My application is a native one (glue).
I've followed the /native-audio sample of Android NDK samples and fragments from the excellent book Android NDK Beginners Guide, so my code behaves correctly on most of wav/PCM data, except those sampled at 44.1Khz. My specific code is this:
Engine init
// create OpenSL ES engine
SLEngineOption EngineOption[] = {(SLuint32) SL_ENGINEOPTION_THREADSAFE, (SLuint32) SL_BOOLEAN_TRUE};
const SLInterfaceID lEngineMixIIDs[] = {SL_IID_ENGINE};
const SLboolean lEngineMixReqs[] = {SL_BOOLEAN_TRUE};
SLresult res = slCreateEngine(&mEngineObj, 1, EngineOption, 1, lEngineMixIIDs, lEngineMixReqs);
res = (*mEngineObj)->Realize(mEngineObj, SL_BOOLEAN_FALSE);
res = (*mEngineObj)->GetInterface(mEngineObj, SL_IID_ENGINE, &mEngine); // get 'engine' interface
// create output mix (AKA playback; this represents speakers, headset etc.)
res = (*mEngine)->CreateOutputMix(mEngine, &mOutputMixObj, 0,NULL, NULL);
res = (*mOutputMixObj)->Realize(mOutputMixObj, SL_BOOLEAN_FALSE);
Player init
SLresult lRes;
// Set-up sound audio source.
SLDataLocator_AndroidSimpleBufferQueue lDataLocatorIn;
lDataLocatorIn.locatorType = SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE;
lDataLocatorIn.numBuffers = 1; // 1 buffer for a one-time load
// analyze and set correct PCM format
SLDataFormat_PCM lDataFormat;
lDataFormat.formatType = SL_DATAFORMAT_PCM;
lDataFormat.numChannels = audio->wav.channels; // etc. 1,2
lDataFormat.samplesPerSec = audio->wav.sampleRate * 1000; // etc. 44100 * 1000
lDataFormat.bitsPerSample = audio->wav.bitsPerSample; // etc. 16
lDataFormat.containerSize = audio->wav.bitsPerSample;
lDataFormat.channelMask = SL_SPEAKER_FRONT_CENTER;
lDataFormat.endianness = SL_BYTEORDER_LITTLEENDIAN;
SLDataSource lDataSource;
lDataSource.pLocator = &lDataLocatorIn;
lDataSource.pFormat = &lDataFormat;
SLDataLocator_OutputMix lDataLocatorOut;
lDataLocatorOut.locatorType = SL_DATALOCATOR_OUTPUTMIX;
lDataLocatorOut.outputMix = mOutputMixObj;
SLDataSink lDataSink;
lDataSink.pLocator = &lDataLocatorOut;
lDataSink.pFormat = NULL;
const SLInterfaceID lSoundPlayerIIDs[] = { SL_IID_PLAY, SL_IID_ANDROIDSIMPLEBUFFERQUEUE };
const SLboolean lSoundPlayerReqs[] = { SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE };
lRes = (*mEngine)->CreateAudioPlayer(mEngine, &mPlayerObj, &lDataSource, &lDataSink, 2, lSoundPlayerIIDs, lSoundPlayerReqs);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayerObj)->Realize(mPlayerObj, SL_BOOLEAN_FALSE);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayerObj)->GetInterface(mPlayerObj, SL_IID_PLAY, &mPlayer);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayerObj)->GetInterface(mPlayerObj, SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &mPlayerQueue);
if (lRes != SL_RESULT_SUCCESS) { return; }
// register callback on the buffer queue
lRes = (*mPlayerQueue)->RegisterCallback(mPlayerQueue, bqPlayerQueueCallback, NULL);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayer)->SetCallbackEventsMask(mPlayer, SL_PLAYEVENT_HEADATEND);
if (lRes != SL_RESULT_SUCCESS) { return; }
// ..fetch the data in 'audio->data' from opened FILE* stream and set 'datasize'
// feed the buffer with data
lRes = (*mPlayerQueue)->Clear(mPlayerQueue); // remove any sound from buffer
lRes = (*mPlayerQueue)->Enqueue(mPlayerQueue, audio->data, datasize);
The above works good for 8000, 22050 and 32000 samples/sec but on 41100 samples, 4 out of 5 times it will repeat itself lots of time on first play. It's like having a door knocking sound effect that actually loops many times (about 50 times) by a single ->SetPlayState(..SL_PLAYSTATE_PLAYING); and in speed. Any obvious error on my code? a multi-threaded issue with these sampling? Anyone else have this kind of problem? Should i downsample on 41.1Khz cases ? Could it be a Genymotion problem? tx
I solved this by downsampling from 44Khz to 22Khz. Interestingly, this only happens on sounds containing 1 channel and 44,100 samples; in all other cases there's no problem.
I am having trouble with WASAPI. It do not output any sound and I have been checked the data that writing to the buffer.
Because of it does not output any sound, I haven't any idea to find out the problem.
It may have some problems in following code.
SoundStream::SoundStream() : writtenCursor(0), writeCursor(0), distroy(false)
{
IMMDeviceEnumerator * pEnumerator = nullptr;
HResult(CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, IID_PPV_ARGS(&pEnumerator)));
IMMDevice * pDevice = nullptr;
HResult(pEnumerator->GetDefaultAudioEndpoint(eRender, eMultimedia, &pDevice));
SafeRelease(&pEnumerator);
HResult(pDevice->Activate(__uuidof(IAudioClient), CLSCTX_ALL, NULL, (void**)&pAudioClient));
SafeRelease(&pDevice);
WAVEFORMATEXTENSIBLE * pwfx = nullptr;
hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
REFERENCE_TIME hnsRequestedDuration = REFTIMES_PER_SEC * 2;
HResult(pAudioClient->GetMixFormat((WAVEFORMATEX**)&pwfx));
HResult(pAudioClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
hnsRequestedDuration,
0,
(WAVEFORMATEX*)pwfx,
NULL));
pAudioClient->SetEventHandle(hEvent);
channel = (size_t)pwfx->Format.nChannels;
bits = (size_t)pwfx->Format.wBitsPerSample;
validBits = (size_t)pwfx->Samples.wValidBitsPerSample;
frequency = (size_t)pwfx->Format.nSamplesPerSec;
buffer.reshape({ 0, channel, bits >> 3 });
CoTaskMemFree(pwfx);
HResult(pAudioClient->GetBufferSize(&bufferFrameCount));
HResult(pAudioClient->Start());
if (pAudioClient)
{
thread = std::thread([&]()
{
this->Sync();
});
}
}
You could look at my WASAPI.cpp code at http://jdmcox.com (which works fine).
You should also check if the expected wave format is float:
//SubFormat 00000003-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_IEEE_FLOAT
//SubFormat 00000001-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_PCM
GUID G;
WORD V;
WAVEFORMATEX *pwfx = NULL;
bool itsfloat;
pAudioClient->GetMixFormat(&pwfx);
// Do we received a WAVEFORMATEXTENSIBLE?
if(pwfx.cbSize >= 22) {
G = ((WAVEFORMATEXTENSIBLE*)pwfx)->SubFormat;
V = ((WAVEFORMATEXTENSIBLE*)pwfx)->Samples.wValidBitsPerSample;
if (G.Data1 == 3) itsfloat = true;
else if (G.Data1 == 1) itsfloat = false;
}
You know you received a WAVEFORMATEXTENSIBLE and not a simple WAVEFORMATEX because the "pwfx.cbSize >= 22".
See more at:
IAudioClient::GetMixFormat
https://learn.microsoft.com/en-us/windows/win32/api/audioclient/nf-audioclient-iaudioclient-getmixformat
WAVEFORMATEXTENSIBLE
https://learn.microsoft.com/en-us/windows/win32/api/mmreg/ns-mmreg-waveformatextensible
You could look at my WASAPI.cpp code at http://jdmcox.com AGAIN.
Now it works in shared mode as well as exclusive mode.
I should note that no conversion of wave format or wave is necessary in shared mode -- Windows takes care of both converting to and from their format used to mix waves.
I'm a newbie to native development on Windows, but I've been tasked with creating a small app that will list out all the transformers for various video+audio codecs.
Looking at the MSDN documentation, there doesn't seem to be much direct documentation on doing this. Docs that I've found indicate that this information is stored in the registry (not sure where) so that could be a vector.
Is this possible?
Generally how should I do it?
Thanks
EDIT:
It does seem that a call to MFTEnumEx with the parameters of type MFT_REGISTER_TYPE_INFO set to NULL returns a count of 8
MFTEnumEx(MFT_CATEGORY_VIDEO_DECODER,MFT_ENUM_FLAG_ALL,NULL, NULL, &ppActivate, &count);
assert(count > 0);
Still have to get the actual values though. But the passed ppActivate param should contain an enumeration of them.
EDIT:
It's surprising, but while the count above == 8, there are no video or audio attributes(the video/audio IMFAttributes object is NULL)
IMFAttributes* videoAttributes = NULL;
if(SUCCEEDED(hr)){
hr = pProfile->GetVideoAttributes(&videoAttributes);
//If there are no container attributes set in the transcode profile, the GetVideoAttributes method succeeds and videoAttributes receives NULL.
}
assert(videoAttributes != NULL); //FAILS!
EDIT:
This is a method that pulls all the IMFMediaTypes from the machine(modified call from the book Developing Microsoft® Media Foundation Applications); I then enumerate over them in the caller:
HRESULT CTranscoder::GetVideoOutputAvailableTypes(
DWORD flags,
CComPtr<IMFCollection>& pTypeCollection)
{
HRESULT hr = S_OK;
IMFActivate** pActivateArray = NULL;
MFT_REGISTER_TYPE_INFO outputType;
UINT32 nMftsFound = 0;
do
{
// create the collection in which we will return the types found
hr = MFCreateCollection(&pTypeCollection);
BREAK_ON_FAIL(hr);
// initialize the structure that describes the output streams that the encoders must
// be able to produce. In this case we want video encoders - so major type is video,
// and we want the specified subtype
outputType.guidMajorType = MFMediaType_Video;
outputType.guidSubtype = MFVideoFormat_WMV3;
// get a collection of MFTs that fit the requested pattern - video encoders,
// with the specified subtype, and using the specified search flags
hr = MFTEnumEx(
MFT_CATEGORY_VIDEO_ENCODER, // type of object to find - video encoders
flags, // search flags
NULL, // match all input types for an encoder
&outputType, // get encoders with specified output type
&pActivateArray,
&nMftsFound);
BREAK_ON_FAIL(hr);
// now that we have an array of activation objects for matching MFTs, loop through
// each of those MFTs, extracting all possible and available formats from each of them
for(UINT32 x = 0; x < nMftsFound; x++)
{
CComPtr<IMFTransform> pEncoder;
UINT32 typeIndex = 0;
// activate the encoder that corresponds to the activation object
hr = pActivateArray[x]->ActivateObject(IID_IMFTransform,
(void**)&pEncoder);
// while we don't have a failure, get each available output type for the MFT
// encoder we keep looping until there are no more available types. If there
// are no more types for the encoder, IMFTransform::GetOutputAvailableTypes[]
// will return MF_E_NO_MORE_TYPES
while(SUCCEEDED(hr))
{
IMFMediaType* pType;
// get the avilable type for the type index, and increment the typeIndex
// counter
hr = pEncoder->GetOutputAvailableType(0, typeIndex++, &pType);
if(SUCCEEDED(hr))
{
// store the type in the IMFCollection
hr = pTypeCollection->AddElement(pType);
}
}
}
} while(false);
// possible valid errors that may be returned after the previous for loop is done
if(hr == MF_E_NO_MORE_TYPES || hr == MF_E_TRANSFORM_TYPE_NOT_SET)
hr = S_OK;
// if we successfully used MFTEnumEx() to allocate an array of the MFT activation
// objects, then it is our responsibility to release each one and free up the memory
// used by the array
if(pActivateArray != NULL)
{
// release the individual activation objects
for(UINT32 x = 0; x < nMftsFound; x++)
{
if(pActivateArray[x] != NULL)
pActivateArray[x]->Release();
}
// free the memory used by the array
CoTaskMemFree(pActivateArray);
pActivateArray = NULL;
}
return hr;
}
Caller:
hr=transcoder.GetVideoOutputAvailableTypes( MFT_ENUM_FLAG_ALL, availableTypes);
if (FAILED(hr)){
wprintf_s(L"didn't like the printVideoProfiles method");
}
DWORD availableInputTypeCount =0;
if(SUCCEEDED(hr)){
hr= availableTypes->GetElementCount(&availableInputTypeCount);
}
for(DWORD i = 0; i< availableInputTypeCount && SUCCEEDED(hr); i++)
{
//really a IMFMediaType*
IMFAttributes* mediaInterface = NULL;
if(SUCCEEDED(hr)){
hr = availableTypes->GetElement(i, (IUnknown**)&mediaInterface) ;}
if(SUCCEEDED(hr)){
//see http://msdn.microsoft.com/en-us/library/aa376629(v=VS.85).aspx for a list of attributes to pull off the media interface.
GUID majorType;
hr = mediaInterface->GetGUID(MF_MT_MAJOR_TYPE, &majorType);
LPOLESTR majorGuidString = NULL;
hr = StringFromCLSID(majorType,&majorGuidString);
wprintf_s(L"major type: %s \n", majorGuidString);
wprintf_s(L"is a video? %i \n", IsEqualGUID(MFMediaType_Video,majorType));
GUID subType;
if(SUCCEEDED(mediaInterface->GetGUID(MF_MT_SUBTYPE, &subType))){
LPOLESTR minorGuidString = NULL;
if(SUCCEEDED(StringFromCLSID(subType,&minorGuidString)))
wprintf_s(L"subtype: %s \n", minorGuidString);
}
//Contains a DirectShow format GUID for a media type: http://msdn.microsoft.com/en-us/library/dd373477(v=VS.85).aspx
GUID formatType;
if(SUCCEEDED(mediaInterface->GetGUID(MF_MT_AM_FORMAT_TYPE, &formatType))){
LPOLESTR formatTypeString = NULL;
if(SUCCEEDED(StringFromCLSID(formatType,&formatTypeString)))
wprintf_s(L"format type: %s \n", formatTypeString);
}
UINT32 numeratorFrameRate = 0;
UINT32 denominatorFrameRate = 0;
if(SUCCEEDED(MFGetAttributeRatio(mediaInterface, MF_MT_FRAME_RATE, &numeratorFrameRate, &denominatorFrameRate)))
wprintf_s(L"framerate: %i/%i \n", numeratorFrameRate, denominatorFrameRate);
UINT32 widthOfFrame = 0;
UINT32 heightOfFrame = 0;
if(SUCCEEDED(MFGetAttributeSize(mediaInterface, MF_MT_FRAME_SIZE, &widthOfFrame, &heightOfFrame)))
wprintf_s(L"height of frame: %i width of frame: %i \n", heightOfFrame, widthOfFrame);
UINT32 isCompressedP = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_COMPRESSED, &isCompressedP)))
wprintf_s(L"is media compressed? %iu \n", (BOOL)isCompressedP);
BOOL isCompressedP2 = 0;
if(SUCCEEDED((((IMFMediaType*)mediaInterface)->IsCompressedFormat(&isCompressedP2))))
wprintf_s(L"is media compressed2? %i \n", isCompressedP2);
UINT32 fixedSampleSizeP = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_FIXED_SIZE_SAMPLES, &fixedSampleSizeP)))
wprintf_s(L"is fixed sample size? %iu \n", fixedSampleSizeP);
UINT32 sampleSize = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_SAMPLE_SIZE, &sampleSize)))
wprintf_s(L"sample size: %iu \n", sampleSize);
UINT32 averateBitrate = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_AVG_BITRATE, &averateBitrate)))
wprintf_s(L"average bitrate: %iu \n", averateBitrate);
UINT32 aspectRatio = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_PAD_CONTROL_FLAGS, &aspectRatio)))
wprintf_s(L"4 by 3? %i 16 by 9? %i None? %i \n", aspectRatio == MFVideoPadFlag_PAD_TO_4x3, MFVideoPadFlag_PAD_TO_16x9 == aspectRatio, MFVideoPadFlag_PAD_TO_None == aspectRatio);
UINT32 drmFlag = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_DRM_FLAGS, &drmFlag)))
wprintf_s(L"requires digital drm: %i requires analog drm: %i requires no drm: %i", drmFlag == MFVideoDRMFlag_DigitallyProtected, drmFlag == MFVideoDRMFlag_AnalogProtected, MFVideoDRMFlag_None == drmFlag);
UINT32 panScanEnabled = 0;
if(SUCCEEDED(mediaInterface->GetUINT32(MF_MT_PAN_SCAN_ENABLED, &panScanEnabled)))
wprintf_s(L"pan/scan enabled? %i", panScanEnabled);
UINT32 maxFrameRateNumerator = 0;
UINT32 maxFrameRateDenominator = 0;
if(SUCCEEDED(MFGetAttributeRatio(mediaInterface, MF_MT_FRAME_RATE_RANGE_MAX, &maxFrameRateNumerator, &maxFrameRateDenominator)))
wprintf_s(L"max framerate range: %i/%i \n", maxFrameRateNumerator, maxFrameRateDenominator);
}
}
It's getting some attributes from the IMFMediaInterface, but not many attributes are set and
the call to mediaInterface->GetUINT32(MF_MT_COMPRESSED, &isCompressedP) isn't successful but the call to (IMFMediaType*)mediaInterface)->IsCompressedFormat(&isCompressedP2) is, which makes me wonder if I'm doing it wrong.
This is an old question, but noone should go away unanswered.
As you discovered, MFTEnumEx can give you the list of MFTs, either bulk list, or filtered with a criteria. Now once you have the collection of transforms, you have IMFActivate for every transform available.
Having IMFActivate on hands, see this code snippet how you can obtain information about this transform: you list attributes or access attribute of interest using its key, you can obtain the category, input and output media types (MFT_INPUT_TYPES_Attributes, MFT_OUTPUT_TYPES_Attributes).
Here is sample code and MFT dump samples:
How to enumerate Media Foundation transforms on your system
Enumerating Media Foundation Transforms (MFTs)
What is the simplest way to capture audio from the built in audio input and be able to read the raw sampled values (as in a .wav) in real time as they come in when requested, like reading from a socket.
Hopefully code that uses one of Apple's frameworks (Audio Queues). Documentation is not very clear, and what I need is very basic.
Try the AudioQueue Framework for this. You mainly have to perform 3 steps:
setup an audio format how to sample the incoming analog audio
start a new recording AudioQueue with AudioQueueNewInput()
Register a callback routine which handles the incoming audio data packages
In step 3 you have a chance to analyze the incoming audio data with AudioQueueGetProperty()
It's roughly like this:
static void HandleAudioCallback (void *aqData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc) {
// Here you examine your audio data
}
static void StartRecording() {
// now let's start the recording
AudioQueueNewInput (&aqData.mDataFormat, // The sampling format how to record
HandleAudioCallback, // Your callback routine
&aqData, // e.g. AudioStreamBasicDescription
NULL,
kCFRunLoopCommonModes,
0,
&aqData.mQueue); // Your fresh created AudioQueue
AudioQueueStart(aqData.mQueue,
NULL);
}
I suggest the Apple AudioQueue Services Programming Guide for detailled information about how to start and stop the AudioQueue and how to setup correctly all ther required objects.
You may also have a closer look into Apple's demo prog SpeakHere. But this is IMHO a bit confusing to start with.
It depends how ' real-time ' you need it
if you need it very crisp, go down right at the bottom level and use audio units. that means setting up an INPUT callback. remember, when this fires you need to allocate your own buffers and then request the audio from the microphone.
ie don't get fooled by the presence of a buffer pointer in the parameters... it is only there because Apple are using the same function declaration for the input and render callbacks.
here is a paste out of one of my projects:
OSStatus dataArrivedFromMic(
void * inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * dummy_notused )
{
OSStatus status;
RemoteIOAudioUnit* unitClass = (RemoteIOAudioUnit *)inRefCon;
AudioComponentInstance myUnit = unitClass.myAudioUnit;
AudioBufferList ioData;
{
int kNumChannels = 1; // one channel...
enum {
kMono = 1,
kStereo = 2
};
ioData.mNumberBuffers = kNumChannels;
for (int i = 0; i < kNumChannels; i++)
{
int bytesNeeded = inNumberFrames * sizeof( Float32 );
ioData.mBuffers[i].mNumberChannels = kMono;
ioData.mBuffers[i].mDataByteSize = bytesNeeded;
ioData.mBuffers[i].mData = malloc( bytesNeeded );
}
}
// actually GET the data that arrived
status = AudioUnitRender( (void *)myUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
& ioData );
// take MONO from mic
const int channel = 0;
Float32 * outBuffer = (Float32 *) ioData.mBuffers[channel].mData;
// get a handle to our game object
static KPRing* kpRing = nil;
if ( ! kpRing )
{
//AppDelegate * appDelegate = [UIApplication sharedApplication].delegate;
kpRing = [Game singleton].kpRing;
assert( kpRing );
}
// ... and send it the data we just got from the mic
[ kpRing floatsArrivedFromMic: outBuffer
count: inNumberFrames ];
return status;
}