Android OpenSL ES - issue with .wav file sampled at 44.1Khz - android-ndk

I'm trying to convert some of my OpenAL code to OpenSL ES for my Android usage (Kitkat 4.4.4) on Genymotion and encountered an issue with .wav files sampled at 44.1Khz. My application is a native one (glue).
I've followed the /native-audio sample of Android NDK samples and fragments from the excellent book Android NDK Beginners Guide, so my code behaves correctly on most of wav/PCM data, except those sampled at 44.1Khz. My specific code is this:
Engine init
// create OpenSL ES engine
SLEngineOption EngineOption[] = {(SLuint32) SL_ENGINEOPTION_THREADSAFE, (SLuint32) SL_BOOLEAN_TRUE};
const SLInterfaceID lEngineMixIIDs[] = {SL_IID_ENGINE};
const SLboolean lEngineMixReqs[] = {SL_BOOLEAN_TRUE};
SLresult res = slCreateEngine(&mEngineObj, 1, EngineOption, 1, lEngineMixIIDs, lEngineMixReqs);
res = (*mEngineObj)->Realize(mEngineObj, SL_BOOLEAN_FALSE);
res = (*mEngineObj)->GetInterface(mEngineObj, SL_IID_ENGINE, &mEngine); // get 'engine' interface
// create output mix (AKA playback; this represents speakers, headset etc.)
res = (*mEngine)->CreateOutputMix(mEngine, &mOutputMixObj, 0,NULL, NULL);
res = (*mOutputMixObj)->Realize(mOutputMixObj, SL_BOOLEAN_FALSE);
Player init
SLresult lRes;
// Set-up sound audio source.
SLDataLocator_AndroidSimpleBufferQueue lDataLocatorIn;
lDataLocatorIn.locatorType = SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE;
lDataLocatorIn.numBuffers = 1; // 1 buffer for a one-time load
// analyze and set correct PCM format
SLDataFormat_PCM lDataFormat;
lDataFormat.formatType = SL_DATAFORMAT_PCM;
lDataFormat.numChannels = audio->wav.channels; // etc. 1,2
lDataFormat.samplesPerSec = audio->wav.sampleRate * 1000; // etc. 44100 * 1000
lDataFormat.bitsPerSample = audio->wav.bitsPerSample; // etc. 16
lDataFormat.containerSize = audio->wav.bitsPerSample;
lDataFormat.channelMask = SL_SPEAKER_FRONT_CENTER;
lDataFormat.endianness = SL_BYTEORDER_LITTLEENDIAN;
SLDataSource lDataSource;
lDataSource.pLocator = &lDataLocatorIn;
lDataSource.pFormat = &lDataFormat;
SLDataLocator_OutputMix lDataLocatorOut;
lDataLocatorOut.locatorType = SL_DATALOCATOR_OUTPUTMIX;
lDataLocatorOut.outputMix = mOutputMixObj;
SLDataSink lDataSink;
lDataSink.pLocator = &lDataLocatorOut;
lDataSink.pFormat = NULL;
const SLInterfaceID lSoundPlayerIIDs[] = { SL_IID_PLAY, SL_IID_ANDROIDSIMPLEBUFFERQUEUE };
const SLboolean lSoundPlayerReqs[] = { SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE };
lRes = (*mEngine)->CreateAudioPlayer(mEngine, &mPlayerObj, &lDataSource, &lDataSink, 2, lSoundPlayerIIDs, lSoundPlayerReqs);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayerObj)->Realize(mPlayerObj, SL_BOOLEAN_FALSE);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayerObj)->GetInterface(mPlayerObj, SL_IID_PLAY, &mPlayer);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayerObj)->GetInterface(mPlayerObj, SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &mPlayerQueue);
if (lRes != SL_RESULT_SUCCESS) { return; }
// register callback on the buffer queue
lRes = (*mPlayerQueue)->RegisterCallback(mPlayerQueue, bqPlayerQueueCallback, NULL);
if (lRes != SL_RESULT_SUCCESS) { return; }
lRes = (*mPlayer)->SetCallbackEventsMask(mPlayer, SL_PLAYEVENT_HEADATEND);
if (lRes != SL_RESULT_SUCCESS) { return; }
// ..fetch the data in 'audio->data' from opened FILE* stream and set 'datasize'
// feed the buffer with data
lRes = (*mPlayerQueue)->Clear(mPlayerQueue); // remove any sound from buffer
lRes = (*mPlayerQueue)->Enqueue(mPlayerQueue, audio->data, datasize);
The above works good for 8000, 22050 and 32000 samples/sec but on 41100 samples, 4 out of 5 times it will repeat itself lots of time on first play. It's like having a door knocking sound effect that actually loops many times (about 50 times) by a single ->SetPlayState(..SL_PLAYSTATE_PLAYING); and in speed. Any obvious error on my code? a multi-threaded issue with these sampling? Anyone else have this kind of problem? Should i downsample on 41.1Khz cases ? Could it be a Genymotion problem? tx

I solved this by downsampling from 44Khz to 22Khz. Interestingly, this only happens on sounds containing 1 channel and 44,100 samples; in all other cases there's no problem.

Related

Video call using PJSUA

I'm using pjsua to create a video call from a monitor to a phone. I'm able to establish an audio call without problem, but if I try to establish a video call (vid_cnt=1), I'm getting an error.
My purpose is to get and save the audio and video of the phone.
This is my configuration:
void hard_account_config(pjsua_acc_config& acc_cfg, pjsua_transport_id transport_tcp) {
pjsua_acc_config_default(&acc_cfg);
acc_cfg.ka_interval = 15;
// VIDEO
acc_cfg.vid_in_auto_show = PJ_TRUE;
acc_cfg.vid_out_auto_transmit = PJ_TRUE;
acc_cfg.vid_cap_dev = VideoCaptureDeviceId();
acc_cfg.vid_wnd_flags = PJMEDIA_VID_DEV_WND_BORDER | PJMEDIA_VID_DEV_WND_RESIZABLE;
acc_cfg.reg_timeout = 300;
acc_cfg.use_srtp = PJMEDIA_SRTP_DISABLED;
pjsua_srtp_opt_default(&acc_cfg.srtp_opt);
acc_cfg.ice_cfg_use = PJSUA_ICE_CONFIG_USE_CUSTOM;
acc_cfg.ice_cfg.enable_ice = PJ_FALSE;
acc_cfg.allow_via_rewrite = PJ_FALSE;
acc_cfg.allow_sdp_nat_rewrite = acc_cfg.allow_via_rewrite;
acc_cfg.allow_contact_rewrite = acc_cfg.allow_via_rewrite ? 2 : PJ_FALSE;
acc_cfg.publish_enabled = PJ_TRUE;
acc_cfg.transport_id = transport_tcp;
acc_cfg.cred_count = 1;
acc_cfg.cred_info[0].username = pj_string(USER);
acc_cfg.cred_info[0].realm = pj_string("*");
acc_cfg.cred_info[0].scheme = pj_string("Digest");
acc_cfg.cred_info[0].data_type = PJSIP_CRED_DATA_PLAIN_PASSWD;
acc_cfg.cred_info[0].data = pj_string(PASS);
}
Once registration is completed, I run the following code:
prn("=== Test Call ===");
pj_str_t uri = pj_string("sip:" + call_target + "#" + SERVER);
pjsua_call_id call_id;
pjsua_call_setting call_setting;
pjsua_call_setting_default(&call_setting);
call_setting.flag = 0;
call_setting.vid_cnt = PJMEDIA_HAS_VIDEO ? 1 : 0;
pjsua_msg_data msg_data;
pjsua_msg_data_init(&msg_data);
pj_status_t status = pjsua_call_make_call(acc_id, &uri, &call_setting, NULL, &msg_data, &call_id);
if (status != PJ_SUCCESS) {
prn("Error trying: pjsua_call_make_call");
return;
}
I know that PJMEDIA_HAS_VIDEO is equal to 1 on the conf_site.h and pjsua_call_make_call return PJ_SUCCESS.
I've seen that if I have headphones connected, there is no problem. But if I disconnect them, the following error is shown:
#pjsua_aud.c ..Error retrieving default audio device parameters: Unable to find default audio device (PJMEDIA_EAUD_NODEFDEV) [status=420006]
If I connect the headphones, I enable the video and run my code, the following error is shown:
#pjsua_media.c ......pjsua_vid_channel_update() failed for call_id 0 media 1: Unable to find default video device (PJMEDIA_EVID_NODEFDEV)
So, using PJSUA it is necessary to have audio and video devices on the monitor and phone? Should I create virtual ports if I don't have the devices?
You can use the following code to get a list of audio/video devices in PJSUA, which will most likely provide you with a loopback device (among others).
pjmedia_aud_dev_info audio_device[64];
unsigned int audio_device_cnt = 64;
status = pjsua_enum_aud_devs(audio_device, &audio_device_cnt);
printf("There are %d audio devices\n", audio_device_cnt);
for (int i = 0; i < audio_device_cnt; i++) {
printf("%d: %s\n", i, audio_device[i].name);
}
pjmedia_vid_dev_info video_device[64];
unsigned int video_device_cnt = 64;
status = pjsua_vid_enum_devs(video_device, &video_device_cnt);
printf("There are %d video devices\n", video_device_cnt);
for (int i = 0; i < video_device_cnt; i++) {
printf("%d: %s\n", i, video_device[i].name);
}
I have not personally tried capturing a loopback audio device but for video, PJSUA provides an internal colorbar generator (Colorbar generator in this list), which you can use.
Once you find the indices of loopback or dummy audio/video devices you want to use, you can set them by using
pjsua_set_snd_dev(<YOUR DUMMY CAPTURE DEVICE>, <YOUR DUMMY PLAYBACK DEVICE>);
acc_cfg.vid_cap_dev = <YOUR VIDEO CAPTURE DEVICE>;

Distorted microphone audio when the loudspeaker is enabled (Xamarin.iOS)

I am maintaining a Push-to-talk VoIP app.
When a PTT call is running the app create an audio session
m_AudioSession = AVAudioSession.SharedInstance();
NSError error;
if (!m_AudioSession.SetCategory(AVAudioSession.CategoryPlayAndRecord, AVAudioSessionCategoryOptions.DefaultToSpeaker | AVAudioSessionCategoryOptions.AllowBluetooth, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the category");
}
if (!m_AudioSession.SetMode(AVAudioSession.ModeVoiceChat, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the mode");
}
if (!m_AudioSession.OverrideOutputAudioPort(AVAudioSessionPortOverride.Speaker, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error redirecting the audio to the loudspeaker");
}
if (!m_AudioSession.SetPreferredIOBufferDuration(0.06, out error)) // 60 milli seconds
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the preferred buffer duration");
}
if (!m_AudioSession.SetPreferredSampleRate(8000, out error)) // kHz
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the preferred sample rate");
}
if (!m_AudioSession.SetActive(true, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error activating the audio session");
}
The received audio is played using the OutputAudioQueue and the microphone audio is captured (as mentioned in the Apple Doc: https://developer.apple.com/documentation/avfaudio/avaudiosession/mode/1616455-voicechat) using a Voice-Processing I/O Unit.
The initialization code for Voice-Processing I/O Unit is:
AudioStreamBasicDescription audioFormat = new AudioStreamBasicDescription()
{
SampleRate = SAMPLERATE_8000,
Format = AudioFormatType.LinearPCM,
FormatFlags = AudioFormatFlags.LinearPCMIsSignedInteger | AudioFormatFlags.LinearPCMIsPacked,
FramesPerPacket = 1,
ChannelsPerFrame = CHANNELS,
BitsPerChannel = BITS_X_SAMPLE,
BytesPerPacket = BYTES_X_SAMPLE,
BytesPerFrame = BYTES_X_FRAME,
Reserved = 0
};
AudioComponent audioComp = AudioComponent.FindComponent(AudioTypeOutput.VoiceProcessingIO);
AudioUnit.AudioUnit voiceProcessing = new AudioUnit.AudioUnit(audioComp);
AudioUnitStatus unitStatus = AudioUnitStatus.NoError;
unitStatus = voiceProcessing.SetEnableIO(true, AudioUnitScopeType.Input, ELEM_Mic);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetEnableIO(true, AudioUnitScopeType.Input, ELEM_Mic) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetEnableIO(true, AudioUnitScopeType.Output, ELEM_Speaker);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetEnableIO(false, AudioUnitScopeType.Output, ELEM_Speaker) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetFormat(audioFormat, AudioUnitScopeType.Output, ELEM_Mic);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetFormat (MIC-OUTPUT) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetFormat(audioFormat, AudioUnitScopeType.Input, ELEM_Speaker);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetFormat (ELEM 0-INPUT) returned: {0}", unitStatus);
}
unitStatus = voiceProcessing.SetRenderCallback(AudioUnit_RenderCallback, AudioUnitScopeType.Input, ELEM_Speaker);
if (unitStatus != AudioUnitStatus.NoError)
{
DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetRenderCallback returned: {0}", unitStatus);
}
...
voiceProcessing.Initialize();
voiceProcessing.Start();
And the RenderCallback function is:
private AudioUnitStatus AudioUnit_RenderCallback(AudioUnitRenderActionFlags actionFlags, AudioTimeStamp timeStamp, uint busNumber, uint numberFrames, AudioBuffers data)
{
AudioUnit.AudioUnit voiceProcessing = m_VoiceProcessing;
if (voiceProcessing != null)
{
// getting microphone input signal
var status = voiceProcessing.Render(ref actionFlags, timeStamp, ELEM_Mic, numberFrames, data);
if (status != AudioUnitStatus.OK)
{
return status;
}
if (data.Count > 0)
{
unsafe
{
short* samples = (short*)data[0].Data.ToPointer();
for (uint idxSrcFrame = 0; idxSrcFrame < numberFrames; idxSrcFrame++)
{
... send the collected microphone audio (samples[idxSrcFrame])
}
}
}
}
return AudioUnitStatus.NoError;
}
I am facing the problem that if the loudspeaker is enabled: m_AudioSession.OverrideOutputAudioPort(AVAudioSessionPortOverride.Speaker, out error)
then the microphone audio is corrupted (some times is impossible to understand the speech).
If the loudspeaker is NOT enabled (the AVAudioSessionPortOverride.Speaker is not set) then the audio is very nice.
I have already verified that the NumberChannels in the AudioBuffer returned by the Render function is 1 (mono audio).
Any hit helping solved the problem is very appreciated. Thanks
Update:
The AudioUnit_RenderCallback method is called every 32 ms. When the loudspeaker is disabled the received number of frames is 256 which is exact (sample rate is 8000). When the loudspeaker is enabled the received number of frames is 85.
In both cases the GetAudioFormat returns the expected values: BitsPerChannel=16, BytesPerFrame=2, FramesPerPacket=1, ChannelsPerFrame=1, SampleRate=8000
Update:
I end up using the Sample Rate from the Hardware and performing the down-sampling self. It is must understanding that the Audio Unit should be able to perform the down sampling https://developer.apple.com/library/archive/documentation/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/AudioUnitHostingFundamentals/AudioUnitHostingFundamentals.html#//apple_ref/doc/uid/TP40009492-CH3-SW11)) but it was not possible for me to make it working when the loudspeaker was enabled.
I hope you are testing this on an actual device and not a simulator.
In the code, have you tried using this:
sampleRate = AudioSession.CurrentHardwareSampleRate;
Rather than forcing the sample rate, it's best to check the sample rate from the Hardware. It could be that during loudspeaker usage, it changes the sample rate and thus creating an issue.
I would suggest recording based on the above changes and see if the audio improves and then experiment with other flags.
Standard recording pattern:
https://learn.microsoft.com/en-us/dotnet/api/audiotoolbox.audiostreambasicdescription?view=xamarin-ios-sdk-12#remarks

Desktop duplication (DirectX) screen capture fails to deliver screen updates

I'm working on an application that would capture the screen through Desktop duplication APIs (using DirectX 11) (only the diff to the previous screen update) and render it on another window (The viewer might be running on another machine connected via LAN). The code is an improved version of the sample provided in MSDN. Everything works fine except the device did not give any screen update though there is one some times in the mid, that happens around 10% of the time on some machines (mostly on windows 8/8.1 machines and rarely on windows 10 machines). I tried all the possible ways to sort out this problem. Reduced the number of device resets, that provided me some what reliable output but not always work fine for 100%.
The device fails to provide an initial screen (a full screen) some times (This happens 60% of the time on all windows operating systems where Desktop duplication is supported), I came up with a work around that retried for an initial update from the device until it provides one but that too resulted in multiple issues, the device might not even give the initial screen ever.
I have already invested weeks of my efforts to fix the problem but did not figure out a proper solution and there are no forums I know that discusses these kind of issues. Any help would be appreciated.
Below is my code to get the screen diff to the previous one, init the device, populating the adapters and monitors.
Please bear with me for a very long code snippet, Thanks in advance.
To Get the screen update:
INT getChangedRegions(int timeout, rectangles &dirtyRects, std::vector <MOVE_RECT> &moveRects, UINT &rect_count, RECT ScreenRect)
{
UINT diffArea = 0;
FRAME_DATA currentFrameData;
bool isTimeOut = false;
TRY
{
m_LastErrorCode = m_DuplicationManager.GetFrame(&currentFrameData, timeout, &isTimeOut);
if(SUCCEEDED(m_LastErrorCode) && (!isTimeOut))
{
if(currentFrameData.FrameInfo.TotalMetadataBufferSize)
{
m_CurrentFrameTexture = currentFrameData.Frame;
if(currentFrameData.MoveCount)
{
DXGI_OUTDUPL_MOVE_RECT* moveRectArray = reinterpret_cast<DXGI_OUTDUPL_MOVE_RECT*> (currentFrameData.MetaData);
if (moveRectArray)
{
for(UINT index = 0; index < currentFrameData.MoveCount; index++)
{
//WebRTC
// DirectX capturer API may randomly return unmoved move_rects, which should
// be skipped to avoid unnecessary wasting of differing and encoding
// resources.
// By using testing application it2me_standalone_host_main, this check
// reduces average capture time by 0.375% (4.07 -> 4.055), and average
// encode time by 0.313% (8.042 -> 8.016) without other impacts.
if (moveRectArray[index].SourcePoint.x != moveRectArray[index].DestinationRect.left || moveRectArray[index].SourcePoint.y != moveRectArray[index].DestinationRect.top)
{
if(m_UseD3D11BitmapConversion)
{
MOVE_RECT moveRect;
moveRect.SourcePoint.x = moveRectArray[index].SourcePoint.x * m_ImageScalingFactor;
moveRect.SourcePoint.y = moveRectArray[index].SourcePoint.y * m_ImageScalingFactor;
moveRect.DestinationRect.left = moveRectArray[index].DestinationRect.left * m_ImageScalingFactor;
moveRect.DestinationRect.top = moveRectArray[index].DestinationRect.top * m_ImageScalingFactor;
moveRect.DestinationRect.bottom = moveRectArray[index].DestinationRect.bottom * m_ImageScalingFactor;
moveRect.DestinationRect.right = moveRectArray[index].DestinationRect.right * m_ImageScalingFactor;
moveRects.push_back(moveRect);
diffArea += abs((moveRect.DestinationRect.right - moveRect.DestinationRect.left) *
(moveRect.DestinationRect.bottom - moveRect.DestinationRect.top));
}
else
{
moveRects.push_back(moveRectArray[index]);
diffArea += abs((moveRectArray[index].DestinationRect.right - moveRectArray[index].DestinationRect.left) *
(moveRectArray[index].DestinationRect.bottom - moveRectArray[index].DestinationRect.top));
}
}
}
}
else
{
return -1;
}
}
if(currentFrameData.DirtyCount)
{
RECT* dirtyRectArray = reinterpret_cast<RECT*> (currentFrameData.MetaData + (currentFrameData.MoveCount * sizeof(DXGI_OUTDUPL_MOVE_RECT)));
if (!dirtyRectArray)
{
return -1;
}
rect_count = currentFrameData.DirtyCount;
for(UINT index = 0; index < rect_count; index ++)
{
if(m_UseD3D11BitmapConversion)
{
RECT dirtyRect;
dirtyRect.bottom = dirtyRectArray[index].bottom * m_ImageScalingFactor;
dirtyRect.top = dirtyRectArray[index].top * m_ImageScalingFactor;
dirtyRect.left = dirtyRectArray[index].left * m_ImageScalingFactor;
dirtyRect.right = dirtyRectArray[index].right * m_ImageScalingFactor;
diffArea += abs((dirtyRect.right - dirtyRect.left) *
(dirtyRect.bottom - dirtyRect.top));
dirtyRects.push_back(dirtyRect);
}
else
{
diffArea += abs((dirtyRectArray[index].right - dirtyRectArray[index].left) *
(dirtyRectArray[index].bottom - dirtyRectArray[index].top));
dirtyRects.push_back(dirtyRectArray[index]);
}
}
}
}
return diffArea;
}
CATCH_ALL(e)
{
LOG(CRITICAL) << _T("Exception in getChangedRegions");
}
END_CATCH_ALL
return -1;
}
Here is the code to init the device
//
// Initialize duplication interfaces
//
HRESULT cDuplicationManager::InitDupl(_In_ ID3D11Device* Device, _In_ IDXGIAdapter *_pAdapter, _In_ IDXGIOutput *_pOutput, _In_ UINT Output)
{
HRESULT hr = E_FAIL;
if(!_pOutput || !_pAdapter || !Device)
{
return hr;
}
m_OutputNumber = Output;
// Take a reference on the device
m_Device = Device;
m_Device->AddRef();
/*
// Get DXGI device
IDXGIDevice* DxgiDevice = nullptr;
HRESULT hr = m_Device->QueryInterface(__uuidof(IDXGIDevice), reinterpret_cast<void**>(&DxgiDevice));
if (FAILED(hr))
{
return ProcessFailure(nullptr, _T("Failed to QI for DXGI Device"), _T("Error"), hr);
}
// Get DXGI adapter
IDXGIAdapter* DxgiAdapter = nullptr;
hr = DxgiDevice->GetParent(__uuidof(IDXGIAdapter), reinterpret_cast<void**>(&DxgiAdapter));
DxgiDevice->Release();
DxgiDevice = nullptr;
if (FAILED(hr))
{
return ProcessFailure(m_Device, _T("Failed to get parent DXGI Adapter"), _T("Error"), hr);//, SystemTransitionsExpectedErrors);
}
// Get output
IDXGIOutput* DxgiOutput = nullptr;
hr = DxgiAdapter->EnumOutputs(Output, &DxgiOutput);
DxgiAdapter->Release();
DxgiAdapter = nullptr;
if (FAILED(hr))
{
return ProcessFailure(m_Device, _T("Failed to get specified output in DUPLICATIONMANAGER"), _T("Error"), hr);//, EnumOutputsExpectedErrors);
}
DxgiOutput->GetDesc(&m_OutputDesc);
IDXGIOutput1* DxgiOutput1 = nullptr;
hr = DxgiOutput->QueryInterface(__uuidof(DxgiOutput1), reinterpret_cast<void**>(&DxgiOutput1));
*/
_pOutput->GetDesc(&m_OutputDesc);
// QI for Output 1
IDXGIOutput1* DxgiOutput1 = nullptr;
hr = _pOutput->QueryInterface(__uuidof(DxgiOutput1), reinterpret_cast<void**>(&DxgiOutput1));
if (FAILED(hr))
{
return ProcessFailure(nullptr, _T("Failed to QI for DxgiOutput1 in DUPLICATIONMANAGER"), _T("Error"), hr);
}
// Create desktop duplication
hr = DxgiOutput1->DuplicateOutput(m_Device, &m_DeskDupl);
DxgiOutput1->Release();
DxgiOutput1 = nullptr;
if (FAILED(hr) || !m_DeskDupl)
{
if (hr == DXGI_ERROR_NOT_CURRENTLY_AVAILABLE)
{
return ProcessFailure(nullptr, _T("Maximum number of applications using Desktop Duplication API"), _T("Error"), hr);
}
return ProcessFailure(m_Device, _T("Failed to get duplicate output in DUPLICATIONMANAGER"), _T("Error"), hr);//, CreateDuplicationExpectedErrors);
}
return S_OK;
}
Finally to get the current frame and difference to the previous one:
//
// Get next frame and write it into Data
//
_Success_(*Timeout == false && return == DUPL_RETURN_SUCCESS)
HRESULT cDuplicationManager::GetFrame(_Out_ FRAME_DATA* Data, int timeout, _Out_ bool* Timeout)
{
IDXGIResource* DesktopResource = nullptr;
DXGI_OUTDUPL_FRAME_INFO FrameInfo;
try
{
// Get new frame
HRESULT hr = m_DeskDupl->AcquireNextFrame(timeout, &FrameInfo, &DesktopResource);
if (hr == DXGI_ERROR_WAIT_TIMEOUT)
{
*Timeout = true;
return S_OK;
}
*Timeout = false;
if (FAILED(hr))
{
return ProcessFailure(m_Device, _T("Failed to acquire next frame in DUPLICATIONMANAGER"), _T("Error"), hr);//, FrameInfoExpectedErrors);
}
// If still holding old frame, destroy it
if (m_AcquiredDesktopImage)
{
m_AcquiredDesktopImage->Release();
m_AcquiredDesktopImage = nullptr;
}
if (DesktopResource)
{
// QI for IDXGIResource
hr = DesktopResource->QueryInterface(__uuidof(ID3D11Texture2D), reinterpret_cast<void **>(&m_AcquiredDesktopImage));
DesktopResource->Release();
DesktopResource = nullptr;
}
if (FAILED(hr))
{
return ProcessFailure(nullptr, _T("Failed to QI for ID3D11Texture2D from acquired IDXGIResource in DUPLICATIONMANAGER"), _T("Error"), hr);
}
// Get metadata
if (FrameInfo.TotalMetadataBufferSize)
{
// Old buffer too small
if (FrameInfo.TotalMetadataBufferSize > m_MetaDataSize)
{
if (m_MetaDataBuffer)
{
delete [] m_MetaDataBuffer;
m_MetaDataBuffer = nullptr;
}
m_MetaDataBuffer = new (std::nothrow) BYTE[FrameInfo.TotalMetadataBufferSize];
if (!m_MetaDataBuffer)
{
m_MetaDataSize = 0;
Data->MoveCount = 0;
Data->DirtyCount = 0;
return ProcessFailure(nullptr, _T("Failed to allocate memory for metadata in DUPLICATIONMANAGER"), _T("Error"), E_OUTOFMEMORY);
}
m_MetaDataSize = FrameInfo.TotalMetadataBufferSize;
}
UINT BufSize = FrameInfo.TotalMetadataBufferSize;
// Get move rectangles
hr = m_DeskDupl->GetFrameMoveRects(BufSize, reinterpret_cast<DXGI_OUTDUPL_MOVE_RECT*>(m_MetaDataBuffer), &BufSize);
if (FAILED(hr))
{
Data->MoveCount = 0;
Data->DirtyCount = 0;
return ProcessFailure(nullptr, L"Failed to get frame move rects in DUPLICATIONMANAGER", L"Error", hr);//, FrameInfoExpectedErrors);
}
Data->MoveCount = BufSize / sizeof(DXGI_OUTDUPL_MOVE_RECT);
BYTE* DirtyRects = m_MetaDataBuffer + BufSize;
BufSize = FrameInfo.TotalMetadataBufferSize - BufSize;
// Get dirty rectangles
hr = m_DeskDupl->GetFrameDirtyRects(BufSize, reinterpret_cast<RECT*>(DirtyRects), &BufSize);
if (FAILED(hr))
{
Data->MoveCount = 0;
Data->DirtyCount = 0;
return ProcessFailure(nullptr, _T("Failed to get frame dirty rects in DUPLICATIONMANAGER"), _T("Error"), hr);//, FrameInfoExpectedErrors);
}
Data->DirtyCount = BufSize / sizeof(RECT);
Data->MetaData = m_MetaDataBuffer;
}
Data->Frame = m_AcquiredDesktopImage;
Data->FrameInfo = FrameInfo;
}
catch (...)
{
return S_FALSE;
}
return S_OK;
}
Update :
Failed to acquire next frame in DUPLICATIONMANAGER is getting printed whenever the device has hung (That is in the mid of streaming the screens, Ex: Continuously capturing a video and sending it to the other end)
// Get new frame
HRESULT hr = m_DeskDupl->AcquireNextFrame(timeout, &FrameInfo, &DesktopResource);
if (hr == DXGI_ERROR_WAIT_TIMEOUT)
{
*Timeout = true;
return S_OK;
}
*Timeout = false;
if (FAILED(hr))
{
return ProcessFailure(m_Device, _T("Failed to acquire next frame in DUPLICATIONMANAGER"), _T("Error"), hr);//, FrameInfoExpectedErrors);
}
here is the detailed error info :
Id3d11DuplicationManager::ProcessFailure - Error: Failed to acquire next frame in DUPLICATIONMANAGER, Detail: The keyed mutex was abandoned.
Update 2 :
I have got the error code whenever the device failed to give screen updates forever, And here is the same
Id3d11DuplicationManager::ProcessFailure - Error: Failed to get duplicate output in DUPLICATIONMANAGER, Detail: Access is denied.
The error code is E_ACCESSDENIED.
I do not understand why I am getting this error as I am running in SYSTEM mode already and the SetThreadDesktop had been executed twice (One during the init and another after detecting a failure)
This is what the explanation of the error on MSDN : E_ACCESSDENIED if the application does not have access privilege to the current desktop image. For example, only an application that runs at LOCAL_SYSTEM can access the secure desktop.
Is there anything else that would result in this kind of issue?
It's always good to check the return codes and immediately fall back to GDI or any other available screen capturing approach in case of non-recoverable errors. Retrying doesn't work most of the time for certain hardware errors like max limit reached, out of memory, device removed, etc, I learned it in a hard way. Furthermore, DirectX device takes a few iterations before producing an initial frame on rare occasions. It wouldn't be useful to retry more than 10 times, you can safely fallback or try re-initializing the device to check one more time before falling back.
Here are some basic checks to do:
Handle DXGI_ERROR_NOT_CURRENTLY_AVAILABLE error:
_pOutput->GetDesc(&m_OutputDesc);
// QI for Output 1
IDXGIOutput1* DxgiOutput1 = nullptr;
hr = _pOutput->QueryInterface(__uuidof(DxgiOutput1), reinterpret_cast<void**>(&DxgiOutput1));
if (FAILED(hr))
{
return ProcessFailure(nullptr, _T("Failed to QI for DxgiOutput1 in DUPLICATIONMANAGER"), _T("Error"), hr);
}
// Create desktop duplication
hr = DxgiOutput1->DuplicateOutput(m_Device, &m_DeskDupl);
DxgiOutput1->Release();
DxgiOutput1 = nullptr;
if (FAILED(hr) || !m_DeskDupl)
{
if (hr == DXGI_ERROR_NOT_CURRENTLY_AVAILABLE)
{
return ProcessFailure(nullptr, _T("Maximum number of applications using Desktop Duplication API"), _T("Error"), hr);
}
return ProcessFailure(m_Device, _T("Failed to get duplicate output in DUPLICATIONMANAGER"), _T("Error"), hr);//, CreateDuplicationExpectedErrors);
}
Check for device removed(DXGI_ERROR_DEVICE_REMOVED) or Device reset(DXGI_ERROR_DEVICE_RESET) & Out of memory(E_OUTOFMEMORY) error codes (I have received E_OUTOFMEMORY sometimes, though it's uncommon):
HRESULT ProcessFailure(_In_opt_ ID3D11Device* Device, _In_ LPCWSTR Str, _In_ LPCWSTR Title, HRESULT hr)//, _In_opt_z_ HRESULT* ExpectedErrors = NULL)
{
HRESULT TranslatedHr;
// On an error check if the DX device is lost
if (Device)
{
HRESULT DeviceRemovedReason = Device->GetDeviceRemovedReason();
switch (DeviceRemovedReason)
{
case DXGI_ERROR_DEVICE_REMOVED:
case DXGI_ERROR_DEVICE_RESET:
case static_cast<HRESULT>(E_OUTOFMEMORY) :
{
// Our device has been stopped due to an external event on the GPU so map them all to
// device removed and continue processing the condition
TranslatedHr = DXGI_ERROR_DEVICE_REMOVED;
break;
}
case S_OK:
{
// Device is not removed so use original error
TranslatedHr = hr;
break;
}
default:
{
// Device is removed but not a error we want to remap
TranslatedHr = DeviceRemovedReason;
}
}
}
else
{
TranslatedHr = hr;
}
_com_error err(TranslatedHr);
LPCTSTR errMsg = err.ErrorMessage();
return TranslatedHr;
}
Furthermore, Desktop duplication requires a real graphics device to be active in order to work. You may get E_ACCESSDENIED otherwise.
There are also other scenarios you may get this error, like, Desktop switch cases, abandoned keyed mutex. You can try reinitializing the device in such cases.
I have also uploaded my sample project here.

No sound output with WASAPI

I am having trouble with WASAPI. It do not output any sound and I have been checked the data that writing to the buffer.
Because of it does not output any sound, I haven't any idea to find out the problem.
It may have some problems in following code.
SoundStream::SoundStream() : writtenCursor(0), writeCursor(0), distroy(false)
{
IMMDeviceEnumerator * pEnumerator = nullptr;
HResult(CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_ALL, IID_PPV_ARGS(&pEnumerator)));
IMMDevice * pDevice = nullptr;
HResult(pEnumerator->GetDefaultAudioEndpoint(eRender, eMultimedia, &pDevice));
SafeRelease(&pEnumerator);
HResult(pDevice->Activate(__uuidof(IAudioClient), CLSCTX_ALL, NULL, (void**)&pAudioClient));
SafeRelease(&pDevice);
WAVEFORMATEXTENSIBLE * pwfx = nullptr;
hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
REFERENCE_TIME hnsRequestedDuration = REFTIMES_PER_SEC * 2;
HResult(pAudioClient->GetMixFormat((WAVEFORMATEX**)&pwfx));
HResult(pAudioClient->Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
hnsRequestedDuration,
0,
(WAVEFORMATEX*)pwfx,
NULL));
pAudioClient->SetEventHandle(hEvent);
channel = (size_t)pwfx->Format.nChannels;
bits = (size_t)pwfx->Format.wBitsPerSample;
validBits = (size_t)pwfx->Samples.wValidBitsPerSample;
frequency = (size_t)pwfx->Format.nSamplesPerSec;
buffer.reshape({ 0, channel, bits >> 3 });
CoTaskMemFree(pwfx);
HResult(pAudioClient->GetBufferSize(&bufferFrameCount));
HResult(pAudioClient->Start());
if (pAudioClient)
{
thread = std::thread([&]()
{
this->Sync();
});
}
}
You could look at my WASAPI.cpp code at http://jdmcox.com (which works fine).
You should also check if the expected wave format is float:
//SubFormat 00000003-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_IEEE_FLOAT
//SubFormat 00000001-0000-0010-8000-00aa00389b71 defines KSDATAFORMAT_SUBTYPE_PCM
GUID G;
WORD V;
WAVEFORMATEX *pwfx = NULL;
bool itsfloat;
pAudioClient->GetMixFormat(&pwfx);
// Do we received a WAVEFORMATEXTENSIBLE?
if(pwfx.cbSize >= 22) {
G = ((WAVEFORMATEXTENSIBLE*)pwfx)->SubFormat;
V = ((WAVEFORMATEXTENSIBLE*)pwfx)->Samples.wValidBitsPerSample;
if (G.Data1 == 3) itsfloat = true;
else if (G.Data1 == 1) itsfloat = false;
}
You know you received a WAVEFORMATEXTENSIBLE and not a simple WAVEFORMATEX because the "pwfx.cbSize >= 22".
See more at:
IAudioClient::GetMixFormat
https://learn.microsoft.com/en-us/windows/win32/api/audioclient/nf-audioclient-iaudioclient-getmixformat
WAVEFORMATEXTENSIBLE
https://learn.microsoft.com/en-us/windows/win32/api/mmreg/ns-mmreg-waveformatextensible
You could look at my WASAPI.cpp code at http://jdmcox.com AGAIN.
Now it works in shared mode as well as exclusive mode.
I should note that no conversion of wave format or wave is necessary in shared mode -- Windows takes care of both converting to and from their format used to mix waves.

How to monitor playback on ALSA via asoundlib?

I'm building an application that allows for ALSA configuration and in the GUI there is a peek meter that allows the client to see playback levels in realtime. I'm having a hard time determining what device to connect to because I don't know if ALSA has a default "loopback" or not and what it's called. I am also having trouble converting the read data into a sample, then finding said sample's amplitude. Here is what I have built so far:
Grab device and set hardware params
if (0 == snd_pcm_open(&pPcm, "default", SND_PCM_STREAM_CAPTURE, SND_PCM_NONBLOCK))
{
if (0 == snd_pcm_set_params(pPcm, SND_PCM_FORMAT_S16_LE, SND_PCM_ACCESS_RW_INTERLEAVED, 1, 96000, 1, 1)) // This last argument confuses me because I'm not given a unit of measurement (second, millisecond, mircosecond, etc.)
{
return snd_pcm_start(pPcm);
}
}
pPcm = nullptr;
return -1;
Read from device and return the peek of the audio signal
int iRtn = -1;
if (nullptr == pPcm)
{
if (-1 == SetupListener())
{
return iRtn;
}
}
// Check to make the state is sane for reading.
if (SND_PCM_STATE_PREPARED == snd_pcm_state(pPcm) ||
SND_PCM_STATE_RUNNING == snd_pcm_state(pPcm))
{
snd_pcm_resume(pPcm); // This might be superfluous.
// The state is sane, read from the stream.
signed short iBuffer = 0;
int iNumRead = snd_pcm_readi(pPcm, &iBuffer, 1);
if (0 < iNumRead)
{
// This calculates an approximation.
// We have some audio data, acquire it's peek in dB. (decibels)
float nSample = static_cast<float>(iBuffer);
float nAmplitude = nSample / MAX_AMPLITUDE_S16; // MAX_AMPLITUDE_S16 is defined as "32767"
float nDecibels = (0 < nAmplitude) ? 20 * log10(nAmplitude) : 0;
iRtn = static_cast<int>(nDecibels); // Cast to integer for GUI element.
}
}
return iRtn;
The ALSA documentation seems very barren and so I apologize if I'm misusing the API.

Resources