Retrieving D3D12 Video Decode Status - direct3d

I am implementing a H264 video decoder through Direct3D 12 APIs - while I am very new to Direct3D I do have experience with other graphics APIs and H264. I have been struggling to find decent examples of D3D12 video decoding but have got as far as seemingly successfully submitting my work to the decoder.
However, I am at a loss with how to query the decode status. From piecing together the documentation and some other code I found I think it's something like this - mapping the result into the status struct - but I get an invalid argument error. Could anyone point me in the right direction, and any good examples of D3D12 video decoding would be a great resource.
// decode_commands is a ID3D12VideoDecodeCommandList
// query_heap is a ID3D12QueryHeap
// device is a ID3D12Device
// Make query for decode stats.
decode_commands->EndQuery(query_heap.Get(), D3D12_QUERY_TYPE::D3D12_QUERY_TYPE_VIDEO_DECODE_STATISTICS, 0);
// Create buffer for query result.
ComPtr<ID3D12Resource> query_result;
D3D12_RESOURCE_DESC query_result_description = CD3DX12_RESOURCE_DESC::Buffer(sizeof(D3D12_QUERY_DATA_VIDEO_DECODE_STATISTICS));
HRESULT create_query_result = device->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE::D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAGS::D3D12_HEAP_FLAG_NONE,
&query_result_description,
D3D12_RESOURCE_STATES::D3D12_RESOURCE_STATE_COPY_DEST,
nullptr,
IID_PPV_ARGS(&query_result));
if (FAILED(create_query_result)) {
log("Failed to create query result");
return false;
}
// Resolve query.
decode_commands->ResolveQueryData(query_heap.Get(), D3D12_QUERY_TYPE::D3D12_QUERY_TYPE_VIDEO_DECODE_STATISTICS, 0, 1, query_result.Get(), 0);
// Get stats from query result.
D3D12_RANGE range;
range.Begin = 0;
range.End = sizeof(D3D12_QUERY_DATA_VIDEO_DECODE_STATISTICS);
D3D12_QUERY_DATA_VIDEO_DECODE_STATISTICS stats;
HRESULT map = query_result->Map(0, &range, reinterpret_cast<void**>(&stats));
if (FAILED(map)) {
log("Failed to map query result");
return false;
}

Perhaps you should begin with a working sample from D3D9 API :
H264Dxva2Decoder
Than, you write a D3D12 decoder, that's what i would do.

Related

Decode streaming audio with gstreamer 1.0 and access the waveform data?

The actual gst version is 1.8.1.
Currently I have code that receives a gstreamer encoded stream and plays it through my soundcard. I want to modify it to instead give my application access to the raw un-compressed audio data. This should result in an array of integer sound samples, and if I were to plot them I would see the audio wave form (e.g. a perfect tone would be a nice sine wave), and if I were to append the most recent array to the last one received by a callback I wouldn't see any discontinuity.
This is the current playback code:
https://github.com/lucasw/audio_common/blob/master/audio_play/src/audio_play.cpp
I think I need to change the alsasink to an appsink, and setting up a callback that will get the latest chunk of audio after it has passed through the decoder. This is adapted from https://github.com/jojva/gst-plugins-base/blob/master/tests/examples/app/appsink-src.c :
_sink = gst_element_factory_make("appsink", "sink");
g_object_set (G_OBJECT (_sink), "emit-signals", TRUE,
"sync", FALSE, NULL);
g_signal_connect (_sink, "new-sample",
G_CALLBACK (on_new_sample_from_sink), this);
And then there is the callback:
static GstFlowReturn
on_new_sample_from_sink (GstElement * elt, gpointer data)
{
RosGstProcess *client = reinterpret_cast<RosGstProcess*>(data);
GstSample *sample;
GstBuffer *app_buffer, *buffer;
GstElement *source;
/* get the sample from appsink */
sample = gst_app_sink_pull_sample (GST_APP_SINK (elt));
buffer = gst_sample_get_buffer (sample);
/* make a copy */
app_buffer = gst_buffer_copy (buffer);
/* we don't need the appsink sample anymore */
gst_sample_unref (sample);
/* get source and push new buffer */
source = gst_bin_get_by_name (GST_BIN (client->_sink), "app_source");
return gst_app_src_push_buffer (GST_APP_SRC (source), app_buffer);
}
Can I get at the data in that callback? What am I supposed to do with the GstFlowReturn? If that is passing data to another pipeline element I don't want to do that, I'd rather get it there and be done.
https://github.com/lucasw/audio_common/blob/appsink/audio_process/src/audio_process.cpp
Is the gpointer data passed to that callback exactly what I want (cast to a gint16 array?), or otherwise how do I convert and access it?
The GstFlowReturn is merely a return value for the underlying base classes. If you would return an error there the pipeline probably stops because.. well there was a critical error.
The cb_need_data events are triggered by your appsrc element. This can be used as a throttling mechanism if needed. Since you probably use the appsrc in a pure push mode (as soon something arrives at the appsink you push it to the appsrc) you can ignore these. You also explicitly disable these events on the appsrc element. (Or do you still use the one?)
The data format in the buffer depends on the caps that the decoder and appsink agreed on. That is usually the decoder preferred format. You may have some control over this format depending on the decoder or convert it to your preferred format. May be worthwhile to check the format, Float32 is not that uncommon..
I kind of forgot what your actual question was, I'm afraid..
I can interpret the data out of the modified callback below (there is a script that plots it to the screen), it looks like it is signed 16-bit samples in the uint8 array.
I'm not clear about the proper return value for the callback, there is a cb_need_data callback setup elsewhere in the code that is getting triggered all the time with this code.
static void // GstFlowReturn
on_new_sample_from_sink (GstElement * elt, gpointer data)
{
RosGstProcess *client = reinterpret_cast<RosGstProcess*>(data);
GstSample *sample;
GstBuffer *buffer;
GstElement *source;
/* get the sample from appsink */
sample = gst_app_sink_pull_sample (GST_APP_SINK (elt));
buffer = gst_sample_get_buffer (sample);
GstMapInfo map;
if (gst_buffer_map (buffer, &map, GST_MAP_READ))
{
audio_common_msgs::AudioData msg;
msg.data.resize(map.size);
// TODO(lucasw) copy this more efficiently
for (size_t i = 0; i < map.size; ++i)
{
msg.data[i] = map.data[i];
}
gst_buffer_unmap (buffer, &map);
client->_pub.publish(msg);
}
}
https://github.com/lucasw/audio_common/tree/appsink

Xively and Node-Red

I'm fairly new at all this but have muddled my way to getting my Arduino to post values to a Xively stream I've named "Lux and Temp." Three values; count, lux, and temperature.
Now what I want to do is take those values and do something with them using Node-Red.
http://nodered.org
I have Node-Red up and running, but I'll be danged if I can figure out how to parse the data from the Xively API feed. https://api.xively.com/v2/feeds/1823833592
Sadly, I don't have enough reputation points to be able to actually post the data it returns to here, since it has more than 3 URLs embedded in the data. It's a LONG string of data too. ;)
I'm just stumped though as to how to write the function to extract the parts I want.
My initial want is to make a simple Twitter feed out of it. Something like;
"Count 40, Lux 30, Temp 78.3"
I'll eventually want to recycle the code for other things like making my RasPi do something; maybe a display or some LEDs. In either case I need to parse the data and build various messages with it.
Anybody have any experience with the Node-Red functions that can walk me through a solution? The Node-Red site is pretty awesome, but I think it assumes I'm a MUCH more experienced user than I really am. It gives hints, but frankly about all I know is fairly basic Arduino and trivial level Python.
OK, it shouldn't be too tricky, but try putting this in a function block:
var newPayload = "";
var count, lux, temp;
var data = msg.payload.datastreams;
for (var i = 0 ; i< data.length; i++) {
if (data[i].id === 'Count') {
count = data[i].current_value;
}else if (data[i].id === 'Lux') {
lux = data[i].current_value;
} else if (data[i].id === 'Temp') {
temp = data[i].current_value;
}
}
newPayload = util.format('Count: %s, Lux: %s, Temp: %s', count, lux, temp);
msg.payload = newPayload;
return msg;
You may need to add a msg.payload = JSON.parse(msg.payload); to the start if however your getting the feed from xively is not already considered to be json.
[edit]
You could also just run the flow through a JSON parse node. (I always forget the converter nodes)
You should be able to wire that to a twitter output node.

Libav and xaudio2 - audio not playing

I am trying to get audio playing with libav using xaudio2. The xaudio2 code I am using works with an older ffmpeg using avcodec_decode_audio2, but that has been deprecated for avcodec_decode_audio4. I have tried following various libav examples, but can't seem to get the audio to play. Video plays fine (or rather it just plays right fast now, as I haven't coded any sync code yet).
Firstly audio gets init, no errors, video gets init, then packet:
while (1) {
//is this packet from the video or audio stream?
if (packet.stream_index == player.v_id) {
add_video_to_queue(&packet);
} else if (packet.stream_index == player.a_id) {
add_sound_to_queue(&packet);
} else {
av_free_packet(&packet);
}
}
Then in add_sound_to_queue:
int add_sound_to_queue(AVPacket * packet) {
AVFrame *decoded_frame = NULL;
int done = AVCODEC_MAX_AUDIO_FRAME_SIZE;
int got_frame = 0;
if (!decoded_frame) {
if (!(decoded_frame = avcodec_alloc_frame())) {
printf("[ADD_SOUND_TO_QUEUE] Out of memory\n");
return -1;
}
} else {
avcodec_get_frame_defaults(decoded_frame);
}
if (avcodec_decode_audio4(player.av_acodecctx, decoded_frame, &got_frame, packet) < 0) {
printf("[ADD_SOUND_TO_QUEUE] Error in decoding audio\n");
av_free_packet(packet);
//continue;
return -1;
}
if (got_frame) {
int data_size;
if (packet->size > done) {
data_size = done;
} else {
data_size = packet->size;
}
BYTE * snd = (BYTE *)malloc( data_size * sizeof(BYTE));
XMemCpy(snd,
AudioBytes,
data_size * sizeof(BYTE)
);
XMemSet(&g_SoundBuffer,0,sizeof(XAUDIO2_BUFFER));
g_SoundBuffer.AudioBytes = data_size;
g_SoundBuffer.pAudioData = snd;
g_SoundBuffer.pContext = (VOID*)snd;
XAUDIO2_VOICE_STATE state;
while( g_pSourceVoice->GetState( &state ), state.BuffersQueued > 60 ) {
WaitForSingleObject( XAudio2_Notifier.hBufferEndEvent, INFINITE );
}
g_pSourceVoice->SubmitSourceBuffer( &g_SoundBuffer );
}
return 0;
}
I can't seem to figure out the problem, I have added error messages in init, opening video, codec handling etc. As mentioned before the xaudio2 code is working with an older ffmpeg, so maybe I have missed something with the avcodec_decode_audio4?
If this snappet of code isn't enough, I can post the whole code, these are just the places in the code I think the problem would be :(
I don't see you accessing decoded_frame anywhere after decoding. How do you expect to get the data out otherwise?
BYTE * snd = (BYTE *)malloc( data_size * sizeof(BYTE));
This also looks very fishy, given that data_size is derived from the packet size. The packet size is the size of the compressed data, it has very little to do with the size of the decoded PCM frame.
The decoded data is located in decoded_frame->extended_data, which is an array of pointers to data planes, see here for details. The size of the decoded data is determined by decoded_frame->nb_samples. Note that with recent Libav versions, many decoders return planar audio, so different channels live in different data buffers. For many use cases you need to convert that to interleaved format, where there's just one buffer with all the channels. Use libavresample for that.

Default Audio Output - Getting Device Changed Notification? (CoreAudio, Mac OS X, AudioHardwareAddPropertyListener)

I am trying to write a listener using the CoreAudio API for when the default audio output is changed (e.g.: a headphone jack is plugged in). I found sample code, although a bit old and using deprecated functions (http://developer.apple.com/mac/library/samplecode/AudioDeviceNotify/Introduction/Intro.html, but it didn't work. Re-wrote the code in the 'correct' way using AudioHardwareAddPropertyListener method, but still it doesn't seem to work. When I plug in a headphone the function that I registered is not triggered. I'm a bit of a loss here... I suspect the problem may lay some where else, but I can't figure out where...
The Listener Registration Code:
OSStatus err = noErr;
AudioObjectPropertyAddress audioDevicesAddress = { kAudioHardwarePropertyDefaultOutputDevice, KAudioObjectPropertyScopeGlobal, KAudioObjectPropertyElementMaster };
err = AudioObjectAddPropertyListener ( KAudioObjectAudioSystemObject, &AudioDevicesAddress, coreaudio_property_listener, NULL);
if (err) trace ("error on AudioObjectAddPropertyListener");
After a search in sourceforge for projects that used the CoreAudio API, I found the rtaudio project, and more importantly these lines:
// This is a largely undocumented but absolutely necessary
// requirement starting with OS-X 10.6. If not called, queries and
// updates to various audio device properties are not handled
// correctly.
CFRunLoopRef theRunLoop = NULL;
AudioObjectPropertyAddress property = { kAudioHardwarePropertyRunLoop,
kAudioObjectPropertyScopeGlobal,
kAudioObjectPropertyElementMaster };
OSStatus result = AudioObjectSetPropertyData( kAudioObjectSystemObject, &property, 0, NULL, sizeof(CFRunLoopRef), &theRunLoop);
if ( result != noErr ) {
errorText_ = "RtApiCore::RtApiCore: error setting run loop property!";
error( RtError::WARNING );
}
After adding this code I didn't even need to register a listener myself.
Try CFRunLoopRun() - it has the same effect. i.e. making sure the event loop that is calling your listener is running.

C++ Microsoft SAPI: How to set Windows text-to-speech output to a memory buffer?

I have been trying to figure out how to "speak" a text into a memory buffer using Windows SAPI 5.1 but so far no success, even though it seems it should be quite simple.
There is an example of streaming the synthesized speech into a .wav file, but no examples of how to stream it to a memory buffer.
In the end I need to have the synthesized speech in a char* array in 16 kHz 16-bit little-endian PCM format. Currently I create a temp .wav file, redirect speech output there, then read it, but it seems to be a rather stupid solution.
Anyone knows how to do that?
Thanks!
Look at ISpStream::SetBaseStream. Here's a little helper:
inline HRESULT SPCreateStreamOnHGlobal(
HGLOBAL hGlobal, //Memory handle for the stream object
BOOL fDeleteOnRelease, //Whether to free memory when the object is released
const WAVEFORMATEX * pwfex, //WaveFormatEx for stream
ISpStream ** ppStream) //Address of variable to receive ISpStream pointer
{
HRESULT hr;
IStream * pMemStream;
*ppStream = NULL;
hr = ::CreateStreamOnHGlobal(hGlobal, fDeleteOnRelease, &pMemStream);
if (SUCCEEDED(hr))
{
hr = ::CoCreateInstance(CLSID_SpStream, NULL, CLSCTX_ALL, __uuidof(*ppStream), (void **)ppStream);
if (SUCCEEDED(hr))
{
hr = (*ppStream)->SetBaseStream(pMemStream, SPDFID_WaveFormatEx, pwfex);
if (FAILED(hr))
{
(*ppStream)->Release();
*ppStream = NULL;
}
}
pMemStream->Release();
}
return hr;
}
I accomplished it using the ISpStream. Use Setbasestream function of the ispstream to bind it to an istream and then set the output of ispvoice to that ispstream.
Here is my working solution if anybody wants it :
https://github.com/itsyash/MS-SAPI-demo
Do you know how to create a memory-mapped file? You could see if the ISpStream will bind to it.

Resources