Rendering video in Direct3D obtained from Media Foundation efficiently - multithreading

I want to use live video I am decoding from media foundation efficiently.
Originally, I was running the render functions synchronously after decoding each frame. The incoming framerate is of around 25-30 fps, but I would like to render the graphics (game) content at 60fps.
If I do it asynchronously I will either get corrupted output / black screens / both or very low framerate due to aggressive locking. Since the GPU operations are async I haven't been able to find a reasonable critical section. How is this normally done? I can use one of my temporary surfaces (source, dest, or g_pDecodedTexture) as a synchronization point and surround writes to it/them with a CRITICAL_SECTION, but I don't know where the critical section should go on the render (reading) thread. If I surround the whole render function, my framerate is very low, and if I don't I get incorrect output. Maybe there is another more appropriated method for synchronization.
At render setup time
hr = g_d3dDevice->CreateShaderResourceView(g_pDecodedTexture, &shaderResourceViewDesc, &g_pTextureRV);
In the decode thread
void Decode()
{
MFT_OUTPUT_DATA_BUFFER output = { 0 };
//...
encoder->ProcessOutput(0,1,&output,&status);
//
CComPtr<IMFMediaBuffer> spMediaBuffer;
CComPtr<IMFDXGIBuffer> spDXGIBuffer;
CComPtr<IDXGIResource> spDecodedTexture;
output.pSample->GetBufferByIndex(0, &spMediaBuffer);
spMediaBuffer->QueryInterface(IID_PPV_ARGS(&spDXGIBuffer);
spDXGIBuffer->GetResource(IID_PPV_ARGS(&spDecodedTexture);
//....
CComPtr<ID3D11Texture2D> source;
spDXGIBuffer->QueryInterface<ID3D11Texture2D>(&source);
//
CComPtr<ID3D11Resource> dest;
swapChain->GetBuffer(0, __uuidof(ID3D11Resource), (void**)&dest);
deviceContext->CopyResource(dest, source);
deviceContext->CopyResource(g_pDecodedTexture, source);
}
In the render thread
void Render()
{
//...
deviceContext->PSSetShaderResources(0, 1, &g_pTextureRV);
//..
m_deviceContext->VSSetShaderResources(0, 1, &g_pTextureRV);
//..
immediateContext->DrawIndexed(...);
//..
immediateContext->DrawIndexed(...);
//..
immediateContext->DrawIndexed(...);
//..
immediateContext->DrawIndexed(...);
//
Present();
}

You can try this : insert the Frame Rate Converter DSP after the decoder. Be sure your input format is compatible with the DSP. Set frame rate at 60 fps.
Doing this, i think you can keep the synchronous approach.
If you want to manually display at 60 fps, we need more code to see where the problem comes from.

Related

Should the variable value be checked before assigning?

I know this might sound like a silly question but I'm curious should I check my variable value before assigning?
like for example if I'm flipping my skin (Node2D composed of sprite & raycast) based on direction (Vector2) :
func _process(delta):
...
if(direction.x>0):
skin.scale.x=1
elif(direction.x<0):
skin.scale.x=-1
#OR
if(direction.x>0):
if(skin.scale.x!=1):
skin.scale.x=1
elif(direction.x<0):
if(skin.scale.x!=-1):
skin.scale.x=-1
would the skin scale be altered every _process hence consuming more CPU usage
OR
if the value is same will it be ignored?
First of all, given that this is GDScript, so the number of lines will be a performance factor.
We will look at the C++ side…
But before that… Be aware that GDScript does some trickery with properties.
When you say skin.scale Godot will call get_scale on the skin object, which returns a Vector2. And Vector2 is a value type. That Vector2 is not the scale that the object has, but a copy, an snapshot of the value. So, in virtually any other language skin.scale.x=1 is modifying the Vector2 and would have no effect on the scale of the object. Meaning that you should do this:
skin.scale = Vector2(skin.scale.x + 1, skin.scale.y)
Or this:
var skin_scale = skin.scale
skin_scale.x += 1
skin.scale = skin_scale
Which I bet people using C# would find familiar.
But you don't need to do that in GDScript. Godot will call set_scale, which is what most people expect. It is a feature!
So, you set scale, and Godot will call set_scale:
void Node2D::set_scale(const Size2 &p_scale) {
if (_xform_dirty) {
((Node2D *)this)->_update_xform_values();
}
_scale = p_scale;
// Avoid having 0 scale values, can lead to errors in physics and rendering.
if (Math::is_zero_approx(_scale.x)) {
_scale.x = CMP_EPSILON;
}
if (Math::is_zero_approx(_scale.y)) {
_scale.y = CMP_EPSILON;
}
_update_transform();
_change_notify("scale");
}
The method _change_notify only does something in the editor. It is the Godot 3.x instrumentation for undo/redo et.al.
And set_scale will call _update_transform:
void Node2D::_update_transform() {
_mat.set_rotation_and_scale(angle, _scale);
_mat.elements[2] = pos;
VisualServer::get_singleton()->canvas_item_set_transform(get_canvas_item(), _mat);
if (!is_inside_tree()) {
return;
}
_notify_transform();
}
Which, as you can see, will update the Transform2D of the Node2D (_mat). Then it is off to the VisualServer.
And then to _notify_transform. Which is what propagates the change in the scene tree. It is also what calls notification(NOTIFICATION_LOCAL_TRANSFORM_CHANGED) if you have enabled it with set_notify_transform. It looks like this (this is from "canvas_item.h"):
_FORCE_INLINE_ void _notify_transform() {
if (!is_inside_tree()) {
return;
}
_notify_transform(this);
if (!block_transform_notify && notify_local_transform) {
notification(NOTIFICATION_LOCAL_TRANSFORM_CHANGED);
}
}
And you can see it delegates to another _notify_transform that looks like this (this is from "canvas_item.cpp"):
void CanvasItem::_notify_transform(CanvasItem *p_node) {
/* This check exists to avoid re-propagating the transform
* notification down the tree on dirty nodes. It provides
* optimization by avoiding redundancy (nodes are dirty, will get the
* notification anyway).
*/
if (/*p_node->xform_change.in_list() &&*/ p_node->global_invalid) {
return; //nothing to do
}
p_node->global_invalid = true;
if (p_node->notify_transform && !p_node->xform_change.in_list()) {
if (!p_node->block_transform_notify) {
if (p_node->is_inside_tree()) {
get_tree()->xform_change_list.add(&p_node->xform_change);
}
}
}
for (CanvasItem *ci : p_node->children_items) {
if (ci->top_level) {
continue;
}
_notify_transform(ci);
}
}
So, no. There is no check to ignore the change if the value is the same.
However, it is worth noting that Godot invalidates the global transform instead of computing it right away (global_invalid). This is does not make multiple updates to the transform in the same frame free, but it makes them cheaper than otherwise.
I also remind you that looking at the source code is no replacement for using a profiler.
Should you check? Perhaps… If there are many children that need to be updated the extra lines are likely cheap enough. If in doubt: measure with a profiler.

Speaking with an ISpVoice from a ISpTTSEngine

I'm implementing an ISpTTSEngine for the Microsoft Speech API (SAPI). I'd like for
this voice to annunciate just like a typical TTS voice. Rather than write my
own speech synthesizer, I'd like to delegate to a built-in ISpVoice.
I've written enough code to hear text vocalized, but it has a major deficiency
that I haven't been able to explain: the speech does not begin until after my
implementation of ISpTTSEngine:Speak has returned. For the duration of the
audible output, my implementation of ISpTTSEngine:Speak is not invoked, even
when the software using the TTS voice is sending requests.
(For context: my goal for this project is to programmatically observe the speech data that other pieces
of software are attempting to vocalize. That part appears to be working as
intended.)
The full source is available
here. I'll try to
summarize with the most relevant parts.
My implementation of ISpTTSEngine has a private member named
m_cpVoice:
class ATL_NO_VTABLE CTTSEngObj :
public CComObjectRootEx<CComMultiThreadModel>,
public CComCoClass<CTTSEngObj, &CLSID_SampleTTSEngine>,
public ISpTTSEngine,
public ISpObjectWithToken
{
// ...
private:
CComPtr<ISpVoice> m_cpVoice;
And it is initialized in the FinalConstruct
method:
HRESULT CTTSEngObj::FinalConstruct()
{
HRESULT hr = S_OK;
// ...
hr = m_cpVoice.CoCreateInstance(CLSID_SpVoice);
My implementation of ISpTTSEngine:Speak iterates over the text fragments it
receives
and passes the text data to the ISpVoice::Speak
method:
STDMETHODIMP CTTSEngObj::Speak(DWORD dwSpeakFlags,
REFGUID rguidFormatId,
const WAVEFORMATEX* pWaveFormatEx,
const SPVTEXTFRAG* pTextFragList,
ISpTTSEngineSite* pOutputSite)
{
// ...
for (const SPVTEXTFRAG* textFrag = pTextFragList; textFrag != NULL; textFrag = textFrag->pNext)
{
// ...
const std::wstring& text = textFrag->pTextStart;
hr = m_cpVoice->Speak(text.substr(0, textFrag->ulTextLen).c_str(), dwSpeakFlags | SPF_ASYNC | SPF_PURGEBEFORESPEAK, 0);
As mentioned above, no audio is emitted until after ISpTTSEngine:Speak
returns. An arbitrary sleep statement demonstrates this most clearly. Polling
the ISpVoice's SpeakCompleteEvent handle inevitably times out. Removing the
SPF_ASYNC flag from the invocation of ISpVoice::Speak causes the caller to
crash.
Can anyone explain this behavior? Or suggest a change that would allow me to
observe subsequent speech requests?
SAPI isn't expecting to be entered recursively. Consider using a different TTS engine (e.g., the WinRT System.Media.SpeechSynthesis APIs) to do the actual synthesis. The text fragments won't have any embedded markup, so that won't be a big deal.

Codename One: Background threads needing access to the UI

My app has background threads that need to access the UI.
Imagine a chess program (AI) that "thinks" for a number of seconds before it plays a move on the board.
While the thread runs the UI is blocked for input but there is still output.
There are 3 threads involved:
the CN1 EDT
the think thread, using invokeAndBlock, that outputs information about the search process (in a TextField), such as the current move, search depth and search value
a clock thread, started with Thread.start(), that updates once per second the time used by White or Black (TextFields)
During the search (invokeAndBlock) the stopButton is accessible to force the search to stop (not shown).
Below is my current implementation. It works and my question is: is it the right way to implement this?
(I read https://www.codenameone.com/blog/callserially-the-edt-invokeandblock-part-1.html and part-2.)
Form mainForm;
TextField whiteTime, blackTime; // updated by clock thread
TextField searchInfo; // updated by think thread
Clock clock;
Move move;
public void start() {
...
mainForm = new Form(...);
...
thinkButton.addActionListener((ActionListener) (ActionEvent evt) -> {
think();
});
mainForm.show();
}
void think() {
blockUI(); // disable buttons except stopButton
clock.start(board.player); // this thread calls showWhiteTime or showBlackTime every second
invokeAndBlock(() -> { // off the EDT
move = search(board, time); // e.g. for 10 seconds
});
clock.stop();
animateMove(board, move);
clock.start(board.player);
freeUI();
}
// search for a move to play
Move search(Board board, int time) {
...
while (time > 0) {
...
showSearchInfo(info); // called say a few times per second
}
return move;
}
void showSearchInfo(String s) { // access UI off the EDT
callSerially(() -> { // callSerially is necessary here
searchInfo.setText(s);
});
}
void showWhiteTime(String s) {
whiteTime.setText(s); // no callSerially needed, although off the EDT (?)
}
void showBlackTime(String s) {
blackTime.setText(s); // no callSerially needed, although off the EDT (?)
}
Edit: new versions of think, showWhiteTime and showBlackTime.
// version 2, replaced invokeAndBlock by Thread.start() and callSerially
void think() {
blockUI(); // disable buttons except stopButton
new Thread(() -> { // off the EDT
clock.start(board.player); // this thread calls showWhiteTime or showBlackTime every second
move = search(board, time); // e.g. for 10 seconds
clock.stop();
callSerially(() -> {
animateMove(board, move);
clock.start(board.player);
freeUI();
});
}).start();
}
// version 2, added callSerially
void showWhiteTime(String s) { // access UI off the EDT
callSerially(() -> {
whiteTime.setText(s);
});
}
// version 2, added callSerially
void showBlackTime(String s) { // access UI off the EDT
callSerially(() -> {
blackTime.setText(s);
});
}
Most of the code is fine though I would avoid the EDT violations you have in showWhiteTime and showBlackTime. EDT violations can fail in odd ways all of a sudden since you trigger async operations and things can turn nasty quickly. I suggest turning on the EDT violation detection tool in the simulator.
Two things to keep in mind when using invokeAndBlock:
It's slower than a regular thread
It blocks pending events in some cases so it's problematic to have it as a part of a pending event chain
The second point is a difficult one to grasp and a source of many mistakes so it's worth explaining a bit.
consider this code:
buttonA.addActionListener(e -> {
doStuff();
invokeAndBlock(...);
doOtherStuff();
});
buttonA.addActionListener(e -> doSomethingImportant());
That might not seem realistic as you usually don't add two separate listeners one after the other but this happens enough e.g. if one change triggers another etc.
The current event processing will be blocked for buttonA during invokeAndBlock. That means that doOtherStuff() will wait for the invokeAndBlock and also doSomethingImportant() will wait.
If doSomethingImportant() shows another form you can end up with weird behavior such as ages after you pressed the button and did a lot of other things suddenly your form changes.
So you need to be very conscious of your usage of invokeAndBlock.

ARKit SceneKit ARSCNView with Positional Audio SCNAudioPlayer WILL NOT STOP playing

Running ARKit 2.0 with an ARSCNView. iOS12
The application uses multithreading, that's why these functions are being performed on the main thread (just to be sure). I also tried without explicitly performing the functions on the main thread too, with no avail.
I'm using an .aiff sound file but have also tried a .wav. No joy.
I even tried removing audioNode_alarm from the node hierarchy & the sound still plays. I even remove the ARSCNView from the view hierarchy and the sound STILL plays. FFS
From what I can see, I'm doing things EXACTLY as I'm supposed to, to stop the audio from playing. The audio simply will not stop no matter what I try. Can anyone think why?!
weak var audioNode_alarm: SCNNode!
weak var audioPlayer_alarm: SCNAudioPlayer?
func setupAudioNode() {
let audioNode_alarm = SCNNode()
addChildNode(audioNode_alarm)
self.audioNode_alarm = audioNode_alarm
}
func playAlarm() {
DispatchQueue.main.async { [unowned self] in
self.audioNode_alarm.removeAllAudioPlayers()
if let audioSource_alarm = SCNAudioSource(fileNamed: "PATH_TO_MY_ALARM_SOUND.aiff") {
audioSource_alarm.loops = true
audioSource_alarm.load()
audioSource_alarm.isPositional = true
let audioPlayer_alarm = SCNAudioPlayer(source: audioSource_alarm)
self.audioNode_alarm.addAudioPlayer(audioPlayer_alarm)
self.audioPlayer_alarm = audioPlayer_alarm
}
}
}
func stopAlarm() {
DispatchQueue.main.async { [unowned self] in
self.audioNode_alarm?.removeAudioPlayer(self.audioPlayer_alarm!)
self.audioNode_alarm?.removeAllAudioPlayers()
}
}
What I ended up doing is stopping the sound and removing the player by
yourNode.audioPlayers.forEach { audioLocalPlayer in
audioLocalPlayer.audioNode?.engine?.stop()
yourNode.removeAudioPlayer(audioLocalPlayer)
}
According to the documentation SCNAudioPlayer has audioNode, which is supposed to be used "to vary parameters such as volume and reverb in real time during playback".
audioNode is of AVAudioNode type, so if we jump to engine prop and its type definition, we'll find all the controls we need.

How to implement output cache for a content part (such as a widget)?

I have a widget with list of last news, how to cache only widget output?
OutputCache module caches whole page and for anonymous users, but in fact I need to cache only one shape output.
What solution can be here?
It's not a good idea to cache the Shape object itself, but you can capture the HTML output from a Shape and cache that.
Every Orchard Shape has a corresponding object called the Metadata. This object contains, among other things, some event handlers that can run when the Shape is displaying or after it has been displayed. By using these event handlers, it is possible to cache the output of the Shape on the first call to a driver. Then for future calls to the driver, we can display the cached copy of the output instead of running through the expensive parts of the driver or template rendering.
Example:
using System.Web;
using DemoModule.Models;
using Orchard.Caching;
using Orchard.ContentManagement.Drivers;
using Orchard.DisplayManagement.Shapes;
namespace DemoModule.Drivers {
public class MyWidgetPartDriver : ContentPartDriver<MyWidgetPart> {
private readonly ICacheManager _cacheManager;
private readonly ISignals _signals;
public MyWidgetPartDriver(
ICacheManager cacheManager,
ISignals signals
) {
_cacheManager = cacheManager;
_signals = signals;
}
public class CachedOutput {
public IHtmlString Output { get; set; }
}
protected override DriverResult Display(MyWidgetPart part, string displayType, dynamic shapeHelper) {
return ContentShape("Parts_MyWidget", () => {
// The cache key. Build it using whatever is needed to differentiate the output.
var cacheKey = /* e.g. */ string.Format("MyWidget-{0}", part.Id);
// Standard Orchard cache manager. Notice we get this object by reference,
// so we can write to its field to save our cached HTML output.
var cachedOutput = _cacheManager.Get(cacheKey, ctx => {
// Use whatever signals are needed to invalidate the cache.
_signals.When(/* e.g. */ "ExpireCache");
return new CachedOutput();
});
dynamic shape;
if (cachedOutput.Output == null) {
// Output has not yet been cached, so we are going to build the shape normally
// and then cache the output.
/*
... Do normal (potentially expensive) things (call DBs, call services, etc.)
to prep shape ...
*/
// Create shape object.
shape = shapeHelper.Parts_MyWidget(/*...*/);
// Hook up an event handler such that after rendering the (potentially expensive)
// shape template, we capture the output to the cached output object.
((ShapeMetadata) shape.Metadata).OnDisplayed(displayed => cachedOutput.Output = displayed.ChildContent);
} else {
// Found cached output, so simply output it instead of building
// the shape normally.
// This is a dummy shape, the name doesn't matter.
shape = shapeHelper.CachedShape();
// Hook up an event handler to fill the output of this shape with the cached output.
((ShapeMetadata)shape.Metadata).OnDisplaying(displaying => displaying.ChildContent = cachedOutput.Output);
// Replacing the ChildContent of the displaying context will cause the display manager
// to simply use that HTML output and skip template rendering.
}
return shape;
});
}
}
}
EDIT:
Note that this only caches the HTML that is generated from your shape output. Things like Script.Require(), Capture(), and other side effects that you perform in your shape templates will not be played back. This actually bit me because I tried to cache a template that required its own stylesheet, but the stylesheets would only be brought in the first time.
Orchard supplies a service called the CacheManager, which is awesome and cool and makes caching super easy. It is mentioned in the docs, but it isn't a particularly helpful description of how to use it (http://docs.orchardproject.net/Documentation/Caching). Best place to see examples would be in the Orchard core code and third party modules such as Favicon and the twitter widgets (all of them one would hope).
Luckily other nice people have gone to the effort of searching orchards code for you and writing nice little blog posts about it. The developer of the LatestTwitter widget wrote a neat post: http://blog.maartenballiauw.be/post/2011/01/21/Writing-an-Orchard-widget-LatestTwitter.aspx . So did Richard of NogginBox: http://www.nogginbox.co.uk/blog/orchard-caching-by-time . And of course Bertrand has a helpful post on the subject as well: http://weblogs.asp.net/bleroy/archive/2011/02/16/caching-items-in-orchard.aspx

Resources