Where else is the stream number used other than in these two places: GetStreamSource and SetStreamSource?
Using multiple streams allows you to combine together vertex component data from different sources. This can be useful when you have different rendering methods, each of which requires different sets of vertex components. Instead of always sending the entire set of data, you can separate it into streams and only use the ones you need. See this chapter from GPU Gems 2 for an example and sample code. It can also be useful for effects such as morphing.
When calling CreateVertexDeclaration, you specify the stream number in the D3DVERTEXELEMENT9 elements to determine which stream each vertex component comes from.
Related
After some search I've learned it is possible to create multiple Vertex Buffers, each for a specific 3D model, and set them in the Input Assembler to be read by my shaders, or at least this is what I could understand. But by reading Microsoft's documentation I've got very confused of how to do this the right way, this is what I was reading, and they say I can pass in an array of Vertex Buffers to the IA stage, but it also says that the maximum number of Vertex Buffers my Input Assembler can take in D3D11 is 32. What would I do if I needed 50 different models being rendered at the same time? And also if someone could clarify how the pOffset work in this situation with multiple models would also help, as I could understand it should always be assigned a 0 value as the beginning of my buffers is always the vertex data, but I could've understood wrong. And by last I want to add I've already rendered some buffers which consists of multiple models together, but I don't know exactly how could I deal with many individual models.
The short answer is: You don't try to draw all your models in one Draw call.
You are free to organize rendering in many ways, but here is one approach:
A 'model' consists of a one or more 'meshes'. Each mesh is collection of a vertices (in a VB), indices (in an IB), and some material information associated with each 'subset' of indices.
To draw:
foreach M in models
foreach mesh in M
foreach part in mesh
Set shaders based on material
Set VB/IB based on mesh
DrawIndexed
Since this is a number of nested loops, there are several ways to improve the performance. For example, you might just queue up the information instead of actually calling DrawIndexed, then sort by material. Then call DrawIndexed from the sorted queue.
For alpha-blending to appear correct, you have to do at least two rendering passes: First to render opaque things, then the second to render alpha-blended things.
You may also want to combine all the content in a given model into one VB and one IB with offsets rather than use individual resources.
You may have the same model in multiple locations in the world, so you may have many model instances sharing the same mesh data. In this case, sorting by VB/IB as well as material could be useful. If you are drawing the same model in many locations (100s or 1000s), then you should to look into hardware instancing.
An example implementation of this can be found in DirectX Tool Kit as Model, ModelMesh, and ModelMeshPart.
I am trying to extract some features from an audio sample using OpenSMILE, but I'm realizing how difficult it is to set up a config file.
The documentation is not very helpful. The best I could do was run some of the sample config files that are provided, see what came out, and then go into the config file and try to determine where the feature was specified. Here's what I did:
I used the default feature set used from The INTERSPEECH 2010 Paralinguistic Challenge (IS10_paraling.conf).
I ran it over a sample audiofile.
I looked at what came out. Then I read the config file in depth, trying to find out where the feature was specified.
Here's a little markdown table showing the results of my exploration:
| Feature generated | instruction in the conf file |
|-------------------|---------------------------------------------------------|
| pcm_loudness | I see: 'loudness=1' |
| mfcc | I see a section: [mfcc:cMfcc] |
| lspFreq | no matches for the text 'lspFreq' anywhere |
| F0finEnv | I seeF0finalEnv = 1 under [pitchSmooth:cPitchSmoother] |
What I see, is 4 different features, all generated by a different instruction in the config file. Well, for one of them, there was no disconcernable instruction in the config file that I could find. With no pattern or intuitive syntax or apparent system, I have no idea how I can eventually figure out how to specify my own features I want to generate.
There are no tutorials, no YouTube videos, no StackOverflow question and no blog posts out there talking about how this could be done. Which is really surprising since this is obviously a huge part of using OpenSMILE.
If anyone finds this, please, can you advise me on how to create custom config files of OpenSMILE? Thanks!
thanks for your interest in openSMILE and your eagerness to build your own configuration files.
Most users in the scientific community actually use openSMILE for its pre-defined config files for the baseline feature sets, which in version 2.3 are even more flexible to use (more commandline options to output to different file formats etc.).
I admit that the documentation provided is not as good as it could be. However, openSMILE is a very complex piece of Software with a lot of functionality, of which only the most important parts are currently well documented.
The best starting point would be to read the openSMILE book and the SIG'MM tutorials all referenced at http://opensmile.audeering.com/ . It contains a section on how to write configuration files. The next important element is the online help of the binary:
SMILExtract -L lists the available components
SMILExtract -H cComponentName lists all options which a given component supports (and thus also features it can extract) with a short description for each
SMILExtract -configDflt cComponentName gives you a template configuration section for the component with all options listed and defaults set
Due to the architecture of openSMILE, which is centered on incremental processing of all audio features, there is (at least not yet) no easy syntax to define the features you want. Rather, you define the processing chain by adding components:
data sources will read in data (from audio files, csv files, or microphone, for example),
data processors will do signal processing and feature extraction in individual steps (windowing, window function, FFT, magnitudes, mel-spectrum, cepstral coefficients (MFCC), for example for extracting MFCC); for each step there is a data processor.
data sinks will write data to output files or send results to a server etc.
You connect the components via the "reader.dmLevel" and "writer.dmLevel" options. These define a name of a data memory level that the components use to exchange data. Only one component may write to one level, i.e. writer.dmLevel=levelName defines the level and may appear only once. Multiple components can read from this level by setting reader.dmLevel=levelName.
In each component you then set the options to enable computation of features and set parameters for this. To answer your question about lspFreq: This is probably enabled by default in the cLsp component, so you don't see an explicit option for it. For future versions of openSMILE the practice of setting all options explicitly will and should be followed more tightly.
The names of the features in the output will be automatically defined by the components. Often each component adds a part the the name, so you can infer from the name the full chain of processing. The options nameAppend and copyInputName (available to most data processors) control this behaviour, although some components might internally override them or change the behaviour a bit.
To see the names (and other info) for each data memory level, including e.g. which features a component in the configuration produces, you can set the option "printLevelStats=5" in the section of componentInstances:cComponentManager.
As everyhting in openSMILE is built for real-time incremental processing, each data memory level has a buffer, which by default is a ring buffer to keep memory footprint constant when the application runs for a longer time.
Sometimes you might want to summarise features over a window of a given length (e.g. with the cFunctionals component). In this case you must ensure that the buffer size of the input level to this component is large enough to hold the full window. You do this via the following options:
writer.levelconf.isRb = 1/0 : sets type of buffer to ringbuffer (1) or fixed size buffer
writer.levelconf.growDyn = 1/0 : sets the buffer to dynamically grow if more data is written to it (1)
writer.levelconf.nT = sets the size of the buffer in frames. Alternatively you can use bufferSizeSec=x to set the size size in seconds and convert to frames automatically.
In most cases the sizes will be set correctly automatically. Subsequent levels also inherit the configuration from the previous levels. Exceptions are when you set a cFunctionals component to read the full input (e.g. only produce one feature at the end of the file), the you must use growDyn=1 on the level that the functionals component reads from, or if you use a variable framing mode (see below).
The cFunctionals component provides frameMode, frameSize, and frameStep options. Where frameMode can be full* (one vector produced at end of input/file), **list (specify a list of frames), var (receive messages, e.g. from a cTurnDetector component, that define frames on-the-fly), or fix (fixed length window). Only in the case of fix the options frameSize set the size of this window, and frameStep the rate at which the window is shifted forward. In case of fix the buffer size of the input level is set correctly automatically, in the other cases you have to set it manually.
I hope this helps you to get started! With every new openSMILE release we at audEERING are trying to document things a bit better and unify things through various components.
We also welcome contributions from the community (e.g. anybody willing to write a graphical configuration file editor where you drag/drop components and connect them graphically? ;)) - although we know that more documentation will make this easier. Until then, you always have to source code to read ;)
Cheers,
Florian
I am new to Direct Show. What I am trying to do is use one iGraphBuilder object, to play both a silent Avi video, and a separate Wav file at the same time. I can't merge the two into one video, but I would like it to appear as if they were.
Is there any way to go about using one set of filters to run an avi and wav file concurrently?
Thanks!
You can achieve this both ways: by adding two separate chains into the same graph, or using two separate filter graphs.
The first method will give you a graph like this:
Second method will get you two graphs:
The first approach has advantage that you have lipsync between video and audio and you control the graphs as a single playback object. The other method's advantage is ability to control video and audio separately, including stopping and changing files independently one from the other.
My goal is to show multiple (small) panes of video on-screen simultaneously.
I would prefer to use the hardware scalar. This is currently working well for a single video on a single surface. For multiple streams it appears multiple SurfaceViews are needed - I don't see a way to use the hardware scaler to blit multiple images into different parts of the same Surface. What's the best way to lock/blit image pixels to these surfaces?
ANativeWindow_unlockAndPost causes a wait-for-vsync + swap (I think?), so I can't call this per-SurfaceView in the same update cycle (well I can, but I get horrible jittering).
One alternative is to use a seperate render thread per SurfaceView. Does this seem like a sane avenue to pursue? Are there any other ways to update multiple SurfaceViews with a single wait-for-vsync+swap?
I am writing an iOS/Android game and looking for the most performant way to render my vertex data with OpenGL ES 2.0. I have two different kinds of data: dynamic data that changes its attributes every frame, for example the player or animated background objects, and static data such as the static background or the terrain. I googled a lot since yesterday, but I could not find a clear and unique answer to the question of what is the best was to render such data.
There are basically three options for rendering such data (If I do not miss one. If so, feel free to correct me.):
Vertex Arrays Only:
Just fill your vertex every frame on the CPU (including the dynamic data).
Vertex Buffer Objects Only:
Allocate a VBO on the GPU with GL_DYNAMIC_DRAW where both, the dynamic and static data is stored. The dynamic data is then updated every frame via glBufferSubData.
Use both:
Static data is stored and render with a VBO and the dynamic data is rendered with a Vertex Array. With this option, we need two rendering passes, one for rendering the VBO and one for rendering the vertex array.
Since the first option does not exploit the immutability of the static data and since the third option requires two rendering passes, my guess is that I should go with the second option. However, I am absolutely not sure about this and I hope you can clarify my confusion.
Allocate two Vertex Buffer Objects. One with hint GL_DYNAMIC_DRAW that will be updated frequently. Allocate a second VBO for immutable data and use the hint GL_STATIC_DRAW. According to the API documentation, GL_STATIC_DRAW should be used for data that "will be modified once and used many times"; just what you need.
Speaking of two rendering passes here is probably a misuse of the term: what you do is to render your scene in two separate drawing commands. Since drawing commands run asynchronously, you should not expericence any performance hit by doing so.
A second rendering pass, on the other hand, is when you render the entire scene twice (see for example here) with different settings, or when you do some image processing effects on outputs of previous rendering passes.