Using UWP monitor live audio and detect gun-fire/clap sound - audio

I am developing a new UWP app which should monitor sound and fire a event for each sudden sound blow (something like gun fire or clap).
It needs to enable default Audio Input and monitor live audio.
Set audio sensitivity for identifying environment noise and recognizing clap/gun-fire
When there is a high frequency sound like a clap/gun-fire sound (Ideally it should be like configured frequency like +/-40 then it is a gun-fire/clap) then it should call a event.
No need to save Audio
I tried to implement this
public sealed partial class MyPage : Page
private async void Page_Loaded(object sender, RoutedEventArgs e)
string deviceId = Windows.Media.Devices.MediaDevice.GetDefaultAudioCaptureId(Windows.Media.Devices.AudioDeviceRole.Communications);
gameChatAudioStateMonitor = AudioStateMonitor.CreateForCaptureMonitoringWithCategoryAndDeviceId(MediaCategory.GameChat, deviceId);
gameChatAudioStateMonitor.SoundLevelChanged += GameChatSoundLevelChanged;
//other logic
Sound Level Change:
private void GameChatSoundLevelChanged(AudioStateMonitor sender, object args)
switch (sender.SoundLevel)
case SoundLevel.Full:
case SoundLevel.Muted:
case SoundLevel.Low:
// Audio capture should never be "ducked", only muted or full volume.
Debug.WriteLine("Unexpected audio state change.");
ENV: windows 10 (v1809) IDE: VS 2017
Not sure if this is the right approach. This is not enabling audio and not hitting the level change event.
I see other options in WinForms & NAudio tutorial here. Probably with Sampling frequency I can check events... Doesn't have must tutorial on using NAudio with UWP to plot the graph and identify the frequency.
Followed suggestion from #Rob Caplan - MSFT, here is what I ended up with
// We are initializing a COM interface for use within the namespace
// This interface allows access to memory at the byte level which we need to populate audio data that is generated
unsafe interface IMemoryBufferByteAccess
void GetBuffer(out byte* buffer, out uint capacity);
public sealed partial class GunFireMonitorPage : Page
private MainPage _rootPage;
public static GunFireMonitorPage Current;
private AudioGraph _graph;
private AudioDeviceOutputNode _deviceOutputNode;
private AudioFrameInputNode _frameInputNode;
public double Theta;
public DrivePage()
Current = this;
protected override async void OnNavigatedTo(NavigationEventArgs e)
_rootPage = MainPage.Current;
await CreateAudioGraph();
protected override void OnNavigatedFrom(NavigationEventArgs e)
private void Page_Loaded(object sender, RoutedEventArgs e)
private unsafe AudioFrame GenerateAudioData(uint samples)
// Buffer size is (number of samples) * (size of each sample)
// We choose to generate single channel (mono) audio. For multi-channel, multiply by number of channels
uint bufferSize = samples * sizeof(float);
AudioFrame audioFrame = new AudioFrame(bufferSize);
using (AudioBuffer buffer = audioFrame.LockBuffer(AudioBufferAccessMode.Write))
using (IMemoryBufferReference reference = buffer.CreateReference())
// Get the buffer from the AudioFrame
// ReSharper disable once SuspiciousTypeConversion.Global
// ReSharper disable once UnusedVariable
((IMemoryBufferByteAccess) reference).GetBuffer(out var dataInBytes, out var capacityInBytes);
// Cast to float since the data we are generating is float
var dataInFloat = (float*)dataInBytes;
float freq = 1000; // choosing to generate frequency of 1kHz
float amplitude = 0.3f;
int sampleRate = (int)_graph.EncodingProperties.SampleRate;
double sampleIncrement = (freq * (Math.PI * 2)) / sampleRate;
// Generate a 1kHz sine wave and populate the values in the memory buffer
for (int i = 0; i < samples; i++)
double sinValue = amplitude * Math.Sin(Theta);
dataInFloat[i] = (float)sinValue;
Theta += sampleIncrement;
return audioFrame;
private void node_QuantumStarted(AudioFrameInputNode sender, FrameInputNodeQuantumStartedEventArgs args)
// GenerateAudioData can provide PCM audio data by directly synthesizing it or reading from a file.
// Need to know how many samples are required. In this case, the node is running at the same rate as the rest of the graph
// For minimum latency, only provide the required amount of samples. Extra samples will introduce additional latency.
uint numSamplesNeeded = (uint)args.RequiredSamples;
if (numSamplesNeeded != 0)
AudioFrame audioData = GenerateAudioData(numSamplesNeeded);
private void Button_Click(object sender, RoutedEventArgs e)
if (generateButton.Content != null && generateButton.Content.Equals("Generate Audio"))
generateButton.Content = "Stop";
audioPipe.Fill = new SolidColorBrush(Colors.Blue);
else if (generateButton.Content != null && generateButton.Content.Equals("Stop"))
generateButton.Content = "Generate Audio";
audioPipe.Fill = new SolidColorBrush(Color.FromArgb(255, 49, 49, 49));
private async Task CreateAudioGraph()
// Create an AudioGraph with default settings
AudioGraphSettings settings = new AudioGraphSettings(AudioRenderCategory.Media);
CreateAudioGraphResult result = await AudioGraph.CreateAsync(settings);
if (result.Status != AudioGraphCreationStatus.Success)
// Cannot create graph
_rootPage.NotifyUser($"AudioGraph Creation Error because {result.Status.ToString()}", NotifyType.ErrorMessage);
_graph = result.Graph;
// Create a device output node
CreateAudioDeviceOutputNodeResult deviceOutputNodeResult = await _graph.CreateDeviceOutputNodeAsync();
if (deviceOutputNodeResult.Status != AudioDeviceNodeCreationStatus.Success)
// Cannot create device output node
$"Audio Device Output unavailable because {deviceOutputNodeResult.Status.ToString()}", NotifyType.ErrorMessage);
speakerContainer.Background = new SolidColorBrush(Colors.Red);
_deviceOutputNode = deviceOutputNodeResult.DeviceOutputNode;
_rootPage.NotifyUser("Device Output Node successfully created", NotifyType.StatusMessage);
speakerContainer.Background = new SolidColorBrush(Colors.Green);
// Create the FrameInputNode at the same format as the graph, except explicitly set mono.
AudioEncodingProperties nodeEncodingProperties = _graph.EncodingProperties;
nodeEncodingProperties.ChannelCount = 1;
_frameInputNode = _graph.CreateFrameInputNode(nodeEncodingProperties);
frameContainer.Background = new SolidColorBrush(Colors.Green);
// Initialize the Frame Input Node in the stopped state
// Hook up an event handler so we can start generating samples when needed
// This event is triggered when the node is required to provide data
_frameInputNode.QuantumStarted += node_QuantumStarted;
// Start the graph since we will only start/stop the frame input node
mc:Ignorable="d" Loaded="Page_Loaded"
Background="{ThemeResource ApplicationPageBackgroundThemeBrush}">
<ScrollViewer HorizontalAlignment="Center">
<StackPanel HorizontalAlignment="Center">
<!-- more page content -->
<Grid HorizontalAlignment="Center">
<ColumnDefinition Width="*"/>
<ColumnDefinition Width="*"/>
<RowDefinition Height="55"></RowDefinition>
<AppBarButton x:Name="generateButton" Content="Generate Audio" Click="Button_Click" MinWidth="120" MinHeight="45" Margin="0,50,0,0"/>
<Border x:Name="frameContainer" BorderThickness="0" Background="#4A4A4A" MinWidth="120" MinHeight="45" Margin="0,20,0,0">
<TextBlock x:Name="frame" Text="Frame Input" VerticalAlignment="Center" HorizontalAlignment="Center" />
<Rectangle x:Name="audioPipe" Margin="0,20,0,0" Height="10" MinWidth="160" Fill="#313131" HorizontalAlignment="Stretch"/>
<Border x:Name="speakerContainer" BorderThickness="0" Background="#4A4A4A" MinWidth="120" MinHeight="45" Margin="0,20,0,0">
<TextBlock x:Name="speaker" Text="Output Device" VerticalAlignment="Center" HorizontalAlignment="Center" />
There is no graph generated. And there is continuous beep sound with blue line.
Any help is greatly appreciated
Update: Implemented AudioVisualizer
With the help of AudioVisualizer, I was able to plot the lice audio graph.
AudioGraph _graph;
AudioDeviceInputNode _inputNode;
PlaybackSource _source;
SourceConverter _converter;
protected override void OnNavigatedTo(NavigationEventArgs e)
_rootPage = MainPage.Current;
_rootPage.SetDimensions(700, 600);
protected override void OnNavigatedFrom(NavigationEventArgs e)
_graph = null;
async void CreateAudioGraphAsync()
var graphResult = await AudioGraph.CreateAsync(new AudioGraphSettings(Windows.Media.Render.AudioRenderCategory.Media));
if (graphResult.Status != AudioGraphCreationStatus.Success)
throw new InvalidOperationException($"Graph creation failed {graphResult.Status}");
_graph = graphResult.Graph;
var inputNodeResult = await _graph.CreateDeviceInputNodeAsync(MediaCategory.Media);
if (inputNodeResult.Status == AudioDeviceNodeCreationStatus.Success)
_inputNode = inputNodeResult.DeviceInputNode;
_source = PlaybackSource.CreateFromAudioNode(_inputNode);
_converter = new SourceConverter
Source = _source.Source,
MinFrequency = 110.0f,
MaxFrequency = 3520.0f,
FrequencyCount = 12 * 5 * 5,
FrequencyScale = ScaleType.Linear,
SpectrumRiseTime = TimeSpan.FromMilliseconds(20),
SpectrumFallTime = TimeSpan.FromMilliseconds(200),
RmsRiseTime = TimeSpan.FromMilliseconds(20),
RmsFallTime = TimeSpan.FromMilliseconds(500),
ChannelCount = 1
// Note A2
// Note A7
// 5 octaves, 5 bars per note
// Use RMS to gate noise, fast rise slow fall
NotesSpectrum.Source = _converter;
_rootPage.NotifyUser("Cannot access microphone", NotifyType.ErrorMessage);
Now the challenge is how do I wire an event when wave frequency is above a threshold? In that event I would like to count number of shots, timestamp and it's intensity.
Example Sound
Here is my Recording of live sound, as you can here, when there is that big hammer strike (every second or less), I would like to call a event.

You can find the decibels of a frame by finding the average amplitude of all the pcm data from that frame.I believe you want create a graph that handles the input so that looks like this
private static event LoudNoise<double>;
private static int quantum = 0;
static AudioGraph ingraph;
private static AudioDeviceInputNode deviceInputNode;
private static AudioFrameOutputNode frameOutputNode;
public static async Task<bool> CreateInputDeviceNode(string deviceId)
Console.WriteLine("Creating AudioGraphs");
// Create an AudioGraph with default settings
AudioGraphSettings graphsettings = new AudioGraphSettings(AudioRenderCategory.Media);
graphsettings.EncodingProperties = new AudioEncodingProperties();
graphsettings.EncodingProperties.Subtype = "Float";
graphsettings.EncodingProperties.SampleRate = 48000;
graphsettings.EncodingProperties.ChannelCount = 2;
graphsettings.EncodingProperties.BitsPerSample = 32;
graphsettings.EncodingProperties.Bitrate = 3072000;
//settings.DesiredSamplesPerQuantum = 960;
//settings.QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired;
CreateAudioGraphResult graphresult = await AudioGraph.CreateAsync(graphsettings);
if (graphresult.Status != AudioGraphCreationStatus.Success)
// Cannot create graph
return false;
ingraph = graphresult.Graph;AudioGraphSettings nodesettings = new AudioGraphSettings(AudioRenderCategory.GameChat);
nodesettings.EncodingProperties = AudioEncodingProperties.CreatePcm(48000, 2, 32);
nodesettings.DesiredSamplesPerQuantum = 960;
nodesettings.QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired;
frameOutputNode = ingraph.CreateFrameOutputNode(ingraph.EncodingProperties);
quantum = 0;
ingraph.QuantumStarted += Graph_QuantumStarted;
DeviceInformation selectedDevice;
string device = Windows.Media.Devices.MediaDevice.GetDefaultAudioCaptureId(Windows.Media.Devices.AudioDeviceRole.Default);
if (!string.IsNullOrEmpty(device))
selectedDevice = await DeviceInformation.CreateFromIdAsync(device);
} else
return false;
CreateAudioDeviceInputNodeResult result =
await ingraph.CreateDeviceInputNodeAsync(MediaCategory.Media, nodesettings.EncodingProperties, selectedDevice);
if (result.Status != AudioDeviceNodeCreationStatus.Success)
// Cannot create device output node
return false;
deviceInputNode = result.DeviceInputNode;
return true;
private static void Graph_QuantumStarted(AudioGraph sender, object args)
if (++quantum % 2 == 0)
AudioFrame frame = frameOutputNode.GetFrame();
float[] dataInFloats;
using (AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.Write))
using (IMemoryBufferReference reference = buffer.CreateReference())
// Get the buffer from the AudioFrame
((IMemoryBufferByteAccess)reference).GetBuffer(out byte* dataInBytes, out uint capacityInBytes);
float* dataInFloat = (float*)dataInBytes;
dataInFloats = new float[capacityInBytes / sizeof(float)];
for (int i = 0; i < capacityInBytes / sizeof(float); i++)
dataInFloats[i] = dataInFloat[i];
double decibels = 0f;
foreach (var sample in dataInFloats)
decibels += Math.Abs(sample);
decibels = 20 * Math.Log10(decibels / dataInFloats.Length);
// You can pass the decibel value where ever you'd like from here
if (decibels > 10)
LoudNoise?.Invoke(this, decibels);
P.S. I did all of this static but naturally it'll work if it's all in the same instance
I also copied this partially from my own project so it may have some parts I forgot to trim. Hope it helps

Answering the "is this the right approach" question: no, the AudioStateMonitor will not help with the problem.
AudioStateMonitor.SoundLevelChanged tells you if the system is ducking your sound so it doesn't interfere with something else. For example, it may mute music in favour of the telephone ringer. SoundLevelChanged doesn't tell you anything about the volume or frequency of recorded sound, which is what you'll need to detect your handclap.
The right approach will be along the lines of using an AudioGraph (or WASAPI, but not from C#) to capture the raw audio into an AudioFrameOutputNode to process the signal and then run that through an FFT to detect sounds in your target frequencies and volumes. The AudioCreation sample demonstrates using an AudioGraph, but not specifically AudioFrameOutputNode.
Per clapping will be in a frequency range of 2200Hz to 2800Hz.
Recognizing gunshots looks like it's significantly more complicated, with different guns having very different signatures. A quick search found several research papers on this rather than trivial algorithms. I suspect you'll want some sort of Machine Learning to classify these. Here's a previous thread discussing using ML to differ between gunshots and non-gunshots: SVM for one Vs all acoustic signal classification


Emgucv Camera Capure, frame rate less then 1FPS

I am really new in programming and EemguCV and I am struggling with really basics and I hope you guys can help me out. I am writting a little program for my school to show differnt filters and Imageconverting methods.
I am trying to grab images from a USB webcam and illustrate them in Imageboxes. e.g convertign it into Greyscale(Binary, changing resolutiona and Framerate etc. . I think I am doing something wrong because I am receiveing frame rates less then 1 FPS in from the camera. Also with low resolution (640/300px). My question isyou can help me out to encrease my frame rate. Also it was not possible to grab Image with the QuereFrame() metho, so I went with the Mat Obgect and retreive it to the Images.
Here is my code:
private void Capture_ImageGrabbed(object sender, EventArgs e) \\capure function
capture.SetCaptureProperty(CapProp.FrameHeight, resolution_X);
capture.SetCaptureProperty(CapProp.FrameWidth, resolution_Y);
ImgInput = m.ToImage<Bgr, byte>();
ImgGrayInput = m.ToImage<Gray, byte>();
iB_colour_Image.Image = ImgInput;
iB_Grey.Image = ImgGrayInput;
if (imgchange)
ImgReference = ImgInput;
ImgGrayReference = ImgReference.Convert<Gray , byte>();
imgchange = false;
ImgBinarizedInput = new Image<Gray, byte>(ImgGrayInput.Width, ImgGrayInput.Height);
double thresholdInput = CvInvoke.Threshold(ImgGrayInput, ImgBinarizedInput, tb_value, 255, Emgu.CV.CvEnum.ThresholdType.Binary);
ImgBinarizedReference = new Image<Gray, byte>(ImgGrayReference.Width, ImgGrayReference.Height);
double thresholdReference = CvInvoke.Threshold(ImgGrayReference, ImgBinarizedReference, tb_value, 255, Emgu.CV.CvEnum.ThresholdType.Binary);
iB_Binary.Image = ImgBinarizedInput;
private void bt_play_stop_Click(object sender, EventArgs e) \\Buton to start stop the capture
if (buttonstate == false)
if (capture == null)
capture = new VideoCapture(selectedcamera, VideoCapture.API.DShow);
capture.ImageGrabbed += Capture_ImageGrabbed;
buttonstate = true;
bt_play_stop.Text = "Stop";
buttonstate = false;
bt_play_stop.Text = "Play";

How to filter on Apache Edgent and also show the values which were filtered?

I am using Apache Edgent (Java framework) to poll values from a HCSR04 ultrasonic sensor on a Raspberry Pi every 3 seconds. I use a filter to not get values from 50cm to 80cm.
UltrasonicStream sensor = new UltrasonicStream();
DirectProvider dp = new DirectProvider();
Topology topology = dp.newTopology();
TStream<Double> tempReadings = topology.poll(sensor, 3, TimeUnit.SECONDS);
TStream<Double> filteredReadings = tempReadings.filter(reading -> reading < 50 || reading > 80);
System.out.println("filter added: tempReadings.filter(reading -> reading < 50 || reading > 80);");
I want to show some message when the values are filtered. When the values do not match with my filter I can poll them, but when they match I am not returning, that is ok. However, I want just to show that a value was filtered using Apache Edgent libraries. I know that I can do something on the public double get() method, but I wonder if I could do this trick with some method of the Apache Edgent.
public class UltrasonicStream implements Supplier {
private static final long serialVersionUID = -6511218542753341056L;
private static GpioPinDigitalOutput sensorTriggerPin;
private static GpioPinDigitalInput sensorEchoPin;
private static final GpioController gpio = GpioFactory.getInstance();
private double currentDistance = -1.0;
* The HCSR04 Ultrasonic sensor is connected on the physical pin 16 and 18 which
* correspond to the GPIO 04 and 05 of the WiringPi library.
public UltrasonicStream() {
// Trigger pin as OUTPUT
sensorTriggerPin = gpio.provisionDigitalOutputPin(RaspiPin.GPIO_04);
// Echo pin as INPUT
sensorEchoPin = gpio.provisionDigitalInputPin(RaspiPin.GPIO_05, PinPullResistance.PULL_DOWN);
* This is the override method of the Supplier interface from Apache Edgent
public Double get() {
try {
System.out.print("Distance in centimeters: ");
currentDistance = getDistance();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
return currentDistance;
* Retrieve the distance measured by the HCSR04 Ultrasonic sensor connected on a
* Raspberry Pi 3+B
* #return the distance in centimeters
* #throws InterruptedException
public double getDistance() throws InterruptedException {
double distanceCM = -1;
try {
// Thread.sleep(2000);
sensorTriggerPin.high(); // Make trigger pin HIGH
Thread.sleep((long) 0.01);// Delay for 10 microseconds
sensorTriggerPin.low(); // Make trigger pin LOW
// Wait until the ECHO pin gets HIGH
while (sensorEchoPin.isLow()) {
// Store the current time to calculate ECHO pin HIGH time.
long startTime = System.nanoTime();
// Wait until the ECHO pin gets LOW
while (sensorEchoPin.isHigh()) {
// Store the echo pin HIGH end time to calculate ECHO pin HIGH time.
long endTime = System.nanoTime();
distanceCM = ((((endTime - startTime) / 1e3) / 2) / 29.1);
// Printing out the distance in centimeters
// System.out.println("Distance: " + distanceCM + " centimeters");
return distanceCM;
} catch (InterruptedException e) {
return distanceCM;
You can use TStream.split() to create two streams: one whose tuples match your filter predicate and one for those that don't. You can then do whatever you want with either stream. e.g. TStream.peek(t -> System.out.println("excluded: " + t)), or TStream.print(...)
I implemented like this:
UltrasonicStream sensor = new UltrasonicStream();
DirectProvider dp = new DirectProvider();
Topology topology = dp.newTopology();
TStream<Double> tempReadings = topology.poll(sensor, 3, TimeUnit.SECONDS);
TStream<Double> filteredReadings = tempReadings.filter(reading -> {
boolean threshold = reading < 20 || reading > 80;
if (!threshold) {
System.out.println(String.format("Threshold reached: %s cm", reading));
return threshold;

ListView with Groove like quick return header

When scrolling down, Groove moves the header up, outside of the viewable area just like a regular ListView header. When scrolling back up it moves the header back down into the viewable area right away, regardless of the current vertical scroll offset. The header seems to be part of the ListView content because the scrollbar includes the header.
How can this be implemented in a Windows 10 UWP app?
You can do this by utilizing the ListView's internal ScrollViewer's ViewChanged event.
First you got to obtain the internal ScrollViewer. This is the simplest version, but you might want to use one of the many VisualTreeHelper Extensions around to do it safer and easier:
private void MainPage_Loaded(object sender, RoutedEventArgs e)
var border = VisualTreeHelper.GetChild(MyListView, 0);
var scrollviewer = VisualTreeHelper.GetChild(border, 0) as ScrollViewer;
scrollviewer.ViewChanged += Scrollviewer_ViewChanged;
In the EventHandler, you can then change the visibility of your header depending on the scroll direction.
private void Scrollviewer_ViewChanged(object sender, ScrollViewerViewChangedEventArgs e)
var sv = sender as ScrollViewer;
if (sv.VerticalOffset > _lastVerticalOffset)
MyHeader.Visibility = Visibility.Collapsed;
MyHeader.Visibility = Visibility.Visible;
This is the basic idea. You might wan't to add some smooth animations instead of just changing the visibility.
After looking around a bit and experimentation I can now answer my own question.
One can use an expression based composition animation to adjust the Y offset of the the header in relation to scrolling. The idea is based on this answer. I prepared a complete working example on GitHub.
The animation is prepared in the SizeChanged event of the ListView:
ScrollViewer scrollViewer = null;
private double previousVerticalScrollOffset = 0.0;
private CompositionPropertySet scrollProperties;
private CompositionPropertySet animationProperties;
SizeChanged += (sender, args) =>
if (scrollProperties == null)
scrollProperties = ElementCompositionPreview.GetScrollViewerManipulationPropertySet(scrollViewer);
var compositor = scrollProperties.Compositor;
if (animationProperties == null)
animationProperties = compositor.CreatePropertySet();
animationProperties.InsertScalar("OffsetY", 0.0f);
var expressionAnimation = compositor.CreateExpressionAnimation("animationProperties.OffsetY - ScrollingProperties.Translation.Y");
expressionAnimation.SetReferenceParameter("ScrollingProperties", scrollProperties);
expressionAnimation.SetReferenceParameter("animationProperties", animationProperties);
var headerVisual = ElementCompositionPreview.GetElementVisual((UIElement)Header);
headerVisual.StartAnimation("Offset.Y", expressionAnimation);
The OffsetY variable in the animationProperties will drive the animation of the OffsetY property of the header. The OffsetY variable is updated in the ViewChanged event of the ScrollViewer:
scrollViewer.ViewChanged += (sender, args) =>
float oldOffsetY = 0.0f;
animationProperties.TryGetScalar("OffsetY", out oldOffsetY);
var delta = scrollViewer.VerticalOffset - previousVerticalScrollOffset;
previousVerticalScrollOffset = scrollViewer.VerticalOffset;
var newOffsetY = oldOffsetY - (float)delta;
// Keep values within negativ header size and 0
FrameworkElement header = (FrameworkElement)Header;
newOffsetY = Math.Max((float)-header.ActualHeight, newOffsetY);
newOffsetY = Math.Min(0, newOffsetY);
if (oldOffsetY != newOffsetY)
animationProperties.InsertScalar("OffsetY", newOffsetY);
While this does animate correctly, the header is not stacked on top of the ListView items. Therefore the final piece to the puzzle is to decrease the ZIndex of the ItemsPanelTemplate of the ListView:
<ItemsStackPanel Canvas.ZIndex="-1" />
Which gives this as a result:

Speech recognition using SetInputToWaveFile ends prematurely

I want to do speech recognition of an audio file.
My code is pretty basic and derived from here. The problem is that it stops with every wave file prematurely after a few seconds even though some wave files are hours long.
How to make it scan the whole file?
namespace Stimmenerkennung
public partial class Form1 : Form
Thread erkennung;
bool completed;
private void Form1_Load(object sender, EventArgs e)
erkennung = new Thread(erkennen);
void erkennen()
using (SpeechRecognitionEngine recognizer =
new SpeechRecognitionEngine())
// Create and load a grammar.
Grammar dictation = new DictationGrammar();
dictation.Name = "Dictation Grammar";
// Configure the input to the recognizer.
// Attach event handlers for the results of recognition.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
recognizer.RecognizeCompleted +=
new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted);
// Perform recognition on the entire file.
db("Starting asynchronous recognition...");
while (!completed)
//fs((int)(100 / recognizer.AudioPosition.TotalSeconds * recognizer.AudioPosition.Seconds));
// Handle the SpeechRecognized event.
void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
if (e.Result != null && e.Result.Text != null)
db(" Recognized text not available.");
// Handle the RecognizeCompleted event.
void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
if (e.Cancelled)
db(" Operation cancelled.");
if (e.InputStreamEnded)
db(" End of stream encountered.");
completed = true;
void db(string t)
textBox1.Text = textBox1.Text + Environment.NewLine + t;
//textBox1.Text = t;
You can split the file on few seconds chunks by the silences and feed the chunk to the recognizer separately. Then you can combine results into a single string.
You can use any voice activity detection implementation to perform the split, a simple energy-based VAD which calculate frame energy will be sufficient.
You can find some existing implementations of the VAD in CMUSphinx projet

SLIMDX antialising

I try to get the high qulity antialiasing from a tuturial I found on the internet ( But did not achieve a very good solution.
I already set the multisampling to the maximum:
m_swapChainDesc.SampleDescription = new DXGI.SampleDescription(8,0);
To me it appears as the pixel size of the rendered image is larger than the actual pixel size of my screen.
Thank you very much in advance for your valuable inputs
here is the complete code:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using SlimDX;
using DX10 = SlimDX.Direct3D10;
using DXGI = SlimDX.DXGI;
namespace TutorialSeries.DirectX10.Chapter3
public partial class MainWindow : Form
private DX10.Device m_device;
private DXGI.SwapChainDescription m_swapChainDesc;
private DXGI.SwapChain m_swapChain;
private DXGI.Factory m_factory;
private DX10.RenderTargetView m_renderTarget;
private bool m_initialized;
private SimpleBox m_simpleBox;
private Matrix m_viewMatrix;
private Matrix m_projMatrix;
private Matrix m_worldMatrix;
private Matrix m_viewProjMatrix;
public MainWindow()
this.SetStyle(ControlStyles.ResizeRedraw, true);
this.SetStyle(ControlStyles.AllPaintingInWmPaint, true);
this.SetStyle(ControlStyles.Opaque, true);
/// <summary>
/// Initializes device and other resources needed for rendering. Returns true, if successful.
/// </summary>
private bool Initialize3D()
m_device = new DX10.Device(DX10.DriverType.Warp, DX10.DeviceCreationFlags.SingleThreaded);
m_factory = new DXGI.Factory();
m_swapChainDesc = new DXGI.SwapChainDescription();
m_swapChainDesc.OutputHandle = this.Handle;
m_swapChainDesc.IsWindowed = true;
m_swapChainDesc.BufferCount = 1;
m_swapChainDesc.Flags = DXGI.SwapChainFlags.AllowModeSwitch;
m_swapChainDesc.ModeDescription = new DXGI.ModeDescription(
new Rational(60, 1),
m_swapChainDesc.SampleDescription = new DXGI.SampleDescription(8,0);
m_swapChainDesc.SwapEffect = DXGI.SwapEffect.Discard;
m_swapChainDesc.Usage = DXGI.Usage.RenderTargetOutput;
m_swapChain = new DXGI.SwapChain(m_factory, m_device, m_swapChainDesc);
DX10.Viewport viewPort = new DX10.Viewport();
viewPort.X = 0;
viewPort.Y = 0;
viewPort.Width = this.Width;
viewPort.Height = this.Height;
viewPort.MinZ = 0f;
viewPort.MaxZ = 1f;
//DX10.Texture2D backBuffer = m_swapChain.GetBuffer<DX10.Texture2D>(0);
DX10.Texture2D Texture = DX10.Texture2D.FromSwapChain<DX10.Texture2D>(m_swapChain,0);
//m_renderTarget = new DX10.RenderTargetView(m_device, backBuffer);
//DX10.RenderTargetViewDescription renderDesc = new DX10.RenderTargetViewDescription();
//renderDesc.FirstArraySlice = 0;
//renderDesc.MipSlice = 0;
m_renderTarget = new DX10.RenderTargetView(m_device, Texture);
DX10.RasterizerStateDescription rsd = new DX10.RasterizerStateDescription();
rsd.CullMode = DX10.CullMode.Back;
rsd.FillMode = DX10.FillMode.Wireframe;
rsd.IsMultisampleEnabled = true;
rsd.IsAntialiasedLineEnabled = false;
rsd.IsDepthClipEnabled = false;
rsd.IsScissorEnabled = false;
DX10.RasterizerState RasterStateWireFrame = DX10.RasterizerState.FromDescription(m_device,rsd);
DX10.BlendStateDescription blendDesc = new DX10.BlendStateDescription();
blendDesc.BlendOperation = DX10.BlendOperation.Add;
blendDesc.AlphaBlendOperation = DX10.BlendOperation.Add;
blendDesc.SourceAlphaBlend = DX10.BlendOption.Zero;
blendDesc.DestinationAlphaBlend = DX10.BlendOption.Zero;
blendDesc.SourceBlend = DX10.BlendOption.SourceColor;
blendDesc.DestinationBlend = DX10.BlendOption.Zero;
blendDesc.IsAlphaToCoverageEnabled = false;
blendDesc.SetWriteMask(0, DX10.ColorWriteMaskFlags.All);
blendDesc.SetBlendEnable(0, true);
DX10.BlendState m_blendState = DX10.BlendState.FromDescription(m_device, blendDesc);
m_device.Rasterizer.State = RasterStateWireFrame;
m_device.OutputMerger.BlendState = m_blendState;
m_viewMatrix = Matrix.LookAtLH(
new Vector3(0f, 0f, -4f),
new Vector3(0f, 0f, 1f),
new Vector3(0f, 1f, 0f));
m_projMatrix = Matrix.PerspectiveFovLH(
(float)Math.PI * 0.5f,
this.Width / (float)this.Height,
0.1f, 100f);
m_viewProjMatrix = m_viewMatrix * m_projMatrix;
m_worldMatrix = Matrix.RotationYawPitchRoll(0.85f, 0.85f, 0f);
m_simpleBox = new SimpleBox();
m_initialized = true;
catch (Exception ex)
MessageBox.Show("Error while initializing Direct3D10: \n" + ex.Message);
m_initialized = false;
return m_initialized;
/// <summary>
/// Rendering is done during the standard OnPaint event
/// </summary>
protected override void OnPaint(PaintEventArgs e)
if (m_initialized)
m_device.ClearRenderTargetView(m_renderTarget, new Color4(Color.CornflowerBlue));
m_simpleBox.Render(m_device, m_worldMatrix, m_viewProjMatrix);
m_swapChain.Present(0, DXGI.PresentFlags.None);
/// <summary>
/// Initialize 3D-Graphics within OnLoad event
/// </summary>
protected override void OnLoad(EventArgs e)
This is an old question but it's a shame it never got answered; I stumbled across it on Google so I figure answering it may help someone else in the future...
First of all, Pascal, you did NOT set MSAA to the maximum... You're using 8:0, which means 8 samples at a quality of 0 (zero)... definitely not the maximum. What the "max" is depends on the GPU installed on the local machine. So it varies from PC to PC. That's why a DirectX application needs to use DXGI to properly enumerate hardware devices and determine what settings are valid. This is no trivial topic and will require you to do some research and practice of your own. The DirectX SDK Documentation and samples/tutorials are a great starting place, and there's a lot of other materials to be found online. But on my machine, for instance, my GTX-580 GPU can support 8:16 MSAA (possibly higher, but haven't checked).
So you need to learn to use DXGI to enumerate your graphics cards and monitors and figure out what MSAA levels (and other graphics features/settings) it can support. That's the only way you'll ever figure out the "max" MSAA settings or the correct refresh rate of your monitor, for example. If you're clever you will write yourself a small library or component for your game engine that will enumerate hardware devices for you and figure out the optimal graphics settings so you won't have to re-do this over and over for future projects.
