SpeechRecognizer WinRT unwanted audio output - audio

I am writing a WinRT app, testing it in VS 2015, x86 debug. I am getting
unwanted audio output that is not created by my application. Here is my code:
using Windows.Media.SpeechRecognition;
SpeechRecognitionResult result = await recEngine.RecognizeAsync();
//recEngine is of type SpeechRecognizer; using dictation grammar.
if (result.Confidence == SpeechRecognitionConfidence.Rejected)
return;
if (result.RawConfidence < .60)
return;
//proceed to process the speech-to-text in the result......
If I just return (have unrecognized speech), I do not get the audio output: fine. If I am successful in recognizing the speech, I process it in my app, but I also get the following audio output (in my desktop speakers):
"I do not recognize the command. Please try again".
Where is this coming from and how do I suppress it?

sorry, my mistake. I found the audio output (that I created myself).

Related

Java plays audio on Windows but not on Ubuntu

I've been writing for fun a Java program to play some tunes in .wav format.
Whilst developing, the audio played fine on my Windows machine, but however when trying the same code on Ubuntu the audio does not play. There are no errors logged to the console, I press the 'Play' button and nothing happens.
Here's the code I've been using, also contains some logging code:
try {
System.out.println("All mixers:");
for (Mixer.Info m : AudioSystem.getMixerInfo()) {
System.out.println(m.getName());
}
System.out.println("Default mixer: " + AudioSystem.getMixer(null).getMixerInfo());
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File("bsc.wav"));
Clip clip = AudioSystem.getClip();
clip.open(audioInputStream);
clip.start();
} catch (UnsupportedAudioFileException ex) {
ex.printStackTrace();
} catch (IOException ex) {
ex.printStackTrace();
} catch (LineUnavailableException ex) {
ex.printStackTrace();
}
This outputs:
All mixers:
Port HDMI [hw:0]
Port PCH [hw:1]
Port Snowball [hw:2]
HDMI [plughw:0,3]
HDMI [plughw:0,7]
HDMI [plughw:0,8]
PCH [plughw:1,0]
PCH [plughw:1,1]
PCH [plughw:1,2]
Snowball [plughw:2,0]
Default mixer: HDMI [plughw:0,3], version 5.3.0-51-generic
Thought this is more likely an issue with my setup than the code, so thought this was a fit for superuser. Did also write some code to try the other available mixers, but none of those worked either.
Looking around, did hear something about needing to install JMF but couldn't find whether it's really the solution.
Running Ubuntu 19.10 and Java 11.0.7 OpenJDK.
Edit1; found this page on StackOverflow about the Java Sound. Tried running the example code but also to no avail...
Here are a couple things to investigate and (probably) rule out.
Is your button JavaFX? If so, have you verified JavaFX is working?
Have you verified the address of the File object being created?
Try using URL as your AudioSystem.getAudioInputStream argument, setting it with class.getResource(URL).
Use the ais to get a Dataline.Info object as part of the process.
The code for using Dataline.Info follows:
AudioInputStream ais = AudioSystem.getAudioInputStream(url);
DataLine.Info info = new DataLine.Info(Clip.class, ais.getFormat());
Clip clip = (Clip) AudioSystem.getLine(info);
clip.open(ais);
You might inspect the Clip in the debugger before playing it. Check the .audioData property, to verify the data was loaded into memory. You can also verify the audio format here.
I just went through a rigamarole trying to get a jar file made in Windows to execute on Ubuntu 18.04. It turned out to be JavaFX related actually. Once I fixed that, the program ran (including its audio). I always use URL when working with audio assets.

Play audio obtained as byte[] from Azure Speech translation

I am following the samples for Microsoft Cognitive Services Speech SDK, namely the Speech Translation.
The sample for dotnet core uses microphone as audio input and translates what you speak. Translated results are also available as synthesized speech. I would like to play this audio but could not find the appropriate code for that.
Tried using NAudio as sugguested in this answer but I get garbled audio. Guess there is more to the format of the audio.
Any pointers?
On .Net Core, many audio pacakges might not work. For example with NAudio, I can't play sound on my Mac.
I got it working using NetCoreAudio package (Nuget), with the following implementation in the translation Synthesizing event:
recognizer.Synthesizing += (s, e) =>
{
var audio = e.Result.GetAudio();
Console.WriteLine(audio.Length != 0
? $"AudioSize: {audio.Length}"
: $"AudioSize: {audio.Length} (end of synthesis data)");
if (audio.Length > 0)
{
var fileName = Path.Combine(Directory.GetCurrentDirectory(), $"{DateTime.Now.ToString("yyyy-MM-dd_HH-mm-ss.wav")}");
File.WriteAllBytes(fileName, audio);
var player = new Player();
player.Play(fileName).Wait();
}
};

Actions on Google simulator does not separate screen output devices

I try to test my action in the Google Actions Simulator. Unfortunately, the simulator does not seems to recognize the difference between the phone surface and the smart speaker surface within the simulator.
I tried to console log the screentest variable. In the logs both the phone & speaker surface show 'true', which clearly is not correct. I also checked to 'conversation' data log. Both phone & speaker output contain SCREEN_OUTPUT.
app.intent('Default Welcome Intent', (conv) => {
let screentest = conv.available.surfaces.capabilities.has('actions.capability.SCREEN_OUTPUT')
console.log(screentest)
if (screentest === true) {
conv.add('Text with screen')
} else if (screentest === false) {
conv.add('Text without screen')
} else {
conv.add('impossible')
}
})
Expected results: when using the speaker surface inside the simulator, the output of the assistant should be 'Text without Screen'.
Actual results: Both phone & speaker surface inside the simulator generate the answer: 'Text with screen'.
The issue is that you're not quite checking for surfaces correctly.
There are two sets of capabilities reported:
The capabilities available on the surface the user is currently using. If you're using the actions-on-google library, these are available using conv.surface.capabilties.has()
The capabilities available on any surface that the user has connected to their account. These are available using conv.available.surfaces.capabilities.has() if you're using the actions-on-google library.
You're currently using the second one when you should be checking the first one to see what the user is currently using.
You'll want to use the second in case there is something that you want to display to make sure they can handle it before suggesting you switch to it.

IMFMediaPlayer hangs during SetSourceFromByteStream

Background: I'm coding a metro-styled app for Win8. I need to be able to play music-file. Because of quality and space requirements we're using encoded audio (mp3/ogg).
I'm using XAudio2 to play sound effects (.wav files), but since I couldn't figure out a way to play encoded audio with it, I decided to play the music files with Media Foundation (IMFMediaPlayer interface).
I downloaded metro apps sample, and found out that the Media Engine Native C++ video playback sample was closest to what I needed.
Now that my app has MediaPlayer playing musics, I ran into a problem. If the device running the app is slow enough, MediaPlayer hangs. When I'm running the release-version of the app on my device, it's fine and I can hear the music just fine. But when I attach the debugger or run it on a slower device, it hangs when I'm setting bytestream for the MediaPlayer to play.
Here's some code, you'll find it pretty similiar to the sample:
StorageFolder^ installedLocation = Windows::ApplicationModel::Package::Current->InstalledLocation;
m_pickFileTask = Concurrency::task<StorageFile^>(installedLocation->GetFileAsync(filename)), m_tcs.get_token());
auto player = this;
m_pickFileTask.then([player](StorageFile^ fileHandle)
{
player->SetURL(fileHandle->Path);
Concurrency::task<IRandomAccessStream^> fOpenStreamTask = Concurrency::task<IRandomAccessStream^> (fileHandle->OpenAsync(Windows::Storage::FileAccessMode::Read));
fOpenStreamTask.then([player](IRandomAccessStream^ streamHandle)
{
MEDIA::ThrowIfFailed(
player->m_spMediaEngine->Pause()
);
MEDIA::GetMediaError(player->m_spMediaEngine);
player->SetBytestream(streamHandle);
if (player->m_spMediaEngine)
{
MEDIA::ThrowIfFailed(
player->m_spEngineEx->Play()
);
MEDIA::GetMediaError(player->m_spMediaEngine);
}
}
);
}
);
And here's the SetBytestream method:
SetBytestream(IRandomAccessStream^ streamHandle)
{
if(m_spMFByteStream != nullptr)
{
m_spMFByteStream->Close();
m_spMFByteStream = nullptr;
}
MEDIA::ThrowIfFailed(
MFCreateMFByteStreamOnStreamEx((IUnknown*)streamHandle, &m_spMFByteStream)
);
MEDIA::ThrowIfFailed(
m_spEngineEx->SetSourceFromByteStream(m_spMFByteStream.Get(), m_bstrURL)
);
MEDIA::GetMediaError(m_spEngineEx);
return;
}
The line where it hangs is:
m_spEngineEx->SetSourceFromByteStream(m_spMFByteStream.Get(), m_bstrURL)
When I'm debugging the app, I can press pause and see the stack. Well, not much of it, but atleast I can see it that it's indefinitely at
ntdll.dll!77b7f4dc()
Any ideas why my app would hang in such a way?
(OPTIONAL: If you know a better way to play mp3/ogg in a c++ metro-styled app, let me know)
Could not figure out why this is happening, but I managed to code a work-a-round:
IMFSourceReader can be used to decode MP3s and feed bytes into XAudio2SourceVoice.
XAudio2 audio stream effect sample contains good example how to do this.

Blackberry Audio Recording Sample Code

Does anyone know of a good repository to get sample code for the BlackBerry? Specifically, samples that will help me learn the mechanics of recording audio, possibly even sampling it and doing some on the fly signal processing on it?
I'd like to read incoming audio, sample by sample if need be, then process it to produce a desired result, in this case a visualizer.
RIM API contains JSR 135 Java Mobile Media API for handling audio & video content.
You correct about mess on BB Knowledge Base. The only way is browse it, hoping they'll not going to change site map again.
It's Developers->Resources->Knowledge Base->Java API's&Samples->Audio&Video
Audio Recording
Basically it's simple to record audio:
create Player with correct audio encoding
get RecordControl
start recording
stop recording
Links:
RIM 4.6.0 API ref: Package javax.microedition.media
How To - Record Audio on a BlackBerry smartphone
How To - Play audio in an application
How To - Support streaming audio to the media application
How To - Specify Audio Path Routing
How To - Obtain the media playback time from a media application
What Is - Supported audio formats
What Is - Media application error codes
Audio Record Sample
Thread with Player, RecordControl and resources is declared:
final class VoiceNotesRecorderThread extends Thread{
private Player _player;
private RecordControl _rcontrol;
private ByteArrayOutputStream _output;
private byte _data[];
VoiceNotesRecorderThread() {}
private int getSize(){
return (_output != null ? _output.size() : 0);
}
private byte[] getVoiceNote(){
return _data;
}
}
On Thread.run() audio recording is started:
public void run() {
try {
// Create a Player that captures live audio.
_player = Manager.createPlayer("capture://audio");
_player.realize();
// Get the RecordControl, set the record stream,
_rcontrol = (RecordControl)_player.getControl("RecordControl");
//Create a ByteArrayOutputStream to capture the audio stream.
_output = new ByteArrayOutputStream();
_rcontrol.setRecordStream(_output);
_rcontrol.startRecord();
_player.start();
} catch (final Exception e) {
UiApplication.getUiApplication().invokeAndWait(new Runnable() {
public void run() {
Dialog.inform(e.toString());
}
});
}
}
And on thread.stop() recording is stopped:
public void stop() {
try {
//Stop recording, capture data from the OutputStream,
//close the OutputStream and player.
_rcontrol.commit();
_data = _output.toByteArray();
_output.close();
_player.close();
} catch (Exception e) {
synchronized (UiApplication.getEventLock()) {
Dialog.inform(e.toString());
}
}
}
Processing and sampling audio stream
In the end of recording you will have output stream filled with data in specific audio format. So to process or sample it you will have to decode this audio stream.
Talking about on the fly processing, that will be more complex. You will have to read output stream during recording without record commiting. So there will be several problems to solve:
synch access to output stream for Recorder and Sampler - threading issue
read the correct amount of audio data - go deep into audio format decode to find out markup rules
Also may be useful:
java.net: Experiments in Streaming Content in Java ME by Vikram Goyal
While not audio specific, this question does have some good "getting started" references.
Writing Blackberry Applications
I spent ages trying to figure this out too. Once you've installed the BlackBerry Component Packs (available from their website), you can find the sample code inside the component pack.
In my case, once I had installed the Component Packs into Eclipse, I found the extracted sample code in this location:
C:\Program
Files\Eclipse\eclipse3.4\plugins\net.rim.eide.componentpack4.5.0_4.5.0.16\components\samples
Unfortunately when I imported all that sample code I had a bunch of compile errors. To workaround that I just deleted the 20% of packages with compile errors.
My next problem was that launching the Simulator always launched the first sample code package (in my case activetextfieldsdemo), I couldn't get it to run just the package I am interested in. Workaround for that was to delete all the packages listed alphabetically before the one I wanted.
Other gotchas:
-Right click on the project in Eclipse and select Activate for BlackBerry
-Choose BlackBerry -> Build Configurations... -> Edit... and select your new project so it builds.
-Make sure you put your BlackBerry source code under a "src" folder in the Eclipse project, otherwise you might hit build issues.

Resources