I am following the samples for Microsoft Cognitive Services Speech SDK, namely the Speech Translation.
The sample for dotnet core uses microphone as audio input and translates what you speak. Translated results are also available as synthesized speech. I would like to play this audio but could not find the appropriate code for that.
Tried using NAudio as sugguested in this answer but I get garbled audio. Guess there is more to the format of the audio.
Any pointers?
On .Net Core, many audio pacakges might not work. For example with NAudio, I can't play sound on my Mac.
I got it working using NetCoreAudio package (Nuget), with the following implementation in the translation Synthesizing event:
recognizer.Synthesizing += (s, e) =>
{
var audio = e.Result.GetAudio();
Console.WriteLine(audio.Length != 0
? $"AudioSize: {audio.Length}"
: $"AudioSize: {audio.Length} (end of synthesis data)");
if (audio.Length > 0)
{
var fileName = Path.Combine(Directory.GetCurrentDirectory(), $"{DateTime.Now.ToString("yyyy-MM-dd_HH-mm-ss.wav")}");
File.WriteAllBytes(fileName, audio);
var player = new Player();
player.Play(fileName).Wait();
}
};
Related
I have C# project where stream from ip-camera recorded to the file, I use libvlc.
This is part of code with vlc parameters:
string VlcArguments = #":sout=#transcode{acodec=mpga,deinterlace}:standard{access=file,mux=mp4,dst="C:\Users\I\Desktop\Output.mp4"}";
var media = factory.CreateMedia<IMedia>(rtsp://184.72.239.149/vod/mp4:BigBuckBunny_175k.mov, VlcArguments);
var player = factory.CreatePlayer<IPlayer>();
player.Open(media);
filename is the path of the result file.
It works fine, but I need to record sound from a microphone Microphone (High Definition Audio Device).
What I need to change to achieve that?
UPD
It should look something like this
var media = factory.CreateMedia<IMedia>("dshow:// dshow-vdev=rtsp://184.72.239.149/vod/mp4:BigBuckBunny_175k.mov dshow-adev=Microphone (High Definition Audio Device)", VlcArguments)
But it doesn't work (
UPD2
So, I think I found the answer
https://forum.videolan.org/viewtopic.php?f=14&t=124229&p=425550&hilit=camera+microphone+dshow#p425550
Unfortunately this will not work
I receive over network PCM audio data stream and this part works fine so I am ending up with
DataReader incomming = args.GetDataReader();
byte[] RcvBuffer = new byte[incomming.UnconsumedBufferLength];
incomming.ReadBytes(RcvBuffer);
I have all audio data in buffer.
How I can play this through telephone Speaker ? Can you point me in some direction ?
Thanks
There're many ways to do that.
You can prepend the WAVE header to your data, and use MediaElement for playback, see the documentation for SetSource method.
If however by “telephone speaker” you mean the earphone, then it is only possible if you are creating a VoIP app.
It took a while but I sorted it, maybe someone else will need help in the future.
First Problem - since I just started app development for Windows Phone I have chosen Blank App (Windows Phone) instead Blank App (Windows Phone Silverlight) and I did not have access to many features that are available in Silverlight projects, so my suggestions for beginners: understand what each project is for.
Like Soonts said there are many ways to do this, this is one that I used.
I simplified this code and retyped this so there can be some typos.
using Microsoft.Xna.Framework.Audio;
using System.IO;
1) Create Stream to load your incoming data:
MemoryStream stream = new MemoryStream();
2) Load data from buffer to stream:
stream.Write(RcvBuffer, 0, RcvBuffer.Length);
3) I am using SoundEfect to play this through Loud-Speaker. Sample rate that I use is 8 kHz
SoundEffect sound;
sound = new SoundEffect(stream.toArray(), 8000, AudioChannels.Mono)
sound.Play();
Background: I'm coding a metro-styled app for Win8. I need to be able to play music-file. Because of quality and space requirements we're using encoded audio (mp3/ogg).
I'm using XAudio2 to play sound effects (.wav files), but since I couldn't figure out a way to play encoded audio with it, I decided to play the music files with Media Foundation (IMFMediaPlayer interface).
I downloaded metro apps sample, and found out that the Media Engine Native C++ video playback sample was closest to what I needed.
Now that my app has MediaPlayer playing musics, I ran into a problem. If the device running the app is slow enough, MediaPlayer hangs. When I'm running the release-version of the app on my device, it's fine and I can hear the music just fine. But when I attach the debugger or run it on a slower device, it hangs when I'm setting bytestream for the MediaPlayer to play.
Here's some code, you'll find it pretty similiar to the sample:
StorageFolder^ installedLocation = Windows::ApplicationModel::Package::Current->InstalledLocation;
m_pickFileTask = Concurrency::task<StorageFile^>(installedLocation->GetFileAsync(filename)), m_tcs.get_token());
auto player = this;
m_pickFileTask.then([player](StorageFile^ fileHandle)
{
player->SetURL(fileHandle->Path);
Concurrency::task<IRandomAccessStream^> fOpenStreamTask = Concurrency::task<IRandomAccessStream^> (fileHandle->OpenAsync(Windows::Storage::FileAccessMode::Read));
fOpenStreamTask.then([player](IRandomAccessStream^ streamHandle)
{
MEDIA::ThrowIfFailed(
player->m_spMediaEngine->Pause()
);
MEDIA::GetMediaError(player->m_spMediaEngine);
player->SetBytestream(streamHandle);
if (player->m_spMediaEngine)
{
MEDIA::ThrowIfFailed(
player->m_spEngineEx->Play()
);
MEDIA::GetMediaError(player->m_spMediaEngine);
}
}
);
}
);
And here's the SetBytestream method:
SetBytestream(IRandomAccessStream^ streamHandle)
{
if(m_spMFByteStream != nullptr)
{
m_spMFByteStream->Close();
m_spMFByteStream = nullptr;
}
MEDIA::ThrowIfFailed(
MFCreateMFByteStreamOnStreamEx((IUnknown*)streamHandle, &m_spMFByteStream)
);
MEDIA::ThrowIfFailed(
m_spEngineEx->SetSourceFromByteStream(m_spMFByteStream.Get(), m_bstrURL)
);
MEDIA::GetMediaError(m_spEngineEx);
return;
}
The line where it hangs is:
m_spEngineEx->SetSourceFromByteStream(m_spMFByteStream.Get(), m_bstrURL)
When I'm debugging the app, I can press pause and see the stack. Well, not much of it, but atleast I can see it that it's indefinitely at
ntdll.dll!77b7f4dc()
Any ideas why my app would hang in such a way?
(OPTIONAL: If you know a better way to play mp3/ogg in a c++ metro-styled app, let me know)
Could not figure out why this is happening, but I managed to code a work-a-round:
IMFSourceReader can be used to decode MP3s and feed bytes into XAudio2SourceVoice.
XAudio2 audio stream effect sample contains good example how to do this.
Is Adobe Media Encoder (AME) Scriptable? I've heard people mention it was "officially scriptable" but I can't find any reference to its scriptable object set.
Has anyone had any experience scripting AME?
Adobe media encoder is 'officially' not scriptable but we can use extend script API for scripting AME.
Below functions are available through extend script
1.Adding a file to batch
Encode progress
host = App.GetEncoderHost ();
enc = EHost.CreateEncoderForFormat ( "QuickTime");
flag = Enc.LoadPreset ( "HD 1080i 29.97, H.264, AAC 48 kHz");
an if (flag) {
f = enc.encodeEncodeProgress
= function (progress) {
$ .writeln (progress);
}
eHost. enc.encode ("/ Users / test / Desktop / 00000.MTS", "/Users/test/Desktop/0.mov");
} else {
alert ("The preset could not be loaded ");
}
encode end
ehost = App.GetEncoderHost ();
enc = EHost.CreateEncoderForFormat ( "QuickTime");
flag = Enc.LoadPreset ( "HD 1080i 29.97, H.264, AAC 48 kHz");
an if (flag) {
f = enc.onEncodeFinished
= function (success) {
if (success) {
alert ("Successfully encoding has ended ");
} Else {
Alert (" failed to encode ");
}
}
EHost.RunBatch ();
} Else {
Alert (" preset could not be read ");
}
2.Start batch
eHost = app.getEncoderHost ();
eHost.runBatch ();
3.Stop batch
eHost = app.getEncoderHost ();
eHost.stopBatch ();
4.Pause batch
eHost = app.getEncoderHost ();
eHost.pauseBatch ();
5.Getting preset formats
EHost = App.GetEncoderHost ();
List = EHost.GetFormatList ();
6.getting presets
eHost = app.getEncoderHost ();
enc = eHost.createEncoderForFormat ("QuickTime");
list = enc.getPresetList ();
and many more...
The closest bits of info I've found are:
http://www.openspc2.org/book/MediaEncoderCC/
The latter resource is actually good, if you can read japanese, or at least use the Chrome built-in translate function, then you can see it has resources such as this one:
http://www.openspc2.org/book/MediaEncoderCC/easy/encodeHost/009/index.html
We can perform almost all basic functionalities through script.
I had a similar question about Soundbooth.. I haven't tried scripting Adobe Media Encoder though, it doesn't show up in the list of applications I could potentially connect to and script with the ExtendScript Toolkit.
I did find this article that might come in handy if you're on a Windows. I guess using something similar written in AppleScript could do the job on a OSX. I haven't tried it, but this Sikuli thing looks nice, maybe it could help with the job.
Adobe Media Encoder doesn't seem to be scriptable. I was wondering, for batch converting, could you use ffmpeg ? There seem to be a few scripts out there for that, if you google for ffmpeg batch flv.
HTH,
George
Year 2021
Yes, AME is scriptable in ExtendScript. AME API doc can be found at
https://ame-scripting.docsforadobe.dev/index.html.
The API methods can be invoked locally inside AME or remotely through BridgeTalk.
addCompToBatch and other alternatives in the API doc seem to be safe to use. This is working:
app.getFrontend().addCompToBatch(project, preset, destination);
The method requires project to be structured so that 1 and only 1 comp is at the root of the project.
encoder.encode – references of which can be found in Web, supposed to support encode progress callbacks - is not available in AME 2020 and 2021. As a result, this is not working:
var encoder = app.getEncoderHost().createEncoderForFormat(encoderFormat);
var res = encoder.loadPreset(encoderPreset);
if(res){
encoder.encode(project, destination); // error: encode is not a function
}
The method seems to have been removed in AME 2017.1, according to the post reporting the issue https://community.adobe.com/t5/adobe-media-encoder-discussions/media-encoder-automation-system-with-using-extendscript/td-p/9344018
The official stance at the moment is "no", but if you open the Adobe Extend Script Toolkit, and set the target app to Media Encoder, you will see in the Data Browser that a few objects and methods are already exposed in the app object, like app.getFrontend(), app.getEncoderHost() etc. There is no official documentation though, and no support, so you are free to experiment with them at your own risk.
You can use the ExtendScript reflection interface like this:
a = app.getFrontend()
a.reflect.properties
a.reflect.methods
a.reflect.find("addItemToBatch").description
But as far as I can see, no meaningful information can be found this way beyond list of methods and properties.
More about the ExtendScript reflect interface can be found in the JavaScript Tools Guide CC document.
Doesn't seem to be. There're some reference to it being somewhat scriptable via using FCP XML yet it's not "scriptable" in its accepted form.
Edit, it looks like they finally got their finger out and made ME scriptable: https://stackoverflow.com/a/69203537/432987
I got here after it came second in the duckduckgo results for "extendscript adobe media encoder". First was a post on the Adobe forums where an adobe staffer wrote:
Scripting in Adobe Media Encoder is not a supported feature.
and, just to give the finger to anyone seeking to develop solutions for adobe users using adobe's platform:
Also, this is a user-to-user forum, not an official channel for support from Adobe personnel.
I think the answer is "Adobe says no"
Does anyone know of a good repository to get sample code for the BlackBerry? Specifically, samples that will help me learn the mechanics of recording audio, possibly even sampling it and doing some on the fly signal processing on it?
I'd like to read incoming audio, sample by sample if need be, then process it to produce a desired result, in this case a visualizer.
RIM API contains JSR 135 Java Mobile Media API for handling audio & video content.
You correct about mess on BB Knowledge Base. The only way is browse it, hoping they'll not going to change site map again.
It's Developers->Resources->Knowledge Base->Java API's&Samples->Audio&Video
Audio Recording
Basically it's simple to record audio:
create Player with correct audio encoding
get RecordControl
start recording
stop recording
Links:
RIM 4.6.0 API ref: Package javax.microedition.media
How To - Record Audio on a BlackBerry smartphone
How To - Play audio in an application
How To - Support streaming audio to the media application
How To - Specify Audio Path Routing
How To - Obtain the media playback time from a media application
What Is - Supported audio formats
What Is - Media application error codes
Audio Record Sample
Thread with Player, RecordControl and resources is declared:
final class VoiceNotesRecorderThread extends Thread{
private Player _player;
private RecordControl _rcontrol;
private ByteArrayOutputStream _output;
private byte _data[];
VoiceNotesRecorderThread() {}
private int getSize(){
return (_output != null ? _output.size() : 0);
}
private byte[] getVoiceNote(){
return _data;
}
}
On Thread.run() audio recording is started:
public void run() {
try {
// Create a Player that captures live audio.
_player = Manager.createPlayer("capture://audio");
_player.realize();
// Get the RecordControl, set the record stream,
_rcontrol = (RecordControl)_player.getControl("RecordControl");
//Create a ByteArrayOutputStream to capture the audio stream.
_output = new ByteArrayOutputStream();
_rcontrol.setRecordStream(_output);
_rcontrol.startRecord();
_player.start();
} catch (final Exception e) {
UiApplication.getUiApplication().invokeAndWait(new Runnable() {
public void run() {
Dialog.inform(e.toString());
}
});
}
}
And on thread.stop() recording is stopped:
public void stop() {
try {
//Stop recording, capture data from the OutputStream,
//close the OutputStream and player.
_rcontrol.commit();
_data = _output.toByteArray();
_output.close();
_player.close();
} catch (Exception e) {
synchronized (UiApplication.getEventLock()) {
Dialog.inform(e.toString());
}
}
}
Processing and sampling audio stream
In the end of recording you will have output stream filled with data in specific audio format. So to process or sample it you will have to decode this audio stream.
Talking about on the fly processing, that will be more complex. You will have to read output stream during recording without record commiting. So there will be several problems to solve:
synch access to output stream for Recorder and Sampler - threading issue
read the correct amount of audio data - go deep into audio format decode to find out markup rules
Also may be useful:
java.net: Experiments in Streaming Content in Java ME by Vikram Goyal
While not audio specific, this question does have some good "getting started" references.
Writing Blackberry Applications
I spent ages trying to figure this out too. Once you've installed the BlackBerry Component Packs (available from their website), you can find the sample code inside the component pack.
In my case, once I had installed the Component Packs into Eclipse, I found the extracted sample code in this location:
C:\Program
Files\Eclipse\eclipse3.4\plugins\net.rim.eide.componentpack4.5.0_4.5.0.16\components\samples
Unfortunately when I imported all that sample code I had a bunch of compile errors. To workaround that I just deleted the 20% of packages with compile errors.
My next problem was that launching the Simulator always launched the first sample code package (in my case activetextfieldsdemo), I couldn't get it to run just the package I am interested in. Workaround for that was to delete all the packages listed alphabetically before the one I wanted.
Other gotchas:
-Right click on the project in Eclipse and select Activate for BlackBerry
-Choose BlackBerry -> Build Configurations... -> Edit... and select your new project so it builds.
-Make sure you put your BlackBerry source code under a "src" folder in the Eclipse project, otherwise you might hit build issues.