How can a bot receive a voice file from Facebook Messenger (MP4) and convert it to a format that is recognized by speech engines like Bing or Google? - audio

I'm trying to make a bot for Facebook Messenger using Microsoft's Bot Framework that will do this:
Get a user's voice message sent via Facebook Messenger
Convert speech to text
Do something with it
There's no problem with getting the voice message from Messenger (the URL can be extracted from the message the bot receives), and there's also no problem with converting an audio file to speech (using Bing Speech API or Google's similar API).
However, these APIs require PCM (WAV) files, while Facebook Messenger gives you an MP4 file.
Is there a popular/standard way of converting one format into another that is used in writing the bots?
So far my best idea is to run vlc.exe as a console job on my server and convert the file, but that doesn't sound like the best solution.

Developed a solution that works as follows:
Receive voice message from facebook
Download the MP4 file to local disk using the link inside Activity.Attachments
Use MediaToolKit (wrapper for FFMPEG) to convert MP4/AAC to WAV on local server
Send the WAV to Bing Speech API
So the answer to my question is: use MediaToolKit+ffmpeg to convert the file format.
Sample implementation and code here: https://github.com/J3QQ4/Facebook-Messenger-Voice-Message-Converter
public string ConvertMP4ToWAV()
{
var inputFile = new MediaFile { Filename = SourceFileNameAndPath };
var outputFile = new MediaFile { Filename = ConvertedFileNameAndPath };
using (var engine = new Engine(GetFFMPEGBinaryPath()))
{
engine.Convert(inputFile, outputFile);
}
return ConvertedFileNameAndPath;
}

Related

Say "hello" when Twilio call connected

I have a websocket in python Flask that listens to a twilio call. When the call is started I want to say "hello" here is the code.
if data['event'] == "start":
speakBytes = speaker.speak("Hello") // using micrsoft cognitive service to convert the text to bytes
convertedBytes = ap.lin2ulaw(speakBytes.audio_data,1)
ws.send(responseString.format(base64.b64encode(convertedBytes), str(data['streamSid'])))
But the above is not working. I checked microsoft cognitive services speech sunthesizer returns the bytes in WAV format so I have used lin2ulaw form python audioop module.
Need help. Thanks in advance.
Twilio developer evangelist here.
It looks like you are correctly creating the audio to send to the Twilio Media Stream, however I don't think you are sending the correct format.
Twilio Media Streams expect a media message to be a JSON object with the following properties:
event: the value "media"
streamSid: the SID of the stream
media: an object with a "payload" property that then contains the base64 encoded mulaw/8000 audio
Something like this might work:
message = {
"streamSid": data['streamSid'],
"event": "media",
"media": {
"payload": base64.b64encode(convertedBytes)
}
}
# Serializing json
json_object = json.dumps(message)
ws.send(json_object)
If you're using Twilio to connect the number then you'll need to reply with TwiML to the call:
from twilio.twiml.voice_response import VoiceResponse
response = VoiceResponse()
response.say('Hello')
return str(response)
See the doc of <Say></Say.
If you want to use the .wav you created then you would need to save it somewhere accessible (e.g. an Amazon S3 bucket) and then you can use TwiML <Play></Play>.
Thanks for the answers everyone. The solution turned out to be a small change.
I had to change ap.lin2ulaw(speakBytes.audio_data,1) to ap.lin2ulaw(speakBytes.audio_data,4) and it worked fine. It seems to be the compatibility of microsoft text to speech and twilio formats.

Google Action - playing youtube video on google hub

How can I play a specific Youtube Video on my Google Hub via Google Actions? I know I can use a Basic Card to display images and text and even a link (although that link does not show up on the HUB)
I specifically want to be able to trigger or to play a youtube video on my Google Hub.
Actions are not able to start playing video content. Media responses are only for audio.
I have a similar need. After a chat with an action on google, I want to play user requested youtube videos (chains-of) on a local "big screen" (TV-like / PC).
A workaround solution could be:
you realize an action that select one or more videos.
The action act also as a server for a client described here below
The action communicate (SSE, websocket, HTTP...) with a client browser page containing a javascript small program that dynamically visualize the video (id sent via SSE client-server communication)
Here below the rough js script (I'm not a web developer); that just gives you the idea:
<script language="javascript">
function loadVideoWithId(id) {
const tvEmbedMode = "embed/" //"tv#/watch?v="
const url = `https://www.youtube.com/embed/${id}?fs=1&autoplay=1&loop=1` //
const iframe = `<iframe src="${url}" width="1600" height="900" allowFullScreen="allowFullScreen" frameBorder="0" />`
document.write(iframe)
}
loadVideoWithId('hHW1oY26kxQ')
</script>

Unable to send pdf file from bot to user in ms teams

NodeJS BotBuilder SDK version: 3.15.0
My code:
var pdf = {
name: '<file_name>.pdf',
contentType: 'application/pdf',
contentUrl: '<https url to public pdf file>'
};
var reply = new builder.Message(session).addAttachment(pdf);
session.send(reply);
This code is the same in few online examples. The issue I have is that I always get error:
Error: POST to 'https://smba.trafficmanager.net/emea/v3/conversations/a%3A1TwHmhoGuZP2Mf9P0TTnjv8HkcaXzEHryv0sYCvDDUI-qrMitJtHRlAnIcedcDH_v3IfMBXtg_zo5MDVcS0-8hDCQ4sJzpJhrewBPK8uWJXYeShgmd-s7uh5o8kW4ebAP/activities/1543588440246' failed: [400] Bad Request
For image/png this code works fine.
What I want to achieve is this: (image is taken from Bot Framework Emulator)
File from the web sent from bot to user
The file is sent from bot without uploading it to users's one drive.
This works also when I tested the feature in test section of https://dev.botframework.com/bots. It doesn't work only in ms teams.
The behaviour for sending files can differ per channel. Microsoft Teams doesn't support the direct upload method, like the WebChat / Emulator does. This is due to compliance reasons, as Bill Bliss stated.
You can post messages with card attachments referencing existing SharePoint files using the Microsoft Graph APIs for OneDrive and SharePoint. Using the Graph APIs requires obtaining access to a user's OneDrive folder (for personal and groupchat files) or the files in a team's channels (for channel files) through the standard OAuth2 authorization flow. This method works in all Teams contexts.
Have a look at Send and receive files through your bot
for the full documentation and how to implement.
An alternative option would be to use an AdaptiveCard where you can use an image thumbnail of the document combined with a button to directly download the PDF file from your public accessible URL.

MS Bot Framework - Messages with audio attachments lost

I'm writing a bot in Node.js using the MS Bot Framework. To send attachments, I'm actually using the filestream buffer as the contentUrl, e.g.
...
var base64 = new Buffer(filedata).toString('base64');
var msg = new builder.Message()
.setText(session, text)
.addAttachment({
contentUrl: util.format('data:%s;base64,%s', contentType, base64),
contentType: contentType
});
session.send(msg);
...
where contentType is the proper mimetype for the file in question.
When I test this locally (using the Bot Framework Emulator), this works perfectly for both image and audio files - messages with image attachments display the image, and messages with audio attachments show the audiocard allowing for playback, etc.
However, when I test this through FB Messenger, the images work fine, but the audio messages just never appear in FB. Not even the text of the message comes through; it's like the entire message is lost. The dialogue simply skips over the message containing the audio attachment. I'm not even seeing any errors received server-side.
This is happening with both mp3 and wav test audio files, that are each under 1MB (smaller than many of the image files I've successfully tested).
Is there some trick to sending playable audio files to the FB Messenger channel specifically?
Thanks!
I wasn't (yet) able to get a response from FB support, but after further testing, it looks like there is a filesize limit on audio files FB Messenger will accept.
Specifically, I was able to get a sample file of ~45KB to send and display in Messenger successfully, but a larger file of ~400KB got dropped (aka seemed to send successfully from the server-side perspective, but did not show up in Messenger).
Strangely, some of my much larger image files went through, so it seems like this same limit does not exist for image attachments.
Will do some further testing, but it seems like the ultimate solution will be either to majorly compress my audio files, or to host them somewhere else instead of sending as a filestream.

Chat bot-Can I display html content using node.js in Microsoft bot framework & bot builder

I'm developing a chatbot on azure using node.js. It's a data visualization bot which generates chart in html format using d3 library and display to user.
It seems that Microsoft bot builder doesn't support html format. But I have looked through this link:
https://blog.botframework.com/2017/09/07/html-not-supported-web-chat/
It says that there is a way to enable html content:
"If HTML rendering in Web Chat is a critical feature for your applications, you can clone or fork a copy of the Web Chat source code from GitHub, and enable it (on your own custom Web Chat client)."
I tried to clone the file and changed ‘html : false’ to ‘html : true’. But it's not working.
Can anyone tell me what I can do? Really appreciate it!!!
Depending on what data you are attempting to visualize, you might be able to use a service like Google Image Charts: https://developers.google.com/chart/image/docs/chart_playground
Using this service, with the following code:
// attach the card to the reply message
var msg = new builder.Message(session).addAttachment(createHeroCard(session));
session.send(msg);
function createHeroCard(session) {
return new builder.HeroCard(session)
.title('Months with Numbers Bar Chart')
.subtitle('Using a Chart as Image service...')
.text('Build and connect intelligent bots that have charts rendered as images.')
.images([
builder.CardImage.create(session, 'http://chart.googleapis.com/chart?cht=bvg&chs=250x150&chd=s:Monkeys&chxt=x,y&chxl=0:|Jan|Feb|Mar|Apr|May|Jun|Jul')
])
.buttons([
builder.CardAction.openUrl(session, 'https://learn.microsoft.com/bot-framework/', 'Get Started')
]);
}
Produces this hero card:

Resources