Microsoft Cognitive Services - Speaker Recognition API - Verification - SpeakerInvalid - azure-web-app-service

I am trying to consume Microsoft Cognitive Speaker recognition API. I have attached Enrollment and verification audio samples. I get below response:
{
"error": {
"code": "BadRequest",
"message": "SpeakerInvalid"
}
}
Enrollment Audio : https://www.dropbox.com/s/i5qjjxvi16wdvs6/Enrollment.rar?dl=0
Verification Auidio : https://www.dropbox.com/s/w5q8prn2o0sqd2f/Verification.rar?dl=0

That's a generic "I couldn't handle that audio" error. Your links are dead, so I can't check it. Make sure your audio meets the requirements:
Container WAV
Encoding PCM
Rate 16K
Sample Format 16 bit
Channels Mono
Demo web page here with the audio being encoded in a valid format: https://rposbo.github.io/speaker-recognition-api/

One of the issues that we I have had is that the enrollment is not completed, so first check the status of the Enrollment. It should be Enrolled
Profile p = speakerIdClient.getProfile(UUid);

Related

bot sending an mp3 (or wav) to a user in MS teams

I am building an MS Teams bot. According to this sample, botbuilder framework supports audio, but when I actually try to do it in MS Teams it doesn't work, in that it shows the following:
audio card
Here is my code:
async def send_audio(self, audio):
logger.debug(f"Sending {audio}")
card = AudioCard(
title="",
media=[MediaUrl(url=audio)],
)
await turn_context.send_activity(MessageFactory.attachment(CardFactory.audio_card(card)))
inspired by:
https://github.com/microsoft/BotBuilder-Samples/blob/main/samples/python/06.using-cards/dialogs/main_dialog.py
How can I send audio from my bot to a user? A working sample would be best, but please don't point me to MS Documentation (It is total crap!)
The following cards are implemented by the Bot Framework, but are not supported by Teams:
Animation cards
Audio cards
Video cards
Ref: https://learn.microsoft.com/en-us/microsoftteams/platform/task-modules-and-cards/cards/cards-reference#cards-not-supported-in-teams

How to use RTSP live stream's link from tuya API?

I've got a link from Tuya API explorer using the service "IoT Video Live Stream". I want to know where I can use this link for stream my camera's video. I have the video on my tuya APP, but I want use this link.
Here's an example of the API return.
{"result": {
"url": "rtsps://eb3ba0.....aa418b.....:4UBYKMX9T.....T0#aws-tractor1.tuyaus.com:443/v1/proxy/echo_show/b6721582601.....b46ac1b71.....bbb3d*****bf7.....},"success": true,"t": 1642462403630}

google nest hub can't play hls

My Question is
I push HLS steram to gnh(google nest hub) by action.devices.commands.GetCameraStream response format.gnh do nothing but show loading UI some seconds.
It's somthing wrong with my HLS file?
How to get log from gnh to help me debug?
As I know
I am tried to push mp4(1080p/under 60 fps) url to gnh, that's work well.
I am tried to convert mp4 to hls by some lib,include ffmpeg,Bento4.
Here is my JSON send to gnh:
{
"payload": {
"commands": [{
"status": "SUCCESS",
"states": {
"cameraStreamAccessUrl": "http:/path/of/steram.m3u8"
},
"ids": ["....."]
}]
},
"requestId": "My_Request_Id"
}
It seems that you are missing the required property cameraStreamSupportedProtocols. Try adding the protocol and see if you are able to get the stream to work. This will load the default cast camera receiver since you are trying to play HLS content. If you are still seeing an issue with playback, it could be that your stream is malformed and needs to be revised.
Playback logs will only be available to you if you create your own basic receiver app and specify this in your response using the cameraStreamReceiverAppId property. To see more about creating a cast receiver app refer to the overview page (https://developers.google.com/cast/docs/web_receiver) and how to create a basic receiver (https://developers.google.com/cast/docs/web_receiver/basic) for more information. We also do have a default camera receiver sample located in our sample github (https://github.com/googlecast/CastCameraReceiver)

Voice recognition failed miserably: Wrong status code 401 in Bing Speech API / token

When I was trying to translate a sample audio from English to some other language using Azure Bing Speech to Text Api, I am getting Error: Voice recognition failed miserably: Wrong status code 401 in Bing Speech API / token
I have tried increasing open_timeout to a higher value like 50000(which was suggested for slow-internet) hard-coded in bingspeech-api-client in Line 110 , but still the error persists.
let audioStream = fs.createReadStream('hello.wav');
// Bing Speech Key (https://www.microsoft.com/cognitive-services/en-us/subscriptions)
let subscriptionKey = '******';
let client = new BingSpeechClient(subscriptionKey);
client.recognizeStream(audioStream).then(function(response)
{
console.log("response is ",response);
console.log("-------------------------------------------------");
console.log("response is ",response.results[0]);
}).catch(function(error)
{
console.log("error occured is ",error);
});
This code should generate the text from that sample audio file.
Code 401 means unauthorized - wrong key in your case. I suspect you followed an outdated version of some tutorial since by now the service is not called Bing Speech API anymore. See here for a current tutorial using the microsoft-cognitiveservices-speech-sdk SDK for node.js.
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/quickstart-js-node#use-the-speech-sdk

Google Speech to text: The video model is currently not supported for language : nl-NL

Url used: https://cloud.google.com/speech-to-text/
I uploaded a wav audio file (exported both as mp3/wav/flac) via audacity.
I Selected "nederlands" (dutch), punctiation can be both on or of and uploaded the export.
First it uploads, gives me the 'transcribing' message and after that:
The video model is currently not supported for language : nl-NL
I see in the console of my browser window:
Failed to load resource: the server responded with a status of 400 ()
speech.min.js:1132 {
"error": {
"code": 400,
"message": "Invalid recognition 'config': The video model is currently not supported for language : nl-NL.",
"status": "INVALID_ARGUMENT"
}
}
cxl-services.appspot.com/proxy?url=https%3A%2F%2Fspeech.googleapis.com%2Fv1p1beta1%2Fspeech%3Arecognize Failed to load resource: the server responded with a status of 500 ()
speech.min.js:1132 A server error occurred!
If I use the microphone to record a message it works properly.
What am I doing wrong?
The speech-to-text API provides four different models to choose from:
video
phone_call
command_and_search
default
(https://cloud.google.com/speech-to-text/docs/basics#select-model)
Not all models are available for all languages. Try using the default model for fr-FR or nl-NL.
I had the same issue with de-DE. It wouldn't work with the video model, but using the default model worked.

Resources