Streaming microphone input to Google Speech API - node.js

I have looked into Google Cloud Speech API and got streaming my microphone working on a Node server.
I was then wondering what would be best practice for streaming my microphone from a web frontend? Is it sending an audiostream from getUserMedia to the Node server and pipe it to the API with the Node API client? Or is is simply saving the voice input to a file that I then transmit to the API?
The intent is to "transcribe" instructions (one or two sentences long) and send the result to another API.

I'm aware this question is over a year old and the OP has probably either found an answer or given up, but I spent long enough trying in vain to google this before I figured it out that I wanted to help anyone following in my footsteps: I wrote up a tutorial for basically this exact situation here.

Related

How to do screen recording and save in some cloud storage like s3 in real time

I am trying to solve a problem where you need to record screens in real time and keep on sending the data to the backend which will store the video as an s3 object(any cloud store).
I did research it, but everywhere I see people are recording the video and send it as a single file after recording is completed, the problem here is the file may be very big to send it as a single file, hence I want it to get saved in real-time in s3.
I have also seen Webrtc which helps in peer to peer communication.
any suggestions around this to implement in GO or Nodejs will be helpful.
Thanks
What you can do is using an SFU. Which will be used to send screen data to using webrtc and save it to a file server-side.
You can use mediasoup for this.
Here is a working example: https://github.com/ethand91/mediasoup3-record-demo
You should check Multipart upload overview.
No matter how large the video will be, you only need to upload each 5M data as a part to S3. Although it doesn't work exactly like a stream, it's almost a stream.
For the GO sdk, please check S3 Golang SDK

Google Action should play radio stream

I need to develop a Google Action which streams an audio/radio stream.
i thought about media response.
But the documentation says: "Audio for playback must be in a correctly formatted .mp3 file. Live streaming is not supported."
Documentation
Can someone give me an hint, what i have to do to stream an audio-stream? i found a german google action "baden fm" which streams their radio. But not sure how they do it.
Kind Regards
Stefan
The only ways to do this currently:
Stream it in chunks of MP3 files, using the callback at the end of streaming to stream the next chunk
Getting listed on TuneIn, Radio.com or iHeartRadio. From observation, Baden FM seems to be using TuneIn
Through an App Action
Use a Web site link that starts streaming via BrowseCarousel or Button
Last 2 options are not helpful if you're going after non-browser-enabled devices.
Also saw this thread which has some insight on MP3 size/duration: How can I tell Actions on Google to stream audio?
Google Actions do not currently support live audio streaming. I'm in contact with them but it seems they have no ETA to support this.
I was successful doing so with an mp3 live stream:
NPR: https://npr-ice.streamguys1.com/live.mp3?ck=1597372625378
but not with mpd
BBC test stream: https://rdmedia.bbc.co.uk/dash/ondemand/testcard/1/client_manifest-audio.mpd
or with the HLS that my company uses ( .m3u8, can't publish the link publicly )
Note: added links as text/code since I'm not sure whether their companies policies are cool with them being indexed.

How to store and handle big data string (around 2mb) in node js

I have a frontend in angular and the API is in nodejs. The frontend sends me the encrypted file and right now I am storing it in MongoDB. But when I send this file to a frontend, the call will sometimes break. So please suggest me how can I solve this call break issue.
Your question isn't particularly clear. As I understand it, you want to send a large file from the Node backend to the client. If this is correct, then read on.
I had a similar issue whereby a long-running API call took several minutes to compile the data and send the single large response back to the client. This issue I had was a timeout and I couldn't extend it.
With Node you can use 'streams' to stream the data as it is available to the client. This approach worked really well for me as the server was streaming the data, the client was reading it. This got round the timeout issue as there is frequent 'chatter' between the server and client.
Using Streams did take a bit of time to understand and I spent a while reading various articles and examples. That said, once I understood, it was pretty straightforward to implement.
This article on Node Streams on the freeCodeCamp site is excellent. It contains a really useful example where it creates a very large text file, which is then 'piped' to the client using streams. It shows how you can read the text file in 'chunks', make transformations to those checks and sent it to the client. What this article doesn't explain is how to read this data in the client...
For the client-side, the approach is different from the typical fetch().then() approach. I found another article that shows a working example of reading such streams from the back-end. See Using Readable Streams on the Mozilla site. In particular, look at the large code example that uses the 'pump' function. This is exactly what I used to read the data from streamed from the back-end.
Sorry I can't give a specific answer to your question, but I hope the links get you started.

Can't seem to find a way to play mp3

I've been hours going around this problem and I still can't solve it. Basically I get data from a database and using google text to speech I transform it into an mp3; after that I upload it to the google cloud storage. From there I use Twilio API to play the mp3 file when making an outbound call; I know I need to have a url for this file but I am very inexperienced in this and when I create a VoiceResponse() I can't input it. I am doing this all through Python. Is it possible for me to play the mp3 in the outbound call?
Best

How to setup action on Google API on local server?

I am new for action for Google, and I want a help. I want to create a chatbot and wanna use actions on Google API for this, I came across certain blogs but I can not understand how to setup this thing and local and make use of actions on Google API's in a productive way. I have read the documentation but nothing seems to be work in a desired manner. Please help me with initial steps where I could begin with.
I think you should read the documents properly. Here's the correct address for developing actions :- https://developers.google.com/actions/apiai/
This is the perfect document, if you will use API.ai. If you want to learn faster then you have to watch videos. Here is the link for the basic video.
This is the basic video you need to watch :- https://youtu.be/5Al0bfCF-xA
I think this video will help you alot. If still you need some more help for building apps, do reply we will help !
You can test it on your local server by opening up a port on your wifi router and and forward that port to you server ip and port. However, you still https call from api.ai to your local server.

Resources