Bad config error when executing google speech api sample code - node.js

https://cloud.google.com/speech-to-text/docs/streaming-recognize
I've been trying to execute the sample google speech api code under "Performing Streaming Speech Recognition on an Audio Stream"
Here is the code I have been trying to execute:
'use strict';
const record = require('node-record-lpcm16');
const speech = require('#google-cloud/speech');
const exec = require('child_process').exec;
//const speech = Speech();
const client = new speech.SpeechClient();
const encoding = 'LINEAR16';
const sampleRateHertz = 16000;
const languageCode = 'en-US';
const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode
},
interimResults: true // If you want interim results, set this to true
};
const recognizeStream = client.streamingRecognize(request)
.on('error', console.error)
.on('data', (data) =>
process.stdout.write(
(data.results[0] && data.results[0].alternatives[0])
? `Transcription: ${data.results[0].alternatives[0].transcript}\n`
: `\n\nReached transcription time limit, press Ctrl+C\n`)
);
record.start({
sampleRateHertz: sampleRateHertz,
threshold: 0.5,
verbose: true,
recordProgram: 'arecord', // Try also "arecord" or "sox"
silence: '10.0'
}).on('error', console.error)
.pipe(recognizeStream);
console.log('Listening, press Ctrl+C to stop.');
the output in the terminal:
the output in the terminal:
I realise there's a problem with the encoding of the output stream from arecord i.e. it isn't inline with the configuration that's been specified in the program, but I'm not sure what to do to correct this

Related

Electron Desktop Application using GCP Speech To Text - Can not keep the connection open

I am trying to make a speech to text desktop app using GCP and Electron. I have it set up that when you click the record button it starts recording and shows in the speech to text in the box. My problem is that when you speak it transcribes at a pause and sends the data to the front end and then errors(Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client) if you keep speaking. I have been advised that using a websocket to keep the connection open and continually sending the data will fix this. However I am unfamiliar with websockets and all of the tech I am using so my question is what sort of websocket would I use? Will this work using ipcMain or do I need a different websocket? And would anyone be able to advise how I would implement this? My code from my index.js file is below:
router.get("/speechToText", function (req, res, next) {
speech.speechToTextFunction(recorder).on("data",data =>{
console.log(data.results[0].alternatives[0].transcript);
res.render('index', {answerInput: data.results[0].alternatives[0].transcript});
});
which connects to my index.ejs file with the button:
<form action="/speechToText" method="GET" id="captureSpeech">
<button type="submit" class="btn btn-primary" id="capSpeech" onclick="">Capture Speech</button>
</form>
and my speech.js file has this:
const speech = require("#google-cloud/speech");
const client = new speech.SpeechClient();
const encoding = "LINEAR16";
const sampleRateHertz = 16000;
const languageCode = "en-US";
const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
},
interimResults: false, // If you want interim results, set this to true
// singleUtterance: true
};
// Create a recognize stream
const recognizeStream = client
.streamingRecognize(request)
.on("error", console.error);
module.exports = {
speechToTextFunction: (recorder) => {
return recorder
.record({
sampleRateHertz: sampleRateHertz,
threshold: 0,
// Other options, see https://www.npmjs.com/package/node-record-lpcm16#options
verbose: false,
recordProgram: "sox", // Try also "arecord" or "sox"
silence: "10.0",
})
.stream()
.on("error", console.error)
.pipe(recognizeStream)
},
};
console.log("Listening, press Ctrl+C to stop.");
Any help and advice would be greatly appreciated

ReadFileSync error when attempting to use Google Speech Cloud in POST request

I am rather new to Node.js and have been trying to get my application working with the google speech cloud, but ran into some trouble. First, I have created file upload form on my website using a post request, which I then immediately submit to the google cloud through the following code:
app.post('/submit-form', multer({storage: storage}).single('audiofile'), function(req, res) {
console.log(req.file);
async function main() {
// Creates a client
var client = new speech.SpeechClient();
// The name of the audio file to transcribe
var fileName = path.join(__dirname, './uploads/' + req.file.filename);
console.log(fileName)
// Reads a local audio file and converts it to base64
var file = fs.readFileSync(fileName);
var audioBytes = file.toString('base64');
// The audio file's encoding, sample rate in hertz, and BCP-47 language code
var audio = {
content: audioBytes,
};
var config = {
encoding: 'LINEAR16',
sampleRateHertz: 16000,
languageCode: 'en-US',
};
var request = {
audio: audio,
config: config,
};
// Detects speech in the audio file
var [response] = await client.recognize(request);
var transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: ${transcription}`);
var fs = writeFile('./resources/results/result.txt', transcription);
}
});
Trouble is, the program always gets stuck at readFileSync for some reason, saying
TypeError: Cannot read property 'readFileSync' of undefined
Even when I submit filenames that are already in the folder upload, the code does not work, so I am not sure what is wrong

Google speech to text live stream, single_utterance is not working

I'm trying live stream speech to text using Google. I have installed node into my server.
I have successfully implemented it but I want google to recognize when the user stops to speaking. Google explained how to do that using single_utterance=true but it is not taking effect. Can you please tell what issue is there in the below code. Thank you!
var request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
//profanityFilter: false,
enableWordTimeOffsets: true,
//single_utterance: true
// speechContexts: [{
// phrases: ["hoful","shwazil"]
// }] // add your own speech context for better recognition
},
interimResults: true, // If you want interim results, set this to true
singleUtterance: true
};
function startRecognitionStream(client, data) {
console.log(request);
recognizeStream = speechClient.streamingRecognize(request)
.on('error', console.error)
.on('data', (data) => {
process.stdout.write(
(data.results[0] && data.results[0].alternatives[0])
? `Transcription: ${data.results[0].alternatives[0].transcript}\n`
: `\n\nReached transcription time limit, press Ctrl+C\n`);
client.emit('speechData', data);
// if end of utterance, let's restart stream
// this is a small hack. After 65 seconds of silence, the stream will still throw an error for speech length limit
if (data.results[0] && data.results[0].isFinal) {
stopRecognitionStream();
startRecognitionStream(client);
// console.log('restarted stream serverside');
}
})
.on('end_of_single_utterance', (data) => {
process.stdout.write('data ended');
console.log('data ended');
})
;
}
Thank you in advance!

NodeJs Google Speech API Followed Streaming recognition

I am working on a project that will need to use streaming recognition. I want to make it work with NodeJs, like here. It works fine but as the documentation says, it stops after roughtly 60 seconds.
What I want is to restart the recording and the recognition right after this error happens, as my text might take more than one min.
I tried to close the record when I receive an error in my recognition:
const record = require('node-record-lpcm16');
const unirest = require('unirest');
const speech = require('#google-cloud/speech');
const client = new speech.SpeechClient();
const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode
},
interimResults: false
};
const recognizeStream = client
.streamingRecognize(request)
.on('error', error => {
console.log("Timeout...");
console.error(error);
console.log("Closing record");
return record.stop();
})
.on('data', (data) => {
//Process data
});
And restarting the recording when it is ended:
record
.start({
sampleRateHertz: sampleRateHertz,
threshold: 0,
verbose: false,
recordProgram: 'rec',
silence: '1.0'
})
.on('error', console.error)
.on('end', data => {
record.start({
sampleRateHertz: sampleRateHertz,
threshold: 0,
verbose: false,
recordProgram: 'rec',
silence: '1.0'
}).pipe(recognizeStream);
})
.pipe(recognizeStream);
As a noobie in NodeJs it is the only solution I thought of but it doesn't work. The recording starts again correctly but the recognition doesn't as it doesn't transcript anything after it has been closed.
Any idea on how to perform a recognition for more than one minute with streaming recognition? I'd like something similar to what can be done in Python here

How to properly run the streamingRecognize() function provided by google cloud speech api

I have some issue with the streamingRecognize() function to stream speech to text.
When I run, I get the error:
Uncaught TypeError:
speechClient.streamingRecognize is not a function
When I try accessing it through the api object of my speechClient instance, I get this error as response:
google.cloud.speech.v1.StreamingRecognizeRequest#0 is not a field:
undefined
This is my code:
console.log('Listenning started')
document.getElementById("speak-btn").value = "Stop";
// retrieve settings
console.log("Retrieve audio and language settings...")
database.existSettingsRecord({}, (settingsRecord) => {
// The BCP-47 language code to use, e.g. 'en-US'
const languageCode = settingsRecord.language; //'en-US';
// Your Google Cloud Platform project ID
const nathProjectId = 'protocol-recorder-201707';
// Instantiates a client
const speechClient = Speech({
projectId: nathProjectId,
keyFilename: './rsc/credentials/spr-426ec2968cf6.json'
});
// The encoding of the audio file, e.g. 'LINEAR16'
const encoding = 'LINEAR16';
// The sample rate of the audio file in hertz, e.g. 16000
const sampleRateHertz = 16000;
const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode
},
interimResults: false // If you want interim results, set this to true
};
// Create a recognize stream
var notes = '';
console.log('crate the recognize Stream object to be piped..')
//const recognizeStream = speechClient.createRecognizeStream(request)
console.log("speechClient : ",speechClient)
console.log("grpc : ",grpc)
const recognizeStream = speechClient.streamingRecognize(request)
.on('error', console.error)
.on('data', (response) => {
//process.stdout.write(response.results)
process.stdout.write(
(response.results[0] && response.results[0].alternatives[0])
? `Transcription: ${response.results[0].alternatives[0].transcript}\n`
: `\n\nReached transcription time limit, press Ctrl+C\n`);
notes = document.getElementById("notes").value;
notes = notes + response.results;
document.getElementById("notes").value = notes;
});
// Start recording and send the microphone input to the Speech API
console.log('Start recording and send the microphone input to the Speech API..')
record.start({
sampleRateHertz: sampleRateHertz,
threshold: 0,
// Other options, see https://www.npmjs.com/package/node-record-lpcm16#options
verbose: true,
recordProgram: 'sox', // Try also "arecord" or "sox"
silence: '1.0',
device : settingsRecord.audio_input
})
.on('error', console.error)
.pipe(recognizeStream);
I am using :
Win 10
Node js 7.10.0
sox 14.4.2
Thanks for any help on that issues!

Resources