I am creating a video/mp4 from a canvas in the front with react:
const canvasResult = document.getElementById(
"canvasResult"
) as HTMLCanvasElement;
const createVideo = () => {
// create video:
const chunks: any[] = []; // here we will store our recorded media chunks (Blobs)
const stream = canvasResult.captureStream(); // grab our canvas MediaStream
const rec = new MediaRecorder(stream); // init the recorder
// every time the recorder has new data, we will store it in our array
rec.ondataavailable = (e) => chunks.push(e.data);
// only when the recorder stops, we construct a complete Blob from all the chunks
rec.onstop = (e) => {
setLoadingCanvaGif(100);
resolve(new Blob(chunks, { type: "video/mp4" }));
};
rec.start();
setTimeout(() => {
clearInterval(canvaInterval);
setShowCanva(false);
rec.stop();
}, 6000); // stop recording in 6s
}
const blobvideo = await createVideo();
const fileVideo = new File([blobvideo], "video.mp4" , {
type: blobvideo.type,
});
let formData = new FormData();
formData.append("file", file);
await axios.post(
`/uploadFile`,
formData,
{
headers: {
"Content-Type": "multipart/form-data",
},
}
);
and receiving it in the backend with nodejs, I save it like this:
// using express-fileupload
const file: UploadedFile = req?.files?.file;
const targetPath = path.join(
__dirname,
`../../uploads`
);
const fileName = path.join(targetPath, `/design_${ms}.mp4`); // ms is a ramdon id
await new Promise<void>((resolve, reject) => {
file.mv(fileName, function (err: any) {
if (err) {
throw {
code: 400,
message: err,
};
}
resolve();
});
});
the problem I have is that when it is saved with the .mp4 extension, the file is not saved correctly, windows shows it like this (case 3):
If i save it as webm
If the .mp4 file (of the case 3) is passed through a video to mp4 converter (https://video-converter.com/es/)
if i save it as mp4
The problem is that when I want to use case 3 (I need it in mp4 and not in webm and I can't manually upload each video to a converter) I can't use it correctly, it generates errors.
note: the three files are played correctly by opening it with vlc or any video player
I believe MediaRecorder in Chrome only supports video/webm mimeType. Your node service will need to convert the file to mp4.
You can use MediaRecorder.isTypeSupported('video/mp4') to check if it is supported.
const getMediaRecorder = (stream) => {
// If video/mp4 is supported
if (MediaRecorder.isTypeSupported('video/mp4')) {
return new MediaRecorder(stream, { mimeType: 'video/mp4' });
}
// Let the browser pick default
return new MediaRecorder(stream);
};
Related
I would like to save audio recording to S3. I am using the functions below to load direct to awsS3 direct from the browser. It works for short audio recordings of up to around 25 seconds but fails for larger files.
Currently the functions is as follows: I speak into the microphone using recorder.js. Once the recording is complete I press stop which then saves the file to AWS
From the browser:
getSignedRequest(file,fileLoc);
function getFetchSignedRequest(file,fileLoc){
const fetchUrl = `/xxxxxxxxx?file-name=${file.name}&file-type=${file.type}&fileLoc=${fileLoc}`;
fetch(fetchUrl )
.then((response) => {
console.log('response',response)
if(!response.ok){
console.log('Network response was not OK',response.ok)
} else {
putAudioFetchFile(file, response.signedRequest, response.url)
}
})
.catch((error) => {
console.error('Could not get signed URL:', error);
})
}
This send a get request to the NodeJs server which calls this :
const aws = require('aws-sdk');
const fs = require('fs');
aws.config.region = 'xxxxxx';
const S3_BUCKET = process.env.AWS_S3_BUCKET
this.uploadToAWSDrive =
async function uploadToAWSDrive(req,res){
const s3 = new aws.S3();
const URL_EXPIRATION_SECONDS = 3000;
const subFolderName = req.query['fileLoc'];
const fileName = req.query['file-name'];
const fileType = req.query['file-type'];
const fileLocName = subFolderName + fileName;
const s3Params = {
Bucket: S3_BUCKET,
Key: fileLocName,
Expires: URL_EXPIRATION_SECONDS,
ContentType: fileType,
ACL: 'public-read'
};
await s3.getSignedUrl('putObject', s3Params, (err, data) => {
if(err){
console.log(err);
return res.end();
}
const returnData = {
signedRequest: data,
url: `https://${S3_BUCKET}.s3.amazonaws.com/${fileLocName}`
};
console.log('audio uploaded',returnData)
res.write(JSON.stringify(returnData));
res.end();
});
}
Which then calls this:
function uploadFile(file, signedRequest, url){
const xhr = new XMLHttpRequest();
xhr.open('PUT', signedRequest);
xhr.onreadystatechange = () => {
if(xhr.readyState === 4){
if(xhr.status === 200){
console.log('destination url= ', url,xhr.readyState,xhr.status)
}
else{
alert('Could not upload file.');
}
}
};
xhr.send(file);
}
This then sends the file to the awsS3 server. Ok for audio less than 30secs, but fails for longer audio files.
What do I need to do to enable this to work with audio files of greater than 20secs and upto 3 mins?
Any help most appreciated
Not very elegant but the issue was resolved by adding a timer to the origanal function call. A function that followed also needed to be delayed to I think allow processor time. I am sure there will be better ways to do this.
setTimeout( getSignedRequest( myAudioFile,fileLoc), proccessTime) ;
I have created app for speech to text converter. react frontend and nodejs API.i record audio from react and post it to nodejs.but google API result is empty.how can I fix it?
why getting always empty results?
that's my code.
ReactMic Recorder
<ReactMic
record={record}
className="sound-wave"
onStop={onStop}
onData={onData}
strokeColor="#000000"
backgroundColor="#FF4081"
mimeType="audio/wav"/>
<button onClick={startRecording} type="button">Start</button>
<button onClick={stopRecording} type="button">Stop</button>
NodeJs API
app.post('/SpeechConvert', (req, res) => {
const client = new speech.SpeechClient();
console.log(req.files.file);
req.files.file.mv('./input.wav',function (err) {
if (err) {
console.log(err);
}
})
async function speechToText() {
// The name of the audio file to transcribe
const fileData = req.files.file.data;
// Reads a local audio file and converts it to base64
const file = fs.readFileSync('input.wav');
const audioBytes = fileData.toString('base64');
// console.log(audioBytes);
// The audio file's encoding, sample rate in hertz, and BCP-47 language code
const audio = {
content: audioBytes,
};
const config = {
enableAutomaticPunctuation: true,
encoding: 'LINEAR16',
sampleRateHertz: 44100,
languageCode: 'en-US',
};
const request = {
audio: audio,
config: config,
};
// Detects speech in the audio file
const [response] = await client.recognize(request);
console.log(response);
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: ${transcription}`);
res.send({ 'transcription': transcription, 'msg': 'The Audio successfully converted to the text' });
}
speechToText().catch(console.error);
});
can anyone help me to fix this?
I want to convert a large list of PDF files to JPEG and compress the images files on the server before sending the archive file for the client.
I use:
PDFKit to generate my PDF file
ImageMagic to convert each PDF file to JPEG with the command convert -density 300 file.pdf -trim -quality 100 file.jpeg
archiver for zipping image files
Problems:
the convert command is too slow and I have a timeout in the client side if I want to convert a large list of PDF files or a PDF file with many pages.
my archive don't stream directly in the client side
Questions:
It is possible to generate a JPEG file with PDFKit?
What I need to do to stream directly the archive to my response?
My code:
const archiver = require("archiver");
const { exec } = require("child_process");
const PDFDocument = require("pdfkit");
const fs = require("fs");
const contentDisposition = require("content-dispisition");
async function convertFile(pdfPath, ouputPath) {
return new Promise((resolve, reject) => {
exec(
`convert -density 300 ${pdfPath} -trim -quality 80 ${ouputPath}`,
error => (error !== null ? reject(error) : resolve("ok"))
);
});
}
const cleanUp = () => {
// used to clean the generation files and directory
};
function createPdf(path, obj, size = [210, 297]) {
return new Promise(async (resolve, reject) => {
const doc = new PDFDocument({
size,
margin: 0,
layout: "landscape"
});
/** here is a complex code for drawing PDF images and text */
const stream = fs.createWriteStream(path);
stream.on("error", reject);
stream.on("finish", resolve);
doc.pipe(stream);
doc.on("error", reject);
});
}
function getItems(id) {
return []; // Return a list of object for PDF files
}
// Express router
Router.get("/api/card/:id", async function route(req, res) {
res.header("Content-Type", `application/zip`);
res.header("Content-Disposition", contentDisposition(`file.zip`));
const archiveStream = archiver("zip", { zlib: { level: 9 } });
archiveStream.pipe(res);
archiveStream.on("end", cleanUp);
archiveStream.on("error", err => {});
let i = 0;
for (const obj of getItems(req.params.id)) {
const filename = `myfile-${i++}`;
// Create my pdf file
await createPdf(`${filename}.pdf`, obj);
// Convert PDF to JPEG
await convertFile(`${filename}.pdf`, `${filename}.jpeg`);
// Add JPEG file to archive
archiveStream.append(fs.createReadStream(`${filename}.jpeg`), {
name: `${filename}.jpeg`
});
}
archiveStream.finalize();
});
I am making an app where the user browser records the user speaking and sends it to the server which then passes it on to the Google speech to the text interface. I am using mediaRecorder to get 1-second blobs which are sent to a server. On the server-side, I send these blobs over to the Google speech to the text interface. However, I am getting an empty transcriptions.
I know what the issue is. Mediarecorder's default Mime Type id audio/WebM codec=opus, which is not accepted by google's speech to text API. After doing some research, I realize I need to use ffmpeg to convert blobs to LInear16. However, ffmpeg only accepts audio FILES and I want to be able to convert BLOBS. Then I can send the resulting converted blobs over to the API interface.
server.js
wsserver.on('connection', socket => {
console.log("Listening on port 3002")
audio = {
content: null
}
socket.on('message',function(message){
// const buffer = new Int16Array(message, 0, Math.floor(data.byteLength / 2));
// console.log(`received from a client: ${new Uint8Array(message)}`);
// console.log(message);
audio.content = message.toString('base64')
console.log(audio.content);
livetranscriber.createRequest(audio).then(request => {
livetranscriber.recognizeStream(request);
});
});
});
livetranscriber
module.exports = {
createRequest: function(audio){
const encoding = 'LINEAR16';
const sampleRateHertz = 16000;
const languageCode = 'en-US';
return new Promise((resolve, reject, err) =>{
if (err){
reject(err)
}
else{
const request = {
audio: audio,
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
},
interimResults: false, // If you want interim results, set this to true
};
resolve(request);
}
});
},
recognizeStream: async function(request){
const [response] = await client.recognize(request)
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: ${transcription}`);
// console.log(message);
// message.pipe(recognizeStream);
},
}
client
recorder.ondataavailable = function(e) {
console.log('Data', e.data);
var ws = new WebSocket('ws://localhost:3002/websocket');
ws.onopen = function() {
console.log("opening connection");
// const stream = websocketStream(ws)
// const duplex = WebSocket.createWebSocketStream(ws, { encoding: 'utf8' });
var blob = new Blob(e, { 'type' : 'audio/wav; base64' });
ws.send(blob.data);
// e.data).pipe(stream);
// console.log(e.data);
console.log("Sent the message")
};
// chunks.push(e.data);
// socket.emit('data', e.data);
}
I wrote a similar script several years ago. However, I used a JS frontend and a Python backend instead of NodeJS. I remember using a sox transformer to transform the audio input into to an output that the Google Speech API could use.
Perhaps this might be useful for you.
https://github.com/bitnahian/speech-transcriptor/blob/9f186e5416566aa8a6959fc1363d2e398b902822/app.py#L27
TLDR:
Converted from a .wav format to .raw format using ffmpeg and sox.
I'm making a demo of speech to text using Azure speech api on browser by node.js. According to API document here, it does specify that it need .wav or .ogg files. But the example down there does a api call through sending byte data to api.
So I've already get my data from microphone in byte array form. Is it the right path to convert it to byte and send it to api? Or is it better for me to save it as a .wav file then send to the api?
So below is my code.
This is stream from microphone part.
navigator.mediaDevices.getUserMedia({ audio: true })
.then(stream => { handlerFunction(stream) })
function handlerFunction(stream) {
rec = new MediaRecorder(stream);
rec.ondataavailable = e => {
audioChunks.push(e.data);
if (rec.state == "inactive") {
let blob = new Blob(audioChunks, { type: 'audio/wav; codec=audio/pcm; samplerate=16000' });
recordedAudio.src = URL.createObjectURL(blob);
recordedAudio.controls = true;
recordedAudio.autoplay = true;
console.log(blob);
let fileReader = new FileReader();
var arrayBuffer = new Uint8Array(1024);
var reader = new FileReader();
reader.readAsArrayBuffer(blob);
reader.onloadend = function () {
var byteArray = new Uint8Array(reader.result);
console.log("reader result" + reader.result)
etTimeout(() => getText(byteArray), 1000);
}
}
}
}
This is api call part
function getText(audio, callback) {
console.log("in function audio " + audio);
console.log("how many byte?: " + audio.byteLength)
const sendTime = Date.now();
fetch('https://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US', {
method: "POST",
headers: {
'Accept': 'application/json',
'Ocp-Apim-Subscription-Key': YOUR_API_KEY,
// 'Transfer-Encoding': 'chunked',
// 'Expect': '100-continue',
'Content-type': 'audio/wav; codec=audio/pcm; samplerate=16000'
},
body: audio
})
.then(function (r) {
return r.json();
})
.then(function (response) {
if (sendTime < time) {
return
}
time = sendTime
//callback(response)
}).catch(e => {
console.log("Error", e)
})
}
It returns with 400 (Bad Request) and says :
{Message: "Unsupported audio format"}
Reason:
Note you're not creating a MediaRecorder with a audio/wav mimeType by
new Blob(audioChunks,{type:'audio/wav; codec=audio/pcm; samplerate=16000'})
This statement is only a description for blob. I test my Chrome(v71) with isTypeSupported:
MediaRecorder.isTypeSupported("audio/wav") // return false
MediaRecorder.isTypeSupported("audio/ogg") // return false
MediaRecorder.isTypeSupported("audio/webm") // return true
It seems that the MediaRecorder will only record the audio in audio/webm. Also, when I run the following code on Chrome , the default rec.mimeType is audio/webm;codecs=opus
rec = new MediaRecorder(stream);
According to the Audio formats Requiremnts, the audio/webm is not supported yet.
Approach:
Before calling getText() we need convert the webm to wav firstly. There're quite a lot of libraries that can help us do that. I just copy Jam3's script before your code to convert webm to wav :
// add Jam3's script between Line 2 and Line 94 or import that module as you like
// create a audioContext that helps us decode the webm audio
var audioCtx = new (window.AudioContext || window.webkitAudioContext)();
rec = new MediaRecorder(stream,{
mimeType : 'audio/webm',
codecs : "opus",
});
// ...
rec.ondataavailable = e => {
audioChunks.push(e.data);
if (rec.state == "inactive") {
var blob = new Blob(audioChunks, { 'type': 'audio/webm; codecs=opus' });
var arrayBuffer;
var fileReader = new FileReader();
fileReader.onload = function(event) {
arrayBuffer = event.target.result;
};
fileReader.readAsArrayBuffer(blob);
fileReader.onloadend=function(d){
audioCtx.decodeAudioData(
fileReader.result,
function(buffer) {
var wav = audioBufferToWav(buffer);
setTimeout(() => getText(wav), 1000);
},
function(e){ console.log( e); }
);
};
}
}
And it works fine for me :
As a side note, I suggest you should use your backend to invoke the speech-to-text services. Never invoke azure stt service in a browser. That's because exposing your subscription key to front end is really dangerous. Anyone could inspect the network and steal your key.