How to send audio saved as a Buffer, from my api, to my React client and play it? - node.js

I've been chasing my tail for two days figuring out how to best approach sending the <Buffer ... > object generated by Google's Text-To-Speech service, from my express-api to my React app. I've come across tons of different opinionated resources that point me in different directions and only potentially "solve" isolated parts of the bigger process. At the end of all of this, while I've learned a lot more about ArrayBuffer, Buffer, binary arrays, etc. yet I still feel just as lost as before in regards to implementation.
At its simplest, all I aim to do is provide one or more strings of text to tts, generate the audio files, send the audio files from my express-api to my react client, and then automatically play the audio in the background on the browser when appropriate.
I am successfully sending and triggering google's tts to generate the audio files. It responds with a <Buffer ...> representing the binary data of the file. It arrives in my express-api endpoint, from there I'm not sure if I should
...
convert the Buffer to a string and send it to the browser?
send it as a Buffer object to the browser?
set up a websocket using socket.io and stream it?
then once it's on the browser,
do I use an <audio /> tag?
should I convert it to something else?
I suppose the problem I'm having is trying to find answers for this results in an information overload consisting of various different answers that have been written over the past 10 years using different approaches and technologies. I really don't know where one starts and the next ends, what's a bad practice, what's a best practice, and moreover what is actually suitable for my case. I could really use some guidance here.
Synthesise function from Google
// returns: <Buffer ff f3 44 c4 ... />
const synthesizeSentence = async (sentence) => {
const request = {
input: { text: sentence },
voice: { languageCode: "en-US", ssmlGender: "NEUTRAL" },
audioConfig: { audioEncoding: "MP3" },
};
const response = await client.synthesizeSpeech(request);
return response[0].audioContent;
};
(current shape) of express-api POST endpoint
app.post("/generate-story-support", async (req, res) => {
try {
// ? generating the post here for simplicity, eventually the client
// ? would dictate the sentences to send ...
const ttsResponse: any = await axios.post("http://localhost:8060/", {
sentences: SAMPLE_SENTENCES,
});
// a resource said to send the response as a string and then convert
// it on the client to an Array buffer? -- no idea if this is a good practice
return res.status(201).send(ttsResponse.data[0].data.toString());
} catch (error) {
console.log("error", error);
return res.status(400).send(`Error: ${error}`);
}
});
react client
so post
useEffect(() => {
const fetchData = async () => {
const data = await axios.post(
"http://localhost:8000/generate-story-support"
);
// converting it to an ArrayBuffer per another so post
const encoder = new TextEncoder();
const encodedData = encoder.encode(data.data);
setAudio(encodedData);
return data.data;
};
fetchData();
}, []);
// no idea what to do from here, if this is even the right path :/

Related

How to save state of audio blob in my chrome extension?

I have the following popup.js file in which I capture audio from a tab and store it in a blob and then set my audio tag's src URL to it. I would like to store this blob in chrome's local storage so that the audio data remains even after closing the extension's popup.html window. How do I go about doing this? I tried to serialize the blob to JSON using ArrayBuffer but it seemed to not be the correct method. If there is a better way please let me know. Thanks.
function captureTabAudio() {
chrome.tabCapture.capture({ audio: true, video: false }, (stream) => {
context = new AudioContext();
const chunks = [];
if (context.state === 'suspended') {
context.resume();
}
var newStream = context.createMediaStreamSource(stream);
newStream.connect(context.destination);
const recorder = new MediaRecorder(stream);
recorder.start();
setTimeout(() => recorder.stop(), 10000);
recorder.ondataavailable = (e) => {
chunks.push(e.data);
};
recorder.onstop = (e) => {
const blob = new Blob(chunks, { type: "audio/ogg; codecs=opus" });
document.querySelector("audio").src =
URL.createObjectURL(blob);
};
})
}
I assume that when you say "Chrome's local storage" you mean the web's localStorage API. If so, this won't work because, "The keys and the values stored with localStorage are always in the UTF-16 string format, which uses two bytes per character" (MDN). If you meant the extension platform's Storage API, you'll have a similar problem because it only supports JSON-serializable values. While you could potentially convert the ArrayBuffer into a Base64 string, I wouldn't recommend it as you're more likely to run into storage limits.
For storing binary data locally, currently your best bet is to use IndexedDB. Since this API supports the structured clone algorithm, it supports a much broader set of types including ArrayBuffer. A similar question was asked here: Saving ArrayBuffer in IndexedDB.

How do I create an expressjs endpoint that uses azure tts to send audio to a web app?

I am trying to figure out how to expose an express route (ie: Get api/word/:some_word) which uses the azure tts sdk (microsoft-cognitiveservices-speech-sdk) to generate an audio version of some_word (in any format playable by a browser), and res.send()'s the resulting audio, so that a front end javascript web app could consume the api in order to play the audio pronunciation of the word.
I have the azure sdk 'working' - it is creating an 'ArrayBuffer' inside my expressjs code. However, I do not know how to send the data in this ArrayBuffer to the front end. I have been following the instructions here: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=import%2Cwindowsinstall&pivots=programming-language-javascript#get-result-as-an-in-memory-stream
Another way to phrase my question would be 'in express, I have an ArrayBuffer whose contents is an .mp3/.ogg/.wav file. How do I send that file via express? Do I need to convert it into some other data type(like a Base64 encoded string? A buffer?) Do I need to set some particular response headers?
I finally figured it out seconds after asking this question 😂
I am pretty new to this area, so any pointers on how this could be improved would be appreciated.
app.get('/api/tts/word/:word', async (req, res) => {
const word = req.params.word;
const subscriptionKey = azureKey;
const serviceRegion = 'australiaeast';
const speechConfig = sdk.SpeechConfig.fromSubscription(
subscriptionKey as string,
serviceRegion
);
speechConfig.speechSynthesisOutputFormat =
SpeechSynthesisOutputFormat.Ogg24Khz16BitMonoOpus;
const synthesizer = new sdk.SpeechSynthesizer(speechConfig);
synthesizer.speakSsmlAsync(
`
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="zh-CN">
<voice name="zh-CN-XiaoxiaoNeural">
${word}
</voice>
</speak>
`,
(resp) => {
const audio = resp.audioData;
synthesizer.close();
const buffer = Buffer.from(audio);
res.set('Content-Type', 'audio/ogg; codecs=opus; rate=24000');
res.send(buffer);
}
);
});

What is the best approach to stream JSON from a REST API to an Express app?

I have a moleculer-based microservice that has an endpoint which outputs a large JSON object (around tens of thousands of objects)
This is a structured JSON object and I know beforehand what it is going to look like.
[ // ... tens of thousands of these
{
"fileSize": 1155624,
"name": "Gyo v1-001.jpg",
"path": "./userdata/expanded/Gyo v01 (2003)"
},
{
"fileSize": 308145,
"name": "Gyo v1-002.jpg",
"path": "./userdata/expanded/Gyo v01 (2003) (Digital)"
}
// ... tens of thousands of these
]
I went about researching on JSON streaming, and made some headway there, in that I know how to consume a NodeJS ReadableStream client-side. I know I can use oboe to parse the JSON stream.
To that end, this is code in my Express-based app.
router.route("/getComicCovers").post(async (req: Request, res: Response) => {
typeof req.body.extractionOptions === "object"
? req.body.extractionOptions
: {};
oboe({
url: "http://localhost:3000/api/import/getComicCovers",
method: "POST",
body: {
extractionOptions: req.body.extractionOptions,
walkedFolders: req.body.walkedFolders,
},
}).on("node", ".*", (data) => {
console.log(data);
res.write(JSON.stringify(data));
});
});
This is the endpoint in moleculer
getComicCovers: {
rest: "POST /getComicCovers",
params: {
extractionOptions: "object",
walkedFolders: "array",
},
async handler(
ctx: Context < {
extractionOptions: IExtractionOptions;
walkedFolders: IFolderData[];
} >
) {
const comicBooksForImport = await getCovers(
ctx.params.extractionOptions,
ctx.params.walkedFolders
);
// comicBooksForImport is the aforementioned array of objects.
// How do I stream it from here to the Express app object-by-object?
},
},
My question is: How do I stream this gigantic JSON from the REST endpoint to the Express app so I can parse it on the client end?
UPDATE
I went with a socket.io implementation per #JuanCaicedo's suggestion. I have it setup on both the server and the client end.
However, I do have trouble with this piece of code
map(
walkedFolders,
async (folder, idx) => {
let foo = await extractArchive(
extractionOptions,
folder
);
let fo =
new JsonStreamStringify({
foo,
});
fo.pipe(res);
if (
+idx ===
walkedFolders.length - 1
) {
res.end();
}
}
);
I get a Error [ERR_STREAM_WRITE_AFTER_END]: write after end error. I understand that this happens because the response is terminated before the next iteration attempts to pipe the updated value of foo (which is a stream) into the response.
How do I get around this?
Are you asking for a general approach recommendation, or for support with the particular solution you have?
If it's for the first, then I think your best bet for communicating between the server and the client is through websockets, perhaps with something like Socket.io. A long lived connection will serve you well here, since it will take a long time to transmit all your data across.
Then you can send data from the server to the client any time you like. At that point you can read your data on the server as a node.js stream and emit the data one at a time.
The problem with using Oboe and writing to the response on every node is that it requires a long running response, and there's a high likelihood the connection could get interrupted before you've sent all the data across.

How to return a generated image with Bull.js queue?

My use case is this: I want to create screenshots of parts of a page. For technical reasons, it cannot be done on the client-side (see related question below) but needs puppeteer on the server.
As I'm running this on Heroku, I have the additional restriction of a quite small timeout window. Heroku recommends therefore to implement a queueing system based on bull.js and use worker processes for longer-running tasks as explained here.
I have two endpoints (implemented with Express), one that receives a POST request with some configuration JSON, and another one that responds to GET when provided with a job identifier (slightly modified for brevity):
This adds the job to the queue:
router.post('/', async function(req, res, next) {
let job = await workQueue.add(req.body.chartConfig)
res.json({ id: job.id })
})
This returns info about the job
router.get('/:id', async(req, res) => {
let id = req.params.id;
let job = await workQueue.getJob(id);
let state = await job.getState();
let progress = job._progress;
let reason = job.failedReason;
res.json({ id, state, progress, reason });
})
In a different file:
const start = () => {
let workQueue = new queue('work', REDIS_URL);
workQueue.process(maxJobsPerWorker, getPNG)
}
const getPNG = async(job) => {
const { url, width, height, chart: chartConfig, chartUrl } = job.data
// ... snipped for brevity
const png = await page.screenshot({
type: 'png',
fullPage: true
})
await page.close()
job.progress(100)
return Promise.resolve({ png })
}
// ...
throng({ count: workers, worker: start })
module.exports.getPNG = getPNG
The throng invocation at the end specifies the start function as the worker function to be called when picking a job from the queue. start itself specifies getPNG to be called when treating a job.
My question now is: how do I get the generated image (png)? I guess ideally I'd like to be able to call the GET endpoint above which would return the image, but I don't know how to pass the image object.
As a more complex fall-back solution I could imagine posting the image to an image hosting service like imgur, and then returning the URL upon request of the GET endpoint. But I'd prefer, if possible, to keep things simple.
This question is a follow-up from this one:
Issue with browser-side conversion SVG -> PNG
I've opened a ticket on the GitHub repository of the bull project. The developers said that the preferred practice is to store the binary object somewhere else, and to add only the link metadata to the job's data store.
However, they also said that the storage limit of a job object appears to be 512 Mb. So it is also quite possible to store an image of a reasonable size as a base64-encoded string.

How to send File through Websocket along with additional info?

I'm developing a Web application to send images, videos, etc. to two monitors from an admin interface. I'm using ws in Node.js for the server side. I've implemented selecting images available on the server and external URLs and sending them to the clients, but I also wanted to be able to directly send images selected from the device with a file input. I managed to do it using base64 but I think it's pretty inefficient.
Currently I send a stringified JSON object containing the client to which the resource has to be sent, the kind of resource and the resource itself, parse it in the server and send it to the appropriate client. I know I can set the Websocket binaryType to blob and just send the File object, but then I'd have no way to tell the server which client it has to send it to. I tried using typeson and BSON to accomplish this, but it didn't work.
Are there any other ways to do it?
You can send raw binary data through the WebSocket.
It's quite easy to manage.
One option is to prepend a "magic byte" (an identifier that marks the message as non-JSON). For example, prepend binary messages with the B character.
All the server has to do is test the first character before collecting the binary data (if the magic byte isn't there, it's probably the normal JSON message).
A more serious implementation will attach a header after the magic byte (i.e., file name, total length, position of data being sent etc').
This allows the upload to be resumed on disconnections (send just the parts that weren't acknowledged as received.
Your server will need to split the data into magic byte, header and binary_data before processing. but it's easy enough to accomplish.
Hope this help someone.
According to socket.io document you can send either string, Buffer or mix both of them
On Client side:
function uploadFile(e, socket, to) {
let file = e.target.files[0];
if (!file) {
return
}
if (file.size > 10000000) {
alert('File should be smaller than 1MB')
return
}
var reader = new FileReader();
var rawData = new ArrayBuffer();
reader.onload = function (e) {
rawData = e.target.result;
socket.emit("send_message", {
type: 'attachment',
data: rawData
} , (result) => {
alert("Server has received file!")
});
alert("the File has been transferred.")
}
reader.readAsArrayBuffer(file);
}
on server side:
socket.on('send_message', async (data, cb) => {
if (data.type == 'attachment') {
console.log('Found binary data')
cb("Received file successfully.")
return
}
// Process other business...
});
I am using pure WebSocket without io, where you cannot mix content - either String or Binary. Then my working solution is like this:
CLIENT:
import { serialize } from 'bson';
import { Buffer } from 'buffer';
const reader = new FileReader();
let rawData = new ArrayBuffer();
ws = new WebSocket(...)
reader.onload = (e) => {
rawData = e.target.result;
const bufferData = Buffer.from(rawData);
const bsonData = serialize({ // whatever js Object you need
file: bufferData,
route: 'TRANSFER',
action: 'FILE_UPLOAD',
});
ws.send(bsonData);
}
Then on Node server side, the message is catched and parsed like this:
const dataFromClient = deserialize(wsMessage, {promoteBuffers: true}) // edited
fs.writeFile(
path.join('../server', 'yourfiles', 'yourfile.txt'),
dataFromClient.file, // edited
'binary',
(err) => {
console.log('ERROR!!!!', err);
}
);
The killer is promoteBuffer option in deserialize function.

Resources