I am working on creating a zip of multiple files on the server and stream it to the client while creating. Initially, I was using ArchiverJs It was working fine if I was appending buffer to it but it fails when I need to add streams into it. Then after having some discussion on Github, I switched to Node zip-stream which started working fine thanks to jntesteves. But as I deploy the code on GKE k8s I Started getting Network Failed errors for huge files.
Here is my sample code :
const ZipStream = require("zip-stream");
/**
* #summary Adding readable stream provided by https module into zipStreamer using entry method
*/
const handleEntryCB = ({ readableStream, zipStreamer, fileName, resolve }) => {
readableStream.on("error", () => {
console.error("Error while listening readableStream : ", error);
resolve("done");
});
zipStreamer.entry(readableStream, { name: fileName }, error => {
if (!error) {
resolve("done");
} else {
console.error("Error while listening zipStream readableStream : ", error);
resolve("done");
}
});
};
/**
* #summary Handling downloading of files using native https, http and request modules
*/
const handleUrl = ({ elem, zipStreamer }) => {
return new Promise((resolve, reject) => {
let fileName = elem.fileName;
const url = elem.url;
//Used in most of the cases
if (url.startsWith("https")) {
https.get(url, readableStream => {
handleEntryCB({ readableStream, zipStreamer, url, fileName, resolve, reject });
});
} else if (url.startsWith("http")) {
http.get(url, readableStream => {
handleEntryCB({ readableStream, zipStreamer, url, fileName, resolve, reject });
});
} else {
const readableStream = request(url);
handleEntryCB({ readableStream, zipStreamer, url, fileName, resolve, reject });
}
});
};
const downloadZipFile = async (data, resp) => {
let { urls = [] } = data || {};
if (!urls.length) {
throw new Error("URLs are mandatory.");
}
//Output zip name
const outputFileName = `Test items.zip`;
console.log("Downloading using streams.");
//Initialize zip-stream instance
const zipStreamer = new ZipStream();
//Set headers to response
resp.writeHead(200, {
"Content-Type": "application/zip",
"Content-Disposition": `attachment; filename="${outputFileName}"`,
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "GET, POST, OPTIONS"
});
//piping zipStreamer to the resp so that client starts getting response
//as soon as first chunk is added to the zipStreamer
zipStreamer.pipe(resp);
for (const elem of urls) {
await handleUrl({ elem, zipStreamer });
}
zipStreamer.finish();
};
app.post(restPrefix + "/downloadFIle", (req, resp) => {
try {
const { data } = req.body || {};
downloadZipFile(data, resp);
} catch (error) {
console.error("[FileBundler] unknown error : ", error);
if (resp.headersSent) {
resp.end("Unknown error while archiving.");
} else {
resp.status(500).end("Unknown error while archiving.");
}
}
});
I tested for 7-8 files of ~4.5 GB each on local, it works fine and when I tried the same on google k8s, I got network failed error.
After some more research, I Increased server timeout on k8s t0 3000 seconds, than it starts working fine, but I guess the increasing timeout is not good.
Is there anything I am missing on code level or can you suggest some good GKE deployment configuration for a server that can download large files with many concurrent users?
I am stuck on this for the past 1.5+ months. please help!
Edit 1: I edited the timeout in the ingress i.e Network services-> Load Balancing ->edit the timeout in the service
Related
I have a following Vite configuration:
import { defineConfig } from "vite";
const zlib = require("zlib");
export default defineConfig(() => {
return {
server: {
proxy: {
"/start": {
target: "https://someremoteurl.com",
secure: false,
changeOrigin: true,
configure: (proxy) => {
proxy.on("proxyRes", (proxyRes, req, res) => {
const chunks = [];
proxyRes.on("data", (chunk) => chunks.push(chunk));
proxyRes.on("end", () => {
const buffer = Buffer.concat(chunks);
const encoding = proxyRes.headers["content-encoding"];
if (encoding === "gzip" || encoding === "deflate") {
zlib.unzip(buffer, (err, buffer) => {
if (!err) {
let remoteBody = buffer.toString();
const modifiedBody = remoteBody.replace() // do some string manipulation on remoteBody
res.write(modifiedBody);
res.end();
} else {
console.error(err);
}
});
}
});
});
},
},
},
},
};
});
Everything works as expected modifiedBody is of needed shape.
However the server doesn't return the modified response, it retuns the initial html that the "https://someremoteurl.com" url served.
With the following code the response is "correctly" changed:
proxyRes.on("end", () => {
res.end('<h1>Some Test HTML</h1>')
});
But this wouldnt work for me, as i need to read the response first, unzip it, modify it and only then send back.
To me it looks like the proxied response is streamed, but dev server doesn't wait for the response to first finish streaming, running transformations and only then serving the desired document.
Any idea how can i achieve the desired result?
As Vite uses the http-node-proxy lib under the hood i had to look fo the answer in their documentation. I found that selfHandleResponse option needs to be true in order to serve your modified response.
Setting that option solved my question.
I created an adapter-node Sveltekit API endpoint, which streams quotes using a readable stream. When I quit the client route The streaming has to stop. This works fine in development using Sveltekit "npm run dev" (vite dev) or using a windows desktop container (node build).
onDestroy(async () => {
await reader.cancel(); // stop streaming
controller.abort(); // signal fetch abort
});
But when I build and deploy the node container on Google Cloud Run the streaming works fine. Except when I quit the client route: the API endpoint keeps on streaming. The log shows: enqueus for 5 more minutes followed by a delayed Readablestream cancel() on the API server.
Why this 5 minutes between the client cancel / abort and the cancel on the server?
The API +server.js
import { YahooFinanceTicker } from "yahoo-finance-ticker";
/** #type {import('./$types').RequestHandler} */
export async function POST({ request }) {
const { logging, symbols } = await request.json();
const controller = new AbortController();
const ticker = new YahooFinanceTicker();
ticker.setLogging(logging);
if (logging) console.log("api ticker", symbols);
const stream = new ReadableStream({
start(controller) {
(async () => {
const tickerListener = await ticker.subscribe(symbols);
tickerListener.on("ticker", (quote) => {
if (logging) console.log("api", JSON.stringify(quote, ["id", "price", "changePercent"]));
controller.enqueue(JSON.stringify(quote, ["id", "price", "changePercent"]));
});
})().catch(err => console.error(`api listen exeption: ${err}`));
},
cancel() { // arrives after 5 minutes !!!
console.log("api", "cancel: unsubscribe ticker and abort");
ticker.unsubscribe();
controller.abort();
},
});
return new Response(stream, {
headers: {
'content-type': 'text/event-stream',
}
});
}
Route +page.svelte
const controller = new AbortController();
let reader = null;
const signal = controller.signal;
async function streaming(params) {
try {
const response = await fetch("/api/yahoo-finance-ticker", {
method: "POST",
body: JSON.stringify(params),
headers: {
"content-type": "application/json",
},
signal: signal,
});
const stream = response.body.pipeThrough(new TextDecoderStream("utf-8"));
reader = stream.getReader();
while (true) {
const { value, done } = await reader.read();
if (logging) console.log("resp", done, value);
if (done) break;
... and more to get the quotes
}
} catch (err) {
if (!["AbortError"].includes(err.name)) throw err;
}
}
...
The behavior you are observing is expected, Cloud Run does not support client-side disconnects yet.
It is mentioned in this article, that
Cloud Run (fully managed) currently only supports server-side
streaming. Having only "server-side streaming" basically means when
the "client" disconnects, "server" will not know about it and will
carry on with the request. This happens because "server" is not
connected directly to the "client" and the request from the "client"
is buffered (in its entirety) and then sent to the "server".
You can also check this similar thread
It is a known issue, there is already a public issue exists for the same. You can follow that issue for future updates and also add your concerns there.
When using the #opentelemetry/plugin-https and the aws-sdk together in a NodeJS application, the opentelemetry plugin adds the traceparent header to each AWS request. This works fine if there is no need for retries in the aws-sdk. When the aws-sdk retries a request the following errors can occur:
InvalidSignatureException: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.
SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.
The first AWS request contains the following headers:
traceparent: '00-32c9b7adee1da37fad593ee38e9e479b-875169606368a166-01'
Authorization: 'AWS4-HMAC-SHA256 Credential=<credential>, SignedHeaders=host;x-amz-content-sha256;x-amz-date;x-amz-security-token;x-amz-target, Signature=<signature>'
Note that the SignedHeaders doesn't include traceparent.
The retried request contains the following headers:
traceparent: '00-c573e391a455a207469ffa4fb75b3cab-6f20c315628cfcc0-01'
Authorization: AWS4-HMAC-SHA256 Credential=<credential>, SignedHeaders=host;traceparent;x-amz-content-sha256;x-amz-date;x-amz-security-token;x-amz-target, Signature=<signature>
Note that the SignedHeaders does include traceparent.
Before the retry request is sent, the #opentelemetry/plugin-https sets new traceparent header and this makes the signature of the AWS request invalid.
Here is a code which reproduces the issue (you may need to run the script a few times before hitting the rate limit which causes the retries):
const opentelemetry = require("#opentelemetry/api");
const { NodeTracerProvider } = require("#opentelemetry/node");
const { SimpleSpanProcessor } = require("#opentelemetry/tracing");
const { JaegerExporter } = require("#opentelemetry/exporter-jaeger");
const provider = new NodeTracerProvider({
plugins: {
https: {
enabled: true,
path: "#opentelemetry/plugin-https"
}
}
});
const exporter = new JaegerExporter({ serviceName: "test" });
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register();
const AWS = require("aws-sdk");
const main = async () => {
const cwl = new AWS.CloudWatchLogs({ region: "us-east-1" });
const promises = new Array(100).fill(true).map(() => new Promise((resolve, reject) => {
cwl.describeLogGroups(function (err, data) {
if (err) {
console.log(err.name);
console.log("Got error:", err.message);
console.log("ERROR Request Authorization:");
console.log(this.request.httpRequest.headers.Authorization);
console.log("ERROR Request traceparent:");
console.log(this.request.httpRequest.headers.traceparent);
console.log("Retry count:", this.retryCount);
reject(err);
return;
}
resolve(data);
});
}));
const result = await Promise.all(promises);
console.log(result.length);
};
main().catch(console.error);
Possible solutions:
Ignore all calls to aws in the #opentelemetry/plugin-https.
Ignoring the calls to aws will lead to loosing all spans for aws requests.
Add the traceparent header to the unsignableHeaders in the aws-sdk - AWS.Signers.V4.prototype.unsignableHeaders.push("traceparent");
Changing the prototype seems like a hack and also doesn't handle the case if another node module uses different version of the aws-sdk.
Is there another solution which could allow me to keep the spans for aws requests and guarantees that the signature of all aws requests will be correct?
Update (16.12.2020):
The issue seems to be fixed in the aws sdk v3
The following code throws the correct error (ThrottlingException):
const opentelemetry = require("#opentelemetry/api");
const { NodeTracerProvider } = require("#opentelemetry/node");
const { SimpleSpanProcessor } = require("#opentelemetry/tracing");
const { JaegerExporter } = require("#opentelemetry/exporter-jaeger");
const { CloudWatchLogs } = require("#aws-sdk/client-cloudwatch-logs");
const provider = new NodeTracerProvider({
plugins: {
https: {
enabled: true,
path: "#opentelemetry/plugin-https"
}
}
});
const exporter = new JaegerExporter({ serviceName: "test" });
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register();
const main = async () => {
const cwl = new CloudWatchLogs({ region: "us-east-1" });
const promises = new Array(100).fill(true).map(() => new Promise((resolve, reject) => {
cwl.describeLogGroups({ limit: 50 })
.then(resolve)
.catch((err) => {
console.log(err.name);
console.log("Got error:", err.message);
reject(err);
});
}));
const result = await Promise.all(promises);
console.log(result.length);
};
main().catch(console.error);
I am trying to call a rest API from Firebase function which servers as a fulfillment for Actions on Google.
I tried the following approach:
const { dialogflow } = require('actions-on-google');
const functions = require('firebase-functions');
const http = require('https');
const host = 'wwws.example.com';
const app = dialogflow({debug: true});
app.intent('my_intent_1', (conv, {param1}) => {
// Call the rate API
callApi(param1).then((output) => {
console.log(output);
conv.close(`I found ${output.length} items!`);
}).catch(() => {
conv.close('Error occurred while trying to get vehicles. Please try again later.');
});
});
function callApi (param1) {
return new Promise((resolve, reject) => {
// Create the path for the HTTP request to get the vehicle
let path = '/api/' + encodeURIComponent(param1);
console.log('API Request: ' + host + path);
// Make the HTTP request to get the vehicle
http.get({host: host, path: path}, (res) => {
let body = ''; // var to store the response chunks
res.on('data', (d) => { body += d; }); // store each response chunk
res.on('end', () => {
// After all the data has been received parse the JSON for desired data
let response = JSON.parse(body);
let output = {};
//copy required response attributes to output here
console.log(response.length.toString());
resolve(output);
});
res.on('error', (error) => {
console.log(`Error calling the API: ${error}`)
reject();
});
}); //http.get
}); //promise
}
exports.myFunction = functions.https.onRequest(app);
This is almost working. API is called and I get the data back. The problem is that without async/await, the function does not wait for the "callApi" to complete, and I get an error from Actions on Google that there was no response. After the error, I can see the console.log outputs in the Firebase log, so everything is working, it is just out of sync.
I tried using async/await but got an error which I think is because Firebase uses old version of node.js which does not support async.
How can I get around this?
Your function callApi returns a promise, but you don't return a promise in your intent handler. You should make sure you add the return so that the handler knows to wait for the response.
app.intent('my_intent_1', (conv, {param1}) => {
// Call the rate API
return callApi(param1).then((output) => {
console.log(output);
conv.close(`I found ${output.length} items!`);
}).catch(() => {
conv.close('Error occurred while trying to get vehicles. Please try again later.');
});
});
I'm trying to send a POST request to my api but it gives me following error:
I'm using express-http-proxy package for node to send request. Here is my code:
app.use('/api', proxy(targetUrl, {
forwardPath: (req, res) => {
return require('url').parse(req.url).path;
}
}));
In my ReactJs application, I'm using superagent to pass my request and here is request creation code:
methods.forEach((method) =>
this[method] = (path, data = {}, params = {}) => new Promise((resolve, reject) => {
const request = superagent[method](formatUrl(path));
request.set('Token', 'cb460084804cd40');
if (params) {
request.query(params);
}
if (__SERVER__ && req.get('cookie')) {
request.set('cookie', req.get('cookie'));
}
if (data) {
request.send(data);
}
console.log('sending: ', request);
// request.end((err, { text } = {}) => {console.log('ended: ', text, err);});
// reject();
request.end((err, { text } = {}) => err ? reject(text || err) : resolve(text));
}));
formatURL Function
function formatUrl(path) {
const adjustedPath = path[0] !== '/' ? '/' + path : path;
if (__SERVER__) {
return 'https://' + config.apiHost + adjustedPath;
}
return '/api' + adjustedPath;
}
In addition to this, when I send GET request, it gives me following error:
Also, when I try to send POST request on my live APIserver it gives 404 Not Found error but if I try to send POST request on my localhost APIServer it gives 504 Gateway_Timeout error.
I am not confident about this behaviour. Therefore, need your help to find the problem. Thanks for your time in advance.