Spikes in execution time for cloud functions? - node.js

I have a pretty dead simple cloud function that writes a single value to my real-time database. The code is at the bottom of this post.
Watching the logs, I'm finding that the execution time is highly inconsistent. Here's a screenshot:
You can see that it's as low as 3ms (great!) and as high as 579ms (very bad-- and I've seen it reach 1000ms). The result is very noticeable delays in my chatroom implementation, with messages sometimes being appended out of order from how they were sent. (i.e. "1" "2" "3" is being received as "2" "3" "1")
Why might execution time vary this wildly? Cold start vs warm start doesn't seem to apply since you can see these calls happened directly one after the other. I also can't find any documented limits on writes/sec for real-time db, unlike the 1 write/sec limit on firestore documents.
Here's the code:
import * as functions from 'firebase-functions';
import * as admin from 'firebase-admin';
admin.initializeApp();
const messagesRef = admin.database().ref('/messages/general');
export const sendMessageToChannel = functions.https.onCall(async (data, context) => {
if (!context.auth) {
throw new functions.https.HttpsError(
'failed-precondition',
'User must be logged-in.'
);
}
try {
await messagesRef.push({
uid: context.auth.uid,
displayName: data.displayName,
body: data.body
});
} catch (error) {
throw new functions.https.HttpsError('aborted', error);
}
});
Edit: I've seen this similar question from two years ago, where the responder indicates that the tasks themselves have variable execution time.
Is that the case here? Does the real-time database have wildly variable write times (varying by ~330x, from 3ms to 1000ms!)?

That's something quite hard to control based on the code.
You have a lot of steps going on there: \
verifying the user authentication
send his message to the collection
trying to catch any possible errors
So you can't rely simply on response time to organize the messaging order.
You should be setting a serverside timestamp from within the client side to trace that.
You can achieve this with the following explained piece of code:
try {
message.createdAt = firebase.firestore.FieldValue.serverTimestamp() // server-side timestamp
... // calls to functions
} catch(err) {
console.log("Couldn't set timestamp or send to functions")
}
This way you would set a server side timestamp for your message before sending it to be saved, so your users would see whenever a message is being registered (timestamp), saved (functions call) and confirmed (200 when sent).

Related

Node.js/NodeMailer/Express/Outlook smtp host - Concurrent connections limit exceeded

Hope you are well. I am in the middle of working on an application that uses express and nodemailer. My application sends emails successfully, but the issue is, is that I cannot send the emails off one at a time in a manner that I'd like. I do not want to put an array of addresses into the 'to' field, I'd like each e-mail out individually.
I have succeeded in this, however there is an issue. It seems Microsoft has written some kind of limit that prevents applications from having more than a certain number of connections at a time. (link with explanation at end of document of post.)
I have tried to get around this by a number of expedients. Not all of which I'll trouble you with. The majority of them involve setInterval() and either map or forEach. I do not intend to send all that many e-mails - certainly not any to flirt with any kind of standard. I do not even want any HTML in my emails. Just plain text. When my application sends out 2/3 e-mails however, I encounter their error message (response code 432).
Below you will see my code.
As you can see, I'm at the point where I've even been willing to try adding my incrementer into setInterval, as if changing the interval the e-mails fire at will actually help.
Right now, this is sending out some e-mails, but I'm eventually encountering that block. This usually happens around 2/3 e-mails. It is strangely inconsistent however.
This is the first relevant section of my code.
db.query(sqlEmailGetQuery, param)
.then(result => {
handleEmail(result, response);
}).catch(error => {
console.error(error);
response.status(500).json({ error: 'an unexpected error occured.' });
});
});
This is the second section of it.
function handleEmail(result, response) {
const email = result.rows[0];
let i = 0;
email.json_agg.map(contact => {
const msg = {
from: process.env.EMAIL_USER,
to: email.json_agg[i].email,
subject: email.subject,
text: email.emailBody + ' ' + i
};
i++;
return new Promise((resolve, reject) => {
setInterval(() => {
transporter.sendMail(msg, function (error, info) {
if (error) {
return console.log(error);
} else {
response.status(200).json(msg);
transporter.close();
}
});
}, 5000 + i);
});
});
}
I originally tried a simple for loop over the contacts iterable email.json_agg[i].email, but obviously as soon as I hit the connection limit this stopped working.
I have come onto stackoverflow and reviewed questions that are similar in nature. For example, this question was close to being similar, but this guy has over 8000 connections and if you read the rule I posted by microsoft below, they implemented the connection rule after he made that post.
I have tried setInterval with forEach and an await as it connects with each promise, but as this was not the source of the issue, this did not work either.
I have tried similar code to what you see above, except I have set the interval to as long as 20 seconds.
As my understanding of the issue has grown, I can see that I either have to figure out a way to wait long enough so I can send another e-mail - without the connection timing out or break off the connections every time I send an e-mail, so that when I send the next e-mail I have a fresh connection. It seems to me that if the latter were possible though, everyone would be doing it and violating Microsofts policy.
Is there a way for me to get around this issue, and send 3 emails every say 3 seconds, and then wait, and send another three? The volume of e-mails is such that, I can wait ten seconds if necessary. Is there a different smtp host that is less restrictive?
Please let me know your thoughts. My transport config is below if that helps.
const transporter = nodemailer.createTransport({
pool: true,
host: 'smtp-mail.outlook.com',
secureConnection: false,
maxConnections: 1,
port: 587,
secure: false,
tls: { ciphers: 'SSLv3' },
auth: {
user: process.env.EMAIL_USER,
pass: process.env.EMAIL_PASS
}
});
https://learn.microsoft.com/en-us/exchange/troubleshoot/send-emails/smtp-submission-improvements#new-throttling-limit-for-concurrent-connections-that-submitmessages
First off, the most efficient way to send the same email to lots of users is to send it to yourself and BCC all the recipients. This will let you send one email to the SMTP server and then it will distribute that email to all the recipients with no recipient being able to see the email address of any individual recipient.
Second, you cannot use timers to reliably control how many requests are running at once because timers are not connected to how long a given requests takes to complete so timers are just a guess at an average time for a request and they may work in some conditions and not work when things are responding slower. Instead, you have to actually use the completion of one request to know its OK to send the next.
If you still want to send separate emails and send your emails serially, one after the other to avoid having too many in process at a time, you can do something like this:
async function handleEmail(result) {
const email = result.rows[0];
for (let [i, contact] of email.json_agg.entries()) {
const msg = {
from: process.env.EMAIL_USER,
to: contact.email,
subject: email.subject,
text: email.emailBody + ' ' + i
};
await transporter.sendMail(msg);
}
}
If you don't pass transporter.sendMail() the callback, then it will return a promise that you can use directly - no need to wrap it in your own promise.
Note, this code does not send a response to your http request as that should be the responsibility of the calling code and your previous code was trying to send a response for each of the emails when you can only send one response and the previous code was not sending any response if there was an error.
This code relies on the returned promise back to the caller to communicate whether it was successful or encountered an error and the caller can then decide what to do with that situation.
You also probably shouldn't pass result to this function, but should instead just pass email since there's no reason for this code to know it has to reach into some database query result to get the value it needs. That should be the responsibility of the caller. Then, this function is much more generic.
If, instead of sending one email at a time, you want to instead send N emails at a time, you can use something like mapConcurrent() to do that. It iterates an array and keeps a max of N requests in flight at the same time.

Slow promise chain

I'm fairly new to node.js and Promises in general, although I think I get the gist of how they are supposed to work (I've been forced to use ES5 for a looooong time). I also have little in-depth knowledge of Cloud Functions (GCF), though again I do understand at a high level how they work
I'm using GCF for part of my app, which is meant to receive a HTTP request, translate them and send them onward to another endpoint. I need to use promises, as there are occasions when the originating HTTP request has multiple 'messages' sent at once
So, my function works in regards to making sure messages are sent in the correct order, but the subsequent messages are sent on very slowly (the logs suggest it's around a 20 second difference in terms of actually being sent onward)
I'm not entirely sure why that is happening - I would've expected it to be less than a couple of seconds difference. Maybe it's something to do with GCF and not my code? Or maybe it is my code? Either way, I'd like to know if there's something I can do to speed it up, especially since it's supposed to send it onward to a user in Google chat
(Before anyone comments on why it's request.body.body, I don't have control over the format of the incoming request)
exports.onward = (request, response) => {
response.status(200).send();
let bodyArr = request.body.body;
//Chain Promises over multiple messages sent at once stored in bodyArr
bodyArr.reduce(async(previous, next) =>{
await previous;
return process(next);
}, Promise.resolve());
};
function process(body){
return new Promise((resolve, reject) => {
//Obtain JWT from Google
let jwtClient = new google.auth.JWT(
privatekey.client_email,
null,
privatekey.private_key,
['https://www.googleapis.com/auth/chat.bot']
);
//Authorise JWT, reject promise or continue as appropriate
jwtClient.authorize((err, tokens) => {
if(err){
console.error('Google OAuth failure ' + err);
reject();
}else{
let payload = copyPayload();
setValues(payload, body); //Other function which sets payload values
axios.post(url, payload,
{
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer ' + tokens.access_token
},
})
.then(response => {
//HTTP 2xx response received
resolve();
})
.catch(error => {
switch(true){
//Something bad happened
reject();
});
}
});
});
}
EDIT: After testing the same thing, it's gone done a bit to around a 3-6 second delay between promises. Given that the code didn't change, I suspect that it's something to do with GCF?
By doing
exports.onward = (request, response) => {
response.status(200).send();
let bodyArr = request.body.body;
// Any work
};
you are incorrectly managing the life cycle of your Cloud Function: As a matter of fact by doing response.status(200).send(); you are indicating to the Cloud Functions platform that your function successfully reached its terminating condition or state and that, consequently, the platform can shut it down. See here in the doc for more explanations.
Since you send this signal at the beginning of your Cloud Function it may happen that the Cloud Functions shuts it down before the asynchronous job is finished.
In addition, you are potentially generating some "erratic" behavior of the Cloud Function that makes it difficult to debug. Sometimes your Cloud Function is terminated before the asynchronous work is completed, for the reason explained above. But some other times, the Cloud Function platform does not terminate the Function immediately and the asynchronous work can be completed (i.e. has the possibility to complete before the Cloud Function is terminated).
So, you should send the response after all the work is completed.
If you want to immediately acknowledged the user that the work has been started, without waiting for this work to be completed, you should use Pub/Sub: in your main Cloud Function, delegate the work to a Pub/Sub triggered Cloud Function and then return the response.
If you want to acknowledge the user when the work is completed (i.e. when the Pub/Sub triggered Cloud Function is completed), there are several options: Send a notification, an email or write to a Firestore document that you watch from your app.

Am I having a local Firestore database?

I want to understand what kind of Firestore database is installed to my box.
The code is running with node.js 9.
If I remove the internet for X minutes and put it back, I can see all the cached transactions going to Firestore (add, updates, deletes).
If I add firebase.firestore().enablePersistence() line after 'firebase.initializeApp(fbconfig), I am getting this error:
Error enabling offline persistence. Falling back to persistence
disabled: FirebaseError: [code=unimplemented]: This platform is either
missing IndexedDB or is known to have an incomplete implementation.
Offline persistence has been disabled.
Now, my question is. If I don't have persistence enabled or can't have it, how come when disconnecting my device from internet, I still have internal transaction going on? Am I really seeing it the proper way?
To me, beside not seeing the console.log() that I have inside the "then()" to batch.commit or transaction.update right away (only when putting back the internet) tells me that I have some kind of internal database persistence, don't you think?
Thanks in advance for your help.
UPDATE
When sendUpdate is called, it looks like the batch.commit is executed because I can see something going on in listenMyDocs(), but the console.log "Commit successfully!" is not shown until the internet comes back
function sendUpdate(response) {
const db = firebase.firestore();
let batch = db.batch();
let ref = db.collection('my-collection')
.doc('my-doc')
.collection('my-doc-collection')
.doc('my-new-doc');
batch.update(ref, { "variable": response.state });
batch.commit().then(() => {
console.log("Commit successfully!");
}).catch((error) => {
console.error("Commit error: ", error);
});
}
function listenMyDocs() {
const firebase = connector.getFirebase()
const db = firebase.firestore()
.collection('my-collection')
.doc('my-doc')
.collection('my-doc-collection');
const query = db.where('var1', '==', "true")
.where('var2', '==', false);
query.onSnapshot(snapshot => {
snapshot.docChanges().forEach(change => {
if (change.type === 'added') {
console.log('ADDED');
}
if (change.type === 'modified') {
console.log('MODIFIED');
}
if (change.type === 'removed') {
console.log('DELETED');
}
});
});
the console.log "Commit successfully!" is not shown until the internet comes back
This is the expected behavior. Completion listeners fire once the data is committed on the server.
Local events may fire before completion, in an effort to allow your UI to update optimistically. If the server changes the behavior that the client raised events for (for example: if the server rejects a write), the client will fire reconciliatory events (so if an add was rejected, it will firebase a change.type = 'removed event once that is detected).
I am not entirely sure if this applies to batch updates though, and it might be tricky to test that from a Node.js script as those usually bypass the security rules.

Firebase Cloud Function, getting a 304 error

I have a firebase cloud function that resets a number under every user's UID every day back to 0. I have about 600 users and so far it's been working perfectly fine.
But today, it's giving me a 304 error and not reseting the value. Here is a screenshot:
And here is the function code:
export const resetDailyQuestsCount = functions.https.onRequest((req, res) => {
const ref = db.ref('users');
ref.once('value').then(snap => {
snap.forEach(item => {
const uid = item.child('uid').val();
ref.child(uid).update({ dailyQuestsCount: 0 }).catch(err => {
res.status(500).send(err);
});
});
}).catch(err => {
res.status(500).send(err);
})
res.status(200).send('daily quest count reset');
})
Could this be my userbase growing too large? I doubt it, 600 is not that big.
Any help would be really appreciated! This is really affecting my users.
An HTTP function must only send a single response to the client. This means a single call to send(). Your function can possibly attempt to send multiple responses to the client in the even that there are multiple updates that fail. Your logging isn't complete enough to demonstrate this, but it's a very real possibility with what you've shown.
Also bear in mind that this function is very much not scalable since it reads all of your users prior to processing them. For large number of users, this presents memory problems. You should look into ways to limit the number of nodes read by your query in order to prevent future problems.

Can't publish options with RabbitMQ message?

I'm using ampq.node for my RabbitMQ access in my Node code. I'm trying to use either the publish or sendToQueue methods to include some metadata with my published message (namely timestamp and content type), using the options parameter.
But whatever I'm passing to options is completely ignored. I think I'm missing some formatting, or a field name, but I cannot find any reliable documentation (beyond the one provided here which does not seem to do the job).
Below is my publish function code:
var publish = function(queueName, message) {
let content;
let options = {
persistent: true,
noAck: false,
timestamp: Date.now(),
contentEncoding: 'utf-8'
};
if(typeof message === 'object') {
content = new Buffer(JSON.stringify(message));
options.contentType = 'application/json';
}
else if(typeof message === 'string') {
content = new Buffer(message);
options.contentType = 'text/plain';
}
else { //message is already a buffer?
content = message;
}
return Channel.sendToQueue(queueName, content, options); //Channel defined and opened elsewhere
};
What am I missing?
Update:
Turns out if you choose to use a ConfirmChannel, you must provide the callback function as the last parameter, or else, the options object is ignored. So once I changed the code to the following, I started seeing the options correctly:
Channel.sendToQueue(queueName, content, options, (err, result) => {...});
Somehow, I can't seem to get your example publish to work... though I don't see anything particularly wrong with it. I'm not sure why I wasn't able to get your example code working.
But I was able to modify a version of my own amqplib intro code, and got it working with your options just fine.
Here is the complete code for my example:
// test.js file
var amqplib = require("amqplib");
var server = "amqp://test:password#localhost/test-app";
var connection, channel;
function reportError(err){
console.log("Error happened!! OH NOES!!!!");
console.log(err.stack);
process.exit(1);
}
function createChannel(conn){
console.log("creating channel");
connection = conn;
return connection.createChannel();
}
function sendMessage(ch){
channel = ch;
console.log("sending message");
var msg = process.argv[2];
var message = new Buffer(msg);
var options = {
persistent: true,
noAck: false,
timestamp: Date.now(),
contentEncoding: "utf-8",
contentType: "text/plain"
};
channel.sendToQueue("test.q", message, options);
return channel.close();
}
console.log("connecting");
amqplib.connect(server)
.then(createChannel)
.then(sendMessage)
.then(process.exit, reportError);
to run this, open a command line and do:
node test.js "example text message"
After running that, you'll see the message show up in your "test.q" queue (assuming you have that queue created) in your "test-app" vhost.
Here's a screenshot of the resulting message from the RMQ Management plugin:
side notes:
I recommend not using sendToQueue. As I say in my RabbitMQ Patterns email course / ebook:
It took a while for me to realize this, but I now see the "send to queue" feature of RabbitMQ as an anti-pattern.
Sure, it's built in to the library and protocol. And it's convenient, right? But that doesn't mean you should use it. It's one of those features that exists to make demos simple and to handle some specific scenarios. But generally speaking, "send to queue" is an anti-pattern.
When you're a message producer, you only care about sending the message to the right exchange with the right routing key. When you're a message consumer, you care about the message destination - the queue to which you are subscribed. A message may be sent to the same exchange, with the same routing key, every day, thousands of times per day. But, that doesn't mean it will arrive in the same queue every time.
As message consumers come online and go offline, they can create new queues and bindings and remove old queues and bindings. This perspective of message producers and consumers informs the nature of queues: postal boxes that can change when they need to.
I also recommend not using amqplib directly. It's a great library, but it lacks a lot of usability. Instead, look for a good library on top of amqplib.
I prefer wascally, by LeanKit. It's a much easier abstraction on top of amqplib and provides a lot of great features and functionality.
Lastly, if you're struggling with other details in getting RMQ up and running with Node.js, designing your app to work with it, etc., check out my RabbitMQ For Devs course - it goes from zero to hero, fast. :)
this may help others, but the key name to use for content type is contentType in the javascript code. Using the web Gui for rabbitMQ, they use content_type as the key name. different key names to declare options, so make sure to use the right one in the right context.

Resources