Firebase cloud pubsub subscriptions stops listening for messages - node.js

I'm trying to connect my firebase cloud functions project to a third party pub/sub (a separate project). According to this thread this is not possible, so there's no traditional way to make this work. However, I've tried to manually subscribe to certain topics using the #google-cloud/pubsub client on my firebase cloud functions. I need to react to pub/sub messages and write/update certain documents.
Example (minimal):
I have a pubsub subscription on sub.ts:
const pubSubClient = new PubSub({
projectId: config.project_id,
credentials: {
client_email: config.client_email,
private_key: config.private_key
}
});
I subscribe to a certain topic to do some business logic
const subscription = pubSubClient.subscription('my-subscription');
this.subscription.on('message', async (message) => {
try {
message.ack();
const event = parseData(message.data);
await admin.firestore().collection('my-collection').add(event);
} catch (e) {
console.error(e);
}
});
Then this file is imported within the index.js where I declare most CF functions.
import * as admin from 'firebase-admin';
admin.initializeApp();
import './sub';
export { myFunction } from './modules/my-module';
export { myOtherFunction } from './modules/other-module';
It appears that my subscriptions die out after a time and messages won't go through. If I redeploy my functions it appears to be working for a time, but then it stops listening to messages. I've read that firebase cloud functions are stateless, so in this case I need a "stateful" module within my firebase project. Is this possible? Or should I manage this on another server?.
Thanks!

What you're trying to do here (subscribe to a pubsub topic from code running in Cloud Functions) won't work for two reasons:
Cloud Functions server instances scale up and down automatically. There could be 0 instances or 1000 instances concurrently running your triggers, depending on the current load.
Cloud Functions shuts down running code when the function has terminated, and there is a maximum timeout of 9 minutes for any function invocation.
So, even if you manage to subscribe to a topic, that subscription doesn't have a guaranteed duration. It will eventually be destroyed, and you will lose messages.
If you want to handle messages using Cloud Functions in "project A", but the messages come from "project B", you should consider sending them from A to B, perhaps by using pubsub function in B that does nothing other that publish each message to a topic in A. You can then write another function to handle them in A.

Related

how does users.watch (in gmail google api) listen for notifications?

I am confused as to how should the watch feature in the gmail API be implemented to recieve the push notificatons inside a node.js script. Should I call the method inside an infinite loop or something so that it doesn't stop listening for notifications for email once after the call is made?
Here's the sample code that I've written in node.js:
const getEmailNotification = () => {
return new Promise(async (resolve, reject) => {
try{
let auth = await authenticate();
const gmail = google.gmail({version: 'v1', auth});
await gmail.users.stop({
userId: '<email id>'
});
let watchResponse = await gmail.users.watch({
userId: '<email id>',
labelIds: ['INBOX'],
topicName: 'projects/<projectName>/topics/<topicName>'
})
return resolve(watchResponse);
} catch(err){
return reject(`Some error occurred`);
}
})
Thank you!
Summary
To receive push notifications through PUB/SUB you need to create a web-hook to receive them. What does this mean? You need a WEB application or any kind of service that exposes a URL where notifications can be received.
As stated in the Push subscription documentation:
The Pub/Sub server sends each message as an HTTPS request to the subscriber application at a pre-configured endpoint.
The endpoint acknowledges the message by returning an HTTP success status code. A non-success response indicates that the message should be resent.
Setup a channel for watch the notifications could be summarized in the following steps (the documentation you refer to indicates them):
Select/Create a project within the Google Cloud Console.
Create a new PUB/SUB topic
Create a subscription (PUSH) for that topic.
Add the necessary permissions, in this case add gmail-api-push#system.gserviceaccount.com as Pub/Sub Publisher.
Indicate what types of mail you want it to listen for via Users.watch() method (which is what you are doing in your script).
Example
I give you an example using Apps Script (it is an easy way to visualize it, but this could be achieved from any kind of WEB application, as you are using Node.js I suppose that you are familiarized with Express.js or related frameworks).
First I created a new Google Apps Script project, this will be my web-hook. Basically I want it to make a log of all HTTP/POST requests inside a Google Doc that I have previously created. For it I use the doPost() equal to app.post() in Express. If you want to know more about how Apps Script works, you can visit this link), but this is not the main topic.
Code.gs
const doPost = (e) => {
const doc = DocumentApp.openById(<DOC_ID>)
doc.getBody().appendParagraph(JSON.stringify(e, null, 2))
}
Later I made a new implementation as a Web App where I say that it is accessible by anyone, I write down the URL for later. This will be similar to deploying your Node.js application to the internet.
I select a project in the Cloud Console, as indicated in the Prerequisites of Cloud Pub/Sub.
Inside this project, I create a new topic that I call GmailAPIPush. After, click in Add Main (in the right bar of the Topics section ) and add gmail-api-push#system.gserviceaccount.com with the Pub/Sub Publisher role. This is a requirement that grants Gmail privileges to publish notification.
In the same project, I create a Subscription. I tell it to be of the Push type and add the URL of the Web App that I have previously created.
This is the most critical part and makes the difference of how you want your application to work. If you want to know which type of subscription best suits your needs (PUSH or PULL), you have a detailed documentation that will help you choose between these two types.
Finally we are left with the simplest part, configuring the Gmail account to send updates on the mailbox. I am going to do this from Apps Script, but it is exactly the same as with Node.
const watchUserGmail = () => {
const request = {
'labelIds': ['INBOX'],
'topicName': 'projects/my_project_name/topics/GmailAPIPush'
}
Gmail.Users.watch(request, 'me')
}
Once the function is executed, I send a test message, and voila, the notification appears in my document.
Returning to the case that you expose, I am going to try to explain it with a metaphor. Imagine you have a mailbox, and you are waiting for a very important letter. As you are nervous, you go every 5 minutes to check if the letter has arrived (similar to what you propose with setInterval), that makes that most of the times that you go to check your mailbox, there is nothing new. However, you train your dog to bark (push notification) every time the mailman comes, so you only go to check your mailbox when you know you have new letters.

Firebase Functions: How to maintain 'app-global' API client?

How can I achieve an 'app-wide' global variable that is shared across Cloud Function instances and function invocations? I want to create a truly 'global' object that is initialized only once per the lifetime of all my functions.
Context:
My app's entire backend is Firestore + Firebase Cloud Functions. That is, I use a mix of background (Firestore) triggers and HTTP functions to implement backend logic. Additionally, I rely on a 3rd-party location service to continually listen to location updates from sensors. I want just a single instance of the client on which to subscribe to these updates.
The problem is that Firebase/Google Cloud Functions are stateless, meaning that function instances don't share memory/objects/state. If I call functionA, functionB, functionC, there's going to be at least 3 instances of locationService clients created, each listening separately to the 3rd party service so we end up with duplicate invocations of the location API callback.
Sample code:
// index.js
const functions = require("firebase-functions");
exports.locationService = require('./location_service');
this.locationService.initClient();
// define callable/HTTP functions & Firestore triggers
...
and
// location_service.js
var tracker = require("third-party-tracker-js");
const self = (module.exports = {
initClient: function () {
tracker.initialize('apiKey')
.then((client)=>{
client.setCallback(async function(payload) {
console.log("received location update: ", payload)
// process the payload ...
// with multiple function instances running at once, we receive as many callbacks for each location update
})
client.subscribeProject()
.then((subscription)=>{
subscription.subscribe()
.then((subscribeMsg)=>{
console.log("subscribed to project with message: ", subscribeMsg); // success
});
// subscription.unsubscribe(); // ??? at what point should we unsubscribe?
})
.catch((err)=>{
throw(err)
})
})
.catch((err)=>{
throw(err)
})
},
});
I realize what I'm trying to do is roughly equivalent to implementing a daemon in a single-process environment, and it appears that serverless environments like Firebase/Google Cloud Functions aren't designed to support this need because each instance runs as its own process. But I'd love to hear any contrary ideas and possible workarounds.
Another idea...
Inspired by this related SO post and the official GCF docs on stateless functions, I thought about using Firestore to persist a tracker value that allows us to conditionally initialize the API client. Roughly like this:
// read value from db; only initialize the client if there's no valid subscription
let locSubscriberActive = await getSubscribeStatusFromDb();
if (!locSubscriberActive) {
this.locationService.initClient();
}
// in `location_service.js`, do setSubscribeStatusToDb(); // set flag to true when we call subscribe(). reset when we get terminated
The problem faced: at what point do I unset/reset that value? Intuitively, I would do so the moment the function instance that initialized the client gets recycled/killed. However, it appears that it is not possible to know when a Firebase Cloud Function instance is terminated? I searched everywhere but couldn't find docs on how to detect such an event...
What you're trying to do is not at all supported in Cloud Functions. It's important to realize that there may be any number of server instances allocated for each deployed function. That's how Cloud Functions scales up and down to match the load on the function in a cost-effective way. These instances might be terminated at any time for any reason. You have no indication when an instance terminates.
Also, instances are not capable of performing any computation when they are idle. CPU resources are clamped down after a function terminates, and are spun up again when the next function is invoked on that instance. You can't have any "daemon" code running when a function is not actively being invoked. I don't know what your locationService does, but it is certainly doing nothing at all after a function terminates, regardless of how it terminated.
For any sort of long-running or daemon-like code, Cloud Functions is not a suitable product. You should instead consider also using another product that lets you run code 24/7 without disruptions. App Engine and Compute Engine are viable alternatives, and you will have to think carefully about if and how you want their server instances to scale with load.

Why am I receiving this error on Azure when using eventhubs?

I started using Azure recently and It has been an overwhelming experience. I started experimenting with eventhubs and I'm basically following the official tutorials on how to send and receive messages from eventhubs using nodejs.
Everything worked perfectly so I built a small web app (static frontend app) and I connected it with a node backend, where the communication with eventhubs occurs. So basically my app is built like this:
frontend <----> node server <-----> eventhubs
As you can see it is very simple. The node server is fetching data from eventhubs and sending it forward to the frontend, where the values are shown. It is a cool experience and I'm enjoying MS Azure until this error occured:
azure.eventhub.common.EventHubError: ErrorCodes.ResourceLimitExceeded: Exceeded the maximum number of allowed receivers per partition in a consumer group which is 5. List of connected receivers - nil, nil, nil, nil, nil.
This error is really confusing. Im using the default consumer group and only one app. I never tried to access this consumer group from another app. It said the limit is 5, I'm using only one app so it should be fine or am I missing something? I'm not checking what is happening here.
I wasted too much time googling and researching about this but I didn't get it. At the end, I thought that maybe every time I deploy the app (my frontend and my node server) on azure, this would be counted as one consumer and since I deployed the app more than 5 times then this error is showing up. Am I right or this is nonsense?
Edit
I'm using websockets as a communication protocol between my app (frontend) and my node server (backend). The node server is using the default consumer group ( I didn't change nothing), I just followed this official example from Microsoft. I'm basically using the code from MS docs that's why I didn't post any code snippet from my node server and since the error happens in backend and not frontend then it will not be helpful if I posted any frontend code.
So to wrap up, I'm using websocket to connect front & backend. It works perfectly for a day or two and then this error starts to happen. Sometimes I open more than one client (for example a client from the browser and client from my smartphone).
I think I don't understand the concept of this consumer group. Like is every client a consumer? so if I open my app (the same app) in 5 different tabs in my browser, do I have 5 consumers then?
I didn't quite understand the answer below and what is meant by "pooling client", therefore, I will try to post code examples here to show you what I'm trying to do.
Code snippets
Here is the function I'm using on the server side to communicate with eventhubs and receive/consume a message
async function receiveEventhubMessage(socket, eventHubName, connectionString) {
const consumerClient = new EventHubConsumerClient(consumerGroup, connectionString, eventHubName);
const subscription = consumerClient.subscribe({
processEvents: async (events, context) => {
for (const event of events) {
console.log("[ consumer ] Message received : " + event.body);
io.emit('msg-received', event.body);
}
},
processError: async (err, context) => {
console.log(`Error : ${err}`);
}
}
);
If you notice, I'm giving the eventhub and connection string as an argument in order to be able to change that. Now in the frontend, I have a list of multiple topics and each topic have its own eventhubname but they have the same eventhub namespace.
Here is an example of two eventhubnames that I have:
{
"EventHubName": "eh-test-command"
"EventHubName": "eh-test-telemetry"
}
If the user chooses to send a command (from the frontend, I just have a list of buttons that the user can click to fire an event over websockets) then the CommandEventHubName will be sent from the frontend to the node server. The server will receive that eventhubname and switch the consumerClient in the function I posted above.
Here is the code where I'm calling that:
// io is a socket.io object
io.on('connection', socket => {
socket.on('onUserChoice', choice => {
// choice is an object sent from the frontend based on what the user choosed. e.g if the user choosed command then choice = {"EventhubName": "eh-test-command", "payload": "whatever"}
receiveEventhubMessage(socket, choice.EventHubName, choice.EventHubNameSpace)
.catch(err => console.log(`[ consumerClient ] Error while receiving eventhub messages: ${err}`));
}
}
The app I'm building will be extending in the future to a real use case in the automotive field, that's why this is important for me. Therefore, I'm trying to figure out how can I switch between eventhubs without creating a new consumerClient each time the eventhubname changes?
I must say that I didn't understand the example with the "pooling client". I am seeking more elaboration or, ideally, a minimal example just to put me on the way.
Based on the conversation in the issue, it would seem that the root cause of this is that your backend is creating a new EventHubConsumerClient for each request coming from your frontend. Because each client will open a dedicated connection to the service, if you have more than 5 requests for the same Event Hub instance using the same consumer group, you'll exceed the quota.
To get around this, you'll want to consider pooling your EventHubConsumerClient instances so that you're starting with one per Event Hub instance. You can safely use the pooled client to handle a request for your frontend by calling subscribe. This will allow you to share the connection amongst multiple frontend requests.
The key idea being that your consumerClient is not created for every request, but shares an instance among requests. Using your snippet to illustrate the simplest approach, you'd end up hoisting your client creation to outside the function to receive. It may look something like:
const consumerClient = new EventHubConsumerClient(consumerGroup, connectionString, eventHubName);
async function receiveEventhubMessage(socket, eventHubName, connectionString) {
const subscription = consumerClient.subscribe({
processEvents: async (events, context) => {
for (const event of events) {
console.log("[ consumer ] Message received : " + event.body);
io.emit('msg-received', event.body);
}
},
processError: async (err, context) => {
console.log(`Error : ${err}`);
}
}
);
That said, the above may not be adequate for your environment depending on the architecture of the application. If whatever is hosting receiveEventHubMessage is created dynamically for each request, nothing changes. In that case, you'd want to consider something like a singleton or dependency injection to help extend the lifespan.
If you end up having issues scaling to meet your requests, you can consider increasing the number of clients for each Event Hub and/or spreading requests out to different consumer groups.

How to get Google cloud function execution event in node.js using firestore

Below is google cloud function , deployed properly and is working fine
path to function - functions/index.js
const functions = require('firebase-functions');
const admin = require("firebase-admin");
admin.initializeApp();
exports.createUser = functions.firestore
.document('users/{userId}')
.onCreate((snap, context) => {
const newValue = snap.data();
console.log(newValue);
});
how can i access this function's event on successful invocation in node.js app
something like
const myFunctions = require("./functions/index");
myFunctions.createUser().then((data) => {
console.log(data)
})
.catch((err) => {
console.log(err);
})
As of now getting below error
Your createUser Cloud Function is triggered by a Firestore onCreate() event type and therefore will be "triggered when a document is written for the first time", as per the documentation.
The doc also adds the following:
In a typical lifecycle, a Cloud Firestore function does the following:
Waits for changes to a particular document. (In this case when the document is written for the first time)
Triggers when an event occurs and performs its tasks
Receives a data object that contains a snapshot of the data stored in the specified document.
Therefore, if you want to trigger this Cloud Function from "the outside world", e.g. from a node.js app, you need to create a new Firestore document at the corresponding location, i.e. under the users collection. To this end you would use the Node.js Server SDK, see https://cloud.google.com/nodejs/docs/reference/firestore/0.14.x/
Note that you could also trigger it from a client application (web, android, iOS) by creating a new user doc with the corresponding client SDK.
Update following your comments:
You cannot directly "port" and run your code written for Cloud Functions to a Node.js app. You will have to re-develop your solution for Node.js.
In your case you should use the Node.js Server SDK (as mentionned in my comment) and you could use the onSnapshot method of a CollectionReference. See https://cloud.google.com/nodejs/docs/reference/firestore/0.14.x/CollectionReference#onSnapshot
I will try to answer your question, but it's a bit unclear. You asked:
How to get Google cloud function execution event
Well, the event has started when the funcion triggers and your code is running, i.e your line const newValue = snap.data()
Maybe you are looking for a way to do certain tasks, when the trigger has run? You simply just do that from inside the function, and return a promise. If you for example had multiple async tasks to run, you could use a Promise.all([]).

Realtime messaging with NodeJS across multiple processes

I'm trying to implement an API that interacts with a NodeJS server for realtime messaging. Now when that NodeJS app is deployed to a scalable environment like Heroku, multiple instances of this app may be running.
Is it possible to design the node app so that all clients subscribed to a "message channel" will receive this message, although multiple node instances are running - and therefore multiple copies of this channel?
Check out zeromq, it should provide some simple, high performance IPC abstractions to do what you want. In particular, the pub/sub example will be useful.
The main challenge as I imagine it, without knowing anything about how Heroku spawns multiple server instances, will be the logic to determine who is the publisher (the rest of the instances will be subscribers). So let's say, for argument's sake, that your hosting provider gives you an environment variable called INSTANCE_NUM which is an integer in [0,1024] indicating the instance number of the process; so we'll say that instance zero is the message publisher.
var zmq = require('zeromq')
if (process.env['INSTANCE_NUM'] === '0') { // I'm the publisher.
var emitter = getEventEmitter(); // e.g. an HttpServer.
var pub = zmq.createSocket('pub');
pub.bindSync('tcp://*:5555');
emitter.on('someEvent', function(data) {
pub.send(data);
});
} else { // I'm a subscriber.
var sub = zmq.createSocket('sub');
sub.subscribe('');
sub.on('message', function(data) {
// Handle the event data...
});
sub.connect('tcp://localhost:5555');
}
Note that I'm new to zeromq and the above code is totally untested, just for demonstration.

Resources