How to cancel a task enqueued on Firebase Functions? - node.js

I'm talking about this: https://firebase.google.com/docs/functions/task-functions
I want to enqueue tasks with the scheduleTime parameter to run in the future, but I must be able to cancel those tasks.
I expected it would be possible to do something like this pseudo code:
const task = await queue.enqueue({ foo: true })
// Then...
await queue.cancel(task.id)
I'm using Node.js. In case it's not possible to cancel a scheduled task with firebase-admin, can I somehow work around it by using #google-cloud/tasks directly?
PS: I've also created a feature request: https://github.com/firebase/firebase-admin-node/issues/1753

The Firebase SDK doesn't return the task name/ID right now as in the code.
If you need this functionality, I'd recommend filing a feature request and meanwhile use Cloud Tasks directly.
You can simply create a HTTP Callable Function and then use the Cloud Tasks SDK to create a HTTP Target tasks that call this Cloud Function instead of using the onDispatch.
// Instead of onDispatch()
export const handleQueueEvent = functions.https.onRequest((req, res) => {
// ...
});
Adding a Cloud Task:
async function createHttpTask() {
const parent = client.queuePath(project, location, queue);
const task = {
httpRequest: {
httpMethod: 'POST', // change method if required
url, // Use URL of handleQueueEvent function here
},
};
const request = {
parent: parent,
task: task
};
const [response] = await client.createTask(request);
return;
}
Checkout the documentation
for more information.

Related

How to implement Heroku background processes in Node

I'm very new to Heroku and node so have a basic question just about how to implement background processes in a graphql server app I have hosted on Heroku.
I have a working graphql server written in Keystone CMS and hosted on Heroku.
In the database I have a schema called `Item` which basically just takes a URL from the user and then tries to scrape a Hero Image from that URL.
As the URL can be anything, I'm trying to use a headless browser via Playwright in order to get images
This is a memory intensive process though and Heroku is OOM'ing with R14 errors. For this they recommend migrating intensive work like this to a Background Job via Redis, implemented in Bull and Throng
I've never used redis before nor these other libraries so I'm out of my element. I've looked at the Heroku implementation examples "server" and "worker" but haven't been able to translate those into a working implementation. To be honest I just don't understand the flow and design pattern I'm supposed to use with those even after reading the docs and examples.
Here is my code:
Relevant CMS schema where I call the getImageFromURL() function which is memory intensive
# Item.ts
import getImageFromURL from '../lib/imageFromURL'
export const Item = list({
...
fields: {
url: text({
validation: { isRequired: false },
}),
imageURL: text({
validation: { isRequired: false },
}),
....
},
hooks: {
resolveInput: async ({ resolvedData }) => {
if (resolvedData.url) {
const imageURL: string | undefined = await getImageFromURL(
// pass the user-provided url to the image scraper
resolvedData.url
)
if (imageURL) {
return {
...resolvedData,
// if we scraped successfully, return URL to image asset
imageURL,
}
}
return resolvedData
}
return resolvedData
},
}
Image scraping function getImageFromURL() (where I believe the background job needs to go?)
filtered to relevant parts
# imageFromURL.ts
// set up redis for processing
const Queue = require('bull')
const throng = require('throng')
const REDIS_URL = process.env.REDIS_URL || 'redis://127.0.0.1:6379'
let workers = 2
async function scrapeURL(urlString){
...
scrape images with playwright here
...
return url to image asset here
}
// HERE IS WHERE I'M STUCK
// How do I do `scrapeURL` in a background process?
export default async function getImageFromURL(
urlString: string
): Promise<string | undefined> {
let workQueue = new Queue('scrape_and_uppload', REDIS_URL)
// Something like this?
// const imageURL = await scrapeURL(urlString) ??
// Or this?
// This fails with:
// "TypeError: handler.bind is not a function"
// but I'm just lost as to how this should even work
// workQueue.process(2, scrapeURL(urlString))
return Promise.resolve(imageURL)
}
Then when testing I call this with throng((url) => getImageFromURL(url), { workers }).
I have my local redis db running but I'm not even seeing any log spew when I run this so I don't think I'm even successfully connecting to redis?
Thanks in advance let me know where I'm unclear or can add code examples

Proper way of tracing distributed requests through Azure Function Apps

I am experimenting with Node.js and the application insights SDK in two separate function apps. Nodejs is just what I am comfortable with to quickly poc, this might not be the final language so I don't want to know any language specific solutions, simply how application insights behaves in the context of function apps and what it expects to be able to draw a proper application map.
My goal is to be able to write simple queries in log analytics to get the full chain of a single request through multiple function apps, no matter how these are connected. I also want an accurate (as possible) view of the system in the application map in application insights.
My assumption is that a properly set operation_id and operation_parentId would yield both a queryable trace using kusto and a proper application map.
I've set up the following flow:
Function1 only exposes a HTTP trigger, whereas Function2 exposes both a HTTP and Service Bus trigger.
The full flow looks like this:
I call Function1 using GET http://function1.com?input=test
Function1 calls Function2 using REST at GET http://function2.com?input=test
Function1 uses the response from Function2 to add a message to a service bus queue
Function2 has a trigger on that same queue
I am mixing patterns here just to see what the application map does and understand how to use this correctly.
For step 1 through 3, I can see the entire chain in my logs on a single operation_Id. In this screenshot the same operationId spans two different function apps:
What I would expect to find in this log is also the trigger of the service bus where the trigger is called ServiceBusTrigger. The service bus does trigger on the message, it just gets a different operationId.
To get the REST correlation to work, I followed the guidelines from applicationinsights npm package in the section called Setting up Auto-Correlation for Azure Functions.
This is what Function1 looks like (the entrypoint and start of the chain)
let appInsights = require('applicationinsights')
appInsights.setup()
.setAutoCollectConsole(true, true)
.setDistributedTracingMode(appInsights.DistributedTracingModes.AI_AND_W3C)
.start()
const https = require('https')
const httpTrigger = async function (context, req) {
context.log('JavaScript HTTP trigger function processed a request.')
const response = await callOtherFunction(req)
context.res = {
body: response
}
context.log("Sending response on service bus")
context.bindings.outputSbQueue = response;
}
async function callOtherFunction(req) {
return new Promise((resolve, reject) => {
https.get(`https://function2.azurewebsites.net/api/HttpTrigger1?code=${process.env.FUNCTION_2_CODE}&input=${req.query.input}`, (resp) => {
let data = ''
resp.on('data', (chunk) => {
data += chunk
})
resp.on('end', () => {
resolve(data)
})
}).on("error", (err) => {
reject("Error: " + err.message)
})
})
}
module.exports = async function contextPropagatingHttpTrigger(context, req) {
// Start an AI Correlation Context using the provided Function context
const correlationContext = appInsights.startOperation(context, req);
// Wrap the Function runtime with correlationContext
return appInsights.wrapWithCorrelationContext(async () => {
const startTime = Date.now(); // Start trackRequest timer
// Run the Function
const result = await httpTrigger(context, req);
// Track Request on completion
appInsights.defaultClient.trackRequest({
name: context.req.method + " " + context.req.url,
resultCode: context.res.status,
success: true,
url: req.url,
time: new Date(startTime),
duration: Date.now() - startTime,
id: correlationContext.operation.parentId,
});
appInsights.defaultClient.flush();
return result;
}, correlationContext)();
};
And this is what the HTTP trigger in Function2 looks like:
let appInsights = require('applicationinsights')
appInsights.setup()
.setAutoCollectConsole(true, true)
.setDistributedTracingMode(appInsights.DistributedTracingModes.AI_AND_W3C)
.start()
const httpTrigger = async function (context, req) {
context.log('JavaScript HTTP trigger function processed a request.')
context.res = {
body: `Function 2 received ${req.query.input}`
}
}
module.exports = async function contextPropagatingHttpTrigger(context, req) {
// Start an AI Correlation Context using the provided Function context
const correlationContext = appInsights.startOperation(context, req);
// Wrap the Function runtime with correlationContext
return appInsights.wrapWithCorrelationContext(async () => {
const startTime = Date.now(); // Start trackRequest timer
// Run the Function
const result = await httpTrigger(context, req);
// Track Request on completion
appInsights.defaultClient.trackRequest({
name: context.req.method + " " + context.req.url,
resultCode: context.res.status,
success: true,
url: req.url,
time: new Date(startTime),
duration: Date.now() - startTime,
id: correlationContext.operation.parentId,
});
appInsights.defaultClient.flush();
return result;
}, correlationContext)();
};
The Node.js application insights documentation says:
The Node.js client library can automatically monitor incoming and outgoing HTTP requests, exceptions, and some system metrics.
So this seems to work for HTTP, but what is the proper way to do this over (for instance) a service bus queue to get a nice message trace and correct application map? The above solution for the applicationinsights SDK seems to only be for HTTP requests where you use the req object on the context. How is the operationId persisted in cross-app communication in these instances?
What is the proper way of doing this across other messaging channels? What do I get for free from application insights, and what do I need to stitch myself?
UPDATE
I found this piece of information in the application map documentation which seems to support the working theory that only REST/HTTP calls will be able to get traced. But then the question remains, how does the output binding work if it is not a HTTP call?
The app map finds components by following HTTP dependency calls made between servers with the Application Insights SDK installed.
UPDATE 2
In the end I gave up on this. In conclusion, Application Insights traces some things but it is very unclear when and how that works and also depends on language. For the Node.js docs it says:
The Node.js client library can automatically monitor incoming and outgoing HTTP requests, exceptions, and some system metrics. Beginning in version 0.20, the client library also can monitor some common third-party packages, like MongoDB, MySQL, and Redis. All events related to an incoming HTTP request are correlated for faster troubleshooting.
I solved this by taking inspiration from OpenTracing. Our entire stack runs in Azure Functions, so I've implemented logic to use correlationId that passes through all processes. Each process is a span. Each function/process is responsible for logging according to a structured logging framework.

How to return a generated image with Bull.js queue?

My use case is this: I want to create screenshots of parts of a page. For technical reasons, it cannot be done on the client-side (see related question below) but needs puppeteer on the server.
As I'm running this on Heroku, I have the additional restriction of a quite small timeout window. Heroku recommends therefore to implement a queueing system based on bull.js and use worker processes for longer-running tasks as explained here.
I have two endpoints (implemented with Express), one that receives a POST request with some configuration JSON, and another one that responds to GET when provided with a job identifier (slightly modified for brevity):
This adds the job to the queue:
router.post('/', async function(req, res, next) {
let job = await workQueue.add(req.body.chartConfig)
res.json({ id: job.id })
})
This returns info about the job
router.get('/:id', async(req, res) => {
let id = req.params.id;
let job = await workQueue.getJob(id);
let state = await job.getState();
let progress = job._progress;
let reason = job.failedReason;
res.json({ id, state, progress, reason });
})
In a different file:
const start = () => {
let workQueue = new queue('work', REDIS_URL);
workQueue.process(maxJobsPerWorker, getPNG)
}
const getPNG = async(job) => {
const { url, width, height, chart: chartConfig, chartUrl } = job.data
// ... snipped for brevity
const png = await page.screenshot({
type: 'png',
fullPage: true
})
await page.close()
job.progress(100)
return Promise.resolve({ png })
}
// ...
throng({ count: workers, worker: start })
module.exports.getPNG = getPNG
The throng invocation at the end specifies the start function as the worker function to be called when picking a job from the queue. start itself specifies getPNG to be called when treating a job.
My question now is: how do I get the generated image (png)? I guess ideally I'd like to be able to call the GET endpoint above which would return the image, but I don't know how to pass the image object.
As a more complex fall-back solution I could imagine posting the image to an image hosting service like imgur, and then returning the URL upon request of the GET endpoint. But I'd prefer, if possible, to keep things simple.
This question is a follow-up from this one:
Issue with browser-side conversion SVG -> PNG
I've opened a ticket on the GitHub repository of the bull project. The developers said that the preferred practice is to store the binary object somewhere else, and to add only the link metadata to the job's data store.
However, they also said that the storage limit of a job object appears to be 512 Mb. So it is also quite possible to store an image of a reasonable size as a base64-encoded string.

Is there a way to instantiate a new client on server side by firebase cloud function?

I am developing an app and trying to implement news feed by getstream.io using react native and firebase.
Is there a way to generate user token by using firebase cloud function. If there is, would you please give me a pointer how i can do so? (the snippet of codes in cloud function side and client side would be super helpful..)
I have seen similar questions, only to find out no specific tutorial.. any help is appreciated!
For the cloud function side you need to create a https.onRequest endpoint that calls createUserToken like so:
const functions = require('firebase-functions');
const stream = require('getstream');
const client = stream.connect('YOUR_STREAM_KEY', 'YOUR_STREAM_SECRET', 'YOUR_STREAM_ID');
exports.getStreamToken = functions.https.onRequest((req, res) => {
const token = client.createUserToken(req.body.userId);
return { token };
});
After that, deploy with firebase deploy --only functions in the terminal & get the url for the function from your firebase dashboard.
Then you can use the url in a POST request with axios or fetch or whatever like this:
const { data } = axios({
data: {
userId: 'lukesmetham', // Pass the user id for the user you want to generate the token for here.
},
method: 'POST',
url: 'CLOUD_FUNC_URL_HERE',
});
Now, data.token will be the returned stream token and you can save it to AsyncStorage or wherever you want to store it. Are you keeping your user data in firebase/firestore or stream itself? With a bit more background I can add to the above code for you depending on your setup! 😊 Hopefully this helps!
UPDATE:
const functions = require('firebase-functions');
const stream = require('getstream');
const client = stream.connect('YOUR_STREAM_KEY', 'YOUR_STREAM_SECRET', 'YOUR_STREAM_ID');
// The onCreate listener will listen to any NEW documents created
// in the user collection and will only run when it is created for the first time.
// We then use the {userId} wildcard (you can call this whatever you like.) Which will
// be filled with the document's key at runtime through the context object below.
exports.onCreateUser = functions.firestore.document('user/{userId}').onCreate((snapshot, context) => {
// Snapshot is the newly created user data.
const { avatar, email, name } = snapshot.val();
const { userId } = context.params; // this is the wildcard from the document param above.
// you can then pass this to the createUserToken function
// and do whatever you like with it from here
const streamToken = client.createUserToken(userId);
});
Let me know if that needs clearing up, these docs are super helpful for this topic too 😊
https://firebase.google.com/docs/functions/firestore-events

How to mock external service when testing a NodeJS API

I have JSON API built with koa which I am trying to cover with integration tests.
A simple test would look like this:
describe("GET: /users", function() {
it ("should respond", function (done) {
request(server)
.get('/api/users')
.expect(200, done);
});
});
Now the issue comes when the actions behind a controller - lets say saveUser at POST /users - use external resources. For instance I need to validate the users phone number.
My controller looks like this:
save: async function(ctx, next) {
const userFromRequest = await parse(ctx);
try {
// validate data
await ctx.repo.validate(userFromRequest);
// validate mobile code
await ctx.repo.validateSMSCode(
userFromRequest.mobile_number_verification_token,
userFromRequest.mobile_number.prefix + userFromRequest.mobile_number.number
);
const user = await ctx.repo.create(userFromRequest);
return ctx.data(201, { user });
} catch (e) {
return ctx.error(422, e.message, e.meta);
}
}
I was hoping to be able to mock the ctx.repo on the request object but I can't seem to able to get a hold on it from test, which means that my tests are actually hitting the phone number verification service.
Are there any ways I could go around hitting that verification service ?
Have you considered using a mockup library like https://github.com/mfncooper/mockery?
Typically, when writing tests requiring external services, I mock the service client library module. For example, using mocha:
mockery = require('mockery');
repo = require('your-repo-module');
before(function() {
mockery.enable();
repo.validateSMSCode = function() {...};
mockery.registerMock('your-repo-module', repo);
}
This way, every time you require your-repo-module, the mocked module will be loaded rather than the original one. Until you disable the mock, obviously...
app.context is the prototype from which ctx is created from. You may
add additional properties to ctx by editing app.context. This is
useful for adding properties or methods to ctx to be used across your
entire app, which may be more performant (no middleware) and/or easier
(fewer require()s) at the expense of relying more on ctx, which could
be considered an anti-pattern.
app.context.someProp = "Some Value";
app.use(async (ctx) => {
console.log(ctx.someProp);
});
For your sample your re-define app.context.repo.validateSMSCode like this, assuming that you have following setup lines in your test:
import app from '../app'
import supertest from 'supertest'
app.context.repo.validateSMSCode = async function(ctx, next) {
// Your logic here.
};
const request = supertest.agent(app.listen())
After re-defining app.context.repo.validateSMSCode method that your will define in your test, will work, instead of original method.
https://github.com/koajs/koa/blob/v2.x/docs/api/index.md#appcontext
https://github.com/koajs/koa/issues/652

Resources