According to the docs around distributed tracing for Servicebus, the Diagnostic-Id property should be used to store the WC3 traceparent.
When an Azure Function is triggered via a service bus message, I would expect this value to be used to initialize the tracecontext (so AppInsight logs would get tagged with the appropriate operation_Id).
I've even tried to copy it over to the tracecontext and it doesn't seem to work.
Is it not possible to maintain distributed tracing with Service Bus in Azure functions?
Update
You can kinda of get this to work by implementing the wrapped app insights context, however, calls to context.log will still have the original traceparent reference :(
export default async function contextPropagatingHttpTrigger(context, req) {
// overwrite with the proper traceparent from the incoming SB message.
const sbTraceParent = context.bindingData.applicationProperties['diagnostic-Id'];
if (sbTraceParent) {
context.traceContext.traceparent = sbTraceParent;
}
const correlationContext = appInsights.startOperation(context, req) as CorrelationContext;
// Wrap the Function runtime with correlationContext
return appInsights.wrapWithCorrelationContext(async () => {
const startTime = Date.now(); // Start trackRequest timer
try {
// operation_Id matches SB 'diagnostic-Id'
appInsights.defaultClient.trackTrace({
message: 'Correct Trace Context',
});
// operation_Id is the original function traceparent
context.log('Incorrect Trace Context');
return await trigger(context, req);
} catch (e) {
context.log.error(e);
throw e;
} finally {
// Track Request on completion
appInsights.defaultClient.flush();
}
}, correlationContext)();
}
Related
Background
I am working on creating an autonomous Google AutoML end<>end system. I created a cloud function that receives a cloud pub/sub message when training starts. The cloud function uses the operation ID to get the operation status of the training. If the training of the model is complete(operation metadata = true), the function will send the model ID to a deployment function and send a pub/sub message with the modelID for the model to be called on prediction from. I found a solution from SO from this post How to programmatically get model id from google-cloud-automl with node.js client library
Problem
The issue I am coming across is with the cloud function timeout of 10 minutes. I wrote this question on reddit on potential solutions. https://www.reddit.com/r/googlecloud/comments/jqr213/cloud_function_to_compute_engine/ The Compute Engine solution seems not practical for a system mainly written in a cloud function environment. While trying to implement the cron job solution, I thought of the retry feature for cloud functions. It keeps the same event and will retry the function for up to a week. The documentation for retry is https://cloud.google.com/functions/docs/bestpractices/retries How could I include a cancel of the function to keep it retrying until it becomes true and completes the deployment and pub/sub message? My thought is to include the ending of the system in the if else statement, I am just struggling to find documentation of this/ if it would actually work.
Code
const {AutoMlClient} = require('#google-cloud/automl').v1;
// Instantiates a client
const client = new AutoMlClient();
exports.helloPubSub = (event, context) => {
//Imports the Google Cloud AutoML library
const message = event.data
? Buffer.from(event.data, 'base64').toString()
: 'Hello, World';
const model = message;
console.log(model);
const modelpath = message.replace('"','');
const modelID = modelpath.replace('"','');
const message1 = model.replace('projects/170974376642/locations/us-central1/operations/','');
const message2 = message1.replace('"','');
const message3 = message2.replace('"','');
console.log(`Operation ID is: ${message3}`)
getOperationStatus(message3, modelID);
}
// [START automl_vision_classification_deploy_model_node_count]
async function getOperationStatus(opId, message) {
console.log('Starting operation status');
const opped = opId;
const data = message;
const projectId = '170974376642';
const location = 'us-central1';
const operationId = opId;
// Construct request
const request = {
name: `${message}`,
};
console.log('Made it to the response');
const [response] = await client.operationsClient.getOperation(request);
console.log(`Name: ${response.name}`);
console.log(`Operation details:`);
var apple = JSON.stringify(response);
console.log(apple);
console.log('Loop until the model is ready to deploy');
if (apple.includes('True')) {
const appleF = apple.replace((/projects\/[a-zA-Z0-9-]*\/locations\/[a-zA-Z0-9-]*\/models\//,''));
deployModelWithNodeCount(appleF);
pubSub(appleF);
} else {
getOperationStatus(opped, data);
}
}
async function pubSub(id) {
const topicName = 'modelID';
const data = JSON.stringify({foo: `${id}`});
async function publishMessage() {
// Publishes the message as a string, e.g. "Hello, world!" or JSON.stringify(someObject)
const dataBuffer = Buffer.from(data);
try {
const messageId = await pubSubClient.topic(topicName).publish(dataBuffer);
console.log(`Message ${messageId} published.`);
} catch (error) {
console.error(`Received error while publishing: ${error.message}`);
process.exitCode = 1;
}
}
publishMessage();
// [END pubsub_publish_with_error_handler]
// [END pubsub_quickstart_publisher]
process.on('unhandledRejection', err => {
console.error(err.message);
process.exitCode = 1;
});
}
async function deployModelWithNodeCount(message) {
const projectId = 'ireda1';
const location = 'us-central1';
const modelId = message;
// Construct request
const request = {
name: client.modelPath(projectId, location, modelId),
imageClassificationModelDeploymentMetadata: {
nodeCount: 1,
},
};
const [operation] = await client.deployModel(request);
// Wait for operation to complete.
const [response] = await operation.promise();
console.log(`Model deployment finished. ${response}`);
}
// [END automl_vision_classification_deploy_model_node_count]
There are several improvements that you can consider for your code. First of all, it is important to understand that Cloud Functions are short-lived. 9 minutes is the maximum, your function will be active. Cloud Functions are not meant for background operations, if you are looking at a solution, which can be executed in the background and requires minimal infrastructure, I would recommend having a look at Cloud Run.
Now lets have a look at some parts of the code and how it can be improved with a different architecture maintaining Cloud Functions and PubSub as the backbone.
Waiting on model deployment
The code you use is:
if (apple.includes('True')) {
const appleF = apple.replace((/projects\/[a-zA-Z0-9-]*\/locations\/[a-zA-Z0-9-]*\/models\//,''));
deployModelWithNodeCount(appleF);
pubSub(appleF);
} else {
getOperationStatus(opped, data);
}
First of all, I would strongly suggest not to use recursion here, because a) this can be handled via a simple loop, b) you are bombarding the service without any time out or back-off policy. The latter might result in either your service crashing or endpoint starting to reject your requests.
To improve your code, you can for example set at least timeout function, like this:
setTimeout(getOperationStatus(opped, data), 1000)
For readability, I would also suggest just to use a loop in the future since you are using async patterns anyways:
status = getOperationStatus(opped, data);
while(!status){
await new Promise(t => setTimeout(t, 1000));
status = getOperationStatus(opped, data);
}
In this case, you need to separate it into two functions - 1) getOperationStatus, which actually just return status, and 2) waitForDeployment, which polls for the status, compares it with the expected result, and decides to a) wait & retry or b) abandon & return
This might make your code better, but does not solve the fundamental problem of the system design. To understand this, let's have a look a splitting responsibility and structuring the system differently. As a side note, the guide here is not meant for a Cloud Function application.
A few explanations:
Activation Function initializes the entire process, it calls the Vision Auto ML to start the deployment. It only gets the ID of the operation and pushes it to the queue
Cloud Scheduler pushes a trigger to PubSub (alternatively it can also call the function as an endpoint) every X minutes/seconds saying that it is time to check on the progress
Polling Function once triggered ask for the next ID to check, queries Cloud AutoML and if finished, acknowledges the message and writes the results, otherwise exits. You need to be careful with the configuration of acknowledgments here. Useful information is here
Polling of the status
The minor thing I have noticed is how you are polling the status. Why don't your just query this URL GET https://automl.googleapis.com/v1/projects/project-id/locations/us-central1/operations/operation-id and get status of done (check here for details)
Conclusion: Cloud Functions are short-lived and must handle only one operation at a time, no waiting. If you want a simple loop for waiting for results, use Cloud Run
I have a Lambda function that is designed to take a message from a SQS queue and then input a value called perf_value which is just an integer. The CloudWatch logs show it firing each time and logging Done as seen in the .then() block of my write point. With it firing each time I am still only seeing a single data point in InfluxDB Cloud. I can't figure out why it is only inputting a single value then nothing after that. I don't see a backlog in SQS and no error messages in CloudWatch either. I'm guessing it is a code issue or InfluxDB Cloud setup though I used defaults which you would expect to actually work for multiple data points
'use strict';
const {InfluxDB, Point, HttpError} = require('#influxdata/influxdb-client')
const InfluxURL = 'https://us-west-2-1.aws.cloud2.influxdata.com'
const token = '<my token>=='
const org = '<my org>'
const bucket= '<bucket name>'
const writeApi = new InfluxDB({url: InfluxURL, token}).getWriteApi(org, bucket, 'ms')
module.exports.perf = function (event, context, callback) {
context.callbackWaitsForEmptyEventLoop = false;
let input = JSON.parse(event.Records[0].body);
console.log(input)
const point = new Point('elapsedTime')
.tag(input.monitorID, 'monitorID')
.floatField('elapsedTime', input.perf_value)
// .timestamp(input.time)
writeApi.writePoint(point)
writeApi
.close()
.then(() => {
console.log('Done')
})
.catch(e => {
console.error(e)
if (e instanceof HttpError && e.statusCode === 401) {
console.log('Unauthorized request')
}
console.log('\nFinished ERROR')
})
return true
};
EDIT**
Still have been unable to resolve the issue. I can get one datapoint to go into the influxdb and then nothing will show up.
#Joshk132 -
I believe the problem is here:
writeApi
.close() // <-- here
.then(() => {
console.log('Done')
})
You are closing the API client object after the first write so you are only able to write once. You can use flush() instead if you want to force sending the Point immediately.
I would like to know if it's possible to return data before my onCall function finish.
Here is what I'm trying to do and my result data in my app is always equal to null:
exports.myFunction = functions.https.onCall((data, context) => {
if (!context.auth || !context.auth.uid) {
return {
statut: "NOK",
};
}
return Promise.all([promise1(), promise2()])
.then(results => {
const result1 = results[0].data();
return FIRESTORE.collection('MyCollection')
.add(result1)
.then(results2 => {
let queries = [];
queries.push(
function1234()
);
queries.push(
FIRESTORE.collection('MyCollection2')
.doc("123")
.set({a: 123})
);
return Promise.all(queries)
.then(r => {
return { myReturn: "AAAA" };
});
})
.then(r2 => {
console.log('0000');
return function9898();
})
})
this is my client side function:
myFunction = () => {
var myFunctionCloud = firebase.functions().httpsCallable('myFunction');
myFunctionCloud().then(function (result) {
console.log(JSON.stringify(result));
})
}
My logs :
{"data":null}
I would like to know if it's possible to return data before my onCall
function finish.
No, it is not possible since returning data in a Callable Cloud Function indicates to the Cloud Functions platform that the Callable Cloud Function is finished and that it can clean up everything from that invocation. So in many cases, your Cloud Function will be terminated before your business logic is fully executed.
In some cases, it may be possible that the Cloud Functions platform does not terminate your CF immediately, giving enough time for your business logic to be fully executed, but this is not at all guaranteed. Relying on this specific case is not recommended and actually leads to some "erratic" behaviour
One of the approach to immediately send back a result to the front-end and continue to work in the background is to call a PubSub Cloud Function (with all the relevant input in the message) and then return the value. In other words, you delegate the remaining work to another Cloud Function and you stop the current Callable one by returning a response.
I have a Cloud Firestore trigger that takes care of adjusting the balance of a user's wallet in my app.
exports.onCreateTransaction = functions.firestore
.document('accounts/{accountId}/transactions/{transactionId}')
.onCreate(async (snap, context) => {
const { accountId, transactionId } = context.params;
const transaction = snap.data();
// See the implementation of alreadyTriggered in the next code block
const alreadyTriggered = await firestoreHelpers.triggers.alreadyTriggered(context);
if (alreadyTriggered) {
return null;
}
if (transaction.status === 'confirmed') {
const accountRef = firestore
.collection('accounts')
.doc(accountId);
const account = (await accountRef.get()).data();
const balance = transaction.type === 'deposit' ?
account.balance + transaction.amount :
account.balance - transaction.amount;
await accountRef.update({ balance });
}
return snap.ref.update({ id: transactionId });
});
As a trigger may actually be called more than once, I added this alreadyTriggered helper function:
const alreadyTriggered = (event) => {
return firestore.runTransaction(async transaction => {
const { eventId } = event;
const metaEventRef = firestore.doc(`metaEvents/${eventId}`);
const metaEvent = await transaction.get(metaEventRef);
if (metaEvent.exists) {
console.error(`Already triggered function for event: ${eventId}`);
return true;
} else {
await transaction.set(metaEventRef, event);
return false;
}
})
};
Most of the time everything works as expected. However, today I got a timeout error which caused data inconsistency in the database.
Function execution took 60005 ms, finished with status: 'timeout'
What was the reason behind this timeout? And how do I make sure that it never happens again, so that my transaction amounts are successfully reflected in the account balance?
That statement about more-than-once execution was a beta limitation, as stated. Cloud Functions is out of beta now. The current guarantee is at-least-once execution by default. you only get multiple possible events if you enable retries in the Cloud console. This is something you should do if you want to make sure your events are processed reliably.
The reason for the timeout may never be certain. There could be any number of reasons. Perhaps there was a hiccup in the network, or a brief amount of downtime somewhere in the system. Retries are supposed to help you recover from these temporary situations by delivering the event potentially many times, so your function can succeed.
I am new beginner in serverless framwork.
When study Best Practices in Serverless.
here
I have a question about "Initialize external services outside of your Lambda code".
How to implement it?
For example: Below code in handler.js
const getOneUser = (event, callback) => {
let response = null;
// validate parameters
if (event.accountid && process.env.SERVERLESS_SURVEYTABLE) {
let docClient = new aws.DynamoDB.DocumentClient();
let params = {
TableName: process.env.SERVERLESS_USERTABLE,
Key: {
accountid: event.accountid,
}
};
docClient.get(params, function(err, data) {
if (err) {
// console.error("Unable to get an item with the request: ", JSON.stringify(params), " along with error: ", JSON.stringify(err));
return callback(getDynamoDBError(err), null);
} else {
if (data.Item) { // got response
// compose response
response = {
accountid: data.Item.accountid,
username: data.Item.username,
email: data.Item.email,
role: data.Item.role,
};
return callback(null, response);
} else {
// console.error("Unable to get an item with the request: ", JSON.stringify(params));
return callback(new Error("404 Not Found: Unable to get an item with the request: " + JSON.stringify(params)), null);
}
}
});
}
// incomplete parameters
else {
return callback(new Error("400 Bad Request: Missing parameters: " + JSON.stringify(event)), null);
}
};
The question is that how to initial DynamoDB outside of my Lambda code.
Update 2:
Is below code optimized?
Handler.js
let survey = require('./survey');
module.exports.handler = (event, context, callback) => {
return survey.getOneSurvey({
accountid: event.accountid,
surveyid: event.surveyid
}, callback);
};
survey.js
let docClient = new aws.DynamoDB.DocumentClient();
module.exports = (() => {
const getOneSurvey = (event, callback) {....
docClient.get(params, function(err, data)...
....
};
return{
getOneSurvey : getOneSurvey,
}
})();
Here's the quote in question:
Initialize external services outside of your Lambda code
When using services (like DynamoDB) make sure to initialize outside of your lambda code. Ex: module initializer (for Node), or to a static constructor (for Java). If you initiate a connection to DDB inside the Lambda function, that code will run on every invoke.
In other words, in the same file, but outside of -- before -- the actual handler code.
let docClient = new aws.DynamoDB...
...
const getOneUser = (event, callback) => {
....
docClient.get(params, ...
When the container starts, the code outside the handler runs. When subsequent function invocations reuse the same container, you save resources and time by not instantiating the external services again. Containers are often reused, but each container only handles one concurrent request at a time, and how often they are reused and for how long is outside your control... Unless you update the function, in which case any existing containers will no longer be reused, because they'd have the old version of the function.
Your code will work as written, but isn't optimized.
The caveat with this approach that arises in current generation Node.js Lambda functions (Node 4.x/6.x) is that some objects -- notably, those that create literal persistent connections to external services -- will prevent the event loop from becoming empty (a common example is a mysql database connection, which is holding a live TCP connection to the server; by contrast, a DynamoDB "connection" is actually connectionless, since it's transport protocol is HTTPS). In this case you need to either take a different approach or allow lambda to not wait for an empty event loop before freezing the container, by setting context.callbackWaitsForEmptyEventLoop to false before calling the callback... but only do this if needed and only if you fully understand what it means. Setting it by default because some guy on the Internet said it was a good idea will potentially bring you mysterious bugs, later.