How to modify error logs in google cloud functions? - python-3.x

I am using google cloud functions with Python. I want to format all logs with some additional data, e.g. customer id. I achieved this without any problem using Stackdriver Logging library with CloudLoggingHandler. In the same manner, I also like to add this information to uncaught error logs and tracebacks.
I tried to modify sys.excepthook and sys.stderr but it did not work, probably they are handled exclusively by cloud functions.
Is there any way I can modify uncaught exceptions or modify handled errors, e.g. by using Stackdriver error reporting? Or do you have any alternative solution for this (without catching all exceptions)?

Cloud Functions provides the (currently) highest level abstraction for code execution. The philosophy is that your bring the code that implements your desired logic and Cloud Functions provides the highest level environment for execution. This has pluses and minuses.
Furthermore, the biggest plus is that you have the very least to concern yourself with in order to get the execution you desired.
On the other hand, you have very little in the way of operational control (the vision is that Cloud Functions provides the maximum in operational control).
As a consequence, if you want more control over the environment at the cost of you having to do more "work", I suggest you Cloud Run. In Cloud Run, you package your application logic as a Docker container and then ask it to take care of all execution of such logic. In your container, you can do anything you want ... including using technology such as Stackdriver logging and defining a CloudLoggingHandler. Cloud Run then takes care of your scaling and execution environment from there.
To sum up, the answer then becomes "No" you don't have control over error logs in Cloud Functions but you can achieve your desired outcome by leveraging Cloud Run instead.

Although you cannot override sys.excepthook, you might do the following:
To provide a little context, I organize my code structure similar to what is presented here: https://code.luasoftware.com/tutorials/google-cloud-functions/structure-for-google-cloud-functions-development-and-split-multiple-file/
Google stores your function entrypoint function in an env var called X_GOOGLE_ENTRY_POINT. You use Python language capabilities to override this function and wrap it similarly to a "decorator", basically wrapping it around a try/except block and then you can run whatever code you want there. I tried using sys.modules[\__name__] but it didn't work, so I went for locals().
I have the function code defined in test_logging.py
from app.functions.test_logging import *
and after importing I do the following
fn = os.getenv('X_GOOGLE_ENTRY_POINT')
lcl = locals()
def decorate(fn):
def run(*args, **kwargs):
try:
fn(*args, **kwargs)
except Exception as e:
'''Do whatever you want to do here'''
return run
lcl[fn] = decorate(lcl[fn])
It's hackish, but it's working for me and it's particularly useful because I keep my code base inside functions folder and I basically don't need to touch the main.py, making this very flexible. You can ever re-raise the error after handling the exception if you want to allow the GCP to know it has failed and maybe re-run

Related

Use Google Cloud Secrets when initializing code

I have this code to retrieve the secrets:
import {SecretManagerServiceClient} from "#google-cloud/secret-manager";
const client = new SecretManagerServiceClient();
async function getSecret(secret: String, version = "latest") {
const projectID = process.env.GOOGLE_CLOUD_PROJECT;
const [vs] = await client.accessSecretVersion({
name: `projects/${projectID}/secrets/${secret}/versions/${version}`
});
const secretValue = JSON.parse(vs.payload.data.toString());
return secretValue;
}
export {getSecret};
I would like to replace the process.env.SENTRY_DNS with await getSecrets("SENTRY_DNS") but I can't call a promise (await) outside an async function.
Sentry.init({
dsn: process.env.SENTRY_DNS,
environment: Config.isBeta ? "Beta" : "Main"
});
function sentryCreateError(message, contexts) {
Sentry.captureMessage(message, {
level: "error", // one of 'info', 'warning', or 'error'
contexts
});
}
What are the best practices with Google Secrets? Should I be loading the secrets once in a "config" file and then call the values from there? If so, I'm not sure how to do that, do you have an example?
Leaving aside your code example (I don't work with JS anyway), I would think about a few different questions, answers on which may affect the design. For example:
Where this code is executed? - compute engine, app engine, cloud run, k8s, cloud function, and so on. Depending on the answer - an approach to store secrets might be different.
Suppose, for example, that is going to be a cloud function. The next question -
Would you prefer to store the secret values in a special environment variables, or in the secret manager? The first option is faster, but less secure - as, for instance, everybody, who has access tot he cloud function details in the console, might see those environment variable values.
Load secret values into the memory on initialization? Or on every invocation? The first option is faster, but might cause some issues if the secrete values are modified (gradual replacement of old values with new, when some instances are terminated, and new instances are initialized).
The second option may need some additional discussion. It might be possible to get the values asynchronously. In what circumstances it might be useful? I think - only in case your code has something else to do, while waiting for the secret values, which are required to do (probably) the main job of the cloud function. How much can we shave on that? - probably a few milliseconds used on the Secret Manager API call. Any drawbacks? - code complexity, as somebody is to maintain the code in the future. Is that performance gain still overweight? - we probably can return to the item 2 in the list above and think about storing secrets in environment variables in that case.
What about the first option? Again - if the performance is the priority - return back to the item 2 above, otherwise - is the code simplicity and maintainability the priority, and we don't need any asynchronous work here? May be the answer of that question depends on skills, knowledge and a financial budget of your company/team, rather than on the technical preferences.
About the "config" file to store the secret values... While it is possible to store data in a pseudo "/tmp" directory (actually in the memory of a cloud function) during the cloud function execution, we should not expect that data to be preserved between cloud function invocations. Thus, we come back to either environment variables (see the item 2 above), or to some other remote place with an API access. I don't know if there are many other services with better latency than the Secret Manager, which can be used as a cache for storing secrets. Suppose we find such services. And now we get the performance vs complexity/maintainability dilemma again...
Some concluding notes. My context, experience, budget, requirements - may be completely different from your case. My assumptions (i.e. the code is for a cloud function) - can be completely wrong as well... Thus, I would suggest to consider my writing with some criticism, and use ideas which are only relevant for your specific situation.

Overriding a built-in class method that another class I want to inherit from relies on

So I'm writing a script/application that uses pythons multiprocessing BaseManager class. Now for the most part it works great, the only issue I have is that I am using the serve_forever as a blocking statement and then continue onwards however when I want to terminate or exit out of the serve_forever() function(ality) it automatically exits out and terminates the application, but like I mentioned I have some more things I want to take care of before I completely exit out.
I can exit out of serve_forever() by setting a stop event with stop_event.set(). Now this is all well and dandy however according to the source (https://github.com/python/cpython/blob/3.6/Lib/multiprocessing/managers.py#L147) serve_forever explicitly states sys.exit(0) and is part of the Server class that BaseManager uses within it's definition. Essentially I would like to remove that line (sys.exit(0)). How would I accompolish this?
When I search I'm coming up with results such as monkey patching? Can I just Subclass the Server class, explicitly define serve_forever to be the exact same code but without the sys.exit(0) line and call it a day? Something tells me that is not going to work. Do I subclass Server AND BaseManager?
Thanks!
Attempting to monkey-patch or inherit internal classes will result in code that will not be compatible across Python releases, not even patches.
Atop of that, these solutions will be unnecessarily complex and complicated, and are overall frowned upon.
I highly suggest re-implementing serve_forever() by using the start() method together with an event. Waiting for the event to be called or, if impossible, a loop checking if the manager is still alive, will be much easier and a better solution in almost all aspects that I can think of.
After discussing in chat, we realised the easiest approach is to just suppress the SystemExit being thrown from sys.exit(). I'm opening a bug report on CPython bug tracker accordingly to prevent sys.exit(). Do keep in mind the server will not actually shut down as it is run on a different thread. The whole recommendation of using .server().serve_forever() in the stdlib looks dubious at best.
If you wish to immediately shut down the server, call Server.listener.close() after catcing the exception.

Azure durable entity or static variables?

Question: Is it thread-safe to use static variables (as a shared storage between orchestrations) or better to save/retrieve data to durable-entity?
There are couple of azure functions in the same namespace: hub-trigger, durable-entity, 2 orchestrations (main process and the one that monitors the whole process) and activity.
They all need some shared variables. In my case I need to know the number of main orchestration instances (start new or hold on). It's done in another orchestration (monitor)
I've tried both options and ask because I see different results.
Static variables: in my case there is a generic List, where SomeMyType holds the Id of the task, state, number of attempts, records it processed and other info.
When I need to start new orchestration and List.Add(), when I need to retrieve and modify it I use simple List.First(id_of_the_task). First() - I know for sure needed task is there.
With static variables I sometimes see that tasks become duplicated for some reason - I retrieve the task with List.First(id_of_the_task) - change something on result variable and that is it. Not a lot of code.
Durable-entity: the major difference is that I add List on a durable entity and each time I need to retrieve it I call for .CallEntityAsync("getTask") and .CallEntityAsync("saveTask") that might slow done the app.
With this approach more code and calls is required however it looks more stable, I don't see any duplicates.
Please, advice
Can't answer why you would see duplicates with the static variables approach without the code, may be because list is not thread safe and it may need ConcurrentBag but not sure. One issue with static variable is if the function app is not always on or if it can have multiple instances. Because when function unloads (or crashes) the state would be lost. Static variables are not shared across instances either so during high loads it wont work (if there can be many instances).
Durable entities seem better here. Yes they can be shared across many concurrent function instances and each entity can only execute one operation at a time so they are for sure a better option. The performance cost is a bit higher but they should not be slower than orchestrators since they perform a lot of common operations, writing to Table Storage, checking for events etc.
Can't say if its right for you but instead of List.First(id_of_the_task) you should just be able to access the orchestrators properties through the client which can hold custom data. Another idea depending on the usage is that you may be able to query the Table Storages directly with CloudTable class for the information about the running orchestrators.
Although not entirely related you can look at some settings for parallelism for durable functions Azure (Durable) Functions - Managing parallelism
Please ask any questions if I should clarify anything or if I misunderstood your question.

node.js express custom format debug logging

A seemingly simple question, but I am unsure of the node.js equivalent to what I'm used to (say from Python, or LAMP), and I actually think there may not be one.
Problem statement: I want to use basic, simple logging in my express app. Maybe I want to output DEBUG messages, or INFO messages, or just some stats to the log for consumption by other back-end systems later.
1) I want all logs message, however, to contain some fields: remote-ip and request url, for example.
2) On the other hand, code that logs is everywhere in my app, including deep inside the call tree.
3) I don't want to pass (req,res) down into every node in the call tree (this just creates a lot of parameter passing where they are mostly not needed, and complicates my code, as I need to pass these into async callbacks and timeouts etc.)
In other systems, where there is a thread per request, I will store the (req,res) pair (where all the data I need is) in a thread-local-storage, and the logger will read this and format the message.
In node, there is only one thread. What is my alternative here? What's "the request context in which a specific piece of code is running under"?
The only way I can think of achieving something like this is by looking at a trace, and using reflection to look at local variables up the call tree. I hate that, plus would need to implement this for all callbacks, setTimeouts, setIntervals, new Function()'s, eval's, ... and the list goes on.
What are other people doing?

How to call custom asnyc code to initialize a Node.io Job once (before successive calls to input())?

Just discovered Node.io, gone though the docs, api, etc. and it looks great. However, building my first job exports.job = new nodeio.Job(..), with methods like input, run,output, reduce, complete I'm in need of some kind of initialize() method which is called once, before successive calls to input() are done. (Similar how complete is called once just before the job is finished)
Any such method around?
For completeness:
This code imho has to be part of the node.io flow (through some dedicated method) since initializing my async code outside of the node.io scope doesn't guarentee the data is already there before the node.io job is executed.
I don't know if there is such a method, have you browsed through the source? It looks like there is an 'init' method that is called by the processor if it is on the job. If you try that and it isn't what you're looking for, you could suggest this as a feature on the node.io github site.
Otherwise, this would be a very simple thing to add for yourself. Just add an 'initialize' method to your object, and then put the following lines at the top of your 'input' or 'run' method (which I think would probably work better if you need the data to be ready already):
if (!this.initialized) {
this.initialized = true;
this.initialize();
}
Note that there is a tiny performance hit here, of course. But in most cases, it's only the amount of time it takes to check the value of one variable, which is probably quite minimal compared to the amount of processing you actually need.

Resources