aws-xray captureAWS annotations - node.js

I've begun using AWSXRay in order to get more insight into why performance is not ideal in my lambda function. This lambda function runs a gql service meaning it has lots of outbound requests to other lambda functions as well as dynamodb for caching.
I've added tracing to all aws-sdk client calls by utilizing the following in my handler. It mutates the imported AWS module so that all subsequent usage of AWS clients successful includes aws-xray tracing, regardless of what module imports it. Awesome!
import AWS from 'aws-sdk';
import AWSXRay from 'aws-xray-sdk';
AWSXRay.captureAWS(AWS);
Heres an example of the output:
The Problem
The problem is that none of the "Traces" have any annotation regarding the parameters of the requests. Both the annotation and metadata of each trace is empty:
The Hope
The hope is that there is a way to configure the AWSXRay CaptureAWS modifications so that they include the arguments of each aws-client request in the annotations or metadata.
The Question
Is it possible to request that AWSXRay.captureAWS(AWS); includes the parameters passed to the aws sdk client invocations in either the annotations or the metadata of the traces it produces?

The resources section contains high level arguments for some clients e.g. DynamoDB table name. Not all arguments are captured by default. This is because they may contain information that the users do not wish to track in their trace and may also be verbose.
For now opt-in is not available in X-Ray SDK for arbitrary API parameters. As a workaround for now, I would suggest that you wrap your sdk calls in a local subsegment and record the parameters you want to capture as annotations or Metadata for that subsegment. Let me know if you need any help in locating docs that allow you to create your own subsegments.

Related

What do i put in the "handler" prop of aws-cdk's gateway.LambdaRestApi?

I'm building out one of my first gateway api's and have been reading the code and documentation here.
For an apigateway which use made using the LambdaRestApi function, my understanding was that i define the endpoints and the lambda attached to the endpoints.
If that's the case, what do i put as the functions handler function? I don't have any plans for there to be a base route for it so do i have to just have a blank lambda here? Or am i going in the wrong direction with my thinking?
LambdaRestApi is just a utility construct based on RestApi with defaultIntegration to a Lambda function.
You can use a plain RestApi instead if you don't need the default proxy integration to a Lambda handler configured.

Building a jump-table for boto3 clients/methods

I'm trying to build a jumptable of API methods for a variety of boto3 clients, so I can pass an AWS service name and a authn/authz low-level boto3 client to my utility code and execute the appropriate method to get a list of resources from the AWS service.
I'm not willing to hand-code and maintain a massive if..elif..else statement with >100 clauses.
I have a dictionary of service names (keys) and API method names (values), like this:
jumpTable = { 'lambda' : 'list_functions' }
I'm passed the service name ('lambda') and a boto3 client object ('client') already connected to the right service in the region and account I need.
I use the dict's get() to find the method name for the service, and then use a standard getattr() on the boto3 client object to get a method reference for the desired API call (which of course vary from service to service):
`apimethod = jumpTable.get(service)`
`methodptr = getattr(client, apimethod)`
Sanity-checking says I've got a "botocore.client.Lambda object" for 'client' (that looks OK to me) and a "bound method ClientCreator._create_api_method.._api_call of <botocore.client.Lambda" for the methodptr which reports itself as of type 'method'.
None of the API methods I'm using require arguments. When I invoke it directly:
'response = methodptr()'
it returns a boto3 ClientError, while invoking at through the client:
response = client.methodptr()
returns a boto3 AttributeErrror.
Where am I going wrong here?
I'm locked into boto3, Python3, AWS and have to talk to 100s of AWS services, each of which has a different API method that provides the data I need to gather. To an old C coder, a jump-table seems obvious; a more Pythonic approach would be welcome...
The following works for me:
client = boto3.Session().client("lambda")
methodptr = getattr(client, apimethod)
methodptr()
Note that the boto3.Session() part is required. When calling boto3.client(..) directly, I get a 'UnrecognizedClientException' exception.

Is there a AWS CDK code available to enable WAF logging for Kinesis firehose delivery stream?

Anyone have Python CDK code to enable Amazon Kinesis Data Firehose delivery stream Logging in WAF? Any language CDK code is fine for my reference as I didn't find any proper syntax or examples to enable in official python CDK/api documentation nor in any blog.
From the existing documentation (as of CDK version 1.101 and by extension Cloudformation) there seems to be no way of doing this out of the box.
But there is API call which can be utilized with boto3 for example: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/wafv2.html#WAFV2.Client.put_logging_configuration
What you need to have in order to invoke the call:
ResourceArn of the web ACL
List of Kinesis Data Firehose ARN(s) which should receive the logs
This means that you can try using Custom Resource and implement this behavior. Given you have created Firehose and web ACL in the stack previously, use this to create Custom Resource:
https://docs.aws.amazon.com/cdk/api/latest/python/aws_cdk.custom_resources/README.html
crd_provider = custom_resources.Provider(
self,
"Custom resource provider",
on_event_handler=on_event_handler,
log_retention=aws_logs.RetentionDays.ONE_DAY
)
custom_resource = core.CustomResource(
self,
"WAF logging configurator",
service_token=crd_provider.service_token,
properties={
"ResourceArn": waf_rule.attr_arn,
"FirehoseARN": firehose.attr_arn
}
)
on_event_handler in this case is a lambda function which you need to implement.
It should be possible to simplify this further by using AwsSdkCall:
on_event_handler = AwsSdkCall(
action='PutLoggingConfiguration',
service='waf',
parameters={
'ResourceArn': waf_rule.attr_arn,
'LogDestinationConfigs': [
firehose.attr_arn,
]
)
This way you don't need to write your own lambda. But your use case might change and you might want to add some extra functionality to this logging configurator, so I'm showing both approaches.
Disclaimer: I haven't tested this exact code, rather this is an excerpt of similar code written by me to solve similar problem of circumventing the gap in Cloudformation coverage.
I don't have python CDK example, but I had it working in Typescript using CfnDeliverySteam and CfnLoggingConfiguration. I would imagine you can find matching class in python CDK.

Logging for Azure Function in python with SEQ

I'm working on the Azure Function (durable function) that implements an HTTP trigger. All it does is waiting for an HTTP call from the backend that shares a link to a blob storage object (image) so it can be processed by a function. I need to implement a reliable logging solution using SEQ, that's being used for other projects in our company (mostly .NET).
Using some official documentation from here all I'm receiving in the SEQ console is a stream of unstructured events and it's hard to gain where and when the processing starts, how much time did it take, etc. It makes it impossible to troubleshoot.
With .NET projects we were using Serilog that allows you to write so-called enrichers and filters, so you can structurize the logs and the information that is really needed, including the call performance (e.g. elapsed time). I don't see anything even close to that available for Python 3. Can anyone suggest where do I start? What's the best approach to capture the events I'm looking for?
Thanks.
Ok guys, here's the answer:
You need to install the lib called seqlog via requirements.txt for the purpose
In the Python script where you plan to use logger import respective namespace, i.e. import logging
Define SEQ configuration in a JSON file (something like this):
In the init.py load SEQ config:
with open('./seq.config.json', 'r') as f:
seq_config = json.load(f)
f.close()
Use logging object to stream the logs to SEQ:
logging.info("[" + obj.status + "] >> Data has been processed!")
Enjoy the logs posted to the SEQ console
p.s. if you're debugging locally, set http://localhost+port in the seq.config.json instead of the remote console address
Hope this info will help someone.

Node typescript library environment specific configuration

I am new to node and typescript. I am working on developing a node library that reaches out to another rest API to get and post data. This library is consumed by a/any UI application to send and receive data from the API service. Now my question is, how do I maintain environment specific configuration within the library? Like for ex:
Consumer calls GET /user
user end point on the consumer side calls a method in the library to get data
But if the consumer is calling the user end point in test environment I want the library to hit the following API Url
for test http://api.test.userinformation.company.com/user
for beta http://api.beta.userinformation.company.com/user
As far as I understand the library is just a reference and is running within the consumer application. Library can for sure get the environment from the consumer, but I do not want the consumer having to specify the full URL that needs to be hit, since that would be the responsibility of the library to figure out.
Note: URL is not the only problem, I can solve that with environment switch within the library, I have some client secrets based on environments which I can neither store in the code nor checkin to source control.
Additional Information
(as per jfriend00's request in comments)
My library has a LibExecutionEngine class and one method in it, which is the entry point of the library:
export class LibExecutionEngine implements ExecutionEngine {
constructor(private environment: Environments, private trailLoader:
TrailLoader) {}
async GetUserInfo(
userId: string,
userGroupVersion: string
): Promise<UserInfo> {
return this.userLoader.loadUserInfo(userId, userGroupVersion)
}
}
export interface ExecutionEngine {
GetUserInfo(userId: string, userGroupVersion: string): Promise<UserInfo>
}
The consumer starts to use the library by creating an instance of the LibraryExecution then calling the getuserinfo for example. As you see the constructor for the class accepts an environment. Once I have the environment in the library, I need to somehow load the values for keys API Url, APIClientId and APIClientSecret from within the constructor. I know of two ways to do this:
Option 1
I could do something like this._configLoader.SetConfigVariables(environment) where configLoader.ts is a class that loads the specific configuration values from files({environment}.json), but this would mean I maintain the above mentioned URL variables and the respective clientid, clientsecret to be able to hit the URL in a json file, which I should not be checking in to source control.
Option 2
I could use dotenv npm package, and create one .env file where I define the three keys, and then the values are stored in the deployment configuration which works perfectly for an independently deployable application, but this is a library and doesn't run by itself in any environment.
Option 3
Accept a configuration object from the consumer, which means that the consumer of the library provides the URL, clientId, and clientSecret based on the environment for the library to access, but why should the responsibility of maintaining the necessary variables for library be put on the consumer?
Please suggest on how best to implement this.
So, I think I got some clarity. Lets call my Library L, and consuming app C1 and the API that the library makes a call out to get user info as A. All are internal applications in our org and have a OAuth setup to be able to communicate, our infosec team provides those clientids and secrets to individual applications, so I think my clarity here is: C1 would request their own clientid and clientsecret to hit A's URL, C1 would then pass in the three config values to the library, which the library uses to communicate with A. Same applies for some C2 in the future.
Which would mean that L somehow needs to accept a full configuration object with all required config values from its consumers C1, C2 etc.
Yes, that sounds like the proper approach. The library is just some code doing what it's told. It's the client in this case that had to fetch the clientid and clientsecret from the infosec team and maintain them and keep them safe and the client also has the URL that goes with them. So, the client passes all this into your library, ideally just once per instance and you then keep it in your instance data for the duration of that instance

Resources