In a hobby side project I am creating an online game where the user can play a card game by implementing strategies. The user can submit his code and it will play against other users strategies. Once the user has submitted his code, the code needs to be run on server side.
I decided that I want to isolate code execution of user submitted code into an AWS lambda function. I want to prevent the code from stealing my AWS credentials, mining cryptocurrency and doing other harmful activity.
My plan is to do following:
Limit code execution time
Prevent any communication to internet & internal services (except trough the return value).
Have a review process in place, which prevents execution of user submitted code before it is considered unharmful
Now I need your advice on how to achieve best isolation:
How do I configure my function, so that it has no internet access?
How do I configure my function, so that it has no access to my internal services?
Do you see any other possible attack vector?
How do I configure my function, so that it has no internet access?
Launch the function into an isolated private subnet within a VPC.
How do I configure my function, so that it has no access to my internal services?
By launching the function inside the isolated private subnet you can configure which services it has access to by controlling them via the security groups and further via Route Table this subnet attached including AWS Network ACLs.
Do you see any other possible attack vector?
There could be multiple attack vectors here :
I would try to answer from the security perspective in AWS Services. The most important would be to add AWS Billing Alerts setup, just in case there is some trouble at least you'll get notified and take necessary action and I am assuming you already have MFA setup for your logins.
Make sure you configure your lambda with the least privilege IAM Role
Create a completely separate subnet dedicated to launching the lambda function
Create security for lambda and control this lambda access to other services in your solution.
Have a separate route table for the subnet where you allow only the selected services or be very specific with corresponding IP addresses as well.
Make sure you use Network ACLs to configure all the outgoing traffic from the subnet by adding ACL as well as an added benefit.
Enable the VPC flow logs and have the necessary Athena queries with analysis in place and add alerts using AWS CloudWatch.
The list can be very long when you want to secure this deployment fully in AWS. I have added just few.
I'd start by saying this is very risky and allowing people to run their own code in your infrastructure can be very dangerous. However, that said, here's a few things:
Limiting Code Execution Time
This is already built in to Lambda. Functions have an execution limit on time which you can configure easily through IaC, the AWS Console or the CLI.
Restricting Internet Access
By default Lambda functions can be thought of as existing outside the constraints of a VPC for more applications. They therefore have internet access. I guess you could put your Lambda function inside a private subnet in a VPC and then configure the networking to not allow connections out except to locations you want.
Restricting Access to Other Services
Assuming that you are referring to AWS services here, Lamdba functions are bound by IAM roles in relation to other AWS services they can access. As long as you don't give the Lambda function access to something in it's IAM role, it won't be able to access those services unless a potentially malicious user provides credentials via some other means such as putting them in plain text in code which could be picked up by an AWS SDK implementation.
If you are referring to other internal services such as EC2 instances or ECS services then you can restrict access using the correct network configuration and putting your function in a VPC.
Do you see any other possible attack vector?
It's hard to say for sure. I'd really advise against this completely without taking some professional (and likely paid and insured) advice. There are new attack vectors that can open up or be discovered daily and therefore any advice now may completely change tomorrow if a new vulnerability is discovered.
I think your best bets are:
Restrict the function timeout to be as low as phyisically possible (allowing for cold starts).
Minimise the IAM policy for the function as far as humanly possible. Careful with logging because I assume you'll want some logs but not allow someone to run GB's of data in to your CloudWatch logs.
Restrict the language used so you are using one language that you're very confident in and that you can audit easily.
Run the lambda in a private subnet in a VPC. You'll likely want a seperate routing table and you will need to audit your security groups and network ACL's closely.
Add alerts and VPC logs so you can be sure that a) if something does happen that shouldn't then it's logged and traceable and b) you are able to automatically get alerted on the problem and rectify it as soon as possible.
Consider who will be reviewing the code. Are they experienced and trained to spot attack vectors?
Seek paid, professional advice so you don't end up with security problems or very large bills from AWS.
Related
This question may sound a little odd, but here it goes: A customer of ours would like to get access to certain metrics of his environment of our product which we host on Azure for the customer. It's a pretty complicated deployment, but in the end it consists of an Application Gateway, some virtual machines and a dedicated Azure SQL database.
The customer now would want to get select metrics from this deployment forward to their own DataDog subscription, e.g. VM CPU metrics, database statistics and those things. DataDog obviously supports all this information (which is good), but as a default would slurp in information from all resources within our subscription (which is not OK).
Is there a way to fine-granularly define which data is forwarded to DataDog, e.g. the resources and also which type of metrics to forward for each resource? What are my options here? Is it enough to create a service principal with a limited reading right, or can I configure this somewhere else? I am unfortunately not familiar with DataDog.
The main thing which must be prevented is that the customer due to the metrics forwarding could get access to other metrics in our subscription - we need to control the exact scope of the metrics.
The pretty straightforward solution to this issue is to create a service principal via command line, and then to assign the monitoring role to this service principal only exactly for the resources you need. This even works down to a level of specific databases for example.
Kicker: This is not possible to do in such a granularity from the UI, but the az command line accepts assigning the monitoring reader permission on a deep resource ID level, even if the UI for this is not there. By finding the resource ID from the UI, and then using the resource ID from the command line, it's possible to achieve exactly this behaviour.
I keep reading about how Azure Managed Identities are the way to go to secure access to Azure resources, and i totally get the convenience and level of security they offer. But i often worry that at the same time they leave open the possibility that any vulnerability to any application that is running within my resource can then leverage that identity to do things that i may not want it to do, not just the application i want to give access to that resource.
This method of securing things while convenient has always felt awkward. Its like if i need to give a friend of mine access to my apartment to watch my dog while im on vacation, instead of giving my friend my keys i instead just slip the keys through the mail slot and they have 4 other roommates. (lets pretend that these keys are soul-bound to everyone that lives there and cannot be stolen)
Is it possible to combine both Managed Identities with traditional credentials to consume resources?
A specific example might be, i have a JAVA spring based application that consumes Azure Database for MySQL that i have deployed into a Kubernetes environment. I am using a sidecar container with NGINX to provide external HTTP access to that application. Using a pod-managed identity here implies to me that both the JAVA application and NGINX will have access to the database, when i only want my application to be the one that has access. Certainly there are other architectural approaches that i could take in this example but more trying to outline my concerns with managed identities alone.
I am in a situation where I need to do some calculations based upon data provided by the customer, store the result somewhere, and make the result available through APIs. For this, I have produced a NodeJS app that can store the data in a NoSQL DB.
My issue is that I want the customer to pay for everything without seeing the source-code. I dont need to make any money for myself, just allow the customer to pay the bills without seeing my calculations.
For this, I am considering AWS. I can spin out an EC2 instance, run my NodeJS code on it and store all the data in RDS, S3, etc. I have two possibilities from this point:
I pay for the AWS account (ie, put my Credit Card Details), and recover the bill from the customer; or
I ask the customer to create an AWS account, give me some sort of access so that I can download my code on EC2 etc
For option 1, the question is
Is there a way in AWS (IAM user etc) such that a customer can login to
the AWS console, view the billing and usage information, but cannot
logon to EC2 and see the Node source code
For option 2,
Is it even possible that the owner of the account doesnt get to see
the source-code on their EC2 instance
Please advise
Yes, Amazon has a tutorial on exactly that:
https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_billing.html
I would suggest starting there and then asking additional questions here, and we can try and help you.
Is there a way in AWS (IAM user etc) such that a customer can login to
the AWS console, view the billing and usage information, but cannot
logon to EC2 and see the Node source code
This seems like a fit for AWS marketplace AMI based product offering. You can offer your AMI on the marketplace for your customers. Your customers would need an AWS account to buy your AMI and they will be charged by AWS on your behalf. You will be paid by AWS.
Is it even possible that the owner of the account doesn't get to see
the source-code on their EC2 instance
I believe one of the ways to do this to disable the sshd process on the AMI that you create. There can be better alternatives to this but I just put what I could guess.
It is difficult to totally protect something from the owner of an AWS Account.
To give an example, think of the pre-cloud world. If somebody has physical access to a computer, they could open the case and remove the hard disk. By attaching the disk to another computer, they could read the contents. But, you might say, what if the contents is password-protected? Yes, that would help, but how does the app access the contents without knowing the password?
Exactly the same thing applies in the cloud. If you are using an Amazon EC2 instance, then the disk can be copied via a Snapshot and attached to another computer. The contents can be read.
You might be super-smart and know how to make an EC2 instance with an encrypted volume that can boot and serve traffic, without giving people permission to login. If they attach the disk to another EC2 instance, it should be unreadable because the boot volume is encrypted. (But then how does it work when booted?)
If you want to be totally sure about protecting your code, I would recommend:
Use your own AWS Account (so they don't have access)
Setup Consolidated Billing (so they get the bill, but they don't have access to the account)
Provide an API that they can call (either via API Gateway + Lambda, or an Amazon EC2 instance). You can even get fancy and use AWS PrivateLink - Amazon Virtual Private Cloud Connectivity Options, which exposes a service in another AWS Account without traversing the Internet.
The result is that only you have total access to the resources, but they pay for the account. This would only work if you have a single customer using the service, since the whole cost of that account will go to them. (It wouldn't be appropriate if you are servicing multiple customers from the same account.)
AWS does not provide a way to cap usage costs. It is often pointed out that it would not be useful to shut down a commercial website in case of charges exceeding a budget, without information about the appropriate response that's only possessed by the business itself. However, for those who want to experiment at home for learning purposes, this situation does not apply.
Prevention is a good thing, but it is impossible to prevent all accidents and attacks. This question is about response and not prevention.
One standard suggestion is to have some means of rapidly shutting down all AWS resources in an account.
Another piece of standard advice is to make use of features like budget alerts. As an individual citizen, it's plausible that the time to react to such an alert could be one day, or perhaps a week or more in case of illness, which could cause a very high bill. So automation might be useful here.
How can I solve these problems in a manner suitable for an individual developer experimenting in their own time and at their own cost? In particular, how can I:
Prepare for a rapid, well-tested, reliable response to shut down all resource usage in an AWS account
Trigger that response automatically (triggered by, for example, an AWS budget alert, or some other form of cost monitoring)
Some potential complications:
A. In the case of deliberate attack rather than pure user error, 1. may be complicated by the attacker making use of such features as EC2 termination protection.
B. An attacker might also make use of many different AWS services. So, given the large and expanding AWS product range, attempting to maintain a library that deletes every type of resource (EC2 instances, RDS instances, etc.), using code that is specific to particular resource types, may be impractical.
C. This rather old forum post suggests that AWS accounts can't be closed without first cancelling all opt-in services.
Note I can't use the free tier because I want to make use of features not available in that tier.
First off, proper security and management of root account credentials is critical. Enable MFA on all accounts, including root. Do not use the root account except for cases where absolutely necessary. Limit accounts with broad permissions. Enable CloudTrail and if desired, alert on use of elevated permissions. These sorts of actions will most certainly protect against nearly all attackers and since this is a personal account, the types of attackers who may be able to evade these controls would likely have no interest in causing an individual harm, they are more interested in large organizations.
As for accidents, what types of accidents are you thinking might happen? Do you have large compute jobs that use auto-scaling based on factors such as a queue depth? Your best action here is likely to set ASG max sizes, use CloudWatch events to monitor and re-mediate resource usage issues, or even use third party tools that deal with this type of thing.
Something to keep in mind is that AWS implements account limits that will constrain you some but for a personal account, even these limits are likely too permissive. I only have experience requesting limit increases but it might be worth asking AWS if they perform limit decreases as well.
You have raised concerns about excessive costs being generated due to:
Normal usage: If you require the computing resources, then they are most probably of sufficient benefit to the company to warrant the cost. Therefore, excessive use should generate a warning, but you do not want to turn things off.
Accidental usage: This is where an authorized person uses too many resources, such as turning on a service and forgetting to turn it off. Again, monitoring can give you a hint that this is happening. Many AWS customers create a Sandbox Account where they can experiment, and then use an automated script to turn off resources in this account (which is not used for real business purposes).
An attacker: This is an external party sending excessive usage to your services (eg making many requests to your website) but without access to your actual AWS account. This could also be caused by a Denial of Service attack. There is plenty of documentation around handling DDOS-style attacks, but a safe method is to limit the maximum number of instances permitted in an Auto Scaling group.
Someone accessing your AWS account: You mention an attacker making use of EC2 Termination Protection. This is an option you can apply to your own EC2 instances to prevent accidental termination. It is not something that someone outside your company would be able to control unless they have gained credentials to access your AWS Account. If you are worried about this, then active Multi-Factor Authentication (MFA) on your logins.
If you are worried about excessive costs, it's worth considering what generates costs:
Amazon EC2 instances are charged per hour. If you are worried about their cost, then you can Stop them, but this means they are no longer providing services to your company and customers.
Storage services (eg Amazon EBS and Amazon S3) are charged based upon the amount of data stored: You most probably do not want to automatically delete data due to excessive costs, since the data is presumably of value to your company.
Database services (eg Amazon RDS) are charged per hour. You probably don't want to turn them off because they, too, contain data of value to your company.
Some services are charged based upon throughput (eg AWS API Gateway, Amazon Kinesis), but turning off such services would also impact your ability to continue providing services to your customers.
If you are talking about personal usage of AWS that is not supplying a service to customers, then best practice is to always turn off services that aren't required, such as compute and database services. AWS is an on-demand service, so you only have to pay for services that you have requested.
Also, create Billing Alarms to alert you when you cross a certain cost threshold. These can be further broken down into budgets that notify you as you approach certain spending thresholds.
Bottom line: Rather than focusing on system that automatically react to high expenditure and delete things, you should only run services that you currently want. Set alarms/budget to be aware of costs that exceed desired thresholds.
I would like to add Cassandra to CloudFoundry. How can that be achieved? I was looking at the information posted here: CouchDB in CloudFoundry? but that is using the included CouchDB.
I also have been combing through this wiki https://github.com/cloudfoundry/oss-docs/tree/master/vcap/adding_a_system_service, but that doesn't give me enough information on how to point to my externally hosted Cassandra service.
Any help would be appreciated.
Although there's not much information on it, the Service Broker tool will let you expose an external service to a VCAP deployment (so that the service is displayed when running vmc services).
https://github.com/cloudfoundry/vcap-services/tree/master/service_broker
There isn't a how-to or other docs to speak of, so your best bet is to read the source and post questions on the vcap-dev google group. Here's an existing thread on Service Broker:
https://groups.google.com/a/cloudfoundry.org/d/topic/vcap-dev/sXF9rWzMMHc/discussion
If you want to connect directly your existing services from your private cloud, I then see 2 solutions :
Do nothing special and have your code connect to those services, assuming they are visible from the network and no firewall sits between them. Of course, you'll want to make their address configurable, but other than that, it is as if you were hitting a third party
create some kind of "gateway" service whose role would be to proxy the connection to your private service
Of course, a third solution would be to have a real "CloudFoundry" oriented Cassandra service, and migrate your existing data to it (but then it would not be accessible from the rest of your IS, unless you create a bridge the other way around)
I would start with option 1) and depending on your processes and usage, research solution 2) afterwards.