I cannot authenticate to AWS Simple Queue Service (SQS) from an EC2 instance using its associated IAM Role with Boto 2.38 library (and Python 3).
I couldn't find anything specific on documentation about it, but as far as I could understand from examples and other questions around, it was supposed to work just opening a connection like this.
conn = boto.sqs.connect_to_region('us-east-1')
queue = conn.get_queue('my_queue')
Instead, I get a null object from the connect method, unless I provide credentials on my environment, or explicitly to the method.
I'm pretty sure my role is ok, because it works for other services like S3, describing EC2 tags, sending metrics to CloudWatch, etc, all transparently. My SQS policy is like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SQSFullAccess",
"Effect": "Allow",
"Action": [
"sqs:*"
],
"Resource": [
"arn:aws:sqs:us-east-1:<account_id>:<queue_name1>",
"arn:aws:sqs:us-east-1:<account_id>:<queue_name2>"
]
}
]
}
In order to get rid of any suspicion about my policy, I even associated a FullAdmin policy to my role temporarily, without success.
I also verified that it won't work with AWS CLI as well (which, as far as I know, uses Boto as well). So, the only conclusion I could come up with is that this is a Boto issue with SQS client.
Would anyone have a different experience with it? I know that switching to Boto 3 would probably solve it, but I don't consider doing it right now and if it is really a bug, I think it should be reported on git, anyway.
Thanks.
Answering myself.
Boto's 2.38 SQS client does work with IAM Roles. I had a bug in my application.
As for AWS CLI, a credential file (~/.aws/credentials) was present in my local account, and being used instead of the instance's role, because the role is the last one to be looked up by the CLI.
Related
I have been using Terraform now for some months, and I have reached the point where my infrastructure is all base in Terraform files and I now have better control of the resources in our multiple accounts.
But I have a big problem. If someone makes a "manual" alteration of any Terraformed resource, it is easy to detect the change.
But what happens if the resource was not created using Terraform? I just don't know how to track any new resource or changes in them if the resource was not created using Terraform.
A key design tradeoff for Terraform is that it will only attempt to manage objects that it created or that you explicitly imported into it, because Terraform is often used in mixed environments where either some objects are managed by other software (like an application deployment tool) or the Terraform descriptions are decomposed into multiple separate configurations designed to work together.
For this reason, Terraform itself cannot help with the problem of objects created outside of Terraform. You will need to solve this using other techniques, such as access policies that prevent creating objects directly, or separate software (possibly created in-house) that periodically scans your cloud vendor accounts for objects that are not present in the expected Terraform state snapshot(s).
Access policies are typically the more straightforward path to implement, because preventing objects from being created in the first place is easier than recognizing objects that already exist, particularly if you are working with cloud services that create downstream objects as a side-effect of their work, as we see with (for example) autoscaling controllers.
Martin's answer is excellent and explains that Terraform can't be the arbiter of this as it is designed to play nicely both with other tooling and with itself (ie across different state files).
He also mentioned that access policies (although these have to be cloud/provider specific) are a good alternative to this so this answer will instead provide some options here for handling this with AWS if you do want to enforce this.
The AWS SDKs and other clients, including Terraform, all provide a user agent header in all requests. This is recorded by CloudTrail and thus you can search through CloudTrail logs with your favourite log searching tools to look for API actions that should be done via Terraform but don't use Terraform's user agent.
The other option that uses the user agent request header is to use IAM's aws:UserAgent global condition key which will block any requests that don't match the user agent header that's defined. An example IAM policy may look like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1598919227338",
"Action": [
"dlm:GetLifecyclePolicies",
"dlm:GetLifecyclePolicy",
"dlm:ListTagsForResource"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Sid": "Stmt1598919387700",
"Action": [
"dlm:CreateLifecyclePolicy",
"dlm:DeleteLifecyclePolicy",
"dlm:TagResource",
"dlm:UntagResource",
"dlm:UpdateLifecyclePolicy"
],
"Effect": "Allow",
"Resource": "*",
"Condition": {
"StringLike": {
"aws:UserAgent": "*terraform*"
}
}
}
]
}
The above policy allows the user, group or role it is attached to to be able to perform read only tasks to any DLM resource in the AWS account. It then allows any client with a user agent header including the string terraform to perform actions that can create, update or delete DLM resources. If a client doesn't have terraform in the user agent header then any requests to modify a DLM resource will be denied.
Caution: It's worth noting that clients can override the user agent string and so this shouldn't be relied on as a foolproof way of preventing access to things outside of this. The above mentioned techniques are mostly useful to get an idea about the usage of other tools (eg the AWS Console) in your account where you would prefer changes to be made by Terraform only.
The AWS documentation to the IAM global condition keys has this to say:
Warning
This key should be used carefully. Since the aws:UserAgent
value is provided by the caller in an HTTP header, unauthorized
parties can use modified or custom browsers to provide any
aws:UserAgent value that they choose. As a result, aws:UserAgent
should not be used to prevent unauthorized parties from making direct
AWS requests. You can use it to allow only specific client
applications, and only after testing your policy.
The Python SDK, boto, covers how the user agent string can be modified in the configuration documentation.
I haven't executed it but my idea has always been that this should be possible with a consistent usage of tags. A first naive
provider "aws" {
default_tags {
tags = {
Terraform = "true"
}
}
}
should be sufficient in many cases.
If you fear rogue developers will add this tag manually so as to hide their hacks, you could convolute your terraform modules to rotate the tag value over time to unpredictable values, so you could still search for inappropriately tagged resources. Hopefully the burden for them to overcome such mechanism will defeat the effort of simply terraforming a project. (Not for you)
On the downside, many resources will legitimately be not terraformed, e.g. DynamoDB tables or S3 items. A watching process should somehow whitelist what is allowed to exist. Not computational resources, that's for sure.
Tuning access policies and usage of CloudTrail as #ydaetskcoR suggests might be unsuitable to assess the extent of unterraformed legacy infrastructure, but are definitely worth the effort anyway.
This Reddit thread https://old.reddit.com/r/devops/comments/9rev5f/how_do_i_diff_whats_in_terraform_vs_whats_in_aws/ discusses this very topic, with some attention gathered around the sadly archived https://github.com/dtan4/terraforming , although it feels too much IMHO.
I have built my own Docker container that provides inference code to be deployed as endpoint on Amazon Sagemaker. However, this container needs to have access to some files from s3. The used IAM role has access to all s3 buckets that I am trying to reach.
Code to download files using a boto3 client:
import boto3
model_bucket = 'my-bucket'
def download_file_from_s3(s3_path, local_path):
client = boto3.client('s3')
client.download_file(model_bucket, s3_path, local_path)
The IAM role's policies:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::my-bucket/*"
]
}
]
}
Starting the docker container locally allows me to download files from s3 just like expected.
Deploying as an endpoint on Sagemaker, however, the request times out:
botocore.vendored.requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='my-bucket.s3.eu-central-1.amazonaws.com', port=443): Max retries exceeded with url: /path/to/my-file (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPSConnection object at 0x7f66244e69b0>, 'Connection to my-bucket.s3.eu-central-1.amazonaws.com timed out. (connect timeout=60)'))
Any help is appreciated!
for security reasons they don't let it access s3 natively, you need to hook it up to a VPC
https://docs.aws.amazon.com/sagemaker/latest/dg/host-vpc.html
For anyone coming across this question, when creating a model, the 'Enable Network Isolation' property defaults to True.
From AWS docs:
If you enable network isolation, the containers are not able to make any outbound network calls, even to other AWS services such as Amazon S3. Additionally, no AWS credentials are made available to the container runtime environment.
So this property needs to be set to False in order to connect to any other AWS service.
There are many Git issues opened on the Terraform repo about this issue, with lots of interesting comments, but as of now I still see no solution to this issue.
Terraform stores plain text values, including passwords, in tfstate files.
Most users are required to store them remotely so the team can work concurrently on the same infrastructure with most of them storing the state files in S3.
So how do you hide your passwords?
Is there anyone here using Terraform for production? Do you keep you passwords in plain text?
Do you have a special workflow to remove or hide them? What happens when you run a terraform apply then?
I've considered the following options:
store them in Consul - I don't use Consul
remove them from the state file - this requires another process to be executed each time and I don't know how Terraform will handle the resource with an empty/unreadable/not working password
store a default password that is then changed (so Terraform will have a not working password in the tfstate file) - same as above
use the Vault resource - sounds it's not a complete workflow yet
store them in Git with git-repo-crypt - Git is not an option either
globally encrypt the S3 bucket - this will not prevent people from seeing plain text passwords if they have access to AWS as a "manager" level but it seems to be the best option so far
From my point of view, this is what I would like to see:
state file does not include passwords
state file is encrypted
passwords in the state file are "pointers" to other resources, like "vault:backend-type:/path/to/password"
each Terraform run would gather the needed passwords from the specified provider
This is just a wish.
But to get back to the question - how do you use Terraform in production?
I would like to know what to do about best practice, but let me share about my case, although it is a limited way to AWS. Basically I do not manage credentials with Terraform.
Set an initial password for RDS, ignore the difference with lifecycle hook and change it later. The way to ignore the difference is as follows:
resource "aws_db_instance" "db_instance" {
...
password = "hoge"
lifecycle {
ignore_changes = ["password"]
}
}
IAM users are managed by Terraform, but IAM login profiles including passwords are not. I believe that IAM password should be managed by individuals and not by the administrator.
API keys used by applications are also not managed by Terraform. They are encrypted with AWS KMS(Key Management Service) and the encrypted data is saved in the application's git repository or S3 bucket. The advantage of KMS encryption is that decryption permissions can be controlled by the IAM role. There is no need to manage keys for decryption.
Although I have not tried yet, recently I noticed that aws ssm put-parameter --key-id can be used as a simple key value store supporting KMS encryption, so this might be a good alternative as well.
I hope this helps you.
The whole remote state stuff is being reworked for 0.9 which should open things up for locking of remote state and potentially encrypting of the whole state file/just secrets.
Until then we simply use multiple AWS accounts and write state for the stuff that goes into that account into an S3 bucket in that account. In our case we don't really care too much about the secrets that end up in there because if you have access to read the bucket then you normally have a fair amount of access in that account. Plus our only real secrets kept in state files are RDS database passwords and we restrict access on the security group level to just the application instances and the Jenkins instances that build everything so there is no direct access from the command line on people's workstations anyway.
I'd also suggest adding encryption at rest on the S3 bucket (just because it's basically free) and versioning so you can retrieve older state files if necessary.
To take it further, if you are worried about people with read access to your S3 buckets containing state you could add a bucket policy that explicitly denies access from anyone other than some whitelisted roles/users which would then be taken into account above and beyond any IAM access. Extending the example from a related AWS blog post we might have a bucket policy that looks something like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::MyTFStateFileBucket",
"arn:aws:s3:::MyTFStateFileBucket/*"
],
"Condition": {
"StringNotLike": {
"aws:userId": [
"AROAEXAMPLEID:*",
"AIDAEXAMPLEID"
]
}
}
}
]
}
Where AROAEXAMPLEID represents an example role ID and AIDAEXAMPLEID represents an example user ID. These can be found by running:
aws iam get-role -–role-name ROLE-NAME
and
aws iam get-user -–user-name USER-NAME
respectively.
If you really want to go down the encrypting the state file fully then you'd need to write a wrapper script that makes Terraform interact with the state file locally (rather than remotely) and then have your wrapper script manage the remote state, encrypting it before it is uploaded to S3 and decrypting it as it's pulled.
I have an EMR Spark Job that needs to read data from S3 on one account and write to another.
I split my job into two steps.
read data from the S3 (no credentials required because my EMR cluster is in the same account).
read data in the local HDFS created by step 1 and write it to an S3 bucket in another account.
I've attempted setting the hadoopConfiguration:
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "<your access key>")
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey","<your secretkey>")
And exporting the keys on the cluster:
$ export AWS_SECRET_ACCESS_KEY=
$ export AWS_ACCESS_KEY_ID=
I've tried both cluster and client mode as well as spark-shell with no luck.
Each of them returns an error:
ERROR ApplicationMaster: User class threw exception: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception:
Access Denied
The solution is actually quite simple.
Firstly, EMR clusters have two roles:
A service role (EMR_DefaultRole) that grants permissions to the EMR service (eg for launching Amazon EC2 instances)
An EC2 role (EMR_EC2_DefaultRole) that is attached to EC2 instances launched in the cluster, giving them access to AWS credentials (see Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances)
These roles are explained in: Default IAM Roles for Amazon EMR
Therefore, each EC2 instance launched in the cluster is assigned the EMR_EC2_DefaultRole role, which makes temporary credentials available via the Instance Metadata service. (For an explanation of how this works, see: IAM Roles for Amazon EC2.) Amazon EMR nodes use these credentials to access AWS services such as S3, SNS, SQS, CloudWatch and DynamoDB.
Secondly, you will need to add permissions to the Amazon S3 bucket in the other account to permit access via the EMR_EC2_DefaultRole role. This can be done by adding a bucket policy to the S3 bucket (here named other-account-bucket) like this:
{
"Id": "Policy1",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1",
"Action": "s3:*",
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::other-account-bucket",
"arn:aws:s3:::other-account-bucket/*"
],
"Principal": {
"AWS": [
"arn:aws:iam::ACCOUNT-NUMBER:role/EMR_EC2_DefaultRole"
]
}
}
]
}
This policy grants all S3 permissions (s3:*) to the EMR_EC2_DefaultRole role that belongs to the account matching the ACCOUNT-NUMBER in the policy, which should be the account in which the EMR cluster was launched. Be careful when granting such permissions -- you might want to grant permissions only to GetObject rather than granting all S3 permissions.
That's all! The bucket in the other account will now accept requests from the EMR nodes because they are using the EMR_EC2_DefaultRole role.
Disclaimer: I tested the above by creating a bucket in Account-A and assigning permissions (as shown above) to a role in Account-B. An EC2 instance was launched in Account-B with that role. I was able to access the bucket from the EC2 instance via the AWS Command-Line Interface (CLI). I did not test it within EMR, however it should work the same way.
Using spark you can also use assume role to access an s3 bucket in another account but using an IAM Role in the other account. This makes it easier for the other account owner to manage the permissions provided to the spark job. Managing access via s3 bucket policies can be a pain as access rights are distributed to multiple locations rather than all contained within a single IAM role.
Here is the hadoopConfiguration:
"fs.s3a.credentialsType" -> "AssumeRole",
"fs.s3a.stsAssumeRole.arn" -> "arn:aws:iam::<<AWSAccount>>:role/<<crossaccount-role>>",
"fs.s3a.impl" -> "com.databricks.s3a.S3AFileSystem",
"spark.hadoop.fs.s3a.server-side-encryption-algorithm" -> "aws:kms",
"spark.hadoop.fs.s3a.server-side-encryption-kms-master-key-id" -> "arn:aws:kms:ap-southeast-2:<<AWSAccount>>:key/<<KMS Key ID>>"
External IDs can also be used as a passphrase:
"spark.hadoop.fs.s3a.stsAssumeRole.externalId" -> "GUID created by other account owner"
We were using databricks for the above have not tried using EMR yet.
I believe you need to assign an IAM role to your compute nodes (you probably already have done this), then grant cross-account access to that role via IAM on the "Remote" account. See http://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html for the details.
For controlling access of the resources, generally IAM roles are managed as a standard practice. Assume roles are used when you want to access resources in a different account. If you or your organisation follow the same then you should follow https://aws.amazon.com/blogs/big-data/securely-analyze-data-from-another-aws-account-with-emrfs/.
The basic idea here is to use a credentials provider with which the access is obtained by EMRFS to access objects in S3 buckets.
You can go one step further and make the ARN for STS and buckets parameterized for the JAR created in this blog.
I have created all that are needed for a successful deployment.
I tried to make the deployment without configuring the CodeDeploy agent in the Amazon instance and the deployment [obviously] failed.
After setting it up though, succeeded.
So, my question is, should I configure every instance that I use manually?
What if I have 100 instances in the deployment group?
Should I create an AMI with the CodeDeploy agent tool already configured?
EDIT
I have watched this:
https://www.youtube.com/watch?v=qZa5JXmsWZs
with this:
https://github.com/andrewpuch/code_deploy_example
and read this:
http://blogs.aws.amazon.com/application-management/post/Tx33XKAKURCCW83/Automatically-Deploy-from-GitHub-Using-AWS-CodeDeploy
I just cannot understand why I must configure with the IAM creds the instance. Isn't it supposed to take the creds from the role I launched it with?
I am not an expert in aws roles and policies, but from the CD documentation this is what I understood.
Is there a way to give the IAM user access to the instance so I wont have to setup the CD agent?
EDIT 2
I think that this post kind of answers: http://adndevblog.typepad.com/cloud_and_mobile/2015/04/practice-of-devops-with-aws-codedeploy-part-1.html
But as you can see, I launched multiple instances but I only installed CodeDeploy agent on one instance, what about others? Do I have to repeat myself and login to them and install them separately? It is OK since I just have 2 or 3. But what if I have handers or even thousand of instances? Actually there are different solutions for this. One of them is, I setup all environment on one instances and create an AMI from it. When I launch my working instance, I will create instance from the one I’ve already configured instead of the AWS default ones. Some other solutions are available
Each instance only requires the CodeDeploy agent installed on it. It does not require the AWS CLI to be installed. See AWS CodeDeploy Agent Operations for installation and operation details.
You should create an instance profile/role in IAM that will grant any instance the correct permissions to accept a code deployment through CodeDeploy service.
Create a role called ApplicationServer. To this role, add the following policy. This assumes you are using S3 for your revisions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::codedeploy-example-com/*"
]
},
{
"Sid": "Stmt1414002531000",
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData"
],
"Resource": [
"*"
]
},
{
"Sid": "Stmt1414002720000",
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:PutLogEvents"
],
"Resource": [
"*"
]
}
]
}
To your specific questions:
So, my question is, should I configure every instance that I use
manually?
What if I have 100 instances in the deployment group? Should I create
an AMI with the aws-cli tool already configured?
Configure AMI with your base tools, or use CloudFormation or puppet to manage software installation on a given instance as needed. Again the AWS CLI is not required for CodeDeploy. Only the most current version of the CodeDeploy agent is required.