AWS-CDK: Using InitFile to create a file in EC2 instance - python-3.x

I'm trying to create a file in my EC2 instance using the InitFile construct in CDK. Below is the code i'm using to create my EC2 instance into which i'm trying to create a file textfile.txt which would contain a text 'welcome' going by https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_ec2/InitFile.html reference
during cdk initialisation,
init_data = ec2.CloudFormationInit.from_elements(
ec2.InitFile.from_string("/home/ubuntu/textfile.txt", "welcome")
)
self.ec2_instance = ec2.Instance(self,
id='pytenv-instance',
vpc=self.vpc,
instance_type=ec2.InstanceType.of(ec2.InstanceClass.BURSTABLE2, ec2.InstanceSize.NANO),
machine_image=ec2.MachineImage.generic_linux(
{'us-east-1': 'ami-083654bd07b5da81d'}
),
key_name="demokeyyt18",
security_group=self.sg,
vpc_subnets=ec2.SubnetSelection(
subnet_type=ec2.SubnetType.PUBLIC
),
init=init_data,
)
From the EC2 configuration it is evident that the machine image here is Ubuntu. Getting this error: Failed to receive 1 resource signal(s) within the specified duration.
Am I missing something? Any inputs?
UPDATE: This same code works with EC2 machine image as Amazon_linux but not for Ubuntu. Am I doing something wrong ?

CloudFormation init requires the presence of cfn-init helper script on the instance. Ubuntu does not come with it, so you have to set it up yourself.
Here's the AWS guide that contains links to the installation scripts for Ubuntu 16.04/18.04/20.04. You need to add these to the user_data prop of your instance. Then cloudformation-init will work.
If you just want to create a file when the instance starts, though, you don't have to use cfn-init at all - you could just supply the command that creates your file to the user_data prop directly:
self.ec2_instance.user_data.add_commands("echo welcome > /home/ubuntu/textfile.txt")

Related

Node red instance in Kubernetes with custom settings.js and other files

I am building a service which creates on demand node red instance on Kubernetes. This service needs to have custom authentication, and some other service specific data in a JSON file.
Every instance of node red will have a Persistent Volume associated with it, so one way I though of doing this was to attach the PVC with a pod and copy the files into the PV, and then start the node red deployment over the modified PVC.
I use following script to accomplish this
def paste_file_into_pod(self, src_path, dest_path):
dir_name= path.dirname(src_path)
bname = path.basename(src_path)
exec_command = ['/bin/sh', '-c', 'cd {src}; tar cf - {base}'.format(src=dir_name, base=bname)]
with tempfile.TemporaryFile() as tar_buffer:
resp = stream(self.k8_client.connect_get_namespaced_pod_exec, self.kube_methods.component_name, self.kube_methods.namespace,
command=exec_command,
stderr=True, stdin=True,
stdout=True, tty=False,
_preload_content=False)
print(resp)
while resp.is_open():
resp.update(timeout=1)
if resp.peek_stdout():
out = resp.read_stdout()
tar_buffer.write(out.encode('utf-8'))
if resp.peek_stderr():
print('STDERR: {0}'.format(resp.read_stderr()))
resp.close()
tar_buffer.flush()
tar_buffer.seek(0)
with tarfile.open(fileobj=tar_buffer, mode='r:') as tar:
subdir_and_files = [tarinfo for tarinfo in tar.getmembers()]
tar.extractall(path=dest_path, members=subdir_and_files)
This seems like a very messy way to do this. Can someone suggest a quick and easy way to start node red in Kubernetes with custom settings.js and some additional files for config?
The better approach is not to use a PV for flow storage, but to use a Storage Plugin to save flows in a central database. There are several already in existence using DBs like MongoDB
You can extend the existing Node-RED container to include a modified settings.js in /data that includes the details for the storage and authentication plugins and uses environment variables to set the instance specific at start up.
Examples here: https://www.hardill.me.uk/wordpress/tag/multi-tenant/

How to start an ec2 instance using sqs and trigger a python script inside the instance

I have a python script which takes video and converts it to a series of small panoramas. Now, theres an S3 bucket where a video will be uploaded (mp4). I need this file to be sent to the ec2 instance whenever it is uploaded.
This is the flow:
Upload video file to S3.
This should trigger EC2 instance to start.
Once it is running, I want the file to be copied to a particular directory inside the instance.
After this, I want the py file (panorama.py) to start running and read the video file from the directory and process it and then generate output images.
These output images need to be uploaded to a new bucket or the same bucket which was initially used.
Instance should terminate after this.
What I have done so far is, I have created a lambda function that is triggered whenever an object is added to that bucket. It stores the name of the file and the path. I had read that I now need to use an SQS queue and pass this name and path metadata to the queue and use the SQS to trigger the instance. And then, I need to run a script in the instance which pulls the metadata from the SQS queue and then use that to copy the file(mp4) from bucket to the instance.
How do i do this?
I am new to AWS and hence do not know much about SQS or how to transfer metadata and automatically trigger instance, etc.
Your wording is a bit confusing. It says that you want to "start" an instance (which suggests that the instance already exists), but then it says that it wants to "terminate" an instance (which would permanently remove it). I am going to assume that you actually intend to "stop" the instance so that it can be used again.
You can put a shell script in the /var/lib/cloud/scripts/per-boot/ directory. This script will then be executed every time the instance starts.
When the instance has finished processing, it can call sudo shutdown now -h to turn off the instance. (Alternatively, it can tell EC2 to stop the instance, but using shutdown is easier.)
For details, see: Auto-Stop EC2 instances when they finish a task - DEV Community
I tried to answer in the most minimalist way, there are many points below that can be further improved. I think below is still quite some as you mentioned you are new to AWS.
Using AWS Lambda with Amazon S3
Amazon S3 can send an event to a Lambda function when an object is created or deleted. You configure notification settings on a bucket, and grant Amazon S3 permission to invoke a function on the function's resource-based permissions policy.
When the object uploaded it will trigger the lambda function. Which creates the instance with ec2 user data Run commands on your Linux instance at launch.
For the ec2 instance make you provide the necessary permissions via Using instance profiles for download and uploading the objects.
user data has a script that does the rest of the work which you need for your workflow
Download the s3 object, you can pass the name and s3 bucket name in the same script
Once #1 finished, start the panorama.py which processes the videos.
In the next step you can start uploading the objects to the S3 bucket.
Eventually terminating the instance will be a bit tricky which you can achieve Change the instance initiated shutdown behavior
OR
you can use below method for terminating the instnace, but in that case your ec2 instance profile must have access to terminate the instance.
ec2-terminate-instances $(curl -s http://169.254.169.254/latest/meta-data/instance-id)
You can wrap the above steps into a shell script inside the userdata.
Lambda ec2 start instance:
def launch_instance(EC2, config, user_data):
ec2_response = EC2.run_instances(
ImageId=config['ami'], # ami-0123b531fc646552f
InstanceType=config['instance_type'],
KeyName=config['ssh_key_name'],
MinCount=1,
MaxCount=1,
SecurityGroupIds=config['security_group_ids'],
TagSpecifications=tag_specs,
# UserData=base64.b64encode(user_data).decode("ascii")
UserData=user_data
)
new_instance_resp = ec2_response['Instances'][0]
instance_id = new_instance_resp['InstanceId']
print(f"[DEBUG] Full ec2 instance response data for '{instance_id}': {new_instance_resp}")
return (instance_id, new_instance_resp)
Upload file to S3 -> Launch EC2 instance

AWS - Neptune restore from snapshot using SDK

I'm trying to test restoring Neptune instances from a snapshot using python (boto3). Long story short, we want to spin up and delete the Dev instance daily using automation.
When restoring, my restore seems to only create the cluster without creating the attached instance. I have also tried creating an instance once the cluster is up and add to the cluster, but that doesn't work either. (ref: client.create_db_instance)
My code does as follows, get the most current snapshot. Use that variable to create the cluster so the most recent data is there.
import boto3
client = boto3.client('neptune')
response = client.describe_db_cluster_snapshots(
DBClusterIdentifier='neptune',
MaxRecords=100,
IncludeShared=False,
IncludePublic=False
)
snaps = response['DBClusterSnapshots']
snaps.sort(key=lambda c: c['SnapshotCreateTime'], reverse=True)
latest_snapshot = snaps[0]
snapshot_ID = latest_snapshot['DBClusterSnapshotIdentifier']
print("Latest snapshot: " + snapshot_ID)
db_response = client.restore_db_cluster_from_snapshot(
AvailabilityZones=['us-east-1c'],
DBClusterIdentifier='neptune-test',
SnapshotIdentifier=snapshot_ID,
Engine='neptune',
Port=8182,
VpcSecurityGroupIds=['sg-randomString'],
DBSubnetGroupName='default-vpc-groupID'
)
time.sleep(60)
db_instance_response = client.create_db_instance(
DBName='neptune',
DBInstanceIdentifier='brillium-neptune',
DBInstanceClass='db.r4.large',
Engine='neptune',
DBSecurityGroups=[
'sg-string',
],
AvailabilityZone='us-east-1c',
DBSubnetGroupName='default-vpc-string',
BackupRetentionPeriod=7,
Port=8182,
MultiAZ=False,
AutoMinorVersionUpgrade=True,
PubliclyAccessible=False,
DBClusterIdentifier='neptune-test',
StorageEncrypted=True
)
The documentation doesn't help much at all. It's very good at providing the variables needed for basic creation, but not the actual instance. If I attempt to create an instance using the same Cluster Name, it either errors out or creates a new cluster with the same name appended with '-1'.
If you want to programmatically do a restore from snapshot, then you need to:
Create the cluster snapshot using create-db-cluster-snapshot
Restore cluster from snapshot using restore-db-cluster-from-snapshot
Create an instance in the new cluster using create-db-instance
You mentioned that you did do a create-db-instance call in the end, but your example snippet does not have it. If that call did succeed, then you should see an instance provisioned inside that cluster.
When you do a restore from Snapshot using the Neptune Console, it does steps #2 and #3 for you.
It seems like you did the following:
Create the snapshot via CLI
Create the cluster via CLI
Create an instance in the cluster, via Console
Today, we recommend restoring the snapshot entirely via the Console or entirely using the CLI.

Pulling image from ECR via docker-py

I have a script that retrieves a login for ECR, authenticates a DockerClient instance with the login credentials (reauth set to True), and then attempts to pull a nominated container image.
The code seems to work perfectly when running on my local machine interacting with docker daemon on an EC2 instance, but when running from the EC2 instance I am constantly getting
404 Client Error: Not Found ("repository XXXXXXXX.dkr.ecr.eu-west-2.amazonaws.com/autohld-runner not found: does not exist or no pull access")
The same repo is being used for both executing the code locally and remotely on the EC2 instance. I have tried setting the access to the image within ECR to allow pull for both everyone and my AWS ID. I have granted the role assigned to the EC2 instance Full Admin access also. All with no joy.
If I perform the same tasks on the EC2 instance via command line with the exact same repo URI (copied from the error), it works with no issue.
Is there something I am missing within docker-py ?
url = "tcp://127.0.0.1:2375"
dockerd = docker.DockerClient(base_url=url, version='auto')
dockerd.login(username=ecr.username, password=ecr.password, email='none', registry=ecr.registry, reauth=True)
dockerd.images.pull(ecr.get_repo(instance.tags['Container']), tag='latest')
get_repo returns the full URI as reported in the error message, the Container element is the name 'autohld-runner'
Thanks
It seems that if the registry has been accessed via the cli then an auth token or something is set and docker remembers this allowing subsequent calls to work. However in this case the instance is starting up completely fresh and using the login method within docker-py.
This doesn't seem to pass the credentials on to the pull, I have found that using the auth_config named argument and passing in a dictionary of auth parameters works.
auth_creds = {'username': ecr.username, 'password': ecr.password}
dockerd.images.pull(ecr.get_repo(instance.tags['Container']), tag='latest', auth_config=auth_creds)
HTH

How to programmatically download data to AWS EC2 instance?

There are 3 machines involved in my task
A: my desktop
B: EC2 instance spun up by A
C: a remote linux server where data sits and I only have read privilege
The task has basically 3 steps
spin up B from A
download data from C to B to a specific location
change some of the downloaded data on B
I know how to do 1 using awscli or boto3. Steps 2 and 3 are easy if I ssh to the EC2 instance manually. The problem is that if this task needs to be automated, how can I deal with the login credentials.
Specifically, I am thinking of using user_data to run shell scripts after the EC2 instance is born, but the data download uses scp which needs password. Then I could upload an ssh credential file to the EC2 instance, but then I cannot utilize user_data to run the script for step 2 and 3.
So my current solution is all from shell script
spin up B from A
upload ssh credential from A to B
ssh from A to B with shell commands attached where steps 2 and 3 for the task are performed
This solution appears really ugly to me. Is there a better practice in this case?
3 Options
Pass the encrypted/encoded password as part of userdata. The userdata script will first decrypt/decode the password and use it to scp the file from C. Then delete the userdata or someway to delete the encrypted/encoded password
Use ssh key instead of ssh password. But the risk is you have to pass the private key in the userdata. Not a secure way.
Use Ansible and ssh key. But too much work for a simple task.
Give a try to ansible it can help you to automate this task by creating a playbook
For creating an instance you could use the ec2 module, from the doc examples:
# Basic provisioning example
- ec2:
key_name: mykey
instance_type: t2.micro
image: ami-123456
wait: yes
group: webserver
count: 3
vpc_subnet_id: subnet-29e63245
assign_public_ip: yes
To download data, the get_url module, example:
- name: Download file with check (md5)
get_url:
url: http://example.com/path/file.conf
dest: /etc/foo.conf
checksum: md5:66dffb5228a211e61d6d7ef4a86f5758
For modifying files there are multiple modules that can be found in the http://docs.ansible.com/.
In overall is a tool that can help to automate many things, but some time is required to get the basis, check the Getting started guide, hope it can help.
There are many way to solve your task. I will not say about task 1 (spin up B from A) because you already done on it.
Option 1: Use EC2 Run command to push commands to server B. Flow: A -> EC2 Run Command service -> B -> C No need to push credential (SSH key/password) to server B
Option 2: Define all you commands in bash shell file, push this shell file to S3. Use User Data of server B to download that file from S3. Flow: A -> S3. B get file from S3. B -> C
With above 2 options, you do not need to push any credentials to server B. Server C can be any where you have connection between B to C for downloading task.

Resources