How to stream uploaded video with AWS? - amazon-cloudfront

The main task is to protect video from downloading.
To achieve it, we decided to set up Video Streaming from S3.
The project has an PHP API and a client. The API generates Pre-Signed URL to where the video should be uploaded in S3 bucket. Then, client can request video by a CDN URL. But, with signed urls, video can be downloaded from the client.
We found an approach, when video is converted to MPEG-DASH with AWS Elemental MediaConverter. The Job for MediaConverter can be created via API. Then it should be streamed via AWS Elemental MediaPackage and CloudFront.
The problems are:
How to understand when the video upload is finished, to start MediaConverter Job?
MPEG-DASH file has a .mpd manifest, but MediaPackage requires .smil manifest. How to auto generate this file from a .mpd?
P.S. If I'm wrong somewhere, please, correct me.

How to understand when the video upload is finished, to start MediaConverter Job?
It could be achieved by the following workflow
the ingest user uploads a video to the watchfolder bucket in S3
the s3:PutItem event triggers a Lambda function that calls MediaConvert to convert the videos.
Converted videos are stored in S3 by MediaConvert
High level instructions as follow.
create an Amazon S3 bucket to use for uploading videos to be converted. Bucket name example: vod-watchfolder-firstname-lastname
create an Amazon S3 bucket to use for storing converted video outputs from MediaConvert (enables public read, Static website hosting and CORS)
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
create an IAM role to Pass to MediaConvert. Use the IAM console to create a new role. Name it MediaConvertRole and select AWS Lambda for the role type. Use inline policies to grant permissions to other resources needed for the lambda to execute.
Create an IAM Role for Your Lambda function. Use the IAM console to create a role. Name it VODLambdaRole and select AWS Lambda for the role type. Attach the managed policy called AWSLambdaBasicExecutionRole to this role to grant the necessary CloudWatch Logs permissions. Use inline policies to grant permissions to other resources needed for the lambda to execute.
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*",
"Effect": "Allow",
"Sid": "Logging"
},
{
"Action": [
"iam:PassRole"
],
"Resource": [
"ARNforMediaConvertRole"
],
"Effect": "Allow",
"Sid": "PassRole"
},
{
"Action": [
"mediaconvert:*"
],
"Resource": [
"*"
],
"Effect": "Allow",
"Sid": "MediaConvertService"
},
{
"Action": [
"s3:*"
],
"Resource": [
"*"
],
"Effect": "Allow",
"Sid": "S3Service"
}
]
}
Create a lambda Function for converting videos. Use the AWS Lambda console to create a new Lambda function called VODLambdaConvert that will process the API requests. Use the provided convert.py example implementation for your function code.
#!/usr/bin/env python
import glob
import json
import os
import uuid
import boto3
import datetime
import random
from urllib.parse import urlparse
import logging
from botocore.client import ClientError
logger = logging.getLogger()
logger.setLevel(logging.INFO)
S3 = boto3.resource('s3')
def handler(event, context):
'''
Watchfolder handler - this lambda is triggered when video objects are uploaded to the
SourceS3Bucket/inputs folder.
It will look for two sets of file inputs:
SourceS3Bucket/inputs/SourceS3Key:
the input video to be converted
SourceS3Bucket/jobs/*.json:
job settings for MediaConvert jobs to be run against the input video. If
there are no settings files in the jobs folder, then the Default job will be run
from the job.json file in lambda environment.
Ouput paths stored in outputGroup['OutputGroupSettings']['DashIsoGroupSettings']['Destination']
are constructed from the name of the job settings files as follows:
s3://<MediaBucket>/<basename(job settings filename)>/<basename(input)>/<Destination value from job settings file>
'''
assetID = str(uuid.uuid4())
sourceS3Bucket = event['Records'][0]['s3']['bucket']['name']
sourceS3Key = event['Records'][0]['s3']['object']['key']
sourceS3 = 's3://'+ sourceS3Bucket + '/' + sourceS3Key
destinationS3 = 's3://' + os.environ['DestinationBucket']
mediaConvertRole = os.environ['MediaConvertRole']
application = os.environ['Application']
region = os.environ['AWS_DEFAULT_REGION']
statusCode = 200
jobs = []
job = {}
# Use MediaConvert SDK UserMetadata to tag jobs with the assetID
# Events from MediaConvert will have the assetID in UserMedata
jobMetadata = {}
jobMetadata['assetID'] = assetID
jobMetadata['application'] = application
jobMetadata['input'] = sourceS3
try:
# Build a list of jobs to run against the input. Use the settings files in WatchFolder/jobs
# if any exist. Otherwise, use the default job.
jobInput = {}
# Iterates through all the objects in jobs folder of the WatchFolder bucket, doing the pagination for you. Each obj
# contains a jobSettings JSON
bucket = S3.Bucket(sourceS3Bucket)
for obj in bucket.objects.filter(Prefix='jobs/'):
if obj.key != "jobs/":
jobInput = {}
jobInput['filename'] = obj.key
logger.info('jobInput: %s', jobInput['filename'])
jobInput['settings'] = json.loads(obj.get()['Body'].read())
logger.info(json.dumps(jobInput['settings']))
jobs.append(jobInput)
# Use Default job settings in the lambda zip file in the current working directory
if not jobs:
with open('job.json') as json_data:
jobInput['filename'] = 'Default'
logger.info('jobInput: %s', jobInput['filename'])
jobInput['settings'] = json.load(json_data)
logger.info(json.dumps(jobInput['settings']))
jobs.append(jobInput)
# get the account-specific mediaconvert endpoint for this region
mediaconvert_client = boto3.client('mediaconvert', region_name=region)
endpoints = mediaconvert_client.describe_endpoints()
# add the account-specific endpoint to the client session
client = boto3.client('mediaconvert', region_name=region, endpoint_url=endpoints['Endpoints'][0]['Url'], verify=False)
for j in jobs:
jobSettings = j['settings']
jobFilename = j['filename']
# Save the name of the settings file in the job userMetadata
jobMetadata['settings'] = jobFilename
# Update the job settings with the source video from the S3 event
jobSettings['Inputs'][0]['FileInput'] = sourceS3
# Update the job settings with the destination paths for converted videos. We want to replace the
# destination bucket of the output paths in the job settings, but keep the rest of the
# path
destinationS3 = 's3://' + os.environ['DestinationBucket'] + '/' \
+ os.path.splitext(os.path.basename(sourceS3Key))[0] + '/' \
+ os.path.splitext(os.path.basename(jobFilename))[0]
for outputGroup in jobSettings['OutputGroups']:
logger.info("outputGroup['OutputGroupSettings']['Type'] == %s", outputGroup['OutputGroupSettings']['Type'])
if outputGroup['OutputGroupSettings']['Type'] == 'FILE_GROUP_SETTINGS':
templateDestination = outputGroup['OutputGroupSettings']['FileGroupSettings']['Destination']
templateDestinationKey = urlparse(templateDestination).path
logger.info("templateDestinationKey == %s", templateDestinationKey)
outputGroup['OutputGroupSettings']['FileGroupSettings']['Destination'] = destinationS3+templateDestinationKey
elif outputGroup['OutputGroupSettings']['Type'] == 'HLS_GROUP_SETTINGS':
templateDestination = outputGroup['OutputGroupSettings']['HlsGroupSettings']['Destination']
templateDestinationKey = urlparse(templateDestination).path
logger.info("templateDestinationKey == %s", templateDestinationKey)
outputGroup['OutputGroupSettings']['HlsGroupSettings']['Destination'] = destinationS3+templateDestinationKey
elif outputGroup['OutputGroupSettings']['Type'] == 'DASH_ISO_GROUP_SETTINGS':
templateDestination = outputGroup['OutputGroupSettings']['DashIsoGroupSettings']['Destination']
templateDestinationKey = urlparse(templateDestination).path
logger.info("templateDestinationKey == %s", templateDestinationKey)
outputGroup['OutputGroupSettings']['DashIsoGroupSettings']['Destination'] = destinationS3+templateDestinationKey
elif outputGroup['OutputGroupSettings']['Type'] == 'DASH_ISO_GROUP_SETTINGS':
templateDestination = outputGroup['OutputGroupSettings']['DashIsoGroupSettings']['Destination']
templateDestinationKey = urlparse(templateDestination).path
logger.info("templateDestinationKey == %s", templateDestinationKey)
outputGroup['OutputGroupSettings']['DashIsoGroupSettings']['Destination'] = destinationS3+templateDestinationKey
elif outputGroup['OutputGroupSettings']['Type'] == 'MS_SMOOTH_GROUP_SETTINGS':
templateDestination = outputGroup['OutputGroupSettings']['MsSmoothGroupSettings']['Destination']
templateDestinationKey = urlparse(templateDestination).path
logger.info("templateDestinationKey == %s", templateDestinationKey)
outputGroup['OutputGroupSettings']['MsSmoothGroupSettings']['Destination'] = destinationS3+templateDestinationKey
elif outputGroup['OutputGroupSettings']['Type'] == 'CMAF_GROUP_SETTINGS':
templateDestination = outputGroup['OutputGroupSettings']['CmafGroupSettings']['Destination']
templateDestinationKey = urlparse(templateDestination).path
logger.info("templateDestinationKey == %s", templateDestinationKey)
outputGroup['OutputGroupSettings']['CmafGroupSettings']['Destination'] = destinationS3+templateDestinationKey
else:
logger.error("Exception: Unknown Output Group Type %s", outputGroup['OutputGroupSettings']['Type'])
statusCode = 500
logger.info(json.dumps(jobSettings))
# Convert the video using AWS Elemental MediaConvert
job = client.create_job(Role=mediaConvertRole, UserMetadata=jobMetadata, Settings=jobSettings)
except Exception as e:
logger.error('Exception: %s', e)
statusCode = 500
raise
finally:
return {
'statusCode': statusCode,
'body': json.dumps(job, indent=4, sort_keys=True, default=str),
'headers': {'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*'}
}
Make sure to configure your function to use the VODLambdaRole IAM role you created in the previous section.
Create a S3 Event Trigger for your Convert lambda. Use the AWS Lambda console to add a putItem trigger from the vod-watchfolder-firstname-lastname S3 bucket to the VODLambdaConvert lambda.
test the watchfolder automation. You can use your own video or use the test.mp4 video included in this folder to test the workflow.
For detail, please refer to this document https://github.com/aws-samples/aws-media-services-vod-automation/blob/master/MediaConvert-WorkflowWatchFolderAndNotification/README-tutorial.md
MPEG-DASH file has a .mpd manifest, but MediaPackage requires .smil manifest. How to auto generate this file from a .mpd?
as of today, MediaConvert has no auto generate smil file function. Therefore, you could either consider to change the output to HLS and ingest to Mediapackage. Or, creating the smil file manually. Reference document are below
HLS VOD ingest to Mediapackage: https://github.com/aws-samples/aws-media-services-simple-vod-workflow/blob/master/13-VODMediaPackage/README-tutorial.md
Creating smil file: https://docs.aws.amazon.com/mediapackage/latest/ug/supported-inputs-vod-smil.html

Related

Terraform Data Source: aws_s3_object can't get object from S3 bucket in another account

Hi Stack overflow community,
I have some Terraform code that needs access to an object in a bucket that is located in a different AWS account than the one I'm deploying the Terraform to.
The AWS S3 bucket is in us-west-2 and I'm deploying the Terraform in us-east-1 (I don't think this should matter).
I set up the following bucket level policy in the S3 bucket:
{
"Version": "2012-10-17",
"Id": "Policy1",
"Statement": [
{
"Sid": "Stmt1",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<aws-account-number-where-terraform-will-be-deployed>:user/<user-deploying-terraform>"
},
"Action": [
"s3:GetObject*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::<bucket-name>/*",
"arn:aws:s3:::<bucket-name>"
]
},
]
}
When I run the following AWS CLI command I'm able to get the bucket object using the user that will be deploying the Terraform:
aws s3api get-object --bucket "<bucket-name>" --key "<path-to-file>" "test.txt"
But when I run the following Terraform code:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "= 4.6.0"
}
}
}
data "aws_s3_object" "this" {
bucket = "<bucket-name>"
key = "<path-to-file>"
}
output "test" {
value = data.aws_s3_object.this.body
}
I get the following error:
Error: failed getting S3 Bucket (<bucket-name>) Object (<path-to-file>): BadRequest: Bad Request
status code: 400, request id: <id>, host id: <host-id>
with data.aws_s3_object.challenge_file,
on main.tf line 10, in data "aws_s3_object" "this":
10: data "aws_s3_object" "this" {
The provider configuration, as specified by AWS and Hashicorp, uses a single set of credentials, region, etc. You need a second provider configuration with an alias for the other region.
provider "aws" {
alias = "us-west-2"
region = "us-west-2"
}
data "aws_s3_object" "this" {
provider = aws.us-west-2
bucket = "<bucket-name>"
key = "<path-to-file>"
}
if your supplied credentials are not sufficient for permissions to retrieve information about the bucket in the other account, then the provider configuration block will also need separate credentials.

How do I use the S3 bucket arn from a terraform module output with a count index?

I have a terraform module that creates an S3 bucket based on a variable creates3bucket is true or false.
The resource block looks like this.
#Codepipeline s3 bucket artifact store
resource "aws_s3_bucket" "LambdaCodePipelineBucket" {
count = var.creates3bucket ? 1 : 0
bucket = var.lambdacodepipelinebucketname
}
I output the bucket arn in the outputs.tf file like this.
output "codepipelines3bucketarn"{
description = "CodePipeline S3 Bucket arn"
value = aws_s3_bucket.LambdaCodePipelineBucket[*].arn
}
From the calling module I want to pass this arn value in the bucket policy. This works fine when the bucket is not an indexed resource. But Terraform plan complains when there is a count associated with the bucket.
From the calling module I pass the bucket policy like this:
cps3bucketpolicy = jsonencode({
Version = "2012-10-17"
Id = "LambdaCodePipelineBucketPolicy"
Statement = [
{
Sid = "AllowPipelineRoles"
Effect = "Allow"
Principal = {
AWS = ["${module.lambdapipeline.codepipelinerolearn}"]
}
Action = "s3:*"
Resource = [
"${module.lambdapipeline.codepipelines3bucketarn}",
"${module.lambdapipeline.codepipelines3bucketarn}/*",
]
},
{
Sid : "AllowSSLRequestsOnly",
Effect : "Deny",
Principal : "*",
Action : "*",
Resource : [
"${module.lambdapipeline.codepipelines3bucketarn}",
"${module.lambdapipeline.codepipelines3bucketarn}/*",
],
Condition : {
Bool : {
"aws:SecureTransport" : "false"
}
}
}
]
})
Terraform Plan error: So for some reason once i added the count to the s3 bucket resource terraform does not like the "${module.lambdapipeline.codepipelines3bucketarn}/*" in the policy.
How do I pass the bucket arn in the policy from the calling module?
Like Marko E. wrote, you need to use the indexed resource. In your case, you should use this:
output "codepipelines3bucketarn"{
description = "CodePipeline S3 Bucket arn"
value = aws_s3_bucket.LambdaCodePipelineBucket[0].arn
}
But, in your case, your output would be empty, if the variable var.creates3bucket is false.
So I conclude, eighter the bucket is available or you will create it. If this is the case, use the data source for your policy.
data "aws_s3_bucket" "LambdaCodePipelineBucket" {
bucket = var.lambdacodepipelinebucketname
}
and change in your policy
"${module.lambdapipeline.codepipelines3bucketarn}"
to
"${data.aws_s3_bucket.LambdaCodePipelineBucket.arn"
Now, the only "error" will be, if the bucket is not available (then justs set your variable to true and the data source will find a bucket.

connect 2 terraform remote states

I'm working on a terraform task, where I need to connect two terraform s3 backends. We have a 2 repos for our tf script. The main one is for creating dev/qa/prod envs and the other one is for managing users/policies required for the first script.
We use s3 as the backend and I want to connect both the backend together so they can take ids/names from each other with out hardcoding them.
Say you have a backend A / terraform project A with your ids/names:
terraform {
backend "s3" {
bucket = "mybucket"
key = "path/to/my/key"
region = "us-east-1"
}
}
output "names" {
value = [ "bob", "jim" ]
}
In your other terraform project B you can refer to the above backend A as a data source:
data "terraform_remote_state" "remote_state" {
backend = "s3"
config = {
bucket = "mybucket"
key = "path/to/my/key"
region = "us-east-1"
}
}
Then in the terraform project B you can fetch the outputs of the remote state with names/ids:
data.terraform_remote_state.remote_state.outputs.names

boto3 python to start EC2

I am using below python boto3 code to start Ec2
import boto3
region='us-east-1'
instance_id = 'i-06ce851edfXXXXXX'
ec2 = boto3.client('ec2', region_name=region)
def lambda_handler(event, context):
resp = ec2.describe_instance_status(InstanceIds=[str(instance_id)],
IncludeAllInstances=True)
print("Response = ",resp)
instance_status = resp['InstanceStatuses'][0]['InstanceState']['Code']
print("Instance status =", instance_status)
if instance_status == 80:
ec2.start_instances(InstanceIds=[instance_id])
print("Started instance with Instance_id",instance_id)
elif instance_status == 16:
ec2.stop_instances(InstanceIds=[instance_id])
print("Stopped EC2 with Instance-ID",instance_id)
else:
print("No desired state found")
When instance is in running status i am able to stop the instance by running this lambda.
But when instance is in stopped state and i run Lambda i get below message and it show no error.But when i check in console instance is still in stopped state.I am not able to find out why instance is not getting in running stage.
Instance status = 80
Started instance with Instance_id i-06ce851edfXXXXXX
Below is IAM role used
{
"Action": [
"ec2:StopInstances",
"ec2:StartInstances",
"ec2:RebootInstances"
],
"Resource": [
"arn:aws:ec2:us0east-1:2x83xxxxxxxxxx:instance/i-06ce851edfXXXXXX"
],
"Effect": "Allow"
Your code is working. I verified it on my test instance with my lambda.
I reformatted it a bit to be easier to read, but it worked without any changes (except instance id). I can stop running instance. Then I can start stopped instance.
One thing to note is that stopping and starting take time. If you execute your function to fast, it won't be able to start an instance in a stopping state. Maybe that's why you thought it did not work.
Also make sure you increase your lambda's default timeout from 3 seconds to 10 or more.
import boto3
region='us-east-1'
instance_id = 'i-08a1e399b3d299c2d'
ec2 = boto3.client('ec2', region_name=region)
def lambda_handler(event, context):
resp = ec2.describe_instance_status(
InstanceIds=[str(instance_id)],
IncludeAllInstances=True)
print("Response = ",resp)
instance_status = resp['InstanceStatuses'][0]['InstanceState']['Code']
print("Instance status =", instance_status)
if instance_status == 80:
ec2.start_instances(InstanceIds=[instance_id])
print("Started instance with Instance_id",instance_id)
elif instance_status == 16:
ec2.stop_instances(InstanceIds=[instance_id])
print("Stopped EC2 with Instance-ID",instance_id)
else:
print("No desired state found")
I found out the issue.Root volume of EC2 was encrypted so i have added KMS permission in role and it worked.
Indeed, encryption of the root volume is the issue here.
You can add inline policy to the role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kms:*"
],
"Resource": "*"
}]
}
Note that it will grant full access form KMS for all of the resources. If you want, you can restrict the range of this policy to some specific resource.
More info about this problem here: https://aws.amazon.com/premiumsupport/knowledge-center/encrypted-volumes-stops-immediately/

Update bucket created in Terraform file results in BucketAlreadyOwnedByYou error

I need to add a policy to a bucket I create earlier on in my Terraform file.
However, this errors with
Error creating S3 bucket: BucketAlreadyOwnedByYou: Your previous
request to create the named bucket succeeded and you already own it.
How can I amend my .tf file to create the bucket, then update it?
resource "aws_s3_bucket" "bucket" {
bucket = "my-new-bucket-123"
acl = "public-read"
region = "eu-west-1"
website {
index_document = "index.html"
}
}
data "aws_iam_policy_document" "s3_bucket_policy_document" {
statement {
actions = ["s3:GetObject"]
resources = ["${aws_s3_bucket.bucket.arn}/*"]
principals {
type = "AWS"
identifiers = ["*"]
}
}
}
resource "aws_s3_bucket" "s3_bucket_policy" {
bucket = "${aws_s3_bucket.bucket.bucket}"
policy = "${data.aws_iam_policy_document.s3_bucket_policy_document.json}"
}
You should use the aws_s3_bucket_policy resource to add a bucket policy to an existing S3 bucket:
resource "aws_s3_bucket" "b" {
bucket = "my_tf_test_bucket"
}
resource "aws_s3_bucket_policy" "b" {
bucket = "${aws_s3_bucket.b.id}"
policy = <<POLICY
{
"Version": "2012-10-17",
"Id": "MYBUCKETPOLICY",
"Statement": [
{
"Sid": "IPAllow",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": "arn:aws:s3:::my_tf_test_bucket/*",
"Condition": {
"IpAddress": {"aws:SourceIp": "8.8.8.8/32"}
}
}
]
}
POLICY
}
But if you are doing this at the same time then it's probably worth just inlining this into the original aws_s3_bucket resource like this:
locals {
bucket_name = "my-new-bucket-123"
}
resource "aws_s3_bucket" "bucket" {
bucket = "${local.bucket_name}"
acl = "public-read"
policy = "${data.aws_iam_policy_document.s3_bucket_policy_document.json}"
region = "eu-west-1"
website {
index_document = "index.html"
}
}
data "aws_iam_policy_document" "s3_bucket_policy_document" {
statement {
actions = ["s3:GetObject"]
resources = ["arn:aws:s3:::${local.bucket_name}/*"]
principals {
type = "AWS"
identifiers = ["*"]
}
}
}
This builds the S3 ARN in the bucket policy by hand to avoid a potential cycle error from trying to reference the output arn from the aws_s3_bucket resource.
If you had created the bucket without the policy (by applying the Terraform without the policy resource) then adding the policy argument to the aws_s3_bucket resource will then cause Terraform to detect the drift and the plan will show an update to the bucket, adding the policy.
It's probably worth noting that your canned ACL used in the acl of the aws_s3_bucket resource is overlapping with your policy and is unnecessary. You could use either the policy or the canned ACL to allow your S3 bucket to be read by all but the public-read ACL also allows your bucket contents to be anonymously listed like old school Apache directory listings which isn't what most people want.
When setting up terraform to use s3 as a backend for the first time with a config similar to below:
# backend.tf
terraform {
backend "s3" {
bucket = "<bucket_name>"
region = "eu-west-2"
key = "state"
dynamodb_endpoint = "https://dynamodb.eu-west-2.amazonaws.com"
dynamodb_table = "<table_name>"
}
}
resource "aws_s3_bucket" "<bucket_label>" {
bucket = "<bucket_name>"
lifecycle {
prevent_destroy = true
}
}
After creating the s3 bucket manually in the AWS console, run the following command to update the terraform state to inform it that the s3 bucket already exists:
terraform import aws_s3_bucket.<bucket_label> <bucket_name>
The s3 bucket will now be in your Terraform state and will henceforth be managed by Terraform.

Resources