We have setup that converts raw videos into HLS format (.m3u8 and .ts files) and organises them into a directory inside a s3 bucket. Each directory inside the bucket represents one video. Since s3 doesn't really have the concept of directory in its implementation, it does not allow us to get a signed url to read the content of the directory to feed into the video player.
I tried signing the URL for the .m3u8 file alone with getObject, but since tries to fetch the parts of the video to play, it will be thrown with an 403 by s3. Using cloudfront is not an option for us at this stage.
Is there a better and secure way to handle the streaming from s3 without making the entire bucket public?
For anybody still looking for similar solution, You can't get signed url for a directory or wildcard using s3 alone. The better way to do it is to have the CloudFront in front of s3 and use CloudFront Signed URLs/Cookies with Custom Policies which allows to use wildcards when signing.
Example from AWS Docs:
{
"Statement": [
{
"Resource":"http://d111111abcdef8.cloudfront.net/training/*",
"Condition":{
"DateLessThan":{"AWS:EpochTime":1357034400}
}
}
]
}
More on that is explained here: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-creating-signed-url-custom-policy.html
Even though we did not wanted to use CloudFront initially, we ended up using it since that seemed like the only feasible option at the time and developers from AWS also recommended the same.
If you are okay with building custom solution, you can build a lambda that acts like an authorizer and validated the wildcards on top of s3.
Related
I'm trying to better understand how the overall flow should work with AWS Lambda and my Web App.
I would like to have the client upload a file to a public bucket (completely bypassing my API resources), with the client UI putting it into a folder for their account based on a GUID. From there, I've got lambda to run when it detects a change to the public bucket, then resizing the file and placing it into the processed bucket.
However, I need to update a row in my RDS Database.
Issue
I'm struggling to understand the best practice to use for identifying the row to update. Should I be uploading another file with the necessary details (where every image upload consists really of two files - an image and a json config)? Should the image be processed, and then the client receives some data and it makes an API request to update the row in the database? What is the right flow for this step?
Thanks.
You should use a pre-signed URL for the upload. This allows your application to put restrictions on the upload, such as file type, directory and size. It means that, when the file is uploaded, you already know who did the upload. It also prevents people from uploading randomly to the bucket, since it does not need to be public.
The upload can then use an Amazon S3 Event to trigger the Lambda function. The filename/location can be used to identify the user, so the database can be updated at the time that the file is processed.
See: Uploading Objects Using Presigned URLs - Amazon Simple Storage Service
I'd avoid uploading a file directly to S3 bypassing the API. Uploading file from your API allows you to control type of file, size etc as well as you will know who exactly is uploading the file (API authid or user id in API body). This is also a security risk to open a bucket to public for writes.
Your API clients can then upload the file via API, which then can store file on S3 (trigger another lambda for processing) and then update your RDS with appropriate meta-data for that user.
Context
I am building Stateless REST APIs for a browser-based platform that needs to store some user-generated files. These files could potentially be in the GBs.
I am using AWS S3 for storage. I have used AWS SDK in the past for this to route the file uploads through the NodeJS server (Basically - Upload to Server, Server uploads to S3).
I am trying to figure out how to improve it using the Pre-signed urls. I understand the dynamics and the flow on how to get the presigned urls and how to upload the file to S3 directly.
I cannot use SQS or Lambda to trigger object created event.
The architecture needs to be AWS independent.
Question
The simplest of flows I need to achieve is pretty common -
User --> Opens Profile
Clicks Upload Photo
Client Sends Request to /getSignedUrl
Server Returns the signedURL for the file name/type
The client executes the PUT/POST request to upload the file to the signedUrl
Upload Successful
After this - my understanding is -
Client tells the server - File Uploaded Successfully
Server associates the S3 Url for the Photo to the User.
...and that's my problem. How do I associate the successfully uploaded file back to the user on the server in a secure way?
Not sure what I've been missing. It seems like a trivial use case but I haven't been able to find anything regarding it.
1/ I think for the avatar, you should set it as public-read.
When create signed upload url in the
GET: /signed-upload-url
You need to set the image as public-read. After that you are free to interact with image through the direct url. Since this is an avatar, so you can compress it, reduce the size of image by the AWS Lambda function.
2/ If you don't want to have the public-read, you need to associate with server to get signed-download-url to interact with image
GET: /signed-download-url
I have gone through all the existing questions doesn't seems to be fullfill my requirements.
I have a S3 private bucket with 10000 files, Privately accessing via Nodejs server to display in my angular application atleast 25 per page.
I found multiple solutions those seems Inefficient to my thoughts.
Generate pre-signed urls for files.
Pulls the image via the Nodejs API from S3
To display 10 or more need to generate signed Url's each time which is a time consuming process. And pulling image via api using s3.getObject method gives me a Buffer data converting it to a Base64 is hard to handle at the client side and fetching each consumes time this too.
Are these any solutions out there which I'm not aware of and how this can be implemented without affecting user experience.
PS: My Bucket is private not public
Have you tried signed cookies?
I think this may help you by just considering AWS CloudFront and signed the cookie one time to let the client access any file(s) directly after that.
There is some reference.
Also, CloudFront will give you more benefits such as optimize the access speed, attach SSL Certificates to your S3 buckets, and more.
"Sorry for my English"
I'm trying to understand what are the implementation differences between a standard AWS S3 upload scheme (e.g. using aws-sdk) and browser-based uploads, particularly, in node.js.
I understand that in any case, there needs to be a server that will store my AWS credentials and sign the requests to S3.
But there's a bunch of things I don't seem to understand:
If I use a browser-based upload, I'll have an HTML form on the client side with the signature and the policy values in hidden fields that I get from my server. But if I use a standard scheme for uploading files, i.e. completely through my server, how exactly is it implemented? There's a lot of code examples on server side implementations, but what should happen on the client side? So, there will be an HTML form with action attribute pointing to my server's URL designated for file uploads, right? But what will actually happen? Will the file firstly get uploaded to my server's storage and then to S3? Or will it somehow use streaming or something? It really confuses me and I'd really appreciate some code example where there's both server and client side code.
What are the pros and cons of both uploading schemes? When should I favour one approach over another (my personal use case - video uploads in a multi-account system)?
We are using Amazon S3 for images on our website and users upload the images/files directly to S3 through our website. In our policy file we ensure it "begins-with" "upload/". Anyone is able to see the full urls of these images since they are publicly readable images after they are uploaded. Could a hacker come in and use the policy data in the javascript and the url of the image to overwrite these images with their data? I see no way to prevent overwrites after uploading once. The only solution I've seen is to copy/rename the file to a folder that is not publicly writeable but that requires downloading the image then uploading it again to S3 (since Amazon can't really rename in place)
If I understood you correctly The images are uploaded to Amazon S3 storage via your server application.
So the Amazon S3 write permission has only your application. Clients can upload images only throw your application (which will store them on S3). Hacker can only force your application to upload image with same name and rewrite the original one.
How do you handle the situation when user upload a image with a name that already exists in your S3 storage?
Consider following actions:
First user upload a image some-name.jpg
Your app stores that image in S3 under name upload-some-name.jpg
Second user upload a image some-name.jpg
Will your application overwrite the original one stored in S3?
I think the question implies the content goes directly through to S3 from the browser, using a policy file supplied by the server. If that policy file has set an expiration, for example, one day in the future, then the policy becomes invalid after that. Additionally, you can set a starts-with condition on the writeable path.
So the only way a hacker could use your policy files to maliciously overwrite files is to get a new policy file, and then overwrite files only in the path specified. But by that point, you will have had the chance to refuse to provide the policy file, since I assume that is something that happens after authenticating your users.
So in short, I don't see a danger here if you are handing out properly constructed policy files and authenticating users before doing so. No need for making copies of stuff.
actually S3 does have a copy feature that works great
Copying Amazon S3 Objects
but as amra stated above, doubling your space by copying sounds inefficient
mybe itll be better to give the object some kind of unique id like a guid and set additional user metadata that begin with "x-amz-meta-" for some more information on the object, like the user that uploaded it, display name, etc...
on the other hand you could always check if the key exists already and prompt for an error