I have an S3 bucket containing objects that I want to share with users of a website. I know I can use something like Query String Authentication to provide secure access to the objects, but what if I instead make each object publicly-readable yet "hidden" behind a complex key (i.e. URL) containing a cryptographically-strong random number? If the containing bucket disallows listing of objects, there wouldn't be a way to guess or discover the URLs, correct? Or is there some security hole I'm overlooking?
Side note: my first thought was to use UUIDs in the keys, but I read that they can apparently be predicted, given a few previous instances. That said, I don't have an understanding of how easily that can be done. If it's non-trivial, I probably wouldn't worry too much about using them instead of a strong random number...
The problem is if the once shared URL gets into the hands of another user (say via sharing). If you ensure the URL is kept sufficiently secret, it is ok with this approach (say you return the URL to a user via https, and this user dont share it).
Any loophole here will cause a security hole - and here is where the query string based signature scheme is helpful, since the signatures are made to expire after a fixed time and so any re-sharing wont also harm you.
You can use UUIDs (ensure they dont end up duplicating, by regenerating another one if the new one collides). They are probably as difficult (or more) to guess as any other 8-letter password.
The standard way to do what you want is to generate pre-signed URLs for each of the objects you want to share. If you make them with a short lifetime, then they cannot be shared outside that time period. All of the AWS-provided SDKs have support for this feature.
Related
I'm creating a Meteor chat app. I want users to be able to send images to each other, but users not in the group/conversation shouldn't be able to see the image. My first thought was to give an image a unique ID and store the image with this ID in a public folder in my or a 3rd party server.
For example, if the user uploads an image called "name.jpg", it could be stored in Amazon S3 as A3eedAcRCqCa32451.jpg. That way, anyone with the ID can access the image, but the only people with access to the ID would be those in the group chat, since I can ensure secure access using Meteor's publish and subscribe rules. However, this doesn't feel safe to me. Is my intuition right?
If yes, how else would I do it? I searched online and on StackOverflow and couldn't find another simple way to achieve this.
You usually have two things you can do when considering granting access to resources:
Authentication and access control
Indirect object references with sufficient entropy to make brute force attacks hopeless.
Point 1 is more or less obvious.
Point 2 is actually what you have already on your mind. To reason about the security of the second approach let's consider following:
The entropy of A3eedAcRCqCa32451.jpg is about 80 bits.
The count of the domain is 2^80 = 1208925819614629174706176.
An attacker can try to guess the secret.
Let's say, he can make 10 guesses a second and on average he will guess after |domain|/2 tries.
It would take the attacker ~2 mln years to guess.
Now the domain of 80 bits is a bit small from the security perspective. Make it 128 bits by using type 4 UUIDs. I believe you see where this is going.
It depends on the security level you want...
A more secure solution would be to store in a collection the imageId and userId of people that can access the image. When someone wants to access it, you just have to check if he's in the list of allowed users.
Then as you said you can use a 3rd party storage (personally I'm using ostrio:files with Dropbox integration, the docs about it aren't up to date but I made a pull request which was accepted on the dev branch with a working example, you can take a look at it here.)
The nice thing with ostrio:files is that it offers built-in functions like onAfterUpload or interceptDownload in which you can store data about access for the first and check if access is allowed for the second.
I run a landscaping company and have multiple crews. I want to provide each one with a custom URL (like mysite.com/xxxx-xxxx-xxxx) that shows their daily schedule. Going to the page will list the name, address and phone number of 5-10 customers for the day.
Is it safe/wise to use a UUID in a URL for semi-private data?
Depends on how safe you want it to be.
Are the UUIDs used for anything else? If not, they are fine for creating random URLs.
But, browser history would allow anyone using the same machine to find the URLs. Also, unless using https, a network sniffer could easily see the requested URLs and go to the same page.
Another concern is spider bots. Make sure nothing links to those pages, use a robots.txt to prevent indexing the site, but you still might find that some of the pages show up on search engines. It might be better to have the UUID set in a cookie and check that for determining which employee it is, lest your semi-private pages start showing up on google.
Whether or not that schema would work for you, depends on your threat model (as well as some implementation details). Without a concrete threat model, it is not possible to give a definitive answer to your question.
I can, however, give you some ideas about potential issues with the solution, so you can determine if they are relevant for your application. This is not a complete list.
On the implementation side of things:
Not all UUID generators are created equal. Ideally, you want to use a generator based on a cryptographically secure RNG, providing an UUID where every byte is chosen at random.
Using the UUID for a database lookup or similar operation is not necessarily a constant-time operation (and thus there might be side-channel attacks unless you implement the lookup by yourself)
Make sure your URI does not leak via referrer
Some tools attempt to detect 'secret' URLs to protect them from history synchronization or other automatic features. Your schema will most likely not be detected as 'secret'. It might be better to artificially lengthen your URI and to move your UUID into a query parameter.
You can further reduce attack surface with the usual methods (rate limiting, server hardening, etc.)
On the conceptual side of things:
A single identifier for both identification and authentication is not necessarily a bad thing. However, in most cases there is a need for an identification-only identifier – you must not use the 'secret' UUID in those scenarios
If a 'crew' consists of multiple people: you cannot revoke access for a single crew member
Some software (antivirus, browser, etc.) treats information in URLs as public information, and might upload them without user interaction
We are considering using carrierwave_direct for uploading files directly to s3 from the user's browser. The form generated by carrierwave_direct includes our aws_access_key, and a "signature", which is generated by the following code:
def signature
Base64.encode64(
OpenSSL::HMAC.digest(
OpenSSL::Digest.new('sha1'),
aws_secret_access_key, policy
)
).gsub("\n","")
end
The policy argument is a method, and is generated using Time.now, so presumably that makes it very hard for an attacker to figure out our aws_secret_access_key. However, if just the aws_access_key and this signature are enough to authenticate as this s3 user (even if it's time-limited), why would an attacker need our aws_secret_access_key? Can't they just reload the page to get a signature that will work for a period of time? What am I missing here?
The reason I'm concerned is because we're using the same credentials in other parts of our app to do things that we definitely don't want arbitrary users to be able to do, and fog/carrierwave don't seem to provide a way to use one set of credentials for one operation, and another elsewhere.
The signature only authenticates the user with permission to perform the action allowed by the policy document that the signature was generating by signing. Change a single byte in the policy, and the signature is invalid. (You can prove this to yourself by tweaking it manually).
The AWS access key is intended to be safe to expose. The AWS secret is what you should never expose, and the signature does not contain enough information to reverse-engineer your secret from it in any practical way other than brute force... the keyspace is considered too large for this to be practical.
Still, it would be best (as always, not specifically here) to use a different key/secret pair that only has the minimum permissions required to accomplish the purpose, and to periodically rotate them.
Introduction
I want to create a Java web application for storing and backing up user files, similar to Dropbox. One of the interesting Dropbox feature is that it can detect whether a certain file already exists on server. For example, if one user upload a file onto server, another user who tries to upload the same file will not need to upload the same file content. Server will only need mark that he has the same file. This helps to save the bandwidth/space and increases the speed in many ways.
The most basic solution to this problem is to use a file hash string, e.g. sha1, md5, etc., to identify the file. The client software check whether a certain hash exists on server or not. If it exists, then it can skip the uploading process and mark that user has the same file.
Problem
The web application is implemented based on REST architecture so that user can easily write their own client software to upload their files. For security reasons, the SSL is enabled for all transactions. But my most security concern is about users faking that they have a file without actually owning it if I use sha1 or any other standard hash alogorithms. This cannot be prevented by SSL or encryption. If a user manage to get the hash string, e.g. md5 and sha1 of many files can be found by googling, he can mark that he has the file using REST service on the web application.
So one of the possible solution is that the server requests a set of certain random bytes from the file as well as the hash of the whole file. Here is example steps:
Client checks whether a certain hash exists on server or not. Then, server returns the required positions of random bytes if the file already exists.
Client sends random bytes as per request if the server has the file. Client software will not be able to response to it without having the actual file.
In this way, it can save the bandwidth as well as ensure that user owns the file they want to upload.
Question
I am no expert in Security over the web so I have no idea whether this is a good idea or not. I have read some articles about implementing their own fancy process might lead to the reduction in security strength because the security cannot be tested and the extra information may provide a cracking method.
Does anyone has any comment on the process?
Will it reduce the sucurity?
Does anyone have an idea to solve this problem differently?
I understand that there might not be an exactly answer to this question but I would like to hear if anyone has encounter the same problem and has any good solution to it.
Rather than asking the client to upload some random bytes of the file's contents, it may be better to ask the client to upload the hash of a random region the file. That way you can use a wider range of sizes that you ask the client to verify.
Better yet, though, may be to send the client a random number and require the client to compute an HMAC of the entire file's contents using that number as the key. This is more computationally-expensive since the server must compute the HMAC too, but it verifies that the client has the entire file, not just a small portion of it.
One unavoidable side effect of this hash feature, even with a verification scheme, is that it reveals that a copy of the file already exists somewhere on the server. That by itself may be sensitive information.
For the most stringent privacy protection, you should forego this feature and make each user upload their own copy of the file. You can use hash comparison on the server to avoid storing multiple copies of the file, transparently to the clients.
I need to generate UUID to eventually store in a database. Can I generate theses UUID from Javascript on the client browser (There are some examples here)?
Is there any security risk of doing it this way? I understand that anyone can modify the UUID before it's passed to the server for storing. So i'll need to check if they are trully unique before storing them in the database, but other than that, is there any other things to checkout?
(Sorry for my english, feel free to correct any grammar errors)
edit: To answer questions about why I would want to do this, it's because I can create a new object and it's identifier in Javascript and add it to my view and then make an AJAX call to the server to add it to the database. This way, I don't need to load it back from the database to know what is it's primary identifier.
Not really. As long as it's a simple identifier and nothing more, and you are indeed checking it for validity and uniqueness, it's no different than user accounts having an id in the url, for example.
Look at your URL bar. I bet 1296234 is the primary key of this question, but I can't really do anything with that information. Same deal with your script.
What benefit do you see in generating these client-side? In all honesty, the best option is to generate it server-side, out of the users reach. It may not give save you from any serious security issues, but it will cut down on redundant validation.
Is there some reason you can't have the database generate (increment) an ID?
If, like you say, you'll have to check the uniqueness of the value before submitting it anyway, why not just have whatever backend language you are using generate it. That would make it much more opaque.
Yes. The risk is not specific to UUID, any client-side generated ID has some risks, depending on what you do with the ID. The problem is that it's very hard to authenticate the Javascript. If you accept ID generated by client, you accept any IDs from the hackers.
The risks may include,
Session stealing. If you use the ID to identify the session, someone may use an existing ID as generated ID and the server may treat it as an existing session if proper care is not taking.
Duplicate keys. True UUID is random but someone can generate duplicate keys which will mess up your database.
You might find ways to defend against each of these attacks but that's passive protection. It might defeat the original purpose of generating IDs on the client, which is simple.