I'm developing a filesharing website and I have a couple of questions regarding Windows Azure Shared Access Signatures.
About my website: Registered users can upload, share and store their files using blob storage. The files can be up to 2GB in size so I want the upload and download to be as fast as possible. It's
also important that the administartion cost for me as a host is at its minimum. User stored files must be private.
I'm OK with using SAS URI for uploads, but for downloads I'm abit spooked.
Questions:
1. Users can store files on their account and these files should only be accessed by that user. If I were to use SAS URI download here, the files will always be available with an URI as long as
the URI lives, (doesnt require you to be logged in if you know the URI, you can just download the file). This is quite scary if you want the file to be private. I know the signature in the SAS URI
is "HMAC computed over a string-to-sign and key using the SHA256 algorithm, and then encoded using Base64 encoding", is this safe? Is it acceptable to use SAS URI for downloads even if
the files are private? Should I instead stream the file between the server and website, (this will be much more safe but the speed will suffer and the administration cost will rise).
2. How much slower and how much more will it cost if I stream the downloads between (server, website, user) instead of using SAS, (server directly to user)?
3. If I set the SAS URI expiry time to 1 hour and the download takes longer than 1 hour, will the download cancel if the download started before the expiry time?
4. If my website is registered at x.azurewebsites.net and I'm using a purchased domain so I can access my website at www.x.com, is it possible to make the SAS URI's look somethinglike this:
https://x.com/blobpath instead of https://x.blob.core.windows.net/blobpath, (my guess is no..).
Sorry for the wall of text!
There's nothing that stops someone from sharing a URI, whether with or without a SAS. So from a safety perspective, if you leave the expiry date far-off into the future, the URI will remain accessible with the SAS-encoded URI. From an overall security perspective: Since your blob is private, nobody else will have access to the blob without a SAS-encoded URI. To limit SAS use: If, instead of being issued a long-standing SAS URI, the user visited a web page (or API) to request file access, you could generate a new SAS URI for a smaller time window; at this point, the end user would still be able to direct-access the blob without streaming the content through the VM (this just adds an extra network hop for obtaining the URI, along with whatever is needed to host the web / API server). Also related to security: If you use a stored access policy, you have the ability to modify access after issuing the SAS, rather than embedding start+end time directly into the SAS URI itself (see here info about access policies).
You'll incur the cost of the VM(s) used for fronting the URI requests. Outbound bandwidth costs are the same as using blob access directly: You pay for outbound bandwidth only. Performance will be affected by many things if going through a VM: VM size, VM resource use (e.g. if your VM is running at 100% CPU, you might see performance degredation), number of concurrent accesses, etc.
Yes, if the user hits expiry time, the link is no longer valid.
Yes, you can use a SAS combined with custom domain names used with storage. See here for more information about setting up custom domain names for storage.
Related
I have a few questions regarding firebase storage?
I am generating download URLs for firebase storage objects using and admin account (has custom claims) and storing the URL on Firestore.
Users can read the Firestore document to get the URL instead of having to call getDownloadUrl on the client side code.
Q1) I noticed there is a token at the end of the storage URLs. Is this specific to my admin account and is it safe that none admin users can now read this token?
Q2) Furthermore if a non admin user called getDownloadUrl on the same storage path would they receive the same URL as the admin account or a different one?
Q3) If I switch to using getDownloadUrl on the client side would this increase my cost when using firebase storage?
Q4) If i am caching the content by URL and the URL changes it will redownload and not use cache.. Are these download links unique or can getDownloadURL return different URLs on subsequent calls?
Thanks a lot
Edit ---
Sorry I have an additional question
Q5)To move files on firebase storage I currently download them to my local pc and reupload them to another location -- seems very inefficient.
I have seem people using file.move() (as can be seen here.)
Would this be possible to call in a firebase function (as they talk storage rules being an issue in the comments, although its from 2016) and if so how would this be cheaper than my manual download and upload?
Sorry for many questions :)
Q1) I noticed there is a token at the end of the storage URLs. Is this specific to my admin account and is it safe that none admin users can now read this token?
This token is a a random ID generated for this specific file. It won't change, unless you change it intentionally (you can "revoke" the token from the Firebase Console, which will replace it with a new token). Everyone who possesses the URL can view the file whether they are authenticated or not. However, the URL is "hard to guess", so unless you share it with anyone, it will stay secret, practically speaking.
Q2) Furthermore if a non admin user called getDownloadUrl on the same storage path would they receive the same URL as the admin account or a different one?
The returned URL will always be the same, unless you invalidate it in the Firebase Console. If you don't want clients to call getDownloadURL on the files, add a Storage Security Rule that denies reads:
match /path/to/{file} {
allow read: if false;
// Or, if only authed users should be able to call getDownloadURL:
allow read: if request.auth != null;
}
Q3) If I switch to using getDownloadUrl on the client side would this increase my cost when using firebase storage?
A call to getDownloadUrl() does utilize some Google Cloud resources that you will have to pay for, whether you do it server-side or client-side. It's a "Class B" operation (check Google Cloud pricing), and a bit of data transfer.
Q4) If i am caching the content by URL and the URL changes it will redownload and not use cache.. Are these download links unique or can getDownloadURL return different URLs on subsequent calls?
The same URL is return each time, unless you manually invalidate the token. (By the way, the caching policy that sets the Cache-Control header is set on the object as metadata when you upload it.)
Q5) To move files on firebase storage I currently download them to my local pc and reupload them to another location -- seems very inefficient. [..] Would this be possible to call in a firebase function
Yes, you can move files in a Firebase Cloud Function. The Firebase Admin SDKs bypasses security rules.
1) I noticed there is a token at the end of the storage URLs. Is this specific to my admin account and is it safe that none admin users can now read this token?
Depends on what you have at the moment since you can integrate Custom Authentication with Firebase which will allow you to create custom tokens that can be used to sign into the Firebase Authentication service on a client application and assume the identity described by the token’s claim. This can be used when accessing other Firebase services, such as Cloud Storage, etc.
In general your server should create a custom token with a unique identifier.
2) Furthermore if a non admin user called getDownloadUrl on the same storage path would they receive the same URL as the admin account or a different one?
Depends on how you are setting the permissions for the getDownloadUrl. If you have a customized one they can receive a different one but usually it returns a new instance that points to the current reference.
3) If I switch to using getDownloadUrl on the client side would this increase my cost when using firebase storage?
I am not sure about this, I have checked the documentation and there is nothing that would indicate a quota or pricing on this specific method so I would go ahead and assume that it would not do it but I might be wrong on this one.
4) If i am caching the content by URL and the URL changes it will redownload and not use cache.. Are these download links unique or can getDownloadURL return different URLs on subsequent calls?
As specified before, it returns a new instance that points to the current reference so these download links are unique.
5) To move files on firebase storage I currently download them to my local pc and reupload them to another location -- seems very inefficient.
For this question and the last part of your initial post I would suggest you to create a support ticket and ask more details to the Firebase Support Team where you can get more information regarding this since it is more suited for them than to StackOverflow. (https://firebase.google.com/support)
i have an Azure Blob Storage with blobs that are pdf that are categorized by client number. So for each client, they have multiple pdf reports. I only want the client to be able to access the blobs for their client number. (There are hundreds of clients.)
I've researched, but only see shared access signatures, but this doesn't look like what i need.
There is no user-level blob permissions, other than Shared Access Signatures (and Policies).
It's going to be up to you to manage access to specific user content (and how you manage that is really up to you and your app, and how you manage a user's content metadata).
When providing a link to a user's content: if you assume all content is always private, then simply create an on-demand SAS link when requested. There's no way for the user to modify a SAS link to guess sequential numbers or neighboring blobs, since the SAS is for a specific URL.
As Andrés suggested, you could also use your app to stream blob content, and never worry about SAS. However, you will now be consuming resources of your web app (network, CPU, memory), and this will have an impact on your app's scale requirements. You will no longer be able to offload this to the storage service.
Sounds like you already have the users authenticate, and you know which pdfs belong to them. My suggestion is to add to your current application a simple proxy (for instance if you have an MVC application, you could add a new controller and action method that will retrieve the pdfs on behalf of the user).
This way you don't need to use shared access signature and can keep the blob container private. Your controller/action method would simply use the storage SDK to retrieve the blob. An added bonus is that you could check to make sure that they are requesting their own PDF file and reject the request if they guess the ID of someone else's file.
Hello I have a web page that loads a blob resource using a SAS Policy everytime a hyper link is clicked. Meaning if I click twice or more on the link I will generate two or more different signed URLs to the same blob resource. My question is: is there a way of overwrite or cancel the previously generated SAS policies and keeping only the URL generated when the user clicks last?
Thank you in advance.
Technically it is possible to do so however it is not a recommended approach. Reason being, there can only be 5 access policies on a blob container at any point of time and the process to change access policies would require a round trip to storage (i.e. a network call). Assuming there are 100s of users on your website and all of them accessing the same resource. By changing access policy on the fly would result in errors for some of the users plus because it involves a network call, the overall experience may be degraded.
One thing you could do is keep the SAS expiry time short so that the SAS URL is valid for a short amount of time so that there are less chances of it being misused.
To change the access policy, you would 1st need to fetch the existing access policies on a container. Then you could either update the access policy identifier or remove that access policy + create a new access policy and then save the access policies.
Heres three questions for you!
Is it possible to revoke an active SAS URI without refreshing storage key or using Stored Access Policy?
In my application, all users share the same blob container. Because of this, using stored access policy, (max 5 per container), or refreshing storage key, (will result in ALL SAS URI'S being deleted), is not an option for me.
Is it possible to show custom errors if the SAS URI is incorrect or expired?
This is the default page:
If I let users create their own SAS URI for uploading/downloading, do I need to think about setting restrictions? Can this be abused?
Currently, in my application, there are restrictions on how much you are allowed to upload, but no restrictions on how many SAS URIS you are allowed to create. Users can aquire how many SAS URIS as they like as long as the don't complete their upload or exceed the allowed stored bytes.
How does real filesharing websites deal with this?
How much does a SAS URI cost to create?
Edit - Clarification of question 3.
Before you can upload or download a blob you must first get the SAS URI. I was wondering if it's "expensive" to create a SAS URI. Imagine a user exploiting this, creating a SAS URI over and over again without finishing the upload/download.
I was also wondering how real filesharing websites deal with this. It's easy to store information about how much storage the user is using and with that information put restrictions etc, but... If a user keeps uploading files to 99% and then cancel and restarts again and do the same thing, i imagine it would cost alot for the host
To answer your questions:
No, ad-hoc SAS tokens (i.e. tokens without Storage Access Policy) can't be revoked other than changing the storage key or access policy.
No, at this time it is not possible to customize error message. Standard error returned by storage service will be shown.
You need to provide more details regarding 3. As it stands, I don't think we have enough information to comment.
UPDATE
Regarding your question about how expensive creating a SAS URI is, one thing is that creating a SAS URI does not involve making a REST API call to storage service so there's no storage transaction involved. So from the storage side, there's no cost involved in creating a SAS URI. Assuming your service is a web application, only cost I could think of is user making call to your service to create a SAS URI.
Regarding your comment about how real file sharing websites deal with it, I think unless someone with a file sharing website answers it, it would be purely speculative.
(My Speculative response :)) If I were running a file sharing website, I would not worry too much about this kind of thing simply because folks don't have time to "mess around" with your site/application. It's not that the users would come to your website with an intention of "let's just upload files till the upload is 99%, cancel the upload and do that again" :). But again, it is purely a speculative response :).
I'm creating an application that will be hosted in Azure. In this application, users will be able to upload their own content. They will also be able to configure a list of other trusted app users who will be able to read their files. I'm trying to figure out how to architect the storage.
I think that I'll create a storage container named after each user's application ID, and they will be able to upload files there. My question relates to how to grant read access to all files to which a user should have access. I've been reading about shared access signatures and they seem like they could be a great fit for what I'm trying to achieve. But, I'm evaluating the most efficient way to grant access to users. I think that Stored access policies might be useful. But specifically:
Can I use one shared access signature (or stored access policy) to grant a user access to multiple containers? I've found one piece of information which I think is very relevant:
http://msdn.microsoft.com/en-us/library/windowsazure/ee393341.aspx
"A container, queue, or table can include up to 5 stored access policies. Each policy can be used by any number of shared access signatures."
But I'm not sure if I'm understanding that correctly. If a user is connected to 20 other people, can I grant him or her access to twenty specific containers? Of course, I could generate twenty individual stored access policies, but that doesn't seem very efficient, and when they first log in, I plan to show a summary of content from all of their other trusted app users, which would equate to demanding 20 signatures at once (if I understand correctly).
Thanks for any suggestions...
-Ben
Since you are going to have a container per user (for now I'll equate a user with what you called a user application ID), that means you'll have a storage account that can contain many different containers for many users. If you want to have the application have the ability to upload to only one specific container while reading from many two options come to mind.
First: Create a API that lives somewhere that handles all the requests. Behind the API your code will have full access to entire storage account so your business logic will determine what they do and do not have access to. The upside of this is that you don't have to create Shared Access Signatures (SAS) at all. Your app only knows how to talk to the API. You can even combine the data that they can see in that summary of content by doing parallel calls to get contents from the various containers from a single call from the application. The downside is that you are now hosting this API service which has to broker ALL of these calls. You'd still need the API service to generate SAS if you go that route, but it would only be needed to generate the SAS and the client applications would make the calls directly with the Windows Azure storage service bearing the load which will reduce the resources you actually need.
Second: Go the SAS route and generate SAS as needed, but this will get a bit tricky.
You can only create up to five Stored Access Policies on each container. For one of these five you create one policy for the "owner" of the container which gives them Read and write permissions. Now, since you are allowing folks to give read permissions to other folks you'll run into the policy count limit unless you reuse the same policy for Read, but then you won't be able to revoke it if the user removes someone from their "trusted" list of readers. For example, if I gave permissions to both Bob and James to my container and they are both handed a copy of the Read SAS, if I needed to remove Bob I'd have to cancel the Read Policy they shared and reissue a new Read SAS to James. That's not really that bad of an issue though as the app can detect when it no longer has permissions and ask for the renewed SAS.
In any case you still kind of want the policies to be short lived. If I removed Bob from my trusted readers I'd pretty much want him cut off immediately. This means you'll be going back to get a renewed SAS quite a bit and recreating the signed access signature which reduces the usefulness of the signed access policies. This really depends on your stomach of how long you were planning on allowing the policy to live and how quickly you'd want someone cut off if they were "untrusted".
Now, a better option could be that you create Ad-hoc signatures. You can have as many Ad-hoc signatures as you want actually, but they can't be revoked and can at most last one hour. Since you'd make them short lived the length or lack of revocation shouldn't be an issue. Going that route will mean that you'd be having the application come back to get them as needed, but given what I mentioned above about when someone is removed and you want the SAS to run out this may not be a big deal. As you pointed out though, this does increase the complexity of things because you're generating a lot of SASs; however, with these being ad-hoc you don't really need to track them.
If you were going to go the SAS route I'd suggest that your API be generating the ad-hoc ones as needed. They shouldn't last more than a few minutes as people can have their permissions to a container removed and all you are trying to do is reduce the load on hosted service for actually doing the upload and download. Again, all the logic for handling what containers someone can see is still in your API service and the applications just get signatures they can use for small periods of time.