Securing Files over Web: Fine Grained Authorization Based File Access

Securing Files over Web: Fine Grained Authorization Based File Access - security

I have a system where employees can upload files. There are three ways
Upload to my account in public, private or protected mode
Upload to department account in public, private or protected mode
Upload to organization account in public, private or protected mode
where public is visible to anyone, private to the group or person only and protected to anyone in the organization.
All the files for an organization are stored in a directory say, /files/<organizationId>/, on file server
like
files
+-- 234809
| +img1.jpg
| +doc1.pdf
+-- 808234
| +doc2.pdf
I am storing file-path and privacy level in DB. So, I can control whether to show link to a file URL to an user -- on a given page.
The problem is, I do not have any control over file's URL... so, if some one types the URL to img1.jpg in his browser's address bar, there is no way to know whether a logged in user is eligible to see img1.jpg.
Any suggestion?
Its a Java application. However, there's a separate instance of Glassfish working as file-server. Since the app is not released yet, so we are open to adopt to a better file access strategy.
The user who are accessing the files may or may not be logged in. But we can always, authenticate a user by redirecting to login page if we know that the file that is being accessed, is a private or shared.
Thanks
Nishant

You pose an interesting question and your understanding of the problem is correct.
Depending on the version of IIS that is serving the content, you may not even have access control if the content was within your vdir.
A typical solution to this type of scenario is to store the files in a directory that is NOT accessible to the internet and use an HttpHandler that IS protected and stream the files out.
There are several ways to go about this, the simplest being an HttpHandler mapped to a nonexistent directory, say /downloads, and parse the filename out of the RequestUri, set the proper content-type and write the file to Response.
In this case, your HttpHandler IS protected enabling you to determine access.

You could store the files outside of the public folders, and have some sort of route to catch any URL that is requesting a file from an organization. Then you can serve the file programmatically, rather than letting your web server do that without any control.

Related

How to save and use file "safely" in express nodejs?

I would like to save user file in my project, such as user's profile image(jpg) or user's grade information(xml).
Now,I know
how to implement upload process in express
how to use static file path in express (at "public" dir) (for "CSS", "JS", "page image")
But, If I upload user's file in public directory, client can access file in public directory just with url like "my_website/public/.../....xml". I think it is not good for security because everyone can access others' profile image, grade information, and so on.
So, my question is
Is it okay to save user's sensitive information in public(static) directory?
If not, is there any way to save file safely except database?
In real websites, where they save user's sensitive information file? Is it in same directory with server files like main.js? or completely another server?

Is it okay to save user's sensitive information in public(static) directory?
No, it is not OK. That allows anyone access to private information.
If not, is there any way to save file safely except database?
Yes, there are other ways. For example, you could store a user's information in a directory just for that user and only allow users access to their data in their own directory. This obviously required authentication with some sort of username and a credential/password. You won't be using express.static() to access private information. Instead, you will create a route for each type of resource, verify who the logged in user is and provide access to only the resources that belong to that logged in user.
In real websites, where they save user's sensitive information file? Is it in same directory with server files like main.js? or completely another server?
There are lots of ways to implement user-specific storage, but it is highly unlikely that it is stored in the same directory as other server resources (like code files). User-specific data would be stored somewhere else.

How to hide content in a txt file from direct url

I'm working on a windows app which is reading a "authorized" domains list from a txt file with a web request from "domain.com/sub/txtfile"
I don't want people to see the content of the file when entering it directly in the browser. Is it possible to achieve this with some .htaccess hacks or something else?

As your app is a client-side native Windows application, it's not possible to store any secret in the app itself that could be used for authentication. As the user has everything the Windows app may have, it impossible to authenticate the client as discussed many times here.
It also doesn't make much sense. Imagine it was somehow possible and file contents were only visible to your app. What would be the purpose? What if an attacker changed the hosts file on Windows to download the file from a rogue server? What if he used an intermediate proxy to inspect, change or replace contents? The latter is also possible with https, because the user has full control of the client, and can trust whatever certificate he wants.
You could authenticate the user though. An attacker can still see and modify downloaded file contents, but at least not anybody could download the file, only your authenticated users. But this means having a user database where the file is downloaded from, and implementing proper authentication. And it still doesn't solve the other problems.
In short, you can't protect a client-side application from a user that controls the whole client.

what is the most elegant way to restrict users from accessing other users' content?

I have to build a project where users will login and upload some images and videos, and it should be such that the files uploaded should only be visible by the uploader or the site admin.
I main Node.js, so i tried using express middle wares to restrict the media files by user, but it came to my notice that this isn't the best way to handle this as express isn't good at rendering static content.
Here are some options i can think of after some google sessions
Amazon S3 bucket where each user gets their own folder/permissions and files no into this (but are the files truly private when we have a url)
Generate a temporary URL of the files using pre-signed URLs from S3 bucket ( the file will be public for 20 min, i don't want this)
Restrict access on Nginx ( again i don't know if Nginx can access the database and authenticate the request it got)
Use GridFS with mongoDB? (i will probably not use this, but wanna see if this can be a solution)
is there any other way to do this?

Each user is given a unique ID where content (files & videos etc) is referenced in part by this ID such that clients are given access only to their ID content
User logs in, nodejs pairs that user with their unique ID as retrieved from mongodb. No need to offload this to nginx. Where you stow the content is independent of this logic. Could be a local filesystem, mongo, S3, etc ...
Avoid putting username or ID into any URL, its redundant, server maintains this knowledge internally. No need to muddy up URL. For example Amazon AWS has this URL when I interact with my private content
https://console.aws.amazon.com/route53/home?#resource-record-sets:ZAHKXPA7IKZ8W
See it just shows the content ID which is unique to my username's resources. Goal is to minimize clutter in URL.
Alternatively, in the world of sharing resources, like on flicker, here the URL does give username and resource ID https://www.flickr.com/photos/okinawa-soba/31419099190/in/photostream/ In your case even if someone picks up the URL of another's content the server must reject since its not authorized to muck with another's content

The most elegant way would be to politely ask them to refrain for accessing other people content.
I'm not sure it would be the most effective way, though.
Seriously, though, I would suggest using express.static middleware together with Passport or something to implement authentication and permissions. You're not "rendering" any static content here. You're just streaming files straight from the disk.

Preventing Rogue spiders from Indexing directory

We have a secure website (developed in .NET 2.0/C# running on Windows server and IIS 5) to which members have to log in and then they can view some PDF files stored in a virtual directory. To prevent spiders from crawling this website, we have a robots.txt that will disallow all user agents from coming in. However, this will NOT prevent Rogue spiders from indexing the PDF files since they will disregard the robots.txt commands. Since the documents are to be secure, I do not want ANY spiders getting into this virtual directory (not even the good ones).
Read a few articles on the web and wondering how programmers (rather than web masters) have solved this problem in their applications, since this seems like a very common problem. There are many options on the web but am looking for something that is easy and elegant.
Some options that I have seen, but seem to be weak. Listed here with their cons:
Creating a Honeypot/tarpit that will allow rogue spiders to get in and then will list their IP address. Cons : this can also block valid users coming from the same IP, need to manually maintain this list or have some way for members to remove themselves from the list. We dont have a range of IPs that valid members will use, since the website is on the internet.
Request header analysis : However, the rogue spiders use real agent names so this is pointless.
Meta-Robots tag: Cons: only obeyed by google and other valid spiders.
There was some talk about using .htaccess which is suppose to be good but thats only will apache, not IIS.
Any suggestions are very much appreciated.
EDIT: as 9000 pointed out below, rogue spiders should not be able to get into a page that requires a login. I guess the question is 'how to prevent someone who knows the link form requesting the PDF file without logging into the website'.

I see a contradiction between
members have to log in and then they can view some PDF files stored in a virtual directory
and
this will NOT prevent Rogue spiders from indexing the PDF files
How come any unauthorized HTTP request to this directory ever gets served with something else than code 401? The rouge spiders certainly can't provide an authorization cookie. And if the directory is accessible to them, what is 'member login' then?
Probably you need to serve the PDF files via a script that checks authorization. I think IIS is capable of requiring an authorization just for a directory access, too (but I don't really know).

I assume that your links to PDFs come from a known location. You can check the Request.UrlReferrer to make sure users are coming from this internal / known page to access the PDFs.
I would definitely force downloads to go through a script where you can check that a user is in fact logged in to the site before allowing the download.
protected void getFile(string fileName) {
/*
CHECK AUTH / REFERER HERE
*/
string filePath = Request.PhysicalApplicationPath + "hidden_PDF_directory/" + fileName;
System.IO.FileInfo fileInfo = new System.IO.FileInfo(filePath);
if (fileInfo.Exists) {
Response.Clear();
Response.AddHeader("Content-Disposition", "attachment; filename=" + fileInfo.Name);
Response.AddHeader("Content-Length", fileInfo.Length.ToString());
Response.ContentType = "application/pdf";
Response.WriteFile(fileInfo.FullName);
Response.End();
} else {
/*
ERROR
*/
}
}
Untested, but this should give you an idea at least.
I'd also stay away from robots.txt since people will often use this to actually look for things you think you're hiding.

Here is what I did (expanding on Leigh's code).
Created an HTTPHandler for PDF files, created a web.config on the secure directory and configured the Handler to handle PDFs.
In the handler, I check to see if the user is logged in using a session variable set by the application.
If the user has the session variable, I create a fileInfo object and send it on the response. Note : don't do 'context.Response.End()', also the 'Content-Disposition' is obsolete.
So now, where even there is a request for a PDF on the secure directory, the HTTP handler gets the request and checks to see if the user is logged in. If not, display error message, else display the file.
Not sure if there is an performance hit since I am creating the fileInfo objects and sending that, rather than sending the file that already exists. The thing is that you can't Server.Transfer or Response.Redirect to the *.pdf file since you are creating an infinite loop and the response will never get returned to the user.

Displaying PDF to user

We're providing a web form whereby users fill in their personal information; some of it is sensitive information (SSN, Birthday, etc). Upon user submission, the data is prefilled into a PDF which is then made available via a link.
We are creating the PDF in a folder that has write access on the website.
How can we safely create and add PDFs in this folder, with whatever naming scheme (use a GUID?), such that another user cannot guess/spoof the PDF file location, type this in the URL and access another person's PDF?
Maybe the PDF folder has rights only specific to the user, but that may be a different question on how that is accomplished. (The number of users is unknown, as this will be open to public).
Any thoughts on this? In a nut shell, we need to allow the user to view a PDF of the data they just entered while preventing more-savvy users to figure out the location of PDF files, allowing access to other files.
Thanks!

trying to obfuscate the path to a file isn't really making it secure. I would find a way to email or another way to fetch it for the user instead of allowing access to an open directory.
Make the web app fetch the file for the user instead of relying on web server open folder permissions.
just keep in mind obfuscation isn't really security.

If it's really just for the moment, create a completely random file (20384058532045850.pdf) in a temporary directory, serve that to the user immediately and remove it after a certain period of time.
Whether your web app has write rights on that directory or not (I assume you are talking about chmod user rights) is not important, it can't be breached trough the web server and I don't see a problem in revealing the directory path per se - you have to reveal something in giving the user a URL to download. If your PDF names are random enough, there is practically no risk of somebody being able to guess the name of another PDF file in the same directory.
As the PDF contains sensitive data: Don't forget to turn off caching to prevent a local copy of the PDF being saved on the client's browser cache.
I don't know for sure whether turning off caching through the appropriate headers is enough to prevent local caching in all browsers. You might have to look into that.

For the purpose of pdf's, would it not be better (I know I will get flamed for this) to store the actual pdf into the database as a BLOB, which would be on the back-end of the website in question?
There will be no reference to the URL anywhere nor will there be a specific path highlighted in any links on that form.
Hope this helps,
Best regards,
Tom.

The simplest way is to proxy the file through your application (fpassthru() in php for example), this allows you to use what ever access control/identification system you already use for the dynamic content.
If you don't have any means of identifying your users and restricting access, and assuming your platform has a secure session mechanism, you can protect the file by storing the filename in the user's session and then returning that file (and only that file) to the user when requested. This should mean that an attacker would have to spoof a session to access the file so this should be as secure as your session mechanism is.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string