How to prevent external users from viewing document files - security

I've built an online system that allows users to download PDF files using ColdFusion. Users have to log in before they can download the files (PDF & Microsoft Office documents). (This application is only for our company staff.)
However, only today I found out that anyone with internet access can view the files. With only certain keywords such as 'Medical Form myCompanyName' in a Google search, they can view the PDF files using the browser.
How can I prevent this?
UPDATE
this is what my problem is. i've created a folder for all of the PDFs file. each of the files is called using ID from database. if let's say a user wanted to view Medical Form, the link would be: http://myApplication.myCompanyName/forms.cfm?Department=Account&filesID=001.
if the user copy this url & log out from system, he/she will not be able to view this file.(login page will be displayed)
However, without the url, other internet users sstill can view the pdf files just by search it on the net, and the search engine will gives a link that direct it to the folder itself, without having to login.
Example:
Medical Form's pdf file is stored in a folder named Document. when an internet user search for Medical Form, the search engine will link it to: http://myApplication.myCompanyName/Document/Medical%20Form.pdf
we have lots of PDF files in this folder and most of it are confidential, and for internal view purpose only. in php, we can disable this by using .htaccess. i'd like to know if there's anything like this for coldfusion?

You can send files through the code with single line like this:
<cfif isAuthorized>
<cfcontent file="/path/to/files/outside/of/web/root/Form.pdf" type="application/pdf" reset="true" />
</cfif>
ColdFusion FTW, right.
Please note that handling large files (say, 100MB+) may cause some problems, because files being pushed to RAM before sending. Looks like this is not correct any more, as Mike's answer explains.
Another option is to use content type like x-application if you want to force download.
UPD
You want to put this code into the file (let's say file.cfm) and use it for PDF links. Something like this:
Download file Xyz.pdf
file.cfm:
<!--- with trailing slash --->
<cfset basePath = "/path/to/files/outside/of/web/root/" />
<cfif isAuthorized AND StructKeyExists(url, "filename")
AND FileExists(basePath & url.filename)
AND isFile(basePath & url.filename)
AND GetDirectoryFromPath(basePath & url.filename) EQ basePath>
<cfcontent file="#basePath##url.filename#" type="application/pdf" reset="true" />
<cfelse>
<cfoutput>File not found, or you are not authorized to see it</cfoutput>
</cfif>
UPD2
Added GetDirectoryFromPath(basePath & url.filename) EQ basePath as easy and quick protection from the security issue mentioned.
Personally I usually use ID/database approach, though this answer was initially intended as simple guidance, not really compehensive solution.

You need to store your PDF's outside of your web realm.
So lets say the base of your web app is
/website/www
All http (web) requests are served from there.
/website/pdf
could be a path where all PDF's are stored. This path isn't accessible via URL as its not served by your web server.
Then in www
you have something like
downloadpdf.cfm?file=NameOfPDF.pdf
Which does your checks to ensure its an appropiate user and if so serves the document
<cfcontent type="application/pdf" file="/website/pdf/#url.file#" />

Using cfcontent, pre cf8, is a really bad idea, as it loads the entire file into memory before transmission. CF8 and later will actually stream from disk, which resolves the memory issue. However if you have large files, users on slow connections, and/or heavy downloads you still have to worry about thread starvation. Each download with cfcontent ties up a thread for the duration of the download.
Depending on your web server you might be able to route around this by using an x-sendfile extension. This allows you to send an http header with the path to a file outside of your web root, and have your web server handle sending the file, freeing up cf to do further work.
Here's an article by Ben Nadel about using mod_xsendfile on apache, http://www.bennadel.com/blog/2170-Streaming-Secure-Files-Efficiently-With-ColdFusion-And-MOD-XSendFile.htm and here's an equivalent IIS7 XSendFile plugin https://github.com/stakach/IIS-X-Sendfile-plugin

You might checkout the snippet of code for CFWheels SendFile() helper tag http://cfwheels.org/docs/1-1/function/sendfile
https://gist.github.com/1528113

Related

Prevent Cross Site Scripting but still support HTML file upload

I have a web application where user can upload and view files. The user has a link next to the file (s)he has uploaded. Clicking on the link will open the file in the browser (if possible) or show the download dialog (of the browser). Meaning that, if the user upload an html/pdf/txt file it will be rendered in the browser but if it is a word document, it will be downloaded.
It is identified that rendering the HTML file in the browser could be a vulnerability - Cross Site Scripting.
What is the right solution to this problem? The two options I am currently looking at are:
to put Content-Disposition header in the response to make HTML files downloaded instead viewed in the browser.
to find some html scrubbing/sanitizing library to remove any javascript from the file before I serve it.
Looking at the gmail, they do the second approach (of scrubbing) with having a separate domain for the file download - may be to minimize/distract the attack surface. However in this approach the receiver gets a different file than what was sent. Which is not 'right' in my opinion; may be I am biased. In my case, the first one is easy to fix. But I wonder if that is enough, or is there any thing that I overlook!
What are your thoughts on these approaches? Or do you have any other suggestions?
Based on your description, I can see 3 posible attack types (maybe there are more):
Client side code execution
As you said, your web server may serve a file as HTML and run javascript code on the client. This can be avoided with Content-Disposition but I would go with MIME types control through Content-Type. I would define my known type of files (e.g. pdf, jpeg etc.) and serve them with their respective MIME type (e.g. application/pdf, image/jpeg etc.). Anything else I would serve it as application/octet-stream.
Server side code execution
Althougth I see this as an out of topic attack (since it involves other parts of your application and your server) be sure to avoid executing files on the server (e.g. PHP code through LFI). Your webserver should not access directly the files (e.g. again PHP), better store them somethere not accesible through a URL and retrive them on request.
Think if here you are able to reject files (e.g. reject .exe uploads) and ask the user to zip them first.
Trust issues
Since the files are under the same domain, the files will be accesible from javascript (ajax or load as script) and other programs (or people) may trust your links. This is also related to the previous point, if you don't need unzipped exe files, don't allow them. Using an other domain may mitigate some trust problems.
Other ideas:
Zip all files uploaded
Scan each file with antivirus software
PS: For me sanitization would not work in your case. The risk of missing something is too high.

Is it possible to have a link to raw content of file in Azure DevOps

It's possible to generate a link to raw content of the file in GitHub, is it possible to do with VSTS/DevOps?
Even after reading the existing answers, I still struggled with this a bit, so I wanted to leave a bit more of a thorough response.
As others have said, the pattern is (query split onto separate lines for ease of reading):
https://dev.azure.com/{{organization}}/{{project}}/_apis/sourceProviders/{{providerName}}/filecontents
?repository={{repository}}
&path={{path}}
&commitOrBranch={{commitOrBranch}}
&api-version=5.0-preview.1
But how do you find the values for these variables? If you go into your Azure DevOps, choose Repos > Files from the left navigation, and select a particular file, your current url should look something like this:
https://dev.azure.com/{{organization}}/{{project}}/_git/{{repository}}?path=%2Fpackage.json
You should use those values for organization, project, and repository. For path, you'll see an HTTP encoded version of the unix file path. %2F is the HTTP encoding for /, so that path is actually just /package.json (a tool like Postman will do that encoding for you).
Commit or branch is pretty self explanatory; you either know what you want for this value or you should use master. I have "hard-coded" the api version in the above url because that's what the documentation currently points to.
For the last variable, you need providerName. In short, you should probably use TfsGit. I got this value from looking through the list of source providers and looking for one with a value of true for supportedCapabilities.queryFileContents.
However, if you just request this URL you'll get a "203 Non-Authoritative Information" response back because you still need to authenticate yourself. Referring again to the same documentation, it says to use Basic auth with any value for the username and a personal access token for the password. You can create a personal access token at https://dev.azure.com/{{organization}}/_usersSettings/tokens; ensure that it has the Token Administration - Read & Manage permission.
If you're unfamiliar with this sort of thing, again Postman is super helpful with getting these requests working before you get into the code.
So if you have a repository with a src directory at the root, and you're trying to get the file contents of src/package.json, your URL should look something like:
https://dev.azure.com/{{organization}}/{{project}}/_apis/sourceProviders/TfsGit/filecontents?repository={{repository}}&commitOrBranch=master&api-version={{api-version}}&path=src%2Fpackage.json
And don't forget the basic auth!
Sure, here's the rests call needed:
GET https://feeds.dev.azure.com/{organization}/_apis/packaging/Feeds/{feedId}/packages/{packageId}?includeAllVersions={includeAllVersions}&includeUrls={includeUrls}&isListed={isListed}&isRelease={isRelease}&includeDeleted={includeDeleted}&includeDescription={includeDescription}&api-version=5.0-preview.1
https://learn.microsoft.com/en-us/rest/api/azure/devops/artifacts/artifact%20%20details/get%20package?view=azure-devops-rest-5.0#package
I was able to get the raw contents of a file using this URL.
GET https://dev.azure.com/{organization}/{project}/_apis/sourceProviders/{providerName}/filecontents?serviceEndpointId={serviceEndpointId}&repository={repository}&commitOrBranch={commitOrBranch}&path={path}&api-version=5.0-preview.1
I got this from here.
https://learn.microsoft.com/en-us/rest/api/azure/devops/build/source%20providers/get%20file%20contents?view=azure-devops-rest-5.0
You can obtain the raw URL using chrome.
Turn on Developer tools and view the Network tab.
Navigate to view the required file in the DevOps portal (Content panel). Once the content view is visible check the network tab again and find the URL which starts with "Items?Path", this is json response which contains the required "url:" element.
Drag the filename from the attachments windows and drop it in to any other MS application to get the raw URL or linked filename.
Most answers address this well, but in context of a public repo with anonymous access the api is different. Here is the one that works in such a scenario:
https://dev.azure.com/{{your_user_name}}/{{project_name}}/_apis/git/repositories/{{repo_name_encoded}}/items?scopePath={{path_to_your_file}}&api-version=6.0
This is the exact equivalent of the "raw" url provided by Github.
Another way that may be helpful if you want to quickly get the raw URL for a specific file that you are browsing:
install the browser extension named "Undisposition"
from the dot menu (top right) choose "Download": the file will open in a new browser tab from which you can copy the URL
(edit: unfortunately this will only work for file types that the browser knows how to open, otherwise it will still offer to download it...)
I am fairly new to this and had an issue accessing a raw file in an Azure DevOps Repo. It's straightforward in Github.
I wanted to download a file in CMD and BASH using Curl.
First I browsed to the file contents in the browser make a note of the bold sections:
https://dev.azure.com/**myOrg**/_git/**myProjectName**?path=%2F**MyFileName.ps1**
I then constructed the URL similar to what #Zach posted above.
https://dev.azure.com/**myOrg**/**myProjectName**/_apis/sourceProviders/TfsGit/filecontents?repository=**myProjectName**&commitOrBranch=**master**&api-version=5.0-preview.1&path=%2F**MyFileName.ps1**
Now when I paste the above URL in the browser it displays the content in RAW form similar to GitHub.
The difference was I had to setup a PAT (Personal Access Token) in My Azure DevOps account then authenticate the URL in DOS/BASH example below:
curl -u "<username>:<password>" "https://dev.azure.com/myOrg/myProjectName/_apis/sourceProviders/TfsGit/filecontents?repository=myProjectName&commitOrBranch=master&api-version=5.0-preview.1&path=%2FMyFileName.ps1" -# -L -o MyFileName.ps1

Export report to Excel

I want to export a table to an Excel file. I need to export a report.
ORA_EXCEL.new_document;
ORA_EXCEL.add_sheet('Sheet name');
ORA_EXCEL.query_to_sheet('select * from mytable');
ORA_EXCEL.save_to_blob(myblob);
I saved my table to blob.
How do I export/respond to the user (client)?
I need something that is simple to allow a user to be able to download an Excel file to their own computer. I tried doing this procedure in an Oracle workflow:
ORA_EXCEL.save_to_file('EXPORT_DIR', 'example.xlsx');
But this did not help, because it is saves the file to a directory on the server and I need it in the real server.
The way I have handled similar issues in the past was to work with the systems people to mount a directory from either a web server or file server on the Database server.
Then create a directory object so that the procedure can save to a location that is accessible to the user.
If the files are not sensitive and there are a limited number of users then a file server makes sense as it is then just a matter of giving the user access to the file share.
If files are sensitive or this is a large number or unknown users we then used the Web server and sent a email with a link to the user enabling them to download their file. Naturally there needs to be security built into this to stop people being able to download other users files.
We didn't just email the files as an attachment because...
1) Emails with attachments tend to get blocked
2) We always advise not to open attachments on emails. (Yes I know we advise not to click on links as well but nothing is perfect)
Who or what is invoking the production of the document?
If it´s done by an application, which the user is working on, this application can fetch the BLOB, stores it at f.e. TEMP-Directory and calls
System.Diagnostics.Process.Start("..."); to open it with the associated application. (see Open file with associated application)
If it´s a website, this one could stream the blob back as Excel-Mimetype (see Setting mime type for excel document)
Also you could store in an Oracle-DIRECTORY, but this one has to be on the server and should be a netword-share to be accessible for clients (which is rarely accepted in a productive environment!)
If MAIL isn´t the solution, then maybe FTP can be a way to store files in a common share. See UTL_TCP - Package, with this a FTP-transfer can be achieved (a bit hard to code, but there are solutions to find in the web) and I guess, professional tools that generate Office-documents out of Oracle-DB and distribute them do it like this.

CF and PDF in secure environment

I'm using CF9. My problem pertains to an admin application that sets session variables at login to identify the user and user permissions. Depending on the user level, certain pages are allowed for viewing and other pages are not allowed. (I'll refer to this as my 'security framework'. This is wrapped around everything in the root.)
This security framework consists of a cfif statement at the top of the CFM page and a closing cfelse and (</)cfif at the bottom of the page. Everything between this opening cfif and closing cfif displays if the user has that level permission - standard stuff.
Certain users can upload PDF files, no problem here. PDF files are uploaded to a folder outside of the root and then moved and renamed to folders inside the root.
When uploading, the user chooses categories and subcategories etc. and these variables are inserted in a SQL database during the upload process. Therefore, I have filePaths and fileNames, etc. to set up dynamic links on a page for a user to click and load the PDF (password protected) in the browser.
I have the dynamic link pointing to a ShowThisPDF.cfm? with URL variables filePath= #filePath# & fileName = #fileName#. I've set up the ShowThisPDF.cfm with the security framework at the top and bottom of the page and am trying to copy the uploaded PDF into this page so that the PDF will display in the browser.
I've tried many ways to do this with cfdocument and cfpdf and cfcontent, etc. When I read the error that this is throwing, it does look like it is reaching the uploaded file but I get an "access denied" every time, due to the security framework I suppose.
On a side note, elsewhere in this application I can create a PDF from my cf pages with cfdocument with the security framework wrapped around the page and this works perfectly - displaying the PDF in the browser. My problem is in loading an existing PDF into a CFM page that has the security framework - which should allow the PDF to load.
Anyone have an idea as to how I can accomplish the above? I hate to try and bypass my security and it seems logical to "copy" the uploaded PDF into a CFM page that wraps the PDF in the security framework and then display the PDF in the browser.
Agree with Dan - I had similar issue. So I ended up doing https: with a windows login and also a ColdFusion Login to Web Application. At end of day - they need 2 logins to get into the system - then they can see the pdf files etc or what they need.

Possible for Word to edit documents directly off an web server without Sharepoint?

I have a use case that seems pretty simple, but after Googling around I can't find a solution. I have some Word documents on an FTP server and I'd like to be able to create a link that would download them into Word and then allow the saved changes to be sent back to the FTP server.
The problem is that I can only get Word to either open the file from the FTP server as read-only and I can't save the changes back to the server automatically, or the file downloads to a temporary location which isn't automatically saved back to the server. I'm creating my link like this:
Test
Frustratingly, if I go into Word File|Open and paste the link "ftp://ftp.example.com/www/uploads/Image/test.doc" I can save back to the server. What gives? Is there a solution? From Googling around it seems that Sharepoint offers this ability, but that's not practical for us. We're using IE7 and Office 2003.
I believe Microsoft Word can read / write WebDAV - see this question:
Editable Word Document from JSP
Can you set up some kind of proxy that can connect via FTP?
Read this link http://www.webdavsystem.com/server/documentation/ms_office_read_only (is actually about webdav, but I'd guess this is the same issue for FTP), there is a section on on opening weblinked documents in non-readonly mode. Which needs some changes on the client side...
HTH
Tim
Solution for IE:
Put a file on ajaxbrowser.com (this is WebDAV Server for testing) and replace file's full path in the next code:
var openDocumentsObject = new ActiveXObject("SharePoint.OpenDocuments");
openDocumentsObject.EditDocument('http://ajaxbrowser.com/mydoc.docx');
Another example:
<a href='http://ajaxbrowser.com/mydoc.docx' id='urltarget' target='_blank'>Edit through URI</a>

Resources