Provide and Protect a Read Only PDF on Website

Provide and Protect a Read Only PDF on Website - security

I have a client who wants to provide a PDF document on their website but they want to ensure that the document is protected from being downloaded. I've looked at some js plugins that provide readers, but it seems like a quick look at the source code will reveal the path to the PDF in most cases. Anyone have experience with this or know how it can be done?

If you want to allow people to view the documents but make it hard for them to copy them, you'll need to do the document rendering server-side. Keep track of the user's viewport (size, zoom, coordinates in the document, etc.), render the portion of the document within the viewport as a static image, and send that to the user. Keep in mind that no matter what you do, users can reassemble the document if they put enough effort into it.

The "normal" approach is to not bother about not allowing to download, but to make the documents usable only under certain conditions. This essentially means some kind of DRM (Digital Rights Management) system.
You can for example have the document call home and ask for permission to be opened, or printed. You can also tie the file to the user's hard disk (via signature), etc.
With document security, you (your client) has to decide how much it is worth to you/him. That will also set the budget for a DRM system.

Related

"Sandbox" Google Analytics for security

By including Google Analytics in a website (specifically the Javascript version) isn't it true that you are giving Google complete access to all your cookies and site information? (ie. it could be a security hole).
Can this be mitigated by putting Google in an iFrame that is sandboxed? Or maybe only passing Google the necessary information (ie. browser type, screen resolution, etc)?
How can someone get the most out of Google Analytics without leaving the entire site open?
Or perhaps passing the data through my own server and then uploading it to Google?

You can create a scriptless implementation via the measurement protocol (for Universal Analytics enabled properties). This not only avoids any security issues with the script (although I'd rather trust Google on that), it also means you have more control what data is submitted to the Google Server.

A script run on your site can read cookies on your site, yes. And that data can be sent back to google, yes. That is why you shouldn't store sensitive information in cookies. You shouldn't do this even if you don't use google analytics. Even if you don't use ANY other code except your own. Browsers and browser addons can also read that stuff and you definitely cannot control that. Again, never store sensitive information in cookies.
As far as access to "site information".. javascript can be used to read the content on your pages, know urls of pages, etc.. IOW anything you serve up on a web page. Anything that is not behind a wall (e.g. login barrier) is surely up for grabs. But crawlers will look at that stuff anyway. Stuff behind walls can still be grabbed automatically, depending on what they have to actually do to get past those walls (e.g. simple registration/login barriers are pretty easy to get past).
This is also why you should never display sensitive information even in content of your site. E.g. credit card numbers, passwords, etc.. that's why virtually every site you go to that has even remotely sensitive information always shows a mask (e.g. ** ) instead of actual values.
Google Analytics does not actively do these things, but you're right: there's nothing stopping them from doing it, and you've already given them the right to do it by using their script.
And you are right: the safest way to control what Google can actually see is to send server-side requests to them. And also put all your content behind barriers that cannot be easily crawled or scraped. The strongest barrier being one that involves having to pay for access. People are ingenious about making bots about making crawlers and bots to get past all sorts of forms and "human" checks etc.. and you're fighting a losing battle on that count, but nothing stops a bot faster than requiring someone to give you money to access your stuff. Of course, this also means you'd have to make everybody pay for access...
Anyways.. if you're that paranoid about this stuff, why use GA at all? Use something you host yourself (e.g. Piwik). This won't solve for crawlers/bots, obviously, but it will solve for worries about GA grabbing more than you want it to.

Is DRM right way?

I need to protect content(e.g. different files) which I load from server when user buy it. And protect it from copy. I need to do that on different platforms/devices.
I thought about Digital Right Management implementation. Is it right way?
What can you recommend me?
Thanks

DRM can never be a bulletproof answer.
The reason? They user has to decrypt content to play it, that means that at some point he HAS the decryption key and thus HAS the cleartext content. He just needs to get the content from this moment, and therefore your content will be shareable and not protected anymore by DRM.
That's why DRM are usually a failure and snake oil, it just means slowing down the attacker. So to me, that's not a good way.
Now you can do watermarking:
that is marking the content with the end user identity in a non-removable way (cryptographic, redundant, sneaky) and let know the user about that (look for steganography programs and attribute user a unique id). This will give him incentive not to share the content. He will be able to copy, but then bear the responsibility in case of disclosure that is easily traced back to him.
Add clause in the EULA saying that the user bear costs & responsibility in case of disclosure.

One way to do this is to provide your content and a reader, such that only your reader application can display/use your content. You can then issue licenses for your reader that control how and by whom that reader can be used and what content it can access using a software license management system.

Security issues in accepting image uploads

What are the major security issues to consider when accepting image uploads, beyond the normal stuff for all HTTP uploads?
I'm accepting image uploads, and then showing those images to other users.
How should I verify, for example, that the uploaded image is actually a valid image file? Are there any known vulnerabilities in viewers that are exploitable by malformed image files for which I should be concerned about accidentally passing along exploits? (Quickly googling seems to show that there once was in IE5/6.)
Should I strip all image metadata to help users prevent unintentional information disclosures? Or are there some things that are safe and necessary or useful to allow?
Are there any arcane features of common image formats that could be security vulnerabilities?
Are there any libraries that deal with these issues? (And/or with other issues like converting progressive JPEGs to normal JPEGs, downsampling to standardize sizes, optimizing PNGs, etc.)

Some things I learned recently from a web security video:
The nuclear option is to serve all uploaded content from a separate domain which only serves static content - all features are disabled and nothing important is stored there.
Considering processing images through imagemagick etc. to strip out funny business.
For an example of what you are up against, look up GIFAR, a technique that puts a GIF and Java JAR in the same file.

The risk of propogation of bugs inside image formatters isn't "exactly" your problem, but you can help anyway, by following the general practice of mapping ".jpg" to your executable language, and processing each image manually (in this way you can do refer checks as well).
You need to be careful of:
People uploading code as images (.jpg with actual c# code inside)
any invalid extensions (you check for this)
People trying to do path-related attacks on you
The last one is what you'll need to be wary of, if you're dynamically reading in images (as you will be, if you follow my first bit of advice).
So ensure you only open code in the relevant folder, and, probably more importantly, lock down the user that does this work. I mean the webserver user. Make sure it only has permissions to read from the folder you are working in, and other such logical things.
Stripping metadata? Sure why not, it's quite polite of you, but I wouldn't be nuts about it.

Your biggest risk is that an attacker tries to upload some type of executable code to your server. If the uploaded file is then browsable on the web, the attacker may be able to cause the code to run on your server. Your best protection is to first save the uploaded file to a non-publicly browsable location, try to load it as an image in your programming language and allow it if it can be successfully parsed as an image. A lot of the time people will want to resize the image anyway so really doing this is no extra work. Once the image is validated, you can move it into the publicly browsable area for your web server.
Also, make sure you have a limit on file upload size. Most platforms will have some kind of limit in place by default. You don't want a malicious user filling up your disk with an endless file upload.

One of the vulnerabilities I know of is a "WMF backdoor". WMF is "Windows Metafile"--a graphical format rendered by Windows GDI library. Here's wikipedia article.
The attacker is capable to execute arbitrary code on user's machine. This can happen when user just views the file via the browser, including, but not limited to Internet Explorer. The issue is said to be fixed in 2006.

Offline view of dynamic content?

I want to view dynamic contents (flash games, online transaction...etc) offline.
For example, I finish level 1 of this cool flash RPG game.
I go offline and play the level again.
Or, I make a purchase.
And make the purchase again offline.
Of course this won't do anything. It will be strictly for demonstration purpose.
Or, I watch a video online. Go offline and watch it again.
Is this feasible? Whatever I do through browser, it has to download things.
When it downloads, it stores on disk. Then, when it is in offline mode, it routes all traffic out to local disk.
Sounds simple, but is this really possible?
Or am I missing something?
Let's say someone patched a browser to make offline mode much more powerful.
As a web developer, how can I secure my application from this
patched browser?
Let's say I charging my contents (video, game...etc)
per view/use. With this patched browser, people can pay once
and view/use it over and over again.
They might even make a tarball out of their browser cache
and share with other people online.
So, my questions are:
is this patched browser possible?
if it is possible, how can I defend my content against it?

I'm trying to find the original author of the quote: "Trying to make digital content not copyable is like trying to make water not wet."
In your question you describe several different scenarios as if they were similar. They are not. If you have a specific question, then please ask it so that people can focus on addressing the specific case that concerns you.
Let's talk about video (and audio). Essentially, without controlling the client, you can NOT stop the downloaded video from being cached and re-watched. "Patched" browsers exist. In fact, they're not patched. They don't even need to be. FireFox has any number of plug-ins such as "DownloadHelper" which make all of this possible. YouTube goes to some effort to change their system regularly to break DownloadHelper. But they know they can only slow things down.
The only way to control a video download being re-watched is insist on the user using your completely custom plugin or application. The problem is that (a) that costs you much more money, (b) it's more painful for the user.
The other cases you mention - RPG and online transaction... these are different. Often with an RPG or other game, the client portion includes only a part of the code. Some of the code resides on your server. Without a connection to the server, the game cannot be played. You don't have to write it that way, you could make it 100% client... in which case (e.g. for Flash) the SWF file can be downloaded and played again and again, without your control.
But usually those online flash games are part-server in order to do what you say, and make them playable only online and only via the game-writers site.
An online transaction ALWAYS involves a server component, usually encrypted and non-repeatable. They can be secured.

What security issues appear when users can upload their own files?

I was wondering what security issues appear when the end user of a website can upload files to the server.
For instance if my website allows the users to upload a profile picture, and one user uploads something harmful instead, what could happen? What kind of security should I set up to prevent attacks like this? I'm talking here about images, but what about the case where a user can upload anything into a file-vault kind of application?
It's more a general question than a question about a specific situation, so what are the best practices in that situation? What do you usually do?
I suppose: type validation on upload, different permissions for uploaded files... what else?
EDIT: To clear up the context, I am thinking about a web application where a user can upload any kind of file and then display it in the browser. The file would be stored on the server. The users are whoever uses the website, so there is no trust involved.
I am looking for general answers that could apply for different languages/framework and production environments.

Your first line of defense will be to limit the size of uploaded files, and kill any transfer that is larger than that amount.
File extension validation is probably a good second line of defense. Type validation can be done later... as long as you aren't relying on the (user-supplied) mime-type for said validation.
Why file extension validation? Because that's what most web servers use to identify which files are executable. If your executables aren't locked down to a specific directory (and most likely, they aren't), files with certain extensions will execute anywhere under the site's document root.
File extension checking is best done with a whitelist of the file types you want to accept.
Once you validate the file extension, you can then check to verify that said file is the type its extension claims, either by checking for magic bytes or using the unix file command.
I'm sure there are other concerns that I missed, but hopefully this helps.

Assuming you're dealing with only images, one thing you can do is use an image library to generate thumbnails/consistent image sizes, and throw the original away when you're done. Then you effectively have a single point of vulnerability: your image library. Assuming you keep it up-to-date, you should be fine.
Users won't be able to upload zip files or really any non-image file, because the image library will barf if it tries to resize non-image data, and you can just catch the exception. You'll probably want to do a preliminary check on the filename extension though. No point sending a file through the image library if the filename is "foo.zip".
As for permissions, well... don't set the execute bit. But realistically, permissions won't help protect you much against malicious user input.
If your programming environment allows it, you're going to want to run some of these checks while the upload is in progress. A malicious HTTP client can potentially send a file with an infinite size. IE, it just never stops transmitting random bytes, resulting in a denial of service attack. Or maybe they just upload a gig of video as their profile picture. Most image file formats have a header at the beginning as well. If a client begins to send a file that doesn't match any known image header, you can abort the transfer. But that's starting to move into the realm of overkill. Unless you're Facebook, that kind of thing is probably unnecessary.
Edit
If you allow users to upload scripts and executables, you should make sure that anything uploaded via that form is never served back as anything other than application/octet-stream. Don't try to mix the Content-Type when you're dealing with potentially dangerous uploads. If you're going to tell users they have to worry about their own security (that's effectively what you do when you accept scripts or executables), then everything should be served as application/octet-stream so that the browser doesn't attempt to render it. You should also probably set the Content-Disposition header. It's probably also wise to involve a virus scanner in the pipeline if you want to deal with executables. ClamAV is scriptable and open source, for example.

size validation would be useful too, wouldn't want someone to intentionally upload a 100gb fake image just out of spite now would you :)
Also, you may want to consider something to prevent people from using your bandwidth just for a easy way to host images (I would mostly be concerned with hosting of illegal stuff). Most people would use imageshack for temp image hosting anyway.

For further reading, there's a great article by Acunetix on Why File Upload Forms are a Major Security Threat

With more context, it would be easier to know where the vulberabilities may lie.
If the data could be stored in a database (sounds like it won't be), then you should guard against SQL Injection attacks.
If the data could be displayed in a browser (sounds like it would be), then you may need to guard against HTML/CSS Injection attacks.
If you're using scripting languages (e.g., PHP) on the server, then you may need to guard against injection attacks against those specific languages. With compiled server code (or a poor scripting implementation), there's the chance of buffer overrun attacks.
Don't overlook user data security, too: Can your users trust you to prevent their data from being compromised?
EDIT: If you really want to cover all bases, consider the risks of JPEG and WMF security holes. These could be exploited if a malicious user can upload the files from one system, and then views the files -- or persuades another user to view the files -- from another system.

Size of the content
Restricting certain file types (.jpeg, .png etc., white-listed file types should only be allowed)
file tampering (for ex: a site supporting foreign languages, certain encoding is allowed. the hacker may take advantage of this and adds any script/malicious code encoded and appends to the original file and tries to upload)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string