Why does the docutils/Sphinx "include" directive represent a potential security hole?

Why does the docutils/Sphinx "include" directive represent a potential security hole? - security

The source of the claim is in the Docutils documentation, at https://docutils.sourceforge.io/docs/ref/rst/directives.html#include:
Warning
The "include" directive represents a potential security hole. It can be disabled with the "file_insertion_enabled" runtime setting.
What exactly is it that I should be concerned about, and if it's a potential security hole why hasn't it been removed?

"if it's a potential security hole why hasn't it been removed?" Then you can use any programming language to write malware, and why they still exist? You should learn the right attitude to discuss security topics, as often you need to accept some risks and build up fences against them.
To understand the security risks, you need to build up a specific context, such as generating a web site from Sphinx files.
If you choose another context such as generating PDF files, the risks can be different.
First, in .rst files people can write bad JavaScript code in raw directive,
.. raw:: html
<script>
function badCode() {
alert("I am bad code");
}
badCode();
</script>
Second, include directive allows you to include files you might not fully controlled (for example, a file from an external storage, or simply out of your working directory).
So, if you didn't pay enough attention to what you include and someone really hacks the contents included in your Sphinx files, then the final generated sites can contain malicious code and harm the end users who view those pages.
To minimize such risks, clearly
Sphinx allows you to disable include completely as that documentation page says, but you lose this useful feature.
You can include only contents that you trust (from the same Git repository for example), because you know a good enough code review can reduce risks from there.
If you still want to include external contents, set up procedures to actively verify the actual contents before generating the web site.

Related

Could there be malware in Python/Javascript files that I purchased?

I recently purchased a django-react code off fiverr. I don't know too much about web development.
This may be just paranoia but is it possible that there could be some malicious malware in these kind of Files. And that if I run the server then something could happen to the pc I run it on?

Short answer, is it possible for HTML, PY, JS etc. files to contain malicious content, yes. If you run this server on a PC can any malicious content do bad things to the PC that it is being run on, yes.
Ok, so that is the scary side of this done. Let's consider the question a little more objectively. Let's think about how these files can contain malicious content, and more importantly what can you do about it.
The author deliberately wrote malicious code into the files
Of course this is possible, but in my opinion unlikely. People producing malware are looking for a return on the investment of their time. Writing a solution to your request on fiverr and including malicious content is a huge investment for minimal return.
Also, please bare in mind that any contractor / freelancer is building their career on trust. If they get caught writing malicious code for customers then their reputation will be impacted. There is a great book on Who Can You Trust? which goes into the details about trust on platforms sharing goods and services.
If you do want to check for issues in the code, then I would use a static code analyser (e.g. Fortify) and a penetration test.
The author has included an open source module that has malicious content
This in my opinion is more likely. There have been examples in the past of modules published via sharing mechanisms e.g. NPM have contained malicious content. Here are a couple of examples:
Malicious NPM package
Bitcoin Stealer
The good news here is that it is quite easy for you to check for known issues. For example, as you have JS files I assume there is a dependancy on npm, you can use npm-audit to check the dependancies for security advisories.
In summary, in your position I would start by ensuring that the dependencies used by the code don't have any significant security advisories. Then, if the system is critical enough I would use a static code analysis tool (e.g. Fortify) to check the custom code. Finally, always good for a public facing system is getting a good penetration test done.
The key point here, is to think about the risk profile for the system and then decide what investment you need / should do to ensure the security of the system. Is this a customer facing system taking sensitive information (e.g. bank / credit card details), or, an internal intranet system? The first will require more stringent security checks to ensure that you keep your customers safe.

Mitigating security risks of javascript objects using require.js

I'm a little paranoid about storing sensitive information in global variables on the browser; who wouldn't be. Enter AMD! My question is, can we confidently use require.js to completely isolate variables, to help mitigate unwanted manipulation of variables from the console? Has anyone found a backdoor, or maybe a better way to put it is, has anyone witnessed any security issues with the require.js library?
Thanks!

No you can't. Even if you don't have any global variable the user can still go through your source code and add break points, then when the code reach the breakpoint he can manipulate all the variables that are accessible in the actual scope.
Take a look at this gamedev question which has some advices on how to make it harder (but not impossible) for users to cheat your code.

Yeah, the attacker can always view the source.
But if you size and shape the payload, minifying and modularize parts/regions of the client, serving them in accordance with use-case narratives on-demand, you effectively add a layer of security that exists due to the assumption of human-play.
A bot cannot simply traverse directories on a server, but instead must (via JavaScript) navigate the application intelligently, only getting code at a uniquely specified point in the app. It must know when certain payloads are essential to the use-case (say offering up credit card info N screens into a process).
Moreover, client code and be obfuscated w/r/t IP address or along continuous, periodic release cycles.

What are the common website vulnerabilities, and the programming languages related to them?

As far as know, I must be careful with PHP, and I think Javascript. What else?

Security vulnerabilities are (mostly) independent of the language involved (except for memory issues).
Instead, you should focus on tasks with potential vulnerabilities, such as processing user input or handling sensitive data.
Some things to watch out for:
Always use parameters in SQL
Always escape correctly (when generating HTML, JSON, Javascript strings, or anything else)
Be extremely careful when executing code dynamically (eg, eval, automatic updates, etc)
Always validate user input on the server
You should also read articles about security, such as the Top 25 Most Dangerous Programming Errors.

OWASP provides an annual report describing the top ten web application security flaws (see link below for description of the project and the most recent report). As SLaks wrote, many vulnerabilities are independent of the language. Web applications need to be designed with security in mind.
http://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project

Security issues in accepting image uploads

What are the major security issues to consider when accepting image uploads, beyond the normal stuff for all HTTP uploads?
I'm accepting image uploads, and then showing those images to other users.
How should I verify, for example, that the uploaded image is actually a valid image file? Are there any known vulnerabilities in viewers that are exploitable by malformed image files for which I should be concerned about accidentally passing along exploits? (Quickly googling seems to show that there once was in IE5/6.)
Should I strip all image metadata to help users prevent unintentional information disclosures? Or are there some things that are safe and necessary or useful to allow?
Are there any arcane features of common image formats that could be security vulnerabilities?
Are there any libraries that deal with these issues? (And/or with other issues like converting progressive JPEGs to normal JPEGs, downsampling to standardize sizes, optimizing PNGs, etc.)

Some things I learned recently from a web security video:
The nuclear option is to serve all uploaded content from a separate domain which only serves static content - all features are disabled and nothing important is stored there.
Considering processing images through imagemagick etc. to strip out funny business.
For an example of what you are up against, look up GIFAR, a technique that puts a GIF and Java JAR in the same file.

The risk of propogation of bugs inside image formatters isn't "exactly" your problem, but you can help anyway, by following the general practice of mapping ".jpg" to your executable language, and processing each image manually (in this way you can do refer checks as well).
You need to be careful of:
People uploading code as images (.jpg with actual c# code inside)
any invalid extensions (you check for this)
People trying to do path-related attacks on you
The last one is what you'll need to be wary of, if you're dynamically reading in images (as you will be, if you follow my first bit of advice).
So ensure you only open code in the relevant folder, and, probably more importantly, lock down the user that does this work. I mean the webserver user. Make sure it only has permissions to read from the folder you are working in, and other such logical things.
Stripping metadata? Sure why not, it's quite polite of you, but I wouldn't be nuts about it.

Your biggest risk is that an attacker tries to upload some type of executable code to your server. If the uploaded file is then browsable on the web, the attacker may be able to cause the code to run on your server. Your best protection is to first save the uploaded file to a non-publicly browsable location, try to load it as an image in your programming language and allow it if it can be successfully parsed as an image. A lot of the time people will want to resize the image anyway so really doing this is no extra work. Once the image is validated, you can move it into the publicly browsable area for your web server.
Also, make sure you have a limit on file upload size. Most platforms will have some kind of limit in place by default. You don't want a malicious user filling up your disk with an endless file upload.

One of the vulnerabilities I know of is a "WMF backdoor". WMF is "Windows Metafile"--a graphical format rendered by Windows GDI library. Here's wikipedia article.
The attacker is capable to execute arbitrary code on user's machine. This can happen when user just views the file via the browser, including, but not limited to Internet Explorer. The issue is said to be fixed in 2006.

What security issues appear when users can upload their own files?

I was wondering what security issues appear when the end user of a website can upload files to the server.
For instance if my website allows the users to upload a profile picture, and one user uploads something harmful instead, what could happen? What kind of security should I set up to prevent attacks like this? I'm talking here about images, but what about the case where a user can upload anything into a file-vault kind of application?
It's more a general question than a question about a specific situation, so what are the best practices in that situation? What do you usually do?
I suppose: type validation on upload, different permissions for uploaded files... what else?
EDIT: To clear up the context, I am thinking about a web application where a user can upload any kind of file and then display it in the browser. The file would be stored on the server. The users are whoever uses the website, so there is no trust involved.
I am looking for general answers that could apply for different languages/framework and production environments.

Your first line of defense will be to limit the size of uploaded files, and kill any transfer that is larger than that amount.
File extension validation is probably a good second line of defense. Type validation can be done later... as long as you aren't relying on the (user-supplied) mime-type for said validation.
Why file extension validation? Because that's what most web servers use to identify which files are executable. If your executables aren't locked down to a specific directory (and most likely, they aren't), files with certain extensions will execute anywhere under the site's document root.
File extension checking is best done with a whitelist of the file types you want to accept.
Once you validate the file extension, you can then check to verify that said file is the type its extension claims, either by checking for magic bytes or using the unix file command.
I'm sure there are other concerns that I missed, but hopefully this helps.

Assuming you're dealing with only images, one thing you can do is use an image library to generate thumbnails/consistent image sizes, and throw the original away when you're done. Then you effectively have a single point of vulnerability: your image library. Assuming you keep it up-to-date, you should be fine.
Users won't be able to upload zip files or really any non-image file, because the image library will barf if it tries to resize non-image data, and you can just catch the exception. You'll probably want to do a preliminary check on the filename extension though. No point sending a file through the image library if the filename is "foo.zip".
As for permissions, well... don't set the execute bit. But realistically, permissions won't help protect you much against malicious user input.
If your programming environment allows it, you're going to want to run some of these checks while the upload is in progress. A malicious HTTP client can potentially send a file with an infinite size. IE, it just never stops transmitting random bytes, resulting in a denial of service attack. Or maybe they just upload a gig of video as their profile picture. Most image file formats have a header at the beginning as well. If a client begins to send a file that doesn't match any known image header, you can abort the transfer. But that's starting to move into the realm of overkill. Unless you're Facebook, that kind of thing is probably unnecessary.
Edit
If you allow users to upload scripts and executables, you should make sure that anything uploaded via that form is never served back as anything other than application/octet-stream. Don't try to mix the Content-Type when you're dealing with potentially dangerous uploads. If you're going to tell users they have to worry about their own security (that's effectively what you do when you accept scripts or executables), then everything should be served as application/octet-stream so that the browser doesn't attempt to render it. You should also probably set the Content-Disposition header. It's probably also wise to involve a virus scanner in the pipeline if you want to deal with executables. ClamAV is scriptable and open source, for example.

size validation would be useful too, wouldn't want someone to intentionally upload a 100gb fake image just out of spite now would you :)
Also, you may want to consider something to prevent people from using your bandwidth just for a easy way to host images (I would mostly be concerned with hosting of illegal stuff). Most people would use imageshack for temp image hosting anyway.

For further reading, there's a great article by Acunetix on Why File Upload Forms are a Major Security Threat

With more context, it would be easier to know where the vulberabilities may lie.
If the data could be stored in a database (sounds like it won't be), then you should guard against SQL Injection attacks.
If the data could be displayed in a browser (sounds like it would be), then you may need to guard against HTML/CSS Injection attacks.
If you're using scripting languages (e.g., PHP) on the server, then you may need to guard against injection attacks against those specific languages. With compiled server code (or a poor scripting implementation), there's the chance of buffer overrun attacks.
Don't overlook user data security, too: Can your users trust you to prevent their data from being compromised?
EDIT: If you really want to cover all bases, consider the risks of JPEG and WMF security holes. These could be exploited if a malicious user can upload the files from one system, and then views the files -- or persuades another user to view the files -- from another system.

Size of the content
Restricting certain file types (.jpeg, .png etc., white-listed file types should only be allowed)
file tampering (for ex: a site supporting foreign languages, certain encoding is allowed. the hacker may take advantage of this and adds any script/malicious code encoded and appends to the original file and tries to upload)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string