Can an Apache-served pure-HTML website be hacked?

Can an Apache-served pure-HTML website be hacked? - security

Assume you are running a pure-HTML website on Apache. Just serving static files, nothing dynamic, nothing fancy.
Also assume all passwords are safe, and no social-hacking (i.e. phishing attacks, etc...)
Can a website of this nature basically be hacked? Can the server become compromised? Are there any examples for this?

Yes, such a server can become compromised. A very common vector, sadly, is FTPing to the server over an insecure wifi connection. Anyone listening closely can pick your password out of the air. (It's fun to be at a tech conference and have your password displayed on a screen for all to see, along with the other fools that sent their credentials in the clear over wifi.)
Another common vector is using a simple password and having it fall to a dictionary attack.

Sure it can be compromised, through security flaws in Apache itself. While it's true that adding more layers (like php, sql, etc) onto the server itself increases the potential for vulnerability, nothing is infallible.
Apache, however, is a very well-known open-source program, and the community does a good job of flushing out bugs like this.
Short answer: any internet-connected device has the potential to be "hacked"

You might be curious to read an article or two about "Securing Apache 2".
It is reasonable to say it is secure, but you may want to take note that Apache does come shipped with some modules already enabled.
Also, the goal of securing Apache should not only to secure the server instance itself, but to sandbox the server in such a way that you would limit the damage any such intrusion could do.
All of this information is of course contingent on the web server being the only exposed component on the box.

Anything "hackable" about the website itself would have to be a hack in Apache itself. If so, you've got bigger problems than just one website.
So, on a practical level, nope, given the "all password files are safe" conditions.

In theory it could be hacked but in theory anything can be hacked. In practice no. Because Apache didn't had any important vulnerabilities for several years.

Related

Client Server Security Architecture

I would like go get my head around how is best to set up a client server architecture where security is of up most importance.
So far I have the following which I hope someone can tell me if its good enough, or it there are other things I need to think about. Or if I have the wrong end of the stick and need to rethink things.
Use SSL certificate on the server to ensure the traffic is secure.
Have a firewall set up between the server and client.
Have a separate sql db server.
Have a separate db for my security model data.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Passwords generated using a salt which is stored in a different db to the passwords.
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
I have tried searching for some diagrams which could help me understand but I cannot find any which seem to be appropriate.
Thanks in advance

Hardening your architecture can be a challenging task and sharding your services across multiple servers and over-engineering your architecture for semblance security could prove to be your largest security weakness.
However, a number of questions arise when you come to design your IT infrastructure which can't be answered in a single SO answer (will try to find some good white papers and append them).
There are a few things I would advise which is somewhat opinionated backed up with my own thought around it.
Your Questions
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
Settle for the cloud. You do not need to store things on physical servers anymore unless you have current business processes running core business functions that are already working on local physical machines.
Running physical servers increases your system administration requirements for things such as HDD encryption and physical security requirements which can be misconfigured or completely ignored.
Use SSL certificate on the server to ensure the traffic is secure.
This is normally a no-brainer and I would go with a straight, "Yes"; however you must take into consideration the context. If you are running something such as a blog site or documentation-related website that does not transfer any sensitive information at any point in time through HTTP then why use HTTPS? HTTPS has it's own overhead, it's minimal, but it's still there. That said, if in doubt, enable HTTPS.
Have a firewall set up between the server and client.
That is suggested, you may also want to opt for a service such as CloudFlare WAF, I haven't personally used it though.
Have a separate sql db server.
Yes, however not necessarily for security purposes. Database servers and Web Application servers have different hardware requirements and optimizing both simultaneously is not very feasible. Additionally, having them on separate boxes increases your scalability quite a bit which will be beneficial in the long run.
From a security perspective; it's mostly another illusion of, "If I have two boxes and the attacker compromises one [Web Application Server], he won't have access to the Database server".
At foresight, this might seem to be the case but is rarely so. Compromising the Web Application server is still almost a guaranteed Game Over. I will not go into much detail into this (unless you specifically ask me to) however it's still a good idea to keep both services separate from eachother in their own boxes.
Have a separate db for my security model data.
I'm not sure I understood this, what security model are you referring to exactly? Care to share a diagram or two (maybe an ERD) so we can get a better understanding.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Obvious yes; what I am about to say however is controversial and may be flagged by some people (it's a bit of a hot debate)—I recommend using BCrypt instead of PKBDF2 due to BCrypt being slower to compute (resulting in slower to crack).
See - https://security.stackexchange.com/questions/4781/do-any-security-experts-recommend-bcrypt-for-password-storage
Passwords generated using a salt which is stored in a different db to the passwords.
If you use BCrypt I would not see why this is required (I may be wrong). I go into more detail regarding the whole username and password hashing into more detail in the following StackOverflow answer which I would recommend you to read - Back end password encryption vs hashing
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
This purely depends on your goals, budget and requirements. I would personally go for AWS, however you should read some more on alternative platforms such as Google Cloud Platform before making your decision.
Last Remarks
All of the things you mentioned are important and it's good that you are even considering them (most people just ignore such questions or go with the most popular answer) however there are a few additional things I want to point:
Internal Services - Make sure that no unrequired services and processes are running on server especially in productions. These services will normally be running old versions of their software (since you won't be administering them) that could be used as an entrypoint for your server to be compromised.
Code Securely - This may seem like another no-brainer yet it is still overlooked or not done properly. Investigate what frameworks you are using, how they handle security and whether they are actually secure. As a developer (and not a pen-tester) you should at least use an automated web application scanner (such as Acunetix) to run security tests after each build that is pushed to make sure you haven't introduced any obvious, critical vulnerabilities.
Limit Exposure - Goes somewhat hand-in-hand with my first point. Make sure that services are only exposed to other services that depend on them and nothing else. As a rule of thumb, keep everything entirely closed and open up gradually when strictly required.
My last few points may come off as broad. The intention is to keep a certain philosophy when developing your software and infrastructure rather than a permanent rule to tick on a check-box.
There are probably a few things I have missed out. I will update the answer accordingly over time if need be. :-)

What security risks are posed by using a local server to provide a browser-based gui for a program?

I am building a relatively simple program to gather and sort data input by the user. I would like to use a local server running through a web browser for two reasons:
HTML forms are a simple and effective means for gathering the input I'll need.
I want to be able to run the program off-line and without having to manage the security risks involved with accessing a remote server.
Edit: To clarify, I mean that the application should be accessible only from the local network and not from the Internet.
As I've been seeking out information on the issue, I've encountered one or two remarks suggesting that local servers have their own security risks, but I'm not clear on the nature or severity of those risks.
(In case it is relevant, I will be using SWI-Prolog for handling the data manipulation. I also plan on using the SWI-Prolog HTTP package for the server, but I am willing to reconsider this choice if it turns out to be a bad idea.)
I have two questions:
What security risks does one need to be aware of when using a local server for this purpose? (Note: In my case, the program will likely deal with some very sensitive information, so I don't have room for any laxity on this issue).
How does one go about mitigating these risks? (Or, where I should look to learn how to address this issue?)
I'm very grateful for any and all help!

There are security risks with any solution. You can use tools proven by years and one day be hacked (from my own experience). And you can pay a lot for security solution and never be hacked. So, you need always compare efforts with impact.
Basically, you need protect 4 "doors" in your case:
1. Authorization (password interception or, for example improper, usage of cookies)
2. http protocol
3. Application input
4. Other ways to access your database (not using http, for example, by ssh port with weak password, taking your computer or hard disk etc. In some cases you need properly encrypt the volume)
1 and 4 are not specific for Prolog but 4 is only one which has some specific in a case of local servers.
Protect http protocol level means do not allow requests which can take control over your swi-prolog server. For this purpose I recommend install some reverse-proxy like nginx which can prevent attacks on this level including some type of DoS. So, browser will contact nginx and nginx will redirect request to your server if it is a correct http request. You can use any other server instead of nginx if it has similar features.
You need install proper ssl key and allow ssl (https) in your reverse proxy server. It should be not in your swi-prolog server. Https will encrypt all information and will communicate with swi-prolog by http.
Think about authorization. There are methods which can be broken very easily. You need study this topic, there are lot of information. I think it is most important part.
Application input problem - the famose example is "sql injection". Study examples. All good web frameworks have "entry" procedures to clean all possible injections. Take an existing code and rewrite it with prolog.
Also, test all input fields with very long string, different charsets etc.
You can see, the security is not so easy, but you can select appropriate efforts considering with the impact of hacking.
Also, think about possible attacker. If somebody is very interested particulary to get your information all mentioned methods are good. But it can be a rare case. Most often hackers just scan internet and try apply known hacks to all found servers. In this case your best friend should be Honey-Pots and prolog itself, because the probability of hacker interest to swi-prolog internals is extremely low. (Hacker need to study well the server code to find a door).
So I think you will found adequate methods to protect all sensitive data.
But please, never use passwords with combinations of dictionary words and the same password more then for one purpose, it is the most important rule of security. For the same reason you shouldn't give access for your users to all information, but protection should be on the app level design.
The cases specific to a local server are a good firewall, proper network setup and encription of hard drive partition if your local server can be stolen by "hacker".
But if you mean the application should be accessible only from your local network and not from Internet you need much less efforts, mainly you need check your router/firewall setup and the 4th door in my list.
In a case you have a very limited number of known users you can just propose them to use VPN and not protect your server as in the case of "global" access.

I'd point out that my post was about a security issue with using port forwarding in apache
to access a prolog server.
And I do know of a successful prolog injection DOS attack on a SWI-Prolog http framework based website. I don't believe the website's author wants the details made public, but the possibility is certainly real.
Obviously this attack vector is only possible if the site evaluates Turing complete code (or code which it can't prove will terminate).
A simple security precaution is to check the Request object and reject requests from anything but localhost.
I'd point out that the pldoc server only responds by default on localhost.
- Anne Ogborn

I think SWI_Prolog http package is an excellent choice. Jan Wielemaker put much effort in making it secure and scalable.
I don't think you need to worry about SQL injection, indeed would be strange to rely on SQL when you have Prolog power at your fingers...
Of course, you need to properly manage the http access in your server...
Just this morning there has been an interesting post in SWI-Prolog mailing list, about this topic: Anne Ogborn shares her experience...

Obfuscating server headers

I have a WSGI application running in PythonPaste. I've noticed that the default 'Server' header leaks a fair amount of information ("Server: PasteWSGIServer/0.5 Python/2.6").
My knee jerk reaction is to change it...but I'm curious what others think.
Is there any utility in the server header, or benefit in removing it? Should I feel uncomfortable about giving away information on my infrastructure?
Thanks

Well "Security through Obscurity" is never a best practice; your equipment should be able to maintain integrity against an attacker that has extensive knowledge of your setup (barring passwords, console access, etc). Can't really stop a DDOS or something similar, but you shouldn't have to worry about people finding out you OS version, etc.
Still, no need to give away information for free. Fudging the headers may discourage some attackers, and, in cases like this where you're running an application that may have a known exploit crop up, there are significant benefits in not advertising that you're running it.
I say change it. Internally, you shouldn't see much benefit in leaving it alone, and externally you have a chance of seeing benefits if you change it.

Given the requests I find in my log files (like requests for IIS-specific bugs in Apache logs, and I'm sure IIS server logs will show Apache-specific requests as well), there's many bots out there that don't care about any such header at all. I guess almost everything is brute force nowadays.
(And actually, as for example I've set up quite a few instances of Tomcat sitting behind IIS, I guess I would not take the headers into account either, if I were to try to hack my way into some server.)
And above all: when using free software I kind of find it appropriate to give the makers some credits in statistics.

Masking your version number is a very important security measure. You do not want to give the attacker any information about what software you are running. This security feature is available in the mod_security, the Open Source Web Application Firewall for Apache:
http://www.modsecurity.org/
Add this line to your mod_security configuration file:
SecServerSignature "IIS/6.0"

Website hacking - Why it is always possible to do?

we know that each executable file can be reverse engineered (disassembled, decompiled). No mater how strong security you will implement, anyway if crackers want to, they do crack!!! Just that is a question of time.
What about websites? May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)? If no, than what is the reason?

Yes it is always possible to do. There is always a way in.
It's like my grandfather always said:
Locks are meant to keep the honest
people out

May we say that website can be completely safe from attacks of hackers?
No. Even the most secure technology in the world is vulnerable to social engineering attacks, for one thing.

You can easily write a webapp that is mathematically proven to be secure... But that proof will only hold as long as the underlying operating system, interpreter|compiler, and hardware are secure, which is never the case.

The key thing to remember is that websites are usually part of a huge and complex system and it doesn't really matter if the hacker enters the system through the web application itself or some other part of the entire infrastructure. If someone can get access to your servers, routers, DNS or whatever, they can bring down even the best web application. In my experience a lot of systems are vulnerable in some way or another. So "completely secure" means either "we're trying really hard to secure the platform" or "we have no clue whatsoever, but we hope everything is okay". I have seen both.

To sum up and add to the posts that precede:
Web as a shared resource - websites are useful so long as they are accessible. Render the web site unaccessible, and you've broken it. Denial of service attacks add up to flooding the server so that it can no longer respond to legitimate requests will always be a factor. It's a game of keep away - big server sites find ways to distribute, hackers find ways to deluge.
Dynamic data = dynamic risk - if the user can input data, there's a chance for a hacker to be a menance. Today the big concepts are cross-site scripting and SQL injection, but once one avenue for cracking is figured out, chances are high that another mechanism will rise. You could, conceivably, argue that a totally static site can be secure from this, but then how many useful sites fit that bill?
Complexity = the more complex, the harder to secure - given the rapid change of technology, I doubt that any web developer could say with 100% confidence that a modern website was secure - there's too much unknown code. Taking the host aside (the server, network protocols, OS, and maybe database), there's still all the great new libraries in Java EE and .Net. And even a less enterprise-y architecture will have some serious complexity that makes knowing all potential inputs and outputs of the code prohibitively difficult.
The authentication problem = by definition, the web site lets a remote user do something useful on a server that is far away. Knowing and trusting the other end of the communication is an old challenge. These days server side authenitication is relatively well implemented an understood and (so far as I know!) no one's managed to hack PKI. But getting user authentication ironed out is still quite tricky. It's doable, but it's a tradeoff between difficulty for the user and for configuration, and a system with a higher risk of vulnerability. And even a strong system can be broken when users don't follow the rules or when accidents happen. All this doesn't apply if you want to make a public site for all users, but that severely limits the features you'll be able to implement.
I'd say that web sites simply change the nature of the security challenge from the challenges of client side code. The developer does not need to be as worried about code replication, but the developer does need to be aware of the risks that come from centralizing data and access to a server (or collection of servers). It's just a different sort of problem.

Websites suffer greatly from injection and cross site scripting attacks
Cross-site scripting carried out on
websites were roughly 80% of all
documented security vulnerabilities as
of 2007
Also part of a website (in some web sites a great deal) is sent to the client in the form of CSS, HTML and javascript, which is the open for inspection by anyone.

Not to nitpick, but your definition of "good hosting" does not assume the HTTP service running on the host is completely free from exploits.
Popular web servers such as IIS and Apache are often patched in order to protect against such exploits, which are often discovered the same way exploits in local executables are discovered.
For example, a malformed HTTP request could cause a buffer overrun on the server, leading to part of its data being executed.

It's not possible to make anything 100% secure.
All that can be done is to make something hard enough to break into, that the time and effort spent doing so makes it not worth doing.

Can I crack your site? Sure, I'll just hire a few suicide bombers to blow up your servers. Or... I'll blow up those power plants that power up your site, or I do some sort of social engineering, and DDOS attacks would quite likely be effective in a large scale not to mention atom bombs...
Short answer: yes.

This might be the wrong website to discuss that. However, it is widely known that security and usability are inversely related. See this post by Bruce Schneier for example (which refers to another website, but on Schneier's blog there's a lot of interesting readings on the issue).

Assuming the server itself isn't comprimised, and has no other clients sharing it, static code should be fine. Things usually only start to get funky when there's some sort of scripting language involved. After all, I've never seen a comprimised "It Works!" page

Saying 'completely secure' is a bad thing as it will state two things:
there has not been a proper threat analysis, because secure enough would be the 'correct' term
since security is always a tradeoff it means that the a system that is completely secure will have abysmal usability and the site will be a huge resource hog as security has been taken to insane levels.
So instead of trying to achieve "complete security" you should;
Do a proper threat analysis
Test your application (or have someone professional test it) against common attacks
Apply best practices, not extreme measures

The short of it is that you have to strike a balance between ease of use and security, much of the time, and decide what provides the optimal level of both for your purposes.
An excellent case in point is passwords. The easy way to go about it is to just have one, use it everywhere, and make it something easy to remember. The secure way to go about it is to have a randomly generated variable-length sequence of characters across the encoding spectrum that only the user himself knows.
Naturally, if you go too far on the easy side, the user's data is easy to pick off. If you go too far on the side of security, however, practical application could end up leading to situations that compromise the added value of the security measures (e.g. people can't remember their whole keychain of passwords and corresponding user names, and therefore write them all down somewhere. If the list is compromised, the security measures that had been put into place are for naught. Hence, most of the time a balance gets struck and places ask that you put a number in your password and tell you not to do anything stupid like tell it to other people.
Even if you remove the possibility of a malicious person with the keys to everything leaking data from the equation, human stupidity is infinite. There is no such thing as 100% security.

May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)?
Well if we're going to start putting constraints on the attacker, then of course we can design a completely secure system: we just have to bar all of the attacker's attacks from the scenario.
If we assume the attacker actually wants to get in (and isn't bound by the rules of your engagement), then the answer is simply no, you can't be completely safe from attacks.

Yes, it's possible for a website to be completely secure, for a reasonable definition of 'complete' that includes your original premise that the hosting is not vulnerable. The problem is the same as with any software that contains defects; people create software of a complexity that is slightly beyond their capability to manage and thus flaws remain undetected until it's too late.
You could start smaller and prove all your work correct and safe as you construct it, remaking any off-the-shelf components that haven't been designed to that stringent degree of quality, but unfortunately that leaves you at a massive commercial disadvantage compared to the people who can write 99% safe software in 1% of the time. Therefore there's rarely a good business reason for going down this path.

The answer to this question lies close to the ideas about computational theory that arise from considering the halting problem. http://en.wikipedia.org/wiki/Halting_problem To wit, if you could with clarity say you'd devised a way to programmatically determine if any particular program was secure, you might be close to disproving the undecidability of the halting problem on the class of machines you were working with. Since the undecidability of the halting problem has been proven, we can know that over turing machines you would be unable to prove securability since the problem of security reduces to the halting problem. Even for finite machines you might be able to decide all of the states of the program, but Minsk would tell us that the time required for a complete state tree for even simplistic modern day machines and web servers would be huge. You probably know a lot about a specific piece of code, but as soon as you changed the code, or updated it, a complete retest would be required. Fundamentally this is interesting because it all boils back to the concept of information and meaning. Read about Automated theory proving to understand more about the limits of computational systems. http://en.wikipedia.org/wiki/Automated_theorem_proving

The fact is hackers are always one step ahead of developers, you can never ever consider a site to be bullet proof and 100% safe. You just avoid malicious stuff as much as you can !!
In fact, you should follow whitelist approach rather than blacklist approach when it comes to security.

Is it good practice to hide web server information in HTTP headers?

This question is more security related than programming related, sorry if it shouldn't be here.
I'm currently developing a web application and I'm curious as to why most websites don't mind displaying their exact server configuration in HTTP headers, like versions of Apache and PHP, with complete "mod_perl, mod_python, ..." listing and so on.
From a security point of view, I'd prefer that it would be impossible to find out if I'm running PHP on Apache, ASP.NET on IIS or even Rails on Lighttpd.
Obviously "obscurity is not security" but should I be worried at all that visitors know what version of Apache and PHP my server is running ? Is it good practice or totally unnecessary to hide this information ?

Prevailing wisdom is to remove the server ID and the version; better yet, change them to another legitimate server ID and version - that way the attacker goes off trying IIS vulnerabilities against Apache or something like that. Might as well mislead the attacker.
But honestly, there are so many other clues to go by, I wonder about whether this is worth it. I suppose it could stop attackers using a search engine to find servers with known vulnerabilities.
(Personally, I don't bother on my HTTP server, but it's written in Java and much less vulnerable to the typical kinds of attack.)

I think you usually see those headers because the systems send them by default.
I routinely remove them as they provide no real value and could, as you suggested reveal information about the server.

Hiding the information in the headers usually just slows down the lazy and ignorant villains. There are many ways to fingerprint a system.

Running nmap -O -sV against an IP will give you the OS and service versions with a fairly high degree of accuracy. The only extra info you're giving away by having your server advertise that information is which modules you have loaded.

It seems that some of the answers are missing an obvious advantage of turning off the headers.
Yes, you all are right; turning of the headers (and the statusline present e.g. at directory listings) does not stop an attacker from finding out what software you use.
However, turning this information off prevents malware which uses google to look for vulnerable systems from finding you.
tldr: Don't use it as a (or even as THE) security-measure, but as a measure to drive away unwanted traffic.

I normally turn off Apache's long header version information with ServerTokens; it adds nothing useful.
One point which nobody has picked up on, is it looks like better security to a prospective client, pen testing company etc, if you're giving out less information from your web server.
So giving less information out boosts the perceived security (i.e. it shows you have actually thought about it and done something)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string