What if I develop a desktop application which million people will use, and behind the scene, the application is surveilling users' files on their hard drives, streaming the data time to time?
Can one be assured no such things happen, with any popular software applications, be it MS Office or Google Chrome?
Or this is just a stupid question?
Is it technically possible? Yes, it is.
Could it be happening in an application used by a million users for a relatively long time without being noticed? Very unlikely. Somebody would notice the strange network traffic eventually.
Also #Mjh mentioned open source in a comment. While open source can help by allowing people to audit the source code, how many times have you checked that the binary you are using is actually the compiled source that you were looking at? Of course, there are signatures on binary packages and all, but the signature is made by the package maintainer. There is an inherent trust not only in the developer of the application, but also in the tool chain that creates a binary package from the source code. And then we haven't talked about strange "bugs", or the fact that even in open source, some security issues are very hard to find (otherwise all open source software would be security bug-free, which they are not).
So back to your question, sure, you could use all kinds of techniques to monitor the behavior of an application, you could monitor memory access, network traffic, whatever else. You can also analyse the code itself, look for suspicious things. It will take a huge amount of effort and still there will be no 100% guarantee, only some level of assurance.
Automated version upgrades could make detection even harder by the way. Even if you put lots of resources into analysis of one version, what if only a short-lived version had malicious code? Sure, that too can be analysed, but would anyone bother, unless there was a good reason (like indications of something malicious)?
Yet I think you can be pretty sure that major vendors don't do this. It's just not worth it for them, why would they? Their risk would be huge, with a relatively low benefit.
I am building a client-side program that connects to a server. This client-side program needs to have the source code available to the users as part of the licencing (not an option). However, I need to ensure that when a user connects to the server with that client-side program, it's running with the original code and hasn't been altered and re-compiled.
Is there any way to check during connection to the server that they're using an unaltered version of the program?
No, there's really no way to do that.
You're basically encountering the "Trusted Client" problem. The client code runs on the user's PC, and the user has full control over that PC. He can change the bytes of the program on disk, or even in memory. If you were to try to perform a hash or checksum against the code, he could simply change the code that did that verification and make it return "unmodified".
You could try to make things a little harder on a malicious user but there's no practical way to achieve what you're hoping.
What you have described is a issue that the video game industry has been fighting for the last decade and a half. In short, how to prevent the user from modifying the client (in their case, generally to prevent cheating, though also for copyright reasons). If that effort has taught us anything, it's that preventing modifications to the client is a constant arms race that you will never decisively win. In light of that, don't even try.
Follow the standard client-server assumption that the client is in the hands of the enemy and cannot be trusted. Build your server side defensively based on that assumption and you'll be alright.
It's very very difficult and probably not worth it. But if you are interested in pursuing it you'd have to develop something that has been code signed and monitored by the Windows kernal.
A couple topics that will orient you to the scope of the problem:
Protected media path
Driver signing
Both media devices and device drivers are digitally signed by the manufacturer and continuously monitored by Windows. If anything goes out of whack, it gets shut down (that'ts the technical term). Seems very daunting. And I don't know if the technology is available for desktop software that isn't a device driver and isn't related to DRM.
Good luck!
Target hardware is a rather low-powered MCU (ARM Cortex-M3 #72MHz, with just about 64KB SRAM and 256KB flash), so walking the thin line here. My board does have ethernet, and I will eventually get lwIP (lightweight TCP/IP FOSS suite) running on it (currently struggling). However, I also need some kind of super light-weight alternative to SSL/TLS. I am aware of the multiple GPL'd SSL/TLS implementations for such MCU's, but their footprint is still fairly significant. While they do fit-in, given everything else, don't leave much room for others.
My traffic is not HTTP, so I don't have to worry about HTTPS, and my client/server communication can be completely proprietary, so non-standard solution is okay. Looking for suggestions on what might be the minimalistic yet robust (well a weak security is worthless), alternative that helps me --
Encrypt my communication (C->S & S->C)
Do 2-way authentication (C->S & S->C)
Avoid man-in-middle attacks
I won't be able to optimize library at ARMv7 assembly level, and thus bank entirely on my programming skills and the GNU-ARM compiler's optimization. Given above, any pointers of what might be the best options ?
If any of those small TLS implementations allow you to disable all X.509 and ASN.1 functionality and just use TLS with preshared-keys you'd have quite a small footprint. That's because only symmetric ciphers and hashes are used.
There's CurveCP. It's meant to completely replace SSL.
It's fairly new, and still undergoing development, but its author is a well-known expert in the field, and has been carefully working toward it during the past decade. A lot of careful research and design has been put into it.
My immediate reaction would be to consider Kerberos. It's been heavily studied, so the chances of major holes at this point are fairly remote. At the same time, it's fairly minimalist, so unless you restrict what you need to do, you're probably not going to be able to use anything much more lightweight without compromising security.
If that's still too heavy for your purposes, you're probably going to need to impose some restrictions on what you want done.
You might have a look at MST (Minimal Secure Transport) https://github.com/DiplIngFrankGerlach/MST.
It provides the same security assurances as TLS, but requires a pre-shared key. Also, it is very small (less than 1000 LoC, without AES) and can therefore easily be reviewed by an expert.
I know it is to come with variant of answer about two years later, yet... "PolarSSL's memory footprint can get as small as 30k and averages below 110k." https://polarssl.org/features
Is there a site, or is there a simple way of setting up one, which demonstrates what can happen with a buffer overrun? This is in the context of a web app.
The first question is... what webserver or library? IIS-6 running the ASP script interpreter? ASP.NET behind IIS-7? Apache Tomcat? Some library like the HTTPServer in Poco?
To exploit a buffer overrun you must first find a specific place in an application where the overrun can occur, and then figure out what exact bytes must be sent to overwrite memory with the right bytes to run out to the stack and then execute machine code you've sent in the overrun (because you've rewritten the stack).
http://en.wikipedia.org/wiki/Buffer_overflow
http://seclists.org/fulldisclosure/2005/Apr/412
http://www.microsoft.com/technet/security/bulletin/MS06-034.mspx
http://www.owasp.org/index.php/Buffer_Overflow
http://www.windowsecurity.com/articles/Analysis_of_Buffer_Overflow_Attacks.html
Bring on the flames, but I would say that any codebase of sufficient size contains potentially explotable buffer overflows regardless of the execution environment or source language. Sure, some languages and platforms allow the app developers to "shoot themselves in the foot" easier than others, but it all comes down to individual dilligence, project complexity, use of 3rd-party code and then there are always exploits for base implementations.
Buffer overruns are here to stay, it's just a question of mitigating exposure through safe programming practices and diligently monitoring for new exploits as they are discovered to manage risk.
A grand example is that for a web app, there's the chance for overflows in the networking implementations of all the equipment and applications handling delivery of your bytes across the Internet.
Consider the distinction of a client trying to break a server, versus one client trying to conquer other clients accessing a specific server?
Besides the simple "C" style overruns where the developer failed to follow secure programming practices, there are hidden gems like integer overflows.
Even if a certain server host environment is known to be highly stable, it no doubt makes use of a database and other operating system facilities which themselves might be exploitable through the webserver.
When you get right down to it, nothing is secure and it's really just a matter of complexity.
The only way a specific application or server can be fully secure is through diligent monitoring so issues can be corrected when questionable behavior is discovered. I highly doubt Walmart or Fedex or any other major corporation is happy to publicize breaches. Instead they monitor, fix and move on.
One could even say that SQL Injection is an abstract form of a buffer overrun, but you're simply picking where you want the buffer to end using specific characters and then providing the code you wish to execute in the right place.
They tend not to occur in web apps. Google's Web Application Exploits and Defenses codelab has a short section on it but doesn't demonstrate it:
This codelab doesn't cover overflow vulnerabilities because Jarlsberg is written in Python, and therefore not vulnerable to typical buffer and integer overflow problems. Python won't allow you to read or write outside the bounds of an array and integers can't overflow. While C and C++ programs are most commonly known to expose these vulnerabilities, other languages are not immune. For example, while Java was designed to prevent buffer overflows, it silently ignores integer overflow.
It really depends on the language you're using as to whether buffer overflow is even possible.
we know that each executable file can be reverse engineered (disassembled, decompiled). No mater how strong security you will implement, anyway if crackers want to, they do crack!!! Just that is a question of time.
What about websites? May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)? If no, than what is the reason?
Yes it is always possible to do. There is always a way in.
It's like my grandfather always said:
Locks are meant to keep the honest
people out
May we say that website can be completely safe from attacks of hackers?
No. Even the most secure technology in the world is vulnerable to social engineering attacks, for one thing.
You can easily write a webapp that is mathematically proven to be secure... But that proof will only hold as long as the underlying operating system, interpreter|compiler, and hardware are secure, which is never the case.
The key thing to remember is that websites are usually part of a huge and complex system and it doesn't really matter if the hacker enters the system through the web application itself or some other part of the entire infrastructure. If someone can get access to your servers, routers, DNS or whatever, they can bring down even the best web application. In my experience a lot of systems are vulnerable in some way or another. So "completely secure" means either "we're trying really hard to secure the platform" or "we have no clue whatsoever, but we hope everything is okay". I have seen both.
To sum up and add to the posts that precede:
Web as a shared resource - websites are useful so long as they are accessible. Render the web site unaccessible, and you've broken it. Denial of service attacks add up to flooding the server so that it can no longer respond to legitimate requests will always be a factor. It's a game of keep away - big server sites find ways to distribute, hackers find ways to deluge.
Dynamic data = dynamic risk - if the user can input data, there's a chance for a hacker to be a menance. Today the big concepts are cross-site scripting and SQL injection, but once one avenue for cracking is figured out, chances are high that another mechanism will rise. You could, conceivably, argue that a totally static site can be secure from this, but then how many useful sites fit that bill?
Complexity = the more complex, the harder to secure - given the rapid change of technology, I doubt that any web developer could say with 100% confidence that a modern website was secure - there's too much unknown code. Taking the host aside (the server, network protocols, OS, and maybe database), there's still all the great new libraries in Java EE and .Net. And even a less enterprise-y architecture will have some serious complexity that makes knowing all potential inputs and outputs of the code prohibitively difficult.
The authentication problem = by definition, the web site lets a remote user do something useful on a server that is far away. Knowing and trusting the other end of the communication is an old challenge. These days server side authenitication is relatively well implemented an understood and (so far as I know!) no one's managed to hack PKI. But getting user authentication ironed out is still quite tricky. It's doable, but it's a tradeoff between difficulty for the user and for configuration, and a system with a higher risk of vulnerability. And even a strong system can be broken when users don't follow the rules or when accidents happen. All this doesn't apply if you want to make a public site for all users, but that severely limits the features you'll be able to implement.
I'd say that web sites simply change the nature of the security challenge from the challenges of client side code. The developer does not need to be as worried about code replication, but the developer does need to be aware of the risks that come from centralizing data and access to a server (or collection of servers). It's just a different sort of problem.
Websites suffer greatly from injection and cross site scripting attacks
Cross-site scripting carried out on
websites were roughly 80% of all
documented security vulnerabilities as
of 2007
Also part of a website (in some web sites a great deal) is sent to the client in the form of CSS, HTML and javascript, which is the open for inspection by anyone.
Not to nitpick, but your definition of "good hosting" does not assume the HTTP service running on the host is completely free from exploits.
Popular web servers such as IIS and Apache are often patched in order to protect against such exploits, which are often discovered the same way exploits in local executables are discovered.
For example, a malformed HTTP request could cause a buffer overrun on the server, leading to part of its data being executed.
It's not possible to make anything 100% secure.
All that can be done is to make something hard enough to break into, that the time and effort spent doing so makes it not worth doing.
Can I crack your site? Sure, I'll just hire a few suicide bombers to blow up your servers. Or... I'll blow up those power plants that power up your site, or I do some sort of social engineering, and DDOS attacks would quite likely be effective in a large scale not to mention atom bombs...
Short answer: yes.
This might be the wrong website to discuss that. However, it is widely known that security and usability are inversely related. See this post by Bruce Schneier for example (which refers to another website, but on Schneier's blog there's a lot of interesting readings on the issue).
Assuming the server itself isn't comprimised, and has no other clients sharing it, static code should be fine. Things usually only start to get funky when there's some sort of scripting language involved. After all, I've never seen a comprimised "It Works!" page
Saying 'completely secure' is a bad thing as it will state two things:
there has not been a proper threat analysis, because secure enough would be the 'correct' term
since security is always a tradeoff it means that the a system that is completely secure will have abysmal usability and the site will be a huge resource hog as security has been taken to insane levels.
So instead of trying to achieve "complete security" you should;
Do a proper threat analysis
Test your application (or have someone professional test it) against common attacks
Apply best practices, not extreme measures
The short of it is that you have to strike a balance between ease of use and security, much of the time, and decide what provides the optimal level of both for your purposes.
An excellent case in point is passwords. The easy way to go about it is to just have one, use it everywhere, and make it something easy to remember. The secure way to go about it is to have a randomly generated variable-length sequence of characters across the encoding spectrum that only the user himself knows.
Naturally, if you go too far on the easy side, the user's data is easy to pick off. If you go too far on the side of security, however, practical application could end up leading to situations that compromise the added value of the security measures (e.g. people can't remember their whole keychain of passwords and corresponding user names, and therefore write them all down somewhere. If the list is compromised, the security measures that had been put into place are for naught. Hence, most of the time a balance gets struck and places ask that you put a number in your password and tell you not to do anything stupid like tell it to other people.
Even if you remove the possibility of a malicious person with the keys to everything leaking data from the equation, human stupidity is infinite. There is no such thing as 100% security.
May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)?
Well if we're going to start putting constraints on the attacker, then of course we can design a completely secure system: we just have to bar all of the attacker's attacks from the scenario.
If we assume the attacker actually wants to get in (and isn't bound by the rules of your engagement), then the answer is simply no, you can't be completely safe from attacks.
Yes, it's possible for a website to be completely secure, for a reasonable definition of 'complete' that includes your original premise that the hosting is not vulnerable. The problem is the same as with any software that contains defects; people create software of a complexity that is slightly beyond their capability to manage and thus flaws remain undetected until it's too late.
You could start smaller and prove all your work correct and safe as you construct it, remaking any off-the-shelf components that haven't been designed to that stringent degree of quality, but unfortunately that leaves you at a massive commercial disadvantage compared to the people who can write 99% safe software in 1% of the time. Therefore there's rarely a good business reason for going down this path.
The answer to this question lies close to the ideas about computational theory that arise from considering the halting problem. http://en.wikipedia.org/wiki/Halting_problem To wit, if you could with clarity say you'd devised a way to programmatically determine if any particular program was secure, you might be close to disproving the undecidability of the halting problem on the class of machines you were working with. Since the undecidability of the halting problem has been proven, we can know that over turing machines you would be unable to prove securability since the problem of security reduces to the halting problem. Even for finite machines you might be able to decide all of the states of the program, but Minsk would tell us that the time required for a complete state tree for even simplistic modern day machines and web servers would be huge. You probably know a lot about a specific piece of code, but as soon as you changed the code, or updated it, a complete retest would be required. Fundamentally this is interesting because it all boils back to the concept of information and meaning. Read about Automated theory proving to understand more about the limits of computational systems. http://en.wikipedia.org/wiki/Automated_theorem_proving
The fact is hackers are always one step ahead of developers, you can never ever consider a site to be bullet proof and 100% safe. You just avoid malicious stuff as much as you can !!
In fact, you should follow whitelist approach rather than blacklist approach when it comes to security.