gnuplot 5.x Buffer overflow vulnerability - security

I would like to use gnuplot 5.x on my employer's hardware platforms (Dell workstations & laptop running Windows 7 and 10 respectively). Our system administration has identified "buffer overflow vulnerabilities" associated with this software.
A simple search for "gnuplot buffer overflow" on Google yields some information related to this, e.g.:
https://sourceforge.net/p/gnuplot/bugs/2093/
https://sourceforge.net/p/gnuplot/bugs/1413/
https://research.loginsoft.com/bugs/buffer-overflow-vulnerability-in-ps_options-gnuplot-5-2-5/
I am not a C++ savvy software programmer nor a cyber-security specialist. What I see in some of these posts is comments like:
Link #1: "an attacker might use this flaw to overflow important data to hijack the control flow".
Link #3: "This allows an attacker to cause Denial of Service (Segmentation fault and Memory Corruption) or possibly have unspecified other impacts when a victim opens a specially crafted file".
And I am thinking to myself: "Really?". Are these assessments credible, and if so, how can such a long-established and widely-used tool have such a serious flaw?
As you can imagine, I am having trouble getting the installation and usage of gnuplot approved by my employer. Any evidence-based information you can provide to shed light on this matter would be very much appreciated.
Thank you very much.
Maziar.

All programs have bugs. You point to two specific bugs that were identified in 2018, reported on the project bug tracker, and fixed in the first subsequent release of the program. In both cases the bugs were of the sort "you could crash this program if you give it certain improperly formed commands". Those particular bugs no longer exist in current versions of the program, since they were reported and fixed. But other bugs of the sort probably exist in gnuplot and in every single program running on your computer. That's life in an imperfect world.
In other words, you or your employer's security policy must decide whether it is really a problem that "this program might crash if you feed it garbage".
A more realistic concern is that gnuplot is essentially a scripting language. It can read and write any file that the user has permission to access. If you run a gnuplot script that came from a malicious source, it might overwrite your files or issue harmful commands to your system. This is the same concern as if you downloaded and executed a *.com or *.exe file or a python script any other series of commands provided by a malicious source. It's not really an indictment of the program, only of lax security practices by the user.

Related

Why is return-to-libc much earlier than return-to-user?

I'm really new to this topic and only know some basic concepts. Nevertheless, there is a question that quite confuses me.
Solar Designer proposed the idea of return-to-libc in 1997 (http://seclists.org/bugtraq/1997/Aug/63). Return-to-user, from my understanding, didn't become popular until 2007 (http://seclists.org/dailydave/2007/q1/224).
However, return-to-user seems much easier than return-to-libc. So my question is, why did hackers spend so much effort in building a gadget chain by using libc, rather than simply using their own code in the user space when exploiting a kernel vulnerability?
I don't believe that they did not realize there are NULL pointer dereference vulnerabilities in the kernel.
Great question and thank you for taking some time to research the topic before making a post that causes it's readers to lose hope in the future of mankind (you might be able to tell I've read a few [exploit] tags tonight)
Anyway, there are two reasons
return-to-libc is generic, provided you have the offsets. There is no need to programmatically or manually build either a return chain or scrape existing user functionality.
Partially because of linker-enabled relocations and partly because of history, the executable runtime of most programs executing on a Linux system essentially demand the libc runtime, or at least a runtime that can correctly handle _start and cons. This formula still stands on Windows, just under the slightly different paradigm of return-to-kernel32/oleaut/etc but can actually be more immediately powerful, particularly for shellcode with length requirements, for reasons relating to the way in which system calls are invoked indirectly by kernel32-SSDT functions
As a side-note, If you are discussing NULL pointer dereferences in the kernel, you may be confusing return to vDSO space, which is actually subject to a different set of constraints that standard "mprotect and roll" userland does not.
Ret2libc or ROP is a technique you deploy when you cannot return to your shellcode (jmp esp/whatever) because of memory protections (DEP/NX).
They perform different goals, and you may engage with both.
Return to libc
This is a buffer overrun exploit where you have control over some input, and you know there exists a badly written program which will copy your input onto the stack and return through it. This allows you to start executing code on a machine you have not got a login to, or a very restrictive access.
Return to libc gives you the ability to control what code is executed, and would generally result in you running within the space, a more natural piece of code.
return from kernel
This is a privilege escalation attack. It takes code which is running on the machine, and exploits bugs in the kernel causing the kernel privileges to be applied to the user process.
Thus you may use return-to-libc to get your code running in a web-browser, and then return-to-user to perform restricted tasks.
Earlier shellcode
Before return-to-libc, there was straight shellcode, where the buffer overrun included the exploit code, with knowledge about where the stack would be, this was possible to run directly. This became somewhat obsolete with the NX bit in windows. (x86 hardware was capable of having execute on a segment, but not a page).
Why is return-to-libc much earlier than return-to-user?
The goal in these attacks is to own the machine, frequently, it was easy to find a vulnerability in a user-mode process which gave you access to what you needed, it is only with the hardening of the system by reducing the privilege at the boundary, and fixing the bugs in important programs, that the new return-from-kernel became necessary.

How does towelroot (futex exploit) works

There is a security issue in linux kernel, which affects most of android devices and basically allows any user to become root.
Since I am linux user for quite some time, I am very curious how this exploit works, especially how can I check whether my kernel in my PC (custom built) or on any of my servers, is vulnerable to this or not. Is there any source code (preferably documented) or details of the exploit so that I could see how it works? I could only find the generic information or closed source binaries that do exploit the bug and give you root if executed by any user, but no background information or details of which part of kernel has the flaw and how is it even possible to do this.
So far I found this interesting article http://tinyhack.com/2014/07/07/exploiting-the-futex-bug-and-uncovering-towelroot/ which explains that it uses stack hack, by calling certain syscalls in order to get something into a stack of futex_queue. While I understand how that works, I have no idea how changing anything in that stack can actually elevate privileges of current process. What I found interesting is, that this guy say that since kernel 3.13 something has changed and now different technique is needed to exploit this. Does it mean that this was not even fixed and is still exploitable in recent kernel that can be downloaded from kernel.org?
As SilverlightFox said, the Security portion of stackexchange (http://security.stackexchange.com/) probably would be better for this, but here goes nothing.
From the sound of it, this hack appears to be a way to elevate any users' terminal/kernel for a given amount of time, which is, not to say the least, bad. My idea of how this sort of issue would work is a program that overloads the futex_queue by calling those said syscalls and then provides, temporarily, the user with superuser access.
I looked around at the link you provided and found that it does require remote login from SSH or similar procedures. In a console screenshot, it uses the line gcc -o xpl xpl.c -lpthread, which shows that this exploit is done in C. And a quote directly from the article:
This is actually where the bug is: there is a case where the waiter is
still linked in the waiter list and the function returns. Please note that a kernel stack is completely separate from user stack. You can not
influence kernel stack just by calling your own function in the userspace. You can manipulate kernel stack value by doing syscall.
In the image at http://www.clevcode.org/cve-2014-3153-exploit/, it shows the output of the towelroot exploit, testing the address limits and getting into the task structure to spawn a superuser shell. Also, in the tinyhack article, it gives a simple recreation of this exploit's base, so I'd recommend taking a look at that and working from it.
I don't know any clear form of testing if your system is vulnerable, so the best I can tell you is to try and harden your systems and do all you can to keep it protected. Anyway, I don't think that someone would easily get hold of server ports and logins to run this exploit on your system.
Cheers!

Need advice to design 'crack-proof' software [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am currently working on a project where i need to create some architecture, framework or any standards by which i can "at least" increase the cracking method for a software, i.e, to add to software security. There are already different ways to activate a software which includes online activation, keys etc. I am currently studying few research papers as well. But there are still lot of things that i want to discuss.
Could someone guide me to some decent forum, mailing list or something like that? or any other help would be appreciated.
I'll tell you the closest thing to "crackproof": a web application.
Desktop applications are doomed, for many other reasons, but making your application run "in the cloud", in a browser, gives you a lot more control about security.
A desktop software runs on the client's computer, so the client has full access to it. A web app runs on your server, so the client only sees a tiny bit of it.
You need to begin by infiltrating the local hacking gang, posing as an 11 year old who wants to "hack it up". Once you've earned their trust you can learn what features they find hardest to crack. As you secretly release "uncrackable" software to the local message boards, you can see what they do with it. Build upon your inner knowledge until they can no longer crack your software. When that is done, let your identity be known. Ideally, this will be seen as a sign of betrayal, that you're working against them. Hopefully this will lead them to contact other hackers outside the local community to attack your software.
Continue until you've reached the top of the hacker mafia. Write your thesis as a book, sell to HBO.
Isn't it a sign of success when your product gets cracked? :)
Seriously though - one approach is to use License objects that are serialized to XML and then encrypted using public/private key pairs. They are then read back in at runtime, de-serialized and processed to ensure they are valid.
But there is still the ubiquitous "IsValid()" method which can be cracked to always return true.
You could even put that method into a signed assembly to prevent tampering, but all you've done then is create another layer of "IsValid()" which too can be cracked.
We use licenses to turn on or off various features in our software, and to validate support/upgrade periods. But this is only for our legitimate customers. Anyone who wants to bypass it probably could.
We trust our legitimate customers to not try to bypass the licensing, and we accept that our illegitimate customers will find a way.
We would waste more money attempting to imporve the 'tamper proof' nature of our solution that we loose to people who pirate software.
Plus you've got to consider the pain to our legitimate customers, and asking them to paste a license string from their online account page is as much pain as I'd want to put them through. Why create additional barriers to entry for potential customers?
Anyway, depending on which solution you've got in place already, my description above might give you some ideas that might decrease the likelyhood someone will crack your product.
As nute said, any code you release to a customer's machine is crackable.
Don't try for "uncrackable." Try for "there's enough deterrent to reasonably protect my assets."
There are a lot of ways you can try and increase the cost of cracking. Most of them cost you but there is one thing you can do that actually reduces your costs while increasing the cost of cracking: deliver often.
There is a finite cost to cracking any given binary. That cost is increased by the number of binaries being cracked. If you release new functionality every week, you essentially bifurcate your users into two groups:
Those who don't need the latest features and can wait for a crack.
Those who do need the latest features and will pay for your software.
By engaging in the traditional anti-cracking techniques, you can multiply the cost of cracking one binary an, consequently, widen the gap between when a new feature is released and when it is available on the black market. To top it all off, your costs will go down and the amount of value you deliver in a period of time will go up - that's what makes it free.
The more often you release, the more you will find that quality and value go up, cost goes down, and the less likely people will be to steal your software.
As others have mentioned, once you release the bits to users you have given up control of them. A dedicated hacker can change the code to do whatever they want. If you want something that is closer to crack-proof, don't release the bits to users. Keep it on the server. Provide access to the application through the Internet or, if the user needs a desktop client, keep critical bits on the server and provide access to them via web services.
Like others have said, there is no way of creating a complete crack-proof software, but there are ways to make cracking the software more difficult; most of these techniques are actually used by bad guys to hide the malware inside binaries and by game companies to make cracking and copying the games more difficult.
If you are really serious about doing this, you could check e.g., what executable packers like UPX do. But then you need to implement the unpacker also. I do not actually recommend doing this, but studying game protectors and binary obfuscation might help you in your quest.
First of all, in what language are you writing this?
It's true that a crack-proof program is impossible to achieve, but you can always make it harder. A naive approach to application security means that a program can be cracked in minutes. Some tips:
If you're deploying to a virtual machine, that's too bad. There aren't many alternatives there. All popular vms (java, clr, etc.) are very simple to decompile, and no obfuscator nor signature is enough.
Try to decouple as much as possible the UI programming with the underlying program. This is also a great design principle, and will make the cracker's job harder from the GUI (e.g. enter your serial window) to track the code where you actually perform the check
If you're compiling to actual native machine code, you can always set the build as a release (not to include any debug information is crucial), with optimization as high as possible. Also in the critical parts of your application (e.g. when you validate the software), be sure to make it an inline function call, so you don't end up with a single point of failure. And call this function from many different places in your app.
As it was said before, packers always add up another layer of protection. And while there are many reliable choices now, you can end up being identified as a false positive virus by some anti-virus programs, and all the famous choices (e.g. UPX) have already pretty straight-forward unpackers.
there are some anti-debugging tricks you can also look for. But this is a hassle for you, because at some time you might also need to debug the release application!
Keep in mind that your priority is to make the critical part of your code as untraceable as possible. Clear-text strings, library calls, gui elements, etc... They are all points where an attacker may use to trace the critical parts of your code.

Any good strategies for dealing with 'not reproducible' bugs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Very often you will get or submit bug reports for defects that are 'not reproducible'. They may be reproducible on your computer or software project, but not on a vendor's system. Or the user supplies steps to reproduce, but you can't see the defect locally. Many variations on this scenario of course, so to simplify I guess what I'm trying to learn is:
What is your company's policy towards 'not reproducible' bugs? Shelve them, close them, ignore? I occasionally see intermittent, non reproducible bugs in 3-rd party frameworks, and these are pretty much always closed instantly by the vendor... but they are real bugs.
Have you found any techniques that help in fixing these types of bugs? Usually what I do is get a system info report from the user, and steps to reproduce, then search on keywords, and try to see any sort of pattern.
Verify the steps used to produce the error
Oftentimes the people reporting the error, or the people reproducing the error, will do something wrong and not end up in the same state, even if they think they are. Try to walk it through with the reporting party. I've had a user INSIST that the admin privileges were not appearing correctly. I tried reproducing the error and was unable to. When we walked it through together, it turned out he was logging in as a regular user in that case.
Verify the system/environment used to produce the error
I've found many 'irreproducible' bugs and only later discovered that they ARE reproducible on Mac OS (10.4) Running X version of Safari. And this doesn't apply only to browsers and rendering, it can apply to anything; the other applications that are currently being run, whether or not the user is RDP or local, admin or user, etc... Make certain you get your environment as close to theirs as possible before calling it irreproducible.
Gather Screenshots and Logs
Once you have verified that the user is doing everything correctly and still getting a bug, and that you're doing exactly what they do, and you are NOT getting the bug, then it's time to see what you can actually do about it. Screenshots and logs are critical. You want to know exactly what it looks like, and exactly what was going on at the time.
It is possible that the logs could contain some information that you can reproduce on your system, and once you can reproduce the exact scenario, you might be able to coax the error out of hiding.
Screenshots also help with this, because you might discover that "X piece has loaded correctly, but it shouldn't have because it is dependent on Y" and that might give you a hint. Even if the user can describe what were doing, a screen shot could help even more.
Gather step-by-step description from the user
It's very common to blame the users, and not trust anything that they say (because they call a 'usercontrol' a 'thingy') but even though they might not know the names of what they're seeing, they will still be able to describe some of the behaviour they are seeing. This includes some minor errors that may have occured a few minutes BEFORE the real error occurred, or possibly slowness in certain things that are usually fast. All these things can be clues to help you narrow down which aspect is causing the error on their machine and not yours.
Try Alternate Approachs to produce the error
If all else fails, try looking at the section of code that is causing problems, and possibly refactor or use a workaround. If it is possible for you to create a scenario where you start with half the information already there (hopefully in UAT) ask the user to try that approach, and see if the error still occurs. Do you best to create alternate but similar approaches that get the error into a different light so that you can examine it better.
Short answer: Conduct a detailed code review on the suspected faulty code, with the aim of fixing any theoretical bugs, and adding code to monitor and log any future faults.
Long answer:
To give a real-world example from the embedded systems world: we make industrial equipment, containing custom electronics, and embedded software running on it.
A customer reported that a number of devices on a single site were experiencing the same fault at random intervals. Their symptoms were the same in each case, but they couldn't identify an obvious cause.
Obviously our first step was to try and reproduce the fault in the same device in our lab, but we were unable to do this.
So, instead, we circulated the suspected faulty code within the department, to try and get as many ideas and suggestions as possible. We then held a number of code review meetings to discuss these ideas, and determine a theory which: (a) explained the most likely cause of the faults observed in the field; (b) explained why we were unable to reproduce it; and (c) led to improvements we could make to the code to prevent the fault happening in the future.
In addition to the (theoretical) bug fixes, we also added monitoring and logging code, so if the fault were to occur again, we could extract useful data from the device in question.
To the best of my knowledge, this improved software was subsequently deployed on site, and appears to have been successful.
resolved "sterile" and "spooky"
We have two closed bug categories for this situation.
sterile - cannot reproduce.
spooky - it's acknowledged there is a problem, but it just appears intermittently, isn't quite understandable, and gives everyone a faint case of the creeps.
Error-reporting, log files, and stern demands to "Contact me immediately if this happens again."
If it happens in one context, and not in another, we try to enumerate the difference between both, and eliminate them.
Sometimes this works (e.g. other hardware, dual core vs. hyperthreading, laptop-disk vs. workstation disk, ...).
Sometimes it doesn't. If it's possible, we may start remote-debugging. If that doesn't help, we may try get our hands on the customer's system.
But of course, we don't write too many bugs in the first place :)
Well, you try your best to reproduce it, and if you can't, you take a long think and consider how such a problem might arise. If you still have no idea, then there's not much you can do about it.
Some of the new features in Visual Studio 2010 will help. See:
Historical Debugger and Test Impact Analysis in Visual Studio Team System 2010
Better Software Quality with Visual Studio Team System 2010
Manual Testing with Visual Studio Team System 2010
Sometimes the bug is not reproducible even in a pre-production environment that is the exact duplicate of the production environment. Concurrency issues are notorious for this.
Random Failures Are Often Concurrency Issues
Link: https://pragprog.com/tips/
The reason can be simply because of the Heisenberg effect, i.e. observation changes behaviour. Another reason can be because the chances are very small of hitting the combination of events that triggers the bug.
Sometimes you are lucky and you have audit logs that you can playback, greatly increasing the chances of recreating the issue. You can also stress the environment with high volumes of transactions. This effectively compresses time so that if the bug occurs say once a week, you may be able to reliably reproduce it in 1 day if you stress the system to 7 X the production load.
The last resort is whitebox testing where you go through the code line by line writing unit tests as you go.
I add logging to the exception handling code throughout the program. You need a method to collect the logs (users can email it, etc.)
Preemptive checks for code versions and sane environments are a good thing too. With the ease of software updates these days the code and environment the user is running has almost certainly not been tested. It didn't exist when you released your code.
With a web project I'm developing at the moment I'm doing something very similar to your technique. I'm building a page that I can direct users to in order to collect information such as their browser version and operating system. I'll also be collecting the apps registry info so i can have a look at what they've been doing.
This is a very real problem. I can only speak for web development, but I find users are rarely able to give me the basic information I would need to look into the issue. I suspect it's entirely possible to do something similar with other kinds of development. My plan is to keep working on this system to make it more and more useful.
But my policy is never to close a bug simply because I can't reproduce it, no matter how annoying it may be. And then there's the cases when it's not a bug, but the user has simply gotten confused. Which is a different type of bug I guess, but just as important.
You talk about problems that are reproducible but only on some systems. These are easy to handle:
First step: By using some sort of remote software, you let the customer tell you what to do to reproduce the problem on the system that has it. If this fails, then close it.
Second step: Try to reproduce the problem on another system. If this fails, make an exact copy of the customers system.
Third step: If it still fails, you have no option than to try to debug it on the customer system.
Once you can reproduce it, you can fix it. Doesn't matter on what system.
The tricky issue are truly non-reproducible issues, that is things that happen only intermittently. For that I'll have to chime in with the reports, logs and stern demands attitude. :)
It is important to categorize such bugs (rarely reproducible) and act on them differently than bugs that are frequently reproducible based on specific user actions.
Clear issue description along with steps to reproduce and observed behavior: Unambiguous reporting helps in understanding of the issue by entire team eliminating incorrect conclusions. For example, user reporting blank screen is different than HMI freeze on user action. Sequence of steps and approx timing of user action is also important.Did the user immediately select the option after screen transition or waited for a few minutes? An interesting bug concerning timing is a car allergic to vanilla ice-cream that baffled automotive engineers.
System config and startup parameters: Sometimes even hardware configuration and application software version (including drivers and firmware version) may do a trick or two. Mismatch of version or configuration can result in issues that are difficult to reproduce in other setups. Hence these are essential details to be captured. Most bug reporting tools have these details as mandatory parameters to report while logging an issue.
Extensive Logging: This is dependent on the logging facilities followed in concerned projects. While working with embedded Linux systems, we not only provide general diagnostic logs, but also system level logs like dmesg or top command logs. You may never know that wrong part is not the code flow but the abnormal memory usage/CPU usage. Identify the type of the issue and report the relevant logs for investigation.
Code Reviews and Walk-through: Dev teams cannot wait forever to reproduce these issues at their end and then take action. Bug report and available logs should be investigated and various possibilities be identified on this basis from design and code. If required, they should prepare hotfix on possible root causes and circulate the hotfix among teams including the tester who identified it to see if bug is reproducible with it.
Don't close these issues based on observation by a single tester/team after a fix is identified and checked in: Perhaps the most important part is approach followed to close these issues. Once fix of these issues has been checked in, all testing/validation teams at different locations should be informed on it for running intensive tests and identifying regression errors if any. Only all (practically most of them) of them reports as non-reproducible, a closure assessment has to be done by senior management.
If it is not reproduce able get logs, screen shots of exact steps to reproduce.
There's a nice new feature in Windows 7 that allows the user to record what they're doing and then send a report - it comes through as a doc with screen-shots of every stage. Hopefully it'll help in the cases where it's the user interacting with the application in an order that the developer wouldn't think of. I've seen plenty of bugs where it's just a case that the developer's logical way of using the app doesn't fit with how end users actually do it... resulting in lots of subtle errors.
Logging is your friend!
Generally what happens when we discover a bug that we can't reproduce is we either ask the customer to turn on more logging (if its available), or we release a version with extra logging added around the area we are interested in. Generally speaking the logging we have is excellent and has the ability to be very verbose, and so releasing versions with extra logging doesn't happen often.
You should also consider the use of memory dumps (which IMO also falls under the umbrella of logging). Producing a minidump is so quick that it can usually be done on production servers, even under load (as long as the number of dumps being produced is low).
The way I see it: Being able to reproduce a problem is nice because it gives you an environment where you can debug, experiement and play around in more freely, but - reproducing a bug is by no means essential to debug it! If the bug is only happening on someone else system then you still need to diagnose and debug the problem in the same way, its just that this time you need to be cleverer about how you do it.
The accepted answer is the best general approach. At a high level, it's worth weighing the importance of fixing the bug against what you could add as a feature or enhance that would benefit the user. Could a 'non-reproducible' bug take two days to fix? Could a feature be added in that time that gives users more benefit than that bug fix? Maybe the users would prefer the feature. I've been fixated at times as a developer on imperfections I can see, and then users are asked for feedback and none of them actually mention the bug(s) that I can see, but the software is missing a feature that they really want!
Sometimes, simple persistence in attempting to reproduce the bug whilst debugging can be the most effective approach. For this strategy to work, the bug needs to be 'intermittent' rather than completely 'non-reproducible'. If you can repeat a bug even one time in 10, and you have ideas about the most likely place it's occurring, you can place breakpoints at those points then doggedly attempt to repeat the bug and see exactly what's going on. I've experienced this to be more effective than logging in one or two cases (although logging would be my first go-to in general).

What are some advanced and modern resources on exploit writing?

I've read and finished both Reversing: Secrets of Reverse Engineering and Hacking: The Art of Exploitation. They both were illuminating in their own way but I still feel like a lot of the techniques and information presented within them is outdated to some degree.
When the infamous Phrack Article, Smashing the Stack for Fun and Profit, was written 1996 it was just before what I sort of consider the Computer Security "golden age".
Writing exploits in the years that followed was relatively easy. Some basic knowledge in C and Assembly was all that was required to perform buffer overflows and execute some arbitrary shell code on a victims machine.
To put it lightly, things have gotten a lot more complicated. Now security engineers have to contend with things like Address Space Layout Randomization (ASLR), Data Execution Prevention (DEP), Stack Cookies, Heap Cookies, and much more. The complexity of writing exploits went up at least an order of magnitude.
You can't event run most of the buffer overrun exploits in the tutorials you'll find today without compiling with a bunch of flags to turn off modern protections.
Now if you want to write an exploit you have to devise ways to turn off DEP, spray the heap with your shell-code hundreds of times and attempt to guess a random memory location near your shellcode. Not to mention the pervasiveness of managed languages in use today that are much more secure when it comes to these vulnerabilities.
I'm looking to extend my security knowledge beyond writing toy-exploits for a decade old system. I'm having trouble locating resources that help address the issues of writing exploits in the face of all the protections I outlined above.
What are the more advanced and prevalent papers, books or other resources devoted to contending with the challenges of writing exploits for modern systems?
You mentioned 'Smashing the stack'. Research-wise this article was out-dated before it was even published. The late 80s Morris worm used it (to exploit fingerd IIRC). At the time it caused a huge stir because back then every server was written in optimistic C.
It took a few (10 or so) years, but gradually everyone became more conscious of security concerns related to public-facing servers.
The servers written in C were subjected to lots of security analysis and at the same time server-side processing branched out into other languages and runtimes.
Today things look a bit different. Servers are not considered a big target. These days it's clients that are the big fish. Hijack a client and the server will allow you to operate under that client's credentials.
The landscape has changed.
Personally I'm a sporadic fan of playing assembly games. I have no practical use for them, but if you want to get in on this I'd recommend checking out the Metasploit source and reading their mailing lists. They do a lot of crazy stuff and it's all out there in the open.
I'm impressed, you are a leet hacker Like me. You need to move to web applications. The majority of CVE numbers issued in the past few years have been in web applications.
Read these two papers:
http://www.securereality.com.au/studyinscarlet.txt
http://www.ngssoftware.com/papers/HackproofingMySQL.pdf
Get a LAMP stack and install these three applications:
http://sourceforge.net/projects/dvwa/ (php)
http://sourceforge.net/projects/gsblogger/ (php)
http://www.owasp.org/index.php/Category:OWASP_WebGoat_Project (j2ee)
You should download w3af and master it. Write plugins for it. w3af is an awesome attack platform, but it is buggy and has problems with DVWA, it will rip up greyscale. Acunetix is a good commercial scanner, but it is expensive.
I highly recommend "The Shellcoder's Handbook". It's easily the best reference I've ever read when it comes to writing exploits.
If you're interested writing exploits, you're likely going to have to learn how to reverse engineer. For 99% of the world, this means IDA Pro. In my experience, there's no better IDA Pro book than Chris Eagle's "The IDA Pro Book". He details pretty much everything you'll ever need to do in IDA Pro.
There's a pretty great reverse engineering community at OpenRCE.org. Tons of papers and various helpful apps are available there. I learned about this website at an excellent bi-annual reverse engineering conference called RECon. The next event will be in 2010.
Most research these days will be "low-hanging fruit". The majority of talks at recent security conferences I've been to have been about vulnerabilities on mobile platforms (iPhone, Android, etc) where there are few to none of the protections available on modern OSes.
In general, there won't be a single reference out there that will explain how to write a modern exploit, because there's a whole host of protections built into OSes. For example, say you've found a heap vulnerability, but that pesky new Safe Unlinking feature in Windows is keeping you from gaining execution. You'd have to know that two geniuses researched this feature and found a flaw.
Good luck in your studies. Exploit writing is extremely frustrating, and EXTREMELY rewarding!
Bah! The spam thingy is keeping me from posting all of my links. Sorry!
DEP (Data Execution Prevention), NX (No-Execute) and other security enhancements that specifically disallow execution are easily by-passed by using another exploit techniques such as Ret2Lib or Ret2Esp. When an application is compiled it usually is done so with other libraries (Linux) or DLLs (Windows). These Ret2* techniques simply call an existing function() that resides in memory.
For example, in a normal exploit you may overflow the stack and then take control of the return address (EIP) with the address of a NOP Sled, your Shellcode or an Environmental Variable that contains your shellcode. When attempting this exploit on a system that does not allow the stack to be executable your code will not run. Instead, when you overflow the return address (EIP) you can point it to an existing function within memory such as system() or execv(). You pre populate the required registers with the parameters this function expects and now you can call /bin/sh without having to execute anything from the stack.
For more information look here:
http://web.textfiles.com/hacking/smackthestack.txt

Resources