Finding Vulnerabilities in Software - security

I'm insterested to know the techniques that where used to discover vulnerabilities. I know the theory about buffer overflows, format string exploits, ecc, I also wrote some of them. But I still don't realize how to find a vulnerability in an efficient way.
I don't looking for a magic wand, I'm only looking for the most common techniques about it, I think that looking the whole source is an epic work for some project admitting that you have access to the source. Trying to fuzz on the input manually isn't so comfortable too. So I'm wondering about some tool that helps.
E.g.
I'm not realizing how the dev team can find vulnerabilities to jailbreak iPhones so fast.
They don't have source code, they can't execute programs and since there is a small number of default
programs, I don't expect a large numbers of security holes. So how to find this kind of vulnerability
so quickly?
Thank you in advance.

On the lower layers, manually examining memory can be very revealing. You can certainly view memory with a tool like Visual Studio, and I would imagine that someone has even written a tool to crudely reconstruct an application based on the instructions it executes and the data structures it places into memory.
On the web, I have found many sequence-related exploits by simply reversing the order in which an operation occurs (for example, an online transaction). Because the server is stateful but the client is stateless, you can rapidly exploit a poorly-designed process by emulating a different sequence.
As to the speed of discovery: I think quantity often trumps brilliance...put a piece of software, even a good one, in the hands of a million bored/curious/motivated people, and vulnerabilities are bound to be discovered. There is a tremendous rush to get products out the door.

There is no efficient way to do this, as firms spend a good deal of money to produce and maintain secure software. Ideally, their work in securing software does not start with a looking for vulnerabilities in the finished product; so many vulns have already been eradicated when the software is out.
Back to your question: it will depend on what you have (working binaries, complete/partial source code, etc). On the other hand, it is not finding ANY vulnerability but those that count (e.g., those that the client of the audit, or the software owner). Right?
This will help you understand the inputs and functions you need to worry about. Once you localized these, you may already have a feeling of the software's quality: if it isn't very good, then probably fuzzing will find you some bugs. Else, you need to start understanding these functions and how the input is used within the code to understand whether the code can be subverted in any way.
Some experience will help you weight how much effort to put at each task and when to push further. For example, if you see some bad practices being used, then delve deeper. If you see crypto being implemented from scratch, delve deeper. Etc

Aside from buffer overflow and format string exploits, you may want to read a bit on code injection. (a lot of what you'll come across will be web/DB related, but dig deeper) AFAIK this was a huge force in jailbreaking the iThingies. Saurik's mobile substrate allow(s) (-ed?) you to load 3rd party .dylibs, and call any code contained in those.

Related

How to Protect an Exe File from Decompilation

What are the methods for protecting an Exe file from Reverse Engineering.Many Packers are available to pack an exe file.Such an approach is mentioned in http://c-madeeasy.blogspot.com/2011/07/protecting-your-c-programexe-files-from.html
Is this method efficient?
The only good way to prevent a program from being reverse-engineered ("understood") is to revise its structure to essentially force the opponent into understanding Turing Machines. Essentially what you do is:
take some problem which generally proven to be computationally difficult
synthesize a version of that whose outcome you know; this is generally pretty easy compared to solving a version
make the correct program execution dependent on the correct answer
make the program compute nonsense if the answer is not correct
Now an opponent staring at your code has to figure what the "correct" computation is, by solving algorithmically hard problems. There's tons of NP-hard problems that nobody has solved efficiently in the literature in 40 years; its a pretty good bet if your program depends on one of these, that J. Random Reverse-Engineer won't suddenly be able to solve them.
One generally does this by transforming the original program to obscure its control flow, and/or its dataflow. Some techniques scramble the control flow by converting some control flow into essentially data flow ("jump indirect through this pointer array"), and then implementing data flow algorithms that require precise points-to analysis, which is both provably hard and has proven difficult in practice.
Here's a paper that describes a variety of techniques rather shallowly but its an easy read:
http://www.cs.sjsu.edu/faculty/stamp/students/kundu_deepti.pdf
Here's another that focuses on how to ensure that the obfuscating transformations lead to results that are gauranteed to be computationally hard:
http://www.springerlink.com/content/41135jkqxv9l3xme/
Here's one that surveys a wide variety of control flow transformation methods,
including those that provide levels of gaurantees about security:
http://www.springerlink.com/content/g157gxr14m149l13/
This paper obfuscates control flows in binary programs with low overhead:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.3773&rank=2
Now, one could go through a lot of trouble to prevent a program from being decompiled. But if the decompiled one was impossible to understand, you simply might not bother; that's the approach I'd take.
If you insist on preventing decompilation, you can attack that by considering what decompilation is intended to accomplish. Decompilation essentially proposes that you can convert each byte of the target program into some piece of code. One way to make that fail, is to ensure that the application can apparently use each byte
as both computer instructions, and as data, even if if does not actually do so, and that the decision to do so is obfuscated by the above kinds of methods. One variation on this is to have lots of conditional branches in the code that are in fact unconditional (using control flow obfuscation methods); the other side of the branch falls into nonsense code that looks valid but branches to crazy places in the existing code. Another variant on this idea is to implement your program as an obfuscated interpreter, and implement the actual functionality as a set of interpreted data.
A fun way to make this fail is to generate code at run time and execute it on the fly; most conventional languages such as C have pretty much no way to represent this.
A program built like this would be difficult to decompile, let alone understand after the fact.
Tools that are claimed to a good job at protecting binary code are listed at:
https://security.stackexchange.com/questions/1069/any-comprehensive-solutions-for-binary-code-protection-and-anti-reverse-engineeri
Packing, compressing and any other methods of binary protection will only every serve to hinder or slow reversal of your code, they have never been and never will be 100% secure solutions (though the marketing of some would have you believe that). You basically need to evaluate what sort of level of hacker you are up against, if they are script kids, then any packer that require real effort and skill (ie:those that lack unpacking scripts/programs/tutorials) will deter them. If your facing people with skills and resources, then you can forget about keeping your code safe (as many of the comments say: if the OS can read it to execute it, so can you, it'll just take a while longer). If your concern is not so much your IP but rather the security of something your program does, then you might be better served in redesigning in a manner where it cannot be attack even with the original source (chrome takes this approach).
Decompilation is always possible. The statement
This threat can be eliminated to extend by packing/compressing the
executable(.exe).
on your linked site is a plain lie.
Currently many solutions can be used to protect your application from being anti-compiled. Such as compressing, Obfuscation, Code snippet, etc.
You can looking for a company to help you achieve this.
Such as Nelpeiron, the website is:https://www.nalpeiron.com/
Which can cover many platforms, Windows, Linux, ARM-Linux, Android.
What is more Virbox is also can be taken into consideration:
The website is: https://lm-global.virbox.com/index.html
I recommend is because they have more options to protect your source code, such as import table protection, memory check.

Using theorem provers to find attacks

I've heard a bit about using automated theorem provers in attempts to show that security vulnerabilities don't exist in a software system. In general this is fiendishly hard to do.
My question is has anyone done work on using similar tools to find vulnerabilities in existing or proposed systems?
Eidt: I'm NOT asking about proving that a software system is secure. I'm asking about finding (ideally previously unknown) vulnerabilities (or even classes of them). I'm thinking like (but an not) a black hat here: describe the formal semantics of the system, describe what I want to attack and then let the computer figure out what chain of actions I need to use to take over your system.
Yes, a lot of work has been done in this area. Satisfiability (SAT and SMT) solvers are regularly used to find security vulnerabilities.
For example, in Microsoft, a tool called SAGE is used to eradicate buffer overruns bugs from windows.
SAGE uses the Z3 theorem prover as its satisfiability checker.
If you search the internet using keywords such as “smart fuzzing” or “white-box fuzzing”, you will find several other projects using satisfiability checkers for finding security vulnerabilities.
The high-level idea is the following: collect execution paths in your program (that you didn't manage to exercise, that is, you didn't find an input that made the program execute it), convert these paths into mathematical formulas, and feed these formulas to a satisfiability solver.
The idea is to create a formula that is satisfiable/feasible only if there is an input that will make the program execute the given path.
If the produced formula is satisfiable (i.e., feasible), then the satisfiability solver will produce an assignment and the desired input values. White-box fuzzers use different strategies for selecting execution paths.
The main goal is to find an input that will make the program execute a path that leads to a crash.
So, at least in some meaningful sense, the opposite of proving something is secure is finding code paths for which it isn't.
Try Byron Cook's TERMINATOR project.
And at least two videos on Channel9. Here's one of them
His research is likely to be a good starting point for you to learn about this extremely interesting area of research.
Projects such as Spec# and Typed-Assembly-Language are related too. In their quest to move the possibility of safety checks from runtime back to compile-time, they allow the compiler to detect many bad code paths as compilation errors. Strictly, they don't help your stated intent, but the theory they exploit might be useful to you.
I'm currently writing a PDF parser in Coq together with someone else. While the goal in this case is to produce a secure piece of code, doing something like this can definitely help with finding fatal logic bugs.
Once you've familiarized yourself with the tool, most proof become easy. The harder proofs yield interesting test cases, that can sometimes trigger bugs in real, existing programs. (And for finding bugs, you can simply assume theorems as axioms once you're sure that there's no bug to find, no serious proving necessary.)
About a moth ago, we hit a problem parsing PDFs with multiple / older XREF tables. We could not prove that the parsing terminates. Thinking about this, I constructed a PDF with looping /Prev Pointers in the Trailer (who'd think of that? :-P), which naturally made some viewers loop forever. (Most notably, pretty much any poppler-based viewer on Ubuntu. Made me laugh and curse Gnome/evince-thumbnailer for eating all my CPU. I think they fixed it now, tho.)
Using Coq to find lower-level bugs will be difficult. In order to prove anything, you need a model of the program's behavior. For stack / heap problems, you'll probably have to model the CPU-level or at least C-level execution. While technically possible, I'd say this is not worth the effort.
Using SPLint for C or writing a custom checker in your language of choice should be more efficient.
STACK and KINT used constraint solvers to find vulnerabilities in many OSS projects, like the linux kernel and ffmpeg. The project pages point to papers and code.
It's not really related to theorem-proving, but fuzz testing is a common technique for finding vulnerabilities in an automated way.
There is the L4 verified kernel which is trying to do just that. However, if you look at the history of exploitation, completely new attack patterns are found and then a lot of software written up to that point is very vulnerable to attacks. For instance, format string vulnerabilities weren't discovered until 1999. About a month ago H.D. Moore released DLL Hijacking and literally everything under windows is vulnerable.
I don't think its possible to prove that a piece of software is secure against an unknown attack. At least not until a theorem is able to discover such an attack, and as far as I know this hasn't happened.
Disclaimer: I have little to no experience with automated theorem provers
A few observations
Things like cryptography are rarely ever "proven", just believed to be secure. If your program uses anything like that, it will only be as strong as the crypto.
Theorem provers can't analyze everything (or they would be able to solve the halting problem)
You would have to define very clearly what insecure means for the prover. This in itself is a huge challenge
Yes. Many theorem proving projects show the quality of their software by demonstrating holes or defects in software. To make it security related, just imagine finding a hole in a security protocol. Carlos Olarte's Ph.D. thesis under Ugo Montanari has one such example.
It is in the application. Not really the theorem prover itself that has anything to do with security or special knowledge thereof.

Successful Non-programmer, 5GL, Visual, 0 Source Code or Similar Tools?

Can anyone give me an example of successful non-programmer, 5GL (not that I am sure what they are!), visual, 0 source code or similar tools that business users or analysts can use to create applications?
I don’t believe there are and I would like to be proven wrong.
At the company that I work at, we have developed in-house MVC that we use to develop web applications. It is basically a reduced state-machine written in XML (à la Spring WebFlow) for controller and a simple template based engine for presentation. Some of the benefits:
dynamic nature: no need to recompile to see the changes
reduced “semantic load”: basically, actions in controller know only “If”. Therefore, it is easy to train someone to develop apps.
The current trend in the company (or at least at management level) is to try to produce tools for the platform that require 0 source code, are visual etc. It has a good effect on clients (or at least at management level) since:
they can be convinced that this way they will need no programmers or at least will be able to hire end-of-the-lather programmers that cost much less than typical programmers.
It appears that there is a reduced risk involved, since the tool limits the implementer or user (just don’t use the word programmer!) in what he can do, so there is a less chance that he can introduce error
It appears to simplify the whole problem since there seems to be no programming involved (notoriously complex). Since applications load dynamically, there is less complexity then typically associated with J2EE lifecycle: compile, package, deploy etc.
I am personally skeptic that something like this can be achieved. Solution we have today has a number of problems:
Implementers write JavaScript code to enrich pages (could be solved by developing widgets). Albeit client-side, still a code that can become very complex and result in some difficult bugs.
There is already a visual tool, but implementers prefer editing XML since it is quicker and easier. For comparison, I guess not many use Eclipse Spring WebFlow plug-in to edit flow XML.
There is a very poor reuse in the solution (based on copy-paste of XML). This hampers productivity and some other aspects, like fostering business knowledge.
There have been numerous performance and other issues based on incorrect use of the tools. No matter how reduced the playfield, there is always space for error.
While the platform is probably more productive than Struts, I doubt it is more productive than today’s RAD web frameworks like RoR or Grails.
Verbosity
Historically, there have been numerous failures in this direction. The idea of programs written by non-programmers is old but AFAIK never successful. At certain level, anything but the power of source code becomes irreplaceable.
Today, there is a lot of talk about DSLs, but not as something that non-programmers should write, more like something they could read.
It seems to me that the direction company is taking in this respect is a dead-end. What do you think?
EDIT: It is worth noting (and that's where some of insipiration is coming from) that many big players are experimenting in that direction. See Microsoft Popfly, Google Sites, iRise, many Mashup solutions etc.
Yes, it's a dead end. The problem is simple: no matter how simple you make the expression of a solution, you still have to analyze and understand the problem to be solved. That's about 80-90% of how (most good) programmers spend their time, and it's the part that takes the real skill and thinking. Yes, once you've decided what to do, there's some skill involved in figuring out how to do that (in a programming language of your choice). In most cases, that's a small part of the problem, and the least open to things like schedule slippage, cost overrun or outright failure.
Most serious problems in software projects occur at a much earlier stage, in the part where you're simply trying to figure out what the system should do, what users must/should/may do which things, what problems the system will (and won't) attempt to solve, and so on. Those are the hard problems, and changing the environment to expressing the solution in some way other that source code will do precisely nothing to help any of those difficult problems.
For a more complete treatise on the subject, you might want to read No Silver Bullet - Essence and Accident in Software Engineering, by Frederick Brooks (Included in the 20th Anniversary Edition of The Mythical Man-Month). The entire paper is about essentially this question: how much of the effort involved in software engineering is essential, and how much is an accidental result of the tools, environments, programming languages, etc., that we use. His conclusion was that no technology was available that gave any reasonable hope of improving productivity by as much as one order of magnitude.
Not to question the decision to use 5GLs, etc, but programming is hard.
John Skeet - Programming is Hard
Coding Horror - Programming is Hard
5GLs have been considered a dead-end for a while now.
I'm thinking of the family of products that include Ms Access, Excel, Clarion for DOS, etc. Where you can make applications with 0 source code and no programmers. Not that they are capable of AI quality operations, but they can make very usable applications.
There will always be "real" languages to do the work, but we can drag and drop the workflow.
I'm using Apple's Automator which allows users to chain together "Actions" exposed by the various applications on their systems.
Actions have inputs and/or outputs, some have UI elements and basic logic can be applied to the chain.
The key difference between automator
and other visual environments is that
the actions use existing application
code and don't require any special
installation.
More Info > http://www.macosxautomation.com/automator/
I've used it to "automate" many batch processes and had really great results (surprises me every time). I've got it running builds and backups and whenever i need to process a mess of text files it comes through.
I would love to know whether iHook or Platypus (osx wrapper builders for shell scripts) could let me develop plugins in python ....
There is definitely room for more applications like this and for more support from OSX application developers but the idea is sound.
Until there's major support there aren't many "actions" available, but a quick check on my system just showed me an extra 30 that i didn't know i had.
PS. There was another app for OS-preX called "Filter Tops" which had a much more limited set of plugins.
How about Dabble DB?
Of course, just like MS Access and other non-programmer programming platforms, it has some necessary limitations in order that the user won't get him or herself stuck... as John pointed out programming is hard. But it does give the user a lot of power, and it seems that most applications that non-programmers want to build are database-type applications anyway.

How to protect your software code? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
How do you protect your software from illegal distribution?
Best practice to prevent software copy
Hypothetical situation:
Lets say I have built a software product from the scratch and it does wonderful things. The only problem is that, once someone takes a look at the code, they will understand it very easily and they can easily build it up themselves.
Now, the thing is that I built the code from the scratch 100% and uses a mixture of API calls.
Nobody else is involved in the development of the code.
If I want to sell this product, what is the guarantee that someone much smarter than me will reverse engineer the whole thing and come up with better product?
Right now I am thinking of fragmenting the whole code. Adding lots of redundant code and tonnes of comments.
Is there any software which encrypts the software code, that will make debugging, troubleshooting, and understanding how the code works virtually impossible? and yet runs as usual? so that the developer can have peace of mind?
Very few things in a program are truly novel. Almost everything that you are likely to put into your code, someone else could invent on their own. Generally more easily than they could learn it by reading your code. Reading code is harder than writing it, and most programmers don't really like doing it anyway.
So it's much more likely that they will look at your app and think "I could do that", then "That's cool, I'm gonna read that code and then copy it!". Even if they understand it, you will still own the copyright, you still get to market first.
I recommend that you just forget about it.
once someone takes a look at the
code, they will understand it very
easily and they can easily build it up
themselves.
So don't give anybody the source code.
If I want to sell this product, what
is the guarantee that someone much
smarter than me will reverse engineer
the whole thing and come up with
better product?
(a) So start selling it now and capture the market. Reverse engineering takes time, during which you are capturing market and 'mind-share'. (b) Put a provision in your licence agreement that prohibits reverse-engineering. (c) Make sure everybody who gets the product signs the agreement.
Right now I am thinking of fragmenting
the whole code. Adding lots of
redundant code and tonnes of comments.
That only has a point if you're going to distribute the source code. In which case nobody even needs to reverse-engineer. They have your source code. Don't give it to them.
Is there any software ...
There's lots of software that purports to do this job. However it is a technical solution to a business problem. All software can be reverse-engineered, because at some point or other it all has to be decrypted and de-obfuscated to the point where the CPU will understand it. At that point it is essentially plaintext. So no technical solution is formally speaking possible (short of something like code that executes in a tamper-proof HSM).
I will add that there is another business mechanism you can use to defend against business loss, which is what this is all about: price. Make the price so high that the licensees will value their copy and not permit it to be inspected, or make it so low that reverse-engineering is cost-infeasible; or make it free and make your money on the support contract.
Once you actually have the knowledge and experience to write such a codebase, it will be clear to you that obfuscation is meant to deter casual IP infringement.
Someone who wants to know your code is going to know your code.
If it becomes an issue of monetary loss, the courts are your protection.
That's how it works.
Someone will always be able to understand and work out your code. Heck, if you had 0 way getting to the code, even just using the system is enough for someone to be able to replicate the process.
Example: I take a jug of water and pour it into the cup, while my back is facing to another person. This other person knows that water and gravity are awesome at making things fall into other containers, so they can then work out a process of lifting a jug to let gravity (API call) work in their favour. They mightn't know exact what angle you used in your forearm and any super-sneaky cup-holding techniques you used, but they can replicate the same process and improve on it over time.
tl;dr: You can't protect code.
The thing to do is invent even more wonderful things while the competition is reverse-engineering your current stuff. It's called competing through innovation.
I am not a lawyer
if you are really worried about it, to the point you are willing to invest money in it, dont protect your code (beyond something reasonable like obfuscation or encryption) but rather patent your idea and your art. Then if someone does take it, reverse engineer it and make a better process based of yours, you have legal grounds to get your money.
There are tons of things you will have to do, include proving they took your idea (which isnt easy), but if this is the solution to world hunger and all of humanities problems its the thing to do.
Now for the downside, I will guess, and probably be 90% right that your method is:
Not patentable, for various reasons (I was amazed at the number of already patented ideas, and how difficult it was to identify original art)
Not new, or unique (i.e. there is already established art for it)
Not worth patenting because the expense far outways the benefits
An IP lawyer can tell you for sure, and the expense of a consult is not that much. Overall it will be cheaper to consult with them then to invest a lot of time in hiding code.
Good luck.
Don't even bother. If your code really "does wonderful things" be assured that it'll get hacked. And be it just for curiosity.
There is no 100% way to protect your code from reverse engineering. What language are we talking about? If this is C/C++ then it is pretty hard to reverse engineer, more you could strip it from debugging information etc. But if this is for example Java then even if you obfuscate the code, there are some pretty cool tools (like JAD) that will reveal much of your work anyway.
Despite all of this I think you should try to change your attitude. Big companies pay a lot of money for simple solutions and it seems that nowadays service is the most important thing, not the software (hence the success of open-software based companies). So, if you have a great software don't be scared that someone might steal it, rather think how to sell it good.
Is there any software which encrypts the software code, that will make debugging, troubleshooting, and understanding how the code works virtually impossible? and yet runs as usual? so that the developer can have peace of mind?
This is the totally wrong mindset IMO. What happens if you get hit by a bus? Your company goes bankrupt? All your data gets destroyed in a fire? For every single one of your customers, the value of their investment in your software will drop, and eventually reach zero, because the software can't be developed, or troubleshot, any further without you. I have seen so much money wasted that way, I think it's a horrible business model.
I earn my bread with making software myself so I know the hardships of making a living with it. Still, obfuscation can't be the way to go nowadays. Impose strict license agreements on your customers, scare the hell out of them so they don't even think about redistributing the software, but leave it open.
This is futile. There is always someone smarter than you and therefore they will be able to reverse engineer your obfuscation.
Usually someone smart enough to hack your code and use it in a meaningful way is smart enough to do it on their own, and probably thinks they can do it better than you did, so they won't bother stealing your stuff.
Don't worry about the people who can hack your code but not make meaningful use of it. If you've done a good job, this can only reinforce the quality of the job you've done (think of all the crappy touchscreen phone imitators).
They are going to reverse-engineer your code. Nothing can stop them.. The only thing you can do is make it harder. This ranges from obfuscating code that is inheritely "open" such as PHP and Javascript, all the way down to littering your code with a crap load of self-modification.
In a lot of ways, I think, the thing that makes a piece of software valuable, is not the crazy technological advancement that it provides, but rather the things that we think might think of as being tertiary to the piece of software itself. Like the fact that you'll be there to support it. Or that it's provided as a web service and you'll be there to make sure the server is running. Or that it's a community, and you'll be there to moderate and build the community.
While you may be actually selling code, the value you that your code has isn't intrinsic to the code itself, but rather derives from the features and ecosystem that surrounds your code.

Have we given up on the idea of code reuse? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
A couple of years ago the media was rife with all sorts of articles on
how the idea of code reuse was a simple way to improve productivity
and code quality.
From the blogs and sites I check on a regular basis it seems as though
the idea of "code reuse" has gone out of fashion. Perhaps the 'code
reuse' advocates have all joined the SOA crowd instead? :-)
Interestingly enough, when you search for 'code reuse' in Google the
second result is titled:
"Internal Code Reuse Considered Dangerous"!
To me the idea of code reuse is just common sense, after all look at
the success of the apache commons project!
What I want to know is:
Do you or your company try and reuse code?
If so how and at what level, i.e. low level api, components or
shared business logic? How do you or your company reuse code?
Does it work?
Discuss?
I am fully aware that there are many open source libs available and that anyone who has used .NET or the Java has reused code in some form. That is common sense!
I was referring more to code reuse within an organizations rather than across a community via a shared lib etc.
I originally asked;
Do you or your company try and reuse code?
If so how and at what level, i.e. low level api, components or shared business logic? How do you or your company reuse code?
From where I sit I see very few example of companies trying to reuse code internally?
If you have a piece of code which could potentially be shared across a medium size organization how would you go about informing other members of the company that this lib/api/etc existed and could be of benefit?
The title of the article you are referring to is misleading, and is actually a very good read. Code reuse is very beneficial, but there are downsides with everything. Basically, if I remember correctly, the gist of the article is that you are sealing the code in a black box and not revisiting it, so as the original developers leave you lose the knowledge. While I see the point, I don't necessarily agree with it - at least not to a "sky is falling" regard.
We actually group code reuse into more than just reusable classes, we look at the entire enterprise. Things that are more like framework enhancement or address cross-cutting concerns are put into a development framework that all of our applications use (think things like pre- and post-validation, logging, etc.). We also have business logic that is applicable to more than one application, so those sort of things get moved to a BAL core that is accessible anywhere.
I think that the important thing is not to promote things for reuse if they are not going to really be reused. They should be well documented, so that new developers can have a resource to help them come up to speed, as well. Chances are, if the knowledge isn't shared, the code will eventually be reinvented somewhere else and will lead to duplication if you are not rigorous in documentation and knowledge sharing.
We reuse code - in fact, our developers specifically write code that can be reused in other projects. This has paid off quite nicely - we're able to start new projects quickly, and we iteratively harden our core libraries.
But one can't just write code and expect it to be re-used; code reuse requires communication among team members and other users so people know what code is available, and how to use it.
The following things are needed for code reuse to work effectively:
The code or library itself
Demand for the code across multiple projects or efforts
Communication of the code's features/capabilities
Instructions on how to use the code
A commitment to maintaining and improving the code over time
Code reuse is essential. I find that it also forces me to generalize as much as possible, also making code more adaptable to varying situations. Ideally, almost every lower level library you write should be able to adapt to a new set of requirements for a different application.
I think code reuse is being done through open source projects for the most part. Anything that can be reused or extended is being done via libraries. Java has an amazing number of open source libraries available for doing a large number of things. Compare that to C++, and how early on everything would have to be implemented from scratch using MFC or the Win32 API.
We reuse code.
On a small scale we try to avoid code duplication as much as posible. And we have a complete library with a lot of frequently used code.
Normally code is developed for one application. And if it is generic enough, it is promoted to the library. This works excelent.
The idea of code reuse is no longer a novel idea...hence the apparent lack of interest. But it is still very much a good idea. The entire .NET framework and the Java API are good examples of code reuse in action.
We have grown accustomed to developing OO libraries of code for our projects and reusing them in other projects. Its a part of the natural life cycle of an idea. It is hotly debated for a while and then everyone accepts and there is no reason for further discussion.
Of course we reuse code.
There are a near infinite amount of packages, libraries and shared objects available for all languages, with whole communities of developers behing them supporting and updating.
I think the lack of "media attention" is due to the fact that everyone is doing it, so it's no longer worth writing about. I don't hear as many people raising awareness of Object-Oriented Programming and Unit Testing as I used to either. Everyone is already aware of these concepts (whether they use them or not).
Level of media attention to an issue has little to do with its importance, whether we're talking software development or politics! It's important to avoid wasting development effort by reinventing (or re-maintaining!) the wheel, but this is so well-known by now that an editor probably isn't going to get excited by another article on the subject.
Rather than looking at the number of current articles and blog posts as a measure of importance (or urgency) look at the concepts and buzz-phrases that have become classics or entered the jargon (another form of reuse!) For example, Google for uses of the DRY acronym for good discussion on the many forms of redundancy that can be eliminated in software and development processes.
There's also a role for mature judgment regarding costs of reuse vs. where the benefits are achieved. Some writers advocate waiting to worry about reuse until a second or third use actually emerges, rather than spending effort to generalize bit of code the first time it is written.
My personal view, based on the practise in my company:
Do you or your company try and reuse code?
Obviously, if we have another piece of code that already fits our needs we will reuse it. We don't go out of our way to use square pegs in round holes though.
If so how and at what level, i.e. low level api, components or shared business logic? How do you or your company reuse code?
At every level. It is written into our coding standards that developers should always assume their code will be reused - even if in reality that is highly unlikely. See below
If your OO model is good, your API probably reflects your business domain, so reusable classes probably equates to reusable business logic without additional effort.
For actual reuse, one key point is knowing what code is already available. We resolve this by having everything documented in a central location. We just need a little discipline to ensure that the documentation is up-to-date and searchable in a meaningful way.
Does it work?
Yes, but not because of the potential or actual reuse! In reality, beyond a few core libraries and UI components, there isn't a large amount of reuse.
In my personal opinion, the real value is in making the code reusable. In doing so, aside from a hopefully cleaner API, the code will (a) be documented sufficiently for another developer to use it without trawling the source code, and (b) it will also be replaceable. These points are a great benefit to on-going software maintenance.
Do you or your company try and reuse code? If so how and at what
level, i.e. low level api, components or shared business logic? How do
you or your company reuse code?
I used to work in a codebase with uber code reuse, but it was difficult to maintain because the reused code was unstable. It was prone to design changes and deprecation in ways that cascaded to everything using it. Before that I worked in a codebase with no code reuse where the seniors actually encouraged copying and pasting as a way to reuse even application-specific code, so I got to see the two extremities and I have to say that one isn't necessarily much better than the other when taken to the extremes.
And I used to be an uber bottom-up kind of programmer. You ask me to build something specific and I end up building generalized tools. Then using those tools, I build more complex generalized tools, then start building DIP abstractions to express the design requirements for the lower-level tools, then I build even more complex tools and repeat, and at some point I start writing code that actually does what you want me to do. And as counter-productive as that sounded, I was pretty fast at it and could ship complex products in ways that really surprised people.
Problem was the maintenance over the months, years! After I built layers and layers of these generalized libraries and reused the hell out of them, each one wanted to serve a much greater purpose than what you asked me to do. Each layer wanted to solve the world's hunger needs. So each one was very ambitious: a math library that wants to be amazing and solve the world's hunger needs. Then something built on top of the math library like a geometry library that wants to be amazing and solve the world's hunger needs. You know something's wrong when you're trying to ship a product but your mind is mulling over how well your uber-generalized geometry library works for rendering and modeling when you're supposed to be working on animation because the animation code you're working on needs a few new geometry functions.
Balancing Everyone's Needs
I found in designing these uber-generalized libraries that I had to become obsessed with the needs of every single team member, and I had to learn how raytracing worked, how fluids dynamics worked, how the mesh engine worked, how inverse kinematics worked, how character animation worked, etc. etc. etc. I had to learn how to do pretty much everyone's job on the team because I was balancing all of their specific needs in the design of these uber generalized libraries I left behind while walking a tightrope balancing act of design compromises from all the code reuse (trying to make things better for Bob working on raytracing who is using one of the libraries but without hurting John too much who is working on physics who is also using it but without complicating the design of the library too much to make them both happy).
It got to a point where I was trying to parametrize bounding boxes with policy classes so that they could be stored either as center and half-size as one person wanted or min/max extents as someone else wanted, and the implementation was getting convoluted really fast trying to frantically keep up with everyone's needs.
Design By Committee
And because each layer was trying to serve such a wide range of needs (much wider than we actually needed), they found many reasons to require design changes, sometimes by committee-requested designs (which are usually kind of gross). And then those design changes would cascade upwards and affect all the higher-level code using it, and maintenance of such code started to become a real PITA.
I think you can potentially share more code in a like-minded team. Ours wasn't like-minded at all. These are not real names but I'd have Bill here who is a high-level GUI programmer and scripter who creates nice user-end designs but questionable code with lots of hacks, but it tends to be okay for that type of code. I got Bob here who is an old timer who has been programming since the punch card era who likes to write 10,000 line functions with gotos in them and still doesn't get the point of object-oriented programming. I got Joe here who is like a mathematical wizard but writes code no one else can understand and always make suggestions which are mathematically aligned but not necessarily so efficient from a computational standpoint. Then I got Mike here who is in outer space who wants us to port the software to iPhones and thinks we should all follow Apple's conventions and engineering standards.
Trying to satisfy everyone's needs here while coming up with a decent design was, probably in retrospect, impossible. And in everyone trying to share each other's code, I think we became counter-productive. Each person was competent in an area but trying to come up with designs and standards which everyone is happy with just lead to all kinds of instability and slowed everyone down.
Trade-Offs
So these days I've found the balance is to avoid code reuse for the lowest-level things. I use a top-down approach from the mid-level, perhaps (something not too far divorced from what you asked me to do), and build some independent library there which I can still do in a short amount of time, but the library doesn't intend to produce mini-libs that try to solve the world's hunger needs. Usually such libraries are a little more narrow in purpose than the lower-level ones (ex: a physics library as opposed to a generalized geometry-intersection library).
YMMV, but if there's anything I've learned over the years in the hardest ways possible, it's that there might be a balancing act and a point where we might want to deliberately avoid code reuse in a team setting at some granular level, abandoning some generality for the lowest-level code in favor of decoupling, having malleable code we can better shape to serve more specific rather than generalized needs, and so forth -- maybe even just letting everyone have a little more freedom to do things their own way. But of course all of this is with the aim of still producing a very reusable, generalized library, but the difference is that the library might not decompose into the teeniest generalized libraries, because I found that crossing a certain threshold and trying to make too many teeny, generalized libraries starts to actually become an extremely counter-productive endeavor in the long term -- not in the short term, but in the long run and broad scheme of things.
If you have a piece of code which could potentially be shared across a
medium size organization how would you go about informing other
members of the company that this lib/api/etc existed and could be of
benefit?
I actually am more reluctant these days and find it more forgivable if colleagues do some redundant work because I would want to make sure that code does something fairly useful and non-trivial and is also really well-tested and designed before I try to share it with people and accumulate a bunch of dependencies to it. The design should have very, very few reasons to require any changes from that point onwards if I share it with the rest of the team.
Otherwise it could cause more grief than it actually saves.
I used to be so intolerant of redundancy (in code or efforts) because it appeared to translate to a product that was very buggy and explosive in memory use. But I zoomed in too much on redundancy as the key problem, when really the real problem was poor quality, hastily-written code, and a lack of solid testing. Well-tested, reliable, efficient code wouldn't suffer that problem to nearly as great of a degree even if some people duplicate, say, some math functions here and there.
One of the common sense things to look at and remember that I didn't at the time is how we don't mind some redundancy when we use a very solid third party library. Chances are that you guys use a third party library or two that has some redundant work with what your team is doing. But we don't mind in those cases because the third party library is great and well-tested. I recommend applying that same mindset to your own internal code. The goal should be to create something awesome and well-tested, not to fuss over a little bit of redundancy here and there as I mistakenly did long ago.
So these days I've shifted my intolerance towards a lack of testing instead. Instead of getting upset over redundant efforts, I find it much more productive to get upset over other people's lack of unit and integration testing! :-D
While I think code reuse is valuable, I can see where this sentiment is rooted. I've worked on a lot of projects where much extra care was taken to create re-usable code that was then never reused. Of course reuse is much preferable to duplicate code, but I have seen a lot of very extenisve object models created with the goal of using the objects across the enterprise in multiple projects (kind of the way the same service in SOA can be used in different apps) but have never seen the objects actually used more than once. Maybe I just haven't been part of organizations taking good advantage of the principle of reuse.
The two software projects I've worked on have both been long term development. One is about 10 years old, the other has been around for over 30 years, rewritten in a couple versions of Fortran along the way. Both make extensive reuse of code, but both rely very little on external tools or code libraries. DRY is a big mantra on the newer project, which is in C++ and lends itself more easily to doing that in practice.
Maybe the better question is when do we NOT reuse code these days? We are either in a state on building using someone elses observed "best practices" or prediscovered "design patterns" or just actually building on legacy code, libraries, or copying.
It seems the degree to which code A is reused to make code B is often based around how much the ideas in code A taken to code B are abstracted into design patterns/idioms/books/fleeting thoughts/actual code/libraries. The hard part is in applying all those good ideas to your actual code.
Non-technical types get overzealous about the reuse thing. They don't understand why everything can't be copy-pasted. They don't understand why the greemelfarm needs a special adapter to communicate the same information that it used to to the old system to the new system, and that, unfortunately we can't change either due to a bazillion other reasons.
I think techies have been reusing from day 1 in the same way musicians have been reusing from day 1. Its an ongoing organic evolution and sythesis that will keep ongoing.
Code reuse is an extremely important issue - where code is not reused, projects take longer and are harder for new team members to get into.
However, writing reusable code takes longer.
Personally, I try to write all my code in a reusable way, this takes longer, but it results in the fact that most of my code has become official infrastructures in my organization and that new projects based on these infrastructures take significantly less time.
The danger in reusing code, is if the reused code is not written as an infrastructure - in a general and encapsulated manner with as few as possible assumptions and as much as possible documentation and unit testing, that the code can end up doing unexpected things.
Also, if bugs are found and fixed, or features added, these changes are rarely returned to the source code, resulting in different versions of the reused code, that no one knows of or understands.
The solution is:
1. To design and write the code with not only one project in mind, but to think of future requirements and try to make the design flexible enough to cover them with minimal code change.
2. To enclose the code within libraries that are to be used as-is and not modified within using projects.
3. To allow users to view and modify the code of of the library withing its solution (not within the using project's solution).
4. To design future projects to be based on the existing infrastructures, making changes to the infrastructures as necessary.
5. To charge maintaining the infrastructure to all projects, thus keeping the infrastructure funded.
Maven has solved code reuse. I'm completely serious.

Resources