Centralized vs. Distributed version control security

As my company begins to further explore moving from centralized version control tools (CVS, SVN, Perforce and a host of others) to offering teams distributed version control tools (mercurial in our case) I've run into a problem:
The Problem
A manager has raised the concern that distributed version control may not be as secure as our CVCS options because the repo history is stored locally on the developer's machine.
It's been difficult to nail down his exact security concern but I've gathered that it centers on the fact that a malicious employee could steal not only the latest intellectual properly but our whole history of changes just by copying a single folder.
The Question(s)
Do distributed version control system really introduce new security concerns for projects?
Is it easier to maliciously steal code?
Does the complete history represent an additional threat that the latest version of the code does not?
My Thoughts
My take is that this may be a mistaken thought that the centralized model is more secure because the history seems to be safer as it is off on its own box. Given that users with even read access to a centralized repo could selectively extract snapshots of the project at any key revision I'm not sure the DVCS model makes it all that easier. Also, most CVCS tools allow you to extract the whole repo's history with a single command so that you can import them into other tools.
I think the other issue is just how important the history is compared to the latest version. Granted someone could have checked in a top secret file, then deleted it and the history would pretty quickly be significant. But even in that scenario a CVCS user could checkout that top secret version with a single command.
I'm sure I could be missing something or downplaying risks as I'm eager to see DVCS become a fully supported tool option. Please contribute any ideas you have on security concerns.

If you have read access to a CVCS, you have enough permissions to convert the repo to a DVCS, which people do all the time. No software tool is going to protect you from a disgruntled employee stealing your code, but a DVCS has many more options for dealing with untrusted contributors, such as a gatekeeper workflow. Hence its widespread use in open source projects.

You are right in that distributed version control does not really introduce any new security concerns since the developer has already access to the code in both cases. I can only think that since it is easier to work offline and offsite with GIT, developers might become more tempted to do it than in centralized. I would push to force encryption on all corporate laptops with code
not really easier, just the same. If you enable logs, then you will have the same information when the code is accessed.
I personally do not think so. It might represent the thought process leading to certain decisions but not necessarily more.
It comes down to knowledge on how to implement security measures in both cases. If you have more experience in one system vs another then you are more likely to implement more to prevent such loss but at the end of the day, you are trusting your developers with code the minute you allow them access to it. No way around that.

DVCS provides various protections against unauthorized writing. This is why it is popular with opensource teams. It has several frustrating limitations for controlling reading. Opensource teams do not care about this.
The first problem is that most DVCS encourage many copies of the full source. The typical granularity is the full repo. This can include many unneeded branches and even entire other projects, besides the concern of history (along with searchable commit comments that can make the code even more useful to the attacker). CVCS encourages developers to copy as little as possible to their desktop, since the less they copy, the faster it works. The less you put on mobile devices, the easier it is to secure.
When DVCS is implemented with many devices acting as servers, it is much more difficult to implement effective network security. Attacking a local CVCS workspace requires the attacker to gain access to the filesystem. Attacking a DVCS node generally requires attacking the DVCS itself on any device hosting the information (and remember: the folks who maintain most DVCS's are opensource guys; they don't care nearly as much about read controls). The more devices that host repositories, the more likely that users will set up anonymous read access (which again, DVCS encourages because of its opensource roots). This greatly simplifies the job of an attacker who is doing random sweeps.
CVCS that are based on URLs (like subversion) open the opportunity for quite fine-grain access control, such as per-branch access. DVCS tends to fight this kind of access control.
I know developers like DVCS, but there's no way it can be secured as effectively as CVCS. Most environments do a terrible job of securing their CVCS, and if that's the case then it doesn't matter which you use. But if you take access control seriously, you can have much greater control with CVCS as part of a broader least-privilege infrastructure.
Many may argue that there's no reason to protect source code. That's fine and people can argue about it. But if you are going to protect your source code, the best implementation is to not copy the source to random laptops (which are very hard to secure well), and rather have developers mount it from a central server. CVCS works well this way. DVCS makes no sense if you are going to keep it on a single server this way. If you are going to copy files to mobile devices, make sure you copy as little as possible. That's the opposite of DVCS.

There are a bunch of "security" issues; whether they are an issue depends on your setup:
There's more data floating around, which means the notional "attack surface" might be bigger (it depends on how you count).
But how much data does the "typical" developer check out? You might want to use a sparse checkout in svn, but lazy people and some GUI tools don't support that, so they'll have all your code checked out anyway. Git users might be more likely to use multiple repos. This depends on you.
Authentication/access control might be better (and it might be worse!). This is largely a function of the VCS, not whether it is "D" or "C". svn:// is plaintext.
Is deleting files a priority, and how easy is this to do? An accidental commit of a confidential file is more painful to do in git if it happened in the distant past (but people might be more likely to notice).
Are you really going to notice a malicious user pulling the entire history instead of merely doing a checkout? It depends on how big your repository is and what your branches are like. It's easy for a full SVN checkout to take up more space than the repository itself due to branches.
Change history is generally not something you want to give away for free (even to people with a source code license), but how valuable is it? Maybe you have top-secret design methodologies or confidential information in your commit messages, but this seems unlikely.
And finally, security economics:
How much is the extra security worth?
How much is increased productivity worth?
How much is caring about the concerns about your developers worth?
(IIRC it turns out that users should ignore security advice, because the expected cost is more than the expected benefit — this is especially true for things like certificates that expired yesterday. How much does it cost you to check the address bar every time you type in password? How often do you catch a phishing attempt? What is the cost to you per thwarted phishing attempt? What is the cost per successful phish?)


Is it possible that the popular applications in my laptop are surveilling my files on hard drive?

What if I develop a desktop application which million people will use, and behind the scene, the application is surveilling users' files on their hard drives, streaming the data time to time?
Can one be assured no such things happen, with any popular software applications, be it MS Office or Google Chrome?
Or this is just a stupid question?
Is it technically possible? Yes, it is.
Could it be happening in an application used by a million users for a relatively long time without being noticed? Very unlikely. Somebody would notice the strange network traffic eventually.
Also #Mjh mentioned open source in a comment. While open source can help by allowing people to audit the source code, how many times have you checked that the binary you are using is actually the compiled source that you were looking at? Of course, there are signatures on binary packages and all, but the signature is made by the package maintainer. There is an inherent trust not only in the developer of the application, but also in the tool chain that creates a binary package from the source code. And then we haven't talked about strange "bugs", or the fact that even in open source, some security issues are very hard to find (otherwise all open source software would be security bug-free, which they are not).
So back to your question, sure, you could use all kinds of techniques to monitor the behavior of an application, you could monitor memory access, network traffic, whatever else. You can also analyse the code itself, look for suspicious things. It will take a huge amount of effort and still there will be no 100% guarantee, only some level of assurance.
Automated version upgrades could make detection even harder by the way. Even if you put lots of resources into analysis of one version, what if only a short-lived version had malicious code? Sure, that too can be analysed, but would anyone bother, unless there was a good reason (like indications of something malicious)?
Yet I think you can be pretty sure that major vendors don't do this. It's just not worth it for them, why would they? Their risk would be huge, with a relatively low benefit.

Website hacking - Why it is always possible to do?

we know that each executable file can be reverse engineered (disassembled, decompiled). No mater how strong security you will implement, anyway if crackers want to, they do crack!!! Just that is a question of time.
What about websites? May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)? If no, than what is the reason?
Yes it is always possible to do. There is always a way in.
It's like my grandfather always said:
Locks are meant to keep the honest
people out
May we say that website can be completely safe from attacks of hackers?
No. Even the most secure technology in the world is vulnerable to social engineering attacks, for one thing.
You can easily write a webapp that is mathematically proven to be secure... But that proof will only hold as long as the underlying operating system, interpreter|compiler, and hardware are secure, which is never the case.
The key thing to remember is that websites are usually part of a huge and complex system and it doesn't really matter if the hacker enters the system through the web application itself or some other part of the entire infrastructure. If someone can get access to your servers, routers, DNS or whatever, they can bring down even the best web application. In my experience a lot of systems are vulnerable in some way or another. So "completely secure" means either "we're trying really hard to secure the platform" or "we have no clue whatsoever, but we hope everything is okay". I have seen both.
To sum up and add to the posts that precede:
Web as a shared resource - websites are useful so long as they are accessible. Render the web site unaccessible, and you've broken it. Denial of service attacks add up to flooding the server so that it can no longer respond to legitimate requests will always be a factor. It's a game of keep away - big server sites find ways to distribute, hackers find ways to deluge.
Dynamic data = dynamic risk - if the user can input data, there's a chance for a hacker to be a menance. Today the big concepts are cross-site scripting and SQL injection, but once one avenue for cracking is figured out, chances are high that another mechanism will rise. You could, conceivably, argue that a totally static site can be secure from this, but then how many useful sites fit that bill?
Complexity = the more complex, the harder to secure - given the rapid change of technology, I doubt that any web developer could say with 100% confidence that a modern website was secure - there's too much unknown code. Taking the host aside (the server, network protocols, OS, and maybe database), there's still all the great new libraries in Java EE and .Net. And even a less enterprise-y architecture will have some serious complexity that makes knowing all potential inputs and outputs of the code prohibitively difficult.
The authentication problem = by definition, the web site lets a remote user do something useful on a server that is far away. Knowing and trusting the other end of the communication is an old challenge. These days server side authenitication is relatively well implemented an understood and (so far as I know!) no one's managed to hack PKI. But getting user authentication ironed out is still quite tricky. It's doable, but it's a tradeoff between difficulty for the user and for configuration, and a system with a higher risk of vulnerability. And even a strong system can be broken when users don't follow the rules or when accidents happen. All this doesn't apply if you want to make a public site for all users, but that severely limits the features you'll be able to implement.
I'd say that web sites simply change the nature of the security challenge from the challenges of client side code. The developer does not need to be as worried about code replication, but the developer does need to be aware of the risks that come from centralizing data and access to a server (or collection of servers). It's just a different sort of problem.
Websites suffer greatly from injection and cross site scripting attacks
Cross-site scripting carried out on
websites were roughly 80% of all
documented security vulnerabilities as
of 2007
Also part of a website (in some web sites a great deal) is sent to the client in the form of CSS, HTML and javascript, which is the open for inspection by anyone.
Not to nitpick, but your definition of "good hosting" does not assume the HTTP service running on the host is completely free from exploits.
Popular web servers such as IIS and Apache are often patched in order to protect against such exploits, which are often discovered the same way exploits in local executables are discovered.
For example, a malformed HTTP request could cause a buffer overrun on the server, leading to part of its data being executed.
It's not possible to make anything 100% secure.
All that can be done is to make something hard enough to break into, that the time and effort spent doing so makes it not worth doing.
Can I crack your site? Sure, I'll just hire a few suicide bombers to blow up your servers. Or... I'll blow up those power plants that power up your site, or I do some sort of social engineering, and DDOS attacks would quite likely be effective in a large scale not to mention atom bombs...
Short answer: yes.
This might be the wrong website to discuss that. However, it is widely known that security and usability are inversely related. See this post by Bruce Schneier for example (which refers to another website, but on Schneier's blog there's a lot of interesting readings on the issue).
Assuming the server itself isn't comprimised, and has no other clients sharing it, static code should be fine. Things usually only start to get funky when there's some sort of scripting language involved. After all, I've never seen a comprimised "It Works!" page
Saying 'completely secure' is a bad thing as it will state two things:
there has not been a proper threat analysis, because secure enough would be the 'correct' term
since security is always a tradeoff it means that the a system that is completely secure will have abysmal usability and the site will be a huge resource hog as security has been taken to insane levels.
So instead of trying to achieve "complete security" you should;
Do a proper threat analysis
Test your application (or have someone professional test it) against common attacks
Apply best practices, not extreme measures
The short of it is that you have to strike a balance between ease of use and security, much of the time, and decide what provides the optimal level of both for your purposes.
An excellent case in point is passwords. The easy way to go about it is to just have one, use it everywhere, and make it something easy to remember. The secure way to go about it is to have a randomly generated variable-length sequence of characters across the encoding spectrum that only the user himself knows.
Naturally, if you go too far on the easy side, the user's data is easy to pick off. If you go too far on the side of security, however, practical application could end up leading to situations that compromise the added value of the security measures (e.g. people can't remember their whole keychain of passwords and corresponding user names, and therefore write them all down somewhere. If the list is compromised, the security measures that had been put into place are for naught. Hence, most of the time a balance gets struck and places ask that you put a number in your password and tell you not to do anything stupid like tell it to other people.
Even if you remove the possibility of a malicious person with the keys to everything leaking data from the equation, human stupidity is infinite. There is no such thing as 100% security.
May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)?
Well if we're going to start putting constraints on the attacker, then of course we can design a completely secure system: we just have to bar all of the attacker's attacks from the scenario.
If we assume the attacker actually wants to get in (and isn't bound by the rules of your engagement), then the answer is simply no, you can't be completely safe from attacks.
Yes, it's possible for a website to be completely secure, for a reasonable definition of 'complete' that includes your original premise that the hosting is not vulnerable. The problem is the same as with any software that contains defects; people create software of a complexity that is slightly beyond their capability to manage and thus flaws remain undetected until it's too late.
You could start smaller and prove all your work correct and safe as you construct it, remaking any off-the-shelf components that haven't been designed to that stringent degree of quality, but unfortunately that leaves you at a massive commercial disadvantage compared to the people who can write 99% safe software in 1% of the time. Therefore there's rarely a good business reason for going down this path.
The answer to this question lies close to the ideas about computational theory that arise from considering the halting problem. http://en.wikipedia.org/wiki/Halting_problem To wit, if you could with clarity say you'd devised a way to programmatically determine if any particular program was secure, you might be close to disproving the undecidability of the halting problem on the class of machines you were working with. Since the undecidability of the halting problem has been proven, we can know that over turing machines you would be unable to prove securability since the problem of security reduces to the halting problem. Even for finite machines you might be able to decide all of the states of the program, but Minsk would tell us that the time required for a complete state tree for even simplistic modern day machines and web servers would be huge. You probably know a lot about a specific piece of code, but as soon as you changed the code, or updated it, a complete retest would be required. Fundamentally this is interesting because it all boils back to the concept of information and meaning. Read about Automated theory proving to understand more about the limits of computational systems. http://en.wikipedia.org/wiki/Automated_theorem_proving
The fact is hackers are always one step ahead of developers, you can never ever consider a site to be bullet proof and 100% safe. You just avoid malicious stuff as much as you can !!
In fact, you should follow whitelist approach rather than blacklist approach when it comes to security.

Designing a Linux-based system for transferability of ownership/admin rights without total trust

Inspired by a much more specific question on ServerFault.
We all have to trust a huge number of people for the security and integrity of the systems we use every day. Here I'm thinking of all the authors of all the code running on your server or PC, and everyone involved in designing and building the hardware. This is mitigated by reputation and, where source is available, peer review.
Someone else you might have to trust, who is mentioned far less often, is the person who previously had root on a system. Your predecessor as system administrator at work. Or for home users, that nice Linux-savvy friend who configured your system for you. The previous owner of your phone (can you really trust the Factory Reset button?)
You have to trust them because there are so many ways to retain root despite the incoming admin's best efforts, and those are only the ones I could think of in a few minutes. Anyone who has ever had root on a system could have left all kinds of crazy backdoors, and your only real recourse under any Linux-based system I've seen is to reinstall your OS and all code that could ever run with any kind of privilege. Say, mount /home with noexec and reinstall everything else. Even that's not sufficient if any user whose data remains may ever gain privilege or influence a privileged user in sufficient detail (think shell aliases and other malicious configuration). Persistence of privilege is not a new problem.
How would you design a Linux-based system on which the highest level of privileged access can provably be revoked without a total reinstall? Alternatively, what system like that already exists? Alternatively, why is the creation of such a system logically impossible?
When I say Linux-based, I mean something that can run as much software that runs on Linux today as possible, with as few modifications to that software as possible. Physical access has traditionally meant game over because of things like keyloggers which can transmit, but suppose the hardware is sufficiently inspectable / tamper-evident to make ongoing access by that route sufficiently difficult, just because I (and the users of SO?) find the software aspects of this problem more interesting. :-) You might also assume the existence of a BIOS that can be provably reflashed known-good, or which can't be flashed at all.
I'm aware of the very basics of SELinux, and I don't think it's much help here, but I've never actually used it: feel free to explain how I'm wrong.
First and foremost, you did say design :) My answer will contain references to stuff that you can use right now, but some of it is not yet stable enough for production. My answer will also contain allusions to stuff that would need to be written.
You can not accomplish this unless you (as user9876 pointed out) fully and completely trust the individual or company that did the initial installation. If you can't trust this, your problem is infinitely recursive.
I was very active in a new file system several years ago called ext3cow, a copy on write version of ext3. Snapshots were cheap and 100% immutable, the port from Linux 2.4 to 2.6 broke and abandoned the ability to modify or delete files in the past.
Pound for pound, it was as efficient as ext3. Sure, that's nothing to write home about, but it was (and for a large part) still is the production standard FS.
Using that type of file system, assuming a snapshot was made of the pristine installation after all services had been installed and configured, it would be quite easy to diff an entire volume to see what changed and when.
At this point, after going through the diff, you can decide that nothing is interesting and just change the root password, or you can go inspect things that seem a little odd.
Now, for the stuff that has to be written if something interesting is found:
Something that you can pipe the diff though that investigates each file. What you're going to see is a list of revisions per file, at which time they would have to be recursively compared. I.e. , present against former-present, former-present against past1, past1 against past2, etc , until you reach the original file or the point that it no longer exists. Doing this by hand would seriously suck. Also, you need to identify files that were never versioned to begin with.
Something to inspect your currently running kernel. If someone has tainted VFS, none of this is going to work, CoW file systems use temporal inodes to access files in the past. I know a lot of enterprise customers who modify the kernel quite a bit, up to and including modules, VMM and VFS. This may not be such an easy task - comparing against 'pristine' may not be tenable since the old admin may have made good modifications to the kernel since it was installed.
Databases are a special headache, since they change typically each second or more, including the user table. That's going to need to be checked manually, unless you come up with something that can check to be sure that nothing is strange, such a tool would be very specific to your setup. Classic UNIX 'root' is not your only concern here.
Now, consider the other computers on the network. How many of them are running an OS that is known to be easily exploited and bot infested? Even if your server is clean, what if this guy joins #foo on irc and starts an attack on your servers via your own LAN? Most people will click links that a co-worker sends, especially if its a juicy blog entry about the company .. social engineering is very easy if you're doing it from the inside.
In short, what you suggest is tenable, however I'm dubious that most companies could enforce best practices needed for it to work when needed. If the end result is that you find a BOFH in your work force and need to can him, you had better of contained him throughout his employment.
I'll update this answer more as I continue to think about it. Its a very interesting topic. What I've posted so far are my own collected thoughts on the same.
Yes, I know about virtual machines and checkpointing, a solution assuming that brings on a whole new level of recursion. Did the (now departed) admin have direct root access to the privileged domain or storage server? Probably, yes, which is why I'm not considering it for the purposes of this question.
Look at Trusted Computing. The general idea is that the BIOS loads the bootloader, then hashes it and sends that hash to a special chip. The bootloader then hashes the OS kernel, which in turn hashes all the kernel-mode drivers. You can then ask the chip whether all the hashes were as expected.
Assuming you trust the person who originally installed and configured the system, this would enable you to prove that your OS hasn't had a rootkit installed by any of the later sysadmins. You could then manually run a hash over all the files on the system (since there is no rootkit the values will be accurate) and compare these against a list provided by the original installer. Any changed files will have to be checked carefully (e.g. /etc/passwd will have changed due to new users being legitimately added).
I have no idea how you'd handle patching such a system without breaking the chain of trust.
Also, note that your old sysadmin should be assumed to know any password typed into that system by any user, and to have unencrypted copies of any private key used on that system by any user. So it's time to change all your passwords.

What are the best programmatic security controls and design patterns?

There's a lot of security advice out there to tell programmers what not to do. What in your opinion are the best practices that should be followed when coding for good security?
Please add your suggested security control / design pattern below. Suggested format is a bold headline summarising the idea, followed by a description and examples e.g.:
Deny by default
Deny everything that is not explicitly permitted...
Please vote up or comment with improvements rather than duplicating an existing answer. Please also put different patterns and controls in their own answer rather than adding an answer with your 3 or 4 preferred controls.
edit: I am making this a community wiki to encourage voting.
Principle of Least Privilege -- a process should only hold those privileges it actually needs, and should only hold those privileges for the shortest time necessary. So, for example, it's better to use sudo make install than to su to open a shell and then work as superuser.
All these ideas that people are listing (isolation, least privilege, white-listing) are tools.
But you first have to know what "security" means for your application. Often it means something like
Availability: The program will not fail to serve one client because another client submitted bad data.
Privacy: The program will not leak one user's data to another user
Isolation: The program will not interact with data the user did not intend it to.
Reviewability: The program obviously functions correctly -- a desirable property of a vote counter.
Trusted Path: The user knows which entity they are interacting with.
Once you know what security means for your application, then you can start designing around that.
One design practice that doesn't get mentioned as often as it should is Object Capabilities.
Many secure systems need to make authorizing decisions -- should this piece of code be able to access this file or open a socket to that machine.
Access Control Lists are one way to do that -- specify the files that can be accessed. Such systems though require a lot of maintenance overhead. They work for security agencies where people have clearances, and they work for databases where the company deploying the database hires a DB admin. But they work poorly for secure end-user software since the user often has neither the skills nor the inclination to keep lists up to date.
Object Capabilities solve this problem by piggy-backing access decisions on object references -- by using all the work that programmers already do in well-designed object-oriented systems to minimize the amount of authority any individual piece of code has. See CapDesk for an example of how this works in practice.
DARPA ran a secure systems design experiment called the DARPA Browser project which found that a system designed this way -- although it had the same rate of bugs as other Object Oriented systems -- had a far lower rate of exploitable vulnerabilities. Since the designers followed POLA using object capabilities, it was much harder for attackers to find a way to use a bug to compromise the system.
White listing
Opt in what you know you accept
(Yeah, I know, it's very similar to "deny by default", but I like to use positive thinking.)
Model threats before making security design decisions -- think about what possible threats there might be, and how likely they are. For, for example, someone stealing your computer is more likely with a laptop than with a desktop. Then worry about these more probable threats first.
Limit the "attack surface". Expose your system to the fewest attacks possible, via firewalls, limited access, etc.
Remember physical security. If someone can take your hard drive, that may be the most effective attack of all.
(I recall an intrusion red team exercise in which we showed up with a clipboard and an official-looking form, and walked away with the entire "secure" system.)
Encryption ≠ security.
Hire security professionals
Security is a specialized skill. Don't try to do it yourself. If you can't afford to contract out your security, then at least hire a professional to test your implementation.
Reuse proven code
Use proven encryption algorithms, cryptographic random number generators, hash functions, authentication schemes, access control systems, rather than rolling your own.
Design security in from the start
It's a lot easier to get security wrong when you're adding it to an existing system.
Isolation. Code should have strong isolation between, eg, processes in order that failures in one component can't easily compromise others.
Express risk and hazard in terms of cost. Money. It concentrates the mind wonderfully.
Well understanding of underlying assumptions on crypto building blocks can be important. E.g., stream ciphers such as RC4 are very useful but can be easily used to build an insecure system (i.e., WEP and alike).
If you encrypt your data for security, the highest risk data in your enterprise becomes your keys. Lose the keys, and data is lost; compromise the keys and all your data is compromised.
Use risk to make security decisions. Once you determine the probability of different threats, then consider the harm that each could do. Risk is, by definition
R = Pe × H
where Pe is the probability of the undersired event, and H is the hazard, or the amount of harm that could come from the undesired event.
Separate concerns. Architect your system and design your code so that security-critical components can be kept together.
KISS (Keep It Simple, Stupid)
If you need to make a very convoluted and difficult to follow argument as to why your system is secure, then it probably isn't secure.
Formal security designs sometimes refer to a thing called the TCB (Trusted Computing Base). But even an informal design has something like this - the security enforcing part of your code, the part you can't avoid relying on. This needs to be well encapsulated and as simple and small as possible.

What is the best way to stop an application being copied and used without the owner’s permission?

What is the best way to avoid that an application is copied and used without the owner’s knowing?
Is there any way to trace the usage? Meaning periodically the application communicates back, with enough information so that we can know where it is, and if it’s legal. Next thing, of course, shut it down, if it’s not legit.
Software that "phones home" will be quickly shunned by the vast majority of your users. Just license it appropriately and sell it.
People who use your software professionally will either pay for it or they won't use it. Corporations tend to frown on potential lawsuits.
People who want to use your software without paying for it will continue to do so despite your best efforts to counteract them. Once the software is in their hands, it is out of yours. Without pissing off your users, your only recourse is a legal one.
If your product is priced reasonably, some people will pay for it and some won't. That is just something you need to deal with upfront and it should be factored into your business plan.
Don't do this, don't attempt it, don't even think about it.
This is a battle you can't win. If people want to pirate your software they will. You'll be shamed by the fact that a smart reverse engineer can write a one byte binary patch to subvert all your protection schemes.
The people who are going to pirate your software will do so and all these "security features" you build in will likely end up only inconveniencing your true supporters: the people who have legitimately purchased your software. These draconian DRM / anti-piracy schemes only build resentment among software users.
Hardware dongles are the best way if you are really concerned about piracy IMO. Check out the big industrial CAD/CAM packages worth thousands or tens-of-thousands, or the AV/Music production software, they virtually all have dongle protection. Dongles can be emulated or reversed but not without a significant investment in time, a lot more than just changing a few JEs to JNEs in your assembly.
Phoning home is not the way to go unless you are providing a service that requires a subscription and constant updates (like antivirus products, for example) as part of your business model. You need to have a bit of respect for your users and their privacy. You might have perfectly innocent intentions but what if a court ordered your company to hand over that information (like the US government is doing with Google and its search terms) - would/could you fight it? What if you some time in the future sold your company and the new owners decided to sell all that historic information to a marketing company? Privacy is not just about trusting a company not to abuse your data, it is trusting that company to go out of their way to protect your data. Which is pretty far down the list of priorities for most companies. So basically, the monitoring users thing is not really a good path to go down.
The best (and pretty much only) way to reliably prevent piracy is to have a client/server application instead of a standalone one, where a non-trivial part of the work is done by the server and users need to register. Then you can at least detect and block simultaneous use of the same account.
There are several approaches you could take, but there are three that will be vastly more effective that any of the others.
A. Don't create it.
Software that doesn't exist never suffers from unauthorized use.
B. Don't release it.
If you have the only copy, and you keep it that way, then the chances are exceedingly good that there will be no unauthorized use.
C. Give everyone permission to use it.
If you don't want anyone to use it without permission, then you can give everyone permission and there will be no unauthorized users.
There is a possibility to trace the usage. You can accomplish this by letting phone your tool home and send the information you need. The problem with this is, that first nobody likes software that phones home for this purpose and second with a simple application-level gateway you can block the application to phone home! What you describe in your question is a common problem of software-distributors and it's not an easy one to solve!
There's another thing I haven't seen mentioned yet : You could add loads of settings to the applications' configuration file, and start with ridiculous defaults. Then do the installation & configuration personally, so no-one but you is able to figure out how everything should be set. This can be a mayor put-down for people that are just trying out if a copy is enough. (Be sure to add settings that depend on all sorts of system-settings, like OS-version related DLL-versions that should be loaded, etc). Not very user-friendly tho ;-)
