How safe is to use an online SVN repository?

How safe is to use an online SVN repository? - security

How safe is to use an online SVN repository?
I want to develop collaboratively with some friends. I know you can create non-public accounts in some of those services, but I can't fell confortable to send all of our intelectual products to another company manage. After all, if your idea works, those companies can easily find your source code!
Do you think this care is important? If so, what is the best solution?
My question isn't "how good it is" or "which is better", I just want know if you trust them and why (or why not).
Below I give you SVN repositories examples:
XP-Dev
Unfuddle
Assembla
Thank you all!

If you have something valuable enough to be stolen, it's time to get a lawyer anyway. Get him involved from the start, have him review whatever agreements the various hosting sites have to offer, and make sure they can be held accountable for breaches of security, including the value of your source code in the hands of competitors.

It is definitely important to be concerned about your source code in the cloud. At the end of the day you have to weigh up the cost of installing, securing, maintaining, backing up yourself vs a $10/month plan with a hosted SVN service. There are always going to be a certain sector that will never upload code into a hosted repo, i.e. banks, military, etc, but for the majority of us the risk is low and minor compared to the benefits of not doing it yourself. Make sure the provider you choose enforces SSL, has regular backups (at least hourly granularity), their datacenter provider is SAS70, and a policy allowing you to download your full SVN repo dump if you choose to leave, or go elsewhere, and how long the provider has been in business, do they have a good track record, and does the provider enforce a password policy.

Related

Centralized vs. Distributed version control security

As my company begins to further explore moving from centralized version control tools (CVS, SVN, Perforce and a host of others) to offering teams distributed version control tools (mercurial in our case) I've run into a problem:
The Problem
A manager has raised the concern that distributed version control may not be as secure as our CVCS options because the repo history is stored locally on the developer's machine.
It's been difficult to nail down his exact security concern but I've gathered that it centers on the fact that a malicious employee could steal not only the latest intellectual properly but our whole history of changes just by copying a single folder.
The Question(s)
Do distributed version control system really introduce new security concerns for projects?
Is it easier to maliciously steal code?
Does the complete history represent an additional threat that the latest version of the code does not?
My Thoughts
My take is that this may be a mistaken thought that the centralized model is more secure because the history seems to be safer as it is off on its own box. Given that users with even read access to a centralized repo could selectively extract snapshots of the project at any key revision I'm not sure the DVCS model makes it all that easier. Also, most CVCS tools allow you to extract the whole repo's history with a single command so that you can import them into other tools.
I think the other issue is just how important the history is compared to the latest version. Granted someone could have checked in a top secret file, then deleted it and the history would pretty quickly be significant. But even in that scenario a CVCS user could checkout that top secret version with a single command.
I'm sure I could be missing something or downplaying risks as I'm eager to see DVCS become a fully supported tool option. Please contribute any ideas you have on security concerns.

If you have read access to a CVCS, you have enough permissions to convert the repo to a DVCS, which people do all the time. No software tool is going to protect you from a disgruntled employee stealing your code, but a DVCS has many more options for dealing with untrusted contributors, such as a gatekeeper workflow. Hence its widespread use in open source projects.

You are right in that distributed version control does not really introduce any new security concerns since the developer has already access to the code in both cases. I can only think that since it is easier to work offline and offsite with GIT, developers might become more tempted to do it than in centralized. I would push to force encryption on all corporate laptops with code
not really easier, just the same. If you enable logs, then you will have the same information when the code is accessed.
I personally do not think so. It might represent the thought process leading to certain decisions but not necessarily more.
It comes down to knowledge on how to implement security measures in both cases. If you have more experience in one system vs another then you are more likely to implement more to prevent such loss but at the end of the day, you are trusting your developers with code the minute you allow them access to it. No way around that.

DVCS provides various protections against unauthorized writing. This is why it is popular with opensource teams. It has several frustrating limitations for controlling reading. Opensource teams do not care about this.
The first problem is that most DVCS encourage many copies of the full source. The typical granularity is the full repo. This can include many unneeded branches and even entire other projects, besides the concern of history (along with searchable commit comments that can make the code even more useful to the attacker). CVCS encourages developers to copy as little as possible to their desktop, since the less they copy, the faster it works. The less you put on mobile devices, the easier it is to secure.
When DVCS is implemented with many devices acting as servers, it is much more difficult to implement effective network security. Attacking a local CVCS workspace requires the attacker to gain access to the filesystem. Attacking a DVCS node generally requires attacking the DVCS itself on any device hosting the information (and remember: the folks who maintain most DVCS's are opensource guys; they don't care nearly as much about read controls). The more devices that host repositories, the more likely that users will set up anonymous read access (which again, DVCS encourages because of its opensource roots). This greatly simplifies the job of an attacker who is doing random sweeps.
CVCS that are based on URLs (like subversion) open the opportunity for quite fine-grain access control, such as per-branch access. DVCS tends to fight this kind of access control.
I know developers like DVCS, but there's no way it can be secured as effectively as CVCS. Most environments do a terrible job of securing their CVCS, and if that's the case then it doesn't matter which you use. But if you take access control seriously, you can have much greater control with CVCS as part of a broader least-privilege infrastructure.
Many may argue that there's no reason to protect source code. That's fine and people can argue about it. But if you are going to protect your source code, the best implementation is to not copy the source to random laptops (which are very hard to secure well), and rather have developers mount it from a central server. CVCS works well this way. DVCS makes no sense if you are going to keep it on a single server this way. If you are going to copy files to mobile devices, make sure you copy as little as possible. That's the opposite of DVCS.

There are a bunch of "security" issues; whether they are an issue depends on your setup:
There's more data floating around, which means the notional "attack surface" might be bigger (it depends on how you count).
But how much data does the "typical" developer check out? You might want to use a sparse checkout in svn, but lazy people and some GUI tools don't support that, so they'll have all your code checked out anyway. Git users might be more likely to use multiple repos. This depends on you.
Authentication/access control might be better (and it might be worse!). This is largely a function of the VCS, not whether it is "D" or "C". svn:// is plaintext.
Is deleting files a priority, and how easy is this to do? An accidental commit of a confidential file is more painful to do in git if it happened in the distant past (but people might be more likely to notice).
Are you really going to notice a malicious user pulling the entire history instead of merely doing a checkout? It depends on how big your repository is and what your branches are like. It's easy for a full SVN checkout to take up more space than the repository itself due to branches.
Change history is generally not something you want to give away for free (even to people with a source code license), but how valuable is it? Maybe you have top-secret design methodologies or confidential information in your commit messages, but this seems unlikely.
And finally, security economics:
How much is the extra security worth?
How much is increased productivity worth?
How much is caring about the concerns about your developers worth?
(IIRC it turns out that users should ignore security advice, because the expected cost is more than the expected benefit — this is especially true for things like certificates that expired yesterday. How much does it cost you to check the address bar every time you type in password? How often do you catch a phishing attempt? What is the cost to you per thwarted phishing attempt? What is the cost per successful phish?)

Are services like AWS secure enough for an organization that is highly responsible for it's clients privacy?

Okay, so we have to store our clients` private medical records online and also the web site will have a lot of requests, so we have to use some scaling solutions.
We can have our own share of a datacenter and run something like Zend Server Cluster Manager on it, but services like Amazon EC2 look a lot easier to manage, and they are incredibly cheaper too. We just don't know if they are secure enough!
Are they?
Any better solutions?
More info: I know that there is a reference server and it's highly secured and without it, even the decrypted data on the cloud server would be useless. It would be a bunch of meaningless numbers that aren't even linked to each other.
Making the question more clear: Are there any secure storage and process service providers that guarantee there won't be leaks from their side?

First off, you should contact AWS and explain what you're trying to build and the kind of data you deal with. As far as I remember, they have regulations in place to accommodate most if not all the privacy concerns.
E.g., in Germany such thing is a called a "Auftragsdatenvereinbarung". I have no idea how this relates and translates to other countries. AWS offers this.
But no matter if you go with AWS or another cloud computing service, the issue stays the same. And therefor, whatever is possible is probably best answered by a lawyer and based on the hopefully well educated (and expensive) recommendation, I'd go cloud shopping, or maybe not. If you're in the EU, there are a ton of regulations especially in regards to medical records -- some countries add more to it.
From what I remember it's basically required to have end to end encryption when you deal with these things.
Last but not least security also depends on the setup and the application, etc..
For complete and full security, I'd recommend a system that is not connected to the Internet. All others can fail.

You should never outsource highly sensitive data. Your company and only your company should have access to it - in both software and hardware terms. Even if your hoster is generally trusted someone there might just steal hardware.
Depending on the size of your company you should have your custom servers - preferable even unaccessible for the technicans in your datacenter (supposing you don't own the datacenter ;).
So the more important the data is, the less foreign people should have access to it in any means. In the best case you can name all people that have access to them in any way.
(Update: This might not apply to anonymous data, but as you're speaking of customers I don't think that applies here?)
(On a third thought: There're are probably laws to take into consideration of how you have to handle that kind of information ;)

How safe is code hosted elsewhere

I was at a meeting recently for our startup. For half an hour, I was listening to one of the key people on the team talk about timelines, the market, confidentiality, being there first and so on. But I couldn't help ask myself the question: all that talk about confidentiality is nice, but there isn't much talk about physical security. This thing we're working on is web-hosted. What if after uploading it to the webhost, someone walks into the server room (don't even know where that is) and grabs a copy of the code and the database. The database is encrypted, but with access to the machine, you'd have the key.
What do the big boys do to guard the code from being stolen off? Is it common for startups to host it themselves in some private data center or what? Does anyone have facts about what known startups have done, like digg, etc.? Anyone has firsthand experience on this issue?

Very few people are interested in seeing your source code. The sysadmins working at your host are most likely in this group. It's probably not the case that they can copy your code, paste it on another host and be up and running, stealing your customers in 42 minutes.
People might be interested in seeing the contents of your DB if you're storing things like user contact information (or even more extreme, financial information). How do you protect against this? Do the easy, host independent things (like storing passwords as hashes, offloading financial data to financial service providers, HTTPS/SSL, etc.) and make sure you use a host with a good reputation. Places like Amazon (with AWS) and RackSpace would fail quickly if it got out that they regularly let employees walk off with customer (your) data.
How do the big boys do it? They have their own infrastructure (places like Google, Yahoo, etc.) or they use one of the major players (Amazon AWS, Rackspace, etc.).
How do other startups do it? I remember hearing that Stack Overflow hosts their own infrastructure (details, anyone?). This old piece on Digg indicates that they run themselves too. These two instances do not mean that all (or even most) startups have an internal infrastructure.

Most big players in the hosting biz have a solid security policy on their servers. Some very advanced technology goes into securing most high end data centers.
Check out the security at the host that I use
http://www.liquidweb.com/datacenter/

What if after uploading it to the webhost, someone walks into the server room (don't even know where that is) and grabs a copy of the code and the database. The database is encrypted, but with access to the machine, you'd have the key.
Then you're screwed :-) Even colo or rented servers should be under an authorized-access only policy, that is physically enforced at the site. Of course that doesn't prevent anyone from obtaining the "super secret" code otherwise. For that, hire expensive lawyers and get insurance.

By sharing user accounts on the same system you have more to worry about. It can be done without ever having a problem, but you are less secure than if you controlled the entire system.
Make sure you code is chmod 500, or even chmod 700, as long as the last 2 are zeros then your better off. If you do a chmod 777, then everyone on the system will be able to access your files.
However there are still problems. A vulnerability in the Linux kernel would give the attacker access to all accounts. A vulnerability in MySQL would give the attacker access to all databases. By having your own system, then you don't have to worry about these attacks.

What is the best way to stop an application being copied and used without the owner’s permission?

What is the best way to avoid that an application is copied and used without the owner’s knowing?
Is there any way to trace the usage? Meaning periodically the application communicates back, with enough information so that we can know where it is, and if it’s legal. Next thing, of course, shut it down, if it’s not legit.

Software that "phones home" will be quickly shunned by the vast majority of your users. Just license it appropriately and sell it.
People who use your software professionally will either pay for it or they won't use it. Corporations tend to frown on potential lawsuits.
People who want to use your software without paying for it will continue to do so despite your best efforts to counteract them. Once the software is in their hands, it is out of yours. Without pissing off your users, your only recourse is a legal one.
If your product is priced reasonably, some people will pay for it and some won't. That is just something you need to deal with upfront and it should be factored into your business plan.

Don't do this, don't attempt it, don't even think about it.
This is a battle you can't win. If people want to pirate your software they will. You'll be shamed by the fact that a smart reverse engineer can write a one byte binary patch to subvert all your protection schemes.
The people who are going to pirate your software will do so and all these "security features" you build in will likely end up only inconveniencing your true supporters: the people who have legitimately purchased your software. These draconian DRM / anti-piracy schemes only build resentment among software users.

Hardware dongles are the best way if you are really concerned about piracy IMO. Check out the big industrial CAD/CAM packages worth thousands or tens-of-thousands, or the AV/Music production software, they virtually all have dongle protection. Dongles can be emulated or reversed but not without a significant investment in time, a lot more than just changing a few JEs to JNEs in your assembly.
Phoning home is not the way to go unless you are providing a service that requires a subscription and constant updates (like antivirus products, for example) as part of your business model. You need to have a bit of respect for your users and their privacy. You might have perfectly innocent intentions but what if a court ordered your company to hand over that information (like the US government is doing with Google and its search terms) - would/could you fight it? What if you some time in the future sold your company and the new owners decided to sell all that historic information to a marketing company? Privacy is not just about trusting a company not to abuse your data, it is trusting that company to go out of their way to protect your data. Which is pretty far down the list of priorities for most companies. So basically, the monitoring users thing is not really a good path to go down.

The best (and pretty much only) way to reliably prevent piracy is to have a client/server application instead of a standalone one, where a non-trivial part of the work is done by the server and users need to register. Then you can at least detect and block simultaneous use of the same account.

There are several approaches you could take, but there are three that will be vastly more effective that any of the others.
A. Don't create it.
Software that doesn't exist never suffers from unauthorized use.
B. Don't release it.
If you have the only copy, and you keep it that way, then the chances are exceedingly good that there will be no unauthorized use.
C. Give everyone permission to use it.
If you don't want anyone to use it without permission, then you can give everyone permission and there will be no unauthorized users.

There is a possibility to trace the usage. You can accomplish this by letting phone your tool home and send the information you need. The problem with this is, that first nobody likes software that phones home for this purpose and second with a simple application-level gateway you can block the application to phone home! What you describe in your question is a common problem of software-distributors and it's not an easy one to solve!

There's another thing I haven't seen mentioned yet : You could add loads of settings to the applications' configuration file, and start with ridiculous defaults. Then do the installation & configuration personally, so no-one but you is able to figure out how everything should be set. This can be a mayor put-down for people that are just trying out if a copy is enough. (Be sure to add settings that depend on all sorts of system-settings, like OS-version related DLL-versions that should be loaded, etc). Not very user-friendly tho ;-)

Company seeking my personal projects during non-work at home?

Ok, so I'm building "Web 2.0/3.0" sites to make extra money. I currently run my own personal project sites with some advanced technology in the backend (AI stuff, recommendation system) that I've developed over the years. It's a subscription site for me to make money on the side.
Now, my company (they do web application/software technology, ad network) somehow found out I run several websites. They were like, "Hey Joe, you run so and so websites! Why not put them on our ad network?? The stuff you're doing is a threat to our technology -- we don't want you competing with us on the side. Let us have your websites and put it on our portfolio/ad network."
Ok, basically it seems they want the rights to my technology and personal project. Somehow they must've googled my name and linked it to some projects I'm working on on the side. Is this ethical for a company to do? Trying to own my personal project since it's got some cool technology and trying to own the rights to it? Just because I work for the company doesn't mean I'm gonna make an offer to them, right?

You probably need to consult a lawyer. What were the terms of your employment that you agreed to when you were hired? Was there a non-compete clause? Was there a required disclosure clause?

Depends on your employment contract. Your contract might say something like "anything you do, while in the employ of company XYZ, be it during work or non work hours belongs to us". It's time to talk to a lawyer, not ask StackOverflow, this isn't a technology/programming question.

Ethical? Yes, why not. If you're putting stuff out on the web and they can find it via Google, then why shouldn't they? If you don't want people to find stuff you've done on the web then don't put it on the web or use a robots.txt to hide it from Google. It's not completely unreasonable for them to at least wonder if you may be using technology that you developed while you were working for them.
Legal--maybe so, maybe not. Depends on the employment agreement that you signed when you joined the company. I'd consult an employment lawyer for real advice rather than asking here.
They may have web logs that demonstrate that you were working on your private web sites during work time--if you did so. I'd be very careful in how I proceed if I were you.

check your contract, and/or your state laws and case precedents. Talk to a lawyer.
IMHO it is unethical for them to attempt to take your intellectual property without compensation, even if you have a 'all your codez are belong to us' kind of work-for-hire agreement. But talk to a lawyer, and be prepared to walk, get sued, and countersue, if necessary. Someone trying to steal your lunch money is a bully and a thief, but they may just have a legal claim.
Unfrotunately, this is not a joke. Talk to a lawyer right away.

If what you do in any way competes with what your company does or uses technology, intellectual property, information or contacts that you gained because of your employment with your company, then you may have issues and should check your contract and see a lawyer.
The other side is: did you ever work on your sites (and this can include sending emails and the like) your personal projects at work? If so, you may be in trouble there too.
IANAL so that's all I'll say on the legalities.

You need to consult a lawyer to get a definitive answer to this question. The answer might depend on your employment contract, and the laws in your locale. Don't rely on anything people say on the internet regarding legal matters.
Regardless of whether or not it's within their rights to do so, I think it's unethical and foolish of them to pressure you like this. I imagine they have just lost any employee loyalty you might have had.
I think a proper response could be, "if you think there's ad revenue potential in my websites, make me an offer that reflects their value, and I'll consider it." After all, you started those sites to make money, right?
But first talk to a lawyer, to be sure you're in a position to negotiate.

Well a friendly way to go about it, and that they should probably be willing to accept if they are a reasonable lot, is to buy/lease your technology. This way you can get a nice sum of money for your work (since you mentioned the purpose of this site was to make extra money in your question).
Otherwise (if its a pet project first and foremost) you might as well tell them in a friendly manner that you keep that site as a hobby, and you'd prefer to not share it if thats ok, unless they let you work full time on your and a cut in the earnings, etc... (something most people would love to do, work on their pet projects and get paid a stable salary for it).
As always first try to reason with the other party in a civilized and friendly matter, it'll likely make both parties happier, and it'll be better than taking the legal route most of the time.

I am Not a Lawyer, and the laws almost certainly vary by country/state/province. But if you are working on a side project on your own time, on your own equipment, using only your own network resources, etc., then in my opinion, they have no right to your work.
If you signed some sort of vague non-compete contract, or something that says all the stuff you do on your own time is theirs, then you have less of a leg to stand on.
Your best bet is to ask a lawyer, if there's enough revenue from your subscription base to justify it.

Consult a lawyer! Regardless of your contractual obligations, any company has a right to be concerned if one of their employees is running a direct competitor on the side, especially if they can demonstrate that you have access to privileged information which you are using to compete (knowledge of their technologies, marketing strategy, customers etc).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string