Is there an encrypted version control system?

Is there an encrypted version control system? - security

I am looking for an encrypted version control system . Basically I would like to
Have all files encrypted locally before sending to the server. The server should never receive any file or data unencrypted.
Every other feature should work pretty much the same way as SVN or CVS does today.
Can anyone recommend something like this? I did a lot of searches but I cant find anything.

You should encrypt the data pipe (ssl/ssh) instead, and secure the access to the server. Encrypting the data would force SVN to essentially treat everything as a binary file. It can't do any diff, so it can't store deltas. This defeats the purpose of a delta-based approach.
Your repository would get huge, very quickly. If you upload a file that's 100kb and then change 1 byte and checkin again, do that 8 more times (10 revs total), the repository would be storing 10 * 100kb, instead of 100kb + 9 little deltas (let's call it 101kb).
Update: #TheRook explains that it is possible to do deltas with encrypted repository. So it may be possible to do this. However, my initial advice stands: it's not worth the hassle, and you're better off with encrypting the ssl/ssh pipe and securing the server. i.e. "best practices".

It is possible to create a version control system of cipher text. It would be ideal to use a stream cipher such as RC4-drop1024 or AES-OFB mode. As long as the same key+iv is used for each revision. This will mean that the same PRNG stream will be generated each time and then XOR'ed with the code. If any individual byte is different, then you have a mismatch and the cipher text its self will be merged normally.
A block cipher in ECB mode could also be used, where the smallest mismatch would be 1 block in size, so it would be ideal to use small blocks. CBC mode on the other hand can produce widely different cipher text for each revision and thus is undesirable.
I recognize that this isn't very secure, OFB and ECB modes shouldn't normally be used as they are weaker than CBC mode. The sacrifice of the IV is also undesirable. Further more it isn't clear what attack is being defended against. Where as using something like SVN+HTTPS is very common and also secure. My post is merely stating that it is possible to do this efficiently.

Why not set up your repo (subversion, mercurial, whatever) on an encrypted filesystem, and use ssh only to connect?

Use rsyncrypto to encrypt files from your plaintext directory to your encrypted directory, and decrypt files from your encrypted directory and your plaintext directory, using keys that you keep locally.
Use your favorite version control system (or any other version control system -- svn, git, mercurial, whatever) to synchronize between your encrypted directory and the remote host.
The rsyncrypto implementation you can download now from Sourceforge not only handles changes in bytes, but also insertions and deletions.
It implements an approach very similar to the approach that that "The Rook" mentions.
Single-byte insertions, deletions, and changes in a plaintext file typically cause rsyncrypto to make a few K of completely different encrypted text around the corresponding point in the encrypted version of that file.
Chris Thornton points out that ssh is one good solution; rsyncrypto is a very different solution. The tradeoff is:
using rsyncrypto requires transferring a few K for each "trivial" change rather than the half-a-K it would take using ssh (or on a unencrypted system). So ssh is slightly faster and requires slightly less "diff" storage than rsyncrypto.
using ssh and a standard VCS requires the server to (at least temporarily) have the keys and decrypt the files. With rsyncrypto, all encryption keys never leave the local computer. So rsyncrypto is slightly more secure.

SVN have built-in support for transferring data securely. If you use svnserve, then you can access it securely using ssh. Alternatively you can access it through the apache server using https. This is detailed in the svn documentation.

You could use a Tahoe-LAFS grid to store your files. Since Tahoe is designed as a distributed file system, not a versioning system, you'd probably need to use another versioning scheme on top of the file system.
Edit: Here's a prototype extension to use Tahoe-LAFS as the backend storage for Mercurial.

What specifically are you trying to guard against?
Use Subversion ssh or https for the repo access. Use an encrypted filesystem on the clients.

Have a look at GIT. It supports various hooks that might do the job. See, git encrypt/decrypt remote repository files while push/pull.

Have you thought of using Duplicity? It's like rdiff-backup (delta backups) but encrypted? Not really version control - but maybe you want all the cool diff stuff but don't want to work with anyone else.
Or, just push/pull from a local Truecrypt archive and rsync it to a remote location.
rsync.net has a nice description of both - http://www.rsync.net/resources/howto/duplicity.html
http://www.rsync.net/products/encrypted.html
- apparently Truecrypt containers still rsync well.

Variant A
Use a distributed VCS and transport changes direct between different clients over encrypted links. In this scenario is no attackable central server.
For example with mercurial you can use the internal web server, and connect the clients via VPN. Or you can bundle the change sets and distribute them with encrypted mails.
Variant B
You can export an encrypted hard drive partition over the network and mount it on the client side, and run the VCS on the client side. But this approach has lot's of problems, like:
possible data loss when two different clients try to access the VCS at the same time
the link itself must be secured against fraudulent write access (when the partition is shared via NFS it is very likely to end with a configuration where anyone can write to the shared partition, so even when there is no way for others to read the content, there is easily a hole to destroy the content)
There might be also other problems with variant B, so forget variant B.
Variant C
Like #Commodore Jaeger wrote, use a VCS on top of an encryption-aware network file system like AFS.

Similar to some comments above, a useful workaround may be to encrypt&zip the whole repository locally and synchronize it with the remote box. E.g. when using git, the whole repository is stored in directory '.git' within the project dir.
zip/encrypt the whole project dir
upload it to a server (unsure if handling .git alone is sufficient)
before you continue working on the project download this archive
decrypt/unpack and sync with git (locally)
This can be done with a simple shell line script more comfortable.
pro: You can use whatever encryption is appropriate as well as every VCS which supports local repositories.
cons: lacks obviously some VCS features when several people want to upload their code base (the merge situation) - although, perhaps, this could be fixed by avoiding overwrite remotely and merging locally (which introduces locking, this is where the nightmare starts)
Nevertheless, this solution is a hack, not a proper solution.

Based on my understanding this cannot be done, because in SVN and other versioning systems, the server needs access to the plaintext in order to perform versioning.

Source safe stores data in Encrypted files. Wait, I take that back. They're obfuscated. And there's no other security other than a front door to the obfuscation.
C'mon, it's monday.

Related

What is the risk of hardcoded credentials in creating database connection?

Hi security aware people,
I have recently scanned my application with a tool for static code analysis and one of the high severity findings is a hardcoded username and password for creating a connection:
dm.getConnection(databaseUrl,"server","revres");
Why does the scanner think this is a risk for the application? I can see some downsides such as not being able to change the password easily if it's compromised. Theoretically someone could reverse-engineer the binaries to learn the credentials. But I don't see the advantage of storing the credentials in a config file, where they are easy to locate and read, unless they are encrypted. And if I encrypt them, I will be solving the same problem with the encryption key...
Are there any more risks that I cannot see? Or should I use a completely different approach?
Thank you very much.

A fixed password embedded in the code will be the same for every installation, and accessible by anyone with access to the source code or binary (including the installation media).
A password read from a file can be different for each installation, and known only to those who can read the password file.
Typically, your installer will generate a unique password per site, and write that securely to the file to be read by your application. (By "securely", I mean using O_CREAT|O_EXCL to prevent symlink attacks, and with a correct selection of file location and permissions before anyone else can open it).

This is an interesting one, I can give you examples for a .Net application (as you haven't specified running environment / technologies used). Although my guess is Java? I hope this is still relevant and helps you.
My main advice would be to read this article and go from there: Protecting Connection information - MSDN
Here is a page that describes working with encrypted configuration files here
I've seen this solved both using encrypted configuration files and windows authentication. I think that running your application as a user that will be granted access to the relevant stored procedures etc (as little as possible, e.g. Principle of Least Privilege) and furthermore folder access etc is a good route.
I would recommend using both techniques because then you can give relevant local folder access to the pool for IIS and split out your user access in SQL etc. This also makes for better auditing!
This depends on your application needs though. The main reason to make this configurable via a config file or environmental user account I would say is so that when you come to publish your application to production, your developers do not need access to the production user account information and instead can just work with Local / System test / UAT credentials instead.
And of course they are not stored in plain text in your source control checkin then either, which if you host in a private distributed network like GIT could mean that this could be compromised and a hacker would gain access to the credentials.

I think it depends on how accessible / secure your source code or compiled code is. Developers usually have copies of the code on their dev boxes, which are usually not nearly as secure as production servers, and so are much more easily hacked. Generally, a test user / pw is configured on the dev box, and in production, the "real" pw is stored in much more secure config files. Yes, if someone hacked into the server they could easily get the credentials, but that is much more difficult than getting into a dev box in most cases. But like I said it depends. If there is only one dev, and they have a super secure machine they work with, and the repo for their code is also super secure, then there is no effective difference.

What I do is to ask the credentials to end user initially and then encrypt and store them in a file. This way, I don't know their connection details and passwords as a dev. The key is a hashed binary and I store it by poking ekstra bytes in between. One who wants to crack it should find out the algorithm used, key and vector lengths, their location and the start-end positions of the byte sequence keeping the values. A genius, who would also reverse engineer my code to get all this information would break into it (but it might be easier to directly crack the end user's credentials).

Encryption and code archive files

We have this computer code which requires anyone who has access to it pay a license fee. We will pay the fee for our developers but they want our sysadmins to be licensed too as they can see the code archives. But if the code is stored encrypted in the archives then the sysadmins can see the files but not see the contents.
So does any software version control system allow encryption so that only the persons who are checking out the code will require the key and so be able to see the files decrypted.
I was thinking it wouldn't be hard to add this to pserver and cvs but if it is already done elsewhere why reinvent the wheel.
Any insight would be helpful.

There is no way to set up a source control system that can perform server-side diffs in a way that would prevent a sysadmin from at least theoretically accessing the contents. (i.e.: The source control system would not be able to store the decryption key in a place that the sysadmin couldn't access.) Unless your sysadmins habitually browse the source control database contents, such a system should have no practical difference from an unencrypted system from the perspective of your vendor.
The only way to make the source control database illegible to a server admin is to encrypt files on the client before submitting them to the server. For this to meet the desired goal, the decryption keys would need to be inaccessible to the admins, which is unlikely to be practical in most organizations since server admins typically have admin access on all client machines as well. Ignoring this picky detail, it would also mean that all your source control system would ever see is encrypted binaries, which means no server-side diff or blame. It also means potentially horrible bloat of your database size since every file will require complete replacement on each commit. Are you really willing to sacrifice useability of your source control system in order to save licensing fees and/or placate this vendor?

Basically, you want to give all your developers some secret key that they plug into the encryption/decryption routines of git's smudge and clean filters. And you want an encryption scheme that is capable of performing deltas.
First, see Encrypted version control for some examples in git. As written, this can dramatically increase disk usage. However, there are ways to make more "diff-friendly" encryption at the cost of some security. See diph for an example of how you might attack that. Also, any system that uses AES-ECB mode would diff quite well. (You generally shouldn't use AES-ECB mode because of its security flaws... one of those security flaws is that it can diff quite well... hey, that's what you wanted, so this seems a reasonable exception.)

Secure data on server

I am setting up a server where some important code will reside. I want to make sure the code is unreachable, in case the HD is stolen. Well I know you never can be sure, but reasonably secure.
Which method could I use?
How to i.e. mount a crypted filesystem at bootup without human interaction?
Thank you very much for your help.

I do not know if any of the encrypted filesystem solutions support this, but one solution would be to have the server contact another server to get the key. You could even imagine splitting the key between several servers, so the server would have to contact n out of m servers to get the key.
If you place the servers in different locations that would make you safe against (n-1) out of the servers being stolen.
An attacker would however of course be able to get access to the encryption key if he performs the attack while the server is still connected to the network, but this implementation would make you secure against simple theft.

Mounting an encrypted file-system without human intervention will ultimately weaken your security. The thief would just need to steal your server. It is perfectly doable with any Linux based system using dm_crypt. There are many online tutorials showing you how to do it.
If this is for a file-server, you may want to consider using FreeNAS. It is a BSD based NAS operating system and it includes the ability to encrypt the disks, amongst other things. You will need to enter a password through the web-interface to mount the disks.

The open source TrueCrypt creates a virtual disk within a file and mounts it like a real drive, or it can encrypt an entire drive. Encryption is transparent and fast. I have used it; it works in real time. It might make things easier.

What you want is called Full Disk Encryption. A complete partition/filesystem is encrypted, it is decrypted by the OS (or 3rd-party-software) when it's mounted.
There are many implementations, and at least MS Windows & Linux have it as part of the OS.
See the Wikipedia article for details.
Being able to mount it w/o human intervention could be problematic; after all the whole point is that you cannot read the HD without human (i.e. your) intervention :-). You might be able to do this with some hardware token, but then that could also be stolen. So that requirement might not be doable.

Without human interaction is possible using a hardware token but you need to guard against someone stealing the token along with your server.
You could accomplish some safety with built-in GPS and a 10-minute backup battery or something (forget the key if power is lost for >10 minutes or the server is moved). You can make it work somehow but it will be insanely expensive.
You propably want a less involved solution like this:
Boot from a regular partition
Set up encrypted swap with a randomized key on startup (important!)
Set up /tmp and similar locations on an encrypted partition or in RAM (important!)
Mount the encrypted data partition by logging in over ssh
Still human intervention required, but you can stay at home while doing it.

Thank you very much for your helping answers.
I'll try a truecrypt container wich uses several distributed keyfiles (and no password).
A script will retrieve the keyfiles, then mount the volume, then delete the keyfiles.
Since we are only a small bunch, another option could be to programatically crypt/decrypt the data on the client side just before writing/reading. But this seems to me somehow tiresome.
Then, what about having a keyfile on a terminal server?
So many questions!
Thank you once more for your help.

Now... I just remembered about cold boot attacks. Do we really need guns? Are we that doomed?

Which subversion server type is best?

Subversion has multiple server types:
svnserve daemon
svnserve via xinetd
svn over ssh
http-based server
direct access via file:/// URLs
Which one of these is best for a small Linux system (one to two users)?

http:
very flexible and easy for administration
no network problems (Port 80)
3rd party authentication (eg. LDAP, Active Directory)
Unix + Win native support
webdav support for editing without svn client
slow, as each action triggers a new http-action approx. 5-8 times slower than svn://
especially slow on history
no encryption of transferred data
https:
same as http
encryption of transferred data
svn:
fastest transfer
no password encryption in std. setup: pw are readable by admin
firewall problems as no std.port is used
daemon service has to be started
no encryption of transferred data
svn+ssh
nearly as fast as svn://
no windows OS comes with build in ssh components, so 3rd party tools are essentiell
no daemon service needed
encryption of passwords
encryption of transfer

1 of those options is definitely a 'worst' one: file access. Don't use it, use one of the server-based methods instead.
However, whether to use HTTP or Svnserve is entirely a matter of preference. In fact, you can use both simultaneously, the write-lock on the repo ensures that you won't corrupt anything if you use one and then use another.
My preference is simply to use apache though - http is more firewall and internet friendly, it is also easier to hook into ldap or other authentication mechanisms, and you get features like webdav too. The performance may be less than svnserve, but its not particularly noticeable (the transferring of data across the network makes up the bulk of any performance issues)
If you need security for file transfers, then svnserve+ssh, or apache over https is your choice.

Check out FLOSS Weekly Episode 28. Greg Stein is one of the inventors of the WebDAV protocol for SVN and discusses the tradeoffs. My takeaway is that SVN: is faster but the http/webdav implementation is just fine for almost all purposes.

I've always used XInetD and HTTP.
HTTP also had WebDAV going on, so I could browse the source online if I wanted (or you can require a VPN if you wanted encryption and a dark-net type thing).
It really depends on what restrictions (if any) you're under.
Is it only going to be on a LAN? Will you need access outside of your LAN?
If so, will you have a VPN?
Do you have a static IP address and are you allowed to forward ports?
If you aren't under any restriction, I would then suggest going with xinetd (if you have xientd installed, daemon if you don't) and then (if you need remote access) use http-based server if you need remote access (you can also encrypt using HTTPS if you don't want plain text un/pw sent across).
Most other options are more effort with less benefit.
It's an SVN Repo -- you can always pack your bags and change things if you don't like it.

For ease of administration and security, we use svn+ssh for anything that requires commit access. We have set up HTTP based access for anonymous (read only) access to some open-source code, and it is much faster; the problem with svn+ssh is that it has to start up an ssh connection and a whole new svnserve for each user for every operation, which can get to be pretty slow after a while.
So, I'd recommend:
http for anonymous connections
svn+ssh if you need something secure and relatively quick and easy (assuming your users already have ssh set up and your users have access to the server)
https if you need something faster, secure, and you don't mind the extra overhead of administering it (or if you don't already have ssh set up or don't want to deal with Unix permissions)

I like sliksvn runs as a service in Windows, 2mins to setup and then forget about it.
It also comes with the client tools but download tortoise as well.

If you are going to be using the server only on the local machine and understand unix permissions, using file:// urls will be fast, simple and secure. Likewise, if you understand unix permissions and ssh and need to access it remotely, ssh will work great. While I see somebody else mentions it as "worst", I'm pretty sure that's simply due to the need to understand unix permissions.
If you do not like or understand unix permissions, you need to go with svnserve or http. I would probably choose to run it in xinetd, personally.
Finally, if you have firewall or proxy issues, you may need to consider using http. It's much more complicated, and i don't think you're going to see the benefits, so I'd put it last on your list.

I would recommend the http option, since I'm currently using svn+ssh and it appears to be the red-headed stepchild of the available protocols: 3rd-party tool support is consistently worse for svn+ssh than it is for http.

I've been responsible for administering both svnserve and Apache+SVN for my development teams, and I prefer the http-based solution for its flexibility. I know next to little about system administration, I'm a software guy after all, and I liked being able to hand authentication and authorization over to Apache.
Both tines the teams were about 10~15 people and both methods worked equally well. If you're planning for any expansion in the future, you might consider the http-based solution. It's just as easy to configure as svnserve, and if you're not going to expose the server to the Internet then you don't have to worry too much about securing and administering Apache either.
As a user of SVN, I prefer the http-based integration with Apache. I like being able to browse the repository with my web browser.

I am curious why NOT FSFS?? Important information - I am managing Windows systems.
I have done many projects with SVN and almost all of them were running from FSFS. Biggest repository was around 70GB (extreme), biggest ammount of repositories was around 700.
We never had any issues, even though we hosted it on Windows, NetApp and many other storage systems. Most of the time when I asked why NOT using FSFS only problem was that people simply didn't trust it.
Advantages:
No backend required (or dedicated server)
Fast and reliable
Hook scripts are supported
NTFS permissions are used
Easy to understand, easy to support, easy to manage
Disadvantages:
Not so easy access from outside your network (VPN)
Permissions only on repository-level (Read, Read/Write)
Hook scripts are running under current user credentials (which is sometimes advantage, sometimes disadvantage)
Martin

Is It Secure To Store Passwords In Web Application Source Code?

So I have a web application that integrates with several other APIs and services which require authentication. My question is, is it safe to store my authentication credentials in plain text in my source code?
What can I do to store these credentials securely?
I think this is a common problem, so I'd like to see a solution which secures credentials in the answers.
In response to comment: I frequently use PHP, Java, and RoR
I'd like to see some more votes for an answer on this question.

Here's what we do with our passwords.
$db['hostname'] = 'somehost.com'
$db['port'] = 1234;
$config = array();
include '/etc/webapp/db/config.php';
$db['username'] = $config['db']['username'];
$db['password'] = $config['db']['password'];
No one but webserver user has access to /etc/webapp/db/config.php, this way you are protecting the username and password from developers.

The only reason to NOT store the PW in the code is simply because of the configuration issue (i.e. need to change the password and don't want to rebuild/compile the application).
But is the source a "safe" place for "security sensitive" content (like passwords, keys, algorithms). Of course it is.
Obviously security sensitive information needs to be properly secured, but that's a basic truth regardless of the file used. Whether it's a config file, a registry setting, or a .java file or .class file.
From an architecture point of view, it's a bad idea for the reason mentioned above, just like you shouldn't "hard code" any "external" dependencies in your code if you can avoid it.
But sensitive data is sensitive data. Embedding a PW in to a source code file makes that file more sensitive than other source code files, and if that's your practice, I'd consider all source code as sensitive as the password.

It is not to be recommended.
An encrypted web.config would be a more suitable place (but note can't be used with a web farm)

It appears the answer is the following:
Don't put credentials in source code but...
Put credentials in a configuration file
Sanitize log files
Set proper permissions/ownership on configs
Probably more depending on platform...

No, it is not.
Plus, you might want to change your password one day, and probably having yo change the source code may not be the best option.

No. Sometimes it is unavoidable. Better approach is to have an architecture set up where the service will implicitly trust your running code based on another trust. (Such as trusting the machine the code is running on, or trusting the application server that is running the software)
If neither of these are available, it would be perfectly acceptable to write your own trust mechanism, though I would keep it completely separate from the application code. Also, would recommend researching ways to keep passwords out of the hands of predators, even when stored on local machine - remembering that you can't protect anything if someone has control of the physical machine it is on.

If you control the Web server, and maintain it for security updates, then in the source (preferably in a configuration module) or in a configuration file that the source uses is probably best.
If you do not control the Web server (say, you are on a shared or even dedicated server provided by a hosting company), then encryption won't help you very much; if the application can decrypt the credentials on a given host, than the host can be used to decrypt the credentials without your intervention (think root or Administrator looking at the source code, and adapting the decryption routine so that it can be used to read the configuration). This is even more of a possibility if you are using unobfuscated managed code (e.g., JVM or .NET) or a Web scripting language that resides in plaintext on the server (like PHP).
As is usually the case, there is a tradeoff between security and accessibility. I'd think about what threats are the ones you are trying to guard against and come up with a means to protect against the situations that you need. If you're working with data that needs to be secure, you should probably be redacting the database fairly regularly and moving data offline to a firewalled and well-protected database server as soon as it becomes stale on the site. This would include data like social security numbers, billing information, etc., which can be referenced. This would also mean that you'd ideally want to control the servers on your own network which provide billing services or secure data storage.

I prefer to keep them in a separate config file, located somewhere outside the web server's document root.
While this doesn't protect against an attacker subverting my code in such a way that it can be coerced into telling them the password, it does still have an advantage over putting the passwords directly into the code (or any other web-accessible file) in that it eliminates concern over a web server misconfiguration (or bug/exploit) allowing an attacker to download the password-containing file directly.

One approach is to encrypt The passwords before placing the password in config.web

I'm writing this for web service app that receives password, not client:
If you save hashed passsword in source code someone who views the source code won't be able to help himself with that hash.
Your program would receive plain password and hash it and compare both hashes.
That's why we save hashed passwords into databases, not plain text. Because they can't be reversed if someone for example steals db or views it for malicious purposes he won't get all users passwords, only the hashes which are pretty useless to him.
Hashing is 1 way process: it produces same value from same source but you can't compute source value out of hash.
Storing on client: when user enters pass u save it to db/file in plaintext, maybe obfuscate a little but not much u can do to prevent someone who gets a hold of that computer to get that password.

Nobody seems to have mentioned hashing yet - with a strong hash algorithm (ie SHA-2 and not MD5), it should be much safer.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string