Encryption and code archive files

Encryption and code archive files - security

We have this computer code which requires anyone who has access to it pay a license fee. We will pay the fee for our developers but they want our sysadmins to be licensed too as they can see the code archives. But if the code is stored encrypted in the archives then the sysadmins can see the files but not see the contents.
So does any software version control system allow encryption so that only the persons who are checking out the code will require the key and so be able to see the files decrypted.
I was thinking it wouldn't be hard to add this to pserver and cvs but if it is already done elsewhere why reinvent the wheel.
Any insight would be helpful.

There is no way to set up a source control system that can perform server-side diffs in a way that would prevent a sysadmin from at least theoretically accessing the contents. (i.e.: The source control system would not be able to store the decryption key in a place that the sysadmin couldn't access.) Unless your sysadmins habitually browse the source control database contents, such a system should have no practical difference from an unencrypted system from the perspective of your vendor.
The only way to make the source control database illegible to a server admin is to encrypt files on the client before submitting them to the server. For this to meet the desired goal, the decryption keys would need to be inaccessible to the admins, which is unlikely to be practical in most organizations since server admins typically have admin access on all client machines as well. Ignoring this picky detail, it would also mean that all your source control system would ever see is encrypted binaries, which means no server-side diff or blame. It also means potentially horrible bloat of your database size since every file will require complete replacement on each commit. Are you really willing to sacrifice useability of your source control system in order to save licensing fees and/or placate this vendor?

Basically, you want to give all your developers some secret key that they plug into the encryption/decryption routines of git's smudge and clean filters. And you want an encryption scheme that is capable of performing deltas.
First, see Encrypted version control for some examples in git. As written, this can dramatically increase disk usage. However, there are ways to make more "diff-friendly" encryption at the cost of some security. See diph for an example of how you might attack that. Also, any system that uses AES-ECB mode would diff quite well. (You generally shouldn't use AES-ECB mode because of its security flaws... one of those security flaws is that it can diff quite well... hey, that's what you wanted, so this seems a reasonable exception.)

Related

Securing symmetric key

In my project (windows desktop application) I use symmetric key in order to encrypt/decrypt some configurations that need to be protected. The key is hardcoded in my code (C++).
What are the risks that my key will be exposed by reverse engineering ? (the customers will receive the compiled DLL only)
Is there a way for better security for managing the key?
Are there open source or commercial products which I can use

Windows provides a key storage mechanism as part of the Crypto API. This would only be useful for you if you have your code generate a unique random key for each user. If you are using a single key for installations for all users, it will obviously have to be in your code (or be derived from constants that are in your code), and thus couldn't really be secure.

What are the risks that my key will be exposed by reverse engineering ? (the customers will receive the compiled DLL only)
100%. Assuming of course that the key protects something useful and interesting. If it doesn't, then lower.
Is there a way for better security for managing the key?
There's no security tool you could use, but there are obfuscation and DRM tools (which are a different problem than security). Any approach you use will need to be updated regularly to deal with new attacks that defeat your old approach. But fundamentally this is the same as DRM for music or video or games or whatever. I would shop around. Anything worthwhile will be regularly updated, and likely somewhat pricey.
Are there open source or commercial products which I can use
Open source solutions for this particular problem are... probably unhelpful. The whole point of DRM is obfuscation (making things confusing and hidden rather than secure). If you share "the secret sauce" then you lose the protection. This is how DRM differs from security. In security, I can tell you everything but the secret, and it's still secure. But DRM, I have to hide everything. That said, I'm sure there are some open source tools that try. There are open source obfuscation tools that try to make it hard to debug the binary by scrambling identifiers and the like, but if there's just one small piece of information that's needed (the configuration), it's hard to obfuscate that sufficiently.
If you need this, you'll likely want a commercial solution, which will be imperfect and likely require patching as it's broken (again, assuming that it protects something that anyone really cares about). Recommending specific solutions is off-topic for Stack Overflow, but google can help you. There are some things specific for Windows that may help, but it depends on your exact requirements.
Keep in mind that the "attacker" (it's hard to consider an authorized user an "attacker") doesn't have to actually get your keys. They just have to wait until your program decrypts the configurations, and then read the configurations out of memory. So you'll need obfuscation around that as well. It's a never-ending battle that you'll have to decide how hard you want to fight.

How to encrypt data while allowing r/w for given user in linux

I am currently working on a java search project that will be distributed to the clients' local server, the project contains some valuable data that we hope it cannot be accessed directly on the machine, but can only be accessed from the project services/apis. The data will be updated on a daily basis and need to be avaliable for query 24/7.
I am thinking of eCryptFs, but after some test, it seems that once the encrypted data is mounted under the service user, say 'root1', as I have to keep the encrypted data in the mounted state to support query, all the other login users can access the de-crypted data without password. Is there anyway to support my scenario? Thanks.

If your users don't have root access, you can simply store the encryption key in a file and deny read access to other users.
If your users do have root access, there is nothing you can do.

EDIT:
Under most circumstances, someone with root account can do anything that the other users can do. So, even if you did get per user r/w permissions on a file but only for a certain user (which is very possible), it would be rather pointless. (Someone with sudo/root access could just run sudo su USER, where USER is the account with the r/w permissions. I think a better way to go about this is to look at options that users do not have control over.
The first thing that came to mind was compiled programs. While they are not really meant for holding secure information, you could compile a simple program to output a little bit of the information after a time delay (to prevent them from just running it continuously and then compiling all of the data they get from it.) Actually, modifying your Java program might be easier; just have it store the information as an enormous string or something. :D
These open source Java obfuscators will make it harder (but certainly not impossible) to reverse engineer your program and, along with it, the data inside.
A more secure option would be to write a C program, compile it, and have it output information (after a time delay) that the JAVA file can then manage. In order to make it harder to decompile, you could add some encryption methods to the string so if the Decompiler messes up on any part of it, it's still worthless information to them.
Final verdict: Nothing is really 100% secure when it is stored on someone else's computer(s) but, then again, neither is it 100% secure on your own server. I would suggest looking into other options, but if you have no other option and you have legal protection on the information, this might work for you.

Centralized vs. Distributed version control security

As my company begins to further explore moving from centralized version control tools (CVS, SVN, Perforce and a host of others) to offering teams distributed version control tools (mercurial in our case) I've run into a problem:
The Problem
A manager has raised the concern that distributed version control may not be as secure as our CVCS options because the repo history is stored locally on the developer's machine.
It's been difficult to nail down his exact security concern but I've gathered that it centers on the fact that a malicious employee could steal not only the latest intellectual properly but our whole history of changes just by copying a single folder.
The Question(s)
Do distributed version control system really introduce new security concerns for projects?
Is it easier to maliciously steal code?
Does the complete history represent an additional threat that the latest version of the code does not?
My Thoughts
My take is that this may be a mistaken thought that the centralized model is more secure because the history seems to be safer as it is off on its own box. Given that users with even read access to a centralized repo could selectively extract snapshots of the project at any key revision I'm not sure the DVCS model makes it all that easier. Also, most CVCS tools allow you to extract the whole repo's history with a single command so that you can import them into other tools.
I think the other issue is just how important the history is compared to the latest version. Granted someone could have checked in a top secret file, then deleted it and the history would pretty quickly be significant. But even in that scenario a CVCS user could checkout that top secret version with a single command.
I'm sure I could be missing something or downplaying risks as I'm eager to see DVCS become a fully supported tool option. Please contribute any ideas you have on security concerns.

If you have read access to a CVCS, you have enough permissions to convert the repo to a DVCS, which people do all the time. No software tool is going to protect you from a disgruntled employee stealing your code, but a DVCS has many more options for dealing with untrusted contributors, such as a gatekeeper workflow. Hence its widespread use in open source projects.

You are right in that distributed version control does not really introduce any new security concerns since the developer has already access to the code in both cases. I can only think that since it is easier to work offline and offsite with GIT, developers might become more tempted to do it than in centralized. I would push to force encryption on all corporate laptops with code
not really easier, just the same. If you enable logs, then you will have the same information when the code is accessed.
I personally do not think so. It might represent the thought process leading to certain decisions but not necessarily more.
It comes down to knowledge on how to implement security measures in both cases. If you have more experience in one system vs another then you are more likely to implement more to prevent such loss but at the end of the day, you are trusting your developers with code the minute you allow them access to it. No way around that.

DVCS provides various protections against unauthorized writing. This is why it is popular with opensource teams. It has several frustrating limitations for controlling reading. Opensource teams do not care about this.
The first problem is that most DVCS encourage many copies of the full source. The typical granularity is the full repo. This can include many unneeded branches and even entire other projects, besides the concern of history (along with searchable commit comments that can make the code even more useful to the attacker). CVCS encourages developers to copy as little as possible to their desktop, since the less they copy, the faster it works. The less you put on mobile devices, the easier it is to secure.
When DVCS is implemented with many devices acting as servers, it is much more difficult to implement effective network security. Attacking a local CVCS workspace requires the attacker to gain access to the filesystem. Attacking a DVCS node generally requires attacking the DVCS itself on any device hosting the information (and remember: the folks who maintain most DVCS's are opensource guys; they don't care nearly as much about read controls). The more devices that host repositories, the more likely that users will set up anonymous read access (which again, DVCS encourages because of its opensource roots). This greatly simplifies the job of an attacker who is doing random sweeps.
CVCS that are based on URLs (like subversion) open the opportunity for quite fine-grain access control, such as per-branch access. DVCS tends to fight this kind of access control.
I know developers like DVCS, but there's no way it can be secured as effectively as CVCS. Most environments do a terrible job of securing their CVCS, and if that's the case then it doesn't matter which you use. But if you take access control seriously, you can have much greater control with CVCS as part of a broader least-privilege infrastructure.
Many may argue that there's no reason to protect source code. That's fine and people can argue about it. But if you are going to protect your source code, the best implementation is to not copy the source to random laptops (which are very hard to secure well), and rather have developers mount it from a central server. CVCS works well this way. DVCS makes no sense if you are going to keep it on a single server this way. If you are going to copy files to mobile devices, make sure you copy as little as possible. That's the opposite of DVCS.

There are a bunch of "security" issues; whether they are an issue depends on your setup:
There's more data floating around, which means the notional "attack surface" might be bigger (it depends on how you count).
But how much data does the "typical" developer check out? You might want to use a sparse checkout in svn, but lazy people and some GUI tools don't support that, so they'll have all your code checked out anyway. Git users might be more likely to use multiple repos. This depends on you.
Authentication/access control might be better (and it might be worse!). This is largely a function of the VCS, not whether it is "D" or "C". svn:// is plaintext.
Is deleting files a priority, and how easy is this to do? An accidental commit of a confidential file is more painful to do in git if it happened in the distant past (but people might be more likely to notice).
Are you really going to notice a malicious user pulling the entire history instead of merely doing a checkout? It depends on how big your repository is and what your branches are like. It's easy for a full SVN checkout to take up more space than the repository itself due to branches.
Change history is generally not something you want to give away for free (even to people with a source code license), but how valuable is it? Maybe you have top-secret design methodologies or confidential information in your commit messages, but this seems unlikely.
And finally, security economics:
How much is the extra security worth?
How much is increased productivity worth?
How much is caring about the concerns about your developers worth?
(IIRC it turns out that users should ignore security advice, because the expected cost is more than the expected benefit — this is especially true for things like certificates that expired yesterday. How much does it cost you to check the address bar every time you type in password? How often do you catch a phishing attempt? What is the cost to you per thwarted phishing attempt? What is the cost per successful phish?)

Securing SQL queries, insuring that no one person knows that password

What are some effective and secure methods of securing SQL queries?
In short I would like to insure that programmers do not see the passwords used by the application to perform queries. Something like RSA or PGP comes to mind, but don't know how one can implement a changing password without being encoded in the application somewhere.
Our environment is a typical Linux/MySQL.

This might be more of a process issue and less of a coding issue.
You need to strictly separate the implementation process and the roll-out process during software development. The configuration files containing the passwords must be filled with the real passwords during roll-out, not before. The programmers can work with the password for the developing environment and the roll-out team changes those passwords once the application is complete. That way the real passwords are never disclosed to the people coding the application.
If you cannot ensure that programmers do not get access to the live system, you need to encrypt the configuration files. The best way to do this depends on the programming language. I am currently working on a Java application that encrypts the .properties files with the appropriate functions from the ESAPI project and I can recommend that. If you are using other languages, you have to find equivalent mechanisms.
Any time you want to change passwords, an administrator generates a new file and encrypts it, before copying the file to the server.
In case you want maximum security and do not want to store the key to decrypt the configuration on your system, an administrator can supply it whenever the system reboots. But this might take things too far, depending on your needs.

If programmers don't have access to the configuration files that contain the login credentials and can't get to them through the debug or JMX interfaces then that should work. Of course that introduces other problems but that would potentially satisfy your requirement. (I am not a Qualified Security Assessor - so check with yours to be sure for PCI compliance.)

Designing a Linux-based system for transferability of ownership/admin rights without total trust

Inspired by a much more specific question on ServerFault.
We all have to trust a huge number of people for the security and integrity of the systems we use every day. Here I'm thinking of all the authors of all the code running on your server or PC, and everyone involved in designing and building the hardware. This is mitigated by reputation and, where source is available, peer review.
Someone else you might have to trust, who is mentioned far less often, is the person who previously had root on a system. Your predecessor as system administrator at work. Or for home users, that nice Linux-savvy friend who configured your system for you. The previous owner of your phone (can you really trust the Factory Reset button?)
You have to trust them because there are so many ways to retain root despite the incoming admin's best efforts, and those are only the ones I could think of in a few minutes. Anyone who has ever had root on a system could have left all kinds of crazy backdoors, and your only real recourse under any Linux-based system I've seen is to reinstall your OS and all code that could ever run with any kind of privilege. Say, mount /home with noexec and reinstall everything else. Even that's not sufficient if any user whose data remains may ever gain privilege or influence a privileged user in sufficient detail (think shell aliases and other malicious configuration). Persistence of privilege is not a new problem.
How would you design a Linux-based system on which the highest level of privileged access can provably be revoked without a total reinstall? Alternatively, what system like that already exists? Alternatively, why is the creation of such a system logically impossible?
When I say Linux-based, I mean something that can run as much software that runs on Linux today as possible, with as few modifications to that software as possible. Physical access has traditionally meant game over because of things like keyloggers which can transmit, but suppose the hardware is sufficiently inspectable / tamper-evident to make ongoing access by that route sufficiently difficult, just because I (and the users of SO?) find the software aspects of this problem more interesting. :-) You might also assume the existence of a BIOS that can be provably reflashed known-good, or which can't be flashed at all.
I'm aware of the very basics of SELinux, and I don't think it's much help here, but I've never actually used it: feel free to explain how I'm wrong.

First and foremost, you did say design :) My answer will contain references to stuff that you can use right now, but some of it is not yet stable enough for production. My answer will also contain allusions to stuff that would need to be written.
You can not accomplish this unless you (as user9876 pointed out) fully and completely trust the individual or company that did the initial installation. If you can't trust this, your problem is infinitely recursive.
I was very active in a new file system several years ago called ext3cow, a copy on write version of ext3. Snapshots were cheap and 100% immutable, the port from Linux 2.4 to 2.6 broke and abandoned the ability to modify or delete files in the past.
Pound for pound, it was as efficient as ext3. Sure, that's nothing to write home about, but it was (and for a large part) still is the production standard FS.
Using that type of file system, assuming a snapshot was made of the pristine installation after all services had been installed and configured, it would be quite easy to diff an entire volume to see what changed and when.
At this point, after going through the diff, you can decide that nothing is interesting and just change the root password, or you can go inspect things that seem a little odd.
Now, for the stuff that has to be written if something interesting is found:
Something that you can pipe the diff though that investigates each file. What you're going to see is a list of revisions per file, at which time they would have to be recursively compared. I.e. , present against former-present, former-present against past1, past1 against past2, etc , until you reach the original file or the point that it no longer exists. Doing this by hand would seriously suck. Also, you need to identify files that were never versioned to begin with.
Something to inspect your currently running kernel. If someone has tainted VFS, none of this is going to work, CoW file systems use temporal inodes to access files in the past. I know a lot of enterprise customers who modify the kernel quite a bit, up to and including modules, VMM and VFS. This may not be such an easy task - comparing against 'pristine' may not be tenable since the old admin may have made good modifications to the kernel since it was installed.
Databases are a special headache, since they change typically each second or more, including the user table. That's going to need to be checked manually, unless you come up with something that can check to be sure that nothing is strange, such a tool would be very specific to your setup. Classic UNIX 'root' is not your only concern here.
Now, consider the other computers on the network. How many of them are running an OS that is known to be easily exploited and bot infested? Even if your server is clean, what if this guy joins #foo on irc and starts an attack on your servers via your own LAN? Most people will click links that a co-worker sends, especially if its a juicy blog entry about the company .. social engineering is very easy if you're doing it from the inside.
In short, what you suggest is tenable, however I'm dubious that most companies could enforce best practices needed for it to work when needed. If the end result is that you find a BOFH in your work force and need to can him, you had better of contained him throughout his employment.
I'll update this answer more as I continue to think about it. Its a very interesting topic. What I've posted so far are my own collected thoughts on the same.
Edit:
Yes, I know about virtual machines and checkpointing, a solution assuming that brings on a whole new level of recursion. Did the (now departed) admin have direct root access to the privileged domain or storage server? Probably, yes, which is why I'm not considering it for the purposes of this question.

Look at Trusted Computing. The general idea is that the BIOS loads the bootloader, then hashes it and sends that hash to a special chip. The bootloader then hashes the OS kernel, which in turn hashes all the kernel-mode drivers. You can then ask the chip whether all the hashes were as expected.
Assuming you trust the person who originally installed and configured the system, this would enable you to prove that your OS hasn't had a rootkit installed by any of the later sysadmins. You could then manually run a hash over all the files on the system (since there is no rootkit the values will be accurate) and compare these against a list provided by the original installer. Any changed files will have to be checked carefully (e.g. /etc/passwd will have changed due to new users being legitimately added).
I have no idea how you'd handle patching such a system without breaking the chain of trust.
Also, note that your old sysadmin should be assumed to know any password typed into that system by any user, and to have unencrypted copies of any private key used on that system by any user. So it's time to change all your passwords.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string