How to Generate a per-Host UUID in a Shell Script? - linux

I am writing a shell script that will be deployed to multiple machines and connect to a central server. When connecting, the script should identify the machine it is running on. (This is used to implement some rudimentary locking but that's not important for the question.)
I know that I could use the host-name, as reported by hostname -f, to identify the machine. But many personal devices have far from unique host-names, such as my-laptop or workstation so I'm wary of using this.
I might be able to add entropy by hashing the host-name together with some other host-specific information that will not change.
echo $(hostname -f) $(uname -snmpio) | md5sum
But the added entropy is very low. I'm having a hard time thinking of other system properties that can be hashed and are guaranteed not to change. (For example, I don't want to add any properties of the file system or other system configuration1 because it might legitimately change at any time.)
Finally, I thought about generating a random string the first time the script is run and store it in some configuration file. This wold be extremely likely to be unique and guaranteed not to change. But if possible, I'd prefer not having to manage persistent state.
Ideally, there would exist a utility to obtain a deterministic non-volatile UUID for the local system (like blkid for block-devices). It is not required that the UUIDs be hard to forge. This is not an authentication mechanism and I'm trusting all parties that run the script.
Are there any superior options I have overlooked?
1 Technically, the host-name is a system configuration, too. But if we change it, we expect the system to no longer be identified as the one it was before.

How to Generate Version 4 UUIDs
The easiest way to generate a unique identifier is to use a UUID. The most common type of UUID for this purpose is UUID v4, which is generally the correct choice unless you have some specific circumstances (e.g. namespacing requirements or poor sources of entropy) that would lead you to using one of the other UUID types.
You can use uuidgen on Linux, which can be found in the "uuid-runtime" package on Debian-based systems. The uuidgen tool is also installed by default OS X, should you need it. On Linux, the tool relies on libuuid and /dev/random to generate Version 4 UUIDs. If a high-quality random number generator isn't available, uuidgen will fall back on Version 1 time-based UUIDs.
UUIDGEN(1) says:
There are two types of UUIDs which uuidgen can generate: time-based UUIDs and random-based UUIDs. By default uuidgen will generate a random-based UUID if a high-quality random number generator is present. Otherwise, it will choose a time-based UUID. It is possible to force the generation of one of these two UUID types by using the -r or -t options.
As a general rule, if you're able to do so you should definitely stick with Version 4 as Version 1 has known security risks and limitations to its uniqueness properties. However, specific use cases may vary.

In the interest of providing options, you could use the RSA fingerprint of the localhost. Your machines most likely have all the components necessary already configured.
hostkey=$(ssh-keygen -l -f /path/to/host_key.pub)
The output will contain spaces and whatnot, but you could parse those out if it is a problem. The host key is usually in /etc/ssh/ssh_host_* but may depend on the distribution of linux.

Related

Does linux cksum command value varies on different system?

If i do "cksum filename" in two different linux system with different hardware spec, i am getting different checksum value for the same file.
Can anyone tell me the reason behind this?
The "filename" is a binary file generated in one system and copied to other system.
The algorithm employed by cksum is specified by POSIX. All POSIX-compliant systems (including GNU/Linux) should compute the same value for the same file -- that's the whole point. If you get different values on different systems, then either the program is buggy or the files (at least cksum's view of them) are not, in fact, the same. I wouldn't bet on the program being buggy.
Do note, however, that there are likely to be other hashing and checksumming programs on both systems (e.g. md5sum or sum). The sums computed by each of these programs are likely to differ, but each should be consistent from system to system. They could be a useful alternative for you, and/or they could give you a second opinion of whether the files really are the same.

Signature/Hash Choice for File Integrity Verification

For a file repository, I need to select a hashing algorithm that will reasonably ensure the integrity of files.
I need an algorithm that anyone (with a bit of effort) would be able to easily use to verify the integrity given the hash. In short, the file may be transferred to the user, along with a hash, and they must be able to verify that the hash comes from the file.
My first choice would be MD5 because there seems to be widely available utilities to verify MD5 hashes, but I'm concerned with the MD5 algorithm being cryptographically broken (ref Wikipedia/US-CERT: http://en.wikipedia.org/wiki/MD5)
My second choice would be a SHA-2 algorithm, but I'm concerned about availability of utilities that could easily verify the hash. Most examples I've found show program code to evaluate a hash, but I've found few, if any, utilities that are pre-built (asking users to build their own utility is beyond the 'easily' scope)
What other options are available for generating and evaluating a file hash, or are these two the options that are best?
Provide both/multiple, and let the user decide which they verify against. Or if they are really cautious, they can verify against both/all.
Have seen download sites use this approach. One site recommended the most secure, but offered others like md5 as fallback. It also provided links to tools. Can't remember specific site I'm afraid.
Since you've been able to find a few file-checkers, why not link to them as a recommendation? That way your users have at least one tool they can use. They don't need several dozen different filechecking utilities, they need just one which works for the algo you chose to use.
Tools you could link to:
Windows: http://securityxploded.com/download-hash-verifier.php
Mac OS X: http://www.macupdate.com/app/mac/31781/checksum
sha256sum, a program a part of the coreutils package on linux will generate checksums for the listed files. The format of the checksum output is the same as that of the md5sum program (but using SHA-256 hashing instead of MD5 of course), which has been widely used for years. You didn't list any target platforms but a quick googling shows there are Windows ports of the command line program.
If you need to generate large numbers of checksums you can use md5deep, which includes support for other hashes as well, including SHA-256.
http://md5deep.sourceforge.net/
I haven't tried this but from the screenshots it looks pretty neat integrating into OSX and Windows Explorer: http://implbits.com/HashTab.aspx

new encryption algorithm for ssh

I am asked to add a new algorithm to ssh so data is ciphered in new algorithm, any idea how to add new algorithm to ssh ?
thanks
It is possible to add some new algorithm to SSH communication, and this is done from time to time (eg. AES was added later). But the question is that you need to modify both client and server so that they both support this algorithm, otherwise it makes no sense.
I assume that you were asked to add some custom, either home-made or non-standard algorithm. So first thing I'd like to do is to warn you that the added algorithm can be weak. You need to perform at least basic search for information about this algorithm, as if it's broken, you will do completely useless and even dangerous work.
As for software modification themselves - it's a rare job to do so most likely you won't find anybody with this experience there. However the code that handles various algorithms is typical and adding new algorithm is trivial - you add one source file with algorithm implementation and then modify a bunch of places by adding one more case to switch statement.
In my career I've worked on a private fork of ssh that was sold as closed-source commercial software. Even they in all their crazy stupidity (private fork? who in their right mind uses non-Open Source encryption software? I thought our customers were completely off their rockers.) didn't add a new encryption algorithm.
It can be done though. Adding the hooks to the ssh protocol to support it isn't hard. The protocol is designed to be extensible in that way. At the beginning the client and server exchange lists of encryption algorithms they're willing to use.
This means, of course, that only a modified client and modified server will talk to eachother.
The real difficulty is OpenSSL. ssh does not use TLS/SSL, but it does use the OpenSSL encryption library. You would have to add the new algorithm to that library, and that library is a terrible beast.
Though, I suppose you could add the algorithm without adding it to OpenSSL. That might be tricky though since I think openssh may rely heavily on the way the OpenSSL APIs work. And part of how they work allows you to pass around a constant representing which algorithm you want to use and then a standard set of calls for encryption and decryption that use the constant to decide on the algorithm.
Again though, if I recall correctly, OpenSSL has an API specifically for adding new algorithms to its suite. So that may not be so hard. You will have to make sure this happens when the OpenSSL library is being initialized.
Anyway, this is a fairly vague answer, but maybe it will point you in the right direction. You should make whoever is doing this pay enormous sums of money. Stupidity that requires this level of knowledge to pull off should never come cheaply.

Are passwords on modern Unix/Linux systems still limited to 8 characters?

Years ago it used to be the case that Unix passwords were limited to 8 characters, or that if you made the password longer than 8 characters the extra wouldn't make any difference.
Is that still the case on most modern Unix/Linux systems?
If so, around when did longer passwords become possible on most systems?
Is there an easy way to tell if a given system supports longer passwords and if so, what the effective maximum (if any) would be?
I've done some web searching on this topic and couldn't really find anything definitive; much of what came up was from the early 2000s when I think the 8 character limit was still common (or common enough to warrant sticking to that limit).
Although the original DES-based algorithm only used the first 8 characters of the password, Linux, Solaris, and other newer systems now additionally support other password hash algorithms such as MD5 which do not have this limit. Sometimes it is necessary to continue using the old algorithm if your network contains older systems and if NIS is used. You can tell that the old DES-based algorithm is still being used if the system will log you in when you enter only the first 8 characters of your >8-character password.
Because it is a hash algorithm, MD5 does not have an intrinsic limit. However various interfaces do generally impose some limit of at least 72 characters.
Although originally the encrypted password was stored in a world-readable file (/etc/passwd), it is now usually stored in a separate shadow database (e.g. /etc/shadow) which is only readable by root. Therefore, the strength of the algorithm is no longer as important as it once was. However if MD5 is inadequate, Blowfish or SHA can be used instead on some systems. And Solaris supports pluggable password encryption modules, allowing you to use any crazy scheme. Of course if you are using LDAP or some other shared user database then you will need to select an algorithm that is supported on all of your systems.
In glibc2 (any modern Linux distribution) the password encryption function can use MD5/SHA-xxx (provoked by a magic salt prefix) which then treats as significant all the input characters (see man 3 crypt). For a simple test on your system, you could try something like:
#!/bin/perl -w
my $oldsalt = '##';
my $md5salt = '$1$##$';
print crypt("12345678", $oldsalt) . "\n";
print crypt("123456789", $oldsalt) . "\n";
print crypt("12345678", $md5salt) . "\n";
print crypt("12345678extend-this-as-long-as-you-like-0", $md5salt) . "\n";
print crypt("12345678extend-this-as-long-as-you-like-1", $md5salt) . "\n";
(which on my system gives)
##nDzfhV1wWVg
##nDzfhV1wWVg
$1$##$PrkF53HP.ZP4NXNyBr/kF.
$1$##$4fnlt5pOxTblqQm3M1HK10
$1$##$D3J3hluAY8pf2.AssyXzn0
Other *ix variants support similar - e.g. crypt(3) since at least Solaris 10.
However, it's a non-standard extension - POSIX does not define it.
Not for Linux. It's only 8 if you disable MD5 Hashing.
http://www.redhat.com/docs/manuals/linux/RHL-8.0-Manual/security-guide/s1-wstation-pass.html
You can administer policies enforcing longer and more complex passwords as well.
The full lengths are discussed here:
http://www.ratliff.net/blog/2007/09/20/password-length/
Are you asking about the crypt algorithm?
http://linux.die.net/man/3/crypt
"By taking the lowest 7 bits of each of the first eight characters of the key..."
"The glibc2 version of this function has the following additional features. ... The entire key is significant here (instead of only the first 8 bytes)."
Here's a hint as to how long ago this change happened.
Glibc 2 HOWTO
Eric Green, ejg3#cornell.edu
v1.6, 22 June 1998
You will find this article of interest. There is something called PAM (Password Authentication Module) which runs your password through a series of modules (configured in /etc/pam.d/passwd or /etc/pam.conf) to determine whether the password is valid or not.
I think around the time when actual passwords were moved from /etc/passwd to shadow, on Linux . I am guessing around 2000, Red Hat 6.x had long passwords IIRC. Around 2000 there were still a lot of old SUN, and they had password and username limits.

How can I write a program that can detect by itself that it has been changed?

I need to write a small program that can detect that it has been changed. Please give me a suggestion!
Thank you.
The short answer is to create a hash or key of the program and have the program encrypt and store that key within itself. From time to time the program would make a checksum of itself and compare it against that hash/key. If there is a difference then handle it accordingly.
There are lots and lots of ways to go about this. There are lots of very smart engineers out there that know how to work around it if that is what you are trying to avoid.
The simplest way would be to use a hash function to generate a short code which is a digest of the whole program and then check this.
It would be fairly easy to debug the code and replace the hash value to subvert this.
A better way would be to generate a digital signature using your private key and with the public key in the program to check it.
This would then require changing the public key and the hash as well as understanding the program, or changing the program code itself to subvert the check.
All you can do in the case described so far is make it more difficult to subvert but it will be possible with a certain amount of effort. I'd suggest looking into cryptographic techniques and copy protection for more information to suit your specific case.
Do you mean that program 'foo' should be able to tell if some part of it was modified prior to / during run time? That's not the responsibility of the program, its the responsibility of the security hooks in the target OS.
For instance, if the installed and trusted 'foo' has signature "xyz1234" , the kernel should refuse to run a modified (or completely new) 'foo'. The same goes for 'foo' while its currently running in memory. Look up 'Trusted Path Of Execution', aka TPE to start.
A better question to ask would be how to sign your released version of 'foo', which depends upon your target platform.
try searching for "code signing"
The easiest way would be for the program to detect its own md5 and store that in a separate file, but this isn't totally secure. An MD5 + CRC might work slightly better.
Or as others here have suggested, a sha1, sha2 or sha3 which are much more secure than md5 currently.
I'd ask an external tool to do the check. This problem reminds me of the challenge to write a program that prints itself. In Bash you could do something like this:
#!/bin/bash
cat $0
which really asks for an external tool to do the job. It's kind of solving the problem by getting away from solving the problem...
The best option is going to be code signing -- either using a tool supplied by your local friendly OS (For example, If you're targeting Windows, you probably want to take a look at Authenticode where the Operating System handles the tampering), or by rolling your own option storing MD5 hashes and comparing
It is important to remember that bets are off if someone injects a thread into your process (to potentially kill your ongoing checks, etc.), or if they tamper with your compiled application to bypass said checks.
An alternative way which wasn't mentioned is to use a binary packer such as UPX.
If the binary gets changed on the disk then the unpacking code is likely to fail.
This however doesn't protect you if someone changes the binary while it is in memory.

Resources