Working with Linux 3.2, I would like to implement a UID algorithm using /dev/urandom.
There may be a chance of reading 16 random bytes twice, and getting the same result. But is the chance small enough to be negligible?
/dev/urandom is supposed to be a random device that should look uniformly random, and in a uniformly random sequence you would expect to find repeated patterns. However, since there are 2128 possible 16-byte sequences, this should happen with probability 2-128, which is vanishingly small.
That said, /dev/urandom is not known to be cryptographically safe and there may be attacks that aren't in the open literature to force the behavior to degenerate (perhaps some government agency knows how to do this, for example). From the man pages:
A read from the /dev/urandom device will not block waiting for more
entropy. As a result, if there is not sufficient entropy in the
entropy pool, the returned values are theoretically vulnerable to a
cryptographic attack on the algorithms used by the driver. Knowledge
of how to do this is not available in the current unclassified
literature, but it is theoretically possible that such an attack may
exist. If this is a concern in your application, use /dev/random
instead.
(My emphasis) Therefore, I wouldn't rely on this if you are trying to go for cryptographic security.
In short, if you just need random values, this is probably fine. If you want to go for cryptographic security, I would not recommend doing this.
Hope this helps!
you have a 1/2^128 chance of reading the same data, so yes - the probability is very negligible. Roughly the same probability of breaking the AES128 encryption scheme.
Assuming the values are perfectly random, due to the Birthday Paradox the probability is approximately 2-64 (the square root of getting any particular value). That is, at about 264 UIDs, the probability to find a pair becomes greater than 50%.
For most applications that should be fine.
How do I disable entropy sources?
Here's a little background on what I'm trying to do. I'm building a little RNG device that talks to my PC via USB. I want it to be the only source of entropy used. I'll use rngd to add my device as a source of entropy.
Quick answer is "you don't".
Don't ever remove sources of entropy. The designers of the random number generator rigged it so any new random bits just get mixed in with the current state.
Having multiple sources of entropy never weaken the random number generator's output, only strengthen it.
The only reason I can think to remove a source of entropy is that it sucks CPU or wall-clock time that you cannot afford. I find this highly unlikely but if this is the case, then your only option is kernel hacking. As far as hacking the kernel goes, this should be fairly simple. Just comment out all the calls to the add_*_randomness() functions throughout the kernel source code (the functions themselves are found in drivers/char/random.c). You could just comment out the contents of the functions but you are trying to save time in this case and the minuscule time the extra function call takes could be too much.
One solution is to to run separate linux instance in a virtual machine.
Additional note, too big for comment:
Depending on its settings, rngd can dominate the kernel's entropy pool,
by feeding it so much data, so often, that other sources of entropy are
mostly ignored or lost. Do not to that unless you trust rngd's source
of random data ultimately.
http://man.he.net/man8/rngd
I suspect you might want a fast random generator.
Edit I should have read the question better
Anyways, frandom comes with a complete tarball for the kernel module so you might be able to learn how to build your own module around your USB device. Perhaps, you can even have it replace/displace /dev/urandom so arbitrary applications would work with it instead of /dev/urandom (of course, given enough permissions, you could just rename the device nodes and 'fool' most applications).
You could look at http://billauer.co.il/frandom.html, which implements that.
Isn't /dev/urandom enough?
Discussions about the necessity of a faster kernel random number generator rise and fall since 1996 (that I know of). My opinion is that /dev/frandom is as necessary as /dev/zero, which merely creates a stream of zeroes. The common opposite opinion usually says: Do it in user space.
What's the difference between /dev/frandom and /dev/erandom?
In the beginning I wrote /dev/frandom. Then it turned out that one of the advantages of this suite is that it saves kernel entropy. But /dev/frandom consumes 256 bytes of kernel random data (which may, in turn, eat some entropy) every time a device file is opened, in order to seed the random generator. So I made /dev/erandom, which uses an internal random generator for seeding. The "F" in frandom stands for "fast", and "E" for "economic": /dev/erandom uses no kernel entropy at all.
How fast is it?
Depends on your computer and kernel version. Tests consistently show 10-50 times faster than /dev/urandom.
Will it work on my kernel?
It most probably will, if it's > 2.6
Is it stable?
Since releasing the initial version in fall 2003, at least 100 people have tried it (probably many more) on i686 and x86_64 systems alike. Successful test reports have arrived, and zero complaints. So yes, it's very stable. As for randomness, there haven't been any complaints either.
How is random data generated?
frandom is based on the RC4 encryption algorithm, which is considered secure, and is used by several applications, including SSL. Let's start with how RC4 works: It takes a key, and generates a stream of pseudo-random bytes. The actual encryption is a XOR operation between this stream of bytes and the cleartext data stream.
Now to frandom: Every time /dev/frandom is opened, a distinct pseudo-random stream is initialized by using a 2048-bit key, which is picked by doing something equivalent to reading the key from /dev/urandom. The pseudo-random stream is what you read from /dev/frandom.
frandom is merely RC4 with a random key, just without the XOR in the end.
Does frandom generate good random numbers?
Due to its origins, the random numbers can't be too bad. If they were, RC4 wouldn't be worth anything.
As for testing: Data directly "copied" from /dev/frandom was tested with the "Diehard" battery of tests, developed by George Marsaglia. All tests passed, which is considered to be a good indication.
Can frandom be used to create one-time pads (cryptology)?
frandom was never intended for crypto purposes, nor was any special thought given to attacks. But there is very little room for attacking the module, and since the module is based upon RC4, we have the following fact: Using data from /dev/frandom as a one-time pad is equivalent to using the RC4 algorithm with a 2048-bit key, read from /dev/urandom.
Bottom line: It's probably OK to use frandom for crypto purposes. But don't. It wasn't the intention.
Is 1024 bit rsa secure, or is it crackable now? Is it safe for my program to use 1024 bit rsa? I read at http://pcworld.about.com/od/privacysecurity1/Researcher-RSA-1024-bit-encry.htm that 1024 bit encryption is unsecure, but I find 2048 bit slower, and also I see that various https sites (even paypal) use 1024 bit encryption. Is 1024 bit encryption secure enough?
Last time I checked, NIST recommends 2048-bit RSA and predicts that it will remain secure until 2030. Page 67 of this PDF has the table.
Edit: They actually predict 1024-bit is OK until 2010, then 2048-bit until 2030, then 3072-bit after that. And it's NIST, not the NSA. Been too long since I did my thesis, LOL.
What are you trying to protect? If you are encrypting something that is not terribly vital, then 1024 may be fine, but, if you are protecting something that is very vital, such as someone's medical or financial info then 4096 bits would be better.
The size of the key really depends on what you are protecting, and how long you expect the encryption to hold. If your timeframe is that the info is only valid for 10 mins then 1024 works fine, for 10 years of protection it isn't.
So, what are you protecting?
There is no easy answer to the question "is size n secure ?" because it depends on the resources of an expected attacker. This has two parts:
Resources that the attacker is willing to invest heavily depend on the situation: defeating your grandmother, a bored computer-science student, or the full secret service of some big, rich country, does not involve the same attack power. It also depends on the perceived value of the protected data.
When designing the system, you want some margin of security, which means that you will make some prophecies on how computing power will evolve in the future, and this raises the difficult question of the notion of cost.
So there are several estimates which have been proposed by various researchers and government institutes. This site offers a survey of such methods, with online calculators so that you may play a bit with some of the input parameters.
Short answer is that if you want short-term security (i.e. security is not relevant beyond, say, year 2015) and 1024 bits are not enough for you, then your enemies must be very powerful indeed. Scarily so. To the point that you should have other, more urgent trouble on your hands.
It is necessary to define the meaning of secure to get a useful answer.
Is your house secure? Mostly we make it "good enough." For example, making it harder to break in than the neighbors is often adequate. That way the thieves spend time trying to break into next door rather than your place.
It might be secure if it requires X hours to break in and the valuable content is worth Y. Converting time to money is tricky, but if it takes a cracker 100 hours of his time to break in, and the contents of your information is worth, say $100, then your data is probably secure enough.
Nothing is going to be totally secure forever. If you're that worried about it, just use 2048-bit and sacrifice speed for better security.
Besides, as the article states:
But determining the prime numbers that make up a huge integer is nearly impossible without lots of computers and lots of time.
It all depends on whether or not you think people will actually try that hard to get at whatever information you're trying to protect.
Found a recent paper addressing exactly this question:
On the Security of 1024-bit RSA and
160-bit Elliptic Curve Cryptography
version 2.1, September 1, 2009
http://eprint.iacr.org/2009/389.pdf
It is said that, currently 1024 bit numbers cannot be factored but, RSA 1024 bit (which is about 310 decimal digits) is not considered secured enough. It is advisable to use RSA with 2048 bit or more, if one needs long term security. There are too many research companies, which are well-funded, doing research and there is a chance that they would not share everything at all. So i think, we can say it is not secure at all. I mean, if one day I happened encrypt an important data, i would prefer 2048 bits or more considering the long term security and the unknown developments in that field.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
For decades, in the field of computing (except disk manufacturers), a KB (kilobyte) was understood to mean 1024 bytes. In the past few years, there has been a movement to use KiB ("kibibyte") to mean 1024 bytes, and change the meaning of kilobyte to be 1000 bytes, dooming us to many more years of confusion. On the other hand, the movement seems to be confined to Gnome, and some overzealous wikipedia editing.
Will you be converting your programs to use KiB? If you have ever displayed a filesize in KB, did you divide by 1000 or 1024?
KB is 1024 bytes, damnit.
I did this once before in an app. While internally it used kibbi's and mebbi's (KiB, MiB, etc), it would still display in what users (in this case IT folks) were used to. The underlying field was just a long that was in bytes IIRC.
It was forward compatible, and would at least allow you to enter 4 GB as well as 4GiB. It also understood shorthand entry like 4.5G and properly rounded back to the real number of bytes - not forcing poor user to have to enter it that way and prevent their mistakes. Updating to use IEC notation is 1 line of code.
kilo's are 1000 and 98% of the world uses metric. We need to get over it already.
I see a lot of anger in many of these responses which baffles me. SI prefixes are SI prefixes, and programmers have no right to alter them for no better reason than convenience and custom. It's odd that those in Computer Science, a highly technical field, are the one's clamoring to go back to the days of cubits furlongs and rods. wtf?
We all know what we mean, but sticking to custom alienates and confuses users. Just because in the early pioneer days some guys, when talking about computer memory, decided to reuse SI notation doesn't mean they were correct to do so.
In my opinion, 1 Kilobyte equals to 1000 bytes is something drivemakers want you to believe, so that your drive looks more spacious than it really is. ;)
Since I spent a few years learning to be a mechanical engineer before switching majors, I have to admit that "kilo" always means 10^3 to me. From that standpoint, KiB makes sense. However, try saying "kibibyte" outloud a few times, and think about how dumb you sound.
Therefore, kilogram is 1000 grams, kilobyte is 1024 bytes.
Addendum: In addition, I agree with those who have been saying that we shouldn't change what is already established if it works. 1024 is simply a nicer number in binary. Also, "kibibyte" still sounds like something a dog eats.
It's not changing the meaning of "kilobyte". Kilo means 1000. Some people were using it incorrectly to refer to units of 1024 bytes.
I never display file sizes in kibibytes, because users don't care about 1000 vs 1024. Instead, I always use "XXX KB/MB/GB", where XXX is the number of bytes divided by 1 thousand / 1 million / etc.
There are 2 ways to think about this:
Use what the operating system you're running on uses. That way users have a consistent experience.
Use what is correct.
If you use KiB always though there will be no confusion. If you use KB there will be confusion. So if you chose option #2 then you're better off actually using 1024 and using the KiB suffix. Working with powers of 2 is more efficient anyway.
It's up to you but my rule of thumb would be that if you have a technical audience, then use KiB and avoid all confusion. If you have a large user base of non technical users, then use what your operating system uses. By the way Windows uses KB to mean 1024 bytes.
Areas of speciality have always used terms in ways that are understood by that specialisation. For example, a mechanical engineer building a bridge uses the term "stress" to mean something completely different from, say, a lawyer who finds out his star witness has been lying on the first day in court. Should we mandate that the engineer use the same definition for "stress" as the lawyer just because that definition is more widely used? If we do, I'm not driving across that bridge!
Kilobytes = 1024 bytes. Its an industry accepted specialisation of the term.
I use KiB.
Do you really want to hurt everyone by refusing to use well-established standards just like IE?
I've always displayed file size in 1000-byte Kilobytes. It hardly ever matters to the people who can't tell the difference, and often relieves confusion when they see the actual number. 65323 bytes = 65Kb when rounded, and the "normal" people like that.
I probably won't ever display "KiB", since that's never what my customers want.
The arrogance of deciding not to follow the standard created by more than just the computer community (see... it isn't "new" that Kilo actually means 1000) is staggering.
Only if the situation called for it. In almost all cases, 1,000-based units are more appropriate.
The only exceptions I know of are memory, since it naturally comes in multiples of a power of two, and CD size, since it's measured in multiples of 220 bytes by the manufacturers. Everything else, including hard drives, DVDs, flash drives, bandwidths, processor speeds, memory buses, etc. is currently measured in 1000s, and file sizes should be, too. (Or, at least, me and Steve Jobs think so. Windows will probably continue measuring file sizes in 1024s for years...)
To avoid confusing the user, use k- = 1,000, and Ki- = 1,024.
The sloppy usage of "k" to mean 1024 is an unholy abomination that should be killed with fire.
Mac OS X doesn't use KiB, MiB, GiB. On the other hand, when it uses the metric ones, it at least does the maths correctly:
Personally I prefer to get this stuff right so that users who are currently in the dark would learn from it. Waiting for users to change first is just foolish. Users didn't suddenly wake up some day and think that a kilobyte is 1024 bytes - it was software which made them think that, so shouldn't it be software's job to correct the mistake?
I've worked in the storage industry for a decade. Arguments over the size of a TB can vary the size of a solution by 10%. In short: programmers and the storage industry use different measurements. Neither are right all the time.
The Storage Networking Industry Association (SNIA) dictionary defines kilobyte as:
Kilobyte (KB)
[General] 1,000 (10^3) bytes.
The SNIA uses the 10^3 convention commonly found in storage and data transfer-related literature rather than the 1,024 (2^10) convention common in computer system random access memory and software literature.
My rule of thumb is:
Measure memory, files, file systems, and data on a network in 1024^n byte blocks.
Measure raw disk space — and only raw disk space — in 1000^n byte blocks.
Tell the customer which unit you're using. Repeat yourself often.
By and large, that keeps me out of trouble.
One program I'm working on uses "KiB" by default, but has a user pereference as to which unit of measurement to use (1024 B in a KiB, 1024 B in a KB, or 1000 B in a KB).
No. 1024 bytes is a kilobyte, regardless of whether that makes sense.
The usage of the "kilo-" prefix for units of 1024 bytes back in the day was probably a mistake. But it's now the standard. Trying to change it now only adds to the confusion.
We don't deal with the world as it should be; we deal with the world as it is.
Technically KiB is correct, but I have seen it only in a few applications (mainly linux console apps). Users are either used to work with 1024 for both KB and KiB (IT people) or they don't really care and will think that "KiB" is misspelled (non-IT people).
However: I have been used to work with "Kilobytes = 1024 bytes" for over 20 years now and even though I know that it is scientifically wrong will go on using it.
If you need to provide KiB to allow your soul to rest, make it available as an option, but don't confuse poor users with yet another definition - especially if they work with an OS, that uses the non-scientific approach and defines KB as 1024.
(BTW: Kibibytes always reminds me of Tinky Winky and his friends... ;) )
I tried to start using these terms when teaching my students, but I've sort of given up now.
I've taught an introductory computer course ("and this is a disk drive") a few times, and it can be confusing for the students that the prefixes mean different things in different contexts. Kilo means 1024 when you have a kilobyte or a kilobit of data, except if you store it on disk when it is 1000, and if you send a kilobit per second over a network then it is 1000, and a kilohertz is of course 1000 too. And one kilometer of fiber cable is 1000 meters! But it turns out that it really isn't that much of a problem. The engineering and computer science students need to know the difference, and they will get used to it anyway. When I meet them again in database courses or in the compiler course, there is never any confusion about the different kinds of kilos, megas and teras. And students from other areas (media design and so on) don't really care.
And after I did an informal poll among the other computer science people in my corridor at the university, and found out that most of them had never heard of these new prefixes, I definitely gave up.
A KB is 1024 bytes
A kB is 1000 bytes
unfortunately spelled out is ambiguous. I always use 1024.
Knuth refers to MB as KKBytes or kkBytes to differentiate between 1024*1024 and 1000*1000
I have honestly never heard of this & I doubt it's going to gain much traction in mainstream usage. I can't imagine why I would want to start doing this. The current definition of kilobyte is accurate & sufficient. I would much rather see hard drive manufacturers start using accurate terminology rather than further dumb-down technical terminology. Why can't manufacturers either build drives that are exactly xGB in size or simply say what they really are?
Other than rants about how the terminology needs to change, I have never heard those expressions used. It is not going to catch on.
I'm still going by measurements of 210*n until computers are based on decimal...
Kilo means 10^3 when you're working in the decimal number system.
Kilo means 2^10 when you're working in the binary number system.
I mean, just look at it... they're both quite arbitrary. It seems to me that anything else is equally arbitrary - so we have 40-year entrenched arbitrary versus brand-new arbitrary. Which should win? For now, I vote for the entrenched method, simply because it will cause less total confusion.
At some point our technology is bound to change - think quantum/genetic computers - that point will be a good opportunity to sanitize our measuring system.
Also, some users will always be confused - should we remove confusion for them at the risk of confusing the community that makes it all happen (us and the hardware guys)? I think not.
For me, this is a bit like the 'hacker' arguments we had, back in the day.
Depending on how old and stubborn you are, 'hacker' may mean a different thing to you. For a while in the media (and probably still today, partly) people consider hacking to be the act of breaking into machines illegally. However, in the industry now, the feeling people get is that it is someone who enjoys tinkering with things.
For a while the security community wasn't sure if this would take off, and we actually tried to use 'cracker' to refer to the bad guys. I don't think cracker has really taken off like we'd like, but we have reclaimed 'hacker' as a legitimate term, to quite a reasonable degree of success.
So to me this is the same: just because the media has tried to consider a KB as 1,000, I will never back down, and always stand up for the rights of the remaining 24 bits.
24bFL
Drivemaker/denary Kilobytes can burn in hell. Binary units for binary machines.