I am reading bep 5 and trying to understand how a token value is generated. As I understand the token value is a randomly generated value that is used in a get_peers query for safety. This same token value would then be used in an announced_peers query to see if the same IP previously requested the same Infohash.
My question is how is this value generated exactly? It says something about an unspecified implementation - does this mean I can implement it myself (for example by using the SHA-1 value)?
I tried looking at other beps but couldn't find anything about specific rules for generating a token value, found nothing.
The token represents a write permission so that the other node may follow up with an announce request carrying that write permission.
Since the write permission is specific to an individual node providing the token it is not necessary to specify how it keeps track of valid write permissions, as there needs to be no agreement between nodes how the implementation works. For everyone else the token is just an opaque sequence of bytes, essentially a key.
Possible implementations are
(stateful) keep a hashmap mapping from supplied tokens to their expiration time and which remote IP it is valid for.
(stateless) hash a secret the remote ip, remote id and a validity-time-window-counter. then truncate the hash. bump the counter on a timer. when verifying check with the current and the previous counter.
Since a token is only valid for a few minutes and a node should also have a spam throttle it doesn't need to be high strength, just enough bits to make it impossible to brute-force. 6-8 bytes is generally enough for that purpose.
The underlying goal is to hand out a space-efficient, time-limited write permission to individual nodes in a way that other nodes can't forge.
These are a few Time To mechanisms that can have some light shun upon them as they can be quite useful to a developer. I will be answering them below in an attempt to explain what they can be used for and why on the #platform.
Time To Mechanisms (Attributes of Metadata)
Any data that is shared between #signs can go through several mechanisms. Some of these mechanisms include TTR (Time To Refresh), TTL (Time To Live), and TTB (Time To Birth).
Time To Refresh
TTR, which is an attribute of the metadata of a shared key, accepts an integer value which represents seconds. The subsequent refresh happens based on the given value: for example, if the set TTR value is 86400, then the refresh happens once in a day (there are 86,400 seconds in a day). Another very important attribute of the metadata is CCD (Cascade Delete), which is a boolean variable (a variable that accepts true or false values). For those who are well versed in SQL and database management, you will already have some understanding of what CCD does and how it functions.
If the CCD value is set as true when the sender deletes their original key, the cached key gets deleted on both the sender’s server and the recipient’s server. Correspondingly, if the CCD value is false when the sender deletes their original key, the cached key gets deleted on only the sender’s server and remains cached on the recipient’s server. But why is this useful? CCD is used to avoid unnecessary network calls. As an example: if #alice is in need of #bob’s phone number, she does not need to make a request from her server to #bob’s server to find it, but rather needs only to search locally on her device to find the phone number.
Let’s consider a similar example: #alice shares her phone number with her friends #bob and #john. A few months later, however, #alice purchases a new phone plan, resulting in a new phone number. If #alice has her #sign’s TTR variable set to true, once she updates her old phone number to match her new one, this updated value will also be reflected on #bob and #john’s devices. #alice also has the ability to set a specific time, in seconds, for when the new phone number will be cascaded on shared servers (this is TTB, which is described later). This can be 10 minutes, a day, or whatever specific amount of time she defines.
This function can be quite handy, especially if someone is constantly updating values on their server. This prevents a high density of calls and requests whenever someone wishes to see what new values exist on a shared server.
Time To Live
TTL (Time To Live) is quite self-explanatory: it defines how long data will live on a server. Anyone with an #sign has the ability to upload information on their server and define how long it stays on the server before it is automatically deleted. If #alice wishes to share her summer vacation getaway location as her current location, she has the option to share that summer vacation location for as long as she plans on being there!
To really take advantage of a mechanism like this, developers can combine it with other Time To commands to make life for themselves and those they share their information with easier. Say for instance Alice lives in sunny San Francisco, and owns a vacation home in Spain. With mechanisms such as Time To Refresh and Time To Live, Alice has the ability of travelling to her vacation home for several weeks, uploading her current location as Spain, and setting that information to live on her server for the several weeks that she will be staying at that location.
Time To Birth
Another Time To mechanism that is utilized within the #protocol is the Time To Birth mechanism. This mechanism allows individuals to upload information to their secondary server and have it become activated after a specified amount of time, in seconds. During the time that the data is not ‘active’, any recipients of this information will see the ‘null’ value in place until the activation has occurred.
For example, if #alice wishes to upload a web URL of her personal website after she has completed it, she can simply specify that the URL value can be uploaded to her secondary in exactly 1 days’ time. Until the value is updated a day later, #bob can only see that her website URL is ‘null’.
I am implementing a magic link/passwordless authentication.
I am sending an email with a token generated via crypto.randomBytes, when the user clicks on the link, it is redirected to the app and the token is validated to make sure it is unique.
Does the number of bytes matter, and if yes what would be a good number?
token is validated to make sure it is unique
maybe you could as well validate that it's not yet expired (define some validity to the token)
Does the number of bytes matter, and if yes what would be a good number?
In security, size does matter. It is considered as unfeasible to guess if the random output is 128 bit long (=16 bytes), or 256 bit (=32 bytes) with safe margin.
As well you may add some integrity/authentication check, such as signature or hmac, if you use simple random number generator (not from any serious crypto library) or counter
I am generating a session key to be stored in a cookie using the following function:
function getRandomKey($length=32) {
$string = '';
$characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
for ($i = 0; $i < $length; $i++) {
$string .= $characters[mt_rand(0, strlen($characters)-1)];
}
return $string;
}
If I were to generate a 1 digit key it would have:
26 lowercase + 26 uppercase + 0-10 = 62 options.
Therefore an 8 digit key would have 62^8 or 218,340,105,584,896 possible combinations.
1) Is there any rule of thumb on how many characters out I should go? The more the better, I know, but is 8 enough or should it be more like 32 characters, 64 etc.?
2) Are there any security concerns when using localStorage?
Thanks in advance!
These are two very different questions.
1) TL;DR: about 16 characters (case-sensitive) is ok for most purposes.
First, please if you can, avoid implementing session management. It is already done in many frameworks, including session id generation and more - use an existing, well-known implementation if you can, because it is not straightforward to get it right.
Now, it's all about entropy. You started out right by calculating the number of possible combinations. If you take log2 of that, you get how many bits of entropy that session id has. (Well, let's not go into entropy here...)
So one case-sensitive alphanumeric character ([a-zA-Z0-9]) has log2(62)=5.9542 bits of entropy, two characters two times more, and so on.
The time required for an attacker to guess a valid session id is:
(2^b + 1) / (2 * n * s)
Where 'b' is the available bits of entropy in the session id, 'n' is the number of guesses the attacker can make every second, and 's' is the number of valid session ids in the system.
In a large, distributed web application, potentially using a botnet, an attacker may be able to make n=100000 guesses a second, and there may be s=1 million valid session ids. You want the result to be several hundred years at the very least, say 300 (15768000000 seconds). (These are totally arbitrary values.)
This gives about b=70, so you need 70 bits of entropy. If each character has 5.9542 bits of entropy as discussed above, it gives about 12 for the required session id length, but you can just round it up to 16 to make sure. :)
As a rule of thumb, it is sometimes assumed that bits of entropy in a session id is half the length (in bits) of that session id. It is mostly a reasonable approximation without any calculation. :) Even more so, because sessuion ids are sometimes actual random numbers base64 or otherwise encoded. Different encodings usually give different results though.
Also make sure to use a cryptographic random number generator, otherwise entropy is much less. Note that mt_rand() is not cryptographically random, so the code in your question is vulnerable!
2) TL;DR Yes. (I suppose you mean using local storage for storing the session id.)
The best possible place to store a session id is a httpOnly, Secure cookie without an expiration (non-persistent), because Javascript cannot access it there (for example cross-site scripting doesn't affect a victim user's session id at least), and being non-persistent, it will be removed when the user closes the browser and will not be persisted to disk (well, mostly... but that's a long story).
If you use localStorage, any XSS will directly affect the session id, which is very valuable for an attacker. Also sessions will survive closing the browser, which is slightly unexpected - user sessuions might easily be hijacked on shared computers.
Note though that this depends on the use-case and the risk you want to take. While it would definitaly not be ok for a financial application where you can access and manage very sensitive data, it can be ok for less risky applications. You can also let the user decide ("remember me", in which case you put it into localStorage), but most users are not aware of the associated risk, so they can't make an informed decision.
Also note that sessionStorage is a little better, because the session id will be removed from the browser when it is closed, but it is still available to Javascript (XSS).
This question has always troubled me.
On Linux, when asked for a password, if your input is the correct one, it checks right away, with almost no delay. But, on the other hand, if you type the wrong password, it takes longer to check. Why is that?
I observed this in all Linux distributions I've ever tried.
It's actually to prevent brute force attacks from trying millions of passwords per second. The idea is to limit how fast passwords can be checked and there are a number of rules that should be followed.
A successful user/password pair should succeed immediately.
There should be no discernible difference in reasons for failure that can be detected.
That last one is particularly important. It means no helpful messages like:
Your user name is correct but your password is wrong, please try again
or:
Sorry, password wasn't long enough
Not even a time difference in response between the "invalid user and password" and "valid user but invalid password" failure reasons.
Every failure should deliver exactly the same information, textual and otherwise.
Some systems take it even further, increasing the delay with each failure, or only allowing three failures then having a massive delay before allowing a retry.
This makes it take longer to guess passwords.
I am not sure, but it is quite common to integrate a delay after entering a wrong password to make attacks harder. This makes a attack practicaly infeasible, because it will take you a long time to check only a few passwords.
Even trying a few passwords - birthdates, the name of the cat, and things like that - is turned into no fun.
Basically to mitigate against brute force and dictionary attacks.
From The Linux-PAM Application Developer's Guide:
Planning for delays
extern int pam_fail_delay(pam_handle_t *pamh, unsigned int micro_sec);
This function is offered by Linux-PAM
to facilitate time delays following a
failed call to pam_authenticate() and
before control is returned to the
application. When using this function
the application programmer should
check if it is available with,
#ifdef PAM_FAIL_DELAY
....
#endif /* PAM_FAIL_DELAY */
Generally, an application requests
that a user is authenticated by
Linux-PAM through a call to
pam_authenticate() or pam_chauthtok().
These functions call each of the
stacked authentication modules listed
in the relevant Linux-PAM
configuration file. As directed by
this file, one of more of the modules
may fail causing the pam_...() call to
return an error. It is desirable for
there to also be a pause before the
application continues. The principal
reason for such a delay is security: a
delay acts to discourage brute force
dictionary attacks primarily, but also
helps hinder timed (covert channel)
attacks.
It's a very simple, virtually effortless way to greatly increase security. Consider:
System A has no delay. An attacker has a program that creates username/password combinations. At a rate of thousands of attempts per minute, it takes only a few hours to try every combination and record all successful logins.
System B generates a 5-second delay after each incorrect guess. The attacker's efficiency has been reduced to 12 attempts per minute, effectively crippling the brute-force attack. Instead of hours, it can take months to find a valid login. If hackers were that patient, they'd go legit. :-)
Failed authentification delays are there to reduce the rate of login attempt. The idea that if somebody is trying a dictionary or a brute force attack against one or may user accounts that attacker will be required to wait the fail delay and thus forcing him to take more time and giving you more chance to detect it.
You might also be interested in knowing that, depending on what you are using as a login shell there is usually a way to configure this delay.
In GDM, the delay is set in the gdm.conf file (usually in /etc/gdm/gdm.conf). you need to set RetryDelay=x where x is a value in seconds.
Most linux distribution these day also support having FAIL_DELAY defined in /etc/login.defs allowing you to set a wait time after a failed login attempt.
Finally, PAM also allows you to set a nodelay attribute on your auth line to bypass the fail delay. (Here's an article on PAM and linux)
I don't see that it can be as simple as the responses suggest.
If response to a correct password is (some value of) immediate, don't you only have to wait until longer than that value to know the password is wrong? (at least know probabilistically, which is fine for cracking purposes) And anyway you'd be running this attack in parallel... is this all one big DoS welcome mat?
What I tried before appeared to work, but actually did not; if you care you must review the wiki edit history...
What does work (for me) is, to both lower the value of pam_faildelay.so delay=X in /etc/pam.d/login (I lowered it to 500000, half a second), and also add nodelay (preceded by a space) to the end of the line in common-auth, as described by Gabriel in his answer.
auth [success=1 default=ignore] pam_unix.so nullok_secure nodelay
At least for me (debian sid), only making one of these changes will not shorten the delay appreciably below the default 3 seconds, although it is possible to lengthen the delay by only changing the value in /etc/pam.d/login.
This kind of crap is enough to make a grown man cry!
On Ubuntu 9.10, and I think new versions too, the file you're looking for is located on
/etc/pam.d/login
edit the line:
auth optional pam_faildelay.so delay=3000000
changing the number 3 with another you may want.
Note that to have a 'nodelay' authentication, I THINK you should edit the file
/etc/pam.d/common-auth
too. On the line:
auth [success=1 default=ignore] pam_unix.so nullok_secure
add 'nodelay' to the final (without quotes).
But this final explanation about the 'nodelay' is what I think.
I would like to add a note from a developers perspective. Though this wouldn't be obvious to the naked eye a smart developer would break out of a match query when the match is found. In witness, a successful match would complete faster than a failed match. Because, the matching function would compare the credentials to all known accounts until it finds the correct match. In other words, let's say there are 1,000,000 user accounts in order by IDs; 001, 002, 003 and so on. Your ID is 43,001. So, when you put in a correct username and password, the scan stops at 43,001 and logs you in. If your credentials are incorrect then it scans all 1,000,000 records. The difference in processing time on a dual core server might be in the milliseconds. On Windows Vista with 5 user accounts it would be in the nanoseconds.
I agree. This is an arbitrary programming decision. Putting the delay to one second instead of three doesn't really hurt the crackability of the password, but makes it more user-friendly.
Technically, this deliberate delay is to prevent attacks like the "Linearization attack" (there are other attacks and reasons as well).
To illustrate the attack, consider a program (without this
deliberate delay), which checks an entered serial to see whether it
matches the correct serial, which in this case happens to be
"xyba". For efficiency, the programmer decided to check one
character at a time and to exit as soon as an incorrect character is
found, before beginning the lengths are also checked.
The correct serial length will take longer to process than an incorrect serial length. Even better (for attacker), a serial number
that has the first character correct will take longer than any that
has an incorrect first character. The successive steps in waiting time
is because each time there's one more loop, comparison to go through
on correct input.
So, attacker can select a four-character string and that the string beginning with x takes the most time. (by guess work)
Attacker can then fix character as x and vary the second character, in which case they will find that y takes the longest.
Attacker can then fix the first two characters as xy and vary the third character, in which case they will find that b takes the
longest.
Attacker can then fix the first three character as xyb and vary the fourth character,in which case they will find that a takes the
longest.
Hence, the attackers can recover the serial one character at a time.
Linearization.java.
Linearization.docx, sample output
The serial number is four characters long ans each character has 128
possible values. Then there are 1284 = 228 =
268,435,456 possible serials. If attacker must randomly guess
complete serial numbers, she would guess the serial number in about
227 = 134,217,728 tries, which is an enormous amount of work. On the other hand, by using the linearization attack above, an
average of only 128/2 = 64 guesses are required for each letter, for a
total expected work of about 4 * 64 = 28 = 256 guesses,
which is a trivial amount of work.
Much of the written martial is adapted from this (taken from Mark Stamp's "Information Security: Principles and Practice"). Also the calculations above do not take into account the amount of guesswork needed to to figure out the correct serial length.