Why are passwords with repeating substrings weak?

Why are passwords with repeating substrings weak? - security

Many websites have password strength checking tool, which tells you how strong your password is
Lets say I have
st4cK0v3rFl0W
which is always considered super strong, but when I do
st4cK0v3rFl0Wst4cK0v3rFl0W
it is suddenly super weak. I've also heard that when password have just small repeating sequence, it is much weaker.
But how possibly can the second one be weaker than the first one, when it is twice as long?

Sounds like the password strength checker is flawed. It's not a big issue, I suppose, but a repeated strong password is not weaker than the original password.

My guess is that it's simply trivial to check for someone attacking your password. Trying each password doubled and tripled too is only double or triple the work. However, including more possibly characters in a password, such as punctuation marks, raises the complexity of brute-forcing your password much more.
However, in practice, nearly every non-obvious (read: impervious to dictionary attacks [yes, that includes 1337ifying a dictionary word]) password with 8 or more characters can be considered reasonably secure. It's usually much less work to social engineer it from you in some way or just use a keylogger.

I guess because you need to type your password two times by using the keyboard, so for that maybe if some one is in front of you can notice it.

The algorithm is broken.
Either uses a doublet detection and immediately writes it off as bad. Or calculates a strength that is in some way relative to the string length, and the repeated string is weaker than the comparable totally random string of equal length.

It might be a flaw by the password trength checker - it recognises a pattern... A pattern is not good for a password, but in this case it is a pattern on a complex string... Another reason can be the one pointed out by answer from Wael Dalloul : Someone can see the repeated text when you type it. Any spies have two chances of seeing what you type...

The best reason that I could think of, comes from the Electronic Authentication Guide, published by NIST. It gives a general thumb rule on how to estimate entropy in a password.
Length is just one criteria for entropy. There is the password character set that is also involved, but these are not the only criteria. If you read Shannon's research on user selected passwords closely, you'll notice that higher entropy is assigned to initial bits, and and lesser entropy to the latter, since it is quite possible to infer the next bits of the password from the previous.
This is not to say that longer passwords are bad, just that long passwords with a poor selection of characters are just as likely to be weak as shorter passwords.

My guess is it can generate a more "obvious" hash.
For example abba -> a737y4gs, but abbaabba -> 1y3k1y3k , granted this is a silly example, but the idea is that repeating patterns in key would make hash appear less "random".

Whilst in practice the longer one is probably stronger, I think there may be potential weaknesses when you get into the nitty gritty of how encryption and ciphers work... possibly...
Other than that, I'd echo the other responses that the strength-checker you're using isn't taking all aspects into account very accurately.
Just a thought...

Related

Password Case Sensitive Check

I was thinking of making a check to remember a user that their password is case sensitive in case they got their password right but forgot to use the uppercase characters in their password.
The first idea was to simply add a field with a lowercase password hash and check (when the password check failed) if the lowercase password inserted matched the lowercase
PseudoCode
if(getpasswordHash(password.toLower()) == databes_lower_hash){
writeMessage("Remember that your password is Case Sensitive")
}
Will this lower my password security ?
Should I use a different salt for the lowercase password and the normal password ?
Should I just not bother and just give the message about case sensitivity everytime someone get the password wrong ?

yes, it would lower your security.
you are trying to help user by changing the usual two-state (pass/fail) scenario to three-state (pass/fail-but-you-are-near/fail) scenario. this intermediate state that you are introducing will certainly lower your security.
here is how:
say a hacker gets some hint that my password is 4 characters. if he goes sheer brute-force way then the he has to try (26*2)^4 combinations. but once you implement this, then in just (26)^4 combinations he would get a ... Remember that your password is Case Sensitive ... message. from that point on, he has to try a maximum of 2^4 combinations for small/upper cases of each character. thus the brute-force barrier is significantly reduced.
you would now have to store two hashes in your database. even if you use different salts, which you must anyway, you are again reducing the brute force barrier by half. to crack a password, the hacker can now deploy two separate computers devoted to cracking one hash each. effectively reducing the time in half.
of course these are extreme scenarios. passwords are never allowed to be 4 characters long. there would be special chars as well. there is a lean chance of a military class hacker to target your application. you dont see how on earth someone can ever steal your password-table. but all of these can be debated against. there is social engineering, system vulnerabilities, and yes even your application can attract serious hackers. all it takes for your application is to attract crowd. and with crowd comes the bad guys.
so rule of thumb is:
with security, always follow the established norms. there are landmines everywhere else.
respect everyone's password with utmost care as it those are the passwords of the bank-vault itself.

1.) Yes of course. But to the same level like only allowing lowercase in the first place.
3.) Would be a cleaner solution.

Generating a pseudo-natural phrase from a big integer in a reversible way

I have a large and "unique" integer (actually a SHA1 hash).
Note: While I'm talking here about SHA1 hashes, this is not a cryptography / security question! I'm not trying to break SHA1. Imagine a random 160-bit integer instead of SHA1 if that will help.
I want (for no other reason than to have fun) to find an algorithm to map that SHA1 hash to a computer-generated (pseudo-)English phrase. The mapping should be bidirectional (i.e., knowing the algorithm, one must be able to calculate the original SHA1 hash from that phrase.)
The phrase need not make sense. I would even settle for a whole paragraph of nonsense. (Though quality — englishness — of a paragraph should probably be better than for a mere phrase.)
A better algorithm would produce shorter, more natural-looking, more unique phrases.
A variation: it is OK if I will be able to work only with a part of hash. Say, first six hex digits is fine.
The possible usage of the generated phrase: the human readable version of Git commit ID, to use as a motto for a given program version, which is built from that commit. (As I said, this is "for fun". I don't claim that this is very practical — or be much more readable than the SHA1 itself.)
Possible approach: In the past I've attempted to build a probability table (of words), and generate phrases as Markov chains, seeding the generator (picking branches from probability tree), according to the bits I read from the SHA. This was not very successful, the resulting phrases were too long and ugly. I'm not sure if this was a bug, or the general flaw in the algorithm, since I had to abandon it early enough.
Now I'm thinking about attempting to solve the problem once again. Any advice on how to approach this? Do you think Markov chain approach can work here? Something else?

A very simple approach would be:
Take list of say 1024 nouns, 1024 verbs and 1024 adjectives each. Your phrase could then be sentence of the form
noun[bits_01-10] verb[bits11-20] adjective[bits21-30] verb[bits31-40],
noun[bits_41-50] verb[bits51-60] adjective[bits61-70] verb[bits71-80],
noun[bits_81-90] verb[bits91-100] adjective[bits101-110] verb[bits111-120] and
noun[bits_121-130] verb[bits131-140] adjective[bits141-150] verb[bits151-160].
With a bit more linguistic thought you can probably construct slightly more complicated ad thus not so repetitive looking sentences (say, a bit for singular / plural, a bit of two for different tenses,...). Longer word lists use up a few more bits but my guess is that you reach rather exotic words quite fast.

We'll, lets see... The english language has about 1,000,000 words. That's about 20 bits per word. SHA1 is 160 bits, so you'll need 8 words. Theoretically, All you'll have to do is to take the n'th word of the oxford english dictionary, where n is a group of 20 bits at a time.
Now, to make it more natural, you can try to add "in/at/on/and/the..." between words, according to their type (nouns,verbs...) using some simple algorithm. (You should remove all these words from your base dictionary, of course).
The algorithm is reversible: Just remove all the words you've added, and convert each word to it's 20-bit index.
Also, try google "insult generator". Some of those generators are pretty nice. I'm not sure about the number of combinations, though.
You can buy the Oxford English Dictionary on CD-ROM with more than 500,000 words (19-bit). I'm not sure if it would be easy to extract the words and their types, however. I'm not sure if it is legal, but I think you can't claim a patent on dictionary entries...

This is an old question but entropoetry is a JavaScript (Node/frontend) library that also solves this problem. It combines Markov poetry with Huffman coding, so given the same dictionary (i.e., the same version of the library), converting poetry↔︎numbers will be bidirectional.
Example, from the Node command line:
> var Poet = require('entropoetry'); var p = new Poet();
> p.stringify(Buffer.from('deadbeef', 'hex'))
'old trick of loving you\nif you but'
> console.log(p.parse(`old trick of loving you
... if you but`))
<Buffer de ad be ef>
And as technology marches on, what seemed like a “fun only” idea in 2011 has some real uses in 2017: memorizing cryptocurrency private keys (brain wallet), Dat/IPFS links, etc.

Hash function means it is not possible (within reasonable limits) to get a data from hash, unless it is broken (insecure).
Question should be about breaking SHA-1 hash algorithm - look at Google, it's not that broken. So no, you cannot create English phrase from SHA-1 hash code, if you can, please make a huge paper about that, lot of them are useless, this would be breakthrough :-)
Edit: if only part of hash is enough, I suggest just brute force (+ simple map of hash<->phrase, possibly in a file or db), breaking hash algorithm is very "strong soup" (difficult problem).
Edit2: guys be more specific when asking question, not my fault... I will not delete this so that it scares off any other crypto guys around :-)

What is the difference between a "nonce" and a "GUID"?

This question here is about creating an authentication scheme. The accepted answer given by AviD states
Your use of a cryptographic nonce is
also important, that many tend to skip
over - e.g. "lets just use a GUID"...
Which leads me to my question. Why wouldn't you just use a GUID?

Whenever you randomly generate a random number intended to be used in cryptography, you should be really sure that the number is really random. GUIDs tend to be generated based on values that can be discovered, guessed or inferred, such as current system time or a network card MAC address, and thus the nonce could potentially be guessed.

Nonces should be random (or at least non-guessable). GUIDs have quite a bit of non-randomness to them (I'm not sure how many bits of entropy are in a GUID).

Password complexity strategies - any evidence for them?

On more than one occasion I've been asked to implement rules for password selection for software I'm developing. Typical suggestions include things like:
Passwords must be at least N characters long;
Passwords must include lowercase, uppercase and numbers;
No reuse of the last M passwords (or passwords used within P days).
And so on.
Something has always bugged me about putting any restrictions on passwords though - by restricting the available passwords, you reduce the size of the space of all allowable passwords. Doesn't this make passwords easier to guess?
Equally, by making users create complex, frequently-changing passwords, the temptation to write them down increases, also reducing security.
Is there any quantitative evidence that password restriction rules make systems more secure?
If there is, what are the 'most secure' password restriction strategies to use?
Edit Ólafur Waage has kindly pointed out a Coding Horror article on dictionary attacks which has a lot of useful analysis in it, but it strikes me that dictionary attacks can be massively reduced (as Jeff suggests) by simply adding a delay following a failed authentication attempt.
With this in mind, what evidence is there that forced-complex passwords are more secure?

Something has always bugged me about
putting any restrictions on passwords
though - by restricting the available
passwords, you reduce the size of the
space of all allowable passwords.
Doesn't this make passwords easier to
guess?
In theory, yes. In practice, the "weak" passwords you disallow represent a tiny subset of all possible passwords that is disproportionately often chosen when there are no restrictions, and which attackers know to attack first.
Equally, by making users create
complex, frequently-changing
passwords, the temptation to write
them down increases, also reducing
security.
Correct. Forcing users to change passwords every month is a very, very bad idea, except perhaps in extreme high-security environments where everyone really understands the need for security.

Those kind of rules definitely help because it stops stupid users from using passwords like "mypassword", which unfortunately happens quite often.
So actually, you are forcing the users into an extremely large set of potential passwords. It doesn't matter that you are excluding the set of all passwords with only lowercase letters, because the remaining set is still orders of magnitude larger.
BUT my big pet peeves are password restrictions I've encountered on major sites, like
No special characters
Maximum length
Why would anyone do this? W.H.Y.????

A nice read up on this is Jeff's article on Dictionary Attacks.

Never prevent the user from doing what they really want, unless there is a technical limitation from doing so.
You may nag the hell out of the user for doing stupid things like using a dictionary word or a 3-character password, or only using numbers, but see #1 above.
There is no good technical reason to require only alphanumerics, or at least one capital letter, or at least one number; see #1 above.
I forget which website had this advice regarding passwords: "Pick a password that is very easy for you to remember, but very hard for someone else to guess." But then they proceeded to require at least one capital letter and one number.
The problem with passwords is that they are so ubiquitous that it is essentially impossible for any person without a photographic memory to actually remember them without writing them down, and therefore leaving a serious security hole should someone gain access to this list of written-down passwords.
The only way I am able to manage this for myself is to split most of my passwords -- and I just checked my list, I'm up to 130 so far! -- into two parts, one which is the same in all cases, and the other which is unique but simple. (I break this rule for sites requiring high-security like bank accounts.)
By requiring "complexity" as defined as multiple types of characters all present, is that it forces people into a disparate set of conventions for different sites, which makes it harder to remember the password in question.
The only reason I will acknowledge for sites limiting the set of allowable password characters, is that it needs to be typeable on a keyboard. If you have to assume the account needs to be accessed from multiple countries, then keyboards may not always support the same characters on the user's home keyboard.
One of these days I'll have to make a blog posting on the subject. :(

My old limit theorem:
As the security of the password approaches adequate, the probability that it will be on a sticky note attached to the computer or monitor approaches one.

One also might point out the recent fiasco over at twitter where one of their admin's password turned out to be "happiness", which fell to a dictionary attack.

For questions like this, I ask myself what Bruce Schneier would do - the linked article is about how to choose passwords which are hard to guess with typical attacks.
Also note that if you add a delay after a failed attempt, you might also want to add a delay after a successful attempt, otherwise the delay is simply a signal that the attack has failed an other attempt should be launched.

Whilst this does not directly answer your question, I personally find the most aggrevating rule I have encountered one whereby you could not reuse any password previously used. After working at the same place for a number of years, and having to change your password every 2/3 months, the ability to use a password I chose over a year ago would not seem to be particularly unsafe or unsecure. If I have used "safe" passwords in the past (Alphanumeric with changes in case), surely reusing them after a perios of say a year or 2 (depending on how regularly you have to change your password) would seem to be acceptable to me. It also means I am less likely to use "easier" passwords, which might happen if I can't think of anything easy to remember and difficult to guess!

First let me say that details such as minimum length, case sensitivity and required special characters should depend on who has access and what the password allows them to do. If it's a code to launch a nuclear missile, it should be more strict than a password to log in to play your paid online edition of Angry Birds.
But I've got a SPECIFIC beef with case sensitivity.
For starters, users hate it. The human brain thinks "A=a". Of course, developers brains' aren't usually typical. ;-) But developers are also inconvenienced by case sensitivity.
Second, the CapsLock key is too easy to hit by mistake. It's right between Tab and Shift keys, but it SHOULD be up above the Esc key. Its location was established long ago in the days of typewriters, which had no alternate font available. In those days it was useful to have it there.
All passwords have risk... You're balancing risk with ease-of-use, and yes, usability matters.
MY ARGUMENT:
Yes, case sensitivity is more secure for a given password length. But unless someone is making me do otherwise, I opt for a longer minimum password length. Even if we assume only letters and digits are allowed, each added character multiplies number of the possible passwords by 36.
Someone who's less lazy than me with math could tell you the difference in number of combinations between, say a minimum 8-character case-sensitive password, and a 12-character case-insensitive password. I think most users would prefer the latter.
Also, not all apps expose usernames to others, so there are potentially two fields the hacker may have to find.
I also prefer to allow spaces in passwords as long as the majority of the password isn't spaces.
In the project I'm developing now, my management screen allows the administrator to change password requirements, which apply to all future passwords. He can also force all users to update passwords (to new requirements) at any time after next logon. I do this because I feel my stuff doesn't need case-sensitivity, but the administrator (who probably paid me for the software) may disagree so I let that person decide.
The PIN for my bank card is only four digits. Since it's only numbers it's not case sensitive. And heck, it's my MONEY! If you consider nothing else, this sounds pretty insecure, were it not for the fact that the hacker has to steal my card to get my money. (And have his photo taken.)
One other beef: Developers who come onto StackOverflow and regurgitate hard-and-fast rules that they read in an article somewhere. "Never hard code anything." (As if that's possible.) "All queries must be parameterized" (not if the the user doesn't contribute to the query.) etc.
Please excuse the rant. ;-) I promise I respect disagreement.

Personally for this paticular problem I tend to give passwords a 'score' based on characteristics of the entered text, and refuse passwords that don't meet the score.
For example:
Contains Lower Case Letter +1
Contains different Lower Case Letter +1
Contains Upper Case Letter +1
Contains different Upper Case Letter +1
Contains Non-Alphanumeric character: +1
Contains different Non-Alphanumeric character: +1
Contains Number: +1
Contains Non Consecutive or repeated Second Number: +1
Length less than 8: -10
Length Greater than 12: +1
Contains Dictionary word: -4
Then only allowing passwords with a score greater than 4, (and providing the user feedback as they create their password via javascript)

How do you enforce strong passwords?

There are many techniques to enforce strong passwords on website:
Requesting that passwords pass a regex of varying complexity
Setting the password autonomously, so that casual users have a strong password
Letting passwords expire
etc.
On the other hands there are drawbacks, because all of them make life less easy for the user, meaning less registrations.
So, what techniques do you use? Which provide the best protection vs. inconvenience ratio?
To clear things up, I am not referring to banking sites or sites that store credit cards. Think more in terms of popular (or not-so-popular) sites that still require registration.

I don't think it's possible to enforce strong passwords, but there are lots of things you can do to encourage them as much as possible.
Rate each password and give the user feedback in the form of a score or a graphical bar, etc.
Set a minimum password score to weed out the awful ones
Have a list of common words that are either banned, or tank the password score
One excellent trick I like to use is to have the password's expiry date tied to the password score. So stronger passwords don't need to be changed so often. This works particularly well if you can give users direct feedback about how long the password they've chosen will live for (and dynamically update it so they can see how adding characters affects the date).

Don't enforce anything ... if you are not protecting financial information or something equally important, then don't make the user choose a strong password.
I have the same weak password on a whole load of sites that require registration for forums, etc. I don't really care if someone guesses it and can post messages as me (and don't think there is much motivation for someone to do so). What I can't do is remember different strong passwords for a dozen sites and don't really want to use another piece of software to manage them for me.
The best compromise would be to show some kind of feedback to the user on how strong the password is (based on whether it is a dictionary word, number of different character types, length, etc).

Why enforce it?
I found that a "password strength meter" (a bar indicating password strength as you type) is usually a good non-intrusive measure. It makes those who care about security to have a guilty conscience about password weakness, yet does not frustrate those who do not care as much.
Also, there is an insightful essay on why periodic password change policy is a bad idea with today's threat model.

It's been my experience that it depends really on the type of site, as you said.
If you are creating a bank or financial website then users typically understand if you have a more secure password, since their personal data may be at risk.
However for sites that typically don't contain a lot of personal information a simpler password will be fine. They may be less prone to hack attempts, and wouldn't get anything worthwhile anyway.
I've also found that most people also seem to have a couple passwords they use often. One being complex, and another being simple. So requesting they use a complex password usually won't keep people from registering.
I've never found expiring passwords to work successfully. As I said before, many people already have a set couple of passwords they use often, so asking them to go outside of this just for your site may make them not want to come back.

The best way really depends on your site and what you are using. But the ideal way is to do as much on the client side as you can before they submit it. Using RegEx is a good way. If you can make them not have to submit the form again, that is ideal.

On letting passwords expire, there are two notable problems with the practice:
Users find it more difficult to remember their current passwords, and so they are more likely to do silly things like write them on a post-it stuck to their monitor.
Users don't generate a new, strong, unrelated password on each attempt. Most of the time they use some scheme to generate a password similar to their old one. Therefore, if an attacker gets an old password, it's still pretty easy for them to deduce a newer one.
EDIT: Which isn't to say I'm against the whole idea, but just that this needs to be considered along with other factors.

There's an Ajax tool, PasswordStrength, that will give the user an idea if their password is any good. I like it because it doesn't have to prohibit the creation of a password.
http://www.asp.net/AJAX/AjaxControlToolkit/Samples/PasswordStrength/PasswordStrength.aspx

I've never seen this done, but it seems like it would work wonderfully: the password creation page could have an expandable list of the,say, the 50 most common passwords, forcing the user to scroll down a bit before typing in their password. This, combined with Checkers' suggestion, would do much to prevent careless choices.
However, solving the problem of preventing password reuse... no clue.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string