Does encryption guarantee integrity? - security

To build a secure system, can we assume that encryption guarantees integrity is true before starting a secure programming?
Both in symmetric and public-key
encryption, is my question
well-proofed ?
If no, what are the
vulnerabilities, can you give an
example?

No. This is easy to see if you consider the one-time pad, a simple (theoretically) perfectly secure system.
If you change any bit of the output, a bit of the clear text will change, and the recipient has no way to detect this.
This is an obvious case, but the same conclusion applies to most encryption systems. They only provide for confidentiality, not integrity.
Thus, you may want to add a digital signature. Interestingly, when using public key cryptography, it is not sufficient to sign then encrypt (SE), or to encrypt then sign (ES). Both of these are vulnerable to replay attacks. You have to either sign-encrypt-sign or encrypt-sign-encrypt to have a generally secure solution. This paper explains why in detail.
If you use SE, the recipient can decrypt the message, then re-encrypt it to a different recipient. This then deceives the new recipient about the sender's intended recipient.
If you use ES, an eavesdropper can remove the signature and add their own. Thus, even though they can't read the message, they can take credit for it, pretending to be the original sender.

In short the answer is no. Message Integrity and Secrecy are different, and require different tools.
Lets take a simple coin flip into consideration, and in this case we are betting on the results. The result is a simple bool and I encrypt it using a stream cipher like RC4 which yields 1 encrypted bit and I email it to you. You don't have the key, and I ask you to email me back the answer.
A few attacks can happen in this scenario.
1)An attacker could modify the bit in transit, if it was a 0 there is a 50% chance it will become a 1 and the contrary is true. This is because RC4 produces a prng stream that is XOR'ed with the plain text produce the cipher text, similar to a one time pad.
2)Another possibility is that I could provide you with a different key to make sure your answer is wrong. This is easy to brute force, I just just keep trying keys until I get the proper bit flip.
A solution is to use a block cipher is CMAC Mode. A CMAC is a message authentication code similar to an hmac but it uses a block cipher instead of a message digest function. The secret key (K) is the same key that you use to encrypt the message. This adds n+1 blocks to the cipher text. In my scenario this prevents both attacks 1 and 2. An attacker cannot flip a simple bit because the plain text is padded, even if the message only takes up 1 bit i must transmit a minimum of 1 block using a block cipher. The additional authentication block prevents me from chaining the key, and it also provides integrity from anyone attempting to modify the cipher text in transit (although this would be very difficult to do in practice, the additional layer of security is useful).
WPA2 uses AES-CMAC for these reasons.

If data integrity is a specific concern to you, you should use a cryptographic hash function, combined with an an encryption algorithm.
But it really does come down to using the correct tool for the job. Some encryption algorithms may provide some level of checksum validation built-in, others may not.

Related

How Do You Ensure Data Security of Small Data?

My Question:
What is the Best Approach to Ensure Data Security of Small Data? Below I present a concern around symmetric and asymmetric encryption. I'm curious if there is a way to do asymmetric encryption on small data with an equivalent of some sort of "salting" to actually make it secure? If so, how do you pick a "salt" and implement it properly? Or is there a better way to handle this?
Explanation of My Concern:
When encrypting something that has "bulk" it seems to me that asymmetric encryption approaches are pretty secure. My concern is around if I have a small field of data, say a credit card number, password, or social security number in a database. Then the data being encrypted is of fixed length and presentation. That being said, a hacker could attempt to encrypt every possible social security numbers (10^9 permutations) with the public key and compare it to values stored in the db. Once they find a match, they know the real number. Similar attacks can be done for the other data types. Because of this, I decided to avoid symmetric methods like mysql's AES_ENCRYPT() built in function, however now I'm questioning asymmetric as well.
How do we properly protect small data?
Salting is normally used for hash algorithms, but I need to be able to get the data back after. I thought about maybe having some "base bulk text", then append the sensitive data to the end. Do the encrypt on that concatenation. Decryption would reverse the process, by decrypting then stripping off the "base bulk text". If the hacker can figure out the base bulk text then I don't see how this would add any additional security.
Picking other data to include as part of encryption, to help act like a salt value derived from other fields in the database(or hash values of those fields, or combination there of yields the same issue) also seems like it is vulnerable. As hackers could be run through combinations similar to the attack mentioned above to try to perform a more intelligent form of "brute force". That being said, I'm unsure of how to properly secure the small data and my googles have not helped me.
What is the best approach to ensure data security of small data?
If you are encrypting with an RSA public key, there is no need to salt the small data. Use OAEP padding. The padding introduces the equivalent of random salt. Try it: encrypt the credit card number twice with the same RSA public key, using OAEP padding, and look at the result. You will see two different values, indistinguishable from random data.
If you are encrypting with an AES symmetric key, then you can use a random IV per data, and store the IV in the clear, publicly, next to the ciphertext. Try encrypting the credit number twice with AES CBC mode, for example, with a unique, 16 byte (cryptographically strong) IV each time. You will see two different ciphertexts. Now, assuming a 16-byte AES key, try to brute force those two outputs, without using any knowledge of the key. Use just the ciphertext, and the 16 byte IVs, and try to discover the credit card number.
EDIT: It's beyond the scope of the question, but since I mention it in the comment, if a client can send you arbitrary ciphertext to decrypt ("decrypt this credit card info"), you must not let the client see any difference between a padding error on decryption, vs. any other error on decryption. Look up "padding oracle".
If you need to encrypt data use a symmetric key algorithm, AES is a good choice. Use a mode such as CBC and a random IV, this will ensure that encryption the same data will produce different output.
Add PKCS#7 née PKCS#5 for padding.
If there is real value in the data hire a cryptographic domain expert to help with the design and later validation.
Asymmetric encryption is most useful for communicating encrypted data between two parties. For example, you have a mobile application that accepts credit card numbers and needs to transmit them to the server for processing. You want the public application (which is inherently insecure) to be able to encrypt the data and only you should be able to decrypt it in your secure environment.
Storage is a completely different matter. You're not communicating anything to or from an insecure party, you are the only one dealing with the data. You don't want to give everyone a way to decrypt things if they breach your storage, you want to make things as difficult as possible. Use a symmetric algorithm for storage and include a unique Initialization Vector with each encrypted value as a hurdle to decryption if the storage is compromised.
PCI-DSS requires that you use Strong Cryptography, which they define as follows.
At the time of publication, examples of industry-tested and accepted standards and algorithms for minimum encryption strength include AES (128 bits and higher), TDES (minimum triple-lengthkeys), RSA (2048 bits and higher), ECC (160 bits and higher), and ElGamal (2048 bits and higher). See NIST Special Publication 800-57 Part 1 (http://csrc.nist.gov/publications/) for more guidance on cryptographic key strengths and algorithms.
Beyond that, they are primarily concerned with key management, and with good reason. Breaching your storage won't help as much as actually having the means to decrypt your data, so ensure that your symmetric key is managed correctly and in accordance with their requirements.
There is also a field of study called Format-preserving encryption which seeks to help legacy systems maintain column-width and data types (a social security number is a 9-digit number even after encryption, etc), while allowing values to be securely encrypted. In this way the encryption can be created at a low level of the legacy system without breaking all of the layers above it which depend on a particular data format.
It is sometimes called "small-space encryption" and the idea is also explained in the paper How to Encipher Messages on a Small Domain
Deterministic Encryption and the Thorp Shuffle which gives an introduction to the topic and presents a specific algorithm devised by the authors. The Wikipedia article mentions many other algorithms with similar purpose.
If you'd prefer a video explanation of the topic, see The Mix-and-Cut Shuffle: Small Domain Encryption Secure Against N Queries talk from Crypto 2013. It includes graphics detailing how several algorithms work and some early research into the security of such designs.
When I encrypt short messages, I add a relatively long random salt to them before encryption. Edit others suggest prepending the salt to the payload.
So, for example, if I encrypt the fake credit card number 4242 4242 4242 4242. what I actually encrypt is
tOH_AN2oi4MkLC3lmxxRWaNqh6--m42424242424242424
the first time, and
iQe5xOZPIMjVWfrDDip244ZGhCy2U142424242424242424
the second time, and so forth.
This random salting significantly discourages the lookup table approach you describe. Many operating systems furnish sources of high-quality random numbers like *nix /dev/rand and Windows' RNGCryptoServiceProvider module.
It's still not OK to hold payment card data in that way without defense in depth and PCI data security certification.
Edit: Some encryption schemes handle this salting as part of their normal functioning.

Using asymmetric encryption to secure passwords

Due to our customer's demands, user passwords must be kept in some "readable" form in order to allow accounts to be converted at a later date. Unfortunately, just saving hash values and comparing them on authentication is not an option here. Storing plain passwords in the database is not an option either of course, but using an encryption scheme like AES might be one. But in that case, the key to decrypt passwords would have to be stored on the system handling authentication and I'm not quite comfortable with that.
Hoping to get "best of both worlds", my implementation is now using RSA asymmetric encryption to secure the passwords. Passwords are salted and encrypted using the public key. I disabled any additional, internal salting or padding mechanisms. The encrypted password will be the same every time, just like a MD5 or SHA1 hashed password would be. This way, the authentication system needs the public key, only. The private key is not required.
The private key is printed out, sealed and stored offline in the company's safe right after it is created. But when the accounts need to be converted later, it will allow access to the passwords.
Before we deploy this solution, I'd like to hear your opinion on this scheme. Any flaws in design? Any serious drawbacks compared to the symmetric encryption? Anything else we are missing?
Thank you very much in advance!
--
Update:
In response to Jack's arguments below, I'd like to add the relevant implementation details for our RSA-based "hashing" function:
Security.addProvider(new org.bouncycastle.jce.provider.BouncyCastleProvider());
Cipher rsa = Cipher.getInstance("RSA/None/NoPadding");
rsa.init(Cipher.ENCRYPT_MODE, publicKey);
byte[] cryptRaw = rsa.doFinal(saltedPassword.getBytes());
Having quickly skimmed over the paper mentioned by Jack, I think I somewhat understand the importance of preprocessing such as OAEP. Would it be alright to extend my original question and ask if there is a way to apply the needed preprocessing and still have the function return the same output every time for each input, just as a regular hashing function would? I would accept an answer to that "bonus question" here. (Or should I make that a seperate question on SOF?)
--
Update 2:
I'm having a hard time accepting one of the present answers because I feel that none really does answer my question. But I no longer expect any more answers to come, so I'll accept the one that I feel is most constructive.
I'm adding this as another answer because instead of answering the question asked (as I did in the first response) this is a workaround / alternative suggestion.
Simply put:
Use hashes BUT, whenever a user changes their password, also use your public key as follows:
Generate a random symmetric key and use it to encrypt the timestamp, user identifier, and new password.
The timestamp is to ensure you don't mess up later when trying to find the current / most up-to-date password.
Username so that you know which account you're dealing with.
Password because it is a requirement.
Store the encrypted text.
Encrypt the symmetric key using your public key.
Store the public key encrypted symmetric key with the encrypted text.
Destroy the in-memory plaintext symmetric key, leaving only the public key encrypted key.
When you need to 'convert' the accounts using the current password, you use the private key and go through the password change records. For each one:
Using the private key, decrypt the symmetric key.
Using the symmetric key, decrypt the record.
If you have a record for this user already, compare timestamps, and keep the password that is most recent (discarding the older).
Lather, rinse, repeat.
(Frankly I'm probably overdoing things by encrypting the timestamp and not leaving it plaintext, but I'm paranoid and I have a thing for timestamps. Don't get me started.)
Since you only use the public key when changing passwords, speed isn't critical. Also, you don't have to keep the records / files / data where the plaintext password is encrypted on the server the user uses for authentication. This data can be archived or otherwise moved off regularly, as they aren't required for normal operations (that's what the hash is for).
There is not enough information in the question to give any reasonable answer. Anyway since you disable padding there is a good chance that one of the attacks described in the paper
"Why Textbook ElGamal and RSA Encryption are Insecure" by
D. Boneh, A. Joux, and P. Nguyen is applicable.
That is just a wild guess of course. Your proposal could be susceptible to a number of other attacks.
In terms of answering your specific question, my main concern would have been management of the private key but given it's well and truly not accessible via any computer system breach, you're pretty well covered on that front.
I'd still question the logic of not using hashes though - this sounds like a classic YAGNI. A hashing process is deterministic so even if you decided to migrate systems in the future, so long as you can still use the same algorithm, you'll get the same result. Personally, I'd pick a strong hash algorithm, use a cryptographically strong, unique salt on each account and be done with it.
It seems safe enough in terms of what is online but have you given full consideration to the offline storage. How easy will it be for people within your company to get access to the private key? How would you know if someone within your company had accessed the private key? How easy would it be for the private key to be destroyed (e.g. is the safe fireproof/waterproof, will the printed key become illegible over time etc).
You need to look at things such as split knowledge, dual control, tamper evident envelopes etc. As a minimum I think you need to print out two strings of data which when or'd together create the private key and then have one in your office and one in your customers office,
One serious drawback I've not seen mentioned is the speed.
Symmetric encryption is generally much much faster than asymmetric. That's normally fine because most people account for that in their designs (SSL, for example, only uses asymmetric encryption to share the symmetric key and checking certificates). You're going to be doing asymmetric (slow) for every login, instead of cryptographic hashing (quite fast) or symmetric encryption (pretty snappy). I don't know that it will impact performance, but it could.
As a point of comparison: on my machine an AES symmetric stream cipher encryption (aes-128 cbc) yields up to 188255kB/s. That's a lot of passwords. On the same machine, the peak performance for signatures per second (probably the closest approximation to your intended operation) using DSA with a 512 bit key (no longer used to sign SSL keys) is 8916.2 operations per second. That difference is (roughly) a factor of a thousand assuming the signatures were using MD5 sized checksums. Three orders of magnitude.
This direct comparison is probably not applicable directly to your situation, but my intention was to give you an idea of the comparative algorithmic complexity.
If you have cryptographic algorithms you would prefer to use or compare and you'd like to benchmark them on your system, I suggest the 'openssl speed' command for systems that have openssl builds.
You can also probably mitigate this concern with dedicated hardware designed to accelerate public key cryptographic operations.

RSA: Encrypting message using multiple keys

Is it possible to get additional security by encrypting a message using 2 or more RSA keys?
EDIT: A few clarifications:
The context I am most interested in doing this for is encrypting a randomly generated symmetric key.
I don't want to limit the question to encrypting twice in a row; the purpose is to avoid the high computational cost of large RSA keys. Using less straightforward tactics such as breaking the message into parts and encrypting them separately should be considered as an option.
It should be assumed that getting only part of the message is acceptable.
If you know of any publications where this is discussed specifically by an expert, or algorithms that use multiple RSA keys, then please contribute.
No.
It is not safe to do thought experiments regarding cryptography. You are advised to keep narrowly to the path trodden by the experts.
And when the experts want to protect something better, they use a bigger key-size (at least 2048 bits is required, smaller certificates are insufficient for any peace of mind) or use elliptic curve certificates in preference to RSA.
Incidentally, you're remember that your message body is typically encrypted with a symmetric cipher and a random key, and that just this random key is encrypted with the public key of the recipient. Double-encrypting this secret key won't make this secret key longer, and won't impact an attacker's ability to brute-force that.
Quantum cryptography - I mention it only as an exciting aside, you need not factor this into your choice - promises interesting things for the keysizes: the RSA keys will be wiped out by Shor's algorithm, but the symmetric keys (Grover's) will be only half-lengthed (128-bits will be equiv to 64-bits, so will be crackable). There is of course debate about whether such quantum machines can be implemented etc etc :)
No.
If Key A is compromised than encrypted with A+B will protect against the compromise, but outside that special case, you get no additional benefit.
Composing ciphers
Say you have an encryption function E(M, K), where M is the plaintext message and K is the key. Say no known vulnerabilities exist in E.
You generate two completely unrelated keys K1 and K2.
It is guaranteed that if you compose them in the form E(E(M, K1), K2), it is impossible to actually lose security this way. If it was possible to lose security from encrypting E(M, K1), be it with K2 or any other key, the is cipher broken, because an attacker could just do E(E(M, K1), KF) where KF is any key the attacker wishes to choose.
For more info see here.
Encrypting every second block with a different key
The implications here are obvious. Assuming you are using properly composed cryptographic primitives with both encryption function:key combinations, if you encrypt every second block with a different key out of the set of two keys, the attacker can only decrypt the blocks he has the key for.
Yes!
But do not use raw encryption. Use RSA encryption schema. Instead of reencrypting the encrypted message with the second key, which might have weakening effet (I don't know), use the shared secret algorithm to split your secret in two. The shared secret algorithm make it possible to split a secret in n pieces and ensures that if an attacker manages to get n-1 pieces he knows nothing of the secret. So don't simply split the secret in two.
You can then have more then 2 RSA keys. Another powerful property of the shared secret algorithm is that it is possible to spread the secret over n pieces and require only m pieces, with m smaller than n, to recover the secret. This makes the secret recovery more robust to loss of pieces.
Look here for more information on shared secret: http://en.wikipedia.org/wiki/Shared_secret
In additional to the answers given, it also simply doesn't work unless you do some patching. Very simply, one of the moduli must be larger than the other. If you perform RSA mod the larger modulus first and mod the smaller last you lose information and cannot guarantee successful decryption. The obvious patch is to always encrypt with the smaller modulus first. Of course, you have to perform decryption in the opposite order. Another simple patch is choose moduli that a very close together in size, so that the probability that you encounter a ciphertext that cannot be uniquely decrypted is vanishingly small.

Is it possible to reverse a SHA-1 hash?

Is it possible to reverse a SHA-1?
I'm thinking about using a SHA-1 to create a simple lightweight system to authenticate a small embedded system that communicates over an unencrypted connection.
Let's say that I create a sha1 like this with input from a "secret key" and spice it with a timestamp so that the SHA-1 will change all the time.
sha1("My Secret Key"+"a timestamp")
Then I include this SHA-1 in the communication and the server, which can do the same calculation. And hopefully, nobody would be able to figure out the "secret key".
But is this really true?
If you know that this is how I did it, you would know that I did put a timestamp in there and you would see the SHA-1.
Can you then use those two and figure out the "secret key"?
secret_key = bruteforce_sha1(sha1, timestamp)
Note1:
I guess you could brute force in some way, but how much work would that actually be?
Note2:
I don't plan to encrypt any data, I just would like to know who sent it.
No, you cannot reverse SHA-1, that is exactly why it is called a Secure Hash Algorithm.
What you should definitely be doing though, is include the message that is being transmitted into the hash calculation. Otherwise a man-in-the-middle could intercept the message, and use the signature (which only contains the sender's key and the timestamp) to attach it to a fake message (where it would still be valid).
And you should probably be using SHA-256 for new systems now.
sha("My Secret Key"+"a timestamp" + the whole message to be signed)
You also need to additionally transmit the timestamp in the clear, because otherwise you have no way to verify the digest (other than trying a lot of plausible timestamps).
If a brute force attack is feasible depends on the length of your secret key.
The security of your whole system would rely on this shared secret (because both sender and receiver need to know, but no one else). An attacker would try to go after the key (either but brute-force guessing or by trying to get it from your device) rather than trying to break SHA-1.
SHA-1 is a hash function that was designed to make it impractically difficult to reverse the operation. Such hash functions are often called one-way functions or cryptographic hash functions for this reason.
However, SHA-1's collision resistance was theoretically broken in 2005. This allows finding two different input that has the same hash value faster than the generic birthday attack that has 280 cost with 50% probability. In 2017, the collision attack become practicable as known as shattered.
As of 2015, NIST dropped SHA-1 for signatures. You should consider using something stronger like SHA-256 for new applications.
Jon Callas on SHA-1:
It's time to walk, but not run, to the fire exits. You don't see smoke, but the fire alarms have gone off.
The question is actually how to authenticate over an insecure session.
The standard why to do this is to use a message digest, e.g. HMAC.
You send the message plaintext as well as an accompanying hash of that message where your secret has been mixed in.
So instead of your:
sha1("My Secret Key"+"a timestamp")
You have:
msg,hmac("My Secret Key",sha(msg+msg_sequence_id))
The message sequence id is a simple counter to keep track by both parties to the number of messages they have exchanged in this 'session' - this prevents an attacker from simply replaying previous-seen messages.
This the industry standard and secure way of authenticating messages, whether they are encrypted or not.
(this is why you can't brute the hash:)
A hash is a one-way function, meaning that many inputs all produce the same output.
As you know the secret, and you can make a sensible guess as to the range of the timestamp, then you could iterate over all those timestamps, compute the hash and compare it.
Of course two or more timestamps within the range you examine might 'collide' i.e. although the timestamps are different, they generate the same hash.
So there is, fundamentally, no way to reverse the hash with any certainty.
In mathematical terms, only bijective functions have an inverse function. But hash functions are not injective as there are multiple input values that result in the same output value (collision).
So, no, hash functions can not be reversed. But you can look for such collisions.
Edit
As you want to authenticate the communication between your systems, I would suggest to use HMAC. This construct to calculate message authenticate codes can use different hash functions. You can use SHA-1, SHA-256 or whatever hash function you want.
And to authenticate the response to a specific request, I would send a nonce along with the request that needs to be used as salt to authenticate the response.
It is not entirely true that you cannot reverse SHA-1 encrypted string.
You cannot directly reverse one, but it can be done with rainbow tables.
Wikipedia:
A rainbow table is a precomputed table for reversing cryptographic hash functions, usually for cracking password hashes. Tables are usually used in recovering a plaintext password up to a certain length consisting of a limited set of characters.
Essentially, SHA-1 is only as safe as the strength of the password used. If users have long passwords with obscure combinations of characters, it is very unlikely that existing rainbow tables will have a key for the encrypted string.
You can test your encrypted SHA-1 strings here:
http://sha1.gromweb.com/
There are other rainbow tables on the internet that you can use so Google reverse SHA1.
Note that the best attacks against MD5 and SHA-1 have been about finding any two arbitrary messages m1 and m2 where h(m1) = h(m2) or finding m2 such that h(m1) = h(m2) and m1 != m2. Finding m1, given h(m1) is still computationally infeasible.
Also, you are using a MAC (message authentication code), so an attacker can't forget a message without knowing secret with one caveat - the general MAC construction that you used is susceptible to length extension attack - an attacker can in some circumstances forge a message m2|m3, h(secret, m2|m3) given m2, h(secret, m2). This is not an issue with just timestamp but it is an issue when you compute MAC over messages of arbitrary length. You could append the secret to timestamp instead of pre-pending but in general you are better off using HMAC with SHA1 digest (HMAC is just construction and can use MD5 or SHA as digest algorithms).
Finally, you are signing just the timestamp and the not the full request. An active attacker can easily attack the system especially if you have no replay protection (although even with replay protection, this flaw exists). For example, I can capture timestamp, HMAC(timestamp with secret) from one message and then use it in my own message and the server will accept it.
Best to send message, HMAC(message) with sufficiently long secret. The server can be assured of the integrity of the message and authenticity of the client.
You can depending on your threat scenario either add replay protection or note that it is not necessary since a message when replayed in entirety does not cause any problems.
Hashes are dependent on the input, and for the same input will give the same output.
So, in addition to the other answers, please keep the following in mind:
If you start the hash with the password, it is possible to pre-compute rainbow tables, and quickly add plausible timestamp values, which is much harder if you start with the timestamp.
So, rather than use
sha1("My Secret Key"+"a timestamp")
go for
sha1("a timestamp"+"My Secret Key")
I believe the accepted answer is technically right but wrong as it applies to the use case: to create & transmit tamper evident data over public/non-trusted mediums.
Because although it is technically highly-difficult to brute-force or reverse a SHA hash, when you are sending plain text "data & a hash of the data + secret" over the internet, as noted above, it is possible to intelligently get the secret after capturing enough samples of your data. Think about it - your data may be changing, but the secret key remains the same. So every time you send a new data blob out, it's a new sample to run basic cracking algorithms on. With 2 or more samples that contain different data & a hash of the data+secret, you can verify that the secret you determine is correct and not a false positive.
This scenario is similar to how Wifi crackers can crack wifi passwords after they capture enough data packets. After you gather enough data it's trivial to generate the secret key, even though you aren't technically reversing SHA1 or even SHA256. The ONLY way to ensure that your data has not been tampered with, or to verify who you are talking to on the other end, is to encrypt the entire data blob using GPG or the like (public & private keys). Hashing is, by nature, ALWAYS insecure when the data you are hashing is visible.
Practically speaking it really depends on the application and purpose of why you are hashing in the first place. If the level of security required is trivial or say you are inside of a 100% completely trusted network, then perhaps hashing would be a viable option. Hope no one on the network, or any intruder, is interested in your data. Otherwise, as far as I can determine at this time, the only other reliably viable option is key-based encryption. You can either encrypt the entire data blob or just sign it.
Note: This was one of the ways the British were able to crack the Enigma code during WW2, leading to favor the Allies.
Any thoughts on this?
SHA1 was designed to prevent recovery of the original text from the hash. However, SHA1 databases exists, that allow to lookup the common passwords by their SHA hash.
Is it possible to reverse a SHA-1?
SHA-1 was meant to be a collision-resistant hash, whose purpose is to make it hard to find distinct messages that have the same hash. It is also designed to have preimage-resistant, that is it should be hard to find a message having a prescribed hash, and second-preimage-resistant, so that it is hard to find a second message having the same hash as a prescribed message.
SHA-1's collision resistance is broken practically in 2017 by Google's team and NIST already removed the SHA-1 for signature purposes in 2015.
SHA-1 pre-image resistance, on the other hand, still exists. One should be careful about the pre-image resistance, if the input space is short, then finding the pre-image is easy. So, your secret should be at least 128-bit.
SHA-1("My Secret Key"+"a timestamp")
This is the pre-fix secret construction has an attack case known as the length extension attack on the Merkle-Damgard based hash function like SHA-1. Applied to the Flicker. One should not use this with SHA-1 or SHA-2. One can use
HMAC-SHA-256 (HMAC doesn't require the collision resistance of the hash function therefore SHA-1 and MD5 are still fine for HMAC, however, forgot about them) to achieve a better security system. HMAC has a cost of double call of the hash function. That is a weakness for time demanded systems. A note; HMAC is a beast in cryptography.
KMAC is the pre-fix secret construction from SHA-3, since SHA-3 has resistance to length extension attack, this is secure.
Use BLAKE2 with pre-fix construction and this is also secure since it has also resistance to length extension attacks. BLAKE is a really fast hash function, and now it has a parallel version BLAKE3, too (need some time for security analysis). Wireguard uses BLAKE2 as MAC.
Then I include this SHA-1 in the communication and the server, which can do the same calculation. And hopefully, nobody would be able to figure out the "secret key".
But is this really true?
If you know that this is how I did it, you would know that I did put a timestamp in there and you would see the SHA-1. Can you then use those two and figure out the "secret key"?
secret_key = bruteforce_sha1(sha1, timestamp)
You did not define the size of your secret. If your attacker knows the timestamp, then they try to look for it by searching. If we consider the collective power of the Bitcoin miners, as of 2022, they reach around ~293 double SHA-256 in a year. Therefore, you must adjust your security according to your risk. As of 2022, NIST's minimum security is 112-bit. One should consider the above 128-bit for the secret size.
Note1: I guess you could brute force in some way, but how much work would that actually be?
Given the answer above. As a special case, against the possible implementation of Grover's algorithm ( a Quantum algorithm for finding pre-images), one should use hash functions larger than 256 output size.
Note2: I don't plan to encrypt any data, I just would like to know who sent it.
This is not the way. Your construction can only work if the secret is mutually shared like a DHKE. That is the secret only known to party the sender and you. Instead of managing this, a better way is to use digital signatures to solve this issue. Besides, one will get non-repudiation, too.
Any hashing algorithm is reversible, if applied to strings of max length L. The only matter is the value of L. To assess it exactly, you could run the state of art dehashing utility, hashcat. It is optimized to get best performance of your hardware.
That's why you need long passwords, like 12 characters. Here they say for length 8 the password is dehashed (using brute force) in 24 hours (1 GPU involved). For each extra character multiply it by alphabet length (say 50). So for 9 characters you have 50 days, for 10 you have 6 years, and so on. It's definitely inaccurate, but can give us an idea, what the numbers could be.

authentication token is encrypted but not signed - weakness?

Through the years I've come across this scenario more than once. You have a bunch of user-related data that you want to send from one application to another. The second application is expected to "trust" this "token" and use the data within it. A timestamp is included in the token to prevent a theft/re-use attack. For whatever reason (let's not worry about it here) a custom solution has been chosen rather than an industry standard like SAML.
To me it seems like digitally signing the data is what you want here. If the data needs to be secret, then you can also encrypt it.
But what I see a lot is that developers will use symmetric encryption, e.g. AES. They are assuming that in addition to making the data "secret", the encryption also provides 1) message integrity and 2) trust (authentication of source).
Am I right to suspect that there is an inherent weakness here? At face value it does seem to work, if the symmetric key is managed properly. Lacking that key, I certainly wouldn't know how to modify an encrypted token, or launch some kind of cryptographic attack after intercepting several tokens. But would a more sophisticated attacker be able to exploit something here?
Part of it depends on the Encryption Mode. If you use ECB (shame on you!) I could swap blocks around, altering the message. Stackoverflow got hit by this very bug.
Less threatening - without any integrity checking, I could perform a man-in-the-middle attack, and swap all sorts of bits around, and you would receive it and attempt to decrypt it. You'd fail of course, but the attempt may be revealing. There are side-channel attacks by "Bernstein (exploiting a combination of cache and microarchitectural characteristics) and Osvik, Shamir, and Tromer (exploiting cache collisions) rely on gaining statistical data based on a large number of random tests." 1 The footnoted article is by a cryptographer of greater note than I, and he advises reducing the attack surface with a MAC:
if you can make sure that an attacker
who doesn't have access to your MAC
key can't ever feed evil input to a
block of code, however, you
dramatically reduce the chance that he
will be able to exploit any bugs
Yup. Encryption alone does not provide authentication. If you want authentication then you should use an message authentication code such as HMAC or digital signatures (depending on your requirements).
There are quite a large number of attacks that are possible if messages are just encrypted, but not authenticated. Here is just a very simple example. Assume that messages are encrypted using CBC. This mode uses an IV to randomize the ciphertext so that encrypting the same message twice does not result in the same ciphertext. Now look what happens during decryption if the attacker just modifies the IV but leaves the remainder of the ciphertext as is. Only the first block of the decrypted message will change. Furthermore exactly those bits changed in the IV change in the message. Hence the attacker knows exactly what will change when the receiver decrypts the message. If that first block
was for example a timestamp an the attacker knows when the original message was sent, then he can easily fix the timestamp to any other time, just by flipping the right bits.
Other blocks of the message can also be manipulated, though this is a little trickier. Note also, that this is not just a weakness of CBC. Other modes like OFB, CFB have similiar weaknesses. So expecting that encryption alone provides authentication is just a very dangerous assumption
A symmetric encryption approach is as secure as the key. If both systems can be given the key and hold the key securely, this is fine. Public key cryptography is certainly a more natural fit.

Resources