Decrypting an image [closed] - security

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I need to decrypt a png file. I can't open/view the image because it is encrypted. When I run the file command on the image in the command line, it says that it's a 'data' type.
I know the image is encrypted using XOR (as in the case of one-time pad), with a secret key which I do not know of.
I only have the image file and no other information. How should I go about finding out the secret key?

You would also have to know whether the one-time pad is as long as the original image or not. If the pad is shorter then it will be repeated until the end of the plaintext. If it is 1 up to 7 bytes long then it is really easy, because the first 8 bytes of the PNG file format are known: \x89PNG\r\n\x1a\n.
Calculate key = ciphertext[0] ^ '\x89'. If key ^ ciphertext[1] == 'P' then you have your key. Otherwise you need to check key ^ ciphertext[i] == knownHeader[i] for i in 2 to 8 to see if you have the beginning of the key. Depending on the i where you found the match, then you know how long the pad is. Afterwards you can calculate the remaining key bytes.
The only remaining thing is to use the whole key to decrypt the whole file and check whether it is something sensible.

XORing a random bitstream will result in a practically random output, because each bit changes with a probability of 0.5 (see Shannon, 1949). If you have neither the key nor the image, you have no way to recover the image, or the key itself.
If this is some kind of a challenge, you can try and use the properties of XOR to your advantage. If for example there is another image XORed with the same key, and you have that image plaintext and encrypted, you can obviously find the key by XORing the unencrypted image to the encrypted one. Or you can for example try guessing at keys that are for some reason obvious in your context.
Also one time pad has all the great features if the key is truly random, and has the same length as the input. Your key may for example be shorter and repeated.
One thing you can exploit is the fact that an image file has a known format, so quite a few bits of the key you can find out if you know what format it should be. At the beginning there is the magic number to identify the file format, and then depending on the actual image type some fields have few or just one potential value. Let's fill those in in your key being constructed and see if there is a pattern. But again, if the key is truly random, never reused, and as long as the image, it is not possible to recover the image file.

Related

How to create Processing challenges

I have an idea for my login and verification system but I'm wondering if its the correct idea or if its already been implemented somewhere else.
To try and mitigate some attacks, before login id like to propose a challenge to the client. So I'm thinking of proposing a computational challenge to the user. I need to somehow generate a problem or some kind of code that would take an average machine X seconds to process. With the answer being something that I could quickly calculate and check.
For instance, I know that hashing/encryption algorithms take time, perhaps I could use a simple encryption algorithm and generate some random short length password and a longer length key word. I then encrypt the keyword with the password, and send that to the client.
Then the client has to randomly guess/check the password and when it gets it, it sends me the key.
EXAMPLE:
Password:Foo
Length: 3
Answer: Bar
Keyword:123456
Output: encrypted(answer+keyword)->o3jubo3ibf32ib3o
SEND-> o3jubo3ibf32ib3o,Answer:Bar.
Then the client processes this and tries to come up with the password, guessing and checking if the first 3 letters = bar.
When it finds an answer, it sends me back a keyword and the password to check. I check if the keyword matches, if it does I continue, if it does not. I try to decrypt the challenge with the password they used to see if it does have the first 3 letters equal to the answer. If they ever give me a password that does not have the first 3 letters equal to the password- I blacklist the IP for 5 or so minutes and sever the connection.
Does this seem like it would work?
Has a library for this already been created?
Is there examples in c++ I can look at?
What would be the best encryption method for this approach?

What is the word 'security value' means in AES encryption algorithm?

I studied a lot trying to answer these three questions, but I still can't get it.
In Question 1: I don't know what is the definition of 'security value' .
In Question 2&3 : I can't see any security weaknesses in the scheme.
The terminology is confusing, but normally we call that value the initial vector (IV). The point of using a random IV is that if you encrypt the same message twice, then you do not get the same ciphertext twice, which is necessary to prevent leaking of information (for example, see what happens in ECB mode -- look at the penguin images).
To the second question, a general security requirement is that even if the attacker supplies some number of inputs (m_i), he should not be able to figure out what other inputs are that he has no control over. To understand why, look at padding oracle attacks such as POODLE. If you understand POODLE, you should be able to come up with something similar on this proposed scheme.
Here's a hint on some problems that may happen. Think about the case of m_1 = m_2. Then what is c_2? I'm not going to solve the homework for you, but that should point you in the right direction.

What is the encrypted text shown in address bar?

first I don't know whether it is a right place to ask this Question or not. When we open some specific site or submit some form or login to some site, then in the address bar some encrypted text are appended as a query string but I don't have any idea whether it is a session id or some thing else.
And if it is a session id then is it a good approach to disclose the session id.
Like. https://www.google.co.in/?gfe_rd=cr&ei=q1HiWI2kLO3s8Ae3raXwCQ
https://my.naukri.com/Inbox/viewRecruiterMails?id=d786bc1c09837cc9ca692d042c01186294584fccc83209d4fe409a9be01b6ec61edd7a843282321a
The string ei= in first example and id= in second one
What you are seeing is just encoded binary values (byte values). The first instance seems to be using base 64 for the encoding (probably the URL-safe variant of it) and the second one uses hexadecimal encoding of the bytes.
What the meaning is of the data (possibly after decoding) depends on the protocol defined for the site. There aren't any specific rules. ID's generally contain about 128 bits of randomness though.

Using "seed" based math to recreate application instances

Okay so I was thinking today about Minecraft a game which so many of you are so familiar with, I'm sure and while my question isn't directly related to the game I find it much simply to describe my question using the game as an example.
My question is, is there any way a type of "seed" or string of characters can be used to recreate an instance of a program (not in the literal programming sense) by storing a code which when re-entered into this program as a string at run-time, could recreate the data it once held again, in fields, text boxes, canvases, for example, exactly as it was.
As I understand it, Minecraft takes the string of ASCII characters you enter, all which truly are numbers, and performs a series of operations on it which evaluate to some type of hash or number which is finite... this number (again as I understand) is the representation of that string you entered. So it makes sense that because a string when parsed by this algorithm will always evaluate to the same hash. 1 + 1 will always = 2 so a seeds value must always equal that seeds value in the end. And in doing so you have the ability to replicate exactly, worlds, by entering this sort of key which is evaluated the same on every machine.
Now, if we can exactly replicate worlds like this this is it possible to bring it into a more abstract concept like the following?...
Say you have an application, like Microsoft Word. Word saved the data you have entered as a file on your hard drive it holds formatting data, the strings you've entered, the format of the file... all that on a physical file... Now imagine if when you entered your essay into Word instead of saving it and bringing your laptop to school you instead click on parse and instead of creating a file, you are given a hash code... Now you goto school you know you have to print it. so you log onto the computer and open Word... Now instead of open there is an option now called evaluate you click it and enter the hash your other computer formulated and it creates the exact essay you have written.
Is this possible, and if so are there obvious implementations of this i simply am not thinking of or are just so seemingly part of everyday I don't think recognize it? And finally... if possible, what methods and algorithms would go into such a thing?
[EDIT]
I had to do some research on the anatomy of a seed and I think this explains it well
The limit is 32 characters or for a
numeric seed, 19 digits plus the minus sign.
Numeric seeds can range from -9223372036854775808 to
9223372036854775807 which is a total of 18446744073709551616 Text
strings entered will be "hashed" to one of the numeric seeds in the
above range. The "Seed for the World Generator" window only allows 32
characters to be entered and will not show or use any more than that."
BUT looking back on it lossless compression IS EXACTLY what I was
describing after re-reading the wiki page and remembering that (you
are very correct) the seed only partakes in the generation, the final
data is stores as a "physical" file on the HDD (which again, you are correct) is raw uncompressed data in a file
So in retrospect, I believe I was describing lossless compression, trying in my mind to figure out how the seed was able to replicate the exact same world, forgetting the seed was only responsible for generating the code, not the saving or compression of it.
So thank you for your help guys! It's really appreciated I believe we can call this one solved!
There are several possibilities to achieve this "string" that recovers your data. However they're not all applicable depending on the context.
An actual seed, which initializes for example a peudo-random number generator, then allows to recreate the same sequence of pseudo-random numbers (see this question).
This is possibly similar to what Minecraft relies on, because the whole process of how to create a world based on some choices (possibly pseudo-random choices) is known in advance. Even if we pretend that we have random numbers, computers are actually deterministic, which makes this possible.
If your document were generated randomly then this would be applicable: with the same seed, the same gibberish comes out.
Some key-value dictionary, or hash map. Then the values have to be accessible by both sides and the string is the key that allows to retrieve the value.
Think for example of storing your word file on an online server, then your key is the URL linking to your file.
Compressing all the information that is in your data into the string. This is much harder, and there are strong limits due to the entropy of the data. See Shannon's source coding theorem for example.
You would be better off (as in, it would be easier) to just compress your file with a usual algorithm (zip or 7z or something else), rather than reimplementing it yourself, especially as soon as your document starts having fancy things (different styles, tables, pictures, unusual characters...)
With the simple hypothesis of 27 possible characters (26 letters and the space), Shannon himself shows in Prediction and Entropy of Printed English (Bell System Technical Journal, 30: 1. January 1951 pp 50-64, online version) that there is about 2.14 bits of entropy per letter in English. That's about 550 characters encoded with your 32 character string.
While this is significantly better than the 8 bits we use for each ASCII character, it also shows it is very likely to be impossible to encode a document in English in less than a fourth of its size. Then you'd still have to add punctuation, and all the rest of the fuss.

How to filesearch for a substring of a base64'd string

I have a client with a website that looks as if it has been hacked. Random pages throughout the site will (seemingly at random) automatically forward to a youtube video. This happens for a while (not sure how long yet... still trying to figure that out) and then the redirect disappears. May have something to do with our site caching though. Regardless, the client isn't happy about it.
I'm searching the code base (this is a Wordpress site, but this question was generic enough that I put it here instead of in the Wordpress groups...) for "base64_decode" but not having any luck.
So, since I know the specific url that the site is getting forwarded to every time, I thought I'd search for the video id that is in the youtube url. This method could also be pertinent when the hack-inserted base64'd string is defined to a variable and then that variable is decoded (so a grep for "base64_decode" wouldn't necessarily come up with any answers that looked suspicious).
So, what I'm wondering is if there's a way to search for a substring of a string that has been base64'd and then inserted into the code. Like, take the substring I'm searching for, base64 it, and then search the code base for the resultant string. (Maybe after manipulating it slightly?)
Is there a way to do that? Is that method even valid? I don't really have any idea how the whole base64 algorithm works, or if this is possible, so I thought I'd quickly throw the question out here to see if anyone else did.
Nothing to it (for somebody with the chutzpah to call himself "Programmer Dan").
Well, maybe a little. Your have to know the encoding for the values 0 to 63.
In general, encoding to Base64 is done by taking three 8-bit characters of plain text at a time, breaking those bits into four sets of 6-bit numbers, and creating four characters of encoded text by converting the numbers (0 to 63) to arbitrary characters. Actually, the encoded characters aren't completely arbitrary, as they must be acceptable to pretty much ANY method of transmission, since that's the original reason for using Base64 encoding. The system I usually work with uses {A..Z,a..z,0..9,+,/} in that order.
If one wanted to be nasty (which one might expect in the case you're dealing with), one might change the order, or even the characters, during the process. Of course, if you have examples of the encoded Base64, you can see what the character set is (unless the encoding uses more than 64 characters). But you still have the possibility of things like changing the order as you encode or decode (simple rotation, for example). But, I digress. The question is about searching for encoded text, not deciphering deliberate obfuscation. I could tell you a lot about that, too.
Simple methodology:
Encode the plain text you're looking for. If the encoding results in one or two equal signs (padding) at the end, eliminate them and the last encoded character that precedes them. Search for the result.
Same as (1) except stick a blank on the front of your plain text. Eliminate the first two encoded characters. Search for the result.
Same as (2) except with two blanks on the front. Again, eliminate the first two encoded characters. Search for the result.
These three searches will find all files containing the encoding of the plain text you're looking for.
This is all “air code”, meaning off the top of my head, at best. Some might suggest I pulled it out of somewhere else. I can think of three possible problems with this algorithm, excluding any issues of efficiency. But, that’s what you get at this price.
Let me know if you want the working version. Or send me yours. Good luck.
Cplusman

Resources