Bittorrent tracker request, format of info_hash - bittorrent

When I want to send an initial request to a tracker all references I've seen says it needs to be url-encoded. If I transform the SHA-1 hash I have of the info key into a hex string, why would I need to url-encode the hash? It only contains allowed characters.

The info_hash parameter is not a hex string. It's a pure binary string, so yes, you will have to URL-encode many of the bytes in it. (This tends to make it longer in the end than just using a hex-encoded string, but that's the BitTorrent protocol for you, too late to do anything about it now!)

Binary form of info-hash (a 20 byte long SHA1) should be URL encoded. AFAIK some trackers accept pure hexadecimal info-hash (40 character long string).

Related

SHA512 to UTF8 to byte encryption

a question:
A parcel service provider requests that the password is encoded in a specific way:
KEY -> UTF8 Encoding -> SHA512
They KEY should be in byte form, not string
currently I have this in Node.js with CryptoJS:
password = CryptoJS.SHA512(CryptoJS.enc.Utf8.parse(key))
or
password = CryptoJS.SHA512(CryptoJS.enc.Utf8.stringify(key))
Don't know which one is the right one.
I need to convert the key to bytes, how do I do that?
Keys are arbitrary sequences of bytes, and SHA-512 works on arbitrary sequences of bytes. However, UTF-8 can't encode arbitrary sequences of bytes. It can only encode Unicode code points. What you're asking for isn't possible. (I suggest posting precisely what the requirement is. It's possible you're misreading it.)
You need another encoding, such as Base64 or Hex. The output of either of those is compatible with UTF-8 (they both output subsets of UTF-8).
That said, this is a very strange request, since you already have exactly the correct input for SHA-512. Converting it to a string and then converting that string back to (likely different) bytes seems a pointless step, but if you need it, you'll need a byte encoding like Base64 or Hex.

How to get pieces value from .torrent file

I am trying to build a .torrent file interpreter. The problem is that I can't seem to understand how to go about interpreting the pieces value. I am aware that the pieces key contains a concatenation of the SHA-1 hashes for each piece and that SHA-1 contains 20 bytes. A result of this is that the final output should be a multiple of 20 bytes. However, after counting the bytes from the pieces value as a string or in hexadecimal form it still does not satisfy this. How should I interpret the pieces key?
Here we use bencode and bdecode, and the pieces value can get easily. I think you need to firstly read BEP for more details. What's more, you can see this and use it as an example.
From looking at a real torrent file, I found that the SHA-1 hashes had to be taken from its hexadecimal string format, but I previously thought that it was wrong because the byte length of the hash was not a multiple of 20. Turns out I forgot to add a trailing 0 to hexadecimals that were only 1 character (e.g. a had to be changed to 0a)

Why encode a JSON payload to base64?

On codejam site they are returning a json string as base64 encoded string.
The actual json payload's size is less than the base64 encoded string.
What's the reason behind returning the payload as base64 encoded string?
7 bit schemes are, or tend to be, transport neutral. There is a natural corruption check as least as good as CRC, and the JSON is less likely to be mangled by well meaning library functions (CRLF, anti-injection, SQL parsing). Yes it's going to be longer. 7 goes into anything more times than 8 (or more).

Can ALL string be decoded as valid binary data?

As known, Base-64 encodes binary data into transferable ASCII strings, and we decode these strings back to data.
Now my question is inverted: Can every random string be decoded as binary data, and correctly encoded back to the exact original string?
It depends upon your coding method - some methods use only a limited range of characters so a string containing other characters would not be legal. In Base64 this is the case so the answer is no. With other methods I'm sure its possible but I cannot think of an example other than simply treating the string as binary bytes.

SHA-1 Digest Reduction

I'm using QR Code barcodes to store UUIDs in my system and I need to check that the barcodes generated are mine and not someone else's. I also need to keep the encoded data short so that the QR Codes remain in the lower version range and remain easy to scan.
My approach is to take the UUID raw value number (a 128-bit value) and a 16 bit checksum and then Base64 encoded that data before converting to a QR code. So far so good, this works perfectly.
To generate the checksum I take the string version of the UUID and combine it with a long secret string and XOR the odd bytes together to produce a SHA-1 hash. But this hash is too long, so I XOR all the old bytes together to produce half the checksum, and likewise with the even bytes to produce the other half.
What worries me is that I have compromised the SHA-1 system needlessly by XORing it down. Would it be better to just take two unmanipulated bytes from somewhere within the result? I accept that a 16-bit checksum won't be as secure as a 160-bit checksum, but that is a price I have to pay for usability with the barcodes. What I really don't want to find is that I've now provided a checksum that is easy to crack as the UUID is transmitted in the clear.
If there is a better way of generating the checksum that would also be a suitable answer to the question. As always many thanks for your time or just reading this, double plus good thanks if you post an answer.
There's no reason to do any XORing. Simply taking the first two bytes will be as (in)secure.
To keep the code version as small as possible, you might want to convert the 144 bit value to a decimal string and encode that. QR Codes have different characters sets and encode numbers efficiently. Base64 can only be encoded as 8 bit values in QR codes so you add 30% right there.

Resources