Concept behind converting Base16 to Base64 - base64

I understand how to read decimal, binary, hex and base64; that is I can manually convert numbers/counts expressed as each of those bases to expressions in the other bases.
I'm doing the matasano crypto challenges and the very first assignment got me thinking (https://cryptopals.com/sets/1/challenges/1).
The approaches to this problem that I found convert the hexstring to bytes (binary) and then the bytes to base64. Which I understand. Or so I thought. Could I simply concatenate these bytes and say I have the binarystring expression of the same number?
I noticed they basically read the hexstring 2 hexcharacters at a time (because 2 hexcharacters is one byte at most). This results in a binarystring where each binarycharacter(bit) is "aligned" with the hexcharacter(s) it came from.
Does this mean I can just convert this binarystring to decimal and it will be same "number" that the hexstring represents?
Could a similar character-by-character scheme be done to convert to base64? How many hexcharacters per base64character?

#Flimzy shared this link and the way it answered my question is realizing two things:
base16 is an octet based encoding
base64 is a sextet based encoding

Related

SHA512 to UTF8 to byte encryption

a question:
A parcel service provider requests that the password is encoded in a specific way:
KEY -> UTF8 Encoding -> SHA512
They KEY should be in byte form, not string
currently I have this in Node.js with CryptoJS:
password = CryptoJS.SHA512(CryptoJS.enc.Utf8.parse(key))
or
password = CryptoJS.SHA512(CryptoJS.enc.Utf8.stringify(key))
Don't know which one is the right one.
I need to convert the key to bytes, how do I do that?
Keys are arbitrary sequences of bytes, and SHA-512 works on arbitrary sequences of bytes. However, UTF-8 can't encode arbitrary sequences of bytes. It can only encode Unicode code points. What you're asking for isn't possible. (I suggest posting precisely what the requirement is. It's possible you're misreading it.)
You need another encoding, such as Base64 or Hex. The output of either of those is compatible with UTF-8 (they both output subsets of UTF-8).
That said, this is a very strange request, since you already have exactly the correct input for SHA-512. Converting it to a string and then converting that string back to (likely different) bytes seems a pointless step, but if you need it, you'll need a byte encoding like Base64 or Hex.

How to get pieces value from .torrent file

I am trying to build a .torrent file interpreter. The problem is that I can't seem to understand how to go about interpreting the pieces value. I am aware that the pieces key contains a concatenation of the SHA-1 hashes for each piece and that SHA-1 contains 20 bytes. A result of this is that the final output should be a multiple of 20 bytes. However, after counting the bytes from the pieces value as a string or in hexadecimal form it still does not satisfy this. How should I interpret the pieces key?
Here we use bencode and bdecode, and the pieces value can get easily. I think you need to firstly read BEP for more details. What's more, you can see this and use it as an example.
From looking at a real torrent file, I found that the SHA-1 hashes had to be taken from its hexadecimal string format, but I previously thought that it was wrong because the byte length of the hash was not a multiple of 20. Turns out I forgot to add a trailing 0 to hexadecimals that were only 1 character (e.g. a had to be changed to 0a)

How to know encoding of this file?

I thought this is base64 encoding so i try to decode it in that way but it seems this is not base64 encoding. I want to decode this.
O7hrHYO5UUFHFPVILQPc6A==:hEnb3PVrxgHbEL1VT+cu8ic4ocIOfoaWkJ2b2MCrVy4=:jXB0R2OctZ6i1K3s2DlLNS5D/PSdhzKM7GX7gVh6AvXbWrA5i/4j3maFlgk1X2BpmOXYoZab2hAJS4lCBtWi6WnE3zDLhBvWJWFyAN93fIvS66PXJiINmaEhKi8mBIjc
I am learning about reverse eng. and i got this file. This is simple quiz app. (android) in database file it has question with above encoding string. I put here first one. There are many more questions like this.
The colon character : cannot appear in base64 output, and also = can only appear at the end of base64 output, so this string seems to be composed of 3 parts, each individually encoded in base64:
O7hrHYO5UUFHFPVILQPc6A==
hEnb3PVrxgHbEL1VT+cu8ic4ocIOfoaWkJ2b2MCrVy4=
jXB0R2OctZ6i1K3s2DlLNS5D/PSdhzKM7GX7gVh6AvXbWrA5i/4j3maFlgk1X2BpmOXYoZab2hAJS4lCBtWi6WnE3zDLhBvWJWFyAN93fIvS66PXJiINmaEhKi8mBIjc
These don't decode to anything meaningful in base64, so my guess is some encryption scheme has been applied. After decoding, the lengths of these are all multiple of 16 bytes, which hints at a block cipher with blocks of 16 bytes (128 bits).

Can ALL string be decoded as valid binary data?

As known, Base-64 encodes binary data into transferable ASCII strings, and we decode these strings back to data.
Now my question is inverted: Can every random string be decoded as binary data, and correctly encoded back to the exact original string?
It depends upon your coding method - some methods use only a limited range of characters so a string containing other characters would not be legal. In Base64 this is the case so the answer is no. With other methods I'm sure its possible but I cannot think of an example other than simply treating the string as binary bytes.

Compress bytes into a readable string (no null or endofline)

I'm searching for the most appropriated encoding or method to compress bytes into character that can be read with a ReadLine-like command that only recognizes readable char and terminates on end of line char. There is probably a common practice to achieve it, but I don't know a lot about encoding.
Currently, I'm outputing bytes as a string of hex, so I need 2 bytes to represent 1 byte. It works well, but it is slow. Ex: byte with a value 255 is represented as 'FF'.
I'm sure it could be 3 or 4 times smaller, though there's a limit since I'm outputing MP3 data, but I don't know how. Should I just ZIP my string or there would be too much overhead on it?
Will ASCII85 contains random null bytes and EndOfLine or I'm safe with it?
Don't zip mp3 files, that will not gain much (or anything at all).
I'm a bit disappointed that you did not read up on Ascii85 before asking as I think the Wikipedia article explains fairly clearly that it uses only printable ASCII characters; so, no line endings or null bytes. It is efficient and the conversion is also fairly simple and quick - split your data to 4-byte ints; you will convert these to just five Ascii85 digits by repeatedly dividing the int value by 85 and taking ASCII value of the modulo + 33.
You can also consider using Base64 or UUEncode. These are fairly popular (e.g. used in email attachments) so you will find many libraries preparing these. But they are less efficient.

Resources