Problems with SHA 2 Hashing and Java - security

I am working on following the SHA-2 cryptographically functions as stated in https://en.wikipedia.org/wiki/SHA-2.
I am examining the lines that say:
begin with the original message of length L bits append a single '1' bit;
append K '0' bits, where K is the minimum number >= 0 such that L + 1 + K + 64 is a multiple of 512
append L as a 64-bit big-endian integer, making the total post-processed length a multiple of 512 bits.
I do not understand the last two lines. If my string is short can its length after adding K '0' bits be 512. How should I implement this in Java code?

First of all, it should be made clear that the "string" that is talked about is not a Java String but a bit string. These algorithms are binary/bit based. The implementation will generally not handle bits but bytes. So there is a translation phase where you should see bytes instead of bits.
SHA-512 is operated on in blocks of 512 bits (SHA-224/256) or 1024 bits (SHA-384/512). So basically you have a 64 or 128 byte buffer that you are filling before operating on it. You could also directly cache the data in 32 bit int fields (SHA-224/256) or 64 bit long fields, as that is the word size that is operated on.
Now the padding is relatively simple procedure. The padding is called bit-padding. As it is used in big-endian mode (SHA-2 fortunately uses this instead of the braindead little endian mode in SHA-3) the padding consists of a single bit set on the highest order bit in a byte, with the rest filled by zero's. That makes for a value of (byte) 0x80 which must be put in the buffer.
If you cannot create this padding because the buffer is full then you will have to process the previous block, and then set the first bit of the now available buffer to (byte) 0x80. In the newer Java you can also use (byte) 0b1_0000000 byte the way, which is more explicit.
Now you simply add zero's until you have 8 to 16 bytes left, again depending on the hash output size used. If there aren't enough bytes then fill till the end, process the block, and re-start filling with zero bytes until you have 8 or 16 bytes left again.
Now finally you have to encode the number of bits in those 8 or 16 bytes you've left. So multiply your input by eight, and make sure you encode those bytes in the same way as you'd expect in Java with the least significant bits as much to the right as possible. You might want to use https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html#putLong-long- for this if you don't want to program it yourself. You may probably forget about anything over 2^56 bytes anyway, so if you have SHA-384/SHA-512 then simply set the first eight bytes to zero.
And that's it, except that you still need to process that last block and then use as many bytes from the left as required for your particular output size.

Related

RGB565 color space bits order

I am confused on RGB565 and BGR565 color space in bits order. e.g, the hex value 0xF81E, I don't know at RGB565, R is high 5 bits in (0xF81E & 0xF800), or R is low 5 bits (0xF81E & 0x001F)
If you have a 16-bit value, then in RGB565 the R component will be the most significant 5 bits (ie high bits), whereas in BGR565 it'll be the least-significant 5 bits (low bits).
That said, if you are reading such 16-bit values from serialized bytes (eg a raw dump to file), then also consider the byte order of the serialization. For example, if the serialization isn't big-endian (network byte order), then 0xF81E appearing in adjacent raw bytes might indicate RGB565 value 0x1EF8.

How to determine the addressable memory capacity in bytes knowing the bits for the operand address?

If I have say a 64-bit instruction, which has 2 bytes (16 bits) for opcode and the rest for operand address, I can determine that I have 48bits for the address (64-16). The maximum value that can be displayed with 48 bits plus 1 to account for address 0 is my go to number. This would be 2^48. However, I have the problem with the understanding of this in terms of the iB units.
2^48 is 2^40 (TiB) x 2^8 = 256TiB. But since TiB = 2^40 BYTES, when did the 2^48 become a BYTE? I generally believed that to get the number of bytes I'd have to divide by 8, but this doesn't seem to be the case.
Could someone explain why this works?
A byte is by definition the smallest chunk of memory which has an address. Whatever number of address bits, the resulting address is the address of a byte, by definition. In all (or at least, most) computer architectures existing today, a byte is the same as an octet, that is, eight bits; but historically there were popular computer architectures with 6 bit bytes, or 12 bit bytes, or even other more exotic number of bits per byte.

Quickjob MOVZON X'FF' to OFA1

what does MOVZON X'FF' do in quickjob. I believe it just moves input to output. Please let me know, if I am wrong.
The smallest unit of information is the bit. Processors usually don‘t work on single bits when accessing memory; they work on bytes. A byte consists of 8 consecutive bits (for most architectures).
To describe how different processor instructions work with bytes, bytes are sometimes subdivided into two 4-bit groups, called nibbles. Counting left to right, bits 0-3 are called „left nibble“, „high order nibble“, or „zone nibble“. Bits 4-7, the right half, are called „right nibble“, „low order nibble“, or „number nibble“.
There are instructions that work on the whole byte, e.g. MOVE. And there are instructions that work on nibbles. MOVEZONE (MOVZON) works on zone nibbles and leaves the number nibbles alone; MOVENUM (MOVNUM) works on number nibbles, and leaves the zone nibbles alone.
This kind of instructions are usually used with bytes that contain numeric values, coded as either zoned decimal, or packed decimal. They are rather exotic when working on text data.
This reference is used.
Given the instruction:
MOVZON X'FF' to OFA1
The receiving field OFA1 refers to the first record position (the 1) of the output file ( the OF) designated as A. The instruction will set the high-order bits (0-3 or "zone bits") of the first position to ones, matching bits 0-3 of the X'FF'.
However, it appears, as a matter of style, the instruction should have been written as MOVZON X'F0' TO OAF1 since the low-order bits (4-7) are not used.

How to determine the byte length of a long

Is there a fast way to determine the number of bytes used in a long? I'm looking for something like this:
len((1000**1000).to_bytes())
(The problem of course is that to_bytes wants the number of bytes as input.)
(x.bit_length() + 7) // 8 will do what you want. Number of bits, converted to bytes and rounded up.

bitshift large strings for encoding QR Codes

As an example, suppose a QR Code data stream contains 55 data words (each one byte in length) and 15 error correction words (again one byte). The data stream begins with a 12 bit header and ends with four 0 bits. So, 12 + 4 bits of header/footer and 15 bytes of error correction, leaves me 53 bytes to hold 53 alphanumeric characters. The 53 bytes of data and 15 bytes of ec are supplied in a string of length 68 (str68). The problem seems simple enough - concatenate 2 bytes of (right-shifted) header data with str68 and then left shift the entire 70 bytes by 4 bits.
This is the first time in many years of programming that I have ever needed to do something like this, I am a c and bit shifting noob, so please be gentle... I have done a little investigation and so far have not been able to figure out how to bitshift 70 bytes of data; any help would be greatly appreciated.
Larger QR codes can hold 2000 bytes of data...
You need to look at this 4 bits at a time.
The first 4 bits you need to worry about are the lower bits of the first byte. Fortunately this is an easy case because they need to end up in the upper bits of the first byte.
The next 4 bits you need to worry about are the upper bits of the second byte. These need to end up as the lower bits of the first byte.
The next 4 bits you need to worry about are the lower bits of the second byte. But fortunately you already know how to do this because you already did it for the first byte.
You continue in this vein until you have dealt with the lower bytes of the 70th byte.

Resources