Do the blocks in DEFLATE always start from byte boundary?

Do the blocks in DEFLATE always start from byte boundary? - deflate

In DEFLATE specification (RFC 1951), it does explain about how the blocks are structured, but because of my shortage in English, I can't find how the blocks are relayed.
When uncompressed block(BTYPE 00 in RFC 1951) first shows up, and if it's not the final block, it's obvious that the next block will show up at the byte boundary, as the structure of uncompressed block should always end up at the byte boundary.
However, in case of other blocks (compressed block), we can't say that End of the Block symbol (255) end up at the byte boundary. In this case, should we fill up zeroes till byte boundary or can we just relay the next block right after the End of the Block symbol regardless of the byte boundary?

The next block starts at the next bit, regardless of the byte boundary.

Related

I have attempted to supply hand written shellcode, but it is being read as a string and not as bytes, what next?

How do I get "\x90" to be read as the byte value corresponding to the x86 NOP instruction when supplied as a field within the standard argument list in Linux? I have a buffer being stuffed all the way to 10 and then being overwritten into the next 8 bytes with the new return address, at least so I would like. Because the byte sequence being supplied is not read as a byte sequence but rather as characters, I do not know how to fix this. What next?

JPEG SOS specification

I am parsing a JPG in java byte by byte. I am then writing same image byte by byte, and I have come across an oddity. I have tried looking at the spec but I see no reference.
At the end of the SOS section there are three bytes that most sources say 'skip'. But if I write 0x00,0x00,0x00 then java(fx) complains about an invalid value. If I write 0x000x3f0x00 then there is no complaint. (the three byte sequence is what was produced by GIMP in the original file)
I came across an indirect reference to this in the GoLang repo
// - the bytes "\x00\x3f\x00". Section B.2.3 of the spec says that for
// sequential DCTs, those bytes (8-bit Ss, 8-bit Se, 4-bit Ah, 4-bit Al)
// should be 0x00, 0x3f, 0x00<<4 | 0x00.
My question is should I just write 0x3f at this position, or does the value depend upon something else?

In a sequential JPEG scan this value has no meaning. The standard says to set it to 63 but that tells the decoder nothing. You have to process all 64 DCT coefficients in a sequential scan.
In a progressive scan this value means A LOT.

Understanding the spec of the ogg header format

For writing my own ogg-container-class (not using libogg), I try to understand the needed header format. According to the spec, at byte 27 of the stream (starting to count at 0) starts the "segment_table (containing packet lacing values)". This is the red marked byte 13. Concerning the Opus-data that I want to include, the Opus data must start with OpusHead (4F 70 75 73) on its beginning. Why doesn't it start on position 27 where the red 13 is placed? A 13 is a "device control 3" symbol that neither occurs in the Ogg spec, nor in the Opus spec.
EDIT: I found this link that describes the spec a little. There it becomes clear (which it is not from the first link imho) that the 13 (byte 27) is the size of the following segment.

That appears to be a single byte giving the length of the following segment_table data. So there is 13(hex) bytes (16 decimal) bytes of segment_table data.

RFC 3533 is a more verbose description of the format header.
Byte 26 says how many bytes the segment table occupies, so you read that, add 27, and that tells you where the first packet starts (or continues).
The segment table tells you the length(s) of the encapsulated packet(s). Basically you read through the table, adding together the values in each successive byte. If the value you just added is < 255 then that marks a packet boundary, so record the current value of the accumulator, reset it to zero, then continue until you reach the end of the table.
In your example, the segment table size in byte 26 is 1, so the data starts at 27+1 or byte 28, which is the start of the 'OpusHead' string. The value in the 1 byte segment table is 0x13, so the packet is 19 bytes long. 28+19 is 47 (or 0x2f) which is the start of the 'OggS' capture pattern at the start of the next header.
This slightly complicated algorithm is designed to store framing data for many small packets with bounded overhead while still allowing arbitrarily large packets. Note also that packets can be continued between pages, spanning 2 or more segment tables.

Dealing with Padding / Stuff Bits Entropy Encoded JPEG

When decoding entropy encoded DC values in JPEG (or the entropy encoded prediction differences in lossless JPEG), how do I distinguish between 1 bits that have been stuffed to pad a byte before a marker and a Huffman coded value?
For example if I see:
0xAF 0xFF 0xD9
and I have already consumed the bits in [0xA], how can I tell if the next 0xF is padded or should be decoded?
This is from the JPEG Spec:
F.1.2.3 Byte stuffing
In order to provide code space for marker codes
which can be located in the compressed image data without decoding,
byte stuffing is used.
Whenever, in the course of normal encoding, the
byte value X’FF’ is created in the code string, a X’00’ byte is
stuffed into the code string. If a X’00’ byte is detected after a
X’FF’ byte, the decoder must discard it. If the byte is not zero, a
marker has been detected, and shall be interpreted to the extent
needed to complete the decoding of the scan.
Byte alignment of markers
is achieved by padding incomplete bytes with 1-bits. If padding with
1-bits creates a X’FF’ value, a zero byte is stuffed before adding the
marker.

There are only two possibilities for an FF value in the compressed data stream.
Restart Marker; or
FF00 representing FF.
If you are decoding a stream, you will know from the restart interval when to expect a restart marker. When you hit the point in decoding where you should find a restart marker, you discard the remaining bits in the current byte.

Preon encode() does not fill up remaining bits until the byte boundary is reached

I have a message where a variable length of 7Bit characters is encoded. Unfortunately those 7Bit characters are stored in the message as 7Bit. That means the last byte of the message is not necessarily aligned to a byte boundary.
Decoding a message with Preon works fine, but when encoding the previously decoded message with Preon and comparing the byte arrays, the arrays do not match in length.
The encoded byte array is one byte smaller than the original one.
I debugged Preon because I assumed a bug, but it works as designed. When a byte boundary is reached, Preon stores the remaining bits until the next write() call to the BitChannel occures. But for the last byte there is no further call.
The question is, is there a way to tell Preon to flush the remaining buffer?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Do the blocks in DEFLATE always start from byte boundary? - deflate

The next block starts at the next bit, regardless of the byte boundary.

Related

I have attempted to supply hand written shellcode, but it is being read as a string and not as bytes, what next?

JPEG SOS specification

Understanding the spec of the ogg header format

Dealing with Padding / Stuff Bits Entropy Encoded JPEG

Preon encode() does not fill up remaining bits until the byte boundary is reached

Categories

Resources