ITU T.87 JPEG LS Standard and sample .jls SOS encoded streams have no escape sequence 0xFF 0x00 - jpeg

ITU T.81 states the following:
B.1.1.2 Markers
Markers serve to identify the various structural
parts of the compressed data formats. Most markers start marker
segments containing a related group of parameters; some markers stand
alone. All markers are assigned two-byte codes: an X’FF’ byte followed
by a byte which is not equal to 0 or X’FF’ (see Table B.1). Any marker
may optionally be preceded by any number of fill bytes, which are
bytes assigned code X’FF’. NOTE – Because of this special
code-assignment structure, markers make it possible for a decoder to
parse the compressed data and locate its various parts without having
to decode other segments of image data. "
B.1.1.5 Entropy-coded data segments An entropy-coded data segment
contains the output of an entropy-coding procedure. It consists of an
integer number of bytes, whether the entropy-coding procedure used is
Huffman or arithmetic.
NOTES
(1) Making entropy-coded segments an
integer number of bytes is performed as follows: for Huffman coding,
1-bits are used, if necessary, to pad the end of the compressed data
to complete the final byte of a segment. For arithmetic coding, byte
alignment is performed in the procedure which terminates the
entropy-coded segment (see D.1.8).
(2) In order to ensure that a marker
does not occur within an entropy-coded segment, any X’FF’ byte
generated by either a Huffman or arithmetic encoder, or an X’FF’ byte
that was generated by the padding of 1-bits described in NOTE 1 above,
is followed by a “stuffed” zero byte (see D.1.6 and F.1.2.3).
And in many other places where well known Stuff_0() function is also named.
Not sure where standard ITU T.87 stands in regard to the encoding escape sequence 0xFF 0x00 specified by standard ITU T.81:
Standard ITU T.87 it self that do not specify this but expects it.
Where Standard test samples are incorrectly formed, clearly do not have encoding escape sequence 0xFF 0x00 in encoded streams. For example 0xFF 0x7F, 0xFF 0x2F, and other sequences can be found in encoded streams of .jsl test samples : namelly "T8C0E3.JLS". And no one saw it all these years;
Or if Standard ITU T.87 actually overrides the ITU T.81 regarding this rule for encoded streams and doesn't allow encoding of escape sequence;
In decoder we could make logic to detect decoder errors when 0xFF and !0x00 is to actually use that byte and not skip it if component is not fully decoded. But what if jls file do not have escape sequence and we encounter 0xFF 0x00 sequence should we skip 0x00 byte or not?
Would like some clarification on subject of standard ITU T.87 JPEG-LS encoding, and what is the correct procedure. Should we, or shouldn't we, encode escape sequnce 0xFF 0x00 in encoded streams?

The answer :
ITU T.87 - ANNEX A - point A1 Coding parameters and compressed image data - pass 3
Marker segments are inserted in the data stream as specified in Annex
D. In order to provide for easy detection of marker segments, a single
byte with the value X'FF' in a coded image data segment shall be
followed with the insertion of a single bit '0'. This inserted bit
shall occupy the most significant bit of the next byte. If the X'FF'
byte is followed by a single bit '1', then the decoder shall treat the
byte which follows as the second byte of a marker, and process it in
accordance with Annex C. If a '0' bit was inserted by the encoder, the
decoder shall discard the inserted bit, which does not form part of
the data stream to be decoded.
NOTE 2 – This marker segment detection
procedure differs from the one specified in CCITT Rec. T.81 | ISO/IEC
10918-1.
JPEG-LS T.87 overrides T.81 JPEG Standard for encoded data stream to have byte 0xFF followed by byte with value between 0x00 and 0x7F (inclusive).

Related

How the Diagnostic Trouble Code(DTC) data is defined in the ECU?

When a diagnostic tool is connected to the server it gets the the DTC.
I want to know how the DTC data is defined and stored in the ECU.
DTC codes are usually defined as 2-byte or 3-byte values.
A common representation following ISO 15031-6/SAE J2012 is as five-character alphanumeric code (ie. P0001) with the optional low-byte appended as hexadecimal value (ie. P0001-00). The first letter being either: P for Powertrain (00b, highest bits on highest byte), C for Chassis (01b), B for Body (10b) or U for Network related DTCs (11b). ie.
P0001 (Fuel Volume Regulator Control Circuit/Open) would be represented as bytes: 0x00 0x01
P0A01 (Range/Performance) would be represented as bytes: 0x0A 0x01
C0001 (TCS Control Channel A Valve 1) would be represented as bytes: 0x40 0x01
The DTCs are stored as their respective byte representation in Non-volatile memory (NvM) of the ECU, so that it can be retrieved even if the ECU has been power cycled. Along with the DTC additional information will be stored, i.e. Freeze frame/environmental data, DTC status mask (pendingDTC/confirmedDTC/...), counter (aging/debouncing), time of first occurence, etc.

JPEG SOS specification

I am parsing a JPG in java byte by byte. I am then writing same image byte by byte, and I have come across an oddity. I have tried looking at the spec but I see no reference.
At the end of the SOS section there are three bytes that most sources say 'skip'. But if I write 0x00,0x00,0x00 then java(fx) complains about an invalid value. If I write 0x000x3f0x00 then there is no complaint. (the three byte sequence is what was produced by GIMP in the original file)
I came across an indirect reference to this in the GoLang repo
// - the bytes "\x00\x3f\x00". Section B.2.3 of the spec says that for
// sequential DCTs, those bytes (8-bit Ss, 8-bit Se, 4-bit Ah, 4-bit Al)
// should be 0x00, 0x3f, 0x00<<4 | 0x00.
My question is should I just write 0x3f at this position, or does the value depend upon something else?
In a sequential JPEG scan this value has no meaning. The standard says to set it to 63 but that tells the decoder nothing. You have to process all 64 DCT coefficients in a sequential scan.
In a progressive scan this value means A LOT.

problems sending bytes greater 0x7F python3 serial port

I'm working with python3 and do not find an answer for my little problem.
My problem is sending a byte greater than 0x7F over the serial port with my raspberry pi.
example:
import serial
ser=serial.Serial("/dev/ttyAMA0")
a=0x7F
ser.write(bytes(chr(a), 'UTF-8'))
works fine! The receiver gets 0x7F
if a equals 0x80
a=0x80
ser.write(bytes(chr(a), 'UTF-8'))
the receiver gets two bytes: 0xC2 0x80
if i change the type to UTF-16 the receiver reads
0xFF 0xFE 0x80 0x00
The receiver should get only 0x80!
Whats wrong! Thanks for your answers.
UTF-8 specification says that words that are 1 byte/octet start with 0. Because 0x80 is "10000000" in binary, it needs to be preceded by a C2, "11000010 10000000" (2 bytes/octets). 0x7F is 01111111, so when reading it, it knows it is only 1 byte/octet long.
UTF-16 says that all words are represented as 2 byte/octets and has a Byte Order Mark which essentially tells the reader which one is the most-significant octet (or endianness.
Check on UTF-8 for full specifications, but essentially you are moving from the end of the 1 byte range, to the start of the 2 byte range.
I don't understand why you want to send your own custom 1-byte words, but what you are really looking for is any SBCS (Single Byte Character Set) which has a character for those bytes you specify. UTF-8/UTF-16 are MBCS, which means when you encode a character, it may give you more than a single byte.
Before UTF-? came along, everything was SBCS, which meant that any code page you selected was coded using 8-bits. The problem arose when 256 characters were not enough, and they had to make code pages like IBM273 (IBM EBCDIC Germany) and ISO-8859-1 (ANSI Latin 1; Western European) to interpret what "0x2C" meant. Both the sender and receiver needed to set their code page identifier to the same, or they wouldn't understand each other. There is further confusion because these SBCS code pages don't always use the full 256 characters, so "0x7F" may not even exist / have a meaning.
What you could do is encode it to something like codepage 737/IBM 00737, send the "Α" (Greek Alpha) character and it should encode it as 0x80.
If it doesn't work, t'm not sure if you can send the raw byte through pyserial as the write() method seems to require an encoding, you may need to look into the source code to see the lower level details.
a=0x80
ser.write(bytes(chr(a), 'ISO-8859-1'))

Dealing with Padding / Stuff Bits Entropy Encoded JPEG

When decoding entropy encoded DC values in JPEG (or the entropy encoded prediction differences in lossless JPEG), how do I distinguish between 1 bits that have been stuffed to pad a byte before a marker and a Huffman coded value?
For example if I see:
0xAF 0xFF 0xD9
and I have already consumed the bits in [0xA], how can I tell if the next 0xF is padded or should be decoded?
This is from the JPEG Spec:
F.1.2.3 Byte stuffing
In order to provide code space for marker codes
which can be located in the compressed image data without decoding,
byte stuffing is used.
Whenever, in the course of normal encoding, the
byte value X’FF’ is created in the code string, a X’00’ byte is
stuffed into the code string. If a X’00’ byte is detected after a
X’FF’ byte, the decoder must discard it. If the byte is not zero, a
marker has been detected, and shall be interpreted to the extent
needed to complete the decoding of the scan.
Byte alignment of markers
is achieved by padding incomplete bytes with 1-bits. If padding with
1-bits creates a X’FF’ value, a zero byte is stuffed before adding the
marker.
There are only two possibilities for an FF value in the compressed data stream.
Restart Marker; or
FF00 representing FF.
If you are decoding a stream, you will know from the restart interval when to expect a restart marker. When you hit the point in decoding where you should find a restart marker, you discard the remaining bits in the current byte.

Preon encode() does not fill up remaining bits until the byte boundary is reached

I have a message where a variable length of 7Bit characters is encoded. Unfortunately those 7Bit characters are stored in the message as 7Bit. That means the last byte of the message is not necessarily aligned to a byte boundary.
Decoding a message with Preon works fine, but when encoding the previously decoded message with Preon and comparing the byte arrays, the arrays do not match in length.
The encoded byte array is one byte smaller than the original one.
I debugged Preon because I assumed a bug, but it works as designed. When a byte boundary is reached, Preon stores the remaining bits until the next write() call to the BitChannel occures. But for the last byte there is no further call.
The question is, is there a way to tell Preon to flush the remaining buffer?

Resources