Understanding the ZMODEM protocol

Understanding the ZMODEM protocol - protocols

I need to include basic file-sending and file-receiving routines in my program, and it needs to be through the ZMODEM protocol. The problem is that I'm having trouble understanding the spec.
For reference, here is the specification.
The spec doesn't define the various constants, so here's a header file from Google.
It seems to me like there are a lot of important things left undefined in that document:
It constantly refers to ZDLE-encoding, but what is it? When exactly do I use it, and when don't I use it?
After a ZFILE data frame, the file's metadata (filename, modify date, size, etc.) are transferred. This is followed by a ZCRCW block and then a block whose type is undefined according to the spec. The ZCRCW block allegedly contains a 16-bit CRC, but the spec doesn't define on what data the CRC is computed.
It doesn't define the CRC polynomial it uses. I found out by chance that the CRC32 poly is the standard CRC32, but I've had no such luck with the CRC16 poly. Nevermind, I found it through trial and error. The CRC16 poly is 0x1021.
I've looked around for reference code, but all I can find are unreadable, undocumented C files from the early 90s. I've also found this set of documents from the MSDN, but it's painfully vague and contradictory to tests that I've run: http://msdn.microsoft.com/en-us/library/ms817878.aspx (you may need to view that through Google's cache)
To illustrate my difficulties, here is a simple example. I've created a plaintext file on the server containing "Hello world!", and it's called helloworld.txt.
I initiate the transfer from the server with the following command:
sx --zmodem helloworld.txt
This prompts the server to send the following ZRQINIT frame:
2A 2A 18 42 30 30 30 30 30 30 30 30 30 30 30 30 **.B000000000000
30 30 0D 8A 11 00.Š.
Three issues with this:
Are the padding bytes (0x2A) arbitrary? Why are there two here, but in other instances there's only one, and sometimes none?
The spec doesn't mention the [CR] [LF] [XON] at the end, but the MSDN article does. Why is it there?
Why does the [LF] have bit 0x80 set?
After this, the client needs to send a ZRINIT frame. I got this from the MSDN article:
2A 2A 18 42 30 31 30 30 30 30 30 30 32 33 62 65 **.B0100000023be
35 30 0D 8A 50.Š
In addition to the [LF] 0x80 flag issue, I have two more issues:
Why isn't [XON] included this time?
Is the CRC calculated on the binary data or the ASCII hex data? If it's on the binary data I get 0x197C, and if it's on the ASCII hex data I get 0xF775; neither of these are what's actually in the frame (0xBE50). (Solved; it follows whichever mode you're using. If you're in BIN or BIN32 mode, it's the CRC of the binary data. If you're in ASCII hex mode, it's the CRC of what's represented by the ASCII hex characters.)
The server responds with a ZFILE frame:
2A 18 43 04 00 00 00 00 DD 51 A2 33 *.C.....ÝQ¢3
OK. This one makes sense. If I calculate the CRC32 of [04 00 00 00 00], I indeed get 0x33A251DD. But now we don't have ANY [CR] [LF] [XON] at the end. Why is this?
Immediately after this frame, the server also sends the file's metadata:
68 65 6C 6C 6F 77 6F 72 6C 64 2E 74 78 74 00 31 helloworld.txt.1
33 20 32 34 30 20 31 30 30 36 34 34 20 30 20 31 3 240 100644 0 1
20 31 33 00 18 6B 18 50 D3 0F F1 11 13..k.PÓ.ñ.
This doesn't even have a header, it just jumps straight to the data. OK, I can live with that. However:
We have our first mysterious ZCRCW frame: [18 6B]. How long is this frame? Where is the CRC data, and is it CRC16 or CRC32? It's not defined anywhere in the spec.
The MSDN article specifies that the [18 6B] should be followed by [00], but it isn't.
Then we have a frame with an undefined type: [18 50 D3 0F F1 11]. Is this a separate frame or is it part of ZCRCW?
The client needs to respond with a ZRPOS frame, again taken from the MSDN article:
2A 2A 18 42 30 39 30 30 30 30 30 30 30 30 61 38 **.B0900000000a8
37 63 0D 8A 7c.Š
Same issues as with the ZRINIT frame: the CRC is wrong, the [LF] has bit 0x80 set, and there's no [XON].
The server responds with a ZDATA frame:
2A 18 43 0A 00 00 00 00 BC EF 92 8C *.C.....¼ï’Œ
Same issues as ZFILE: the CRC is all fine, but where's the [CR] [LF] [XON]?
After this, the server sends the file's payload. Since this is a short example, it fits in one block (max size is 1024):
48 65 6C 6C 6F 20 77 6F 72 6C 64 21 0A Hello world!.
From what the article seems to mention, payloads are escaped with [ZDLE]. So how do I transmit a payload byte that happens to match the value of [ZDLE]? Are there any other values like this?
The server ends with these frames:
18 68 05 DE 02 18 D0 .h.Þ..Ð
2A 18 43 0B 0D 00 00 00 D1 1E 98 43 *.C.....Ñ.˜C
I'm completely lost on the first one. The second makes as much sense as the ZRINIT and ZDATA frames.

My buddy wonders if you are implementing a time
machine.
I don't know that I can answer all of your questions -- I've never
actually had to implement zmodem myself -- but here are few answers:
From what the article seems to mention, payloads are escaped with
[ZDLE]. So how do I transmit a payload byte that happens to match the
value of [ZDLE]? Are there any other values like this?
This is explicitly addressed in the document you linked to at the
beginning of your questions, which says:
The ZDLE character is special. ZDLE represents a control sequence
of some sort. If a ZDLE character appears in binary data, it is
prefixed with ZDLE, then sent as ZDLEE.
It constantly refers to ZDLE-encoding, but what is it? When exactly
do I use it, and when don't I use it?
In the Old Days, certain "control characters" were used to control the
communication channel (hence the name). For example, sending XON/XOFF
characters might pause the transmission. ZDLE is used to escape
characters that may be problematic. According to the spec, these are
the characters that are escaped by default:
ZMODEM software escapes ZDLE, 020, 0220, 021, 0221, 023, and 0223.
If preceded by 0100 or 0300 (#), 015 and 0215 are also escaped to
protect the Telenet command escape CR-#-CR. The receiver ignores
021, 0221, 023, and 0223 characters in the data stream.
I've looked around for reference code, but all I can find are
unreadable, undocumented C files from the early 90s.
Does this include the code for the lrzsz package? This is still
widely available on most Linux distributions (and surprisingly handy
for transmitting files over an established ssh connection).
There are a number of other implementations out there, including
several in software listed on freecode, including qodem,
syncterm, MBSE, and others. I believe the syncterm
implementation is written as library that may be reasonable easy
to use from your own code (but I'm not certain).
You may find additional code if you poke around older collections of
MS-DOS software.

I can't blame you. The user manual is not organized in a user friendly way
Are the padding bytes (0x2A) arbitrary?
No, from page 14,15:
A binary header begins with the sequence ZPAD, ZDLE, ZBIN.
A hex header begins with the sequence ZPAD, ZPAD, ZDLE, ZHEX.
.
The spec doesn't mention the [CR] [LF] [XON] at the end, but the MSDN article does. Why is it there?
Page 15
* * ZDLE B TYPE F3/P0 F2/P1 F1/P2 F0/P3 CRC-1 CRC-2 CR LF XON
.
Why does the [LF] have bit 0x80 set?
Not sure. From Tera term I got both control characters XORed with 0x80 (8D 8A 11)
We have our first mysterious ZCRCW frame: [18 6B]. How long is this frame? Where is the CRC data, and is it CRC16 or CRC32? It's not defined anywhere in the spec.
The ZCRCW is not a header or a frame type, it's more like a footer that tells the receiver what to expect next. In this case it's the footer of the data subpacket containing the file name. It's going to be a 32 bit checksum because you're using a "C" type binary header.
ZDLE C TYPE F3/P0 F2/P1 F1/P2 F0/P3 CRC-1 CRC-2 CRC-3 CRC-4
.
Then we have a frame with an undefined type: [18 50 D3 0F F1 11]. Is this a separate frame or is it part of ZCRCW?
That's the CRC for the ZCRCW data subpacket. It's 5 bytes because the first one is 0x10, a control character that needs to be ZDLE escaped. I'm not sure what 0x11 is.
and there's no [XON].
XON is just for Hex headers. You don't use it for a binary header.
ZDLE A TYPE F3/P0 F2/P1 F1/P2 F0/P3 CRC-1 CRC-2
.
So how do I transmit a payload byte that happens to match the value of [ZDLE]?
18 58 (AKA ZDLEE)
18 68 05 DE 02 18 D0
That's the footer of the data subframe. The next 5 bytes are the CRC (last byte is ZDLE encoded)

The ZDLE + ZBIN (0x18 0x41) means the frame is CRC-CCITT(XMODEM 16) with Binary Data.
ZDLE + ZHEX (0x18 0x42) means CRC-CCITT(XMODEM 16) with HEX data.
The HEX data is tricky, since at first some people don't understand it. Every two bytes, the ASCII chars represent one byte in binary. For example, 0x30 0x45 0x41 0x32 means 0x30 0x45, 0x41 0x32, or in ASCII 0 E A 2, then 0x0E, 0xA2. They expand the binary two nibbles to a ASCII representation. I found in several dataloggers that some devices use lower case to represent A~F (a~f) in HEX, it doesn't matter, but on those, you will not find 0x30 0x45 0x41 0x32 (0EA2) but 0x30 0x65 0x61 0x32 (0ea2), it doesn't change a thing, just make it a little bit confuse at first.
And yes, the CRC16 for ZBIN or ZHEX is CRC-CCITT(XMODEM).
The ZDLE ZBIN32 (0x18 0x43) or ZDLE ZBINR32 (0x18 0x44) use CRC-32 calculation.
Noticing that the ZDLE and the following byte are excluded in the CRC calculation.
I am digging into the ZMODEM since I need to "talk" with some Elevators Door Boards, to program a new set of parameters at once, instead using their software to change attribute by attribute. This "talk" could be on the bench instead sitting over the elevator car roof with a notebook. Those boards talk ZMODEM, but as I don't have the packets format they expect, the board still rejecting my file transfer. The boards send 0x2a 0x2A 0x18 0x42 0x30 0x31 0x30 (x6) + crc, the Tera Terminal transfering the file in ZMODEM send to the board 0x2A 0x2A 0x18 0x42 0x30 0x30 ... + CRC, I don't know why this 00 or 01 after the 0x4B means. The PC send this and the filename and some file attributes. The board after few seconds answer with "No File received"...

Related

Interpret AVRCP packets

After some mucking about, I have got a pybluez script to connect to an AVRCP profile on various devices, and read the responses.
Code snippet:
addr="e2:8b:8e:89:6c:07" #S530 white
port=23
if (port>0):
print("Attempting to connect to L2CAP port ",port)
socket=bluetooth.BluetoothSocket(bluetooth.L2CAP);
socket.connect((addr,port))
print("Connected.")
while True:
print("Waiting on read:")
data=socket.recv(1024)
for b in data:
print("%02x"%b,end=" ")
print()
socket.close()
The results I'm getting when I press the button on the earpiece are as follows:
Attempting to connect to L2CAP port 23
Connected.
Waiting on read:
10 11 0e 01 48 00 00 19 58 10 00 00 01 03
Waiting on read:
20 11 0e 00 48 7c 44 00
Waiting on read:
30 11 0e 00 48 7c 46 00
Waiting on read:
40 11 0e 00 48 7c 44 00
After careful reading of the spec, it looks like I'm seeing PASSTHROUGH commands, with 44 being the "PLAY" operation command, and 46 being "PAUSE" (I think)
I don't know what the 10 11 0e means, apart from the fact that the first byte appears to be some sort of sequence number.
My issue is threefold:
I don't know where to find a list of valid operation_ids. It's
mentioned in the spec but not defined apart from a few random
examples.
The spec makes reference to subunit type and Id, (which would be the
48 in the above example) again without defining them AFAICT.
There is no mention of what the leading three bytes are. They may
even be part of L2CAP and nothing directly to do with AVRCP, I'm not
familiar enough with pybluez to tell.
Any assistance in any of the above points would be helpful.
Edit: For reference, the details of the AVRCP spect appears to be here: https://www.bluetooth.org/docman/handlers/DownloadDoc.ashx?doc_id=119996

The real answer is that the specification document assumes you have read other specification documents.
The three header bytes are part of the AVCTP transport layer:
http://www.cs.bilkent.edu.tr/~korpe/lab/resources/AVCTP%20Spec%20v1_0.pdf
In short:
0: 7..4: Incrementing transaction id. 0x01 to 0x0f
3..2: Packet type 00 = self contained packet
1 : 0=request 1=response
0 : 0=PID recognized 1: PID error
1-2: 2 byte bigendian profile id (in this case 110e, AVRCP)
The rest is described in the AVRCP profile doc, https://www.bluetooth.org/docman/handlers/DownloadDoc.ashx?doc_id=119996
I don't find the documentation to be amazingly clear.
I have provided a sample application which seems to work for most of the AVRCP devices I have been able to test:
https://github.com/rjmatthews62/BtAVRCP

Why doesn't the QNAME for this DNS q. end with a NULL character?

I was monitoring port 5353 (mDNS) with WireShark and came across the following DNS question:
According to section 4.1.2 of RFC 1035 QNAME is:
a domain name represented as a sequence of labels, where each label consists of a length octet followed by that number of octets. The domain name terminates with the zero length octet for the null label of the root...
This seems to contradict what I'm seeing over-the-wire in the capture above. The last label ends with c0 12 instead of 00. Why is this and how come it isn't documented in the RFC?

Apparently, when a label sequence ends with c0 12, this indicates an indirect pointer. It is roughly equivalent to stating "go to this offset in the DNS query and continue reading from there".
The first two bits are a constant (c0) and the remaining 14 bits are the offset from the start of the query. In my question, for example, c0 12 indicates that the next part of the QNAME should come from 47 bytes into the query.
05 6c 6f 63 61 6c 00 .local.

How to tell difference between JPEG and JPEG2000?

The question is: How can i tell two files apart? One coded with JPEG, another with JPEG2000.
I need format-specific file read/write functions, i can't find file encoding without reading it.
JPEG works fine right now, but JPEG func fails to open JPEG2000.
So i need to determine whether my file is JPG or JPEG2000.

According to the Digital Formats at Library of Congress, all JPEG 2000 files start with the following signature (also known as magic bytes or magic number):
00 00 00 0C 6A 50 20 20 0D 0A 87 0A
(The IANA record only lists the first 12, so I left the remainder out).
Normal JPEG files on the other hand, starts with:
FF D8 FF E0
Comparing these bytes, you should easily be able to tell them apart.

Could I get memory access information through X86_64 machine code?

I want to do a statistic of memory bytes access on programs running on Linux (X86_64 architecture). I use perf tool to dump the file like this:
: ffffffff81484700 <load2+0x484700>:
2.86 : ffffffff8148473b: 41 8b 57 04 mov 0x4(%r15),%edx
5.71 : ffffffff81484800: 65 8b 3c 25 1c b0 00 mov %gs:0xb01c,%edi
22.86 : ffffffff814848a0: 42 8b b4 39 80 00 00 mov 0x80(%rcx,%r15,1),%esi
25.71 : ffffffff814848d8: 42 8b b4 39 80 00 00 mov 0x80(%rcx,%r15,1),%esi
2.86 : ffffffff81484947: 80 bb b0 00 00 00 00 cmpb $0x0,0xb0(%rbx)
2.86 : ffffffff81484954: 83 bb 88 03 00 00 01 cmpl $0x1,0x388(%rbx)
5.71 : ffffffff81484978: 80 79 40 00 cmpb $0x0,0x40(%rcx)
2.86 : ffffffff8148497e: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
5.71 : ffffffff8148499b: 8b 71 34 mov 0x34(%rcx),%esi
5.71 : ffffffff814849a4: 0f af 34 24 imul (%rsp),%esi
My current method is to analyze file and get all memory access instructions, such as move, cmp, etc. Then calculate every access bytes of every instruction, such as mov 0x4(%r15),%edx will add 4 bytes.
I want to know whether there is possible way to calculate through machine code , such as by analyzing "41 8b 57 04", I can also add 4 bytes. Because I am not familiar with X86_64 machine code, could anyone give any clues? Or is there any better way to do statistics? Thanks in advance!

See https://stackoverflow.com/a/20319753/120163 for information about decoding Intel instructions; in fact, you really need to refer to Intel reference manuals: http://download.intel.com/design/intarch/manuals/24319101.pdf If you only want to do this manually for a few instructions, you can just look up the data in these manuals.
If you want to automate the computation of instruction total-memory-access, you will need a function that maps instructions to the amount of data accessed. Since the instruction set is complex, the corresponding function will be complex and take you a long time to write from scratch.
My SO answer https://stackoverflow.com/a/23843450/120163 provides C code that maps x86-32 instructions to their length, given a buffer that contains a block of binary code. Such code is necessary if one is to start at some point in the object code buffer and simply enumerate the instructions that are being used. (This code has been used in production; it is pretty solid). This routine was built basically by reading the Intel reference manual very carefully. For OP, this would have to be extended to x86-64, which shouldn't be very hard, mostly you have account for the extended-register prefix opcode byte and some differences from x86-32.
To solve OP's problem, one would also modify this routine to also return the number of byte reads by each individual instruction. This latter data also has to be extracted by careful inspection from the Intel reference manuals.
OP also has to worry about where he gets the object code from; if he doesn't run this routine in the address space of the object code itself, he will need to somehow get this object code from the .exe file. For that, he needs to build or run the equivalent of the Windows loader, and I'll bet that
has a bunch of dark corners. Check out the format of object code files.

Decoding sniffed packets

I understand that each packet has some header that seems like a random mix of chars. On the other hand, the content itself can be in pure ascii and therefore it might be human friendly. Some of the packets I sniffed were readable (raw html headers for sure). But some packets looked like this:
0000 00 15 af 51 68 b2 00 e0 98 be cf d6 08 00 45 00 ...Qh... ......E.
0010 05 dc 90 39 40 00 2e 06 99 72 08 13 f0 49 c0 a8 ...9#... .r...I..
0020 64 6b 00 50 c1 32 02 7a 60 4f 4c b6 45 62 50 10 dk.P.2.z `OL.EbP.
That was just a part, these packets were usually longer. My question is, how can I decode the packet content/data? Do I need the whole stream? Is the decoding simple, or every application can encode it slightly else, to ensure these packets are secured?
Edit:
I don't care about the header, Wireshark shows that. However, that's totally worthless info. I want to decode the data/content.

The content of a packet is defined by the process sending it. Think of it like a telephone call. What's said is dependent on who is calling and who they are talking to. You have to study the programs that construct it to determine how to "decode" it. There are some sniffers that will parse some commonly used methods of encoding and try to do this already.

Why not just use something like wireshark?

Packet headers will depend on the application sending the packet in question, as mentioned in an earlier post. You can also use Wiresharks protocol reference for understanding some of the common protocols.
What you have listed here is the Packet Byte, what you need to see is the Packet Detail view to understand what does the seemingly random data correspond to. In Packet Detail view, when you select various parts of the packet, it will highlight corresponding byte in the Packet Byte view.

If you're using C#, grab SharpPcap and look at the examples in code to get a feel for how it works.
Set the filter to only capture UDP, capture a packet, parse it to udp, and extract the payload. The payload's format is based on the application sending it.
There's a lot of extra gibberish because every udp packet contains a stack of:
Ethernet header
IP header
UDP header
of information before your data and all incoming data is in binary format until you parse it to something meaningful.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string