Python 3: Unable to convert ASCII String to Hexadecimal - python-3.x

Python Version: 3.7.2
I need to convert a string in ASCII like Øâþ  ÿþ !Zk2ìm "Ï"À>q úÞ to Hexademical, which in this case would be d8 e2 02 12 02 fe 01 20 9b 10 20 20 03 ff 07 fe 20 20 21 5a 6b 32 ec 17 6d 20 0e 22 cf 22 c0 3e 71 20 02 20 03 fa de. I found several solutions for doing this on Python 2, however I can't find any way of doing this on Python 3.
To summarise: The intender behaviour is ASCII to HEX as follows:
Øâþ  ÿþ !Zk2ìm "Ï"À>q úÞ TO d8 e2 02 12 02 fe 01 20 9b 10 20 20 03 ff 07 fe 20 20 21 5a 6b 32 ec 17 6d 20 0e 22 cf 22 c0 3e 71 20 02 20 03 fa de.
I've even checked on https://www.rapidtables.com/convert/number/ascii-to-hex.html and found it works, but I'm unable to implement it in Python 3.

You may use the code:
print(*[hex(ord(letter))[2:] for letter in 'Øâþ ÿþ !Zk2ìm "Ï"À>q úÞ'])
which gives the following output:
d8 e2 fe 20 10 20 20 ff fe 20 20 21 5a 6b 32 ec 6d 20 e 22 cf 22 c0 3e 71 20 20 fa de
ord() - get ascii code,
hex() - get hex from int,
[2:] - to omit 0x in every number.
EDIT
Slightly modified version (to get 0e instead of e):
string = 'Øâþ ÿþ !Zk2ìm "Ï"À>q úÞ'
print(*['{:02x}'.format(ord(letter)) for letter in string])

Use ord():
s = 'Øâþ  ÿþ !Zk2ìm "Ï"À>q úÞ'
bytes = bytearray(ord(char) for char in s)
print(bytes)
Output:
bytearray(b'\xd8\xe2\xfe \x10 \xff\xfe !Zk2\xecm \x0e"\xcf"\xc0>q \xfa\xde')
That being said I can't match your output exactly because you copied and pasted a garbage char:
print(''.join(chr(char) for char in bytes)) # Øâþ ÿþ !Zk2ìm "Ï"À>q úÞ

Related

How to change specific byte in packet using scapy?

I want to modify icmp.unused value in scapy. But no matter what value I set for it, the value of icmp.unused is still 0. I know which byte in my packet is responsible for its value. So I want to modify the byte directly. hexstr and hexdump don't work. The end of the packet is messed up. How to do this?
hex_packet = scapy.hexstr(packet)
print(type(hex_packet))
list_packet = list(hex_packet)
list_packet[38] = '\x05'
list_packet[39] = '\x14'
hex_packet = ''.join(list_packet)
packet_hex = scapy.Ether(scapy.import_hexcap())
08 00 27 78 FE 4B 52 54 00 12 35 00 080 45 00 00 38 00 01 00 00 40 01 31 6D C0 A8 64 01 C0 A8 64 05 03 04 41 5E 00 00 05 14 45 00 00 1C 00 01 00 00 40 11 31 74 C0 A8 64 05 C0 A8 64 06 FC F1 00 35 00 08 B9 5A ..'x.KRT..5...E..8....#.1m..d...d...A^....E.......#.1t..d...d....5...Z

How would I "decode" the packet data from a pcap in NodeJS?

I'm wanting to make a PCap Analyzer script where it can detect what traffic is what from a pcap file.
The general idea is: HTTP(x10), DNS(x5), HTTPS(x20)
Now as you can see the majority of traffic is HTTPS based I want to be able to pull that from the pcap packet data to pass to another section of my analyzer script.
I don't have a clue nor any idea of what NPMs or anything that I can use, I have looked into pcap-parser which is a 9+ Yr old NPM this package , and only provides packet.data, packet.header.
I'm just completely losing all hope on making this script as I've tried ever potential resource even went into researching a potential API system to upload the pcap and bring the info I wish to obtain with no avail.
Example of packet.header
{
timestampSeconds: 1606145597,
timestampMicroseconds: 444357,
capturedLength: 60,
originalLength: 60
}
Example of packet.data (Buffer)
<Buffer 01 00 5e 7f ff fa 34 29 8f 99 09 70 08 00 45 00 00 a5 a4 76 00 00 04 11 10 f3 0a c8 06 1d ef ff ff fa ed 0c 07 6c 00 91 17 56 4d 2d 53 45 41 52 43 48 ... 129 more bytes>
<Buffer ff ff ff ff ff ff 34 29 8f 99 09 6e 08 06 00 01 08 00 06 04 00 01 34 29 8f 99 09 6e 0a c8 06 e6 00 00 00 00 00 00 0a c8 06 de 00 00 00 00 00 00 00 00 ... 10 more bytes>
<Buffer e0 55 3d 5e 95 a0 40 ec 99 d3 06 fd 08 00 45 00 05 6b a7 ed 40 00 80 06 00 00 0a 91 a6 ce 34 ef cf 64 e9 9f 01 bb a2 30 72 ed d9 06 6d cc 80 18 02 00 ... 1351 more bytes>
<Buffer 40 ec 99 d3 06 fd e0 55 3d 5e 95 a0 08 00 45 00 00 34 72 2d 40 00 70 06 e2 e3 34 ef cf 64 0a 91 a6 ce 01 bb e9 9f d9 06 6d cc a2 30 14 19 80 10 1b 25 ... 16 more bytes>
<Buffer e0 55 3d 5e 95 a0 40 ec 99 d3 06 fd 08 00 45 00 00 34 05 b4 40 00 80 06 00 00 0a 91 a6 ce 17 d9 8a 6c e9 a8 01 bb f0 0d cc ed 00 00 00 00 80 02 fa f0 ... 16 more bytes>

is a jpeg with a bogus huffman table recoverable?

I have a JPEG that is un-openable in any program:
Opening in Ubuntu Image Viewer yields:
Passing the photo through convert yields similar results:
$ convert corrupt.jpg out.jpg
convert.im6: Bogus Huffman table definition `corrupt.jpg' # error/jpeg.c/JPEGErrorHandler/316.
convert.im6: no images defined `out.jpg' # error/convert.c/ConvertImageCommand/3044.
Running the photo through exiftool yields:
ExifTool Version Number : 9.46
File Name : corrupt.jpg
Directory : .
File Size : 47 kB
File Modification Date/Time : 2015:04:11 01:31:14-07:00
File Access Date/Time : 2018:05:04 10:26:04-07:00
File Inode Change Date/Time : 2018:05:04 10:26:03-07:00
File Permissions : r--------
File Type : JPEG
MIME Type : image/jpeg
Comment : Y�.�.�..2..Q.Q.
Image Width : 640
Image Height : 480
Encoding Process : Baseline DCT, Huffman coding
Bits Per Sample : 8
Color Components : 3
Y Cb Cr Sub Sampling : YCbCr4:2:2 (2 1)
Image Size : 640x480
Un-corrupted photos containing similar image contents average 45-48k, so I reckon the photo data itself is inside this JPEG somewhere.
I hosted the photo on S3. You can download it w/ wget:
wget https://s3.amazonaws.com/jordanarseno.com/corrupt.jpg
I opened the file with hexedit and found the following:
the photo contents outside of the first few hundred bytes is randomly distributed enough to suggest it contains an image. i.e. I'm not seeing consecutive streams of 0's of F's.
it does in-fact start with the FF D8 file signature, as JPEGs ought to.
the next two bytes are not FF E0 or FF E1 like the list of file signatures says should correspond to JPEGs or JFIFs. Instead it isFF FE. Which, is in the table, but is listed as:
Byte-order mark for text file encoded in little-endian 16-bit Unicode
Transfer Format
not long after the FF FE, I see bytes whose ascii representation is: &'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz. Seems rather strange for a JPEG. What is this?
likewise, the ASCII string &'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz appears about 100 bytes later.
FF D9 (the JPEG terminator string) is in the file, but characters do appear after this terminator:
FF D9 5C 72 78 E0 7C 94 CD B2 9C FF 00 C4 BF 53 C0 E7 FE 41 D3 9C FF 00 E3 95 7C F1 B6 92 5F 7A 2B EB 54 AF BF E6 30 FD A0 7F CC 3B 53 E9 FF 00 40 F9 FF 00 F8 8A 4D F7 08 30
Switching over to Windows and using JPEGsnoop yields:
JPEGsnoop 1.8.0 by Calvin Hass
http://www.impulseadventure.com/photo/
-------------------------------------
Filename: [C:\corrupt.jpg]
Filesize: [47760] Bytes
Start Offset: 0x00000000
*** Marker: SOI (xFFD8) ***
OFFSET: 0x00000000
*** Marker: COM (Comment) (xFFFE) ***
OFFSET: 0x00000002
Comment length = 36
Comment=Y.Ò................à.....2..Q.Q...
*** Marker: DQT (xFFDB) ***
Define a Quantization Table.
OFFSET: 0x00000028
Table length = 132
----
Precision=8 bits
Destination ID=0 (Luminance)
DQT, Row #0: 3 2 2 3 4 7 9 10
DQT, Row #1: 2 2 2 3 4 10 10 9
DQT, Row #2: 2 2 3 4 7 10 12 10
DQT, Row #3: 2 3 4 5 9 15 14 11
DQT, Row #4: 3 4 6 10 12 19 18 13
DQT, Row #5: 4 6 9 11 14 18 19 16
DQT, Row #6: 8 11 13 15 18 21 21 17
DQT, Row #7: 12 16 16 17 19 17 18 17
Approx quality factor = 91.45 (scaling=17.09 variance=0.95)
----
Precision=8 bits
Destination ID=1 (Chrominance)
DQT, Row #0: 3 3 4 8 17 17 17 17
DQT, Row #1: 3 4 4 11 17 17 17 17
DQT, Row #2: 4 4 10 17 17 17 17 17
DQT, Row #3: 8 11 17 17 17 17 17 17
DQT, Row #4: 17 17 17 17 17 17 17 17
DQT, Row #5: 17 17 17 17 17 17 17 17
DQT, Row #6: 17 17 17 17 17 17 17 17
DQT, Row #7: 17 17 17 17 17 17 17 17
Approx quality factor = 91.44 (scaling=17.11 variance=0.19)
*** Marker: COM (Comment) (xFFFE) ***
OFFSET: 0x000000AE
Comment length = 5
Comment=...
*** Marker: SOF0 (Baseline DCT) (xFFC0) ***
OFFSET: 0x000000B5
Frame header length = 17
Precision = 8
Number of Lines = 480
Samples per Line = 640
Image Size = 640 x 480
Raw Image Orientation = Landscape
Number of Img components = 3
Component[1]: ID=0x01, Samp Fac=0x21 (Subsamp 1 x 1), Quant Tbl Sel=0x00 (Lum: Y)
Component[2]: ID=0x02, Samp Fac=0x11 (Subsamp 2 x 1), Quant Tbl Sel=0x01 (Chrom: Cb)
Component[3]: ID=0x03, Samp Fac=0x11 (Subsamp 2 x 1), Quant Tbl Sel=0x01 (Chrom: Cr)
*** Marker: DHT (Define Huffman Table) (xFFC4) ***
OFFSET: 0x000000C8
Huffman table length = 418
----
Destination ID = 0
Class = 0 (DC / Lossless Table)
Codes of length 01 bits (000 total):
Codes of length 02 bits (001 total): 00
Codes of length 03 bits (005 total): 01 02 03 04 05
Codes of length 04 bits (001 total): 06
Codes of length 05 bits (001 total): 07
Codes of length 06 bits (001 total): 08
Codes of length 07 bits (001 total): 09
Codes of length 08 bits (001 total): 0A
Codes of length 09 bits (001 total): 0B
Codes of length 10 bits (000 total):
Codes of length 11 bits (000 total):
Codes of length 12 bits (000 total):
Codes of length 13 bits (000 total):
Codes of length 14 bits (000 total):
Codes of length 15 bits (000 total):
Codes of length 16 bits (000 total):
Total number of codes: 012
----
Destination ID = 1
Class = 0 (DC / Lossless Table)
Codes of length 01 bits (000 total):
Codes of length 02 bits (003 total): 13 0E 0F
Codes of length 03 bits (001 total): 10
Codes of length 04 bits (001 total): 11
Codes of length 05 bits (001 total): 12
Codes of length 06 bits (001 total): 12
Codes of length 07 bits (012 total): 12 0B 0D 13 15 13 11 15 10 11 12 11
Codes of length 08 bits (016 total): 01 03 03 03 04 04 04 08 04 04 08 11 0B 0A 0B 11
Codes of length 09 bits (013 total): 11 11 11 11 11 11 11 11 11 11 11 11 11
Codes of length 10 bits (011 total): 11 11 11 11 11 11 11 11 11 11 11
Codes of length 11 bits (012 total): 11 11 11 11 11 11 11 11 11 11 11 01
Codes of length 12 bits (015 total): 01 01 01 01 00 00 00 00 00 00 01 02 03 04 05
Codes of length 13 bits (012 total): 06 07 08 09 0A 0B 10 00 02 01 03 03
Codes of length 14 bits (009 total): 02 04 03 05 05 04 04 00 00
Codes of length 15 bits (010 total): 01 7D 01 02 03 00 04 11 05 12
Codes of length 16 bits (014 total): 21 31 41 06 13 51 61 07 22 71 14 32 81 91
Total number of codes: 131
----
Destination ID = 1
Class = 10 (AC Table)
ERROR: Invalid DHT Class (10). Aborting DHT Load.
ERROR: Expected marker 0xFF, got 0x73 # offset 0x0000026C. Consider using [Tools->Img Search Fwd/Rev].
*** Searching Compression Signatures ***
Signature: 01FF5BA518B453CC8F224A4C85505196
Signature (Rotated): 01D13AFD01FF0B6EC46EA4081D25BB4D
File Offset: 0 bytes
Chroma subsampling: 2x1
EXIF Make/Model: NONE
EXIF Makernotes: NONE
EXIF Software: NONE
Searching Compression Signatures: (3347 built-in, 0 user(*) )
EXIF.Make / Software EXIF.Model Quality Subsamp Match?
------------------------- ----------------------------------- ---------------- --------------
CAM:[NIKON ] [NIKON D40 ] [FINE ] Yes
Based on the analysis of compression characteristics and EXIF metadata:
ASSESSMENT: Class 1 - Image is processed/edited
This may be a new software editor for the database.
If this file is processed, and editor doesn't appear in list above,
PLEASE ADD TO DATABASE with [Tools->Add Camera to DB]
*** Additional Info ***
NOTE: Data exists after EOF, range: 0x00000000-0x0000BA90 (47760 bytes)
As a last note, the EXIF.Model identified by JPEGSnoop is incorrect. This photo would have been taken with a VC0706 UART Model: LCF - 23T 0V528
In summary: Is this JPEG recoverable?
The approach used to get this back was more luck than judgement. I think I can explain, though be aware it involves a hex editor...
The Wikipedia page for the syntax of a JPEG file explains that it is made up of a series of segments each started by a two byte marker - 0xFF and another byte to indicate the type of segment.
The hope was that it was just the Huffman table segment of the file that was wrong - as suggested by the error message. Without needing to understand what a Huffman table is, it was enough to see that the same section on Wikipedia explains it is a 0xFF 0xC4 marker for a Huffman table segment.
Further down the page, it mentions:
The JPEG standard provides general-purpose Huffman tables; encoders
may also choose to generate Huffman tables...
Opening up a few other JPEG files found what looks like a standard set of 4 consecutive Huffman table segments - each starting with that 0xFF 0xC4 marker. The sample corrupt.jpg however just had one Huffman table - from position 0x00c8 to 0x02bc below.
(Both contain that &'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz sequence you mentioned in their Huffman tables. In the corrupt file it appears twice in that single Huffman table, in the 'more conventional' JPEGs it appears in the second and fourth Huffman tables.)
From there, the fixed image is a copy and paste of the standard 4 Huffman tables, in place of that range of bytes in corrupt.jpg - now from 0x00c8 to 0x0278 in the fixed file.
Because the JPEG format is based around scanning for segments between those 0xff markers, you can just swap out the Huffman segments - there are no other pointers in the file to worry about. As you said, the rest of the file looked like a plausible JPEG.
Summary of the steps taken:
Hex search the corrupt.jpg for FF C4 and note the offset
Hex search for the next FF. If it's another FF C4 (so a second Huffman table) keep going
Delete the content from the first FF C4 (included) up to but not including the next FF
Instead replace it with the 'standard 4 Huffman tables'. These are the bytes in the last sample below, or can be copied from 0x00c8 to 0x0278 in the fixed file
Corrupt Huffman table:
0000-00d0: xx xx xx xx xx xx xx xx-ff c4 01 a2-00 00 01 05 !....... ........
0000-00e0: 01 01 01 01-01 01 00 00-00 00 00 00-00 00 01 02 ........ ........
0000-00f0: 03 04 05 06-07 08 09 0a-0b 01 00 03-01 01 01 01 ........ ........
0000-0100: 0c 10 0d 0b-0c 0f 0c 09-0a 0e 13 0e-0f 10 11 12 ........ ........
0000-0110: 12 12 0b 0d-13 15 13 11-15 10 11 12-11 01 03 03 ........ ........
0000-0120: 03 04 04 04-08 04 04 08-11 0b 0a 0b-11 11 11 11 ........ ........
0000-0130: 11 11 11 11-11 11 11 11-11 11 11 11-11 11 11 11 ........ ........
0000-0140: 11 11 11 11-11 11 11 11-11 11 11 11-11 11 11 11 ........ ........
0000-0150: 01 01 01 01-01 00 00 00-00 00 00 01-02 03 04 05 ........ ........
0000-0160: 06 07 08 09-0a 0b 10 00-02 01 03 03-02 04 03 05 ........ ........
0000-0170: 05 04 04 00-00 01 7d 01-02 03 00 04-11 05 12 21 ......}. .......!
0000-0180: 31 41 06 13-51 61 07 22-71 14 32 81-91 a1 08 23 1A..Qa." q.2....#
0000-0190: 42 b1 c1 15-52 d1 f0 24-33 62 72 82-09 0a 16 17 B...R..$ 3br.....
0000-01a0: 18 19 1a 25-26 27 28 29-2a 34 35 36-37 38 39 3a ...%&'() *456789:
0000-01b0: 43 44 45 46-47 48 49 4a-53 54 55 56-57 58 59 5a CDEFGHIJ STUVWXYZ
0000-01c0: 63 64 65 66-67 68 69 6a-73 74 75 76-77 78 79 7a cdefghij stuvwxyz
0000-01d0: 83 84 85 86-87 88 89 8a-92 93 94 95-96 97 98 99 ........ ........
0000-01e0: 9a a2 a3 a4-a5 a6 a7 a8-a9 aa b2 b3-b4 b5 b6 b7 ........ ........
0000-01f0: b8 b9 ba c2-c3 c4 c5 c6-c7 c8 c9 ca-d2 d3 d4 d5 ........ ........
0000-0200: d6 d7 d8 d9-da e1 e2 e3-e4 e5 e6 e7-e8 e9 ea f1 ........ ........
0000-0210: f2 f3 f4 f5-f6 f7 f8 f9-fa 11 00 02-01 02 04 04 ........ ........
0000-0220: 03 04 07 05-04 04 00 01-02 77 00 01-02 03 11 04 ........ .w......
0000-0230: 05 21 31 06-12 41 51 07-61 71 13 22-32 81 08 14 .!1..AQ. aq."2...
0000-0240: 42 91 a1 b1-c1 09 23 33-52 f0 15 62-72 d1 0a 16 B.....#3 R..br...
0000-0250: 24 34 e1 25-f1 17 18 19-1a 26 27 28-29 2a 35 36 $4.%.... .&'()*56
0000-0260: 37 38 39 3a-43 44 45 46-47 48 49 4a-53 54 55 56 789:CDEF GHIJSTUV
0000-0270: 57 58 59 5a-63 64 65 66-67 68 69 6a-73 74 75 76 WXYZcdef ghijstuv
0000-0280: 77 78 79 7a-82 83 84 85-86 87 88 89-8a 92 93 94 wxyz.... ........
0000-0290: 95 96 97 98-99 9a a2 a3-a4 a5 a6 a7-a8 a9 aa b2 ........ ........
0000-02a0: b3 b4 b5 b6-b7 b8 b9 ba-c2 c3 c4 c5-c6 c7 c8 c9 ........ ........
0000-02b0: ca d2 d3 d4-d5 d6 d7 d8-d9 da e2 e3-e4 e5 e6 e7 ........ ........
0000-02c0: e8 e9 ea f2-f3 f4 f5 f6-f7 f8 f9 fa-xx xx xx xx ........ ........
Then the next two bytes are ff dd for the start of the next segment:
0000-02c0: xx xx xx xx-xx xx xx xx-xx xx xx xx-ff dd 00 04 ........ ........
This was replaced with the standard 4 general-purpose Huffman tables instead - look for the ff c4 markers:
0000-00d0: xx xx xx xx xx xx xx xx-ff c4 00 1f-00 00 01 05 !....... ........
0000-00e0: 01 01 01 01-01 01 00 00-00 00 00 00-00 00 01 02 ........ ........
0000-00f0: 03 04 05 06-07 08 09 0a-0b ff c4 00-b5 10 00 02 ........ ........
0000-0100: 01 03 03 02-04 03 05 05-04 04 00 00-01 7d 01 02 ........ .....}..
0000-0110: 03 00 04 11-05 12 21 31-41 06 13 51-61 07 22 71 ......!1 A..Qa."q
0000-0120: 14 32 81 91-a1 08 23 42-b1 c1 15 52-d1 f0 24 33 .2....#B ...R..$3
0000-0130: 62 72 82 09-0a 16 17 18-19 1a 25 26-27 28 29 2a br...... ..%&'()*
0000-0140: 34 35 36 37-38 39 3a 43-44 45 46 47-48 49 4a 53 456789:C DEFGHIJS
0000-0150: 54 55 56 57-58 59 5a 63-64 65 66 67-68 69 6a 73 TUVWXYZc defghijs
0000-0160: 74 75 76 77-78 79 7a 83-84 85 86 87-88 89 8a 92 tuvwxyz. ........
0000-0170: 93 94 95 96-97 98 99 9a-a2 a3 a4 a5-a6 a7 a8 a9 ........ ........
0000-0180: aa b2 b3 b4-b5 b6 b7 b8-b9 ba c2 c3-c4 c5 c6 c7 ........ ........
0000-0190: c8 c9 ca d2-d3 d4 d5 d6-d7 d8 d9 da-e1 e2 e3 e4 ........ ........
0000-01a0: e5 e6 e7 e8-e9 ea f1 f2-f3 f4 f5 f6-f7 f8 f9 fa ........ ........
0000-01b0: ff c4 00 1f-01 00 03 01-01 01 01 01-01 01 01 01 ........ ........
0000-01c0: 00 00 00 00-00 00 01 02-03 04 05 06-07 08 09 0a ........ ........
0000-01d0: 0b ff c4 00-b5 11 00 02-01 02 04 04-03 04 07 05 ........ ........
0000-01e0: 04 04 00 01-02 77 00 01-02 03 11 04-05 21 31 06 .....w.. .....!1.
0000-01f0: 12 41 51 07-61 71 13 22-32 81 08 14-42 91 a1 b1 .AQ.aq." 2...B...
0000-0200: c1 09 23 33-52 f0 15 62-72 d1 0a 16-24 34 e1 25 ..#3R..b r...$4.%
0000-0210: f1 17 18 19-1a 26 27 28-29 2a 35 36-37 38 39 3a .....&'( )*56789:
0000-0220: 43 44 45 46-47 48 49 4a-53 54 55 56-57 58 59 5a CDEFGHIJ STUVWXYZ
0000-0230: 63 64 65 66-67 68 69 6a-73 74 75 76-77 78 79 7a cdefghij stuvwxyz
0000-0240: 82 83 84 85-86 87 88 89-8a 92 93 94-95 96 97 98 ........ ........
0000-0250: 99 9a a2 a3-a4 a5 a6 a7-a8 a9 aa b2-b3 b4 b5 b6 ........ ........
0000-0260: b7 b8 b9 ba-c2 c3 c4 c5-c6 c7 c8 c9-ca d2 d3 d4 ........ ........
0000-0270: d5 d6 d7 d8-d9 da e2 e3-e4 e5 e6 e7-e8 e9 ea f2 ........ ........
0000-0280: f3 f4 f5 f6-f7 f8 f9 fa-xx xx xx xx xx xx xx xx ........ .....(..

How do you decode an Ethernet Frame without things like Wireshark?

For example: How would one decode the following ethernet frame?
00 26 b9 e8 7e f1 00 12 f2 21 da 00 08 00 45 00 05 dc e3 cd 20 10 35 06 25 eb 0a 0a 0a 02 c0 a8 01 03 c3 9e 0f 40 00 00 10 00 00 00 14 00 70 10 00 5c 59 99 00 00 02 04 05 b4 01 03 03 06 00 00 01 98 64 34 e8 90 84 98 20 12 18 19 04 85 80 00
I know that the first 6 bytes are the MAC destination address : 00 26 b9 e8 7e f1 The next 6 bytes are the source MAC address : 00 12 f2 21 da 00 The next 2 bytes show the ethernet type : 08 00 The next 4 bytes are : 45 00... Ipv4... "5" the number of bytes in the header.. and "00" means there are no differentiated services.
What I don't know is what anything after that is or how to read it.
Anyone help?
Rearranging a bit your packet, we have:
00 26 b9 e8 7e f1 00 12 f2 21 da 00 08 00 45 00
05 dc e3 cd 20 10 35 06 25 eb 0a 0a 0a 02 c0 a8
01 03 c3 9e 0f 40 00 00 10 00 00 00 14 00 70 10
00 5c 59 99 00 00 02 04 05 b4 01 03 03 06 00 00
01 98 64 34 e8 90 84 98 20 12 18 19 04 85 80 00
If you know that the first 6 octets form the destination mac address, that means that it is an Ethernet layer 2 packet.
According to IEEE 802.3, $3.1.1:
First 6 octets are the destination mac address (00 26 b9 e8 7e f1)
Next 6 octets are the source mac address (00 12 f2 21 da 00)
Next 4 octets are, optionally the 802.1Q tag (present, 08 00 45 00)
Next 2 octets are either:
Maximum payload size - aka MTU (if <= 1500, which is the case, 05 dc is 1500)
Ethernet 2 frame (if >= 1536)
Next is the payload ranging from 46 octets (if the 802.1Q tag is absent) or 42 octets (if the 802.1Q tag is present) to up to 1500 octets (starts at e3 cd 20 10 ..., ends either at 20 12 18 19 or at 03 06 00 00, depends on the 7th item)
Last 4 octets form the CRC32 code (either 01 98 64 34 or 04 85 80 00, depending on the 7th item)
There is also 12 octets used for padding (random - not so random - bytes), that may or may not be inserted in this packet. (if inserted, the padding is e8 90 84 98 20 12 18 19 04 85 80 00)

How to decode the TCP buffer data

I am trying to write a tcp server to get the data from Heacent 908 GPS tracker. After establishing the connection from the tracker I am getting the following buffer output.
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a>
I am not sure how to decode this data into proper readable format.
Note: Off course I have tried to reach the manufacture but they are not responding at all.
What type of possible encoding formats are there for TCP protocol?
On next day I got data like this
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a>
<Buffer 78 78 1f 12 0e 02 14 13 01 14 c8 03 5f a6 50 07 f7 f8 c1 32 35 39 01 9a 04 0f a2 00 b0 5a 00 1a 9b 7a 0d 0a>
<Buffer 78 78 1f 12 0e 02 14 13 01 1e c8 03 5f ad bc 07 f7 f0 76 41 35 40 01 9a 04 0f a2 00 b0 5a 00 1b b6 31 0d 0a>
Something is being changed but not sure what is it...
You ask what possible encoding formats there are for TCP. That's a bit of an odd question: there are an unbounded number of encoding formats using TCP as the underlying protocol. But no matter, we can try to figure out this one!
You've posted some sample messages. Let's see if we can translate them:
byte 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
rev 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
----------------------------------------------------------
hex 78 78 0d 01 03 87 11 31 20 86 48 42 00 06 64 be 0d 0a
text x x \r -- -- -- -- 1 -- H B -- -- d -- \r \n
dec 13 1 3 17 0 6 100 13 10
be32 [218170247] [288432262] [ 419006]
----------------------------------------------------------
hex 78 78 0d 01 03 87 11 31 20 86 48 42 00 07 75 37 0d 0a
text -- u 7
dec 7 117 55
be32 [ 488759]
----------------------------------------------------------
hex 78 78 0d 01 03 87 11 31 20 86 48 42 00 08 8d c0 0d 0a
text -- -- --
dec 8 141
be32 [ 560576]
----------------------------------------------------- byte 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
hex 78 78 1f 12 0e 02 14 13 01 14 c8 03 5f a6 50 07 f7 f8 c1 32 35 39 01 9a 04 0f a2 00 b0 5a 00 1a 9b 7a 0d 0a
text -- -- -- -- -- -- -- -- -- -- _ -- P -- -- -- -- 2 5 9 -- -- -- -- -- -- -- -- -- xx -- z \r \n
----------------------------------------------------------
hex 78 78 1f 12 0e 02 14 13 01 1e c8 03 5f ad bc 07 f7 f0 76 41 35 40 01 9a 04 0f a2 00 b0 5a 00 1b b6 31 0d 0a
text -- -- -- A 5 # -- xx -- 1
Some potentially interesting facts:
Starts with "xx\r\01" which more or less seems like a possible header. But later messages start with "xx" and something else. Anyway, given that NMEA has a prefix of "GP" I wouldn't be shocked if these devices used "xx" for "something that's not NMEA."
Has "HB" in the middle, which could mean "heartbeat" since this is repeating, perhaps waiting for a reply from the server.
Ends with "\r\n" which is a common line ending (on Windows in particular), though the rest doesn't appear to be entirely textual.
The earlier messages are 18 bytes long and the later ones 36 bytes. A guess would be the short ones are status updates or heartbeats and the long ones are actual location information. 36 bytes is enough if we figure:
4 byte latitude: 24 bits if you pinch (see), 25-32 bits more likely
4 byte longitude: same as latitude
6 byte timestamp: 39 bits if using epoch time with centiseconds, 32/48/64 bits more likely
2 byte altitude: I suspect this device doesn't publish altitude at all, given some of the docs
So I think what is going on is that these messages you see are just the device "pinging" the server and waiting for a response. What sort of response? Well, you could try to brute force it, but far, far easier would be to set up a bridge in your program that takes whatever it receives from the device, sends it to the manufacturer's server, and does the same thing in reverse for the responses to the device. This way you will quickly be able to gather a corpus of valid messages which will be very helpful if we really do need to reverse engineer this thing. Or if you're lucky it will turn out to use some standard protocol like NMEA after negotiating the initial session.
Edit: now that you've given us more messages from the device, we can see that it does seem to send something else with variable content. Maybe that's the location data, but I don't have time to try to reverse engineer it right now. One idea is to physically move the unit from west to east or north to south and capture the messages it sends during that time, to try to isolate which parts of the messages are the longitude and which are the latitude (and perhaps timestamp too).
I think it's fairly clear that the first two bytes are "xx" as a header, and the last two are "\r\n" as a terminator. That leaves 32 bytes of payload in the longer messages, all of which appears to be binary data.
It's the GT06 protocol and you can find it's specs here:
http://www.traccar.org/devices/
http://www.traccar.org/docs/protocol.jsp
https://dl.dropboxusercontent.com/s/sqtkulcj51zkria/GT06_GPS_Tracker_Communication_Protocol_v1.8.1.pdf
You can do it this way:
client.on('data', (buffer) => {
const decodedData = buffer.toString('utf8')
console.log(decodedData)
})

Resources