I am trying to write a parser to extract information from the following FLAC file:
$ hd audio.flac | head -n 6
00000000 66 4c 61 43 00 00 00 22 12 00 12 00 00 00 00 00 |fLaC..."........|
00000010 00 00 0a c4 42 f0 00 78 9f 30 00 00 00 00 00 00 |....B..x.0......|
00000020 00 00 00 00 00 00 00 00 00 00 84 00 02 64 1f 00 |.............d..|
00000030 00 00 47 53 74 72 65 61 6d 65 72 20 65 6e 63 6f |..GStreamer enco|
00000040 64 65 64 20 76 6f 72 62 69 73 63 6f 6d 6d 65 6e |ded vorbiscommen|
00000050 74 10 00 00 00 12 00 00 00 54 49 54 4c 45 3d 52 |t........TITLE=R|
Now, according to the specification, the format should be as follow (numbers are in bits):
<32> "fLaC", the FLAC stream marker in ASCII
<16> The minimum block size (in samples) used in the stream.
<16> The maximum block size (in samples) used in the stream.
<24> The minimum frame size (in bytes) used in the stream.
<24> The maximum frame size (in bytes) used in the stream.
<20> Sample rate in Hz.
<3> (number of channels)-1. FLAC supports from 1 to 8 channels
<5> (bits per sample)-1. FLAC supports from 4 to 32 bits per sample.
<36> Total samples in stream.
<128> MD5 signature of the unencoded audio data.
So, I start to write my parser and, while testing, get very strange results. So I test with a "real" metadata extractor:
$ metaflac --list audio.flac
METADATA block #0
type: 0 (STREAMINFO)
is last: false
length: 34
minimum blocksize: 4608 samples
maximum blocksize: 4608 samples
minimum framesize: 0 bytes
maximum framesize: 0 bytes
sample_rate: 44100 Hz
channels: 2
bits-per-sample: 16
total samples: 7905072
MD5 signature: 00000000000000000000000000000000
From the numbers, I can deduce the following:
66 4c 61 43 00 00 00 22 12 00 12 00 00 00 00 00
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~ ~~~~~ ~~~~~~~~ ~~
^ ^ ^ ^ ^ ^
| | | | | |
| | | | | + Etc.
| | | | + Minimum frame size
| | | + Maximum block size
| | + Minimum block size
| + What is that ?!?
+ FLAC stream marker
Where does those 32 bits come from? I see they represent the length of the header, but isn't it against the standard to put it there (Taking into account that we already know the length: (32+16+16+24+20+3+5+36+128)/8)?
The 0x22 (34) is indeed the header block size in bytes as part of the METADATA_BLOCK_HEADER which follows the fLaC marker in the stream. Of the first 8 bits (00), bit 7 indicates that there are more metadatablocks to follow, the next 7 bits indicate that it's a STREAMINFO block. The following 3 bytes (00 00 22) is the length of the contents of the block;
16 + 16 + 24 + 24 + 20 + 3 + 5 + 36 + 128 = 272 bits
272 bits / 8 = 34 (0x22) bytes.
Related
Using tcpdump im trying to sniff some packets. The result is this:
reading from file /tmp/prueba.pcap, link-type LINUX_SLL (Linux cooked v1)
13:35:51.767194 IP6 fdc1:41d:9c3:dbef:a6e9:69f0:59aa:b70a.47193 > fdc1:41d:9c3:dbef:0:ff:fe00:8c00.47193: UDP, length 63
0x0000: 6000 0000 0047 1140 fdc1 041d 09c3 dbef `....G.#........
0x0010: a6e9 69f0 59aa b70a fdc1 041d 09c3 dbef ..i.Y...........
0x0020: 0000 00ff fe00 8c00 b859 b859 0047 d42e .........Y.Y.G..
0x0030: 3f0c 0000 0dc2 50f1 0d7b 2254 696d 6522 ?.....P..{"Time"
0x0040: 3a5b 3136 3632 3033 3933 3531 2c22 225d :[1662039351,""]
0x0050: 2c22 4d6f 6417 0012 320f 00f0 0352 6f6c ,"Mod...2....Rol
0x0060: 6c22 3a5b 3533 302c 2264 c2ba 225d 7d l":[530,"d.."]}
The point is in the line with address 0x0050 we can read "Mod...2". That "Mod" means "Mode" but I don't understand why is not the whole word "Mode". ¿Where is the "e"? I need to read that message perfectly for automate a program reading values from there.
I discarded a puntual problem transmiting the message because every time I sniff a packet that contain that info, the format is exactly the same.
Regards,
There are indications that the packet is not correct in other ways than a missing e. For example, the ether type is 0x09c3 and not 0x86dd (IPv6).
Maybe this code to create a PCAP file can help. Using the raw packet you provided as input the output file is bad.pcap and you could use a tool like Wireshark to examine the packet in more detail, see here
import codecs
from scapy.all import wrpcap, Ether, IP, IPv6, UDP, Raw
data = (
'60 00 00 00 00 47 11 40 fd c1 04 1d 09 c3 db ef '
'a6 e9 69 f0 59 aa b7 0a fd c1 04 1d 09 c3 db ef '
'00 00 00 ff fe 00 8c 00 b8 59 b8 59 00 47 d4 2e '
'3f 0c 00 00 0d c2 50 f1 0d 7b 22 54 69 6d 65 22 '
'3a 5b 31 36 36 32 30 33 39 33 35 31 2c 22 22 5d '
'2c 22 4d 6f 64 17 00 12 32 0f 00 f0 03 52 6f 6c '
'6c 22 3a 5b 35 33 30 2c 22 64 c2 ba 22 5d 7d' )
data_list = data.split( " " )
data_s = codecs.decode(''.join(data_list), 'hex')
packet = Raw(load=data_s)
wrpcap('bad.pcap', [packet])
data = (
'3f 0c 00 00 0d c2 50 f1 0d 7b 22 54 69 6d 65 22 '
'3a 5b 31 36 36 32 30 33 39 33 35 31 2c 22 22 5d '
'2c 22 4d 6f 64 17 00 12 32 0f 00 f0 03 52 6f 6c '
'6c 22 3a 5b 35 33 30 2c 22 64 c2 ba 22 5d 7d' )
data_list = data.split( " " )
data_s = codecs.decode(''.join(data_list), 'hex')
packet = Ether(dst="60:00:00:00:00:47", src="11:40:fd:c1:04:1d") / IPv6(dst="fdc1:41d:9c3:dbef:0:ff:fe00:8c00", src="fdc1:41d:9c3:dbef:a6e9:69f0:59aa:b70a" ) / UDP(sport=47193, dport=47193, len=0x0047 ) / Raw(load=data_s)
wrpcap('better.pcap', [packet])
The hex dump begins with the IPv6 header; the link-layer header is not being dumped, so the Ethertype that would appear in the LINKTYPE_SLL_LINUX link-layer header isn't shown.
So the header is:
6000 0000: version (6), traffic class (0), flow label (0)
0047: payload length (0x47 = 71)
1140: next header (0x11 = 17 = UDP), hop limit (0x40 = 64)
fdc1 041d 09c3 dbef a6e9 69f0 59aa b70a: source IPv6 address (fdc1:41d:9c3:dbef:a6e9:69f0:59aa:b70a)
fdc1 041d 09c3 dbef 0000 00ff fe00 8c00: destination IPv6 address (fdc1:41d:9c3:dbef:0:ff:fe00:8c00)
The next header field is 17, so what follows that is a UDP header:
b859: source port (0x5859 = 47193)
b859: destination port (0x5859 = 47193)
0047: length (0x47 = 71)
d42e: checksum
Those 71 bytes are the UDP header (8 bytes) plus the UDP payload (71 - 8 = 63 bytes).
Port 47193 is a in the "registered" port range; however, it does not appear in the current list of well-known and registered ports.
It does, however, appear, from some web searches, to be the default gateway port for MQTT-SN. MQTT "is a lightweight, publish-subscribe, machine to machine network protocol", and MQTT-SN, according to that page, "is a variation of the main protocol aimed at battery-powered embedded devices on non-TCP/IP networks".
If this is MQTT-SN, then, according to the protocol specification for MQTT-SN, the payload would be:
3f0c: length (0x3f = 63), message type (0c = PUBLISH)
00: flags (0x0000)
000d: TopicId
c250: MsgId
f1 0d7b 2254 696d 6522 3a5b 3136 3632 3033 3933 3531 2c22 225d 2c22 4d6f 6417 0012 320f 00f0 0352 6f6c 6c22 3a5b 3533 302c 2264 c2ba 225d 7d: Data
So that data is:
0xf1 0x0d {"Time":[1662039351,""],"Mod 0x17 0x00 0x12 2 0x0f 0x00 0xf0 0x03 Roll":[530,"d 0xc2 0xba "]}
If the published data is intended to be ASCII text, it appears to have been damaged; if this is from a wireless low-power network, perhaps there was radio interference.
The answer is easy... The content of the pcap packet was compressed with lz4...
Suppose I create a simple PNG with:
convert -size 1x1 canvas:red red.png
Here is a similar image (bigger size) for reference:
Then run the command identify on it. It tells me the ColorSpace of the image is sRGB but there seems to be NO indication of this inside the file. In fact running
$ hexdump -C red.png
00000000 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 |.PNG........IHDR|
00000010 00 00 00 01 00 00 00 01 01 03 00 00 00 25 db 56 |.............%.V|
00000020 ca 00 00 00 04 67 41 4d 41 00 00 b1 8f 0b fc 61 |.....gAMA......a|
00000030 05 00 00 00 20 63 48 52 4d 00 00 7a 26 00 00 80 |.... cHRM..z&...|
00000040 84 00 00 fa 00 00 00 80 e8 00 00 75 30 00 00 ea |...........u0...|
00000050 60 00 00 3a 98 00 00 17 70 9c ba 51 3c 00 00 00 |`..:....p..Q<...|
00000060 06 50 4c 54 45 ff 00 00 ff ff ff 41 1d 34 11 00 |.PLTE......A.4..|
00000070 00 00 01 62 4b 47 44 01 ff 02 2d de 00 00 00 07 |...bKGD...-.....|
00000080 74 49 4d 45 07 e5 01 0d 17 04 37 80 ef 04 02 00 |tIME......7.....|
00000090 00 00 0a 49 44 41 54 08 d7 63 60 00 00 00 02 00 |...IDAT..c`.....|
000000a0 01 e2 21 bc 33 00 00 00 25 74 45 58 74 64 61 74 |..!.3...%tEXtdat|
000000b0 65 3a 63 72 65 61 74 65 00 32 30 32 31 2d 30 31 |e:create.2021-01|
000000c0 2d 31 33 54 32 33 3a 30 34 3a 35 35 2b 30 30 3a |-13T23:04:55+00:|
000000d0 30 30 2d af d4 01 00 00 00 25 74 45 58 74 64 61 |00-......%tEXtda|
000000e0 74 65 3a 6d 6f 64 69 66 79 00 32 30 32 31 2d 30 |te:modify.2021-0|
000000f0 31 2d 31 33 54 32 33 3a 30 34 3a 35 35 2b 30 30 |1-13T23:04:55+00|
00000100 3a 30 30 5c f2 6c bd 00 00 00 00 49 45 4e 44 ae |:00\.l.....IEND.|
00000110 42 60 82 |B`.|
00000113
does not provide a clue, that I know of.
I understand that identifying the ColorSpace of an image, that does not contain that information, is a very hard problem -- see one proposed solution looking at the histogram of colors here.
So how identify, from the ImageMagick suite, determines the ColorSpace of this image?
It is common, but not standardized to assume that an image without an embedded or sidecar ICC profile or without an explicit encoding description is encoded according to IEC 61966-2-1:1999, i.e. sRGB specification.
This is just a bug in ImageMagick. You can use exiftool to check whether sRGB + intent chunk is present. In this case, no.
Gamma 2.2 is not sRGB. Thus ImageMagic is wrong here. That is a common problem on Wikipedia, all SVG images when converted to PNG have this and it destroys the colours. See: https://phabricator.wikimedia.org/T26768
We will have to reencode all images on Wikipedia, since we use ImageMagick. Sigh.
I am currently looking at online examples and here is a WAV file contents in bytes
52 49 46 46 24 08 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 02 00
22 56 00 00 88 58 01 00 04 00 10 00 64 61 74 61 00 08 00 00 00 00 00 00
24 17 1e f3 3c 13 3c 14 16 f9 18 f9 34 e7 23 a6 3c f2 24 f2 11 ce 1a 0d
and here is the visual; representation:
So according to the Subchunk2Size there is 2048 bytes in the data. The formula to calculate the number of samples in a WAV is given as:
Subchunk2Size /(NumChannels * BitsPerSample/8 ) = NumSamples
If I plugin numbers and according to the information given I get NumSamples = 512. But in the diagram the sample rate is 22050. How can the total number fo samples be less than a single second of samples?
For those wondering, here is a link to the source.
I suspect they are just using a bad example where the duration of the wav file would be less than a second. Their formula makes sense and we can use it to verify the data size of a one second wav file.
If our sample rate is 22050 samples/sec and our wav file is one second, then numSamples = 22050. We know that Subchunk2Size is the number of bytes in the data and can be calculated using this formula: Subchunk2Size = numSamples * numChannels * bitsPerSample / 8 , so, assuming numChannels = 2 and bitsPerSample = 16, we know that a one second wav file should be (22050 * 2 * 16 / 8) bytes which is 88200 bytes, so it would make sense that if Subchunk2Size is 2048 bytes, as per the website's example, then the duration of the wav file would be less than a second and thus, numSamples would be less than 22050.
I have been receiving an odd/unknown message while attempting to communicate with some bittorrent peers. In this particular case I am in the middle of downloading pieces and all of a sudden this new/odd message pops up in front of a piece response.The message is odd because it doesn't appear to follow the protocol, all messages are supposed to look like this
'<length prefix><message ID><payload>'
length prefix is 4 bytes, message id is 1 byte and the payload. I am including a capture to show what I mean, on line 509 of the capture you will
see a request for a piece, on line 510 you will see the beginning of the response.
The first 4 bytes of the response are 00 00 00 00, ie 0 length message (Which is causing me issues), the next 4 bytes are the actual length of the message which is 30. The actual response to the piece request starts on line 513, so I get the piece I was requesting but this new/odd message is messing me up. I'm certain I can find a workaround but I would really like to understand what this means.
Also, I have no idea what the actual message means, and cannot find any information about it anywhere.
Here is the Wireshark capture.
https://1drv.ms/u/s!Agj06pa-wu0tnFqsYn_KnHmVz3x2
Data from packet 510:
0000 00 00 00 00 00 00 00 1e 14 01 64 35 3a 61 64 64 ..........d5:add
0010 65 64 36 3a 63 f2 7a 48 17 f4 37 3a 64 72 6f 70 ed6:c.zH..7:drop
0020 70 65 64 30 3a 65 ped0:e
00 00 00 00 4 bytes keep-alive message
00 00 00 1e message length 30 bytes
14 message type extended message (BEP10)
01 extended message ID = 1 as specified by the previous extension handshake: ut_pex
64 35 3a 61 64 64 65 64 36 3a 63 f2 7a 48 17 f4 37 3a 64 72 6f 70 70 65 64 30 3a 65
d5:added6:c.zH..7:dropped0:e
ut_pex message data (bencoded)
d
5:added
6:c.zH..
7:dropped
0:
e
ut_pex message data (bencoded with added white space)
The first 4 bytes of the response are 00 00 00 00, ie 0 length message (Which is causing me issues)
The bittorrent spec says
Messages of length zero are keepalives, and ignored.
During a fresh installation, I accidentally formatted a disk containing datas. I have tried using some tools: testdisk, foremost, but I did not get good results. (see my unsuccessful post on superuser).
So I have decided to read some docs about ext2 filesystem structure, and I could get some results:
The deleted partition have a directory tree like that:
dev
|-scripts
|-projects
|-services
|-...
Medias
|-downloads
|-Musique
|-...
backup
...
So, based on the ext2 directory entry format:
Directory Entry
Starting_Byte Ending_Byte Size_in_Bytes Field_Description
0 3 4 Inode
4 5 2 Total size of this entry (Including all subfields)
6 6 1 Name Length least-significant 8 bits
7 7 1 Type indicator (only if the feature bit for "directory entries have file type byte" is set, else this is the most-significant 8 bits of the Name Length)
8 8+N-1 N Name characters
I tried to find some datas matching this structure.
I used this script:
var bindexOf = require('buffer-indexof');
var currentOffset=0;
var deviceReadStream = fs.createReadStream("/dev/sdb");
deviceReadStream.on('error',function(err){
console.log(err);
});
deviceReadStream.on('data',function(data){
var dirs = ["dev","scripts","services","projects","Medias","downloads","Musique","backup"];
dirs.forEach(function(dir){
dirOctetFormat = new Buffer(2);
dirOctetFormat.writeUInt8(dir.length,0);
dirOctetFormat.writeUInt8(2,1);// type is directory
dirOctetFormat= Buffer.concat( [dirOctetFormat, new Buffer(dir)]);
var offset = bindexOf( data, dirOctetFormat );
if( offset >= 0 ){
console.log( dir + " entry found at offset " + (currentOffset + offset) );
}
});
currentOffset += data.length;
});
}
I found data which seems to be the directory entry of the dev directory:
===== Current offset: 233590226944 - 217.5478515625Gio ======
scripts entry found at offset 233590227030
services entry found at offset 233590227014
projects entry found at offset 233590228106
If it is the case, I got the inode numbers of its children directories: scripts, projects, services,...
But I do not know what to do with that!
I tried to deduce the location of these inodes, based on this guide,
but as I was unable to find a superblock of the deleted filesystem, I just have to make guesses about the block size, the number of blocks, ...
and that seems a little bit fuzzy to me to hope obtaining a result.
So could you have some intervals for all values needed to obtain the offset of an inode, and a more formal formula to get this offset?
If you have only erased the partition table (or modified it) you can still get your data, if data has not been reused for something else.
ext2 filesystems have a MAGIC number in superblock, so to recover your partition you have only to search for it. I did this on one machine and was able to recover not one, but seven partitions in one disk. You have some chances to get invalid numbers, but just search for that magic. Magic number is defined in include/uapi/linux/magic.h and value is #define EXT2_SUPER_MAGIC 0xEF53 (it's found at offset #define EXT2_SB_MAGIC_OFFSET 0x38 ---from file include/linux/ext2_fs.h)
To search for the superblock, just try to find 0xef53 at offset 0x38 in one sector of the disk, it will mark the first block of the partition. Be careful, that superblock is replicated several times in one partition, so you'll find all the copies of it.
Good luck! (I had when it happened to me)
Edit (To illustrate with an example)
Just see the magic number in one of my own partitions:
# hd /dev/sda3 | head -20
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000400 40 62 08 00 00 87 21 00 26 ad 01 00 f6 30 15 00 |#b....!.&....0..|
00000410 1d 31 08 00 00 00 00 00 02 00 00 00 02 00 00 00 |.1..............|
00000420 00 80 00 00 00 80 00 00 90 1f 00 00 cf 60 af 55 |.............`.U|
00000430 fc 8a af 55 2d 00 ff ff 53 ef 01 00 01 00 00 00 |...U-...S.......|<- HERE!!!
00000440 36 38 9d 55 00 00 00 00 00 00 00 00 01 00 00 00 |68.U............|
00000450 00 00 00 00 0b 00 00 00 00 01 00 00 3c 00 00 00 |............<...|
00000460 46 02 00 00 7b 00 00 00 5a bf 87 15 12 8f 44 3b |F...{...Z.....D;|
00000470 97 e7 f3 74 4d 75 69 12 72 6f 6f 74 00 00 00 00 |...tMui.root....|
00000480 00 00 00 00 00 00 00 00 2f 00 61 72 67 65 74 00 |......../.arget.|
00000490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000004c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 02 |................|
000004d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000004e0 08 00 00 00 00 00 00 00 00 00 00 00 93 54 99 ab |.............T..|
000004f0 aa 64 46 b3 a6 73 94 34 a3 79 46 28 01 01 00 00 |.dF..s.4.yF(....|
00000500 0c 00 00 00 00 00 00 00 e5 61 92 55 0a f3 02 00 |.........a.U....|
00000510 04 00 00 00 00 00 00 00 00 00 00 00 ff 7f 00 00 |................|
00000520 00 80 10 00 ff 7f 00 00 01 00 00 00 ff ff 10 00 |................|
Remember it is on offset 0x38 counted from the block origin, and assume the super block is the second block (block 0 reserved for bootcode, so it will be block 1, with two sectors per block, to make 1k blocksize) in the partition, so you'll have to rewind 0x438 bytes from the beginning of the magic number to get the partition origin.
I have run the command on my whole disk, getting the following result:
# hd /dev/sda | grep " [0-9a-f][0-9a-f] 53 ef" | sed -e 's/^/ /' | head
006f05f0 ee 00 00 11 66 0a 00 00 53 ef 00 00 11 66 2d 00 |....f...S....f-.|
007c21d0 55 2a aa 7d f4 aa 89 55 53 ef a4 91 70 40 c1 00 |U*.}...US...p#..|
20100430 fc 8a af 55 2d 00 ff ff 53 ef 01 00 01 00 00 00 |...U-...S.......|
2289a910 0f 8f 4f 03 00 00 81 fe 53 ef 00 00 0f 84 ce 04 |..O.....S.......|
230d4c70 0a 00 00 00 1c 00 00 00 53 ef 01 00 00 00 00 00 |........S.......|
231b7e50 a0 73 07 00 00 00 00 00 53 ef 0d 00 00 00 00 00 |.s......S.......|
23dbd230 d5 08 ad 2b ee 71 07 8a 53 ef c2 89 d4 bb 09 1f |...+.q..S.......|
25c0c9e0 06 00 00 00 00 4f 59 c0 53 ef 32 c0 0e 00 00 00 |.....OY.S.2.....|
25d72ca0 b0 b4 7b 3d a4 f7 84 3b 53 ef ba 3c 1f 32 b9 3c |..{=...;S..<.2.<|
25f0eab0 f1 fd 02 be 28 59 67 3c 53 ef 9c bd 04 30 72 bd |....(Yg<S....0r.|
Clearly, there are much more uninteresting lines in this listing than the ones we need. To locate the one interesting here, we have to do some computing with the numbers. We have seen that sectors are 512 bytes long (this is 0x200 in hex) and we can have the superblock magic at offset 0x438, so we expect valid offsets to be at 0xXXXXXX[02468ace]38 only. Just select the lines with offsets ending in that expression, and you'll get the first superblock valid (in the third line) at offset 0x20100430.
Substract 0x430 to give the byte offset of the partition (0x20100000, and then, divide the result by 0x200, giving 0x100800, or 1050624)
# fdisk -l /dev/sda | sed -e 's/^/ /'
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: DF97DAD4-727D-4BB3-BD7B-3C5A584A2747
Device Start End Sectors Size Type
/dev/sda1 2048 526335 524288 256M EFI System
/dev/sda2 526336 1050623 524288 256M BIOS boot
/dev/sda3 1050624 18628607 17577984 8.4G Linux filesystem <-- HERE!!!
/dev/sda4 18628608 77221887 58593280 28G Linux filesystem
/dev/sda5 77221888 85035007 7813120 3.7G Linux filesystem
/dev/sda6 85035008 104566783 19531776 9.3G Linux filesystem
/dev/sda7 104566784 135817215 31250432 14.9G Linux swap
/dev/sda8 135817216 155348991 19531776 9.3G Linux filesystem
/dev/sda9 155348992 1953523711 1798174720 857.4G Linux filesystem