How to manually change duration field in matroska/mkv - mkv

I have one mkv file that doesn't have valid duration.
I want to change this duration parameter manually.
I gone through this matroska specification defined at
http://www.matroska.org/technical/specs/index.html
Looking at specification for matroska this contains only identification magic numbers, but this doesn't specify length for data.
How to parse this matroska header so that i get duration field and change this field?

The type of the Duration field is float. According to the documentation it can be either 4 or 8 octets.
To know which size it is, you have to look at the data size part of the field. The data size part uses an UTF-8 like system. It's explained here.

You can use ffprobe to get duration of .mkv file :
$ ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 input.mkv
5012.640000
See : https://trac.ffmpeg.org/wiki/FFprobeTips#Duration

Matroska uses EBML (https://en.wikipedia.org/wiki/Extensible_Binary_Meta_Language) as a base. EBML elements use variable sized integers, so it's not trivial to just change numbers. You can use FFmpeg to remux an MKV, which will rewrite the header with the correct duration.

I'm sorry for long introduction, but your case is described at the end of the note
EBML format in matroska:
https://github.com/ietf-wg-cellar/ebml-specification/blob/master/specification.markdown
matroska specification:
https://www.matroska.org/technical/elements.html
mkv file consist from blocks,
every block has a following structure - consists from these 3 sections -
ID, DATA_LENGTH, DATA
ID - field identification section
count zero bits till first 1, to estimate length of this section in bytes, the next part till end of this byte(s) section is matroska element ID:
example
1f 43 b6 75
0001 1111 -> first nibble determines a 4 bytes block for ID (the first 1 in binary form is on fourth position from left), and the ID is 0xF43b675 - and it means cluster, may in matroska specification look for the full id 0x1f43b675
44 89
0100 0100 -> first nibble estimates a 2 byte section for ID (the first 1 in binary form is on second position from left), and the ID is 0x489 - and it means Duration, in matroska specification match for 0x4489
DATA LENGTH - after the ID section follows size specification, also BML
count zero bits till first 1, to estimate length in bytes of this section, the next part till end of this (byte) section is data length in bytes
01 00 00 00 00 17 ff 0d
0000 0001 - for the first byte the section length is 8 bytes, related data length is 0x17ff0d
88
1000 1000 - the section length is 1 byte, related data length is 8 byte
complexe sample
00000000 4d 80 a5 47 53 74 72 65 M..GStre
00000008 61 6d 65 72 20 6d 61 74 amer mat
00000010 72 6f 73 6b 61 6d 75 78 roskamux
00000018 20 76 65 72 73 69 6f 6e version
00000020 20 31 2e 31 30 2e 34 00 1.10.4.
4d 80
0100 1100 - define 2 bytes long ID 0xD80 - it means a Muxing application or library
a5
1010 0101 - define a 1 byte length block, data area has a 0x25 bytes
DATA - data block
00000000 47 53 74 72 65 M..GStre
00000008 61 6d 65 72 20 6d 61 74 amer mat
00000010 72 6f 73 6b 61 6d 75 78 roskamux
00000018 20 76 65 72 73 69 6f 6e version
00000020 20 31 2e 31 30 2e 34 00 1.10.4.
Note:
strings are zero terminated, terminating zero is included in the size
numbers are in big endian, no swaping is necessary, so 1000000 you see as 0f 42 40
see example
Timestamp scale:
00000000 2a d7 b1 83 0f 42 40
2a
0010 1010 - three bytes for ID, ID is 0xAD7B1
83
1000 0011 - one byte for length, data length is 3
0f 42 40 - data as BE integer - 1000000
Duration:
00000000 44 89 88 41 22 85 dd 26 D..A"..&
00000008 35 c5 b5 5..
44
0100 0100 - two bytes for ID section (1 is in second position from left), ID is 0x489, in matroska specification look for 0x4489
88
1000 1000 - one byte for length section (1 in the first position from left), data length is 8 bytes (the remaining 7 bites)
41 22 85 dd 26 35 c5 b5
eight bytes of data, these data are double type, and gives 606958.57462900004 - 10 minutes and 6.958574629 seconds (and the VLC player shows this time).
In C++:
__int64 durationRaw = 0x412285dd2635c5b5;
double durationMiliSeconds = (double*)&durationRaw;
So for your case get desired Duration value, convert it to miliseconds as double (8 bytes float) and store its binary represenation as is in desired value in the Duration block.

Related

How to read perfectly a .pcap file

Using tcpdump im trying to sniff some packets. The result is this:
reading from file /tmp/prueba.pcap, link-type LINUX_SLL (Linux cooked v1)
13:35:51.767194 IP6 fdc1:41d:9c3:dbef:a6e9:69f0:59aa:b70a.47193 > fdc1:41d:9c3:dbef:0:ff:fe00:8c00.47193: UDP, length 63
0x0000: 6000 0000 0047 1140 fdc1 041d 09c3 dbef `....G.#........
0x0010: a6e9 69f0 59aa b70a fdc1 041d 09c3 dbef ..i.Y...........
0x0020: 0000 00ff fe00 8c00 b859 b859 0047 d42e .........Y.Y.G..
0x0030: 3f0c 0000 0dc2 50f1 0d7b 2254 696d 6522 ?.....P..{"Time"
0x0040: 3a5b 3136 3632 3033 3933 3531 2c22 225d :[1662039351,""]
0x0050: 2c22 4d6f 6417 0012 320f 00f0 0352 6f6c ,"Mod...2....Rol
0x0060: 6c22 3a5b 3533 302c 2264 c2ba 225d 7d l":[530,"d.."]}
The point is in the line with address 0x0050 we can read "Mod...2". That "Mod" means "Mode" but I don't understand why is not the whole word "Mode". ¿Where is the "e"? I need to read that message perfectly for automate a program reading values from there.
I discarded a puntual problem transmiting the message because every time I sniff a packet that contain that info, the format is exactly the same.
Regards,
There are indications that the packet is not correct in other ways than a missing e. For example, the ether type is 0x09c3 and not 0x86dd (IPv6).
Maybe this code to create a PCAP file can help. Using the raw packet you provided as input the output file is bad.pcap and you could use a tool like Wireshark to examine the packet in more detail, see here
import codecs
from scapy.all import wrpcap, Ether, IP, IPv6, UDP, Raw
data = (
'60 00 00 00 00 47 11 40 fd c1 04 1d 09 c3 db ef '
'a6 e9 69 f0 59 aa b7 0a fd c1 04 1d 09 c3 db ef '
'00 00 00 ff fe 00 8c 00 b8 59 b8 59 00 47 d4 2e '
'3f 0c 00 00 0d c2 50 f1 0d 7b 22 54 69 6d 65 22 '
'3a 5b 31 36 36 32 30 33 39 33 35 31 2c 22 22 5d '
'2c 22 4d 6f 64 17 00 12 32 0f 00 f0 03 52 6f 6c '
'6c 22 3a 5b 35 33 30 2c 22 64 c2 ba 22 5d 7d' )
data_list = data.split( " " )
data_s = codecs.decode(''.join(data_list), 'hex')
packet = Raw(load=data_s)
wrpcap('bad.pcap', [packet])
data = (
'3f 0c 00 00 0d c2 50 f1 0d 7b 22 54 69 6d 65 22 '
'3a 5b 31 36 36 32 30 33 39 33 35 31 2c 22 22 5d '
'2c 22 4d 6f 64 17 00 12 32 0f 00 f0 03 52 6f 6c '
'6c 22 3a 5b 35 33 30 2c 22 64 c2 ba 22 5d 7d' )
data_list = data.split( " " )
data_s = codecs.decode(''.join(data_list), 'hex')
packet = Ether(dst="60:00:00:00:00:47", src="11:40:fd:c1:04:1d") / IPv6(dst="fdc1:41d:9c3:dbef:0:ff:fe00:8c00", src="fdc1:41d:9c3:dbef:a6e9:69f0:59aa:b70a" ) / UDP(sport=47193, dport=47193, len=0x0047 ) / Raw(load=data_s)
wrpcap('better.pcap', [packet])
The hex dump begins with the IPv6 header; the link-layer header is not being dumped, so the Ethertype that would appear in the LINKTYPE_SLL_LINUX link-layer header isn't shown.
So the header is:
6000 0000: version (6), traffic class (0), flow label (0)
0047: payload length (0x47 = 71)
1140: next header (0x11 = 17 = UDP), hop limit (0x40 = 64)
fdc1 041d 09c3 dbef a6e9 69f0 59aa b70a: source IPv6 address (fdc1:41d:9c3:dbef:a6e9:69f0:59aa:b70a)
fdc1 041d 09c3 dbef 0000 00ff fe00 8c00: destination IPv6 address (fdc1:41d:9c3:dbef:0:ff:fe00:8c00)
The next header field is 17, so what follows that is a UDP header:
b859: source port (0x5859 = 47193)
b859: destination port (0x5859 = 47193)
0047: length (0x47 = 71)
d42e: checksum
Those 71 bytes are the UDP header (8 bytes) plus the UDP payload (71 - 8 = 63 bytes).
Port 47193 is a in the "registered" port range; however, it does not appear in the current list of well-known and registered ports.
It does, however, appear, from some web searches, to be the default gateway port for MQTT-SN. MQTT "is a lightweight, publish-subscribe, machine to machine network protocol", and MQTT-SN, according to that page, "is a variation of the main protocol aimed at battery-powered embedded devices on non-TCP/IP networks".
If this is MQTT-SN, then, according to the protocol specification for MQTT-SN, the payload would be:
3f0c: length (0x3f = 63), message type (0c = PUBLISH)
00: flags (0x0000)
000d: TopicId
c250: MsgId
f1 0d7b 2254 696d 6522 3a5b 3136 3632 3033 3933 3531 2c22 225d 2c22 4d6f 6417 0012 320f 00f0 0352 6f6c 6c22 3a5b 3533 302c 2264 c2ba 225d 7d: Data
So that data is:
0xf1 0x0d {"Time":[1662039351,""],"Mod 0x17 0x00 0x12 2 0x0f 0x00 0xf0 0x03 Roll":[530,"d 0xc2 0xba "]}
If the published data is intended to be ASCII text, it appears to have been damaged; if this is from a wireless low-power network, perhaps there was radio interference.
The answer is easy... The content of the pcap packet was compressed with lz4...

Parsing linux color control sequences

I'm trying to render the output of a linux shell command in HTML. For example, systemctl status mysql looks like this in my terminal:
As I understand from Floz'z Misc I was expecting that the underlying character stream would contain control codes. But looking at it in say hexyl (systemctl status mysql | hexyl) I can't see any codes:
Looking near the bottom on lines 080 and 090 where the text "Active: failed" is displayed, I was hoping to find some control sequences to change the color to red. While not necessarily ascii, I used some ascii tables to help me:
looking at the second lot of 8 characters on line 090 where the letters ive: fa are displayed, I find:
69 = i
76 = v
65 = e
3a = :
20 = space
66 = f
61 = a
69 = i
There are no bytes for control sequences.
I wondered if hexyl is choosing not to display them so I wrote a Java program which outputs the raw bytes after executing the process as a bash script and the results are the same - no control sequences.
The Java is roughly:
p = Runtime.getRuntime().exec(new String[]{"/bin/sh", "-c", "systemctl status mysql"}); // runs in the shell
p.waitFor();
byte[] bytes = p.getInputStream().readAllBytes();
for(byte b : bytes) {
System.out.println(b + "\t" + ((char)b));
}
That outputs:
...
32
32
32
32
32
65 A
99 c
116 t
105 i
118 v
101 e
58 :
32
102 f
97 a
105 i
108 l
101 e
100 d
...
So the question is: How does bash know that it has to display the word "failed" red?
systemctl detects that the output is not a terminal, and it removes colors codes from the output.
Related: Detect if stdin is a terminal or pipe? , https://unix.stackexchange.com/questions/249723/how-to-trick-a-command-into-thinking-its-output-is-going-to-a-terminal , https://superuser.com/questions/1042175/how-do-i-get-systemctl-to-print-in-color-when-being-interacted-with-from-a-non-t
Tools sometimes (sometimes not) come with options to enable color codes always, like ls --color=always, grep --color=always on in case of systemd with SYSTEMD_COLORS environment variable.
What tool can I use to see them?
You can use hexyl to see them.
how does bash know that it has to mark the word "failed" red?
Bash is the shell, it is completely unrelated.
Your terminal, the graphical window that you are viewing the output with, knows to mark it red because of ANSI escape sequences in the output. There is no interaction with Bash.
$ SYSTEMD_COLORS=1 systemctl status dbus.service | grep runn | hexdump -C
00000000 20 20 20 20 20 41 63 74 69 76 65 3a 20 1b 5b 30 | Active: .[0|
00000010 3b 31 3b 33 32 6d 61 63 74 69 76 65 20 28 72 75 |;1;32mactive (ru|
00000020 6e 6e 69 6e 67 29 1b 5b 30 6d 20 73 69 6e 63 65 |nning).[0m since|
00000030 20 53 61 74 20 32 30 32 32 2d 30 31 2d 30 38 20 | Sat 2022-01-08 |
00000040 31 39 3a 35 37 3a 32 35 20 43 45 54 3b 20 35 20 |19:57:25 CET; 5 |
00000050 64 61 79 73 20 61 67 6f 0a |days ago.|
00000059

Calculating the number of samples in a WAV file

I am currently looking at online examples and here is a WAV file contents in bytes
52 49 46 46 24 08 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 02 00
22 56 00 00 88 58 01 00 04 00 10 00 64 61 74 61 00 08 00 00 00 00 00 00
24 17 1e f3 3c 13 3c 14 16 f9 18 f9 34 e7 23 a6 3c f2 24 f2 11 ce 1a 0d
and here is the visual; representation:
So according to the Subchunk2Size there is 2048 bytes in the data. The formula to calculate the number of samples in a WAV is given as:
Subchunk2Size /(NumChannels * BitsPerSample/8 ) = NumSamples
If I plugin numbers and according to the information given I get NumSamples = 512. But in the diagram the sample rate is 22050. How can the total number fo samples be less than a single second of samples?
For those wondering, here is a link to the source.
I suspect they are just using a bad example where the duration of the wav file would be less than a second. Their formula makes sense and we can use it to verify the data size of a one second wav file.
If our sample rate is 22050 samples/sec and our wav file is one second, then numSamples = 22050. We know that Subchunk2Size is the number of bytes in the data and can be calculated using this formula: Subchunk2Size = numSamples * numChannels * bitsPerSample / 8 , so, assuming numChannels = 2 and bitsPerSample = 16, we know that a one second wav file should be (22050 * 2 * 16 / 8) bytes which is 88200 bytes, so it would make sense that if Subchunk2Size is 2048 bytes, as per the website's example, then the duration of the wav file would be less than a second and thus, numSamples would be less than 22050.

Unknown Bittorrent Message

I have been receiving an odd/unknown message while attempting to communicate with some bittorrent peers. In this particular case I am in the middle of downloading pieces and all of a sudden this new/odd message pops up in front of a piece response.The message is odd because it doesn't appear to follow the protocol, all messages are supposed to look like this
'<length prefix><message ID><payload>'
length prefix is 4 bytes, message id is 1 byte and the payload. I am including a capture to show what I mean, on line 509 of the capture you will
see a request for a piece, on line 510 you will see the beginning of the response.
The first 4 bytes of the response are 00 00 00 00, ie 0 length message (Which is causing me issues), the next 4 bytes are the actual length of the message which is 30. The actual response to the piece request starts on line 513, so I get the piece I was requesting but this new/odd message is messing me up. I'm certain I can find a workaround but I would really like to understand what this means.
Also, I have no idea what the actual message means, and cannot find any information about it anywhere.
Here is the Wireshark capture.
https://1drv.ms/u/s!Agj06pa-wu0tnFqsYn_KnHmVz3x2
Data from packet 510:
0000 00 00 00 00 00 00 00 1e 14 01 64 35 3a 61 64 64 ..........d5:add
0010 65 64 36 3a 63 f2 7a 48 17 f4 37 3a 64 72 6f 70 ed6:c.zH..7:drop
0020 70 65 64 30 3a 65 ped0:e
00 00 00 00 4 bytes keep-alive message
00 00 00 1e message length 30 bytes
14 message type extended message (BEP10)
01 extended message ID = 1 as specified by the previous extension handshake: ut_pex
64 35 3a 61 64 64 65 64 36 3a 63 f2 7a 48 17 f4 37 3a 64 72 6f 70 70 65 64 30 3a 65
d5:added6:c.zH..7:dropped0:e
ut_pex message data (bencoded)
d
5:added
6:c.zH..
7:dropped
0:
e
ut_pex message data (bencoded with added white space)
The first 4 bytes of the response are 00 00 00 00, ie 0 length message (Which is causing me issues)
The bittorrent spec says
Messages of length zero are keepalives, and ignored.

What are those 32 bits near the start of a FLAC file?

I am trying to write a parser to extract information from the following FLAC file:
$ hd audio.flac | head -n 6
00000000 66 4c 61 43 00 00 00 22 12 00 12 00 00 00 00 00 |fLaC..."........|
00000010 00 00 0a c4 42 f0 00 78 9f 30 00 00 00 00 00 00 |....B..x.0......|
00000020 00 00 00 00 00 00 00 00 00 00 84 00 02 64 1f 00 |.............d..|
00000030 00 00 47 53 74 72 65 61 6d 65 72 20 65 6e 63 6f |..GStreamer enco|
00000040 64 65 64 20 76 6f 72 62 69 73 63 6f 6d 6d 65 6e |ded vorbiscommen|
00000050 74 10 00 00 00 12 00 00 00 54 49 54 4c 45 3d 52 |t........TITLE=R|
Now, according to the specification, the format should be as follow (numbers are in bits):
<32> "fLaC", the FLAC stream marker in ASCII
<16> The minimum block size (in samples) used in the stream.
<16> The maximum block size (in samples) used in the stream.
<24> The minimum frame size (in bytes) used in the stream.
<24> The maximum frame size (in bytes) used in the stream.
<20> Sample rate in Hz.
<3> (number of channels)-1. FLAC supports from 1 to 8 channels
<5> (bits per sample)-1. FLAC supports from 4 to 32 bits per sample.
<36> Total samples in stream.
<128> MD5 signature of the unencoded audio data.
So, I start to write my parser and, while testing, get very strange results. So I test with a "real" metadata extractor:
$ metaflac --list audio.flac
METADATA block #0
type: 0 (STREAMINFO)
is last: false
length: 34
minimum blocksize: 4608 samples
maximum blocksize: 4608 samples
minimum framesize: 0 bytes
maximum framesize: 0 bytes
sample_rate: 44100 Hz
channels: 2
bits-per-sample: 16
total samples: 7905072
MD5 signature: 00000000000000000000000000000000
From the numbers, I can deduce the following:
66 4c 61 43 00 00 00 22 12 00 12 00 00 00 00 00
~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~ ~~~~~ ~~~~~~~~ ~~
^ ^ ^ ^ ^ ^
| | | | | |
| | | | | + Etc.
| | | | + Minimum frame size
| | | + Maximum block size
| | + Minimum block size
| + What is that ?!?
+ FLAC stream marker
Where does those 32 bits come from? I see they represent the length of the header, but isn't it against the standard to put it there (Taking into account that we already know the length: (32+16+16+24+20+3+5+36+128)/8)?
The 0x22 (34) is indeed the header block size in bytes as part of the METADATA_BLOCK_HEADER which follows the fLaC marker in the stream. Of the first 8 bits (00), bit 7 indicates that there are more metadatablocks to follow, the next 7 bits indicate that it's a STREAMINFO block. The following 3 bytes (00 00 22) is the length of the contents of the block;
16 + 16 + 24 + 24 + 20 + 3 + 5 + 36 + 128 = 272 bits
272 bits / 8 = 34 (0x22) bytes.

Resources