What is the maximum size of RTP Header? - voip

I am working on bandwidth estimations for VoIP calls. I want to know the maximum size of RTP header. I looked on wiki but only the minimum size is available. I tried to calculate manually the number of bits used in the header but the field:
"Header extension: (optional) The first 32-bit word contains a profile-specific identifier (16 bits) and a length specifier (16 bits) that indicates the length of the extension (EHL = extension header length) in 32-bit units, excluding the 32 bits of the extension header"
is confusing me. Please help.

The header structure in the wiki page shows that the header size depends on the value of the CC field (bits 4-7). These four bits can hold at most 15, so the header size will be 128 + 32 x CC = 128 + 15 * 32 = 608 bits = 76 bytes.
For more info, see RFC 3550.

Related

choosing the galois field for reed-solomon encoding

What determines the size of galois field when using reed-solomon algorithm to encode an arbitrary message of any size? Is it the symbol size, or the size of the message?
For example, if I am to encode ASCII characters, and I use GF(2^8) because ASCII's are 8 bits, I would end up with a maximum codeword length of 2^8 - 1 = 255 ASCII characters. Then I would have to split the message into sub-messages of length 255.
Or, if I use GF(2^s) such that 2^s - 1 >= the length of the message, then there's no need to split the message, but in this case even though I am encoding ASCII characters which are 8 bits, each symbol in the codeword would be considered 2^s bits.
Which is preferred? Or is there any other things that determine the selection of the Galois Field?
The fixed or maximum size of the message determines the symbol size. GF(2^2) for up to 15 nibbles (7.5 bytes), GF(2^8) for up to 255 bytes, GF(2^10) for up to 1023 10 bit symbols or 1278.75 bytes (often used for HDD 512 data byte sectors), GF(2^12) for up to 4095 12 bit symbols or 6142.5 bytes (often used for HDD 4096 data byte sectors).

UTF-16 Encoding - Why using complex surrogate pairs?

I have been working on string encoding schemes and while I examine how UTF-16 works, I have a question. Why using complex surrogate pairs to represent 21 bits code point? Why not to simply store the bits in the first code unit and the remaining bits in the second code unit? Am I missing something! Is there a problem to store the bits directly like we did in UTF-8?
Example of what I am thinking of:
The character 'πŸ™ƒ'
Corresponding code point: 128579 (Decimal)
The binary form: 1 1111 0110 0100 0011 (17 bits)
It's 17-bit code point.
Based on UTF-8 schemes, it will be represented as:
240 : 11110 000
159 : 10 011111
153 : 10 011001
131 : 10 000011
In UTF-16, why not do something looks like that rather than using surrogate pairs:
49159 : 110 0 0000 0000 0111
30275 : 01 11 0110 0100 0011
Proposed alternative to UTF-16
I think you're proposing an alternative format using 16-bit code units analogous to the UTF-8 code scheme β€”Β let's designate it UTF-EMF-16.
In your UTF-EMF-16 scheme, code points from U+0000 to U+7FFF would be encoded as a single 16-bit unit with the MSB (most significant bit) always zero. Then, you'd reserve 16-bit units with the 2 most significant bits set to 10 as 'continuation units', with 14 bits of payload data. And then you'd encode code points from U+8000 to U+10FFFF (the current maximum Unicode code point) in 16-bit units with the three most significant bits set to 110 and up to 13 bits of payload data. With Unicode as currently defined (U+0000 .. U+10FFFF), you'd never need more than 7 of the 13 bits set.
U+0000 .. U+7FFF β€” One 16-bit unit: values 0x0000 .. 0x7FFF
U+8000 .. U+10FFF β€” Two 16-bit units:
1. First unit 0xC000 .. 0xC043
2. Second unit 0x8000 .. 0xBFFF
For your example code point, U+1F683 (binary: 1 1111 0110 0100 0011):
First unit: 1100 0000 0000 0111 = 0xC007
Second unit: 1011 0110 0100 0011 = 0xB643
The second unit differs from your example in reversing the two most significant bits, from 01 in your example to 10 in mine.
Why wasn't such a scheme used in UTF-16
Such a scheme could be made to work. It is unambiguous. It could accommodate many more characters than Unicode currently allows. UTF-8 could be modified to become UTF-EMF-8 so that it could handle the same extended range, with some characters needing 5 bytes instead of the current maximum of 4 bytes. UTF-EMF-8 with 5 bytes would encode up to 26 bits; UTF-EMF-16 could encode 27 bits, but should be limited to 26 bits (roughly 64 million code points, instead of just over 1 million). So, why wasn't it, or something very similar, adopted?
The answer is the very common one – history (plus backwards compatibility).
When Unicode was first defined, it was hoped or believed that a 16-bit code set would be sufficient. The UCS2 encoding was developed using 16-bit values, and many values in the range 0x8000 .. 0xFFFF were given meanings. For example, U+FEFF is the byte order mark.
When the Unicode scheme had to be extended to make Unicode into a bigger code set, there were many defined characters with the 10 and 110 bit patterns in the most significant bits, so backwards compatibility meant that the UTF-EMF-16 scheme outlined above could not be used for UTF-16 without breaking compatibility with UCS2, which would have been a serious problem.
Consequently, the standardizers chose an alternative scheme, where there are high surrogates and low surrogates.
0xD800 .. 0xDBFF High surrogates (most signicant bits of 21-bit value)
0xDC00 .. 0xDFFF Low surrogates (less significant bits of 21-bit value)
The low surrogates range provides storage for 10 bits of data β€” the prefix 1101 11 uses 6 of 16 bits. The high surrogates range also provides storage for 10 bits of data β€” the prefix 1101 10 also uses 6 of 16 bits. But because the BMP (Basic Multilingual Plane β€” U+0000 .. U+FFFF) doesn't need to be encoded with two 16-bit units, the UTF-16 encoding subtracts 1 from the high order data, and can therefore be used to encode U+10000 .. U+10FFFF. (Note that although Unicode is a 21-bit encoding, not all 21-bit (unsigned) numbers are valid Unicode code points. Values from 0x110000 .. 0x1FFFFF are 21-bit numbers but are not a part of Unicode.)
From the Unicode FAQ β€” UTF-8, UTF-16, UTF-32 & BOM:
Q: What’s the algorithm to convert from UTF-16 to character codes?
A: The Unicode Standard used to contain a short algorithm, now there is just a bit distribution table. Here are three short code snippets that translate the information from the bit distribution table into C code that will convert to and from UTF-16.
Using the following type definitions
typedef unsigned int16 UTF16;
typedef unsigned int32 UTF32;
the first snippet calculates the high (or leading) surrogate from a character code C.
const UTF16 HI_SURROGATE_START = 0xD800
UTF16 X = (UTF16) C;
UTF32 U = (C >> 16) & ((1 << 5) - 1);
UTF16 W = (UTF16) U - 1;
UTF16 HiSurrogate = HI_SURROGATE_START | (W << 6) | X >> 10;
where X, U and W correspond to the labels used in Table 3-5 UTF-16 Bit Distribution. The next snippet does the same for the low surrogate.
const UTF16 LO_SURROGATE_START = 0xDC00
UTF16 X = (UTF16) C;
UTF16 LoSurrogate = (UTF16) (LO_SURROGATE_START | X & ((1 << 10) - 1));
Finally, the reverse, where hi and lo are the high and low surrogate, and C the resulting character
UTF32 X = (hi & ((1 << 6) -1)) << 10 | lo & ((1 << 10) -1);
UTF32 W = (hi >> 6) & ((1 << 5) - 1);
UTF32 U = W + 1;
UTF32 C = U << 16 | X;
A caller would need to ensure that C, hi, and lo are in the appropriate ranges. [

ECDH private key size

I know that key sizes in ECDH depend on size of Elliptic Curve.
If it is a 256-bit curve (secp256k1), keys will be:
Public: 32 bytes * 2 + 1 = 65 (uncompressed)
Private: 32 bytes
384-bit curve (secp384r1):
Public: 48 bytes * 2 + 1= 97 (uncompressed)
Private: 48 bytes
But with 521-bit curve (secp521r1) situation is very strange:
Public: 66 bytes * 2 + 1 = 133 (uncompressed)
Private: 66 bytes or 65 bytes.
I used node.js crypto module to generate this keys.
Why private key value of 521-bit curve is variable?
The private key of the other curves are variable as well, but they are less likely to exhibit this variance when it comes to encoding to bytes.
The public key is encoded as two statically sized integers, prefixed with the uncompressed point indicator 04. The size is identical to the key size in bytes.
The private key doesn't really have an pre-established encoding. It is a single random value (or vector) within the range 1..N-1 where N is the order of the curve. Now if you encode this value as a variable sized unsigned number then usually it will be the same size as the key in bytes. However, it may by chance be one byte smaller, or two, or three or more. Of course, the chance that it is much smaller is pretty low.
Now the 521 bit key is a bit strange that the first, most significant byte of the order doesn't start with a bit set to 1; it only has the least significant bit set to 1. This means that there is a much higher chance that the most significant byte of the private value (usually called s) is a byte shorter.
The exact chance of course depends on the full value of the order:
01FF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFA
51868783 BF2F966B 7FCC0148 F709A5D0
3BB5C9B8 899C47AE BB6FB71E 91386409
but as you may guess it is pretty close to 1 out of 2 because there are many bits set to 1 afterwards. The chance that two bytes are missing is of course 1 out of 512, and three bytes 1 out of 131072 (etc.).
Note that ECDSA signature sizes may fluctuate as well. The X9.42 signature scheme uses two DER encoded signed integers. The fact that they are signed may introduces a byte set all to zeros if the most significant bit of the most significant byte is set to 1, otherwise the value would be interpreted as being negative. The fact that it consists of two numbers, r and s, and that the size of DER encoding is also dependent of the size of the encoded integers makes the size of the full encoding rather hard to predict.
Another less common (flat) encoding of an ECDSA signature uses the same statically sized integers as the public key, in which case it is just twice the size of the order N in bytes.
ECDH doesn't have this issue. Commonly the shared secret is the statically encoded X coordinate of the point that is the result of the ECDH calculation - or at least a value that is derived from it using a Key Derivation Function (KDF).

Finding number of samples in .wav file and Hex Editor

Need help with Hex Editor and audio files.I am having trouble figuring out the formula to get the number of samples in my .wav files.
I downloaded StripWav which tells me the number of samples in the .waves,but still cannot figure out the formula.
Can you please download these two .wavs,open them in a hex editor and tell me the formula to get the number of samples.
If you so kindly do this for me,pleas tell me the number of samples for each .wav so I can make sure the formula is correct.
http://sinewavemultimedia.com/1wav.wav
http://sinewavemultimedia.com/2wav.wav
Here is a problem I have two programs,
One reads the wav data and the other shows the numsamples
here is the data
RIFF 'WAVE' (wave file)
<fmt > (format description)
PCM format
2 channel
44100 frames per sec
176400 bytes per sec
4 bytes per frame
16 bits per sample
<data> (waveform data - 92252 bytes)
But the other program says NumSamples is
23,063 samples
/*******UPDATE*********/
One more thing I did the calculation with 2 files
This one is correct
92,296 bytes and num samples is 23,063`
But this other one is not coming out correctly it is over 2 megs i just subracted 44 bytes and I doing it wrong here? here is the filesize
2,473,696 bytes
But the correct numsamples is
617,400
WAVE format
You must read the fmt header to determine the number of channels and bits per sample, then read the size of the data chunk to determine how many bytes of data are in the audio. Then:
NumSamples = NumBytes / (NumChannels * BitsPerSample / 8)
There is no simple formula for determining the number of samples in a WAV file. A so-called "canonical" WAV file consists of a 44-byte header followed by the actual sample data. So, if you know that the file uses 2 bytes per sample, then the number of samples is equal to the size of the file in bytes, minus 44 (for the header), and then divided by 2 (since there are 2 bytes per sample).
Unfortunately, not all WAV files are "canonical" like this. A WAV file uses the RIFF format, so the proper way to parse a WAV file is to search through the file and locate the various chunks.
Here is a sample (not sure what language you need to do this in):
http://msdn.microsoft.com/en-us/library/ms712835
A WAVE's format chunk (fmt) has the 'bytes per sample frame' specified as wBlockAlign.
So: framesTotal = data.ck_size / fmt.wBlockAlign;
and samplesTotal = framesTotal * wChannels;
Thus, samplesTotal===FramesTotal IIF wChannels === 1!!
Note how the above answer elegantly avoided to explain that key-equations the spec (and answers based on them) are WRONG:
consider flor example a 2 channel 12 bits per second wave..
The spec explains we put each 12bps sample in a word:
note: t=point in time, chan = channel
+---------------------------+---------------------------+-----
| frame 1 | frame 2 | etc
+-------------+-------------+-------------+-------------+-----
| chan 1 # t1 | chan 2 # t1 | chan 1 # t2 | chan 2 # t2 | etc
+------+------+------+------+------+------+------+------+-----
| byte | byte | byte | byte | byte | byte | byte | byte | etc
+------+------+------+------+------+------+------+------+-----
So.. how many bytes does the sample-frame (BlockAlign) for a 2ch 12bps wave have according to spec?
<sarcasm> CEIL(wChannels * bps / 8) = 3 bytes.. </sarcasm>
Obviously the correct equation is: wBlockAlign=wChannels*CEIL(bps/8)

Base64: What is the worst possible increase in space usage?

If a server received a base64 string and wanted to check it's length before converting,, say it wanted to always permit the final byte array to be 16KB. How big could a 16KB byte array possibly become when converted to a Base64 string (assuming one byte per character)?
Base64 encodes each set of three bytes into four bytes. In addition the output is padded to always be a multiple of four.
This means that the size of the base-64 representation of a string of size n is:
ceil(n / 3) * 4
So, for a 16kB array, the base-64 representation will be ceil(16*1024/3)*4 = 21848 bytes long ~= 21.8kB.
A rough approximation would be that the size of the data is increased to 4/3 of the original.
From Wikipedia
Note that given an input of n bytes,
the output will be (n + 2 - ((n + 2) %
3)) / 3 * 4 bytes long, so that the
number of output bytes per input byte
converges to 4 / 3 or 1.33333 for
large n.
So 16kb * 4 / 3 gives very little over 21.3' kb, or 21848 bytes, to be exact.
Hope this helps
16kb is 131,072 bits. Base64 packs 24-bit buffers into four 6-bit characters apiece, so you would have 5,462 * 4 = 21,848 bytes.
Since the question was about the worst possible increase, I must add that there are usually line breaks at around each 80 characters. This means that if you are saving base64 encoded data into a text file on Windows it will add 2 bytes, on Linux 1 byte for each line.
The increase from the actual encoding has been described above.
This is a future reference for myself. Since the question is on worst case, we should take line breaks into account. While RFC 1421 defines maximum line length to be 64 char, RFC 2045 (MIME) states there'd be 76 char in one line at most.
The latter is what C# library has implemented. So in Windows environment where a line break is 2 chars (\r\n), we get this: Length = Floor(Ceiling(N/3) * 4 * 78 / 76)
Note: Flooring is because during my test with C#, if the last line ends at exactly 76 chars, no line-break follows.
I can prove it by running the following code:
byte[] bytes = new byte[16 * 1024];
Console.WriteLine(Convert.ToBase64String(bytes, Base64FormattingOptions.InsertLineBreaks).Length);
The answer for 16 kBytes encoded to base64 with 76-char lines: 22422 chars
Assume in Linux it'd be Length = Floor(Ceiling(N/3) * 4 * 77 / 76) but I didn't get around to test it on my .NET core yet.
Also it would depend on actual character encoding, i.e. if we encode to UTF-32 string, each base64 character would consume 3 additional bytes (4 byte per char).

Resources