How to use FEC feature for Opus codec - audio

I'm trying to use the opus Forward Error Correction (FEC) feature.
I have a service which does the encoding with OPUS_SET_INBAND_FEC(1)
and OPUS_SET_PACKET_LOSS_PERC(20) with 10ms packets and sends them over UDP.
I'm not clear on the decoding process though.
When a packet is lost, do I need to call decode with fec=1 ONLY or do I need to call decode with fec=0 after as well on the next packet?
How do I know up front the size of the pcm that I send to decode with fec enabled?

I managed to make it work.
The encoding part stated in the question was correct:
Use the encoder OPUS_SET_INBAND_FEC(1) and OPUS_SET_PACKET_LOSS_PERC(X) where x>0 ans x<100
Send packets with a duration of at least 10ms (for example: 480 samples at 48 kHz)
For the decoding part, when a packet is lost, call the decode function on the next packet first with fec=1 and again with fec=0.
When calling decode with fec=1, the pcm sent will be entirely filled.
If you don't know the length that the pcm should be use on the decoder OPUS_GET_LAST_PACKET_DURATION(x) where x will get the the duration of the last packet.

Related

how to deal with processing time delay of audio codec while streaming over RTP

In section 2.1 of the Speex codec manual it says:
Every speech codec introduces a delay in the transmission. For Speex, this delay is equal to the frame size, plus some amount
of “look-ahead” required to process each frame. In narrowband operation (8 kHz), the delay is 30 ms, while for wideband (16
kHz), the delay is 34 ms. These values don’t account for the CPU time it takes to encode or decode the frames.
In RTP Payload Format for the Speex Codec, RFC5574 it says:
ptime: SHOULD be a multiple of 20 msec
I have a 20mS frame time of encoded data. so I assume my ptime should be 20.
The delay for the encoding is 30mS or more. The time between RTP packets are 20mS. How is this supposed to work? Every other RTP payload is an empty packet? How do I resolve this?
Seemingly this is an issue with every codec. I must be missing some fundamental understanding of how streaming works.
I have validated I can stream a pre-encoded buffer and it sounds as intended.
I have tried:
Creating a large queue in the beginning to compensate, however this quickly becomes zero length.
Sending zero data as the payload
Ideas I haven't yet tried:
Send a packet of all padding and mark the RTP header as padding
Increase the sequence but not the timestamp until the next actual payload is ready (this sounds like it is against the spec?)
Note: I'm now wondering if the delay mentioned by speex is within the encoded output and the delay I am seeing while streaming is due to my limited CPU (embedded)
My note was correct. This question is flawed.
The Speex manual is referring to a delay in the audio output, not an inherent delay of processing time. Therefore the issue in question is not an issue.
I'm glad I asked the question, it helped me come to the solution.

MJPEG data doesn't have SOI marker. What kind of mjpeg I'm dealing with?

I'm writing a simple MJPEG streamer in Java.
Since it has very specific application, I had to reverse engineer the rtp packets sent by the GStreamer using the Wireshark sniffer. The GStreamer, in turn, just restreams the mjpeg avi file taken from specific camera.
The first and the most straightforward way is to prepare packets as an array of data and send them to the consumer, particularly VLC player, that uses live555 library. I'm populating the rtp headers well, so the internal payload is hardcoded; in my experiment VLC successfully displays the encoded screen, i.e. it decodes the hardcoded rtp payload well.
So, that is the value of first two packets payload (from total 25) in hex view:
String[] hexArray =
{
"0000000000ffa05a00000080100b0c0e0c0a100e0d0e1211101318281a181616183123251d283a333d3c3933383740485c4e404457453738506d51575f626768673e4d71797064785c6567631112121815182f1a1a2f634238426363636363636363636363636363636363636363636363636363636363636363636363636363636363636363636363636363ad8a5a00296801692800a5ed40052d001450014b40052f5a004dbce694e68000c7d29435002e452e68017a8a2800c51400518c500141a004c5348a0031498a005a2800a2800a2800a28014514001a2800a51400b450014500251400941a004a2800a5a004c668a004a5a00292800a5a0029734006697340052d0019a5cd004d15c4917dd6e3d0d5d8751523f7831ee2802d32c3728090187622aacba79eb1367d8d00547478db0ea54fbd20a0008a4e94005140052e6801411de9719a00694c8e29bb4d001d29680145385002629306800e476a379e38a00a9711bc97d0b80762839fd2926b677ba128e9b81340178e49a4a0028a004a2800a2801ebcd0450018a422801299707103ffba6802b45129d1d1d87ce4819a6c56ed2c45d0f4ed4010c91329c329151b479a008da32298548a004a334012a4acbdea64ba1d186280275955ba1a75002e714e0fc608cd002ec46e876d4b1d94afca1523d734012fd825c7381504700695e3923f987dd6cf7a009134fee5003f5a71b50a083b4714019de63472ba30fba719f5a956e79a0099270dc77a833feb1cf6a0087c3d8315c49d8c86b54d001d39a963b9963e8d91e86802c2df82312c60fd283159ce3e5f9188fa50042fa5b6331386fad5592d668cfcc87ea28021a280108a4c500262931400845215a00694cd31d38a00a8edf3800f5ab7167cb258f4a004a4ed4007bd283400b450014b40051400b4500140a005cd1400b45002d2e2801368349b481807a50028c8a5dd8eb400bb8529a000506800a2800a280034dc50018a4a0028a0028a0028a00296800a2800a05002d1400b4500145002628c5001494009de96800a280128a00292800a2800a5a0028068017345002e696800a5cd00392578db2848abb0ea2540120dc3d6802ec734370080437b1a8a5b147c98ced3e9da8029c904917df5e3d474a8e801297140094b400519a007034bd6801a541a69522800cfad2e680141a703400b46280136d2e280168c0228010ad26d34006293140062931400e14fc50021146df949c8a006e3bd57bdff8f57f718a0046f9747b65f5c1a92c86db66f76a0091803503db46dd06d3ed4015ded9d738f9876aaee9ce08c1a0089a3a8ca500260d1400a1b078a952775ef9a009d2e54f0dc54c1c1e86801c0d3d6464395623e940121bb988ff00586985d8924b1c9a008e57900cef6fceab34ae5c61cf1d6802c8903f2c050608dcf1c6680226b778db767a75fa5131db67330e323028022d29bc8d3e3ec5f923deafa5c83de80255954f7a76e1400b45003d26923395722ac25fb8e1d430a007f99693ff00ac4009f6a6369b138cc52fe7401564b09d3f8770f6aaecaca70ca41f7a0069146280022931400dc5262802bc96b193bb6f23de95b88cfd2800a3b5001411400500e68014519a00052d002f4a2800a2800a05002d2d0019a5cd002e68a005a050018e68c500041ed4738a003340606801720d2d001450036908c50025140051400514001a5a0028a002945002d250014b4005140052d002518a004c5140084514005140098a08a0028a00292800a280168a005a33400b9a5a00296801c188390715662be963c03f30f7a00bd15e4528c13b4fa1a596d62979c6d6f51401525b2953ee8de3d4557f62306800a4c50014a680014b400669d9a000804629bb3d2801304500f3400b9a706a00506945001462800a28016930280108a4c50018a70340013450004554d40edb5ff810fe74005e7cb6b6d1f60b9a9edc62", //1
"000004ec00ffa05ad23f53c9a0075260500211cd35a35718619a00aef66adf74e2abc96cebd5723da802131d4663a00615c52722800cd2ab9071934013a5cb0ebcd4cb72a7af140130604706968003cd44f0ab1c8e0d0048221b786e7de8daebd680012718a64f189ad9a20719e45003e2b74fb3c48c7e64183ef4c7b47ce55b1400d2658f191d0734b1dd6680275ba1532cc08eb400f0c294e2800a504af20e0d004e97b3271907eb538bc8a5189a3feb40086d6d67e51b6fb0a824d324519460d40151e0963fbc845474005262801920e2abce76c4dec2801d4940051d4d0021eb4a38a00297140074a5068003d68cd0014a38140051400b466800a5a0029680168cd0028a075a005ef4668014e29314009b79cd1839ce6800e7346ef6a000114b40084669b834008296800068a005a4a00296800a5a0028a0031450014a0d002d25002d140098a280131477a004a2800a280131462800a2800a2800a2800a2800a5cd002d2d0014b9a00506a68aea48beeb71e86802f43a8231c48369f5ed561922b850480c3b11401525b060731b6e1e86aaba321c3020fbd00251400519a005a41400a0d381a005e0d26d1400d2b8a43c50000d3b3400a1a9c0d0014500145001d697140098a5071400871da9a3ad002e6a9ea4730aafabaff3a005d48fef2251d92acc38fb3463da801d8a154b1c0ea68002b8241ea28c500376d211400c7851faad5692cffb87f034015e48193ef2d4263a00618e98460d001466801eb2329e0d4eb72470c280265995bbd48083400b9a5dc68011f0c3a60fad33045003be6039a72c8477a0078914f0c3229ad04321ce3140113da1504a1a882cabfc3400e5b82ad86e2a64ba05b19e940132ce0f7a915c3500381a5a004a952e254fbae7f1a00b09a81e049183f4a5c594e390149fc280227d3323314991ef55a4b39e3ce53207714015241818354af8e2dcfb9a009e92800029318a004ef4b400519cd002d038a005cd140052d00140a0028a00052e79a005a280169450014b40051400b4b400521a000d2e280108a4d9ce6800c11df349c8eb40099e69280168a005a4a005a3b500252d0014b4005140051400b40a005a28016931400946280128c500149400514001a4a0028a0028a004a280168a005a2801734b9a00296800cd491c8f19ca311401761d408e2519f7156d648a71c10d9ed4010cb60a7988edf6354e58258befaf1ea3a50047450019a5cd0019a280141a50d400b9a0806801a57d29bcd00283466801c1a9dba801783498a0028068017349400628228012a95f73244bfed668013516cdd7d302ae8e2341ed40052e6800dc73934a2800146280108cd3769a0031eb51496d1b83c60fb500577b361f77e6aaaf110704608a008da3a618e801b82282680141c53d6565e8680274b9fef0a996556e86801f9a5a0050d8a3e461c8e7d6800f2f8e0e693e65eb400a1cf4cd49bc746140018e293d326a07b21bb2a714010b452a1c0e68499875c8a009d6e7d4d4cb7008e680241203de9fb85002f1498e6801caee872ac454e97d2a8c361beb4014f559567f28c636b03f37b8ac7d41c2f94a7a33806802d5140094b400d22968010d274a005cd2d0014a0d0014500145002d1400b45001d296800ef477a005a51400a0f34b40051400b40e7ad0018e694f4a004a05002d21140098a42b91400d2b4878a00514b4001a4a005a280168a0028a0028a00296800cd19a005a33400b450025262801283400521a0028a0028a004a2800a280128a002945002d19a005a01a005a5a0001a72b10410706802e417ee9c3fcc3f5abb0dd4530c6707d0d0012da452f206d3ea2a9cb672c63206f1ea2802be292800a2800a5a0039a5cd0019a5a00695f4a6f22800cd2eea0050d4e0d400bba8cd0019a5cd002e68a004c550ba39d42dd3d89fa7228021ba7dd76deed5a8dc1c7b500369680128cd002834b", //2
According to the spec, the first bytes of the first packet are:
0000000000ffa05a
this is Type-specific, Fragment Offset, Type, Q, Width, Height, MBZ, Precision, Length
the next data starting with 0000008010 and ending with 636363 is Quantization Table 128 bytes long.
So, the question is what is the data starting right after it?
ad8a5a0029
According to specs, no one of it can be recognized as possible JPEG marker, so the question is what is the encoding used and how can I encode the image I like with this particular encoding?
Attempt to encode on the manner of http://www.media.mit.edu/pia/Research/deepview/src/JpegEncoder.java works well for the jpeg files, but is not decoded with VLC's live555.

How to prevent data throttling with audio codec streaming

I am sampling a incoming audio stream at 8Ksps. I have a codec that takes ~1.6ms to encode a packet of data (80 samples) into an encoded packet (5 samples). At this rate I get 8000*1.662e-3 ~= 13 samples every encoding cycle. But I need 80 samples every cycle. How do I keep the stream continuous? My only guess is slow down the bitrate of the outgoing encode stream but I'm not sure how to calculate this in general such that buffers on the incoming side don't fill up and the receiving side's buffers don't get starved.
This seems like a basic tenet of streaming but I can't find any info on methods. Thanks for any help!

Extract the audio data from RTP packets

I have RTP packets and I use the pcap.net Packet Object.
I need to get the actual audio data from the packet, without the RTP header - just the payload.
Language: C#
thnaks,
Ofek
RTP header length is 12 bytes usually. Using pcap.net you can create an object of udp.
UdpDatagram udp = null;
Then use
udp.Payload.ToHexadecimal();
So you will get full RTP . In that above hex string remove first 12 byte, that will be the rtp data(ex :voice).

Is GSM6.10 audio format block or stream based?

I might be asking the wrong question, but my knowledge in this area is very limited.
I'm using acmStreamConvert to convert PCM to GSM (6.10).
Audio Format: 8khz, 16-bit, mono
For the PCM buffer size I'm using 640 bytes (320 samples). For GSM buffer I'm using 65 bytes. My understanding is that GSM "always" converts 320 samples to 65 bytes.
The reason I ask "block or stream" is I'm wondering if I can safely convert multiple audio streams (real-time) using the same acmStreamConvert handle? I see the function has some flags for ACM_STREAMCONVERTF_START and ACM_STREAMCONVERTF_END and ACM_STREAMCONVERTF_BLOCKALIGN, but is it required I use this start/end sequence for GSM? I understand that might be required for some formats that use head/tails, but I'm hoping this isn't required for GSM format?
I'm working on a group VOIP client, and each client sends GSM format, and then needs to convert to PCM before playing. I'm hoping I don't need one ACM handle per client.
Stream based, or at least the ACM API usage of it is. Trying to use the same ACM objects/handles for multiple streams will produce undesired results. I suspect this also means it doesn't handle lost packets as well as other codecs might (haven't confirmed that part yet).

Resources