Incorporate HEv2 AAC into an MPEG-TS for HLS content - http-live-streaming

I try to find any info about AAC HEv2 (PS) in an MPEG Transport Stream (TS) for HLS.
According to the HLS Authoring Specification for Apple Devices AAC HEv2 is a supported format. AAC HEv2 is part of MPEG-4, but I cannot understand how HEv2 could fit into a transport stream.
SBR (or HEv1) can be in TS by implicit signaling. In case of MP4 we have an audio specific config. But how can I multiplex AAC Parametric Stereo into the TS?
Is it available or not?
I cannot find any info from the Apple site and so on.

There are two ways to put AAC into transport stream.
1.
Using ADTS syntax (MPEG2-style).
In a such case PMT's stream_type should be specified as 0x0F (ISO/IEC 13818-7 Audio with ADTS transport syntax).
So, you are limited to using "old" (MPEG2) AAC versions only, without SBR and PS.
2.
Using LATM+LOAS/AudioSyncStream syntax (MPEG4-style).
In a such case PMT's stream_type should be specified as 0x11 (ISO/IEC 14496-3 Audio with the LATM transport syntax).
And you can use all the force of "new" (MPEG4) AAC features, including SBR and PS.
Furthermore, DVB standard ETSI TS 101 154 demands: HEv1/HEv2 AAC shall be transmitted using LATM syntax.

Related

How is the AAC encoder priming delay handled in HLS?

As per Apple, in AAC encoding 2112 priming samples are added at the beginning of audio. When creating HLS stream with AAC audio, will these priming samples be added to the beginning of each HLS segment or only to the first HLS segment? And, how does this AAC encoder delay affect HLS DISCONTINUITY tags later in the HLS stream?
https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/QTFFAppenG/QTFFAppenG.html
I depends on the AAC you use.
For 'old-style' AAC-LC you only have priming samples at the beginning of the stream and not at the beginning of each segment.
But the delay is carried through the entire stream.
Typically a new piece of media is displayed after a DISCONTINUITY tag - for example an advertisement - so you will receive another set of priming samples.
Your AAC audio decoder needs to discard the priming samples (first 2112) PCM output samples after startup and after DISCONTINUITY.
If you use the more modern xHE-AAC - you don't have to worry about priming samples anymore.
Another wrinkle - in the early days it was just assumed that AAC-LC has 2112 priming samples.
Now the number can be different and it can be signaled in the MP4 container as Edit-List.

How do you encode raw pcm_f32le audio to AAC encoded audio with FFmpeg (C/C++)?

I am trying to encode raw audio (pcm_f32le) to AAC encoded audio. One thing I've noticed is that I can accomplish this via the CLI tool:
ffmpeg -f f32le -ar 48000 -ac 2 -c:a pcm_f32le -i out.raw out.m4a -y
This plays just fine and decodes fine.
The steps I've taken:
When I am using the C example code: https://ffmpeg.org/doxygen/3.4/encode_audio_8c-example.html and switch the encoder to codec = avcodec_find_encoder(AV_CODEC_ID_AAC);
Output the various sample formats associated with AAC, it only provides FLTP. That assumes a planar/interleaved format.
This page seems to provide the various supported input formats per codec.
This is confusing because I don't think my raw captured audio is interleaved. I've certainly tried passing it through and it doesn't work as intended.
It will stay stuck here with this ret code indefinitely after calling avcodec_receive_packet:
AVERROR(EAGAIN): output is not available in the current state - user must try to send input
Questions:
How can I modify the example code from FFmpeg to convert pcm_f32le raw audio to AAC encoded audio?
Why is the CLI tool able to?
I am using libsoundio to capture raw audio from Linux's Dummy Output. I wonder how I could get a planar format to pass through to get AAC encoded audio.
If AAC is not a possibility, is doing so with MP3?
Find here a working example of how to encode raw pcm_f32le to aac with ffmpeg

RTP AAC Packet Depacketizer

I asked earlier about H264 at RTP H.264 Packet Depacketizer
My question now is about the audio packets.
I noticed via the RTP packets that audio frames like AAC, G.711, G.726 and others all have the Marker Bit set.
I think frames are independent. am I right?
My question is: Audio is small, but I know that I can have more than one frame per RTP ​​packet. Independent of how many frames I have, they are complete? Or it may be fragmented between RTP packets.
The difference between audio and video is that audio is typically encoded either in individual samples, or in certain [small] frames without reference to previous data. Additionally, amount of data is small. So audio does not typically need complicated fragmentation to be transmitted over RTP. However, for any payload type you should again refer to RFC that describes the details:
AAC - RTP Payload Format for MPEG-4 Audio/Visual Streams
G.711 - RTP Payload Format for ITU-T Recommendation G.711.1
G.726 - RTP Profile for Audio and Video Conferences with Minimal Control
Other

Which are the FLV supported audio types?

I'm having issues with playing back some quick time files using actionscript 3.0 (NetStream class).
I have no control on how the quick time files are produced, but it seems so far that the files with uncompressed audio do not play audio at all in Flash Player.
I'm trying to compile a list of audio formats using video(mov/flv/etc.) in Flash Player, but I'm confused by the resources.
I've look through the FLV Format Specs(pdf link) on devnet and the media types listed there are:
MP3 A media type of .mp3 (0x2E6D7033)
indicates that the track contains MP3
audio data. The dot character, hex
0x2E, is included to make a complete
four-character code.
AAC A media type
of mp4a (0x6D703461) indicates that
the track is encoded with AAC audio.
Flash Player supports the following
AAC profiles, denoted by their object
types:
- 1 = main profile
- 2 = low complexity, a.k.a. LC
- 5 = high efficiency/scale band replication, a.k.a. HE/SBR When the
audio codec is AAC, an esds box occurs
inside the stsd box of a sample table.
This box contains initialization data
that an AAC decoder requires to decode
the stream. See ISO/IEC 14496-3 for
more information about the structure
of this box.
On the wikipedia entry, there is a mention on uncompressed audio:
FLV files also support uncompressed
audio or ADPCM format audio.
but there is no reference for that statement.
Is there a page that lists all the supported audio formats for playing back video in Flash Player ?
Be careful not to confuse the F4V and FLV container formats.
The official specification you mentioned describes both of these formats.
Your quote specifically refers to the F4V format which only supports MP3 and AAC in the flash player.
The list of audio codecs supported by the FLV container is shown on page 70 in the same file:
SoundFormat
(See notes following
table, for special
encodings)
UB [4] Format of SoundData. The following values are defined:
0 = Linear PCM, platform endian
1 = ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16 kHz mono
5 = Nellymoser 8 kHz mono
6 = Nellymoser
7 = G.711 A-law logarithmic PCM
8 = G.711 mu-law logarithmic PCM
9 = reserved
10 = AAC
11 = Speex
14 = MP3 8 kHz
15 = Device-specific sound
Formats 7, 8, 14, and 15 are reserved.
AAC is supported in Flash Player 9,0,115,0 and higher.
Speex is supported in Flash Player 10 and higher.

How to implement flv -> mp4/ogg live stream transcoding with FMS?

flv is not directly supported by most mobile browsers,
so I want to convert to the mp4/ogg format.
Is there anyhow I can achieve it with FMS that generated the .flv file from live webcam stream?
UPDATE
I found a similar question here which partly does the job:
ffmpeg -i input.flv output.mp4
But I need streaming
I assume you mean Ogg Vorbis audio with AVC/h.264 video in an FLV container? If so, the only problem is that the Flash Player does not support vorbis playback nor is there a codec id for it in the FLV specification. There is however an Alchemy plugin which does decode Ogg but it is not for streaming from FMS and certainly not within FLV. Info on the Flash/Ogg decoder:
http://www.hydrogenaudio.org/forums/lofiversion/index.php/t66269.html
Media types for FLV may be found here, as well as other useful information:
http://en.wikipedia.org/wiki/Flash_Video
Summary:
Supported media types in FLV file format
Video: On2 VP6, Sorenson Spark (Sorenson H.263), Screen video, H.264
Audio: MP3, ADPCM, Linear PCM, Nellymoser, Speex, AAC, G.711 (reserved for internal use)
Supported media types in F4V file format
Video: H.264
Images (still frame of video data): GIF, PNG, JPEG
Audio: AAC, HE-AAC, MP3
By the way, I found your question because I am implementing Ogg/Ogv streaming in Red5 (http://code.google.com/p/red5) for HTML5 and Unity.

Resources