HLS FLAC Audio Stream - http-live-streaming

I have a FLAC audio file (24bit/192Kbps) from which I want to create a HLS packaged adaptive bitrate stream, the highest quality stream being the input format, so FLAC (24bit/192Kbps) and the lower format streams being AAC at different bit rates.
I can do this with AWS MediaConvert or AWS Elastic Transcoder with regard to the AAC outputs but it doesn't support creating the FLAC outputs as far as I can see.
Is there a reason I shouldn't be trying to do this? Assuming that it is a perfectly valid objective is there another tool/service to do the job or perhaps I need to code something up myself around ffmpeg?

We are currently reviewing FLAC as a roadmap feature and intend to include it in an upcoming MediaConvert release. Please check out the what's new post for released features in MediaConvert.
https://aws.amazon.com/about-aws/whats-new/media-services/?whats-new-content.sort-by=item.additionalFields.postDateTime&whats-new-content.sort-order=desc&awsf.whats-new-products=general-products%23aws-elemental-mediaconvert
Regards

Related

How can I apply audio compression to an MP4 file?

I am using moviepy to generate MP4 files from sets of shorter clips, each with their own audio. The problem is that the resulting MP4 often has a very high dynamic range from one clip to the next and I would like to apply audio compression to make it easier on the ears. In Google I can only find results about audio information compression, but not about audio compression from the audio engineering perspective.
I would like to know if there is some way of doing this with moviepy, or with some other library. I have no issue with invoking (non interactive) command line utilities either.
Thank you.

How to combine jpeg frames with uncompressed mono audio into an h264 stream or any other format processed by web browsers out of the box?

So I have an esp32 which captures images and sound. The esp32-camera library already returns the jpeg encoded buffer. The audio however is uncompressed and is just a digital representation of signal strength at high sample rate.
I use esp32 to host a webpage which contains <image> element and a JavaScript snippet, which constantly sends GET requests to a branching url for image data and updates the element. This approach is not very good, especially that now I've added audio capabilities to the circuit.
I'm curious if it would be possible to combine jpeg encoded frames and some audio data into a chunk of h264 and then send it directly as a response to a GET request making it a stream?
This not only would simplify the whole serving multiple webpages thing, but also remove the issues of syncing the audio and video if they are sent separately.
In particular I'm also curious how easy would it be to do on esp32 since it doesn't have a whole bunch of ram and computational power. It would be challenging to find or port large libraries which could help as well, so i guess I would have to code it myself.
I also am not sure if h264 is the best option. I know its supported on most browser out of the box and is using jpeg compression behind the scenes for the frames, but perhaps a simpler format exists which is also widely supported.
So to sum it up: Is h264 a best bet in the provided context? Is combining jpeg and uncompressed mono audio into h264 possible in the provided context? If an answer to either of previous questions is a no, what alternatives do i have if any?
I'm curious if it would be possible to combine jpeg encoded frames and some audio data into a chunk of h264 and then send it directly as a response to a GET request making it a stream?
H.264 is a video codec. It doesn't have anything to do with audio.
I know its supported on most browser out of the box and is using jpeg compression behind the scenes for the frames
No, this isn't true. H.264 is its own thing. It's far more powerful than JPEG and is specifically designed for motion, whereas JPEG was not.
You need a few things:
A video codec, to efficiently handle your frames. Most of these embedded camera libraries can give you an MJPEG stream. I'd use that if possible. I don't think your ESP32 has other video encoding capability, does it? H.264 is a good choice, but only if you can actually encode it.
A container format, to aid in streaming your audio and video streams together. ISOBMFF/MP4 is common, as is WebM/Matroska.
If you're only streaming to a single client (which seems likely given the limited horsepower of the board), and if you have enough capability to do the audio/video encoding, you can generate a WebM stream on the fly that is directly playable in a <video> element. This seems exactly what you are asking for.

How to merge several audio files using Libav API?

Currently, I am implementing a new feature of my software using the Libav API. This is the requirement: to merge a list of audio files (MP3 and WAV) and create a unique
audio file (MP3) as output. Note: The challenge is not about concatenating files, but merging them. When the output sound is played, all the input audio content must sound at the same time, as when you merge several files in a video editor.
I was researching about Libav audio streams, and I am just guessing that my requirement is related to the "channels" concept, I mean, that there is possible to include several audios in the stream, using one channel per audio or something like that. I was hoping to find more information about this topic, but FFmpeg/Libav documentation is actually scarce.
Right now, I am able to merge several audio streams to a video stream successfully and I can create a playable MP4 file. My problem is that players like MPlayer/VLC only reproduce the first audio stream with the video, the other two audio streams are ignored.
I was looking at the set of examples included in the FFmpeg source code, but there is nothing specifically related to my requirement, so I would appreciate any
source code reference or algorithm explanation about how to merge several audio files into one using libav. Thanks.
Update:
The ffmpeg command to merge several audio files requires de filter flag "amix", like in this example:
ffmpeg -i 1.mp3 -i 2.mp3 -i 3.mp3 -filter_complex amix=inputs=3:duration=first result.mp3
All the syntax related to this option is described in the FFmpeg Documentation
Checking the FFmpeg source code, it seems the amix feature implementation is included in the file af_amix.c
I am not 100% sure, but it seems the general algorithm is described in the function:
static int activate(AVFilterContext *ctx)
Do you know how to merge several audio files using command line ffmpeg? It would help you if you first understand how to do it with the ffmpeg command then reverse engineer how it achieves it. It's all about how to constrct a filtergraph and pass data through it.
As for examples, check out examples/filter_audio.c and examples/filtering_audio.c
This C example gets two WAV audio files and merges them to generate a new WAV file using ffmpeg-4.4 API. Tip: The key of the process is to use these filters: abuffer, amix and abuffersink.
https://github.com/xtingray/audio_mixer/
Although it doesn't support MP3 format as the output, it gives you the basics to understand how to implement your own requirements. I hope it can be handy for anyone looking for references about this specific topic.

Why should a video uploaded to Azure Media Service be encoded?

I have recorded a video on my phone, I don't get why it needs to be encoded at all. Doesn't the format persist? Maybe I missing the point of encoding here. After the recording is it not already in format that is viewable to users?
It's a valid question if you wanted to just upload the existing MP4 file that was encoded on your phone and just stream it as a single bitrate HLS or DASH packaged file.
Most users of our service prefer that the uploaded MP4 file is first encoded to multiple bitrates and resolutions to allow for Adaptive Bitrate Streaming.
If you are not familiar with what Adapative Streaming is or how it works, I recommend watching a few of these - https://www.youtube.com/results?search_query=Adaptive+bitrate+streaming+overview
Or read through this article
https://en.wikipedia.org/wiki/Adaptive_bitrate_streaming
We have two types of encoding presets to enable this. One called Adaptive Streaming, which will generate a fixed "ladder" of bitrates and qualities, and one called Content Aware Encoding, which will look at your video, analyze it, and generate the best set of tracks and bitrates for the content type.
https://learn.microsoft.com/en-us/azure/media-services/latest/content-aware-encoding
Thanks,
John D.

How do I create an mp4 file from a collection of H.264 frames and audio frames?

I have a program that captures and stores H.264 encoded video as well as audio into a proprietary format file. I need to be able to export that video and audio to an mp4 file. I prefer C# but will use C++ if necessary. Any suggestions?
To produce MPEG-4 Part 14 .MP4 file you need a multiplexer. There is a choice of multiplexers out there:
FFmpeg (libavformat)
DirectShow filters (free and open source from GDCL, commercial)
Windows 7+ Media Foundation file sink
API and complexity might vary because some of multiplexers are expected to be a part of pipeline, they are not completely standalone classes. You might want to check respective samples (and license agreements, perhaps, too) to see what is best for you.
Take a look at libmp4v2. Fairly straightforward to use..
http://code.google.com/p/mp4v2/

Resources