How do you encode raw pcm_f32le audio to AAC encoded audio with FFmpeg (C/C++)? - audio

I am trying to encode raw audio (pcm_f32le) to AAC encoded audio. One thing I've noticed is that I can accomplish this via the CLI tool:
ffmpeg -f f32le -ar 48000 -ac 2 -c:a pcm_f32le -i out.raw out.m4a -y
This plays just fine and decodes fine.
The steps I've taken:
When I am using the C example code: https://ffmpeg.org/doxygen/3.4/encode_audio_8c-example.html and switch the encoder to codec = avcodec_find_encoder(AV_CODEC_ID_AAC);
Output the various sample formats associated with AAC, it only provides FLTP. That assumes a planar/interleaved format.
This page seems to provide the various supported input formats per codec.
This is confusing because I don't think my raw captured audio is interleaved. I've certainly tried passing it through and it doesn't work as intended.
It will stay stuck here with this ret code indefinitely after calling avcodec_receive_packet:
AVERROR(EAGAIN): output is not available in the current state - user must try to send input
Questions:
How can I modify the example code from FFmpeg to convert pcm_f32le raw audio to AAC encoded audio?
Why is the CLI tool able to?
I am using libsoundio to capture raw audio from Linux's Dummy Output. I wonder how I could get a planar format to pass through to get AAC encoded audio.
If AAC is not a possibility, is doing so with MP3?

Find here a working example of how to encode raw pcm_f32le to aac with ffmpeg

Related

ffmpeg audio encoding based on codec and not on stream identifier

I have an RTSP Stream with one video stream and three audio streams as the source. Two of the audio streams are encoded with .mp2 and one is encoded with .ac-3. I want to convert the .mp2 streams to AAC. This would be easy if the .mp2streams would have the same stream identifier every time I start ffmpeg, but unfortunately the stream identifiers change. This means sometimes the two .mp2 streams are 0:a:0 and 0:a:1 and the next time they are 0:a:1 and 0:a:2.
Is there an option to re-encode only the .mp2 streams and keep the .ac-3 stream untouched?
I should probably also mention that this encoding is used for live TV so it is not an option to produce intermediate files or have several ffmpeg commands.
Try
ffprobe -show_entries stream_tags -select_streams a INPUT_URL
and see if there are any stream tags (metadata) that distinguishes mp2 streams. Then you can use the metadata stream specifier to selectively set re-encoding:
ffmpeg ... -c copy -c:a:m:{name}:{value} ac3 ...
where {name} and {value} are the name and value of the tag, respectively.
Reference on stream specifier: https://ffmpeg.org/ffmpeg.html#Stream-specifiers-1
If there isn't any usable tag, your only solution likely is to run ffprobe first to identify the stream # before running ffmpeg.

Dealing with problems in FLAC audio files with ffmpeg

I have gotten a set of FLAC (audio) files from a friend. I copied them to my Sonos music library, and got set to enjoy a nice album. Unfortunately, Sonos would not play the files. As a result I have been getting to know ffmpeg.
Sonos' complaint with the FLAC files was that it was "encoded at an unsupported sample rate". With rolling eyes and shaking head, I note that the free VLC media player happily plays these files, but the product I've paid for (Sonos) - does not. But I digress...
ffprobe revealed that the FLAC files contain both an Audio channel and a Video channel:
$ ffprobe -hide_banner -show_streams "/path/to/Myaudio.flac"
Duration: 00:02:23.17, start: 0.000000, bitrate: 6176 kb/s
Stream #0:0: Audio: flac, 176400 Hz, stereo, s32 (24 bit)
Stream #0:1: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 450x446 [SAR 72:72 DAR 225:223], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Cover (front)
Cool! I guess this is how some audio players are able to display the 'album artwork' when they play a song? Note also that the Audio stream is reported at 176400 Hz! Apparently I'm out of touch; I thought that 44.1khz sampling rate effectively removed all of the 'sampling artifacts' we could hear. Anyway, I learned that Sonos would support a max of 48kHz sampling rate, and this (the 176.4kHz rate) is what Sonos was unhappy about. I used ffmpeg to 'dumb it down' for them:
$ ffmpeg -i "/path/to/Myaudio.flac" -sample_fmt s32 -ar 48000 "/path/to/Myaudio48K.flac"
This seemed to work - at least I got a FLAC file that Sonos would play. However, I also got what looks like a warning of some sort:
[swscaler # 0x108e0d000] deprecated pixel format used, make sure you did set range correctly
[flac # 0x7feefd812a00] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
A bit more research turned up this answer which I don't quite understand, and then in a comment says, "not to worry" - at least wrt the swscaler part of the warning.
And that (finally) brings me to my questions:
1.a. What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
1.b. How should I use ffmpeg to modify the Video channel to set these specifications (ref. Q 1.a.)?
2.a. How do I remove the Video channel from the .flac audio file?
2.b. How do I add a Video channel into a .flac file?
EDIT:
I asked the above (4) questions after failing to accomplish a 'direct' conversion (a single ffmpeg command) from FLAC at 176.4 kHz to ALAC (.m4a) at 48 kHz (max supported by Sonos). I reasoned that an 'incremental' approach through a series of conversions might get me there. With the advantage of hindsight, I now see I should have posted my original failed direct conversion incantation... we live and learn.
That said, the accepted answer below meets my final objective to convert a FLAC file encoded at 176.4kHz to an ALAC (.m4a) at 48kHz, and preserve the cover art/video channel.
What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
A cover art is just a single frame so framerate has no relevance in this case. However, you don't want a video stream, it has to remain a single image, so -vsync 0 should be added. Muxer is simply the specific term for the packager as used in media file processing. It is decided by the choice of format e.g. FLAC, WAV..etc. What's important is the codec for the cover art; usually, it's PNG or JPEG. For FLAC, PNG is the default codec.
How do I remove the Video channel from the .flac audio file
ffmpeg -i "/path/to/Myaudio.flac" -vn -c copy "/path/to/Myaudio48K.flac"
(All this does is skip any video in the input and copy everything else)
How do I add a Video channel into a .flac file?
To add cover art to audio-only formats, like MP3, FLAC..etc, the video stream has to have a disposition of attached picture. So,
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -sample_fmt s32 -ar 48000 -disposition:v attached_pic -vsync 0 "/path/to/Myaudio48K.flac"
For direct conversion to ALAC, use
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -ar 48000 -c:a alac -disposition:v attached_pic -vsync 0 -c:v png "/path/to/Myaudio48K.m4a"

MPEG-DASH streaming live using encoded stream buffer

I have been trying to implement HTTP live streaming using mpeg-dash but need guidance on some issues.
Provided :
I have audio and video encoded stream in buffered input.
a direct mpeg-2 transport stream for above is also available in a buffer.
Current approach :
Save the transport stream into chunks of fixed length.
use ffmpeg to extract video stream.
ffmpeg -i latest_chunk.ts -s 720x480 -c:v libx264 -b:v 600k -y -an output_video_stream.mp4
use ffmpeg to extract video stream.
ffmpeg -i latest_chunk.ts -c:a aac -b:a 128k -y -vn output_audio_stream.mp4
use mp4box to create dash segments and mpd.
mp4box -dash 7000 -profile live output_video_stream.mp4 output_audio_stream.mp4 -out manifest.mpd
A server running continuously in another thread serves the generated mpd and segments.
Issues :
The above approach gives a considerable amount of latency. Can this be done more efficiently?
I want to know if there is a method to take directly encoded streams buffer as input and produce mpeg-dash segments and mpd. HTTP server will do the rest. If there is, please provide an example.
Also i provided the length of the transport stream chunks (in sec) in mp4box as argument -mpd-refresh 12, but the player only requests for the mpd once, plays the segments, and stops. It also does not include minimumUpdatePeriod attribute in the generated mpd file
mp4box -dash 7000 -profile live -mpd-refresh 12 output_video_stream.mp4 output_audio_stream.mp4 -out manifest.mpd
Does the mpeg-dash has support for mpeg-2 encoded media streams?
Any advice/solution/reference for the same is appreciated.

Convert audio to 8-bit signed PCM

I have a .mp4 audio file that I want to convert to a 8-bit unsigned PCM format for an Arduino Uno using the TMRpcm library.
It also could be a .wav file. Anyways, I have tried many things to no avail. The closest I got was with Audacity using the NIST Sphere codec. I tried to do this with FFmpeg, but it only supports demuxing NIST Sphere files. How do I convert audio to this format on Mac OS X (10.10.2)?
avconv is a fork from ffmpeg ... so use ffmpeg if you wish
avconv -i input.mp4 -ar 8000 -acodec pcm_u8 -ac 1 output.wav
WAV is the container format for the PCM codec so if you MUST have PCM then get into a binary file editor (wxHexEditor is a nice one) and delete the first 44 bytes (its header) of that WAV file
So above gives you 8000 samples per second and a bit depth of 8 bits, and mono.
verify this using
avprobe some_video_audio_file.wav
see bit depth listing available using avconv here
I realized that I was trying to convert a corrupt audio file. Audacity converted a valid file correctly.

AAC bitstream not in ADTS format and extradata missing

With FFMPEG, I'm sending a stream from Computer A over to Computer B, via UDP.
This is done over a MPEGTS stream, encoded with libx264 and aac.
Computer B takes this stream with FFMPEG and puts it into an m3u8 playlist.
After a random time (2-35 minutes), the message
[mpegts # 0533f000] AAC bitstream not in ADTS format and extradata missing
av_interleaved_write_frame(): Invalid data found when processing input
appears.
What I figures is that the receiving FFMPEG can't read the header file of the audio part for this particular package, and since it can't put video and audio together anymore, it stops creating the .ts files and just stops running.
Here's the cmdline of the receiving stream:
ffmpeg -i udp://address -vcodec copy -acodec copy -map 0 -f segment -segment_list playlist.m3u8 -analyzeduration 100000 -probesize 100000-segment_list_flags +live-cache -segment_time 8 -segment_wrap 10 out%03d.ts
Now I need to know the answer to either one of these 2 questions:
1) Can I put something in my commandline in order to avoid this particular problem or
2) Can I tell FFMPEG to just ignore it for this particular message, quite possibly creating weird audio or none at all, and to simply move on to the next one?

Resources