Add running text to video stream with vdpau - linux

I have a task to add running text to video stream (or file) on receive. Video need to be run on cubieboard with Armbian. I tested with mpv, with flag --hwdec=vdpau, and the video runs smoother that without it. To add running text I tried to use lavfi-drawtext filter, but when I use it, mpv falls back to software decoding and lag is seen. Here is one of the examples I used:
mpv --hwdec=vdpau Videos/VID* -vf lavfi=[drawtext=fontsize=40:fontcolor=yellow:x=w-50*t:y=h/2:textfile=livetext.txt:reload=1]
And an output from that command with --msg-level=vd=v, it is from my working PC, on cubieboard it also warns about audio/video desync:
Playing: Videos/VID_20180129_120726.mp4
(+) Video --vid=1 (*) (h264 1080x1920 30.000fps)
(+) Audio --aid=1 --alang=eng (*) (aac 2ch 44100Hz)
[vd] Container reported FPS: 30.000000
[vd] Codec list:
[vd] h264 - H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
[vd] h264_crystalhd (h264) - CrystalHD H264 decoder
[vd] h264_cuvid (h264) - Nvidia CUVID H264 decoder
[vd] Opening video decoder h264
[vd] Probing 'vdpau'...
[vd] Trying hardware decoding.
[vd] Selected video codec: h264 (H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10)
Opening video filter: [lavfi graph=drawtext=fontsize=40:fontcolor=yellow:x=w-50*t:y=h/2:textfile=/opt/mpv-text/livetext.txt:reload=1]
[vd] Pixel formats supported by decoder: vdpau vaapi_vld yuv420p
[vd] Codec profile: High (0x64)
[vd] Requesting pixfmt 'vdpau' from decoder.
Using hardware decoding (vdpau).
[vd] Decoder format: 1080x1920 vdpau[yuv420p] bt.709/bt.709/bt.1886/limited CL=mpeg2/4/h264
[ffmpeg] Impossible to convert between the formats supported by the filter 'src' and the filter 'auto_scaler_0'
[lavfi] Can't configure libavfilter graph.
Video filter chain:
[in] 1080x1920 vdpau[yuv420p] bt.709/bt.709/bt.1886/limited SP=1.000000 CL=mpeg2/4/h264
[lavfi] "lavfi.00" 1080x1920 vdpau[yuv420p] bt.709/bt.709/bt.1886/limited SP=1.000000 CL=mpeg2/4/h264 <---
[out] ???
Falling back to software decoding.
[vd] Detected 8 logical cores.
[vd] Requesting 9 threads for decoding.
AO: [pulse] 44100Hz stereo 2ch float
[vd] Decoder format: 1080x1920 yuv420p bt.709/bt.709/bt.1886/limited CL=mpeg2/4/h264
VO: [opengl] 1080x1920 yuv420p
AV: 00:00:03 / 00:00:36 (9%) A-V: 0.000
[vd] Uninit video.
After a long search I doubt that hardware acceleration with mpv here is possible. If so, maybe you could give advice on other tools to achieve that? I am a newbie in this sphere and maybe there is a more efficient way to add running text to video. Thanks.

I found a format of subtitles that can move text (ASS), so that no video filters are needed. Tag \move(x1,y1,x2,y2) can do that, reference to tags below:
http://docs.aegisub.org/3.2/ASS_Tags/

Related

Dealing with problems in FLAC audio files with ffmpeg

I have gotten a set of FLAC (audio) files from a friend. I copied them to my Sonos music library, and got set to enjoy a nice album. Unfortunately, Sonos would not play the files. As a result I have been getting to know ffmpeg.
Sonos' complaint with the FLAC files was that it was "encoded at an unsupported sample rate". With rolling eyes and shaking head, I note that the free VLC media player happily plays these files, but the product I've paid for (Sonos) - does not. But I digress...
ffprobe revealed that the FLAC files contain both an Audio channel and a Video channel:
$ ffprobe -hide_banner -show_streams "/path/to/Myaudio.flac"
Duration: 00:02:23.17, start: 0.000000, bitrate: 6176 kb/s
Stream #0:0: Audio: flac, 176400 Hz, stereo, s32 (24 bit)
Stream #0:1: Video: mjpeg (Progressive), yuvj444p(pc, bt470bg/unknown/unknown), 450x446 [SAR 72:72 DAR 225:223], 90k tbr, 90k tbn, 90k tbc (attached pic)
Metadata:
comment : Cover (front)
Cool! I guess this is how some audio players are able to display the 'album artwork' when they play a song? Note also that the Audio stream is reported at 176400 Hz! Apparently I'm out of touch; I thought that 44.1khz sampling rate effectively removed all of the 'sampling artifacts' we could hear. Anyway, I learned that Sonos would support a max of 48kHz sampling rate, and this (the 176.4kHz rate) is what Sonos was unhappy about. I used ffmpeg to 'dumb it down' for them:
$ ffmpeg -i "/path/to/Myaudio.flac" -sample_fmt s32 -ar 48000 "/path/to/Myaudio48K.flac"
This seemed to work - at least I got a FLAC file that Sonos would play. However, I also got what looks like a warning of some sort:
[swscaler # 0x108e0d000] deprecated pixel format used, make sure you did set range correctly
[flac # 0x7feefd812a00] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
A bit more research turned up this answer which I don't quite understand, and then in a comment says, "not to worry" - at least wrt the swscaler part of the warning.
And that (finally) brings me to my questions:
1.a. What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
1.b. How should I use ffmpeg to modify the Video channel to set these specifications (ref. Q 1.a.)?
2.a. How do I remove the Video channel from the .flac audio file?
2.b. How do I add a Video channel into a .flac file?
EDIT:
I asked the above (4) questions after failing to accomplish a 'direct' conversion (a single ffmpeg command) from FLAC at 176.4 kHz to ALAC (.m4a) at 48 kHz (max supported by Sonos). I reasoned that an 'incremental' approach through a series of conversions might get me there. With the advantage of hindsight, I now see I should have posted my original failed direct conversion incantation... we live and learn.
That said, the accepted answer below meets my final objective to convert a FLAC file encoded at 176.4kHz to an ALAC (.m4a) at 48kHz, and preserve the cover art/video channel.
What framerate, muxer & other specifications make a graphic compatible with a majority of programs that use the graphic?
A cover art is just a single frame so framerate has no relevance in this case. However, you don't want a video stream, it has to remain a single image, so -vsync 0 should be added. Muxer is simply the specific term for the packager as used in media file processing. It is decided by the choice of format e.g. FLAC, WAV..etc. What's important is the codec for the cover art; usually, it's PNG or JPEG. For FLAC, PNG is the default codec.
How do I remove the Video channel from the .flac audio file
ffmpeg -i "/path/to/Myaudio.flac" -vn -c copy "/path/to/Myaudio48K.flac"
(All this does is skip any video in the input and copy everything else)
How do I add a Video channel into a .flac file?
To add cover art to audio-only formats, like MP3, FLAC..etc, the video stream has to have a disposition of attached picture. So,
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -sample_fmt s32 -ar 48000 -disposition:v attached_pic -vsync 0 "/path/to/Myaudio48K.flac"
For direct conversion to ALAC, use
ffmpeg -i "/path/to/Myaudio.flac" -i coverimage -ar 48000 -c:a alac -disposition:v attached_pic -vsync 0 -c:v png "/path/to/Myaudio48K.m4a"

Incorporate HEv2 AAC into an MPEG-TS for HLS content

I try to find any info about AAC HEv2 (PS) in an MPEG Transport Stream (TS) for HLS.
According to the HLS Authoring Specification for Apple Devices AAC HEv2 is a supported format. AAC HEv2 is part of MPEG-4, but I cannot understand how HEv2 could fit into a transport stream.
SBR (or HEv1) can be in TS by implicit signaling. In case of MP4 we have an audio specific config. But how can I multiplex AAC Parametric Stereo into the TS?
Is it available or not?
I cannot find any info from the Apple site and so on.
There are two ways to put AAC into transport stream.
1.
Using ADTS syntax (MPEG2-style).
In a such case PMT's stream_type should be specified as 0x0F (ISO/IEC 13818-7 Audio with ADTS transport syntax).
So, you are limited to using "old" (MPEG2) AAC versions only, without SBR and PS.
2.
Using LATM+LOAS/AudioSyncStream syntax (MPEG4-style).
In a such case PMT's stream_type should be specified as 0x11 (ISO/IEC 14496-3 Audio with the LATM transport syntax).
And you can use all the force of "new" (MPEG4) AAC features, including SBR and PS.
Furthermore, DVB standard ETSI TS 101 154 demands: HEv1/HEv2 AAC shall be transmitted using LATM syntax.

audio sample format s16p, ffmpeg or audio codec bug?

I have a video file and I had dumped the video info to a txt file with ffmpeg nearly 3 year ago.
...
Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16, 256 kb/s
Stream #0:2[0x1c1]: Audio: mp2, 48000 Hz, stereo, s16, 256 kb/s
But I found the format changed when I used the update ffprobe (ffprobe version N-78046-g46f67f4 Copyright (c) 2007-2016 the FFmpeg developers).
...
Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16p, 256 kb/s
Stream #0:2[0x1c1]: Audio: mp2, 48000 Hz, stereo, s16p, 256 kb/s
With the same video, its sample format changes to s16p.
I implemented a simple video player which uses ffmpeg. It can play video 3 years ago, but failed to output the correct pcm stream after changing to update ffmpeg. I spent lots time and finally found that the audio should have been s16 instead of s16p. The decoded audio stream works after I added the line before calling avcodec_decode_audio4,
audio_codec_ctx->sample_fmt = AV_SAMPLE_FMT_S16
but it is just a hack. Does anyone encounter this issue? How to make ffmpeg work correctly? Any hint is appreciated. Thanks!
The output format changed. The reason for this is fairly convoluted and technical, but let me try explaining it anyway.
Most audio codecs are structured such that the output of each channel is best reconstructed individually, and the merging of channels (interleaving of a "left" and "right" buffer into an array of samples ordered left0 right0 left1 right1 [etc]) happens at the very end. You can probably imagine that if the encoder wants to deinterleave again, then transcoding of audio involves two redundant operations (interleaving/deinterleaving). Therefore, all decoders where it makes sense were switched to output planar audio (so s16 changed to s16p, where p means planar), where each channel is its own buffer.
So: nowadays, interleaving is done using a resampling library (libswresample) after decoding instead of as an integral part of decoding, and only if the user explicitly wants to do so, rather than automatically/always.
You can indeed set the request sample format to S16 to force decoding to s16 instead of s16p. Consider this a compatibility hack that will at some point be removed for the few decoders for which it does work, and also one that will not work for new decoders. Instead, consider adding libswresample support to your application to convert between whatever is the native output format of the decoder, and the format you want to use for further data processing (e.g. playback using sound card).

Maximum kbps for AAC audio stream (MPEG-2/4)

What are the absolute maximum bitrate the standardized MPEG-2 Part 7 AAC (ISO/IEC 13818-7:1997) and MPEG-4 Audio AAC (ISO/IEC 14496-3:1999) can output by the specifications?

How to implement flv -> mp4/ogg live stream transcoding with FMS?

flv is not directly supported by most mobile browsers,
so I want to convert to the mp4/ogg format.
Is there anyhow I can achieve it with FMS that generated the .flv file from live webcam stream?
UPDATE
I found a similar question here which partly does the job:
ffmpeg -i input.flv output.mp4
But I need streaming
I assume you mean Ogg Vorbis audio with AVC/h.264 video in an FLV container? If so, the only problem is that the Flash Player does not support vorbis playback nor is there a codec id for it in the FLV specification. There is however an Alchemy plugin which does decode Ogg but it is not for streaming from FMS and certainly not within FLV. Info on the Flash/Ogg decoder:
http://www.hydrogenaudio.org/forums/lofiversion/index.php/t66269.html
Media types for FLV may be found here, as well as other useful information:
http://en.wikipedia.org/wiki/Flash_Video
Summary:
Supported media types in FLV file format
Video: On2 VP6, Sorenson Spark (Sorenson H.263), Screen video, H.264
Audio: MP3, ADPCM, Linear PCM, Nellymoser, Speex, AAC, G.711 (reserved for internal use)
Supported media types in F4V file format
Video: H.264
Images (still frame of video data): GIF, PNG, JPEG
Audio: AAC, HE-AAC, MP3
By the way, I found your question because I am implementing Ogg/Ogv streaming in Red5 (http://code.google.com/p/red5) for HTML5 and Unity.

Resources