I have created a Gstreamer pipeline like below
gst-launch-1.0 rtpbin name=rtpbin latency=200 rtp-profile=avpf rtspsrc location="rtsp://url" protocols=2 timeout=50000000 ! decodebin name=decodebin ! audioconvert ! audioresample ! opusenc ! rtpopuspay pt=111 ssrc=11111111 ! rtprtxqueue max-size-time=1000 max-size-packets=0 ! rtpbin.send_rtp_sink_0 rtpbin.send_rtp_src_0 ! udpsink host=10.7.50.43 port=12785 rtpbin.send_rtcp_src_0 ! udpsink host=10.7.50.43 port=12900 sync=false async=false funnel name=rtp_funnell ! udpsink host=10.7.50.43 port=14905 funnel name=rtcp_funnell ! udpsink host=10.7.50.43 port=13285 sync=false async=false decodebin. ! videoconvert ! tee name=video_tee video_tee. ! queue ! videoconvert ! videoscale ! videorate ! video/x-raw , format=I420 , width=320 , height=180 , framerate=24/1 ! x264enc tune=zerolatency speed-preset=9 dct8x8=false bitrate=512 insert-vui=true key-int-max=10 b-adapt=true qp-max=40 qp-min=21 pass=17 ! h264parse ! rtph264pay ssrc=33333333 pt=101 ! rtprtxqueue max-size-time=2000 max-size-packets=100 ! rtpbin.send_rtp_sink_1 rtpbin.send_rtp_src_1 ! rtp_funnell.sink_0 rtpbin.send_rtcp_src_1 ! rtcp_funnell.sink_0 video_tee. ! queue ! videoconvert ! videoscale ! videorate ! video/x-raw , format=I420 , width=640 , height=360 , framerate=24/1 ! x264enc tune=zerolatency speed-preset=9 dct8x8=false bitrate=1024 insert-vui=true key-int-max=10 b-adapt=true qp-max=40 qp-min=21 pass=17 ! h264parse ! rtph264pay ssrc=33333334 pt=101 ! rtprtxqueue max-size-time=2000 max-size-packets=100 ! rtpbin.send_rtp_sink_2 rtpbin.send_rtp_src_2 ! rtp_funnell.sink_1 rtpbin.send_rtcp_src_2 ! rtcp_funnell.sink_1 video_tee. ! queue ! videoconvert ! videoscale ! videorate ! video/x-raw , format=I420 , width=960 , height=540 , framerate=24/1 ! x264enc tune=zerolatency speed-preset=9 dct8x8=false bitrate=2048 insert-vui=true key-int-max=10 b-adapt=true qp-max=40 qp-min=21 pass=17 ! h264parse ! rtph264pay ssrc=33333335 pt=101 ! rtprtxqueue max-size-time=2000 max-size-packets=100 ! rtpbin.send_rtp_sink_3 rtpbin.send_rtp_src_3 ! rtp_funnell.sink_2 rtpbin.send_rtcp_src_3 ! rtcp_funnell.sink_2 video_tee. ! queue ! videoconvert ! videoscale ! videorate ! video/x-raw , format=I420 , width=1280 , height=720 , framerate=24/1 ! x264enc tune=zerolatency speed-preset=9 dct8x8=false bitrate=4096 insert-vui=true key-int-max=10 b-adapt=true qp-max=40 qp-min=21 pass=17 ! h264parse ! rtph264pay ssrc=33333336 pt=101 ! rtprtxqueue max-size-time=2000 max-size-packets=100 ! rtpbin.send_rtp_sink_4 rtpbin.send_rtp_src_4 ! rtp_funnell.sink_3 rtpbin.send_rtcp_src_4 ! rtcp_funnell.sink_3
It returns with the following warning and audio is not transcoded properly
WARNING: from element /GstPipeline:pipeline0/GstDecodeBin:decodebin: Delayed linking failed.
Additional debug info:
gst/parse/grammar.y(544): gst_parse_no_more_pads (): /GstPipeline:pipeline0/GstDecodeBin:decodebin:
failed delayed linking some pad of GstDecodeBin named decodebin to some pad of GstAudioConvert named audioconvert0
But if we change the source to static source like below (replace rtspsrc pipeline)
filesrc location="BigBuckBunny.mp4"
It works fine.
The difference of these two sources are as follows
Working Source
> gst-discoverer-1.0 "BigBuckBunny.mp4"
Analyzing file:BigBuckBunny.mp4
Done discovering file:BigBuckBunny.mp4
Properties:
Duration: 0:09:56.473333333
Seekable: yes
Live: no
container #0: Quicktime
video #1: H.264 (High Profile)
Stream ID: 786017c5b5a8102940e7912e1130363236dc5ce24cb9a0f981d989da87e36cbe/002
Width: 1280
Height: 720
Depth: 24
Frame rate: 24/1
Pixel aspect ratio: 1/1
Interlaced: false
Bitrate: 1991280
Max bitrate: 5372792
audio #2: MPEG-4 AAC
Stream ID: 786017c5b5a8102940e7912e1130363236dc5ce24cb9a0f981d989da87e36cbe/001
Language: <unknown>
Channels: 2 (front-left, front-right)
Sample rate: 44100
Depth: 32
Bitrate: 125488
Max bitrate: 169368
Not Working Source
>gst-discoverer-1.0 rtsp://ip:port
Analyzing rtsp://ip:port
Done discovering rtsp://ip:port
Properties:
Duration: 99:99:99.999999999
Seekable: no
Live: yes
container #0: application/rtsp
unknown #2: application/x-rtp
audio #1: MPEG-4 AAC
Stream ID: 9053093890e06258c9ebd10a484943f40698af07428b21e6d4e07cc150314b0b/audio:0:0:RTP:AVP:97
Language: <unknown>
Channels: 2 (front-left, front-right)
Sample rate: 48000
Depth: 32
Bitrate: 0
Max bitrate: 0
unknown #4: application/x-rtp
video #3: H.264 (Main Profile)
Stream ID: 9053093890e06258c9ebd10a484943f40698af07428b21e6d4e07cc150314b0b/video:0:0:RTP:AVP:96
Width: 1920
Height: 1080
Depth: 24
Frame rate: 60/1
Pixel aspect ratio: 1/1
Interlaced: false
Bitrate: 0
Max bitrate: 0
I have a USB audio-video capture device, something used to digitize video cassettes. I want to record both the video and audio from the device to a video file that has dimensions 720x576 and video codec H.264 and good audio quality.
I am able to record video from the device using ffmpeg and I am able to see video from the device using MPlayer. I am able also to see that audio is being delivered from the device to the computer by looking at Input tab of the Sound Preferences window or by recording the audio using Audacity, however the audio gets delivered from the device apparently only when the video is being accessed using ffmpeg or MPlayer.
I have tried to get ffmpeg to record the audio and I have tried to get MPlayer to play the audio and my efforts have not been successful.
The device is "Pinnacle Dazzle DVC 90/100/101" (as returned by v4l2-ctl --list-devices). The sound cards listing shows it as "DVC100":
$ cat /proc/asound/cards
0 [PCH ]: HDA-Intel - HDA Intel PCH
HDA Intel PCH at 0x601d118000 irq 171
1 [DVC100 ]: USB-Audio - DVC100
Pinnacle Systems GmbH DVC100 at usb-0000:00:14.0-4, high speed
29 [ThinkPadEC ]: ThinkPad EC - ThinkPad Console Audio Control
ThinkPad Console Audio Control at EC reg 0x30, fw N2LHT33W
The PulseAudio listing for the device is as follows:
$ pactl list cards short
0 alsa_card.pci-0000_00_1f.3 module-alsa-card.c
14 alsa_card.usb-Pinnacle_Systems_GmbH_DVC100-01 module-alsa-card.c
The following ffmpeg command successfully records video, but records severely distorted, broken and out-of-sync audio:
ffmpeg -y -f rawvideo -f alsa -thread_queue_size 2048 -ar 48000 -i hw:0 \
-c:a aac -video_size 720x576 -pixel_format uyvy422 -i /dev/video2 out.mp4
The following MPlayer command successfully displays the video but does not play the audio:
mplayer -tv driver=v4l2:norm=PAL:device=/dev/video2:width=720:height=576 \
-ao alsa:device=hw=1.0 -vf pp=lb tv://
Now, when the above MPlayer command is running (not the ffmpeg command) and displaying the input video in a window, Audacity can be opened and set recording audio, and it records the audio from the device clearly and in good quality. While Audacity is doing this, the input device is listed in pavucontrol as "Dazzle DVC Audio Device Analogue Stereo". Equivalently, arecord can be used also to record the audio using the following command (with output shown):
$ arecord -vv -D plughw:DVC100 -fdat out.wav
Recording WAVE 'out.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Plug PCM: Hardware PCM card 1 'DVC100' device 0 subdevice 0
Its setup is:
stream : CAPTURE
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 2
rate : 48000
exact rate : 48000 (48000/1)
msbits : 16
buffer_size : 24000
period_size : 6000
period_time : 125000
tstamp_mode : NONE
tstamp_type : MONOTONIC
period_step : 1
avail_min : 6000
period_event : 0
start_threshold : 1
stop_threshold : 24000
silence_threshold: 0
silence_size : 0
boundary : 6755399441055744000
appl_ptr : 0
hw_ptr : 0
Looking at the output of arecord -L, I tried a variety of audio device input names with ffmpeg and none of them seemed to work. So, for example, I tried commands like the following:
ffmpeg -y -f rawvideo -f alsa -i plughw:DVC100 \
-video_size 720x576 -pixel_format uyvy422 -i /dev/video2 out.mp4
And tried the following audio device names:
plughw:DVC100
plughw:CARD=DVC100,DEV=0
hw:CARD=DVC100,DEV=0
plughw:CARD=DVC100
sysdefault:CARD=DVC100
iec958:CARD=DVC100,DEV=0
dsnoop:CARD=DVC100,DEV=0
So, how might I get ffmpeg to record the audio successfully to the video file? Is there some alternative approach to this problem?
EDIT: The relevant output from the command pactl list sources is as follows:
Source #20
State: SUSPENDED
Name: alsa_input.usb-Pinnacle_Systems_GmbH_DVC100-01.analog-stereo
Description: Dazzle DVC100 Audio Device Analogue Stereo
Driver: module-alsa-card.c
Sample Specification: s16le 2ch 48000Hz
Channel Map: front-left,front-right
Owner Module: 45
Mute: no
Volume: front-left: 99957 / 153% / 11.00 dB, front-right: 99957 / 153% / 11.00 dB
balance 0.00
Base Volume: 35466 / 54% / -16.00 dB
Monitor of Sink: n/a
Latency: 0 usec, configured 0 usec
Flags: HARDWARE HW_MUTE_CTRL HW_VOLUME_CTRL DECIBEL_VOLUME LATENCY
Properties:
alsa.resolution_bits = "16"
device.api = "alsa"
device.class = "sound"
alsa.class = "generic"
alsa.subclass = "generic-mix"
alsa.name = "USB Audio"
alsa.id = "USB Audio"
alsa.subdevice = "0"
alsa.subdevice_name = "subdevice #0"
alsa.device = "0"
alsa.card = "1"
alsa.card_name = "DVC100"
alsa.long_card_name = "Pinnacle Systems GmbH DVC100 at usb-0000:00:14.0-4, high speed"
alsa.driver_name = "snd_usb_audio"
device.bus_path = "pci-0000:00:14.0-usb-0:4:1.1"
sysfs.path = "/devices/pci0000:00/0000:00:14.0/usb1/1-4/1-4:1.1/sound/card1"
udev.id = "usb-Pinnacle_Systems_GmbH_DVC100-01"
device.bus = "usb"
device.vendor.id = "2304"
device.vendor.name = "Pinnacle Systems, Inc."
device.product.id = "021a"
device.product.name = "Dazzle DVC100 Audio Device"
device.serial = "Pinnacle_Systems_GmbH_DVC100"
device.string = "front:1"
device.buffering.buffer_size = "352800"
device.buffering.fragment_size = "176400"
device.access_mode = "mmap+timer"
device.profile.name = "analog-stereo"
device.profile.description = "Analogue Stereo"
device.description = "Dazzle DVC100 Audio Device Analogue Stereo"
alsa.mixer_name = "USB Mixer"
alsa.components = "USB2304:021a"
module-udev-detect.discovered = "1"
device.icon_name = "audio-card-usb"
Ports:
analog-input-linein: Line In (priority: 8100)
Active Port: analog-input-linein
Formats:
pcm
I tested the name from this with ffmpeg (version 4.3.1, compiled with -enable-libpulse) in the following way:
ffmpeg -y -f video4linux2 -f pulse \
-i alsa_input.usb-Pinnacle_Systems_GmbH_DVC100-01.analog-stereo \
-video_size 720x576 -pixel_format uyvy422 -i /dev/video2 out.mp4
Unfortunately this hasn't worked.
I also use Dazzle DVC100 to capture video and -f alsa -i hw:1 works well to me. For instance:
ffmpeg -f alsa -i hw:1 -i /dev/video2 \
-codec:v ffv1 -codec:a pcm_s16le raw.mkv
The number of the device can be found using:
cat /proc/asound/cards
Use the number in the first column after hw: prefix. In your case it is hw:1.
Keep in mind FFmpeg fails opening the device when PulsAudio device is opened. It happens to me when I am runnning pavucontrol at the same time for example. In practice I need to wait about a half of a minute after closing pavucontrol before running FFmpeg successfully.
You can check the output of FFmpeg in real time using:
ffmpeg -f alsa -i hw:1 -i /dev/video2 \
-codec:v ffv1 -codec:a pcm_s16le -f matroska - | ffplay -
You can find more information on capturing video using Dazzle DVC100 in my post.
I want to fade a track in and out at specific time codes. For example, I would like to take an audio file, and:
Start it at 100% Volume
Fade it to 20% at 2 seconds
Fade it to 100% at 4 seconds
Fade it to 20% at 6 seconds
Fade it to 100% at 8 seconds
Fade it to 20% at 10 seconds
Fade it to 100% at 12 seconds
Fade it to 0 at 14 seconds
I've been testing this with a constant tone generated by ecasound so that I can open the resulting file in Audacity and see the results visually. As far as I can tell, increasing the amplitude is relative, while decreasing it is not. It seems that if I fade the amplitude up, it affects the relative volume of the whole track and not just at the specific time I set the fade, which is where I'm getting lost.
Example commands
# generate the tone
ecasound -i tone,sine,880,20 -o:tone.wav
# Just the test to see that i can fade start it at 100 and fade it to 20.
ecasound -a:1 -i tone.wav -ea:100 -kl2:1,100,20,2,1 -a:all -o:test_1.mp3
# Fade it out and in
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,100,20,2,1 \
-ea:100 -kl2:1,20,100,4,1 \
-a:all -o:test_2.mp3
# Fade it out and in with a peak of 500
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,100,20,2,1 \
-ea:100 -kl2:1,20,500,4,1 \
-a:all -o:test_3.mp3
# Fade it out from 500, out, and then back to 500
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,500,20,2,1 \
-ea:100 -kl2:1,20,500,4,1 \
-a:all -o:test_4.mp3
# Fade it out from 500, out to a low of 10, and then back to 500
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,500,10,2,1 \
-ea:100 -kl2:1,10,500,4,1 \
-a:all -o:test_5.mp3
# Fade it out from 1000, out to a low of 10, and then back to 1000
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,1000,10,2,1 \
-ea:100 -kl2:1,10,1000,4,1 \
-a:all -o:test_6.mp3
# The eventual result I'm looking for
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,500,20,2,1 \
-ea:100 -kl2:1,20,500,4,1 \
-ea:100 -kl2:1,500,20,6,1 \
-ea:100 -kl2:1,20,500,8,1 \
-ea:100 -kl2:1,500,20,10,1 \
-ea:100 -kl2:1,20,500,12,1 \
-ea:100 -kl2:1,500,0,14,4 \
-a:all -o:test_7.mp3
The Results
The best I can tell from these results is that the amplitude of the whole track is relative to the difference between the low and the peak of all the fading effects. I'm not sure if this result is expected, but it's very confusing.
Also, in the last result (second to last in the image), the fades are no longer taking a full second each. In order to figure out why that may be, I took the final fade-to-zero off and the durations were back to normal. This does not seem like expected behavior.
# "Fixing" the fade durations
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,500,20,2,1 \
-ea:100 -kl2:1,20,500,4,1 \
-ea:100 -kl2:1,500,20,6,1 \
-ea:100 -kl2:1,20,500,8,1 \
-ea:100 -kl2:1,500,20,10,1 \
-ea:100 -kl2:1,20,500,12,1 \
-a:all -o:test_8.mp3
As a side note, I've also tried changing the -ea values to the "current" amplitude with every line. It didn't make any difference (no matter what I set -ea to)
I have the very latest installed from git (2.8.1+dev). I had these same issues with 2.7.0, which is why I upgraded and eventually found myself here.
Am I doing this wrong?
-kl2
After a few hours of head scratching, I finally think I have it figured out. The "From" amplitude on every fade needs to be 100. If you are increasing the amplitude, the "To" amplitude is maximum / from * to.
So if you're trying to go from 20 to 100, it's 100 / 20 * 100 or 500. If you're trying to get to 120: 100 / 20 * 120 or 600. I assume this all makes perfect sense to someone, but I was perfectly stumped.
The working example (with a slightly higher bottom range in the middle to demonstrate):
ecasound -a:1 -i tone.wav \
-ea:100 -kl2:1,100,20,2,1 \
-ea:100 -kl2:1,100,500,4,1 \
-ea:100 -kl2:1,100,40,6,1 \
-ea:100 -kl2:1,100,250,8,1 \
-ea:100 -kl2:1,100,20,10,1 \
-ea:100 -kl2:1,100,500,12,1 \
-ea:100 -kl2:1,100,0,14,1 \
-a:all -o:test_7.mp3
And the output:
Keep in mind that these amplitudes are still relative. If you're going from 45% to 90%: 100 / 45 * 90 = 200, and then now if you drop to 20% of the current amplitude, it's actually 18% (.20 * 90), so going back to 100 would be 100 / 18 * 100 = 555.56
-klg
Just as I figured this out, and came here to post, I received a response from the ecasound mailing list. It's not a direct answer to the kl2 issue, but offers an alternative, easier-on-the-brain answer, which is the klg parameter.
-klg:fx-param,low-value,high-value,point_count,pos1,value1,...,posN,valueN
Generic linear envelope. This controller source can be used to map
custom envelopes to chain operator parameters. Number of envelope
points is specified in 'point_count'. Each envelope point consists of
a position and a matching value. Number of pairs must match
'point_count' (i.e. 'N==point_count'). The 'posX' parameters are given
as seconds (from start of the stream). The envelope points are
specified as float values in range '[0,1]'. Before envelope values are
mapped to operator parameters, they are mapped to the target range of
'[low-value,high-value]'. E.g. a value of '0' will set operator
parameter to 'low-value' and a value of '1' will set it to
'high-value'. For the initial segment '[0,pos1]', the envelope will
output value of 'value1' (e.g. 'low-value').
Here's the command to do what I need using klg instead of kl2:
ecasound -a:1 -i:tone.wav -ea:100 \
-klg:1,0,100,14,2,1,3,0.20,4,0.20,5,1,6,1,7,0.40,8,0.40,9,1,10,1,11,0.20,12,0.20,13,1,14,1,15,0 \
-o:test.mp3
The output is exactly the same as the 2nd track on the image.
This resulting command line is definitely a bit harder to read and hence debug, but may actually be easier to generate dynamically. Regardless, I now have 2 working options to resolve this problem.
And finally, here are my notes for how I figured out the coordinates of the klg command. The asterisks are the "points" which are listed in the klg parameter, the numbers at the top are seconds:
0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
1.0 --* *-* *-* *-*
~ \ / \._./ \ / \
0.2 *-* *-* \
0.0 *----------
I hope this helps someone save at least the amount of hair that i've lost scratching my head.