Making of a "Babbelbox" where you can speak to for partys - audio

Ive got a project to make for a party, its called in holland a "Babbelbox".
its a computer with a webcam and microphone that can be used to make a kind of video log of everyone who wants to say something about the party.
But the problem is that i dont know where to start. ive made a kind of video show system in c but i cant save any data to a good format so it wont jam my harddisk in one hour full.
Requirements:
Record video + audio
Recoding has to start after pressing a button
Good compression over the recorded
videos (would be even better if it
can to be read by final cut pro or
premiere pro)
Light wight programm would be nice
but i could scale up the computer
power

We built one from soup to nuts. That includes software, hardware, a full booth, touch-screen, and even themed it as a cultish confessional in honor of our boss. See http://www.cultoftom.com for the gory details.

a solution for linux using gstreamer:
in ubuntu install the gstreamer-tools package
then you can record with a command similar to:
gst-launch v4l2src ! 'video/x-raw-yuv,width=640,height=480,framerate=30/1' ! tee name=t_vid ! queue ! videoflip method=horizontal-flip ! xvimagesink sync=false t_vid. ! queue ! ffmpegcolorspace ! theoraenc ! queue ! mux. autoaudiosrc ! queue ! audioconvert ! vorbisenc ! queue ! mux. oggmux name=mux ! filesink location=filename.ogv
you can adjust the resolution, framerate, filename etc as you prefer.
from there it would be fairly straighforward to write it out in python and knock up a simple gtk gui for starting / stopping. you could use a multifilesink to handle the filenames for successive recordings.
references:
http://noraisin.net/~jan/diary/?p=40
http://www.twm-kd.com/computers/software/webcam-and-linux-gstreamer-tutorial/
http://pygstdocs.berlios.de/pygst-tutorial/index.html

Related

How can you blend two videos (with their both audio) in a single video using gstreamer?

I currently know how to blend two videos into one, it was very hard to learn how to do this (more than 30 continuous hours researching), I've used the following pipeline:
gst-launch-1.0 filesrc location=candidate.webm ! decodebin ! videoscale ! video/x-raw,width=680,height=480 ! compositor name=comp sink_1::xpos=453 sink_1::ypos=340 ! vp9enc ! webmmux ! filesink location=out.web filesrc location=interviewer.webm ! decodebin ! videoscale ! video/x-raw,width=200,height=140 ! comp.
In this case I'm blending two videos so that the second of them is in the right bottom corner, and the first one is the "background". Well, does somebody knows how can I get both audios in the same file too? I hope someone find useful my pipeline.
The audiomixer element does take multiple audio streams and mixes them into a single audio stream.

Sync audio and video when playing mp4 file with GStreamer

I need to sync video and audio when I play mp4 file. How can I do that?
Here's my pipeline:
gst-launch-0.10 filesrc location=./big_buck_bunny.mp4 ! \
qtdemux name=demux demux.video_00 ! queue ! TIViddec2 engineName=codecServer codecName=h264dec ! ffmpegcolorspace !tidisplaysink2 video-standard=pal display-output=composite \
demux.audio_00 ! queue max-size-buffers=500 max-size-time=0 max-size-bytes=0 ! TIAuddec1 ! audioconvert ! audioresample ! autoaudiosink
Have you tried playing the video on a regular desktop without using TI's elements? GStreamer should take care of synchronization for playback cases (and many others).
If the video is perfectly synchronized on a desktop then you have a bug on the elements specific to your target platform (TIViddec2 and tidisplaysink2). qtdemux should already put the expected timestamps on the buffers, so it is possible that TIViddec2 isn't copying those to its decoded buffers or tidisplaysink2 isn't respecting them. (The same might apply to the audio part)
I'd first check TIViddec2 by replacing the rest of the pipeline after it with a fakesink and run with verbose mode of gst-launch. The output from fakesink should show you the output timestamps, check if those are consistent, you can also put a fakesink right after qtdemux to check the timestamps that it produces and see if the decoders are respecting that.
I used wrong video framerate actually.

Syncing audio and video when mp4muxing in gst-launch-1.0

I have a Logitech C920 webcam that provides properly formatted h264 video, and a mic hooked up to an ASUS Xonar external USB sound card. I can read both and mux their data into a single file like this:
gst-launch-1.0 -e \
mp4mux name=muxy ! filesink location=/tmp/out.mp4 \
alsasrc device='hw:Device,0' do-timestamp=true ! audio/x-raw,rate=48000 ! audioconvert ! queue ! lamemp3enc ! muxy.audio_0 \
v4l2src do-timestamp=true ! video/x-h264,framerate=30/1,height=720 ! h264parse ! queue ! muxy.video_0
...but then I get poorly synchronized audio/video. The audio flow consistently starts with 250ms of garbage noise, and the resulting mp4 video is 250ms (7 or 8 frames at 30fps) out of sync.
Seems like the sources start simultaneously, but the sound card inserts 250ms of initialization junk every time. Or perhaps, the camera takes 250ms longer to start up but reports an incorrect start of stream flag. Or, maybe the clocks in my devices are out of sync for some reason. I don't know how to figure out the difference between these (and other) potential root causes.
Whatever the cause, I'd like to patch over the symptoms at least. I've been trying to do any of the following in the gstreamer pipeline, any of which would satisfy my requirements:
Cut out the first 250ms of audio
Delay the video by 250ms or 7 frames
Synchronize the audio and video timestamps properly with attributes like alsasrc slave-method or v4l2src io-mode
And I'm apparently doing it wrong. Nothing works. No matter what, I always end up with the video running 250ms/7 frames ahead of the audio. Adding the queue elements reportedly fixed the sync issue as mediainfo now reports Duration values for audio and wideo within 20ms of each other, which would be acceptable. But that's not how the resulting videos actually work. Clap my hands, the noise arrives late.
This can be fixed in post processing but why not avoid the hassle and get it right, straight from the gst pipeline? I'm all out of tricks and just about ready to fall back to fixing every single video's sync by hand instead. Any ideas out there?
Thanks for any help, tips, ideas.

How can I use gstreamer & smpte to concatenate 2 video files with gst-launch?

I have 2 video files (vid1.mov and vid2.mov), both have the same frame size and frame rate. I want to have 1 final video with shows vid1.mov and then vid2.mov, one after the other. I also want there to be a transition from one video to another (rather than an abrupt change of video), and have discovered the smpte plugin for gstreamer, which goes what I want.
Using gst-launch on the ubuntu linux command line, how can I merge the 2 videos together with a transition?
(Assume I want to use the same transition as in the smpte example of 2 seconds long and type=234)
I tried modifying the smpte example like this:
gst-launch filesrc location=vid1.mov ! decodebin ! ffmpegcolorspace ! smpte name=s border=20000 type=234 duration=2000000000 ! ffmpegcolorspace ! ximagesink filesrc location=vid2.MOV ! decodebin ! ffmpegcolorspace ! s.
It starts playing both videos at the same time, then transitioning from one to the other, so it only shows 2sec of vid1.mov, and then plays all of vid2.mov. How can I get it play all of vid1.mov, then 2sec before vid1.mov ends, it starts playing vid2.mov, and starts transitioning, so that it finishs transitioning just as vid1.mov ends, it should then continue to play all of vid2.mov as normal.
Someone else has pointed me to GNonLin, for gstreamer non-linear editing stuff, which would be used with this. However I have other problems with it cf. Video Transitions with GStreamer & GNonLin not working

Gstreamer: RTP jitter buffer not working properly with packet loss?

For a VoIP speech quality monitoring application I need to compare an incoming RTP audio stream to a reference signal. For the signal comparison itself I use pre-existing, special-purpose tools. For the other parts (except packet capture) the Gstreamer library seemed to be a good choice. I use the following pipeline to simulate a bare-bones VoIP client:
filesrc location=foobar.pcap ! pcapparse ! "application/x-rtp, payload=0, clock-rate=8000"
! gstrtpjitterbuffer ! rtppcmudepay ! mulawdec ! audioconvert
! audioresample ! wavenc ! filesink location=foobar.wav
The pcap file contains a single RTP media stream. I crafted a capture file that's missing 50 of the original 400 UDP datagrams. For the given audio sample (8s long for my example):
[XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
with a certain amount of consecutive packet loss I'd expect an audio signal like this to be output ('-' denotes silence):
[XXXXXXXXXXXXXXXXXXXXXXXX-----XXXXXXXXXXX]
however what is actually saved in the audio file is this (1s shorter for my example):
[XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
It seems that the jitter buffer, a crucial part for this application, is not working properly. Could this be an incompatibility with / a shortcoming of the pcapparse element? Am I missing a key part in the pipeline to ensure time synchronization? What else could be causing this?
The issue could be solved by slightly altering the pipeline:
An audiorate element needs to be added before wavenc that "produces a perfect stream by inserting or dropping samples as needed".
However this works only if audiorate receives the packet-loss events. For this the do-lost property of gstjitterbuffer needs to be set to true.
Here's the corrected pipeline:
filesrc location=foobar.pcap ! pcapparse
! "application/x-rtp, payload=0, clock-rate=8000"
! gstrtpjitterbuffer do-lost=true ! rtppcmudepay ! mulawdec
! audioconvert ! audioresample ! audiorate ! wavenc
! filesink location=foobar.wav
GStreamer may just use the dejitter buffer to smooth out the packets on the way to the (audio) output. This wouldn't be unusual, its the bare minimum definition of dejittering.
It may go so far as reordering out-of-order packets or deleting duplicates, but packet loss concealment (your scenario) can be quite complex.
Basic implementations just duplicate the last received packet, whilst more advanced implementation analyse and reconstruct the tone of the last received packets to smooth out the audio.
It sounds like your application performance will depend on the exact implementation of loss concealment, so even if GStreamer does do "something", you may have a hard time quantifying its impact on your results unless you understand it in great detail.
Perhaps you could try a pcap with a couple of out-of-order and duplicate packets and check if GStreamer at least reorders/deletes them, that would go someway to clarifying what is happening.

Resources