Linux: launch window, capture screen - linux

I need to write a Red Hat Linux command line tool that launches a window and captures its appearance to disk as a JPEG.
Typically the target machines don't have graphics cards, but we can install any software components (e.g., X).
Question or two:
What libraries or tools might you suggest for this?
If I were to use something like GTK+ to create this tool, would lacking a video card hamper its execution?
I saw scrot, but it doesn't appear to support capturing a specific window without user interaction.

It sounds like you'll need to use the "virtual framebuffer" driver for the X.org server, combined with the xwd, NetPBM, and cjpeg utilities.
I'm not sure about the particular configuration you'll need for the X server, but you will likely have to make sure the server you're using has the virtual framebuffer driver built into to. The virtual framebuffer driver is a display driver just like one you'd use to connect to an NVidia or ATI video card, except it's "output" is a chunk of memory that contains the pixels, not an LCD screen.
xwd is one of the standard X tools, that can create a X Window Dump. xwd can be told on the command line which window to dump. It outputs a funky "xwd" formatted stream to standard out.
The NetPBM utilities are a collection of command line tools that convert one image format to another. It includes one that converts xwdtoppm. PPM is a very basic, non-compressed format that is the intermediate format understood by most of the NetPBM tools.
cjpeg is part of the standard JPEG tools collection, and is probably installed if you also have NetPBM. cjpeg can take a stream of PPM bytes and emit a stream of JPEG bytes.
Through the magic of Unix scripting and pipes, you can string these utilities together to fire up the app with the window, call xwd, xwdtoppm, and cjpeg to dump the image to a file.

You might try running vncserver to create a virtual X window display - no graphics card needed. Be sure to set your DISPLAY variable to the display number that gets printed when vncserver starts. Next, start your app on the created display (in hte background) and use xwd with data formatters or a gimp command to capture the screen image to jpeg.
By the way, check the similar answers for Command line program to create website screenshots (on Linux).

Related

Sending a webcam input to zoom using a recorded clip

I have an idea that I have been working on, but there are some technical details that I would love to understand before I proceed.
From what I understand, Linux communicates with the underlying hardware through the /dev/. I was messing around with my video cam input to zoom and I found someone explaining that I need to create a virtual device and mount it to the output of another program called v4loop.
My questions are
1- How does Zoom detect the webcams available for input. My /dev directory has 2 "files" called video (/dev/video0 and /dev/video1), yet zoom only detects one webcam. Is the webcam communication done through this video file or not? If yes, why does simply creating one doesn't affect Zoom input choices. If not, how does zoom detect the input and read the webcam feed?
2- can I create a virtual device and write a kernel module for it that feeds the input from a local file. I have written a lot of kernel modules, and I know they have a read, write, release methods. I want to parse the video whenever a read request from zoom is issued. How should the video be encoded? Is it an mp4 or a raw format or something else? How fast should I be sending input (in terms of kilobytes). I think it is a function of my webcam recording specs. If it is 1920x1080, and each pixel is 3 bytes (RGB), and it is recording at 20 fps, I can simply calculate how many bytes are generated per second, but how does Zoom expect the input to be Fed into it. Assuming that it is sending the strean in real time, then it should be reading input every few milliseconds. How do I get access to such information?
Thank you in advance. This is a learning experiment, I am just trying to do something fun that I am motivated to do, while learning more about Linux-hardware communication. I am still a beginner, so please go easy on me.
Apparently, there are two types of /dev/video* files. One for the metadata and the other is for the actual stream from the webcam. Creating a virtual device of the same type as the stream in the /dev directory did result in Zoom recognizing it as an independent webcam, even without creating its metadata file. I did finally achieve what I wanted, but I used OBS Studio virtual camera feature that was added after update 26.0.1, and it is working perfectly so far.

Which version of linux has support for Dolby Advanced Audio v2?

I am using LENEVO G500 Laptop and my sound card has support for Dolby Advanced Audio v2 that works nicely in Windows OS (i.e. Windows 7, 8 and 8.1). However I have failed to enable Dolby sound effect in my Linux OS (have tried it with Linux Mint 17, Fedora 20).
Does anyone have an idea which linux version has support for this feature or how I can enable in a linux OS.
I would appreciate if you could direct me to the right direction.
Thanks.
I've googled out a very good advise on forums that helps me to achieve a Dolby like sound on my Kubuntu 19.04 with Lenovo g780.
Install PulseEffects https://github.com/wwmm/pulseeffects
(repos with deb files are here: https://github.com/wwmm/pulseeffects/wiki/Package-Repositories#debian--ubuntu)
Restart the user session or reboot after this, because PulseAudio will be upgraded, and it may cause problems if you don't restart.
Run PulseEffects and close it. It'll create all settings dirs on first launch. They required for next step.
Install PulseEffects-Presets from here: https://github.com/JackHack96/PulseEffects-Presets
(I've used the suggested script that automatically downloads them to PulseEffects import dirs, it will require flatpak that could be got from repos with sudo apt install flatpak)
Launch PulseEffects again. Select Convolver. Enable it. Click on wave button. You'll see a list of presets. Enable:
Dolby ATMOS ((128K MP3)) 1.Default.irs
Close the dialog and that's it. You can toggle Convolver in PulseEffects on and off while playng music to compare results. You may play with other presets as well.
For improving sound on a notebook or a tablet PulseEffects help pages come with a tuorial about how to achieve this.
App can be minimized to tray on GTK-enabled desktops with an additional application: https://github.com/boomshop/pulseffectstray
It's better enable autostart in app settings (it will copy it's desktop file to ~/.config/autostart with --gapplication-service command line. So next time start without GUI).
It is possible to get reasonably close to the Dolby Advanced Audio output on Linux.
TLDR:
Record the result of playing a -0dBFS impulse in Windows with all effects enabled. Save that as a wav file and use it as input to the PulseEffects Convolver.
Step-by-step:
Install Audacity in Windows.
Configure Audacity to use WASAPI
Select the loopback device as input
Select your laptop speakers as output, making sure all Dolby Advanced Audio effects are enabled.
Start recording
Play an impulse audio file (you might need to do this twice, Audacity often doesn't pick up the first impulse.
Zoom in and select the area around the recorded impulse (see screenshot below)
Export the selection as a WAV file and change the extension to irs
Import this irs file into the Convolver.
Some notes:
Audacity isn't required, presumably any software capable of recording from the output device will be fine.
To avoid any changes introduced by sample rate conversion, set the sample rate of the output and input devices in Windows to be the same.
When recording, in Step 5, Audacity would not record unless audio was playing. This is probably due to using WASAPI. Just start recording, play the impulse and if you don't see it in the recording output as a single spike, play it again.
The screenshot is quite heavily zoomed in so that you can see the area where there is data. When selecting what to export, try to make sure the selection is roughly centered around the central peak. It doesn't have to be perfect.
As a useful check to make sure what you are recording has been processed by Dolby Advanced Audio, you can disable all effects on the output device in Windows and record the impulse a second time. This should show up as a single peak sample and not the symmetric pattern.
After a bit of research I found this explanation that seems to have satisfied my query. It generally says ...
There isn't going to be an easy fix for this, unless Dolby releases a Linux driver or publishes more information on what exactly their software is doing (which is unlikely).
Haswell-ThinkPad-problems, linux-low-audio-quality
Beware that recently PulseEffects has changed it's name to EasyEffects, but PulseEffects-Presets hasn't updated it's config files to cover this change; Therefore this answer might not be applicable anymore.

How do I "dump" the contents of an X terminal programmatically a la /dev/vcs{,a} in the Linux console?

Linux's kernel-level console/non-X terminal emulator contains a very cool feature (if compiled in): each /dev/ttyN device corresponds with /dev/vcsaN and /dev/vcsN devices which represent the in-memory (displayed) state of that tty, with and without attributes (color, flashing, etc) respectively. This allows you to very easily cat /dev/vcs7 and see a dump of /dev/tty7 wherever cat was launched. I used this incredibly practical capability the other day to login to a system via SSH and remotely watch a dd process I'd forgotten to put inside a screen (or similar) session - it was running off a text console, so I took a few moments to finetune the character ranges that I wanted to grab, and presently I was watching dd's transfer status over SSH (once every second, incidentally).
To reiterate and clarify, /dev/vcs{,a}* are character devices that retrieve the current in-memory representation the kernel console VT100 emulator, represented as a single "line" of text (there are no "newlines" at the end of each "line" of the screen). Just to remove confusion, I want to note that I can't tail -f this device: it's not a character stream like the TTY itself is. (But I've never needed this kind of behavior, for what it's worth.)
I've kept my ears perked for many years for a method to dump the character-cell memory state of X terminal emulators - or indeed any arbitrary process that needs to work with ttys, in some similar manner as I can with the Linux console. And... I am rather surprised that there is no practical solution to this problem - since it has, arguably, existed for approximately 30 years - X was introduced in 1984 - or, to be pedantic, at least 19 years - /dev/vcs{,a}* was introduced in kernel 1.1.94; the newest file in that release is dated 22 Feb 1995. (The oldest is from 1st Dec 1993 :P)
I would like to say that I do understand and realize that the tty itself is not a "screen buffer" as such but a character stream, and that the nonstandard feature I essentially exploited above is a quirky capability specific to the Linux VT102 emulator. However, this feature is cool enough (why else would it be in the mainline tree? :D) that, in my opinion, there should be a counterpart to it for things that work with /dev/pts*.
This afternoon, I needed to screen-scrape the output of an interactive ncurses application so I could extract metadata from the information it presented in my terminal. (There was no other practical way to achieve the goal I was aiming for.) Linux' kernel VT100 driver would permit such a task to be completed very easily, and I made the mistake of thinking that it, in light of this, it couldn't truly be that hard to do the same under X11.
By 9AM, I'd decided that the easiest way to experimentally request a dump of a remote screen would be to run it in dtach (think "screen -x" without any other options) and hack the dtach code to request a screen update and quit.
Around 11AM-12PM, I was requesting screen updates and dumping them to stdout.
Around 3:30PM, I accepted that using dtach would be impossible:
First of all, it relies on the application itself to send the screen redraws on request, by design, to keep the code simple. This is great, but, as luck would have it, the application I was using didn't support whole-screen repaints - it would only redraw on screen-size change (and only if the screen size was truly different!).
Running the program inside a screen session (because screen is a true terminal emulator and has an internal 2D character-cell buffer), then running screen -x inside dtach, also mysteriously failed to produce character cell updates.
I have previously examined screen and found the code sufficiently insane enough to remove any inclinations I might otherwise have to hack on it; all I can say is that said insanity may be one of the reasons screen does not already have the capabilities I have presented here (which would arguably be very easy to implement).
Other questions similar to this one frequently get answers to use typescript, or script; I just want to clarify that script saves the stream of the tty itself to a file, which I would need to push through a VT100 emulator to obtain a screen image of the current state of the tty in question. In other words, script would be a very insane solution to my problem.
I'm not marking this as accepted since it doesn't solve the actual core issue (which is many years old), but I was able to achieve the specific goal I set out to do.
My specific requirements were that I wanted to screen-scrape the output of the ncdu interactive disk usage browser, so I could simply press Enter in another terminal (or perform some similar, easy sequence) to add the directory currently highlighted/selected in ncdu to a file-list of files I wanted to work with.My goal was not to have to distract myself with endless copy+paste and/or retyping of directory names (probably with not a few inaccuracies to boot), so I could focus on the directories I wanted to select.
screen has a refresh feature, accessed by pressing (by default) CTRL+A, CTRL+L. I extended my copy of dtach to be capable of sending keystrokes in addition to dumping remote screens to stdout, and wrapped dtach in a script that transmitted the refresh sequence (\001\014) to screen -x running inside dtach. This worked perfectly, retrieving complete screen updates without any flicker.
I will warn anyone interested in trying this technique, however, that you will need to perfect the art of dodging VT100 escape sequences. I used regular expressions for this so I wasn't writing thousands of lines of code; here's the specific part of the script that extracted out the two pieces of information I needed:
sh -c "(sleep 0.1; dtach -k qq $'\001\014') &"; path="$(dtach -d qq -t 130000 | sed -n $'/^\033\[7m.*\/\.\./q;/---.*$/{s/.*--- //;s/ -\+.*//;h};/^\033\[7m/{s/.\033.*//g;s/\r.*//g;s/ *$//g;s/^\033\[7m *[^ ]\+ \[[# ]*\] *\(\/*\)\(.*\)$/\/\\2\\1/;p;g;p;q}' | sed 'N;s/\(.*\)\n\(.*\)/\2\1/')"
Since screenshots are cool and help people visualize things, here's a look at how it works when it's running:
The file shown inverted at the bottom of the ncdu-scrape window is being screen-scraped from the ncdu window itself; the four files in the list are there because I selected them using the arrow keys in ncdu, moved my mouse over to the ncdu-scrape window (I use focus-follows-mouse), and pressed Enter. That added the file to the list (a simple text file itself).
Having said this, I would like to clarify that the regular expression above is not a code sample to run with; it is, rather, a warning: for anything beyond incredibly trivial (!!) content extractions such as the one presented here, you're basically getting into the same territory as large corporations/interests who want to convert from VT100-based systems to something more modern, who have to spend tends of thousands commissioning large translation frameworks that perform the kind of conversion outlined above on an especially large scale.
Saner solutions appreciated.

Linux: Screen desktop video capture over network, and VNC framerate

Sorry for the wall of text - TL;DR:
What is the framerate of VNC connection (in frames/sec) - or rather, who determines it: client or server?
Any other suggestions for desktop screen capture - but "correctly timecoded"/ with unjittered framerate (with a stable period); and with possibility to obtain it as uncompressed (or lossless) image sequence?
Briefly - I have a typical problem that I am faced with: I sometimes develop hardware, and want to record a video that shows both commands entered on the PC ('desktop capture'), and responses of the hardware ('live video'). A chunk of an intro follows, before I get to the specific detail(s).
Intro/Context
My strategy, for now, is to use a video camera to record the process of hardware testing (as 'live' video) - and do a desktop capture at the same time. The video camera produces a 29.97 (30) FPS MPEG-2 .AVI video; and I want to get the desktop capture as an image sequence of PNGs at the same frame rate as the video. The idea, then, would be: if the frame rate of the two videos is the same; then I could simply
align the time of start of the desktop capture, with the matching point in the 'live' video
Set up a picture-in-picture, where a scaled down version of the desktop capture is put - as overlay - on top of the 'live' video
(where a portion of the screen on the 'live' video, serves as a visual sync source with the 'desktop capture' overlay)
Export a 'final' combined video, compressed appropriately for the Internet
In principle, I guess one could use a command line tool like ffmpeg for this process; however I would prefer to use a GUI for finding the alignment start point for the two videos.
Eventually, what I also want to achieve, is to preserve maximum quality when exporting the 'final' video: the 'live' video is already compressed when out of the camera, which means additional degradation when it passes through the Theora .ogv codec - which is why I'd like to keep the original videos, and use something like a command line to generate a 'final' video anew, if a different compression/resolution is required. This is also why I like to have the 'desktop capture' video as a PNG sequence (although I guess any uncompressed format would do): I take measures to 'adjust' the desktop, so there aren't many gradients, and lossless encoding (i.e. PNG) would be appropriate.
Desktop capture options
Well, there are many troubles in this process under Ubuntu Lucid, which I currently use (and you can read about some of my ordeals in 10.04: Video overlay/composite editing with Theora ogv - Ubuntu Forums). However, one of the crucial problems is the assumption, that the frame rate of the two incoming videos is equal - in reality, usually the desktop capture is of a lower framerate; and even worse, very often frames are out of sync.
This, then, requires the hassle of sitting in front of a video editor, and manually cutting and editing less-than-a-second clips on frame level - requiring hours of work for what will be in the end a 5 minute video. On the other hand, if the two videos ('live' and 'capture') did have the same framerate and sync: in principle, you wouldn't need more than a couple of minutes for finding the start sync point in a video editor - and the rest of the 'merged' video processing could be handled by a single command line. Which is why, in this post, I would like to focus on the desktop capture part.
As far as I can see, there are only few viable (as opposed to 5 Ways to Screencast Your Linux Desktop) alternatives for desktop capture in Linux / Ubuntu (note, I typically use a laptop as target for desktop capturing):
Have your target PC (laptop) clone the desktop on its VGA output; use a VGA-to-composite or VGA-to-S-video hardware to obtain a video signal from VGA; use video capture card on a different PC to grab video
Use recordMyDesktop on the target PC
Set up a VNC server (vino on Ubuntu; or vncserver) on the target PC to be captured; use VNC capture software (such as vncrec) on a different PC to grab/record the VNC stream (which can, subsequently, be converted to video).
Use ffmpeg with x11grab option
*(use some tool on the target PC, that would do a DMA transfer of a desktop image frame directly - from the graphics card frame buffer memory, to the network adapter memory)
Please note that the usefulness of the above approaches are limited by my context of use: the target PC that I want to capture, typically runs software (utilizing the tested hardware) that moves around massive ammounts of data; best you could say about describing such a system is "barely stable" :) I'd guess this is similar to problems gamers face, when wanting to obtain a video capture of a demanding game. And as soon as I start using something like recordMyDesktop, which also uses quite a bit of resources and wants to capture on the local hard disk - I immediately get severe kernel crashes (often with no vmcore generated).
So, in my context, I typically do assume involvement of a second computer - to run the capture and recording of the 'target' PC desktop. Other than that, the pros and cons I can see so far with the above options, are included below.
(Desktop preparation)
For all of the methods discussed below, I tend to "prepare" the desktop beforehand:
Remove desktop backgrounds and icons
Set the resolution down to 800x600 via System/Preferences/Monitors (gnome-desktop-properties)
Change color depth down to 16 bpp (using xdpyinfo | grep "of root" to check)
... in order to minimize the load on desktop capture software. Note that changing color depth on Ubuntu requires changes to xorg.conf; however, "No xorg.conf (is) found in /etc/X11 (Ubuntu 10.04)" - so you may need to run sudo Xorg -configure first.
In order to keep graphics resource use low, also I usually had compiz disabled - or rather, I'd have 'System/Preferences/Appearance/Visual Effects' set to "None". However, after I tried enabling compiz by setting 'Visual Effects' to "Normal" (which doesn't get saved), I can notice windows on the LCD screen are redrawn much faster; so I keep it like this, also for desktop capture. I find this a bit strange: how could more effects cause a faster screen refresh? It doesn't look like it's due to a proprietary driver (the card is "Intel Corporation N10 Family Integrated Graphics Controller", and no proprietary driver option is given by Ubuntu upon switch to compiz) - although, it could be that all the blurring and effects just cheat my eyes :) ).
Cloning VGA
Well, this is the most expencive option (as it requires additional purchase of not just one, but two pieces of hardware: VGA converter, and video capture card); and applicable mostly to laptops (which have both a screen + additional VGA output - for desktops one may also have to invest in an additional graphics card, or a VGA cloning hardware).
However, it is also the only option that requires no additional software of the target PC whatsoever (and thus uses 0% processing power of the target CPU) - AND also the only one that will give a video with a true, unjittered framerate of 30 fps (as it is performed by separate hardware - although, with the assumption that clock domains misalignment, present between individual hardware pieces, is negligible).
Actually, as I already own something like a capture card, I have already invested in a VGA converter - in expectation that it will eventually allow me to produce final "merged" videos with only 5 mins of looking for alignment point, and a single command line; but I am yet to see whether this process will work as intended. I'm also wandering how possible it will be to capture desktop as uncompressed video # 800x600, 30 fps.
recordMyDesktop
Well, if you run recordMyDesktop without any arguments - it starts first with capturing (what looks like) raw image data, in a folder like /tmp/rMD-session-7247; and after you press Ctrl-C to interrupt it, it will encode this raw image data into an .ogv. Obviously, grabbing large image data on the same hard disk as my test software (which also moves large ammounts of data), is usually a cause for an instacrash :)
Hence, what I tried doing is to setup Samba to share a drive on the network; then on the target PC, I'd connect to this drive - and instruct recordMyDesktop to use this network drive (via gvfs) as its temporary files location:
recordmydesktop --workdir /home/user/.gvfs/test\ on\ 192.168.1.100/capture/ --no-sound --quick-subsampling --fps 30 --overwrite -o capture.ogv
Note that, while this command will use the network location for temporary files (and thus makes it possible for recordMyDesktop to run in parallel with my software) - as soon as you hit Ctrl-C, it will start encoding and saving capture.ogv directly on the local hard drive of the target (though, at that point, I don't really care :) )
First of my nags with recordMyDesktop is that you cannot instruct it to keep the temporary files, and avoid encoding them, on end: you can use Ctrl+Alt+p for pause - or you can hit Ctrl-C quickly after the first one, to cause it to crash; which will then leave the temporary files (if you don't hit Ctrl-C quickly enough the second time, the program will "Cleanning up cache..."). You can then run, say:
recordmydesktop --rescue /home/user/.gvfs/test\ on\ 192.168.1.100/capture/rMD-session-7247/
... in order to convert the raw temporary data. However, more often than not, recordMyDesktop will itself segfault in the midst of performing this "rescue". Although, the reason why I want to keep the temp files, is to have the uncompressed source for the picture-in-picture montage. Note that the "--on-the-fly-encoding" will avoid using temp files altogether - at the expence of using more CPU processing power (which, for me, again is cause for crashes.)
Then, there is the framerate - obviously, you can set requested framerate using the '--fps N' option; however, that is no guarantee that you will actually obtain that framerate; for instance, I'd get:
recordmydesktop --fps 25
...
Saved 2983 frames in a total of 6023 requests
...
... for a capture with my test software running; which means that the actually achieved rate is more like 25*2983/6032 = 12.3632 fps!
Obviously, frames are dropped - and mostly that shows as video playback is too fast. However, if I lower the requested fps to 12 - then according to saved/total reports, I achieve something like 11 fps; and in this case, video playback doesn't look 'sped up'. And I still haven't tried aligning such a capture with a live video - so I have no idea if those frames that actually have been saved, also have an accurate timestamp.
VNC capture
The VNC capture, for me, consists of running a VNC server on the 'target' PC, and running vncrec (twibright edition) on the 'recorder' PC. As VNC server, I use vino, which is "System/Preferences/Remote Desktop (Preferences)". And apparently, even if vino configuration may not be the easiest thing to manage, vino as a server seems not too taxing to the 'target' PC; as I haven't experienced crashes when it runs in parallel with my test software.
On the other hand, when vncrec is capturing on the 'recorder' PC, it also raises a window showing you the 'target' desktop as it is seen in 'realtime'; when there are large updates (i.e. whole windows moving) on the 'target' - one can, quite visibly, see problems with the update/refresh rate on the 'recorder'. But, for only small updates (i.e. just a cursor moving on a static background), things seem OK.
This makes me wonder about one of my primary questions with this post - what is it, that sets the framerate in a VNC connection?
I haven't found a clear answer to this, but from bits and pieces of info (see refs below), I gather that:
The VNC server simply sends changes (screen changes + clicks etc) as fast as it can, when it receives them ; limited by the max network bandwidth that is available to the server
The VNC client receives those change events delayed and jittered by the network connection, and attempts to reconstruct the desktop "video" stream, again as fast as it can
... which means, one cannot state anything in terms of a stable, periodic frame rate (as in video).
As far as vncrec as a client goes, the end videos I get usually are declared as 10 fps, although frames can be rather displaced/jittered (which then requires the cutting in video editors). Note that the vncrec-twibright/README states: "The sample rate of the movie is 10 by default or overriden by VNCREC_MOVIE_FRAMERATE environment variable, or 10 if not specified."; however, the manpage also states "VNCREC_MOVIE_FRAMERATE - Specifies frame rate of the output movie. Has an effect only in -movie mode. Defaults to 10. Try 24 when your transcoder vomits from 10.". And if one looks into "vncrec/sockets.c" source, one can see:
void print_movie_frames_up_to_time(struct timeval tv)
{
static double framerate;
....
memcpy(out, bufoutptr, buffered);
if (appData.record)
{
writeLogHeader (); /* Writes the timestamp */
fwrite (bufoutptr, 1, buffered, vncLog);
}
... which shows that some timestamps are written - but whether those timestamps originate from the "original" 'target' PC, or the 'recorder' one, I cannot tell.
EDIT: thanks to the answer by #kanaka, I checked through vncrec/sockets.c again, and can see that it is the writeLogHeader function itself calling gettimeofday; so the timestamps it writes are local - that is, they originate from the 'recorder' PC (and hence, these timestamps do not accurately describe when the frames originated on the 'target' PC).
In any case, it still seems to me, that the server sends - and vncrec as client receives - whenever; and it is only in the process of encoding a video file from the raw capture afterwards, that some form of a frame rate is set/interpolated.
I'd also like to state that on my 'target' laptop, the wired network connection is broken; so the wireless is my only option to get access to the router and the local network - at far lower speed than the 100MB/s that the router could handle from wired connections. However, if the jitter in captured frames is caused by wrong timestamps due to load on the 'target' PC, I don't think good network bandwidth will help too much.
Finally, as far as VNC goes, there could be other alternatives to try - such as VNCast server (promising, but requires some time to build from source, and is in "early experimental version"); or MultiVNC (although, it just seems like a client/viewer, without options for recording).
ffmpeg with x11grab
Haven't played with this much, but, I've tried it in connection with netcat; this:
# 'target'
ffmpeg -f x11grab -b 8000k -r 30 -s 800x600 -i :0.0 -f rawvideo - | nc 192.168.1.100 5678
# 'recorder'
nc -l 0.0.0.0 5678 > raw.video #
... does capture a file, but ffplay cannot read the captured file properly; while:
# 'target'
ffmpeg -f x11grab -b 500k -r 30 -s 800x600 -i :0.0 -f yuv4mpegpipe -pix_fmt yuv444p - | nc 192.168.1.100 5678
# 'recorder'
nc -l 0.0.0.0 5678 | ffmpeg -i - /path/to/samplimg%03d.png
does produce .png images - but with compression artifacts (result of the compression involved with yuv4mpegpipe, I guess).
Thus, I'm not liking ffmpeg+x11grab too much currently - but maybe I simply don't know how to set it up for my needs.
*( graphics card -> DMA -> network )
I am, admittedly, not sure something like this exists - in fact, I would wager it doesn't :) And I'm no expert here, but I speculate:
if DMA memory transfer can be initiated from the graphics card (or its buffer that keeps the current desktop bitmap) as source, and the network adapter as destination - then in principle, it should be possible to obtain an uncompressed desktop capture with a correct (and decent) framerate. The point in using DMA transfer would be, of course, to relieve the processor from the task of copying the desktop image to the network interface (and thus, reduce the influence the capturing software can have on the processes running on the 'target' PC - especially those dealing with RAM or hard-disk).
A suggestion like this, of course, assumes that: there are massive ammounts of network bandwidth (for 800x600, 30 fps at least 800*600*3*30 = 43200000 bps = 42 MiB/s, which should be OK for local 100 MB/s networks); plenty of hard disk on the other PC that does the 'recording' - and finally, software that can afterwards read that raw data, and generate image sequences or videos based on it :)
The bandwidth and hard disk demands I could live with - as long as there is guarantee both for a stable framerate and uncompressed data; which is why I'd love to hear if something like this already exists.
-- -- -- -- --
Well, I guess that was it - as brief as I could put it :) Any suggestions for tools - or process(es), that can result with a desktop capture
in uncompressed format (ultimately convertible to uncompressed/lossless PNG image sequence), and
with a "correctly timecoded", stable framerate
..., that will ultimately lend itself to 'easy', single command-line processing for generating 'picture-in-picture' overlay videos - will be greatly appreciated!
Thanks in advance for any comments,
Cheers!
References
Experiences Producing a Screencast on Linux for CryptoTE - idlebox.net
The VideoLAN Forums • View topic - VNC Client input support (like screen://)
VNCServer throttles user inpt for slow client - Kyprianou, Mark - com.realvnc.vnc-list - MarkMail
Linux FAQ - X Windows: How do I Display and Control a Remote Desktop using VNC
How much bandwidth does VNC require? RealVNC - Frequently asked questions
x11vnc: a VNC server for real X displays
HowtoRecordVNC (an X11 session) - Debian Wiki
Alternative To gtk-RecordMyDesktop in Ubuntu
(Ffmpeg-user) How do I use pipes in ffmpeg
(ffmpeg-devel) (PATCH) Fix segfault in x11grab when drawing Cursor on Xservers that don't support the XFixes extension
You should get a badge for such a long well though out question. ;-)
In answer to your primary question, VNC uses the RFB protocol which is a remote frame buffer protocol (thus the acronym) not a streaming video protocol. The VNC client sends a FrameBufferUpdateRequest message to the server which contains a viewport region that the client is interested in and an incremental flag. If the incremental flag is not set then the server will respond with a FrameBufferUpdate message that contains the content of the region requested. If the incremental flag is set then the server may respond with a FrameBufferUpdate message that contains whatever parts of the region requested that have changed since the last time the client was sent that region.
The definition of how requests and updates interact is not crisply defined. The server won't necessarily respond to every request with an update if nothing has changed. If the server has multiple requests queued from the client it is also allowed to send a single update in response. In addition, the client really needs to be able to respond to an asynchronous update message from the server (not in response to a request) otherwise the client will fall out of sync (because RFB is not a framed protocol).
Often clients are simply implemented to send incremental update requests for the entire frame buffer viewport at a periodic interval and handle any server update messages as they arrive (i.e. no attempt is made to tie requests and updates together).
Here is a description of FrameBufferUpdateRequest messages.

How to screen capture screenshots or movies on the Linux framebuffer

How can the linux frame buffer, on Cell Linux, be captured to obtain either screen shots or movies?
Is there a tool to do this for a running program, or must the program writing to, and presumably controlling, the frame buffer also handle capture and recording? If so, how would the program do so?
Many tools for doing so, for example FBGrab and fbdump; look at the sources for those two, it would be pretty easy to extend either one or write your own which captures video instead of just snapshots.
However, I would recommend that the program writing to the framebuffer be the one recording as well, in order to synchronize capturing frames between writing them (and not partially through a write, or skipping, or ...)
you could use ffmpeg or avconv (eg. avconv -f fbdev -i /dev/fb0 mymovie.flv).

Resources