How to Manufacture "Transport endpoint is not connected" Problem in S3FS (for Testing Workarounds) - fuse

Do you know of a reliable way to induce a "Transport endpoint is not connected" trouble state in S3FS?
Yes, I know that S3FS is dodgy and that S3 is not meant for mounting as a normal file system. I realize there are other, better solutions than S3FS. I have read the other threads on SO and I'm not interested in re-hashing recommended alternatives at the moment. Some day, I may consider other alternatives, but I have a deadline and I want to stick to the topic.
I plan to try out things like autofs and cron-triggered remounting scripts and I want to be fairly sure that I'm testing potential solutions against as faithful a reproduction case as I can muster.

"Transport endpoint is not connected" means that the s3fs process exited without unmounting cleanly. Usually this is due to s3fs crashing, e.g., segmentation fault, memory corruption, etc. It should not occur under normal operation but you can simulate it by sending a signal to s3fs: kill -s SEGV $(pidof s3fs).
Newer versions of s3fs (1.89 as of this writing) address many of the previously-reported crashes. If you encounter one with the latest version, please re-run the s3fs with gdb attached and report the backtrace to the s3fs GitHub issue tracker so we can fix the root cause.

Related

How to minimize process crush with FUSE file system

I do work on FUSElib base file system, and process might periodically stop or hang because of internal bugs or exceptional situation. It's critical for disc intensive read/write sotware to use disc mounted with fuse implementation. Any stop/restart might lead to date lose, even if I put fuse process running under supervisor. Any technics exists to minimize this?
The fuse backend architecture is a process running an event loop that on each request (open, read, etc) starts a worker thread to handle it. That is not something available for you to change using the fuse API.
The only way for you to handle crashes and hangs is to have an outside watchdog monitor program to remount your fs whenever it crashes or hangs. Inside your fuse fs implementation you should insert a recovery module that once in a while stores the fs state to disk and also knows to import it back when remounted.
Update:
Adding link to a system doing just that, as referred by comment above (thanks man!)
https://www.semanticscholar.org/paper/Refuse-to-crash-with-Re-FUSE-Sundararaman-Visampalli/022fc284362d04569a1561c3d04dfe0f377d6112

What would cause an abort signal to be sent to a Docker container?

My web service is running in a Docker container.
Recently, I've seen many SystemExit errors which are raised because the server I use (gunicorn) receive the abort signal.
I've checked the CPU Utilization and Memory Utilization monitor, but both are normal, less than 50% utilization, which doesn't seem likely to be the reason.
Since I may do some download on request in my service, I'm wondering maybe it's caused by running out of file handlers, but I've never seen related exception raised in my log.
What other reasons may result in an ABORT signal?
Please try to debug soft/hard memory limits on the PAAS solution, also try to run strace or sysdig kind of utility to figure out the reason for exit.
How are you starting the application inside the Container? You can use either the EXEC or SHELL form of starting your process when you mention inside Dockerfile using CMD or ENTRYPOINT. EXEC form will allow Docker to forward any signals to the running process so that you can handle it there. This would allow you to understand the specific reasons for your aborts.

flock and NFS -- what happens upon unexpected shutdown?

I am using flock within an HPC application on a file system shared among many machines via NFS. Locking works fine as long as all machines behave as expected (Quote from http://en.wikipedia.org/wiki/File_locking: "Kernel 2.6.12 and above implement flock calls on NFS files using POSIX byte-range locks. These locks will be visible to other NFS clients that implement fcntl-style POSIX locks").
I would like to know what is expected to happen if one of the machines that has acquired a certain lock unexpectedly shuts down, e.g. due to a power outage. I am not sure where to look this up. My guess is that this is entirely up to NFS and its way to deal with NFS handles of non-responsive machines. I could imagine that the other clients will still see the lock until a timeout occurs and the NFS server declares all NFS handles of the machine that timed out as invalid. Is that correct? What would that timeout be? What happens if the machine comes up again within the timeout? Can you recommend a definite reference to look all of this up?
Thanks!
When you use NFS v4 (!) the file will be unlocked when the server hasn't heard from the client for a certain amount of time. This lease period defaults to 90s.
There is a good explanation in the O'Reilly book about NFS and NIS, chapter 11.2. To sum up quickly: As NFS is stateless, the server has no way of knowing the client has crashed. The client is responsible for clearing the lock after it reboots.

Should restarting a Linux host from within a cfengine policy be avoided?

Specifically, if cfengine is used to install the most recent version of an onboard device's firmware and do some tests to see if a reboot is required, and the results indicate that the machine needs a restart, is this something that can be done from within cfengine or should that practice be avoided? If so, why? My experience with Puppet tells me that stopping a run to reboot could be a Very Bad Thing in certain cases, so I'm wondering if the same limitations apply to cfengine as well.
Stopping a CFEngine run is not that bad; it's designed to be convergent and modifications are always atomic. If it stops, the next runs will behave correctly.
However, writing promises that restart a device could lead to bad surprises (like, if there is a flaws in the logic of the promise, that results in never-ending restarts), so I suggest that it should be avoided, if possible, and if it is necessary (like, handling thousands of devices), it should be thoroughly tested
Like Nicolas said, there is no harm in stopping a CFEngine run. A CFEngine policy will continue converging the next time it runs. If you want to ensure that everything is properly finished before the reboot, you could just set a class that indicates that a reboot is needed, and to the actual reboot in a separate bundle that is called near the end of your bundlesequence (I'm assuming CFEngine 3).
And indeed, be VERY mindful and test VERY carefully the conditions under which the reboot will take place!

Finding latency issues (stalls) in embedded Linux systems

I have an embedded Linux system running on an Atmel AT91SAM9260EK board on which I have two processes running at real-time priority. A manager process periodically "pings" a worker process using POSIX message queues to check the health of the worker process. Usually the round-trip ping takes about 1ms, but very occasionally it takes much longer - about 800ms. There are no other processes that run at a higher priority.
It appears the stall may be related to logging (syslog). If I stop logging the problem seems to go away. However it makes no difference if the log file is on JFFS2 or NFS. No other processes are writing to the "disk" - just syslog.
What tools are available to me to help me track down why these stalls are occurring? I am aware of latencytop and will be using that. Are there some other tools that may be more useful?
Some details:
Kernel version: 2.6.32.8
libc (syslog functions): uClibc 0.9.30.1
syslog: busybox 1.15.2
No swap space configured [added in edit]
root filesystem is on tmpfs (loaded from initramfs) [added in edit]
The problem is (as you said) syslogd. While your process is running at a RT priority, syslogd is not. Additionally, syslogd does not lock its heap and can (and will) be paged out by the kernel, especially with very few 'customers'.
What you could try is:
Start another thread to manage a priority queue, have that thread talk to syslog. Logging would then just be acquiring a lock and inserting something into a list. Given only two subscribers, you should not spend a lot of time acquiring the mutex.
Not using syslog, implement your own logging (basically the first suggestion, minus talking to syslog).
I had a similar problem and my first attempt to fix it was to modify syslogd itself to lock its heap. That was a disaster. I then tried rsyslogd, which improved some but I still got random latency peaks. I ended up just implementing my own logging using a priority queue to help ensure that more critical messages were actually written first.
Note, if you are not using swap (at all), the shortest path to fixing this is probably implementing your own logging.

Resources