FUSE directory wrapper skeleton

FUSE directory wrapper skeleton - fuse

FUSE can make virtual directories. Is there a FUSE basic skeleton public implementation ?
For example, fuse will mount directory a/ and reflect changes to b/. When virtual file a/x.txt is created, then real file b/x.txt is created, etc.
It appears to be useless, but I then could use that as a base code and then do my modifications.

You have exactly that as one of the libfuse included examples:
https://github.com/libfuse/libfuse/blob/master/example/passthrough.c
A good practice is as you say to copy the code of an example most relevant for you and expand from there.
See also related question about mirroring a file system: FUSE lib passthrough.c example. Where is it mirrors my / exactly?

Related

How to create a "fake filesystem" that forwards system calls to my program?

I would like to write a tool that can be used to mount archives such as tar, tgz, zip, 7z, etc. to some directory for as long as it's running, such that I can then open it with whatever file manager I want.
To do this, I would somehow need to make a fake filesystem that forwards system calls such as opening and reading files to my program. How would I did this? Would I have to make my own filesystem driver, or does a library for this already exist?

FUSE is what you're looking for, in principle. One implementation of the archive-mounting you're looking for is/was archive mount, but I am unsure how well it is maintained.

You want to write a FUSE filesystem. It is explained in this article and there is an example in python: https://thepythoncorner.com/dev/writing-a-fuse-filesystem-in-python/
No need to call this "fake filesystem", because for linux everything can be a file, so you write just a "filesystem".

Your question is very general. I will assume you want to write your program using Python because it is what I would do.
Chosen the language, now we need to look for examples of somebody doing something similar. Github would do the trick:
To mount wikipedia: wikifs
To mount Google Music: GmusicFS
N.B. Wikifs is least than experimental, but the script is short and servers perfectly as example.
For python the most popular library is fuse-python.
Happy hacking

Creating a file under /sys/devices in Linux

I want to create a file under /sys/devices directory in Linux.
What is the best way to do this?

These answer and explanation came up after a quick google-search:
Why cant I create a directory in /sys - Link removed because of limitation
Wikipedia: Sysfs - Link removed because of limitation
If you absolutely have to modify/create anything there, you should first understand how /sys works. And why you want to change it.
EDIT: Petesh pointed out that you where indeed referring to drivers.
As I understand it, /sys/devices is simply a place for devices to dump information about themselves. You don't insert drivers here.
The drivers, or modules, are either implemented into the kernel before compiling it.
Or you can add the module to /usr/lib/modules/uname -r/extramodules/, or overwrite it in /usr/lib/modules/uname -r/kernel/fs/btrfs/
You may also want to look at these:
Arch: Manual module handling
The Linux Kernel Module Programming Guide

Linux Kernel Module that can list files and folders inside a given path

I would like to know if it is possible to list files and folders inside a given folder from within the Linux Kernel. I bet there is a way.
I have searched on-line and gave it few shots, but still could not do it.
Thank you!

Reacting your comment: your question isn't about file reading, but getting the entries of a directory. About your last sentence: yes, every filesystem implements the readdir() function, so it would be filesystem-independent.
In my opinion, you need the following steps:
Research, how to write kernel modules. Tthere are very many tutorials on the net, including step-by-step tutorials with well commented examples.
Write a simple module, which printk()-s some simple text in its initialization function.
Research, how can you call system calls from a kernel module. It is probably not so simple, as from user space, but nearly surely possible.
The simplest way to pass through the path to the directory in a module parameter. Linux kernel modules can have multiple parameters, whose processing is very well automatized (essentially, you can directly bind the parameter name to static variables in the module).
After your module can call system calls, and has its input, you can now open this directory in its init function with the opendir() call. Then read its content (see readdir()), and finally output the result with printk().
Probably there will be some obstacles, for example maybe you can't use syscalls from the module init function, or similar, but none of them will be really hard.

How to get a list of paths in /etc/ld.so.conf on Linux

What is the most portable and robust way to get the list of paths, configured by /etc/ld.so.conf and files included from it? Parsing the file manually seems to be not a good idea — the format is likely to change in the future revisions.
To allow better understanding of the question, I will give you specific details below. Note that, despite these details, this is a general programming question, applicable to other situations.
There is a program, called LuaRocks. It is a package manager for Lua programming language (somewhat like Ruby gems or Python eggs). LuaRocks packages are called "rocks".
As a convenience feature, LuaRocks allows a rock author to specify a list of external dependencies for a rock, formulated as a list of C header files and / or dynamic library files. (.so on Linux.) If the specified file does not exist, the rock can't be installed.
Currently, on Linux, LuaRocks by default checks .so file existance by searching for the file in two hardcoded paths, /usr/lib and /usr/local/lib.
I believe that this is incorrect behaviour, and it is broken by the recent changes in the Ubuntu and other Debian distributions.
Update: the paths are not hardcoded per se, but are user-configurable in the config file. Still, IMO, not a best solution.
Instead (as I understand it), LuaRocks should look up file in the paths, specified by /etc/ld.so.conf and files included from it.
(Now please re-read the question above ;-) )

You shouldn't need to parse /etc/ld.so.conf or any of the config files - if you run 'ldconfig', it will scan the configured directories and generate a cache file.
Then, subsequently when you attempt a dlopen it'll automatically find the files by iterating through the cached library directories. Same thing with compiling and giving -lSomeLib, you shouldn't need to specify -L/my/other/path if you've got it configured in ld.so.conf(.d)
autoconf accomplishes this by attempting to compile a test program that links to the shared library, but that's just a functional wrapper around the dlopen() call.
So, while other methods may not necessarily be 'wrong', at the root of it attempting to link to the library or doing a dlopen() are the 'most right' ways of doing it.
Consider this, if you attempt to link to a library in a directory that ISN'T cached in /etc/ld.so.cache, when you try to run the program it will fail because it won't be able to dlopen() the library!
Hence, any 'good' shared library will be in /etc/ld.so.cache and be linkable/dlopen()able, this means that gcc can use it to link and that the user-generated library or executable will be able to open it when it executes.
You can circumvent this by expressly setting the environment variable LD_LIBRARY_PATH, or LD_PRELOAD_PATH - but each of these has it's own caveats and should be avoided if possible for 'standard' use.
A good write-up on writing shared libraries covers some of these issues, and is a good read for anyone working on programmatic consuming of other-shared libraries. Ulrich Drepper's How to write shared libraries.

According to the FHS, the following are valid locations for dynamic libraries:
/lib*/
/opt/*/lib*/
/usr/lib*/
/usr/local/lib*/
(And most likely ~/lib*/ as well.)
All entries in my /etc/ld.so.conf.d/* conform to this. Some entries reference subdirectories below the FHS dirs, which probably means that you can use the libraries in there without path information.
Now I don't know enough about LuaRocks. If you're limited to Lua-path-style globs (only ?), you cannot match these and have to parse the configs. Otherwise, you could just try to find them anywhere in these directories.
This would break on non-FHS-conforming systems (only option: parse config) and if a directory is not included in the config, the installer might see libraries that the linker cannot find.
These two seem acceptable to me, therefore I'd simply ignore the config and look at these dirs.
(Another possibility could be trying to link the library, this should automagically use the right path. However, this is platform-specific and maybe dangerous.)

Recommended FHS compliant application test/install workflow under Linux?

I'm in the process of switching to Linux for development, and I'm puzzled about how to maintain a good FHS compliancy in my programs.
For example, under Windows, I know that all the resources (Bitmaps, audio data, etc.) that my program will need can be found with relative paths from the executable, so its the same if I'm running the program from my development directory, or from an installation (Under "Program Files" for example), the program will be able to locate all its files.
Now, under Linux, I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory, while also being able to locate the files when they are under /usr/local/share?
I thought for example of setting a symlink under /usr/local/share pointing to my resources dir, and then just hardcode that path inside my program, but I feel its quite hackish and not very portable.
Also, I thought of running an install script that copies all the resources to /usr/local/share everytime I change, or add resources, but I also feel its not a good way to do it.
Could anyone tell me or point me to where it tells how this issue is usually resolved?
Thanks!

For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
You can organize your source tree as you wish — it need not bear any resemblance to the FHS layout desired of installed software.
I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
The standard prefix is /usr. /usr/local is for, well, "local installations" as the FHS spec reiterates.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory
Definitely. Run ./configure --datadir=$PWD/share for example is the way to point your build to the data files form the source tree (substitute by proper path) and use something like -DDATADIR="'${datadir}'" in AM_CFLAGS to make the value known to the (presumably C) code. (All of that, provided you are using autoconf/automake. Similar options may be available in other build systems.)
This sort of hardcoding is what is used in practice, and it suffices. For a development build within your own working copy, having a hardcoded path should not be a problem, and final builds (those done by a packager) will simply use the standard FHS paths.

You could just test a few locations. For example, first check if you have a data directory within the directory you're currently running the program from. If so, just go ahead and use it. If not, try /usr/local/share/yourproject/data, and so on.
For developing/testing, you can use the data directory within your project folder, and for deploying, use the stuff in /usr/local/share/. Of course, you can test for even more locations (e.g. /usr/share).
Basically the requirement for this method is that you have a function that builds the correct paths for all filesystem accesses. Instead of fopen("data/blabla.conf", "w") use something like fopen(path("blabla.conf"), "w"). path() will construct the correct path from the path determined using the directory tests when the program started. E.g. if the path was /usr/local/share/yourproject/data/, the string returned by path("blabla.conf") would be "/usr/local/share/yourproject/data/blabla.conf" - and there is your nice absolute path.
That's how I'd do it. HTH.

My preferred solution in cases like this is to use a configuration file, along with a command-line option that overrides its location.
For example, a configuration file for a fully deployed application named myapp could reside in /etc/myapp/settings.conf and a part of it could look like this:
...
confdir=/etc/myapp/
bindir=/usr/bin/
datadir=/usr/share/myapp/
docdir=/usr/share/doc/myapp/
...
Your application (or a launcher script) can parse this file to determine where to find the rest of the needed files.
I believe that you can reasonably assume in your code that the location of the configuration file is fixed under /etc/myapp - or any other location specified at compile time. Then you provide a command line option to allow that location to be overridden:
myapp --configfile=/opt/myapp/etc/settings.conf ...
It might also make sense to have options for some of the directory paths as well, so that the user can easily override any of the configuration file settings. This approach has a couple of advantages:
Your users can relocate the application very easily - just by moving the files, modifying the paths in the configuration file and then using e.g. a wrapper script to call the main application with the proper --configfile option.
You can easily support FHS, as well as any other scheme you need to.
While developing, you can have your testsuite use a specially crafted configuration file with the paths being wherever you need them to be.
Some people advocate probing the system at runtime to resolve issues like this. I usually suggest avoiding such solutions for at least the following reasons:
It makes your program non-deterministic. You can never tell at a first glance which configuration file it picks up - especially if you have multiple versions of the application on your system.
At any installation mix-up, the application will remain fat and happy - and so will the user. In my opinion, the application should look at one specific and well-documented location and abort with an informative message if it cannot find what it is looking for.
It's highly unlikely that you will always get everything right. There will always be unexpected rare environments or corner cases that the application will not handle.
Such behaviour is against the Unix philosophy. Even comamnd shells probe multiple locations because all locations can hold a file that should be parsed.
EDIT:
This method is not mandated by any formal standard that I know of, but it is the prevalent solution in the Unix world. Most major daemons (e.g. BIND, sendmail, postfix, INN, Apache) will look for a configuration file at a certain location, but will allow you to override that location and - through the file - any other path.
This is mostly to allow the system administrator to implement whetever scheme they want or to setup multiple concurrent installations, but it does help during testing as well. This flexibility is what makes it a Best Practice if not a proper standard.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string