Access data from FUSE filesystem - fuse

Is there any way that I can access the data created by my FUSE filesystem process?
e.g.
in prefix_write() I store some data in memory and would like to access those data from another process.
Shared memory should work. But I'm looking for a more elegant solution, such as a custom field in fuse_operations that I access as a function from other processes. But as far as I know, the fields in fuse_operations need to be from POSIX, so it's probably impossible to do so. Please correct me if I'm wrong.
thanks

The other process that you are speaking of, is it a process forked by another process. If yes then it should be pretty easy to send data. Before forking create a pipe and then fork, so the fd's returned by the pipe are inherited by the child process. You can then use these fd's for bi-directional data transfer.
If your use case is not this, then can you illustrate why do you want a foreign process to access another processes data?

Related

Linux kernel thread serialization

I'm writing linux kernel module (it is LSM).
It is easy to make hooks in several linux kernel operations but I'm wondering how they are called from multiple threads(and processes).
I'm going to use shared data that may be read/written from multiple threads.
Is there any rule how we should lock the shared data?
If the LSM hook calls are serialized outside, we don't need to lock the shared data, but I couldn't find any documents that specifically says about this.
I also want to know the same thing for procfs or securityfs interface operations.
Are they serialized outside?
My guess it that they are NOT serialized because no docs says so. But still not sure.

Process-specific data in kernel

Say I have some process calling file device operation like read. Before this read the process also called a syscall(defined by me), providing me with some information relevant to the read(and possibly other future reads done by this process). What is the best way of achieving this sort of information flow in the kernel? Is there any good way to store process-specific information other than making some pid-indexed list?
I'd like the syscall information stored in kernel to be inherited by children of that process too. Would it be possible to achieve that without (somehow) traversing the process child-parent tree(and that wouldn't give me the inheritance I want because after forking I don't want changes in parent to affect the child)?
Just like we have init_task variable which gives the starting address of the runqueue and which can be accessible anywhere in the user as well as kernel space, you can add a variable which will be set to the appropriate value by your system call and then accessed by your read(appropriate) methods.

fork and IPC mechanism

I'm writing a mono-thread memory heavy proof of concept application.
This application doesn't manipulate much data per se, will mainly load GBs of data and then do some data analysis on it.
I don't want to manage concurrency via MT implementation, don't want to have to implement locks (i.e. mutexes, spinlocks, ...) so I've decided this time around to use the dear old fork().
On Linux, where memory is CoW, I should be able to efficiently analyse same datasets without having to copy them explicitly and with simple parallel mono-thread logic (again, this is a proof of concept).
Now that I spawn child processes, with fork() is very easy to setup input parameters for a sub-task (sub-process in this case), but then I have to get back the results to the main process. And sometimes these results are 10s of GB large. All the IPC mechanisms I have in mind are:
PIPEs/Sockets (and then epoll equivalent to wait for results in a mono-thread fashion)
Hybrid PIPEs/Shared Memory (epoll equivalent to wait for results with reference to Shared Memory, then copy data from Shared Memory into parent process, destroy Shared Memory)
What else could I use? Apart the obvious "go multi-thread", I really would like to leverage the CoW and single-thread multi-process architecture for this proof of concept. Any ideas?
Thanks
After some experimenting the conclusion I got to is the following:
When a child process has to communicate with parent, before spawning such child process I create a segment of shared memory (i.e. 16 MB)
if coordination is needed a semaphore is created in sh mem segment
Then upon forking, I pipe2 with nonblocking sockets so child can notify parent when some data is available
The pipe fd is then used into epoll
epoll is used as Level Triggered so I can interleave requests if the child processes are really fast in sending data
The segment of shared memory is used to communicate data directly if the structures are pod or with simple template<...> binary read/write functions if those are not
I believe this is a good solution.
Cheers
You could also use a regular file.
Parent process could wait for the child process (to analyse the data on memory and then write to file its result and) to exit and once it does, you must be able to read data from the file. As you mentioned, input parameter is not a problem, you could just specify the file name to write to in one of the input parameters. This way, there is no locking required or except for wait() on exit status of child process.
I wonder if each of your child processes return 10s of GB large data, this way it is much better to use regular files, as you will have enough time to process each of the child process's result. But is this 10GBs data shared across child processes? If that was the case, you would have preferred to use locks, so I assume it isn't.

Passing messages between processes

I need to write a simple function which does the following in linux:
Create two processes.
Have thread1 in Process1 do some small operation, and send a message to Process2 via thread2 once operation is completed.
*Process2 shall acknowledge the received message.
I have no idea where to begin
I have written two simple functions which simply count from 0 to 1000 in a loop (the loop is run in a function called by a thread) and I have compiled them to get the binaries.
I am executing these one after the other (both running in the background) from a shell script
Once process1 reaches 1000 in its loop, I want the first process to send a "Complete" message to the other.
I am not sure if my approach is correct on the process front and I have absolutely no idea how to communicate between these two.
Any help will be appreciated.
LostinSpace
You'd probably want to use pipes for this. Depending on how the processes are started, you either want named or anonymous pipes:
Use named pipes (aka fifo, man mkfifo) if the processes are started independently of each other.
Use anonymous pipes (man 2 pipe) if the processes are started by a parent process through forking. The parent process would create the pipes, the child processes would inherit them. This is probably the "most beautiful" solution.
In both cases, the end points of the pipes are used just like any other file descriptor (but more like sockets than files).
If you aren't familiar with pipes yet, I recommend getting a copy of Marc Rochkind's book "Advanced UNIX programming" where these techniques are explained in great detail and easy to understand example code. That book also presents other inter-process communication methods (the only really other useful inter-process communication method on POSIX systems is shared memory, but just for fun/completeness he presents some hacks).
Since you create the processes (I assume you are using fork()), you may want to look at eventfd().
eventfd()'s provide a lightweight mechanism to send events from one process or thread to another.
More information on eventfd()s and a small example can be found here http://man7.org/linux/man-pages/man2/eventfd.2.html.
Signals or named pipes (since you're starting the two processes separately) are probably the way to go here if you're just looking for a simple solution. For signals, your client process (the one sending "Done") will need to know the process id of the server, and for named pipes they will both need to know the location of a pipe file to communicate through.
However, I want to point out a neat IPC/networking tool that can make your job a lot easier if you're designing a larger, more robust system: 0MQ can make this kind of client/server interaction dead simple, and allows you to start up the programs in whatever order you like (if you structure your code correctly). I highly recommend it.

can a vfork child access parent variables?

How does a child process modify or read data in parent process after vfork()?
Are the variables declared in parent process directly accessible to the child?
I have a process which creates some data structures. I then need to fork a child process
which needs to read/write these data structures. The child will be an exec'ed process different from the parent.
One process cannot directly modify another's memory. What you would typically do is create a pipe or other mechanism that can cross process boundaries. The open descriptors will be inherited by the child process if you use fork(). It can then send messages to the parent instructing it to modify the data structures as required.
The form of the messages can be the difficult part of this design. You can:
Design a protocol that carries values and instructions on what to do with them.
Use an existing marshaling tool such as Google Protocol Buffers.
Use Remote Procedure Calls with one of the existing RPC mechanisms (i.e. SUN or ONC-RPC).
You can also use a manually set-up shared memory scheme that would allow both processes to access common memory. The parent process would allocate storage for its data structures in that shared memory. The child process would map that also into its space and access those structures. You would need to use some sort of sync mechanism depending on how you use the data.

Resources