How lock keyword in open function in julia is used? - multithreading

The open(filename::AbstractString; lock = true, keywords...) function has a keyword lock. According to official documentation "The lock keyword argument controls whether operations will be locked for safe multi-threaded access." Even after putting lock keyword to true explicitly, I can open the same file in different a process(for testing I used 2 julia REPLs) and edit it from both process. Now, my questions are:
What does lock keyword actually do?
Is there any way to open a file exclusively in a process and make other processes wait for it to prevent data race?
Even after lock=true, can the file get corrupted if two processes try to write it at the same time?
What does "safe multi-threaded access" mean in this context?
I am using Julia 1.7.3 on linuxmint 20.3

Related

Workaround for ncurses multi-thread read and write

This is what says on http://invisible-island.net/ncurses/ncurses.faq.html#multithread
If you have a program which uses curses in more than one thread, you will almost certainly see odd behavior. That is because curses relies upon static variables for both input and output. Using one thread for input and other(s) for output cannot solve the problem, nor can extra screen updates help. This FAQ is not a tutorial on threaded programming.
Specifically, it mentions it is not safe even if input and output are done on separate threads. Would it be safe if we further use a mutex for the whole ncurses library so that at most one thread can be calling any ncurses function at a time? If not, what would be other cheap workarounds to use ncurses safely in multi-thread application?
I'm asking this question because I notice a real application often has its own event loop but relies on ncurses getch function to get keyboard input. But if the main thread is block waiting in its own event loop, then it has no chance to call getch. A seemingly applicable solution is to call getch in a different thread, which hasn't caused me a problem yet, but as what says above is actually not safe, and was verified by another user here. So I'm wondering what is the best way to merge getch into an application's own event loop.
I'm considering making getch non-blocking and waking up the main thread regularly (every 10-100 ms) to check if there is something to read. But this adds an additional delay between key events and makes the application less responsive. Also, I'm not sure if that would cause any problems with some ncurses internal delay such as ESCDELAY.
Another solution I'm considering is to poll stdin directly. But I guess ncurses should also be doing something like that and reading the same stream from two different places looks bad.
The text also mentions the "ncursest" or "ncursestw" libraries, but they seem to be less available, for example, if you are using a different language binding of curses. It would be great if there is a viable solution with the standard ncurses library.
Without the thread-support, you're out of luck for using curses functions in more than one thread. That's because most of the curses calls use static or global data. The getch function for instance calls refresh which can update the whole screen—using the global pointers curscr and stdscr. The difference in the thread-support configuration is that global values are converted to functions and mutex's added.
If you want to read stdin from a different thread and run curses in one thread, you probably can make that work by checking the file descriptor (i.e., 0) for pending activity and alerting the thread which runs curses to tell it to read data.

there is __cond_lock(x,c) define in compiler.h file , but no __cond_unlock(x,c) define?

In complier.h, there is a macro define as below:
# define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0)
But here I have a question, that is, where there is a __cond_lock definition, but does not define the corresponding __cond_unlock, then the variable on the release, how to keep consistent between __cond_lock and __cond_unlock?
And I checked the definition of function spin_trylock (), and it is used __cond_lock, but which also used a _spin_trylock function.in _spin_trylock function, after a few calls, it will use to __acquire function in this case, the equivalent of an operation, it carried out two calculations would lead Sparse detection warning message appears, after I wrote the code for an experiment to test my judgment, is indeed a warning message will appear, if I wrote it twice unlock instruction, there is no alarm information, but this is inconsistent as program running.
Protecting critical sections using locking is up to the programmer. That means, if you hold a lock to protect a critical reason, you've must have to release the lock when you're finished.
There are various types of locking primitives inside Linux kernel like. spinlock(), spinlock_irq(), spin_trylock(). They have their own purposes. Now, spin_trylock() using __cond_lock inside of it, it's because to make sure, whether that particular lock is available for locking or it's been already taken. Take a look at few examples of how spin_trylock or __cond_lock is being used. For ex. at kernel/sched/fair.c::rebalance_domain (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/fair.c?id=d8dfad3876e4386666b759da3c833d62fb8b2267#n5574) see how the balancing is used, it's been using spin_trylock() to hold the lock and while releasing doing it conditionally. Another example could be found at kernel/posix-timers.c, lock_timer() macro. If you closely look at the uses of lock_timer() you'll find how __cond_lock is being used inside kernel and hopefully your confusion will disappear.
In other words, __cond_lock is used to hold a lock conditionally and not being used directly. It's possible to check a particular lock before releasing the lock and this what has been done so far.

cross-process locking in linux

I am looking to make an application in Linux, where only one instance of the application can run at a time. I want to make it robust, such that if an instance of the app crashes, that it won't block all the other instances indefinitely. I would really appreciate some example code on how to do this (as there's lots of discussion on this topic on the web, but I couldn't find anything which worked when I tried it).
You can use file locking facilities that Linux provides. You haven't specified the language, however you might find this capability pretty much everywhere in some form or another.
Here is a simple idea how to do that in a C program. When the program starts you can take an exclusive non-blocking lock on the whole file using fcntl system call. When another instance of the applications is attempted to be started, it will get an error trying to lock the file, which will mean the application is already running.
Here is a small example how to take the full file lock using fcntl (this function provides facilities for putting byte range locks, but when length is 0, the full file is locked).
struct flock lock_struct;
memset(&lock_struct, 0, sizeof(lock_struct));
lock_struct.l_type = F_WRLCK;
lock_struct.l_whence = SEEK_SET;
lock_struct.l_pid = getpid();
ret = fcntl(fd, F_SETLK, &lock_struct);
Please note that you need to open a file first to put a lock. This means you need to have a file around to use for locking. It might be useful to put the it somewhere where it won't cause any distraction/confusion for other applications.
When the process terminates, all locks that it has taken will be released, so nothing will be blocked.
This is just one of the ideas. I'm pretty sure there are other ways around.
The conventional UNIX way of doing this is with PID files.
Before a process starts, it checks to see if a pre-determined file - usually /var/run/<process_name>.pid exists. If found, its an indication that a process is already running and this process quits.
If the file does not exist, this is the first process to run. It creates the file /var/run/<process_name>.pid and writes its PID into it. The process unlinks the file on exit.
Update:
To handle cases where a daemon has crashed & left behind the pid file, additional checks can be made during startup if a pid file was found:
Do a ps and ensure that a process with that PID doesn't exist
If it exists ensure that its a different process
from the said ps output
from /proc/$PID/stat

Is it good practice to use mkdir as file-based locking on linux?

I wanted to quickly implement some sort of locking in perl program on linux, which would be shareable between different processes.
So I used mkdir as an atomic operation, which returns 1 if the directory doesn't exist and 0 if it does. I remove the directory right after the critical section.
Now, it was pointed to me that it's not a good practice in general (independently on the language). I think it's quite OK, but I would like to ask your opinion.
edit:
to show an example, my code looked something like this:
while (!mkdir "lock_dir") {wait some time}
critical section
rmdir "lock_dir"
IMHO this is a very bad practice. What if the perl script which created the lock directory somehow got killed during the critical section? Another perl script waiting for the lock dir to be removed will wait forever, because it won't get removed by the script which originally created it.
To use safe locking, use flock() on a lock file (see perldoc -f flock).
This is fine until an unexpected failure (e.g. program crash, power failure) happens while the directory exists.
After this, the program will never run because the lock is locked forever (assuming the directory is on a persistent filesystem).
Normally I'd use flock with LOCK_EXCL instead.
Open a file for reading+writing, creating it if it doesn't exist. Then take the exclusive lock, if that fails (if you use LOCK_NB) then some other process has it locked.
After you've got the lock, you need to keep the file open.
The advantage of this approach is, if the process dies unexpected (for example, crash, is killed or the machine fails), the lock is automatically released.

Thread-safety and concurrent modification of a table in SQLite3

Does thread-safety of SQLite3 mean different threads can modify the same table of a database concurrently?
No - SQLite does not support concurrent write access to the same database file. SQLite will simply block one of the transactions until the other one has finished.
note that if you're using python, to access a sqlite3 connection from different threads you need to disable the check_same_thread argument, e.g:
sqlite.connect(":memory:", check_same_thread = False)
as of the 24th of may 2010, the docs omit this option. the omission is listed as a bug here
Not necessarily. If sqlite3 is compiled with the thread safe macro (check via the int sqlite3_threadsafe(void) function), then you can try to access the same DB from multiple threads without the risk of corruption. Depending on the lock(s) required, however, you may or may not be able to actually modify data (I don't believe sqlite3 supports row locking, which means that to write, you'll need to get a table lock). However, you can try; if one threads blocks, then it will automatically write as soon as the other thread finishes with the DB.
You can use SQLite in 3 different modes:
http://www.sqlite.org/threadsafe.html
If you decide to multi-thread mode or serialized mode, you can easy use SQLite in multi-thread application.
In those situations you can read from all your threads simultaneously anyway. If you need to write simultaneously, the opened table will be lock automatycally for current writing thread and unlock after that (next thread will be waiting (mutex) for his turn until the table will be unlocked). In all those cases, you need to create separate connection string for every thread (.NET Data.Sqlite.dll). If you're using other implementation (e.g. any Android wrapper) sometimes the things are different.

Resources