I have looked online and done some searching through stackoverflow and the internet about locks and I just seem to get a general understanding that when a lock is active another thread cannot use it??
I have multiple shared objects which are being read/written constantly throughout the script and I'm still not 100% sure how the locking function really works? When do you need to use it, when do you not need to use it and is it worth creating individual locks for each shared variable/object?
When a thread calls a lock does that mean other threads will only pause at that particular part of the script where the lock was originally called or does it somehow acknowledge to stop reading/writing any variables within the acquire/release function call throughout the entire script?
If I have multiple locks specifically for each shared variable/object and one lock function is called, does this effect the rest of the locks too?
I think to summerise, I'm struggling to understand the "in-depth" version of locking, only being able to find a general overview amongst previous explanations online.
Will reading from the same file make threads run slower? If so, how does YouTube or Netflix servers handle so many people watching the same movie and everyone is at different place in the movie?
Or if reading from the same file make threads slow, then if space is not a concern, is it better to have multiple copies of the file, or split the file into parts?
Will reading from the same file make threads run slower?
No. Modern operating systems handle this situation extremely efficiently.
In my last job interview, I was asked to make a program that sorts data in a huge file. I implemented it using c++ WinApi. Everything was ok until interviewer noticed that i wrote to the file concurrently through a single file handle. He told that i had to synchronize writes with a mutex. I tried to argue that every thread wrote data in its own file area explicitly specifing offset from the file beginning so there was no need in synchronization, it was useless.
Questions:
Is it safe to write (WriteFile) to a file concurrently using a
single handle, assuming that a every thread has its own file part?
Where i can find any information about it?
I'm working on a multithreaded application where multiple threads may want exclusive access to the same file. I'm looking for a way of serializing these operations. I was planning to use flock, lockf, or fcntl locking. However it appears that with these methods an attempt to lock a file by a second thread when a first thread already owns the lock will be granted, because the two threads are in the same process. This is according to the manpages for flock and fnctl (and I guess in linux lockf is implemented with fnctl). Also supported by this other question. So, are there other ways of locking a file in linux which works at a thread-level instead of a process-level?
Some alternatives that I came up with which I do not like are:
1) Use a lockfile (xxx.lock) opened with O_CREAT | O_EXCL flags. This call will succeed only in one thread if there is contention. The problem with this is that then other threads have to spin on the call until they achieve the lock, meaning that I have to _yield() or sleep() which makes me think this is not a great option.
2) Keep a mutex'ed list of all open files. When a thread wants to open/close a file it has to lock the list first. When opening a file, it searches the list to see if it's open. This sounds particularly inefficient because it requires a significant amount of work even if the file is not owned yet.
Are there other ways of doing this?
Edit:
I just discovered this text in my system's manpages which isn't in the online man pages:
If a process uses open(2) (or similar) to obtain more than one descriptor for the same file, these descriptors are treated independently by flock(). An attempt to lock the file using one of these file descriptors may be denied by a lock that the calling process has already placed via another descriptor.
I'm not happy about the words "may be denied", I'd prefer "will be denied" but I guess it's time to test that.
If one of my processes open a file, let's say for reading only, does the O.S guarantee that no other process will write on it as I'm reading, maybe
leaving the reading process with first part of the old file version, and second part of the newer file version, making data integrity questionable?
I am not talking about pipes which have no seek, but on regular files, with seek option (at least when opened with only one process).
No, other processes can change the file contents as you are reading it. Try running "man fcntl" and ignore the section on "advisory" locks; those are "optional" locks that processes only have to pay attention to if they want. Instead, look for the (alas, non-POSIX) "mandatory" locks. Those are the ones that will protect you from other programs. Try a read lock.
No, if you open a file, other processes can write to it, unless you use a lock.
On Linux, you can add an advisory lock on a file with:
#include <sys/file.h>
...
flock(file_descriptor,LOCK_EX); // apply an advisory exclusive lock
Any process which can open the file for writing, may write to it. Writes can happen concurrently with your own writes, resulting in (potentially) indeterminate states.
It is your responsibility as an application writer to ensure that Bad Things don't happen. In my opinion mandatory locking is not a good idea.
A better idea is not to grant write access to processes which you don't want to write to the file.
If several processes open a file, they will have independent file pointers, so they can seek() and not affect one another.
If a file is opened by a threaded program (or a task which shares its file descriptors with another, more generally), the file pointer is also shared, so you need to use another method to access the file to avoid race conditions causing chaos - normally pread, pwrite, or the scatter/gather functions readv and writev.