I would like to reserve a large amount of object ids prior to a complex COPY operation. I know setval and nextval have atomic guarantees, but do these guarantees hold in a multithreaded environment if I'm using them in a compound statement such as the following? I'm using postgresql 9.6.
SELECT setval('objects_id_seq', nextval('objects_id_seq') + 9999); -- returns the last reserved id
I know setval and nextval have atomic guarantees
Yes, but the guarantees may not be what you think they are. Remember that sequences are exempt from normal transaction boundaries. Your setval will take effect immediately on other concurrent transactions. It's atomic - it either happens in its entirety or not at all - but that doesn't mean it obeys all ACID properties.
Your query is definitely going to affect concurrent queries. Specifically, the value of the sequence will continue to increase between nextval and your subsequent setval, if there are concurrent nextval calls from other queries. So if your intent is "reserve 9999 IDs" it won't work, you might actually only reserve 9989 if 10 other sessions called 'nextval' in the small window between your nextval and setval calls.
You can't LOCK a sequence from SQL either, so you can't just take an EXCLUSIVE lock on it.
You can either:
Assign way more extra values than you need and hope you're fast enough;
Lock the table that uses the sequence in EXCLUSIVE mode and hope nobody concurrently calls nextval(...) on the sequence directly;
let COPY assign generated keys from a DEFAULT nextval(....) like usual;
Assign generated keys a different way, such as a counter table, where you can apply stronger locking.
What I think you really want in this case is a nextval variant that increments by your supplied value e.g. 9999, not by 1. PostgresSQL doesn't have such a function yet, but it'd be really handy to have one. Patches welcome!
You might be thinking "what if I ALTER SEQUENCE ... INCREMENT 9999 then call nextval then ALTER SEQUENCE ... INCREMENT 1. Yeah, don't do that. Unfortunately, ALTER SEQUENCE's effects will also become visible outside transaction boundaries due to the same properties that make nextval work. So concurrent queries calling nextval might get a 9999-row jump too. You might not care, but it's worth knowing about. I wouldn't recommend that you rely on this behaviour.
Related
MDB_NOLOCK as described at mdb_env_open() apidoc:
MDB_NOLOCK Don't do any locking. If concurrent access is anticipated, the caller must manage all concurrency itself. For proper operation the caller must enforce single-writer semantics, and must ensure that no readers are using old transactions while a writer is active. The simplest approach is to use an exclusive lock so that no readers may be active at all when a writer begins.
What if an RW txnA intends to modify a set of keys which has no key in common with another set of keys which another RW txnB intends to modify? Couldn't they be sent concurrently?
Isn't the single-writer semantic wasteful for such situations? As one txn is waiting for the previous one to finish, even though they intend to operate in entirely separate regions in an lmdb env.
In an environment opened with MDB_NOLOCK, what if the client app calculates in the domainland, that two write transactions are intending to RW to mutually exclusive set of keys anywhere in an lmdb environment, and sends only such transactions concurrently anyway? What could go wrong?
Could such concurrent writes scale linearly with cores? Like RO txns do? Given the app is able to manage these concurrent writes, in the manner described in 3.
No, since modifying key/value pairs requires also modifying the b-tree structure, and the two transactions would conflict with each other.
You should avoid doing long-running computations in the middle of a write transaction. Try to do as much as possible beforehand. If you can't do this, then LMDB might not be a great fit for you application. Usually you can though.
Very bad stuff. Application crashes and DB corruption.
Writes are generally IO bound, and will not scale with many cores anyway. There are some very hacky things you can do with LMDB's writemap and/or pwrite(2), but you are very much on your own here.
I'm going to assume that writing to the value part of a pre-existing key does not modify the b-tree because you are not modifying the keys. So what Doug Hoyte's comment stands, except possibly point 3:
Key phrase here is "are intending to RW to mutually exclusive set of keys". So assuming that the keys are pre-allocated, and already in the DB, changing the values should not matter. I don't even know if LMDB can store variable sized values, in which case it could matter if the values are different sizes.
So, it should be possible to write with MDB_NOLOCK concurrently as long as you can guarantee to never modify, add, or delete any keys during the concurrent writes.
Empirically I can state that working with LMDB opened with MDB_NO_LOCK (or lock=False in Python) and simply modifying values of pre-existing keys, or even only adding new key/values - seems to work well. Even if LMDB itself is mounted across an NFS like medium and queried from different machines.
#Doug Hoyte - I would appreciate more context as to what specific circumstances might lead to a crash or corruption. In my case there are many small short-lived type of writes to the same DB.
Suppose we have resources A,B,C and their dependencies not cyclic:
B->A
C->A
Means B strongly depends on A and C strongly depends on A. For example: B,C is precomputed resources from A. So if A updates, B,C should be updated too. But if B updated - nothing changes except B.
And for the problem: Considering the fact that each node of graph can be accessed for Read or Write or Read/Upgrade to Write in multi-threaded manner, how one supposed to manage locks in such graph? Is there generalization of this problem?
Update
Sorry for not clear question. Here is also one very important thing:
If for example A changes and will force B,C to be updated it means that the moment B and their dependencies updates - it will free write lock.
Your question is a blend of transaction - locking - concurrency - conflict resolution. Therefore models used in relational databases might serve your purpose.
There are many methods defined for concurrency control.
In your case some might apply depending of how optimistic or pessimistic your algorithm needs to be, how many reads or writes, and what is the amount of data per-transaction.
I can think of the two methods that can help in your case:
1. Strict Two-Phase Locking (SSPL or S2PL)
A transaction begins, A, B, C locks are being obtained and are kept until the end of the transaction. Because multiple locks are kept until the end of the transaction, while acquiring the locks a deadlock condition might be encountered. Locks can change during the transaction time.
This approach is serializable, meaning that all events come in order and no other party can make any changes while the transaction holds.
This approach is pessimistic and locks might hold for a good amount of time, thus resources and time will be spent.
2. Multiversion
Instead of placing locks on A, B, C, maintain version numbers and create a snapshot of each. All changes will be done to snapshots. At the end, all snapshots will replace the previous versions. If any version of A, B and C has changed then an error condition occurs and changes are discarded.
This approach does not place read or write locks meaning that will be fast. But in case of conflicts, if any version has changed in the interim, then data will be discarded.
This is optimistic but might spend much more resources in favor of speed.
Transaction log
In database systems there is also the concept of "transaction log". This means that any transaction being it completed or pending will be present in the "transaction log". So every operation done in any of the above methods is first done to the transaction log. Operations from the log will be materialized at the right moment in the main store. In case of failures the log is analyzed, completed transactions are materialized to the main store and the pending ones are just discarded.
This is used also in "log shipping" in order to ship the log to other servers for the purpose of replication.
Known Implementations
There are multiple in-memory databases that might prevent some hassle with implementing your own solution.
H2 provides also serializable isolation level that can match your use case.
go-memdb provides multiversion concurrency. This one uses an immutable radix tree algorithm, therefore you can look also into this one for details if you are searching to build your own solution.
Many more are defined here.
I am not aware of a specific pattern here; so my solution would go like this:
First of all, I would reverse the edges in your graph. You don't care that A is a dependency for B; meaning: the other direction is telling you what is required to lock on:
A->B
A->C
Because now you can say: if I want to do X on A, I need the X lock on A, and any object depending on A.
And now you can go; inspect A, and the objects depending on A; ... and so forth to determine the set of objects you need an X lock on.
Regarding your comment: Because X in this case is either Read or UpgradedWrite, and if A need Write it doesn't clearly mean that B needs it to. ... for me that translates to: the whole "graph idea" doesn't help. You see, such a graph is only useful to express such direct relations, such as "if a then b". If there is an edge between A and B, then that means that you would want to treat them "the same way". If you are now saying that your objects might or might not need to be both write locked - what would be the point of this graph then? Because then you end up with a lot of actually independent objects, and sometimes a write to A needs a write lock something else; and sometimes not.
Here is the nice article which describes what is ES and how to deal with it.
Everything is fine there, but one image is bothering me. Here it is
I understand that in distributed event-based systems we are able to achieve eventual consistency only. Anyway ... How do we ensure that we don't book more seats than available? This is especially a problem if there are many concurrent requests.
It may happen that n aggregates are populated with the same amount of reserved seats, and all of these aggregate instances allow reservations.
I understand that in distributes event-based systems we are able to achieve eventual consistency only, anyway ... How to do not allow to book more seats than we have? Especially in terms of many concurrent requests?
All events are private to the command running them until the book of record acknowledges a successful write. So we don't share the events at all, and we don't report back to the caller, without knowing that our version of "what happened next" was accepted by the book of record.
The write of events is analogous to a compare-and-swap of the tail pointer in the aggregate history. If another command has changed the tail pointer while we were running, our swap fails, and we have to mitigate/retry/fail.
In practice, this is usually implemented by having the write command to the book of record include an expected position for the write. (Example: ES-ExpectedVersion in GES).
The book of record is expected to reject the write if the expected position is in the wrong place. Think of the position as a unique key in a table in a RDBMS, and you have the right idea.
This means, effectively, that the writes to the event stream are actually consistent -- the book of record only permits the write if the position you write to is correct, which means that the position hasn't changed since the copy of the history you loaded was written.
It's typical for commands to read event streams directly from the book of record, rather than the eventually consistent read models.
It may happen that n-AggregateRoots will be populated with the same amount of reserved seats, it means having validation in the reserve method won't help, though. Then n-AggregateRoots will emit the event of successful reservation.
Every bit of state needs to be supervised by a single aggregate root. You can have n different copies of that root running, all competing to write to the same history, but the compare and swap operation will only permit one winner, which ensures that "the" aggregate has a single internally consistent history.
There are going to be a couple of ways to deal with such a scenario.
First off, an event stream would have the current version as the version of the last event added. This means that when you would not, or should not, be able to persist the event stream if the event stream is not at the version when loaded. Since the very first write would cause the version of the event stream to be increased, the second write would not be permitted. Since events are not emitted, per se, but rather a result of the event sourcing we would not have the type of race condition in your example.
Well, if your commands are processed behind a queue any failures should be retried. Should it not be possible to process the request you would enter the normal "I'm sorry, Dave. I'm afraid I can't do that" scenario by letting the user know that they should try something else.
Another option is to start the processing by issuing an update against some table row to serialize any calls to the aggregate. Probably not the most elegant but it does cause a system-wide block on the processing.
I guess, to a large extent, one cannot really trust the read store when it comes to transactional processing.
Hope that helps :)
I am using shared variables on perl with use threads::shared.
That variables can we modified only from single thread, all other threads are only 'reading' that variables.
Is it required in the 'reading' threads to lock
{
lock $shared_var;
if ($shared_var > 0) .... ;
}
?
isn't it safe to simple verification without locking (in the 'reading' thread!), like
if ($shared_var > 0) ....
?
Locking is not required to maintain internal integrity when setting or fetching a scalar.
Whether it's needed or not in your particular case depends on the needs of the reader, the other readers and the writers. It rarely makes sense not to lock, but you haven't provided enough details for us to determine what your needs are.
For example, it might not be acceptable to use an old value after the writer has updated the shared variable. For starters, this can lead to a situation where one thread is still using the old value while the another thread is using the new value, a situation that can be undesirable if those two threads interact.
It depends on whether it's meaningful to test the condition just at some point in time or other. The problem however is that in a vast majority of cases, that Boolean test means other things, which might have already changed by the time you're done reading the condition that says it represents a previous state.
Think about it. If it's an insignificant test, then it means little--and you have to question why you are making it. If it's a significant test, then it is telltale of a coherent state that may or may not exist anymore--you won't know for sure, unless you lock it.
A lot of times, say in real-time reporting, you don't really care which snapshot the database hands you, you just want a relatively current one. But, as part of its transaction logic, it keeps a complete picture of how things are prior to a commit. I don't think you're likely to find this in code, where the current state is the current state--and even a state of being in a provisional state is a definite state.
I guess one of the times this can be different is a cyclical access of a queue. If one consumer doesn't get the head record this time around, then one of them will the next time around. You can probably save some processing time, asynchronously accessing the queue counter. But here's a case where it means little in context of just one iteration.
In the case above, you would just want to put some locked-level instructions afterward that expected that the queue might actually be empty even if your test suggested it had data. So, if it is just a preliminary test, you would have to have logic that treated the test as unreliable as it actually is.
I mean like thousands users in time updating values in database?
Yes, nextval is safe to use from multiple concurrently operating transactions. That is its purpose and its reason for existing.
That said, it is not actually "thread safe" as such, because PostgreSQL uses a multi-processing model not a multi-threading model, and because most client drivers (libpq, for example) do not permit more than one thread at a time to interact with a single connection.
You should also be aware that while nextval is guaranteed to return distinct and increasing values, it is not guaranteed to do so without "holes" or "gaps". Such gaps are created when a generated value is discarded without being committed (say, by a ROLLBACK) and when PostgreSQL recovers after a server crash.
While nextval will always return increasing numbers, this does not mean that your transactions will commit in the order they got IDs from a given sequence in. It's thus perfectly normal to have something like this happen:
Start IDs in table: [1 2 3 4]
1st tx gets ID 5 from nextval()
2nd tx gets ID 6 from nextval()
2nd tx commits: [1 2 3 4 6]
1st tx commits: [1 2 3 4 5 6]
In other words, holes can appear and disappear.
Both these anomalies are necessary and unavoidable consequences of making one nextval call not block another.
If you want a sequence without such ordering and gap anomalies, you need to use a gapless sequence design that permits only one transaction at a time to have an uncommitted generated ID, effectively eliminating all concurrency for inserts in that table. This is usually implemented using SELECT FOR UPDATE or UPDATE ... RETURNING on a counter table.
Search for "PostgreSQL gapless sequence" for more information.
Yes it is threadsafe.
From the manual:
nextvalAdvance the sequence object to its next value and return that value. This is done atomically: even if multiple sessions execute nextval concurrently, each will safely receive a distinct sequence value.
(Emphasis mine)
Yes: http://www.postgresql.org/docs/current/static/functions-sequence.html
It wouldn't be useful otherwise.
Edit:
Here is how you use nextval and currval:
nextval returns a new sequence number, you use this for the id in an insert on the first table
currval returns the last sequence number obtained by this session, you use that in foreign keys to reference the first table
each call to nextval returns another value, don't call it twice in the same set of inserts.
And of course, you should use transactions in any multiuser code.
This poster asked a different question on the same flawed code.
here Point is: he does not seem to know how foreign keys work, and has them reversed (a sequence functioning as a foreign key is kind of awkward IMHO)
BTW: This is should be a comment, not an answer; but I can't comment yet.