I need to use optimistic locking for my record. But also in our system, we use mapping logic between layers, records to POJO, and vice versa. And if I use store() on new records it works, but I want to store() like update statement on the existing record, but when I convert it from POJO, it works like for new record and I got duplicate id exception.
A know that I can save records with context.update(), but I need optimistic locking and as I know it works only with UpdatableRecords (store() method).
In the documentation I found :
`When loading records from POJOs, jOOQ will assume the record is a new record. It will hence attempt to INSERT it.
and it means that I can't update existing records from POJO with method store() and should use the update from context, but it can't work with the optimistic lock.
Hence, is there any chance to get it at the same time optimistic lock and work with POJOs?
UPDATE :
Thx Lucas. I've started using record.update(), but I ran into another problem :
UserRecord userRecord = context.fetchOne(...);
UserPojo userPojo = mapper.toPojo(userRecord);
userPojo.setName("new_name")
UserRecord userRecordForSave = context.newRecord(USER,userPojo);
userRecordForSave.update()
got DataChangedException
But I turned on optimistic locking and inner JOOQ method checkIfChanged() - validate originals[] fields, but I don't have them after mapping POJO to record, hence I got DataChangedException: Database record has been changed. What is the correct method of mapping POJO in this case?
Updating explicitly
UpdatableRecord.store() decides whether to insert() or update(), depending on whether you created the record, or whether it was fetched from the database via jOOQ. See the Javadoc:
If this record was created by client code, an INSERT statement is executed
If this record was loaded by jOOQ and the primary key value was changed, an INSERT statement is executed (unless Settings.isUpdatablePrimaryKeys() is set). jOOQ expects that primary key values will never change due to the principle of normalisation in RDBMS. So if client code changes primary key values, this is interpreted by jOOQ as client code wanting to duplicate this record.
If this record was loaded by jOOQ, and the primary key value was not changed, an UPDATE statement is executed.
But you don't have to use store(), you can use insert() or update() directly.
Optimistic locking
There are 2 ways to implement optimistic locking in jOOQ:
Based on version or timestamp fields (see Settings.executeWithOptimisticLocking)
Based on changed values (see Settings.executeWithOptimisticLockingExcludeUnversioned)
The former works purely based on the value in the record, but in order to support the latter, you have to make sure jOOQ knows both:
The previous state of your record (Record.original())
The new state of your record (Record itself)
There's no way jOOQ can implement this functionality without knowing both states, so you can't implement your logic without keeping a hold of the previous state somehow.
But in most cases, the version based optimistic locking approach is the most desirable one.
Related
I am trying to understand how cosmosdb udpate works? In cosmosdb, there is a upsert operation to update or insert depending on whether the item exists in container or not. usually the flow is like this:
record = client.read_item(id, partition_key)
record['one_field'] = 'new_value'
client.upsert(record)
My doubt here is whether such update operation will delete the original record even only a singe field is changed? If that is the case, then update becomes expensive is the record is large in size. Is my understanding correct here?
Cosmos DB updates a document by replacing it, not by in-place update.
If you query (or read) a document, and then update some properties, you would then replace the document. Or, as you've done, call upsert() (which is similar to a replace, except that it will create a new document if the specified partition+id doesn't exist already).
The notion of "expensive" is not exactly easy to quantify; look at the returned headers to see the RU charge for a given upsert/replace, to determine the overall cost, and whether you'll need to scale your RU/sec setting based on overall usage patterns.
So I've been trying to wrap my head around this one for weeks, but I just can't seem to figure it out. So MongoDB isn't equipped to deal with rollbacks as we typically understand them (i.e. when a client adds information to the database, like a username for example, but quits in the middle of the registration process. Now the DB is left with some "hanging" information that isn't assocaited with anything. How can MongoDb handle that? Or if no one can answer that question, maybe they can point me to a source/example that can? Thanks.
MongoDB does not support transactions, you can't perform atomic multistatement transactions to ensure consistency. You can only perform an atomic operation on a single collection at a time. When dealing with NoSQL databases you need to validate your data as much as you can, they seldom complain about something. There are some workarounds or patterns to achieve SQL like transactions. For example, in your case, you can store user's information in a temporary collection, check data validity, and store it to user's collection afterwards.
This should be straight forwards, but things get more complicated when we deal with multiple documents. In this case, you need create a designated collection for transactions. For instance,
transaction collection
{
id: ..,
state : "new_transaction",
value1 : values From document_1 before updating document_1,
value2 : values From document_2 before updating document_2
}
// update document 1
// update document 2
Ooohh!! something went wrong while updating document 1 or 2? No worries, we can still restore the old values from the transaction collection.
This pattern is known as compensation to mimic the transactional behavior of SQL.
In my CouchDB database I'd like all documents to have an 'updated_at' timestamp added when they're changed (and have this enforced).
I can't modify the document with validation functions
updates functions won't run unless they're called specifically (so it'd be possible to update the document and not call the specific update function)
How should I go about implementing this?
There is no way to do this now without triggering _update handlers. This is nice idea to track documents changing time, but it faces problems with replications.
Replications are working on top of public API and this means that:
In case of enforcing such trigger you'll have replications broken since it will be impossible to sync data as it is without document modification. Since document get modified, he receives new revision which may easily lead to dead loop if you replicate data from database A to B and B to A in continuous mode.
In other case when replications are fixed there will be always way to workaround your trigger.
I can suggest one work around - you can create a view which emits a current date as a key (or a part of it):
function( doc ){
emit( new Date, null );
}
This will assign current dates to all documents as soon as the view generation gets triggered (which happens after first request to it) and will reassign new dates on each update of a specific document.
Although the above should solve your issue, I would advice against using it for the reasons already explained by Kxepal: if you're on a replicated network, each node will assign its own dates. So taking this into account, the best I can recommend is to solve the issue on the client side and just post the documents with a date already embedded.
In azure table storage. Is there a way to get the new timestamp value after an update or insert. I am writing a 3-phase commit protocol to get table storage to support distributed transactions , and it involes multiple writes to the same entity. So the operation order goes like this, Read Entity, Write Entity (Lock Item), Write Entity (Commit new values). I would like to get the new timestamp after the lock item operation so I don't have to unecessarily read the item again before doing the commit new value operation. So does any one know how to efficiently get the new timestamp value after a savechanges operation?
I don't think you need to do anything special/extra. When you read your entity you will get an Etag for it. When you save that entity (setting someLock=true) that save will only succeed if nobody else have updated the entity since your read. Hence you know you have the lock. And then you can do your second write as you please.
I don't believe it is possible. I would use your own timestamp and/or guid to mark entries.
If you're willing to go back to the Update REST API call, it does return the time that the response was generated. It probably won't be exactly the same as the time stamp on the record, but it will be close I'm sure.
You may need to hack your Azure table. drivers
In the Azure python lib (TableStorage) for example, the Timestamp is simply skipped over.
# exclude the Timestamp since it is auto added by azure when
# inserting entity. We don't want this to mix with real properties
if name in ['Timestamp']:
continue
I'm updating an object in AzureTableStorage using the StorageClient library with
context.UpdateObject(obj);
context.SaveChangesWithRetries(obj);
when I do this, is there any way to get hold of the new timestamp for obj without making another request to the server?
Thanks
Stuart
To supplement Seva Titov's answer: the excerpt reported was valid at least until May 2013, but as of November 2013 it has changed (emphasis added):
The Timestamp property is a DateTime value that is maintained on the server side to record the time an entity was last modified. The Table service uses the Timestamp property internally to provide optimistic concurrency. The value of Timestamp is a monotonically increasing value, meaning that each time the entity is modified, the value of Timestamp increases for that entity. This property should not be set on insert or update operations (the value will be ignored).
Now the Timestamp property is no longer regarded as opaque and it is documented that its value increases after each edit -- this suggests that could Timestamp could be now used to track subsequent updates (at least with regard to the single entity).
Nevertheless, as of November 2013 it is still needed another request to Table Storage to obtain the new timestamp when you update the entity (see the documentation of Update Entity REST method). Only when inserting an entity the REST service returns the entire entity with the timestamp (but I don't remember if this is exposed by the StorageClient/Windows Azure storage library).
MSDN page has some guidance on the usage of Timestamp field:
Timestamp Property
The Timestamp property is a DateTime
value that is maintained on the server
side to record the time an entity was
last modified. The Table service uses
the Timestamp property internally to
provide optimistic concurrency. You
should treat this property as opaque:
It should not be read, nor set on
insert or update operations (the value
will be ignored).
This implies that it is really implementation details of the table storage, you should not rely the Timestamp field to represent timestamp of last update.
If you want a field which is guaranteed to represent time of last write, create new field and set it on every update operatio. I understand this is more work (and more storage space) to maintain the field, but that would actually automatically resolves your question -- how to get the timestamp back, because you would already know it when calling context.UpdateObject().
The Timestamp property is actually a Lamport timestamp. It is guaranteed to always grow over time and while it is presented as a DateTime value it's really not.
On the server side, that is, Windows Azure Storage, for each change does this:
nextTimestamp = Math.Max(currentTimestamp + 1, DateTime.UtcNow)
This is all there is to it. And it's of course guaranteed to happen in a transactional manner. The point of all this is to provide a logical clock (monotonic function) that can be used to ensure that the order of events happen in the intended order.
Here's a link to a version of the actual WAS paper and while it doesn't contain any information on the timestamp scheme specifically it has enough stuff there that you quickly realize that there's only one logical conclusion you can draw from this. Anything else would be stupid. Also, if you have any experience with LevelDB, Cassandra, Memtables and it's ilk, you'll see that the WAS team went the same route.
Though I should add to clarify, since WAS provides a strong consistency model, the only way to maintain the timestamp is to do it under lock and key, so there's no way you can guess the correct next timestamp. You have to query WAS for the information. There's no way around that. You can however hold on to an old value and presume that it didn't change. WAS will tell you if it did and then you can resolve the race condition any way you see fit.
I am using Windows Azure Storage 7.0.0
And you can check the result of the operation to get the eTag and the Timespan properties :
var tableResult = cloudTable.Execute(TableOperation.Replace(entity));
var updatedEntity = tableResult.Result as ITableEntity;
var eTag = updatedEntity.ETag;
var timestamp = updatedEntity.Timestamp;
I don't think so, as far as I know Timespan and Etag are set by Azure Storage itself.