I use operations in operation queue which update data and send notifications at the end.
My problem is I may process the old data while a new entities are already available and this causes an exception. I could process the old data and then process the new ones but I must to store/copy old data anyways.
How to solve this issue if Core Data doesn't allow to retain/copy its managed objects? I tried MagicalRecord but both it and pure Core Data don't provide the ready necessary solutions.
Related
I'm working on a new project, and I am still learning about how to use Microservice/Domain Driven Design.
If the recommended architecture is to have a Database-Per-Service, and use Events to achieve eventual consistency, how does the service's database get initialized with all the data that it needs?
If the events indicating an update to the database occurred before the new service/db was ever designed, do I need to start with a copy of the previous database?
Or should I publish a 'New Service On The Block' event, and allow all the other services to vomit back everything back to me again? Which could be a LOT of chatty-ness, and cause performance issues.
how does the service's database get initialized with all the data that it needs?
It asks for it; which is to say that you design a protocol so that the service that is spinning up can get copies of all of the information that it needs. That often includes tracking checkpoints, and queries that allow you to ask what has happened since some checkpoint.
Think "pull", rather than "push".
Part of the point of "services": designing the right data boundaries. The need to copy a lot of data between services often indicates that the service boundaries need to be reconsidered.
There is a special streaming platform named Apache Kafka, that solves something similar.
With Kafka you would publish events for other services to consume. What makes Kafka special is the fact, that events never (depends on configuration) get deleted and can be consumed again by new services spinning up. This feature can be used for initially populating the database (by setting the offset for a Topic to 0 and therefore re-read the history of events).
There also is another feature, called GlobalKTable what is a TableView of all events for a particular Topic. The GlobalKTable holds the latest value for each key (like primary key) and can be turned into an state-store (RocksDB under the hood), what makes it queryable. This state-store initializes itself whenever the application starts up. So the application does not need to have a database itself, because the state-store would be kept up-to-date automatically (consistency still is a thing to keep in mind). Only for more complex queries that state-store would need to be accompanied with a database (with kafka you would try to pre-compute the results of those queries and make them accessible to a distinct state-store itself).
This would be a complex endeavor, but if it suits your needs it is a fun thing to do!
I'm new to pouch/couch and looking for some guidance on handling conflicts. Specifically, I have an extension running pouchdb (distributed to two users). Then the idea is to have a pouchdb-server or couchdb (does it matter for this small a use case?) instance running remotely. The crux of my concern is handling conflicts, the data will be changing frequently and though the extensions won't be doing live sync, they will be syncing very often. I have conflict handling written into the data submission functions, however there could still be conflicts when syncing occurs with multiple users.
I was looking at the pouch-resolve-conflicts plugin and see immediately the author state:
"Conflict resolution should better be done server side to avoid hard to debug loops when multiple clients resolves conflicts on the same documents".
This makes sense to me, but I am unsure how to implement such conflict
resolution. The only way I can think would be to place REST API layer
in front of the remote database that handles all updates/conflicts etc with custom logic.
But then how could I use the pouch sync functionality? At that point I
may as well just use a different database.
I've just been unable to find any resources discussing how to implement conflict resolution server-side, in fact the opposite.
With your use case, you could probably write to a local pouchdb instance and sync it with the master database. Then, you could have a daemon that automatically resolve conflicts on your master database.
Below is my approach to solve a similar problem.
I have made a NodeJS daemon that automatically resolve conflicts. It integrates deconflict, a NodeJS library that allows you to resolve a document in three ways:
Merge all revisions together
Keep the latest revisions (based on a custom key. Eg: updated_at)
Pick a certain revision (Here you can use your own logic)
Revision deconflict
The way I use CouchDB, every write is partial. We always take some changes and apply them to the latest document. With this approach, we can easily take the merge all revision strategy.
Conflict scanner
When the daemon boot, two processes are executed. One that go through all the changes. If a conflict is detected, it's added to a conflict queue.
Another process is executed and remain active: Continuous changes scanner.
It listen to all new changes and add conflicted documents to the conflict queue
Queue processing
Another process is started and keeps polling the queue for new conflicted documents. It gets conflicted documents in batch and resolve them on by one. If there's not documents, it just wait a certain period and starts the polling again.
Having worked a little bit with Redux I realized that the same concept of unidirectional flow would help me avoid the problem of conflicts altogether.
Redux flows like this...
So, my clientside code never write definitive data to the master database, instead they write insert/update/delete requests locally which PouchDB then pushes to the CouchDB master database. On the same server as the master CouchDB I have PouchDB in NodeJS replicating these requests. "Superviser" software in NodeJS examines each new request, changes their status to "processing" writes the requested updates, inserts and deletes, then marks the request "processed". To ensure they're processed one at time the code that receives each request, stuffs them into a FIFO. The processing code pulls them from the other end.
I'm not dealing with super high volume, so the latency is not a concern.
I'm also not facing a situation where numerous people might be trying to update exactly the same record at the same time. If that's your situation, your client-side update requests will need to specify the rev number and your "supervisors" will need to reject change requests that refer to a superseded version. You'll have to figure out how your client code would get and respond to those rejections.
I understand that events in event sourcing should never be allowed to change. But what about the in-memory state? If the domain model needs to be updated in some way, shouldn't old event still be replayed to old models? I mean shouldn't it be possible to always replay events and get the exact same state as before or is it acceptable if this state evolves too as long as the stored events remains the same? Ideally I think I'd like to be able to get a state as it was with it's old models, rules and what not. But other than that I of course also want to replay old events into new models. What does the theory say about this?
Anticipate event structure changes
You should always try to reflect the fact that an event had a different structure in your event application mechanism (i.e. where you read events and apply them to the model). After all, the earlier structure of an event was a valid structure at that time.
This means that you need to be prepared for this situation. Design the event application mechanism flexible enough so that you can support this case.
Migrating stored events
Only as a very last resort should you migrate the stored events. If you do it, make sure you understand the consequences:
Which other systems consumed the legacy events?
Do we have a problem with them if we change a stored event?
Does the migration work for our system (verify in a QA environment with a full data set)?
Here is my scenario:
I have two servers with a multi-threaded message queuing consumer on each (two consumers total).
I have many message types (CreateParent, CreateChild, etc.)
I am stuck with bad legacy code (creating a child will partially creates a parent. I know it is bad...But I cannot change that.)
Message ordering cannot be assume (message queuing principle!)
RabbitMQ is my message queuing broker.
My problem:
When two threads are running simultaneous (one executing a CreateParent, the other executing a CreateChild), they generate conflicts because the two threads try to create the Parent in the database (remember the legacy code!)
My initial solution:
Inside the consumer, I created an "entity locking" concept. So when the thread processes a CreateChild message for example, it locks the Child and the Parent (legacy code!!) so the CreateParent message processing can wait. I used basic .net Monitor and list of Ids to implement this concept. It works well.
My initial solution limitation:
My "entity locking" concept works well on a single consumer in a single process on a single server. But it will not works across multiple servers running multiple consumers.
I am thinking of using a shared database to "store" my entity locking concept, so each processes (and threads) could access the database to verify which entities are locked.
My question (finally!):
All this is becoming very complex and it increases the bugs risk and code maintenance problems. I really don`t like it!
Does anyone already faced this kind of problem? Are they acceptable workarounds for it?
Does anyone have an idea for a clean solution for my scenario?
Thanks!
Finally, simple solutions are always the better ones!
Instead of using all the complexity of my "entity locking" concept, I finally turn down to pre-validate all the required data and entities states before executing the request.
More precisely, instead of letting CreateChild process crashes by itself when it encounter already existing data created by the CreateParent, I fully validate that everything is okay in the databases BEFORE executing the CreateChild message.
The drawback of this solution is that the implementation of the CreateChild must be aware of what of the specific data the CreateParent will produces and verify it`s presence before starting the execution. But seriously, this is far better than locking all the stuff in cross-system!
What is the best way to check iCloud for existing data?
I need to check that data doesn't exist on the local device, or iCloud so I can then download it.
Since you included the core-data tag I'm assuming you mean that you're using Core Data rather than iCloud file APIs or the ubiquitous key-value store.
With Core Data's built-in iCloud support, you check on existing data in exactly the same way as if you were not using iCloud. Once you create your Core Data stack, you check what data exists by doing normal Core Data fetches. There's no (exposed) concept of local vs. cloud data, there's just one data store that happens to know how to communicate with iCloud. You don't explicitly initiate downloads-- they happen automatically.
At app launch time when you call addPersistentStoreWithType:configuration:URL:options:error:, Core Data internally initiates the download of any data that's not available locally yet. As a result this method may block for a while. If it returns successfully, then all current downloads can be assumed to be complete.
If new changes appear while your app is running, Core Data will download and import them and, when it's finished, will post NSPersistentStoreDidImportUbiquitousContentChangesNotification to tell you what just happened.
This all describes how Core Data's iCloud is supposed to work. In practice you'll probably find that it doesn't always work as intended.
Thanks to #Tom Harrington for pointing out, this error is nothing to do with the developer/coding - it's purely to do with iCloud/Apple/connection issues.
More on this SO answer I found.