I'm currently working on a calendar system written in an EventSource style.
What I'm currently struggling with is how to create an event which create lots of smaller events and how to store the other events in a way which will allow them to be replayed.
For example I may trigger an CreateReminderSchedule which then triggers the construction of many smaller events such as CreateReminder.
{
id:1
description: "Clean room",
weekdays:[5]
start: 01.12.2018,
end: 01.12.2018
type: CREATEREMINDERSCHEDULE
}
This will then create loads of CreateReminder aggregates with different ids so you can edit the smaller ones i.e.
{
id:2
description: "Clean room"
date: 07.12.2018
type: CREATEREMINDER
scheduleId: 1
}
So to me one problem is when I replay all the events the createReminderSchedule will then retrigger createReminderEvents which mean I'll have more reminders than needed during the replay.
Is the answer to remove the smaller events and just have one big create event listing all the ids of the reminders within the event like:
{
id:1
description: "Clean room",
weekdays:[5]
start: 01.12.2018,
end: 01.12.2018
type: CREATEREMINDERSCHEDULE
reminderIds:[2,3,4,5,...]
}
But if i do this way then I won't have the base event for all my reminder aggregates.
Note the reminders must be aware of the reminderSchedule so I can later change the reminderSchedule to update all the reminders related to that reminderschedule
Perhaps you are confusing events with commands. You could have a command that is processed to create your reminders (in the form of events, ie ReminderCreated) which is then applied to your aggregate to create your reminder-objects. This state is recreated in the same way every time you replay your events from its source.
Related
I have a model like the following one in MongoDB using Mongoose:
Stuff
{
_id: ObjectId...,
stuff: String..,
someBoolean: Boolean,
description: String,
transactionsOfThisStuff: [{
trnsactionNumber: objectId?
date: Date.now()
info: String
}]
}
As you can see, the idea is to move stuff, and I need to register every movement, so I made an array of "transactions" where I keep the history.
To make a "transaction" there are some requirements, for example, the "someBoolean" must be in certain value, etc.
And when a transaction is made, some values of the stuff must be updated.
Also, I must be able to move multiple stuff at the same time (move a table, a plumbus, etc), so all of them will have the same "transactionNumber" in each document.
The problem I see with this model is that I can't easily for example list the last 10 movements, or I don't find efficietn getting the Stuff that has been moved with a given "transactionNumber".
If a use two models:
Stuff
{
_id: ObjectId...,
stuff: String..,
someBoolean: Boolean,
description: String,
}
transaction
{
_id: objectId,
date: Date.now(),
info: String,
stuff: [{type:ObjectId, ref: 'Stuff', requiered: true}]
}
the problem with this idea, is that I would need ACID, since if move multiple Stuff, I need to update some values in the "Stuff" :/
Edit:
I save hardware parts, like CPU, Mouse, Keyboard, Monitor, etc. Each one of them is stored in "logical" warehouses. I can make different types of "transactions", like move it to another warehouse, give it to a person, take it back, etc. Each transaction must be tracked, so I can have a history of that item. Also, I can move many items at the same time within the same transaction, for example in the transaction number 21361764 I moved 10 different items. At the same time, I need to update some info in the item, like isStored: False, True.
The transaction itself must have the date of the execution, a client, a user (who is doing the transaction), and an array of items, etc.
The ideas above the edit are what I came so far, but each one has problems since I would need transactions. There must be a way to solve this problem without falling to a relational database.
I have a model called a Transaction which has the following schema
var transactionSchema = new mongoose.Schema({
amount: Number,
status: String,
_recipient: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
_sender: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
});
I want both sender and recipient of this transaction to be able to 'confirm' that the transaction took place. The status starts out as "initial". So when only the sender has confirmed the transaction (but the recipient yet not), I want to update the status to "senderConfirmed" or something, and when the recipient has confirmed it (but sender has not), I want to update status to "recipientConfirmed". When they have both confirmed it, I want to update the status to "complete".
The problem is, how I can know when to update it to "complete" in a way that avoids race conditions? If both sender and recipient go to confirm the transaction at the same time, then both threads will think the status is "initial" and update it just to "senderConfirmed" or "recipientConfirmed", when in actuality it ought to go to "complete".
I read about MongoDBs two phase commit approach here but that doesn't quite fit my need, since I don't (in the case that another thread is currently modifying a transaction) want to prevent the second thread from making its update - I just want it to wait until the first thread is finished before doing its update, and then making the content of its update contingent on the latest status of the transaction.
Bottom line is you need "two" update statement to do this for each of sender and recipient respectively. So basically one is going to try and set the "partial" status to complete, and the other will only set the "initial" status match to the "partial" state.
Bulk operations are the best way to implement multiple statements, so you should use these by accessing the underlying driver methods. Modern API releases have the .bulkWrite() method, which degrades nicely if the server version does not support the "bulk" protocol, and just falls back to issuing separate updates.
// sender confirmation
Transaction.collection.bulkWrite(
[
{ "updateOne": {
"filter": {
"_id": docId,
"_sender": senderId,
"status": "recipientConfirmed"
},
"update": {
"$set": { "status": "complete" }
}
}},
{ "updateOne": {
"filter": {
"_id": docId,
"_sender": senderId,
"status": "initial"
},
"update": {
"$set": { "status": "senderConfirmed" }
}
}}
],
{ "ordered": false },
function(err,result) {
// result will confirm only 1 update at most succeeded
}
);
And of course the same applies for the _recipient except the different status check or change. You could alternately issue an $or condition on the _sender or _recipient and have a generic "partial" status instead of coding different update conditions, but the same basic "two update" process applies.
Of course again you "could" just use the regular methods and issue both updates to the sever in another way, possibly even in parallel since the conditions remain "atomic", but that is also the reason for the { "ordered": false } option since their is no determined sequence that needs to be respected here.
Bulk operations though are better than separate calls, since the send and return is only one request and response, as opposed to "two" of each, so the overhead using bulk operations is far less.
But that is the general approach. No single statement could possibly leave a "status" in "deadlock" or mark as "complete" before the other party also issues their confirmation.
There is a "possibility" and a very slim one that a status was changed from "initial" in between the first attempt update and the second, which would result in nothing being updated. In that case, you can "retry" the action on which it "should" update on the subsequent attempt.
This should only ever need "one" retry at most though. And very very rarely.
NOTE: Care should be taken when using the .collection accessor on Mongoose models. All the regular model methods have built in logic to "ensure" the connection to the database is actually present before they do anything, and in fact "queue" operations until a connection is present.
It's generally good practice to wrap your application startup in an event handler to ensure the database connection:
mongoose.on("open",function() {
// App startup and init here
})
So using the "on" or "once" events for this case.
Generally though a connection is always present either after this event is fired, or after any "regular" model method has already been called in the application.
Possibly mongoose will include methods like .bulkWrite() directly on the model methods in future releases. But presently it does not, so the .collection accessor is necessary to grab the underlying Collection object from the core driver.
Update: I am clarifying my answer based on a comment that my original response did not provide an answer.
An alternative approach would be to keep track of the status as two separate properties:
senderConfirmed: true/false,
recipientConfirmed: true/false,
When the sender confirms you simply update the senderConfirmed field. When the recipient confirms you update the recipientConfirmed field. There is no way they will overwrite each other.
To determine if the transaction is complete you would merely query {senderConfirmed:true,recipientConfirmed:true}.
Obviously this is a change to the document schema, so it may not be ideal.
Original Answer:
Is a change to your schema possible? What if you had two properties - senderStatus and recipientStatus? Sender would only update senderStatus and recipient would only update recipientStatus. Then they couldn't overwrite each others changes.
You would still need some other way to mark it as complete, I assume. You could us a cron job or something...
I have a "task" model (collection) in my app, it has a due date and I'd like to send out notifications once the task is past due.
How should I implement the "past due" property so the system can detect "past due" at any time?
Do I set up cron job to check every minute or is there a better way?
I'd recommend using synced-cron for this. It has a nice interface, and if you expand to multiple instances, you don't have to worry about each instance trying to execute the task. Here is an example of how you could use it:
SyncedCron.add({
name: 'Notify users about past-due tasks',
schedule: function(parser) {
// check every two minutes
return parser.recur().on(2).minute();
},
job: function() {
if(Tasks.find(dueAt: {$lte: new Date}).count())
emailUsersAboutPastDueTasks()
}
});
Of course, you'd also want to record which users had been notified or run this less frequently so your users don't get bombarded with notifications.
What is the best way to propagate updates when you have a denormalized Schema? Should it be all done in the same function?
I have a schema like so:
var Authors = new Schema({
...
name: {type: String, required:true},
period: {type: Schema.Types.ObjectId, ref:'Periods'},
quotes: [{type: Schema.Types.ObjectId, ref: 'Quotes'}]
active: Boolean,
...
})
Then:
var Periods = new Schema({
...
name: {type: String, required:true},
authors: [{type: Schema.Types.ObjectId, ref:'Authors'}],
active: Boolean,
...
})
Now say I want to denormalize Authors, since the period field will always just use the name of the period (which is unique, there can't be two periods with the same name). Say then that I turn my schema into this:
var Authors = new Schema({
...
name: {type: String, required:true},
period: String, //no longer a ref
active: Boolean,
...
})
Now Mongoose doesn't know anymore that the period field is connected to the Period schema. So it's up to me to update the field when the name of a period changes. I created a service module that offers an interface like this:
exports.updatePeriod = function(id, changes) {...}
Within this function I go through the changes to update the period document that needs to be updated. So here's my question. Should I, then, update all authors within this method? Because then the method would have to know about the Author schema and any other schema that uses period, creating a lot of coupling between these entities. Is there a better way?
Perhaps I can emit an event that a period has been updated and all the schemas that have denormalized period references can observe it, is that a better solution? I'm not quite sure how to approach this issue.
Ok, while I wait for a better answer than my own, I will try to post what I have been doing so far.
Pre/Post Middleware
The first thing I tried was to use the pre/post middlewares to synchronize documents that referenced each other. (For instance, if you have Author and Quote, and an Author has an array of the type: quotes: [{type: Schema.Types.ObjectId, ref:'Quotes'}], then whenever a Quote is deleted, you'd have to remove its _id from the array. Or if the Author is removed, you may want all his quotes removed).
This approach has an important advantage: if you define each Schema in its own file, you can define the middleware there and have it all neatly organized. Whenever you look at the schema, right below you can see what it does, how its changes affect other entities, etc:
var Quote = new Schema({
//fields in schema
})
//its quite clear what happens when you remove an entity
Quote.pre('remove', function(next) {
Author.update(
//remove quote from Author quotes array.
)
})
The main disadvantage however is that these hooks are not executed when you call update or any Model static updating/removing functions. Rather you need to retrieve the document and then call save() or remove() on them.
Another smaller disadvantage is that Quote now needs to be aware of anyone that references it, so that it can update them whenever a Quote is updated or removed. So let's say that a Period has a list of quotes, and Author has a list of quotes as well, Quote will need to know about these two to update them.
The reason for this is that these functions send atomic queries to the database directly. While this is nice, I hate the inconsistency between using save() and Model.Update(...). Maybe somebody else or you in the future accidently use the static update functions and your middleware isn't triggered, giving you headaches that you struggle to get rid of.
NodeJS Event Mechanisms
What I am currently doing is not really optimal but it offers me enough benefits to actually outweight the cons (Or so I believe, if anyone cares to give me some feedback that'd be great). I created a service that wraps around a model, say AuthorService that extends events.EventEmitter and is a Constructor function that will look roughly like this:
function AuthorService() {
var self = this
this.create = function() {...}
this.update = function() {
...
self.emit('AuthorUpdated, before, after)
...
}
}
util.inherits(AuthorService, events.EventEmitter)
module.exports = new AuthorService()
The advantages:
Any interested function can register to the Service
events and be notified. That way, for instance, when a Quote is
updated, the AuthorService can listen to it and update the Authors
accordingly. (Note 1)
Quote doesn't need to be aware of all the documents that reference it, the Service simply triggers the QuoteUpdated event and all the documents that need to perform operations when this happens will do so.
Note 1: As long as this service is used whenever anyone needs to interact with mongoose.
The disadvantages:
Added boilerplate code, using a service instead of mongoose directly.
Now it isn't exactly obvious what functions get called when you
trigger the event.
You decouple producer and consumer at the cost of legibility (since
you just emit('EventName', args), it's not immediately obvious
which Services are listening to this event)
Another disadvantage is that someone can retrieve a Model from the Service and call save(), in which the events won't be triggered though I'm sure this could be addressed with some kind of hybrid between these two solutions.
I am very open to suggestions in this field (which is why I posted this question in the first place).
I'm gonna speak more from an architectural point of view than a coding point of view since when it comes right down to it, you can pretty-much achieve anything with enough lines of code.
As far as I've been able to understand, your main concern has been keeping consistency across your database, mainly removing documents when their references are removed and vice-versa.
So in this case, rather than wrapping the whole functionality in extra code I'd suggest going for atomic Actions, where an Action is a method you define yourself that performs a complete removal of an entity from the DB (both document and reference).
So for example when you wanna remove an author's quote, you do something like removing the Quote document from the DB and then removing the reference from the Author document.
This sort of architecture ensures that each of these Actions performs a single task and performs it well, without having to tap into events (emitting, consuming) or any other stuff. It's a self-contained method for performing its own unique task.
Im developing an auction style web app, where products are available for a certain period of time.
I would like to know how would you model that.
So far, what I've done is storing products in DB:
{
...
id: p001,
name: Product 1,
expire_date: 'Mon Oct 7 2013 01:23:45 UTC',
...
}
Whenever a client requests that product, I test *current_date < expire_date*.
If true, I show the product data and, client side, a countdown timer. If the timer reaches 0, I disable the related controls.
But, server side, there are some operations that needs to be done even if nobody has requested that product, for example, notify the owner that his product has ended.
I could scan the whole collection of products on each request, but seems cumbersome to me.
I thought on triggering a routine with cron every n minutes, but would like to know if you can think on any better solutions.
Thank you!
Some thoughts:
Index the expire_date field. You'll want to if you're scanning for auction items older than a certain date.
Consider adding a second field that is expired (or active) so you can also do other types of non-date searches (as you can always, and should anyway, reject auctions that have expired).
Assuming you add a second field active for example, you can further limit the scans to be only those auction items that are active and beyond the expiration date. Consider a compound index for those cases. (As over time, you'll have more an more expired items you don't need to scan through for example).
Yes, you should add a timed task using your favorite technique to scan for expired auctions. There are lots of ways to do this -- your infrastructure will help determine what makes sense.
Keep a local cache of current auction items in memory if possible to make scanning efficient as possible. There's no reason to hit the database if nothing is expiring.
Again, always check when retrieving from the database to confirm that items are still active -- as there easily could be race conditions where expired items may expire while being retrieved for display.
You'll possible want to store the state of status e-mails, etc. in the database so that any server restarts, etc. are properly handled.
It might be something like:
{
...
id: p001,
name: "Product 1",
expire_date: ISODate("Mon Oct 7 2013 01:23:45 UTC"),
active: true,
...
}
// console
db.auctions.esureIndex({expire_date: -1, active: 1})
// javascript idea:
var theExpirationDate = new Date(2013, 10, 06, 0, 0, 0);
db.auctions.find({ expire_date : { "$lte" : theExpirationDate }, active: true })
Scanning the entire collection on each request sounds like a huge waste of processing time.
I would use something like pm2 to handle both keeping track of your main server process as well as running periodic tasks with its built-in cron-like functionality.