I tried researching and understanding the async and non-blocking ability of play.
What I understand(may be wrong):
Action.async leads to Future[Result] which are placeholders of results that are yet to be received. A request comes in and a method handles it (say a database query) but as the db call is made, the thread is freed up to take another request. So how does the system not lose track of that db call as a thread is no longer with it?
Once the result is received does an available thread then pick up on that result and responds with it?
Usually when learning new concepts I'd like an animated or layman's terms type video that visually shows threads and requests.
Also isn't there waiting that has to be done on every request anyway, is it just that resources are used when that waiting is being done?
Thanks in advance!
When you use Node's EventEmitter, you subscribe to a single event. Your callback is only executed when that specific event is fired up:
eventBus.on('some-event', function(data){
// data is specific to 'some-event'
});
In Flux, you register your store with the dispatcher, then your store gets called when every single event is dispatched. It is the job of the store to filter through every event it gets, and determine if the event is important to the store:
eventBus.register(function(data){
switch(data.type){
case 'some-event':
// now data is specific to 'some-event'
break;
}
});
In this video, the presenter says:
"Stores subscribe to actions. Actually, all stores receive all actions, and that's what keeps it scalable."
Question
Why and how is sending every action to every store [presumably] more scalable than only sending actions to specific stores?
The scalability referred to here is more about scaling the codebase than scaling in terms of how fast the software is. Data in flux systems is easy to trace because every store is registered to every action, and the actions define every app-wide event that can happen in the system. Each store can determine how it needs to update itself in response to each action, without the programmer needing to decide which stores to wire up to which actions, and in most cases, you can change or read the code for a store without needing to worrying about how it affects any other store.
At some point the programmer will need to register the store. The store is very specific to the data it'll receive from the event. How exactly is looking up the data inside the store better than registering for a specific event, and having the store always expect the data it needs/cares about?
The actions in the system represent the things that can happen in a system, along with the relevant data for that event. For example:
A user logged in; comes with user profile
A user added a comment; comes with comment data, item ID it was added to
A user updated a post; comes with the post data
So, you can think about actions as the database of things the stores can know about. Any time an action is dispatched, it's sent to each store. So, at any given time, you only need to think about your data mutations a single store + action at a time.
For instance, when a post is updated, you might have a PostStore that watches for the POST_UPDATED action, and when it sees it, it will update its internal state to store off the new post. This is completely separate from any other store which may also care about the POST_UPDATED event—any other programmer from any other team working on the app can make that decision separately, with the knowledge that they are able to hook into any action in the database of actions that may take place.
Another reason this is useful and scalable in terms of the codebase is inversion of control; each store decides what actions it cares about and how to respond to each action; all the data logic is centralized in that store. This is in contrast to a pattern like MVC, where a controller is explicitly set up to call mutation methods on models, and one or more other controllers may also be calling mutation methods on the same models at the same time (or different times); the data update logic is spread through the system, and understanding the data flow requires understanding each place the model might update.
Finally, another thing to keep in mind is that registering vs. not registering is sort of a matter of semantics; it's trivial to abstract away the fact that the store receives all actions. For example, in Fluxxor, the stores have a method called bindActions that binds specific actions to specific callbacks:
this.bindActions(
"FIRST_ACTION_TYPE", this.handleFirstActionType,
"OTHER_ACTION_TYPE", this.handleOtherActionType
);
Even though the store receives all actions, under the hood it looks up the action type in an internal map and calls the appropriate callback on the store.
Ive been asking myself the same question, and cant see technically how registering adds much, beyond simplification. I will pose my understanding of the system so that hopefully if i am wrong, i can be corrected.
TLDR; EventEmitter and Dispatcher serve similar purposes (pub/sub) but focus their efforts on different features. Specifically, the 'waitFor' functionality (which allows one event handler to ensure that a different one has already been called) is not available with EventEmitter. Dispatcher has focussed its efforts on the 'waitFor' feature.
The final result of the system is to communicate to the stores that an action has happened. Whether the store 'subscribes to all events, then filters' or 'subscribes a specific event' (filtering at the dispatcher). Should not affect the final result. Data is transferred in your application. (handler always only switches on event type and processes, eg. it doesn't want to operate on ALL events)
As you said "At some point the programmer will need to register the store.". It is just a question of fidelity of subscription. I don't think that a change in fidelity has any affect on 'inversion of control' for instance.
The added (killer) feature in facebook's Dispatcher is it's ability to 'waitFor' a different store, to handle the event first. The question is, does this feature require that each store has only one event handler?
Let's look at the process. When you dispatch an action on the Dispatcher, it (omitting some details):
iterates all registered subscribers (to the dispatcher)
calls the registered callback (one per stores)
the callback can call 'waitfor()', and pass a 'dispatchId'. This internally references the callback of registered by a different store. This is executed synchronously, causing the other store to receive the action and be updated first. This requires that the 'waitFor()' is called before your code which handles the action.
The callback called by 'waitFor' switches on action type to execute the correct code.
the callback can now run its code, knowing that its dependancies (other stores) have already been updated.
the callback switches on the action 'type' to execute the correct code.
This seems a very simple way to allow event dependancies.
Basically all callbacks are eventually called, but in a specific order. And then switch to only execute specific code. So, it is as if we only triggered a handler for the 'add-item' event on the each store, in the correct order.
If subscriptions where at a callback level (not 'store' level), would this still be possible? It would mean:
Each store would register multiple callbacks to specific events, keeping reference to their 'dispatchTokens' (same as currently)
Each callback would have its own 'dispatchToken'
The user would still 'waitFor' a specific callback, but be a specific handler for a specific store
The dispatcher would then only need to dispatch to callbacks of a specific action, in the same order
Possibly, the smart people at facebook have figured out that this would actually be less performant to add the complexity of individual callbacks, or possibly it is not a priority.
I have defined a callback function with the Meteor.bindEnvironment wrapper as described in the
Meteor Async Guide. I used the wrapper so that a Meteor collection would be available to this asynchronous callback. Within the callback I am trying to insert only documents with a unique values for an attribute called 'title'. I have found several resources demonstrating the Mongo way of handling this, but the required functions (e.g. findAndModify or upsert option for find) are not yet implemented by Meteor.
I have resorted to performing a query for incoming title's value and inserting a new document if the query returns no matching documents. However, this fails due to the asynchronous nature of the callback and duplicates end up being inserted into the collection.
Is there a Meteor or Node.js pattern for wrapping a critical section like this with a lock?
Thanks!
If only one actor (client or server) is doing the polling, I don't see how you would get an async bypass of your 'already received' block list unless your update call was going off too often. Is there nothing in the api to only return msgs after a timestamp, or to subscribe only to changes?
If you have multiple clients each polling a common outside source for updates, then trying to sync to a merged common collection, (which you don't say, but would have the problem you describe),
Make each stream unique by title + userid.
You also may want just local collections on the client to track what the client has seen, and a separate one on the server if you are trying for an audit trail.
I used connect-redis for my session store, and when I use req.session, it seems all the operations on it are synchronized, it's like operating on ordinary Javascript variables, the code obey the order. but I check the source code, which uses the asynchronized way, so I wonder why the req.session acts like that.
Another question is that if I have multiple redis queries,
client.sadd('test', 1);
client.del('test');
client.sadd('test', 2);
client.sadd('test', 3);
no matter where I put the del operation, the results always the same. I thought these queries might be run in any order right? since they all asynchronized called, so the results I expected should be different every time.
Thanks for you help
The fact that roundtrips to the Redis server are managed asynchronously does not mean the queries will be sent in random order.
Redis (and therefore most Redis client libraries) supports pipelining, generally used to optimize the number of roundtrips. The idea is to send multiple queries, and then wait for the replies. The order is critical, because it is used by the client to match queries and replies.
Node.js is very well suited to support this kind of mechanisms. Matt Ranney's node_redis client supports pipelining in a transparent way. Provided the same client object is used, all the queries will be serialized and executed in order.
In your example, it is normal the queries are always executed in the same order. You can check this point by using the monitor command to display the flow of queries sent to Redis.
Now, it is important the last query of the pipeline is associated with a callback, otherwise your program will never know when the last query is complete.
I've gathered that due to the event queue design of node.js and the way that JavaScript references work, that there's not an effective way of figuring out which active request object (if any) spawned the currently-executing function.
If I am in fact mistaken, how do I maintain some sort of statically-accessible context object that is available for the life of a request, short of manually passing it around every function in the event/callback chain?
If there's no way to do that, is there a way to somehow wrap every event call so that when it is spawned, it copies the context data from the event that spawned it?
What I'm actually trying to achieve is having every log message also contain the username of the person to whom the log message is related. Because the callback event chain can become quite complex, I'd say trying to pass the request object around every function in the application is a bit messy for my taste, so hopefully there's another way.