If I'm using a firestore function triggered on firestore database events such as onUpdate() or onCreate(), for most writes I can check the change.after.updateTime field to see when the write occurred, which will be different than Timestamp.now(), the time the function is invoked.
However, when the document is deleted, triggering onWrite() or onDelete(), change.after.exists is false and change.after.updateTime is undefined. Is there any way to get access to the time that the database event occurred, rather than the time that the function is invoked?
Currently there is no information about when the exact transition occurs available for such a delete operation. You'd have to pass or get something out of band (i.e. writing it into a separate document, or somehow reading it from the logs), but I can imagine how that'd get pretty hairy quickly.
I recommend filing a feature request for this with the team.
Related
I am trying to understand change feeds in Azure. I see I can trigger an event when something changes in cosmos db. This is useful. However, in some situations, I expect a document to be changed after a while. A question should have a status change that it has been answered. After a while an order should have a status change "confirmed" and a problem should have status change "resolved" or should a have priority change (to "low"). It is useful to trigger an event when such a change is happening for a certain document. However, it is even more useful to trigger an event when such a change after a (specified) while (like 1 hour) does not happen. A problem needs to be resolved after a while, an order needs to be confirmed after while etc. Can I use change feeds and azure functions for that too? Or do I need something different? It is great that I can visualize changes (for example in power BI) once they happen after a while but I am also interested in visualizing changes that do not occur after a while when they are expected to occur.
Achieving that with Change Feed doesn't sound possible, because as you describe it, Change Feed is reacting based on operations/events that happen.
In your case it sounds as if you needed an agent that needs to be running every X amount of time (maybe an Azure Functions with a TimerTrigger?) and executes a query to find items with X state that have not been modified in the past Y pre-defined interval (possibly the time interval associated with the TimerTrigger). This could be done by checking the _ts field of the state documents or your own timestamp field, see https://stackoverflow.com/a/39214165/5641598.
If your goal is to just deploy it on a dashboard, you could query using Power BI too.
As long as you don't need too much time precision (the Change Feed notifications are usually delayed by a few seconds) for this task, the Azure CosmosDB Change Feed could be easily used as a solution, but it would require some extra work from the Microsoft team to also support capturing deletion TTL expiration events.
A potential solution, if the Change Feed were to capture such TTL expiration events, would be: whenever you insert (or in your use case: change priority of) a document for which you want to monitor lack of changes, you also insert another document (possibly in another collection) that acts as a timer, specifying a TTL of 1h.
You would delete the timer document manually or by consuming the Change Feed for changes, in case a change actually happened.
You could also easily consume from the Change Feed the TTL expiration event and assert that if the TTL expired then there were no changes in the specified time window.
If you'd like this feature, you should consider voting issues such as this one: https://github.com/Azure/azure-cosmos-dotnet-v2/issues/402 and feature requests such as this one: https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/14603412-execute-a-procedure-when-ttl-expires, which would make the Change Feed a perfect fit for scenarios such as yours. Sadly it is not available yet :(
TL;DR No, the Change Feed as it stands would not be a right fit for your use case. It would need some extra functionalities that are planned but not implemented yet.
PS. In case you'd like to know more about the Change Feed and its main use cases anyways, you can check out this article of mine :)
I am using the Firebase Admin SDK with Cloud Functions. The function does multiple writes to several Firestore collections, which need to be consistent.
Now i am not sure how the Firestore operation behave if a valid operation like a write to a document fails (maybe through cosmic radiation or something which is similar unlikely).
Does the operation instantly return an error or is there some kind of retry or error correction mechanism?
Maybe this is a silly question and has nothing to do with the SDK itself.
First of all, if you have multiple documents to write that all must land at the same time, atomically, you should be using a batch or transaction in order to make that happen. If any document would fail to write for any reason, then nothing will happen for any of the documents referenced. If you instead choose to do several write operations, you would have to figure out to reliably roll back each change individually, which is going to be a lot of work.
If you do get an error, I don't believe there are any guarantees about the conditions of that error. You would likely want to retry on your own, unless you're able to determine that the error is not transient. To make retries reliable, you could enable the retry configuration on the function, allow the error to escape the function (don't catch the error), and let Cloud Functions invoke it again for you.
It will throw an error. If you notice every method use has a callback with succes or error.
If you are using something like await on node, you should then try/catch
If you have more than one operation and the procedure should be atomical and/or all or nothing, then use batches
https://firebase.google.com/docs/firestore/manage-data/transactions
When a Functions encounters if the error is not handle then the Function crash, you can modify the retries for Functions
https://firebase.google.com/docs/functions/retries
I'm using python-telegram-bot to build a bot that answers to inline queries. The query result is kind of complex to process. I'm using #run_async.
As a user types an inline query, the client produces several queries, every one of them spawning a handler on my bot that takes time to process.
Of all of those queries, only the latest one is actually important. For example, if I query for "Beach Voley", the bot will receive successive queries for:
Bea
Beach
Beach Vo
Beach Voley
as I type, pause and continue typing.
Then, my bot lags processing the incomplete queries before processing the actually important one, and telegram gives me invalid id errors for the outdated ones, and timeout for the actually important query.
I would like to, upon receive a query, cancel any other inline queries being processed for that user with an older timestamp, killing their threads or something.
Telepot, another telegram bot library for python I used, had this included as a feature. From what I could understand of the source code, it keeps a running tasks queue internally.
How could I mimic this behavior in python-telegram-bot? Is this a feaure that I'm just not finding?
You can set query to context.user_data['query'] in handler every time overwriting the value. Finally, when you'll need to use it - you'll have the last one.
When you use Node's EventEmitter, you subscribe to a single event. Your callback is only executed when that specific event is fired up:
eventBus.on('some-event', function(data){
// data is specific to 'some-event'
});
In Flux, you register your store with the dispatcher, then your store gets called when every single event is dispatched. It is the job of the store to filter through every event it gets, and determine if the event is important to the store:
eventBus.register(function(data){
switch(data.type){
case 'some-event':
// now data is specific to 'some-event'
break;
}
});
In this video, the presenter says:
"Stores subscribe to actions. Actually, all stores receive all actions, and that's what keeps it scalable."
Question
Why and how is sending every action to every store [presumably] more scalable than only sending actions to specific stores?
The scalability referred to here is more about scaling the codebase than scaling in terms of how fast the software is. Data in flux systems is easy to trace because every store is registered to every action, and the actions define every app-wide event that can happen in the system. Each store can determine how it needs to update itself in response to each action, without the programmer needing to decide which stores to wire up to which actions, and in most cases, you can change or read the code for a store without needing to worrying about how it affects any other store.
At some point the programmer will need to register the store. The store is very specific to the data it'll receive from the event. How exactly is looking up the data inside the store better than registering for a specific event, and having the store always expect the data it needs/cares about?
The actions in the system represent the things that can happen in a system, along with the relevant data for that event. For example:
A user logged in; comes with user profile
A user added a comment; comes with comment data, item ID it was added to
A user updated a post; comes with the post data
So, you can think about actions as the database of things the stores can know about. Any time an action is dispatched, it's sent to each store. So, at any given time, you only need to think about your data mutations a single store + action at a time.
For instance, when a post is updated, you might have a PostStore that watches for the POST_UPDATED action, and when it sees it, it will update its internal state to store off the new post. This is completely separate from any other store which may also care about the POST_UPDATED event—any other programmer from any other team working on the app can make that decision separately, with the knowledge that they are able to hook into any action in the database of actions that may take place.
Another reason this is useful and scalable in terms of the codebase is inversion of control; each store decides what actions it cares about and how to respond to each action; all the data logic is centralized in that store. This is in contrast to a pattern like MVC, where a controller is explicitly set up to call mutation methods on models, and one or more other controllers may also be calling mutation methods on the same models at the same time (or different times); the data update logic is spread through the system, and understanding the data flow requires understanding each place the model might update.
Finally, another thing to keep in mind is that registering vs. not registering is sort of a matter of semantics; it's trivial to abstract away the fact that the store receives all actions. For example, in Fluxxor, the stores have a method called bindActions that binds specific actions to specific callbacks:
this.bindActions(
"FIRST_ACTION_TYPE", this.handleFirstActionType,
"OTHER_ACTION_TYPE", this.handleOtherActionType
);
Even though the store receives all actions, under the hood it looks up the action type in an internal map and calls the appropriate callback on the store.
Ive been asking myself the same question, and cant see technically how registering adds much, beyond simplification. I will pose my understanding of the system so that hopefully if i am wrong, i can be corrected.
TLDR; EventEmitter and Dispatcher serve similar purposes (pub/sub) but focus their efforts on different features. Specifically, the 'waitFor' functionality (which allows one event handler to ensure that a different one has already been called) is not available with EventEmitter. Dispatcher has focussed its efforts on the 'waitFor' feature.
The final result of the system is to communicate to the stores that an action has happened. Whether the store 'subscribes to all events, then filters' or 'subscribes a specific event' (filtering at the dispatcher). Should not affect the final result. Data is transferred in your application. (handler always only switches on event type and processes, eg. it doesn't want to operate on ALL events)
As you said "At some point the programmer will need to register the store.". It is just a question of fidelity of subscription. I don't think that a change in fidelity has any affect on 'inversion of control' for instance.
The added (killer) feature in facebook's Dispatcher is it's ability to 'waitFor' a different store, to handle the event first. The question is, does this feature require that each store has only one event handler?
Let's look at the process. When you dispatch an action on the Dispatcher, it (omitting some details):
iterates all registered subscribers (to the dispatcher)
calls the registered callback (one per stores)
the callback can call 'waitfor()', and pass a 'dispatchId'. This internally references the callback of registered by a different store. This is executed synchronously, causing the other store to receive the action and be updated first. This requires that the 'waitFor()' is called before your code which handles the action.
The callback called by 'waitFor' switches on action type to execute the correct code.
the callback can now run its code, knowing that its dependancies (other stores) have already been updated.
the callback switches on the action 'type' to execute the correct code.
This seems a very simple way to allow event dependancies.
Basically all callbacks are eventually called, but in a specific order. And then switch to only execute specific code. So, it is as if we only triggered a handler for the 'add-item' event on the each store, in the correct order.
If subscriptions where at a callback level (not 'store' level), would this still be possible? It would mean:
Each store would register multiple callbacks to specific events, keeping reference to their 'dispatchTokens' (same as currently)
Each callback would have its own 'dispatchToken'
The user would still 'waitFor' a specific callback, but be a specific handler for a specific store
The dispatcher would then only need to dispatch to callbacks of a specific action, in the same order
Possibly, the smart people at facebook have figured out that this would actually be less performant to add the complexity of individual callbacks, or possibly it is not a priority.
I'm using the mongoose Schema.post('save', postSaveCallback) method in order to send updates over a socket to display the state of the world in the database to subscribed clients in a web browser. I am wondering if the post save callback is guaranteed to be executed in the same order that the save method was called? This would guarantee that the state represented in the client view is the accurate state of the world. If the ordering of these post save callbacks is not guaranteed to be in the same order that the mongoose save method is called, it would mean that the clients view could potentially get out of sync with the real database representation.
Is there a better way to do this or is my approach sensible?
Furthermore, is it guaranteed that when postSaveCallback is called the save operation on the underlying mongodb has fully completed and was successful?
Would be very grateful for any pointers on this.
Thanks in advance.
As with async things in general, order is not defined. The postSaveCallback is called when the save operation returns and then is executed when Node gets around to it. Some saves take longer than others which may have been kicked off before, so the callbacks could occur in pretty much any order. You'll have to modify how your callbacks coordinate with each other to ensure whatever kind of consistency you require in your state.
The save callback takes an err argument, so naturally just because the callback doesn't mean that the operation succeeded.