Azure Change Feed Support and multiple local clients - azure

We have a scenario where multiple clients would like to get updates from Document Db inserts, but they are not available online all the time.
Example: Suppose there are three clients registered with the system, but only one is online at present time. When the online client inserts/updates a document, we want the offline client(s) on wakes up to go look at change feed and update itself independently.
Now is there a way for each client to maintain it's own feed to the same partition (when they were last synced) and get the changes when the come online based on last sync?

When using change feed, you use continuation token per partition. Change feed continuation tokens do not expire, thus you can continue from any point. Each client can keep its own continuation token and read changes as needed/wakes up, this essentially means that each client can keep its own feed for each partition.

Related

Should i store data that i fetch from third party databse

I have an API that fetch airplane schedule from third party databases. When the frontend show the data that the API fetch from the database, should the application take data from local database or take the data from the third party?
I am considering this data as dynamic in nature and pretty critical too ( airplane schedule).
I am also assuming that you are aggregating this data from a number of providers and you must have transformed this data into a generic structure ( a common format across all providers).
In my opinion, you should save it into a local database, with a timestamp which indicates when the data was refreshed last.
Ideally you should display the last refreshed info against each provider ( OR airline) in your site. Also you could run a scheduler to refresh the data in regular intervals.
It would be nice to show that next refresh is at "nn" minutes. ( with a count down ).
If you can afford to, you can let the user refresh the data, but it is risky if the concurrent users are considerably more in number.
This is only my opinion.
If the API data/record is not subject to change then saving it to local database can be a good idea.
Users will fetch the data from the local database and for updating this local database, you can create another program (run in server) to update/fetch the data from API. This way only limited connection is requesting to API.

Client Server Data Exchange Persistence - Smells

Suppose I have a client that sends some RunLogicCommand with input to a server. The server responds with some output which is a report for the user to verify. At this point, the server has not persisted anything. The client then sends back the entire report in a separate SaveCommand which will then persist the report data.
To me, certain parts of this exchange seem unnecessary. That is, once the user has verified the report, it seems unnecessary for them to send the entire report back to the server for persistence. Perhaps there is a chance some sensitive data could exposed here as well?
What is the typical approach in this case?
I can see two options:
The user just sends the RunLogicCommand with Input AGAIN with some flag specifying it should be persisted. I don't really like this option since the logic could be complex and take some time to compute.
cache the report on the server (or different service or even db), then just have the client send back the SaveCommand with the ID of the report to save.
Are there any problems with either of these approaches? Is there a better, more typical approach?
Thanks!
There is no single best solution here:
The cons for the approach you mentioned firsts are:
Increased network traffic,potentially increasing costs and giving slower response times
Can you be sure that the document you sent is the same one that has been received. You can but it would require extra work.
As you mentioned, there is an increased risk that sensitive data is exposed. However, you are sending it to the client.
The cons for the first of your two options are:
Running the report twice would increase the load on the server, giving an extra cost due to the need for more processing capacity.
If the underlying data has changed between the two requests. Then the report that was verified by the user and the report stored in the database may not be the same.
I would use a variation of your second option:
Store the report in the database as soon as it has been generated, with status "waiting for user verification"
When the user verifies the report, update the status as verified.
To avoid having many unverified reports in the database, you could have a batch job that checks for and deletes all unverified reports that are older than x days.

What's the strategy for keeping in sync with plaid transactions?

I need to keep in sync with transactions on accounts for a set of items. To me that means:
1. Do an initial download of all historical transactions.
2. Get new transactions when available.
3. Make sure no transactions are dropped on the floor.
It's not clear from the documentation and the API how this can be accomplished reliably.
The create API has a webhook parameter, so it seems I should have a webhook set up immediately on getting transactions. If I don't have I missed out on all the transactions forever?
Can I pull all transactions via the API alone? I noticed the options have an offset. Is that for a cursor? Can I ask for transactions way back and the past to trigger a redownload of transactions?
What if a webhook drops a batch of transactions? How can I tell? How can I redownload the missing transactions?
And I remember reading somewhere in the doc the account IDs and transaction IDs are associated with an ACCESS_TOKEN. Does this mean that the account IDs and transaction IDs can't be used to identify data uniquely across tokens?
Plaid states that they can fetch transaction data up to past two years. However, the amount of historic transactions provided by banks vary from bank to bank. I’ve seen some banks provide data for the past three months, whereas some return data for last two years. I’ve also seen some banks not support returning any transaction data.
As for the webhook, please note that the amount of time it takes to retrieve historic data after connecting an account varies. That’s where a webhook is useful as you can be notified when data is available for fetching.
Plaid returns only 500 transactions per call (I think). So, you are responsible for pagination while retrieving historic data.
You can always retrieve historic data, but you will only be able to get the past two years maximum. Every day that passes, you will not be able to retrieve data for the first day two years ago. It’s a moving window. I’ve generally cached data on our side as you will not be able to access data older than two years.
If I recall correctly, each institution that is connected has a unique access token. You can use account id to uniquely identify transactions, but you might have to store the relations in your database as the returned data doesn’t have that.
Hope that helps.

Achieving incremental CardDAV sync with Node dav client

I'm trying to write a simple node.js program to sync a few address books from a CardDAV server to a local MySQL database. I'm using the node dav client.
I know CardDAV supports only syncing changes since the last sync via sync-token and I see some references to sync tokens when I browse through the source and readme of the dav client. But, I'm very new to DAV, so I'm not 100% sure how to put it all together.
I'm guessing I need to store the sync token (and level?) the server sends back after I run a sync and then include that in my next sync request. Am I on the right track?
Building a CardDAV client is a great resource which describes how all that works, including WebDAV Sync, which is what you are looking for.
Note that a server is not required to provided WebDAV sync (and quite a few don't).
Also note that even if they support WebDAV sync, they can expire the tokens however/whenever they want (e.g. some only store a single token, or only for a limited time).
In short: do not rely on WebDAV-sync. If it is not available, or the token is expired, you need to fallback to a full, regular sync (comparing href's and etag's).
I'm guessing I need to store the sync token (and level?) the server sends back after I run a sync and then include that in my next sync request. Am I on the right track?
Yes you are on the right track. Sync-tokens are usually per collection (Depth:1, I think they can be Depth:infinity, but I'm not sure).
So you need to store it alongside the URL of the collection you are syncing.
Then in the next sync-request, you embed it into the sync-report. If the token is still valid, you get back the new/deleted/changed records. If the token was invalidated, you need to perform a full sync.
Hope that helps :-)

Multiple pouchdbs vs single pouchdb

I created couchdb with multiple dbs for use in my ionic 3 app. Also upon integrating it with pouchdb for client side syncing i created seperate pouchdbs for each one of the dbs. Total 5 pouchdbs. My question
whether it is good idea storing multiple pouchdbs on client side owing to the no. of http connections that would be created by syncing the pouchdbs. Or shall I put all Couchdb databases into one database and use type fields to separate the docs. Then only one pouchdb need to be created and synced on client.
Also using pouchdb-authenticaion plugin, authentication data is valid for only the database on which signup/login methods were called. Accessing other databases returns unauthenticated.
I would say, if your pouchdbs are syncing in realtime, that should be less expensive to reduce their amount to one and distinguish records by type.
But it should not be that costly, but still very convinient to set up multiple changes feed per each ItemStore (e.g. TodoStore, CommentStore, etc) with corresponding filter function passing only docs of the matching type into the store it belongs to. It can also be achieved by filtering on the basis of design_docs (I'm not sure if it saves anything, at least in the browser)
One change feed distributing docs to store would be probably the cheapest solution. But I suppose the filter function can't be changes after the change feed was established, so it must know about all the stores (i.e. doc types) beforehand

Resources