Managing constantly changing data in Database

Managing constantly changing data in Database - node.js

I need some advice on how to architect my data in monogoDB. I have this app, where users can view, add, edit and remove credit and debit transactions. Below is how the data looks.
The balance column here is dynamic. For example if someone adds a transaction dates 10-09-2017, all the amount in the balance field thereafter needs to change in that moment to reflect the new transaction. Right now, I am not saving this balance field at all in the database and is calculating it every time when the user loads the page, reloads it, and also when editing, deleting, adding a transaction. Now it is fast, but I assume, in the future, when the user has a lot of transactions, they will become slow as these calculations needs to be done before the user is displayed the data table. Is there a more efficient way to do this?
Also I am doing the calculations on the client side, so the load is on the client's device and not on server. I think if it is on server side, and a lot of users start using it, the API requests will become much slower and not unusable at all after a while. Is this the right way?
PS : Also it was hard making sure the reader understand my questions but I have tried my best. Please let me know if I should explain this in more details or if I should add any more details.

It is not a question about mongodb, it is a question about user interface.
Will you really display the whole history of transactions at once?
You should either utilize pagination (simplest) or reload on scroll to load your data.
Before you get problems because of the balance cell calculation, it is more likely that you experience problems because of:
Slow loading from network (almost certainly)
Slow page interaction because of DOM size (maybe)
Show the first 100 to 500 transactions and provider the user with some way to load earlier entries.
Update - Regarding server-side balance calculation:
You could calculate balance on server-side and store it into a second collection which serves as a cache. If a transaction insertion happens in the past, you recalculate the cache. To speed this up, you can utilize snapshots:
Within a third collection, you could store the current balance in certain intervals, e.g. with the following data structure:
{ Balance: 150000, Date: 2017-02-03, LastTransactionId: 546 }
When a transaction is inserted in the past, take the most recent snapshot before that past moment and recalculate the cache based on that. This way, you. can keep the number of recalculated transactions pretty small.

Related

Is it better to prewrite the dashboard data or fetch and do the calculation on demand?

So the thing is that i have some data in my Mongodb that i want to represent in a dashboard,
And its taking some time to fetch the selected documents from different collections and do the calculations needed to send the results back to the client.
So i had this idea to pre-write the required data in the required format in a dedicated collection and whenever the client asks for the dashboard i just fetch its data directly, so that i don t have to wait to fetching data across different collections and to do the calculations when he asks for it.
by the way these data are not getting updated frequently… lets say about 100 updates max per day.
Does this idea sound right or it has some drawbacks that i didn t think about?
Thank you in advance,

That's caching, your idea sounds just right.

Robot's Tracker Threads and Display

Application: The purposed application has an tcp server able to handle several connections with the robots.
I choosed to work with database/ no files, so i'm using a sqlite db to save information about the robots and their full history, models of robots, tasks, etc...
The robots send us several data like odometry, tasks information, and so on...
I create a thread for every new robot's connection to handle the messages and update the informations of the robots on the database. Now lets start talk about my problems:
The application got to show information about the robots in realtime, and I was thinking about using QSqlQueryModel, set the right query and the show it on a QTableView but then I got to some problems/ solutions to think about:
Problem number 1: There are informations to show on the QTableView that are not on the database: I have the current consumption on the database and the actual charge on the database in capacity, but I want to show also on my table the remaining battery time, how can I add that column with the right behaviour (math implemented) in my TableView.
Problem number 2: I will be receiving messages each second for each robot, so, updating the db and the the gui(loading the query) may not be the best solution when I have a big number of robots connected? Is it better to update the table, and only update the db each minute or something like this? If I use this method I cant work with the table with the QSqlQueryModel to update the tables, so what is the approach that you recommend me to use?
Thanks
SancheZ

I have run into similar problem before; my conclusion was QSqlQueryModel is not the best option for display purposes. You may want some processing on query results, or you may want to create, remove, change display data based on the result for a fancier gui. I think best is to implement your own delegates and override the view related methods - setData, setEditor
This way you have the control over all your columns and direct union of raw data and its display equivalent (i.e. EditData, UserData).
Yes, it is better if you update your view real-time and run a batch execute at lower frequency to update the big data. In general app is the middle layer and db is a bottom layer for data monitoring, unless you use db in memory shared cache.
EDIT: One important point, you cannot run updates in multiple threads (you can, but sqlite blocks the thread until it gets the lock) so it is best to run update from a single thread

Running query on database after a document/row is of certain age

What is the best practice for running a database-query after any document in a collection become of certain age?
Let's say this is a node.js web-system with mongoDB, with a collection of posts. After a new post is inserted, it should be updated with some data after 60 minutes.
Would a cron-job that checks all posts with (age < one hour) every minute or two be the best solution? What would be the least stressing solution if this system has >10.000 active users?

Some ideas:
Create a second collection as a queue with a "time to update" field which would contain the time at which the source record needs to be updated. Index it, and scan through looking for values older than "now".
Include the field mentioned above in the original document and index it the same way
You could just clear the value when done or reset it to the next 60 minutes depending on behavior (rather than inserting/deleting/inserting documents into the collection).
By keeping the update-collection distinct, you have a better chance of always keeping the entire working set of queued updates in memory (compared to storing the update info in your posts).
I'd kick off the update not as a web request to the same instance of Node but instead as a separate process so as to not block user-requests.
As to how you schedule it -- that's up to you and your architecture and what's best for your system. There's no right "best" answer, especially if you have multiple web servers or a sharded data system.
You might use a capped collection, although you'd run the risk of potentially losing records needing to be updated (although you'd gain performance)

Users last-access time with CouchDB

I am new to CouchDB, but that is not related to the problem. The question is simple, yet not clear to me.
For example: Boris was on the site 5 seconds ago and viewing his profile Ivan sees it.
How to correctly implement this feature (users last-access time)?
The problem is that, if we update users profile document in CouchDB, for ex. property last_access_time, each time a page is refreshed, than we will have the most relevant information (with MySQL we did it this way), but on the other hand, we will have _rev of the document somewhere about 100000++ by the end of the day.
So, how do you do that or do you have any ideas?

This is not a full answer but a possible optimization. It will work in addition to any other answers here.
Instead of storing the latest timestamp, update the timestamp only if it has changed by e.g. 5 seconds, or 60 seconds.
Assume a user refreshes every second for a day. That is 86,400 updates. But if you only update the timestamp at 5-second intervals, that is 17,280; for 60-seconds it is 1,440.
You can do this on the client side. When you want to update the timestamp, fetch the current document and check the old timestamp. If it is less than 5 seconds old, don't do anything. Otherwise, update it normally.
You can also do it on the server side. Write an _update function in CouchDB, which you can query like e.g. POST /db/_design/my_app/_update/last-access/the_doc_id?time=2011-01-31T05:05:31.872Z. The update function will do the same thing: check the old timestamp, and either do nothing, or update it, depending on the elapsed time.

If there was (a large) part of a document that is relatively static, and (a small) part that is highly dynamic, I would consider splitting it into two different documents.
Another option might be to use something more suited to the high write throughput of small pieces of data of that nature such as Redis or possibly MongoDB, and (if necessary) have a background task to occasionally write the info to CouchDB.

CouchDB has no problem with rapid document updates. Just do it, like MySQL. High _rev is no problem.
The only thing is, you have to be responsible about your couch from day 1. All CouchDB users must do this anyway, however you may have to do it sooner. (Applications with few updates have lower risk of a full disk, so developers can postpone this work.)
Poll your database and run compaction if it needs it (based on size, document count, seq_id number)
Poll your views and run compaction too
Always have enough disk capacity and i/o bandwidth to support compaction. Mathematical worst-case: you need 2x the database size, and 2x the write speed; however, most applications require less. Since you are updating documents, not adding them, you will need way less.

Which is the best method to do pagination so that load on server is minimum

I have done a bit of research on pagination and from what i have read there are 2 contradictory solutions of doing it
Load a small set of data from the database each time a user clicks next
Problem - Suppose there are a million rows that meet any WHERE conditions. That means a million rows are retrieved, stored, filesorted, then most of them are discarded and only 20 retrieved. If the user clicks the "next" button the same process happens again, only a different 20 are retrieved.(ref - http://www.mysqlperformanceblog.com/2008/09/24/four-ways-to-optimize-paginated-displays/)
Load all the data form the database and cache it...This has few problems too mentioned here - http://www.javalobby.org/java/forums/t63849.html
So i know i will have to use a hybrid of both..however the question boils down to - Which operation is more expensive -
making repeated queries in database for small chunks of data
or
transferring a large result set over the network

My company has exactly this situation, and we've chosen a bit of a hybrid. Our data is tabular, so we send it via AJAX to datatables This allows for good UI formatting, sorting, filtering, and show/hide of columns. Datatables has a great solution that will "queue ahead" called "pipelining" that will grab a quantity of data ahead of the user's action (in our case, up to 5 times the records they request) then page through without requests until it runs out of data. It's EXTREMELY easy to implement with Datatables, but I suspect a similar solution would not be difficult if you had to write it by hand using jQuery's AJAX functionality.
I tried doing a full load and cache on a 1.5 million record database and it was a trainwreck. The client almost dumped me because they got mad it was so slow. After a solid overnight of AJAX goodness, the client was happy once again. But best never to get to that point.
Good Luck.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Managing constantly changing data in Database - node.js

Related

Is it better to prewrite the dashboard data or fetch and do the calculation on demand?

Robot's Tracker Threads and Display

Running query on database after a document/row is of certain age

Users last-access time with CouchDB

Which is the best method to do pagination so that load on server is minimum

Categories

Resources