Best way to do a Ranking API - node.js

I'm developing a quizz app using MERN stack (with mongoose). I want to implement a ranking of all users ordered by number of solved questions. For now, my api orders according to the number of questions answered, and this is done every time a request is made to obtain the ranking. I want that when the user answers a question correctly, they see the ranking change, this with thousands of users if I want to scale it. I would like to know what is the best option to do a real time ranking, if I have to use a separate real time database, how often would I have to make a call, etc.
I have no idea, but I know that what I have done is not scalable, since if there are thousands or millions of users, a request would take time since it is not automatically ordered in mongoDB.

Related

How to better implement a more complex sorting strategy

I have an application with posts. Those posts are shown in the home view in descending order with the creation date.
I want to implement a more complex sorting strategy based on for example, posts which users have more posts, posts which have more likes, or views. Not complex, simple things. Everything picking random ones. Let's say I have the 100 posts more liked, I pick 10 of them.
To achieve this I don't want to do it in the same query, since I don't want to affect it's performance. I am using mongodb, and I need to use lookup which wouldn't be advisable to use in the most critical query of the app.
What would be the best approach to implement this?.
I thought doing all those calculations using for example AWS Lambda, or maybe triggers in mongo atlas, each 30 seconds and store the resultant information in database, which could be consumed by query.
That way each 30 seconds lets say the first 30 posts will be updated depending on the criteria.
I don't really know if this is a good approach or not. I need something not complex, but be able to "mix" all the post and show first the ones the comply with the criteria.
Thanks!

Is the twissandra data model efficient one ?

help me please,
I am new in cassandra world, so i need some advice.
I am trying to make data model for cassandra DB.
In my project i have
- users which can follow each other,
- articles which can be related with many topics.
Each user can follow many topics.
So the goal is make the aggregated feed where user will get:
articles from all topics which he follow +
articles from all friends which he follow +
self articles.
I have searched about same tasks and found twissandra example project.
As i understood in that example we storing only ids of tweets in timeline, and when we need to get timeline we getting ids of tweets and then getting each tweet by id in separate non blocking request. After collecting all tweets we returning list of tweets to user.
So my question is: is it efficient ?
Making ~41 requests to DB for getting one page of tweets ?
And second question is about followers.
When someone creating tweet we getting all of his followers and putting tweet id to their timeline,
but what if user have thousands of followers ?
It means that for creating only one tweet we should write (1+followers_count) times to DB ?
twissandra is more a toy example. It will work for some workloads, but you possibly have more you need to partition the data more (break up huge rows).
Essentially though yes, it is fairly efficient - it can be made more so by including the content in the timeline, but depending on requirements that may be a bad idea (if need deleting/editing). The writes should be a non-issue, 20k writes/sec/node is reasonable providing you have adequate systems.
If I understand your use case correctly, you will probably be good with twissandra like schema, but be sure to test it with expected workloads. Keep in mind at a certain scale everything gets a little more complicated (ie if you expect millions of articles you will need further partitioning, see https://academy.datastax.com/demos/getting-started-time-series-data-modeling).

Architecting a Mongodb lottery app with bets and jackpots in different currencies

I am designing a nodejs lottery app, using MongoDB/Mongoose; it currently works with fake money.
I want users to continue to be able bet in a 'sandbox', with fake money, but I also want to allow users to use 1+ currencies, each currency with a different jackpot.
I'm looking for the best way to architect this within MongoDB:
Some possibilities:
Use an entirely separate database for each currency. Users will have to have 1 account for each currency. Not ideal.
Have 'bet', 'jackpot', etc. schemas have a 'currency' field. Probably easiest, but not sure if this is a relational way of thinking. It doesn't feel particularly elegant.
Have 2 separate databases for 'bet' and 'jackpot', but a shared database with 'user' information. Since I do use 'populate' a couple of times, this may or may not be feasible.
I appreciate any thoughts on this.
When you want the lottery of each currency separated, you could put them in the same database, but in different collections for each currency. That way you can easily decide which data is currency-agnostic (like user accounts) and which data is currency-dependent (like bets).
Keep in mind, though, that any queries which get data about lotteries in different currencies will get more complicated, because you can only query one collection at a time. When you need a lot of such queries (like when the user has a dashboard where he sees all loteries he currently takes part in regardless of currency), you should rather go for the solution with a currency-field.

MongoDb for collection of production data

I am facing a new type of problem that I haven't tried tackling before. So I would like some pointers in the right direction by someone more knowledgeable than I :-)
I have been asked by a friend to help him design a control system for production line. The project sounds really interesting, and I can't stop thinking about it.
I have already found that I can control the system using a node.js server. So far so good (HTML5 interface here we come)! But where I really want this system to stand out is in the collection of system metrics. The system reports all kinds of things such as temperature, flow etc, and these metrics are reported up to several hundred times per second per metric... and this runs 24/7.
My thought is to persist this in a MongoDb database, and do some realtime statistics on this. The "competition", if you will, seems to save this in a SQL server database and allow the operators to export aggregated data to Excel, and do statistics in Excel.
What are the strategies for doing real time statistics using a MongoDb?
I would really like to provide instant feedback and monitoring based on these metrics. Such as average temperature over the last 24 hours, spikes etc, and also enable alerts. There will not be much advanced statistics done on the server. If that is needed, I would enable export of data to a program such as SPSS.
Is MongoDb a good fit for that? I would love to use a Linux machine instead of a Windows machine with SQL Server and a WinForms Control Interface. The license fees alone are enough to put me off, although I know it probably isn't the case for the people buying the machinery.
This will not be placed in the cloud, but rather on a single server on the network. Next to the machine being operated, I will place a touch interface that through a browser will contact the node.js server to invoke PLC commands. There can be multiple machines that need controlling, and they would all be controlled by the same central node.js server.
The machinery is controlled by PLC controllers from http://beckhoff.com/.
I am not a complete novice when it comes to MongoDb, but I have never put anything I have made into production, and I wouldn't put MongoDb on my CV... yet!
EDIT: It seems that the $inc operator is the way to go. But what if I wan't both the daily and hourly averages as well as a continuous feed that updates a chart on screen with data every second using socket.io. Is is a good idea to update a document for each of the aggregates I need. I really also want to save every measurement, but maybe I could aggregate that on a per second basis, so I don't store up to a 1000 records per second per metric?
MongoDB can definitely be used for your scenario. Look at http://www.slideshare.net/pstokes2/social-analytics-with-mongodb, http://docs.mongodb.org/manual/use-cases/pre-aggregated-reports/ or
Real-time statistics: MySQL(/Drizzle) or MongoDB? for more on this topic
What I am really looking for is the Aggregation Framework: http://docs.mongodb.org/manual/tutorial/aggregation-examples/
That gives me exactly the kind of stats that I would like to see. Use this to calculate sums and averages as I write, and then also allow for ad-hoc queries should they be needed.
For a little insight on performance, read this awesome blogpost!
http://devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework
Also, anyone else looking to do something like this should take a look at this to see how to save the individual events. I don't need to save data longer than a week for example, so a rolling log should be more than enough for me: http://blog.mongodb.org/post/172254834/mongodb-is-fantastic-for-logging
With this I am very close to having a really sweet setup, and I am beginning to feel confident that this is a good choice over MySQL or MSSQL.

Paging among multiple aggregate root

I'm new to DDD so please executes me if some term/understanding are bit off. But please correct me and any advice are appreciated.
Let's say I'm doing a social job board site, and I've identified my aggregate roots: Candidates, Jobs, and Companies. Very different things/contexts so each has own database table, repository, and service. But now I have to build a Pinterest style homepage where data blocks show data for either a Candidate, a Job, or a Company.
Now the tricky part is the data blocks have to be ordered by the last time something happened to the aggregate it represents (a company is liked/commented, or a job was update, etc), and paging occurs in form of infinite scrolling, again just like Pinterest. Since things occur to these aggregates independently I do not have a way to know how many of what aggregate is on any particular page. (but if I did btw, say a table that tracks aggregates' last update time, have I no choice but to promote this to be another aggregate root, with it's own repository?)
Where would I implement the paging logic? I read somewhere that there should be one service per repository per aggregate root, so should I sort and page in controller (I'm using MVC by the way)? Or should there be a independent Application Service that does cross boundary stuff like this? Either case I have to fetch ALL entities for ALL aggregates from db?
That's too many questions already but I'm basically asking:
Is paging presentation, business, or persistence logic? Which horizontal layer?
Where should cross boundary code reside in DDD? Which vertical stack?
Several things come to mind.
How fresh does this aggregated data need to be? I doubt realtime is going to add much value. Talk to a business person and bargain for some latency. This will allow you to build a simpler solution to the problem.
Why not have some process do the scanning, aggregation, sorting and store the result of that asynchronously? Doesn't even need to be in a database (Redis). The bargained latency could be the interval at which to run your process.
Paging is hardly a business decision concern in your example. You just need to provide infinite scrolling and some ajax calls that fetch the cached, aggregated, sorted information. This has little to do with DDD.
Your UI artifacts and the aggregation, sorting process seem to be very much a thing on their own, working together with the data or - better yet - a datacomponent of each context that provides the data in the desired format.

Resources