So i have one user collection(mongo DB) which consists millions of user.
I m using nodejs as backend, angular js as frontend and datatable for displaying those users.
But datatable Load all users in one api call which load more then 1 million user.
This makes my API response two slow.
I want only first 50 users then next 50 then so on....
Server stack = node js + angular js + mongo DB
Thanks
If you are using datatable with huge amount of data you should consider using server side processing functionnality.
Server side processing for datatable is described here : https://datatables.net/manual/server-side
But if you feel lazy to implement this on your server you could use third parties like :
https://github.com/vinicius0026/datatables-query
https://github.com/eherve/mongoose-datatable
Hope this helps.
The way to solve you client trying to fetch users from your server(and DB) and then rendering them to a datatable is done using pagination. There a few ways of solving pagination which i have seen, let's assume you are using REST.
One way of doing this is having your API ending with:
/api/users?skip=100&limit=50
Meaning, the client will ask your server for users(using default sorting) and skipping the first 100 results it finds and retrieving the next 50 users.
Another way is to have your API like this(I don't really like this approach):
/api/users?page=5&pageSize=50
Meaning, the client will pass which page and how many results per page it wants to fetch. This will result in a server side calculation becuase you would need to fetch users from 250-300.
You can read on pagination a lot more on the web.
Having said that, your next issue is to fetch the desired users from the database. MongoDB has two functions for using skip and limit, which is why I like the first API better. You can do the query as follows:
users.find().skip(50).limit(50)
You can read more about the limit function here and the skip function here
First Thing you need in to add skip and limit to you mongo query like this
Model.find().skip(offset).limit(limit)
then the next thing you have to do is enable server side processing in datatables
If you are using javascript data-table then this fiddle will work for you
http://jsfiddle.net/bababalcksheep/ntcwust8/
For angular-datatables
http://l-lin.github.io/angular-datatables/archives/#/serverSideProcessing
One other way if you want to send own parameters
$scope.dtOptions = DTOptionsBuilder.newOptions()
.withOption('serverSide', true)
.withOption('processing', true)
.withOption('ajax', function (data, callback, settings) {
// make an ajax request using data.start and data.length
$http.post(url, {
draw: draw,
limit: data.length,
offset: data.start,
contains: data.search.value
}).success(function (res) {
// map your server's response to the DataTables format and pass it to
// DataTables' callback
draw = res.draw;
callback({
recordsTotal: res.meta,
recordsFiltered: res.meta,
draw: res.draw,
data: res.data
});
});
})
you will get the length per page and offset as start variable in data object in the .withOption('ajax' , fun...) section and from there you can pass this in get request as params e.g. /route?offset=data.start&limit?data.length or using the post request in above example
On hitting next button in table this function will automatically trigger with limit and start and many other datatable related value
#mahesh
when loading page create 2 variables lets say skipVar=0 and limit when user clicks on next send *skipVar value key skip
var skipVar =0
on page load skip=skipVar&limit=limit
on next button
skipVar=skipVar*limit
and send Query String as
skip=skipVar&limit=limit
Related
I have run into an unforeseen problem with my socket.io setup.
I use socket.io to live load data from my database (mongoDB, nodejs, react).
To accomplish this, I use mongoDB's changestream to detect changes and then push them to the front-end via socket.io.
Now this works perfectly as long as the user is connected. And right now, when the user reconnects, it just reloads all data. While this is fine for most users, there is a small group with very bad network connection and thus the front-end is reloading data all the time. Which causes the front-end to be unresponsive for some time.
So, I am looking for a way to only send events that occurred during the front-end being offline. While the front-end can do this quite easily: https://socket.io/docs/v4/client-offline-behavior/
It doesn't seem possible to do this at the server side. Since socket.io (server side) immediately forgets sockets that have disconnected and thus cant buffer events.
So, I was wondering if there is a good way do this? Or would this need a full "wrapper" around socket.io that caches disconnected sockets?
Any help or advice would be appreciated!
I find it is a really interesting and painful problem ! ^^'
If you can give more variables, it may help people to give you a better answer
For instance
How many data are stored in database, how much a typical user will receive, and how many events are triggered on a time frame ?
How long should an event take to be visible ? I mean, if users receive an event with a 10s,30s,... delay, is it harmfull for the service they provide.
How your data is structured ? is it a simple json array with the same field, custom field, dynamic json object, etc..
How your react app is structured, do you put heavy logic when your data is update, etc..
I think you should put more controls in your front end code and update only when new datas.
Some paths to explore
1. Put more controls in your front end
As you stated, for the users with bad connection, the react client seems to update his state too quickly, when they reload data after the websocket is connected, again and again. Ui may freeze in this case, yes.
For this, I think of two approaches :
Before updating the state, check if react current state is the same as the data you receive from websocket connection. If the reconnection is quick enough and no new data arrived, it should be the same. So in this case do not update react state.
If too many events are triggered and after each reconnection new data arrived, you can buffer the datas from the websocket and display it only once per time frame. What i mean by time frame, is you can use functions like setInterval or requestAnimationFrame to trigger react update. A pseudo react code to illustrate this.
function App() {
const [events, setEvents] = useState({ datas: [] });
const bufferedEvents = useRef([]);
useEffect(() => {
websocket.on("connected", (newEvents) => {
bufferedEvents.current = bufferedEvents.current.concat(newEvents);
})
websocket.on("data", (newEvent) => {
bufferedEvents.current = bufferedEvents.current.concat(newEvent);
})
// In the setInterval function you take all the events receive at the connection + new events. to update the react state. You clean the bufferedEvents at the same time.
const intervalId=setInterval(() => {
const events = bufferedEvents.current;
bufferedEvents.current = [];
//update if new datas
if (events.length > 0) {
setEvents((prevState) => { return { datas: prevState.datas.concat(events) } });
}
// console.log(events)
}, 1000) // trigger data update every second. You could replace this approach with a requestAnimationFrame. You can adapt the time refresh as you need.
//Do not forget to clear the interval when the component is unmount
return ()=>{
clearInterval(intervalId)
}
}, []);
return (
<div>
<span>Total events : {events.datas.length}</span>
<br />
{
events.datas.map(event => {
return <div>{event.data}</div>
})
}
</div>
)
}
You can look at this article for details on using requestAnimation frame.
I think that modifying the front end is needed in all case, but still alone, not really good on performance.
2. Fetch only new data in your back end
For this approach, it really depends how your data is structured in the database.
If the data have some timestamp in it, I can think of a naive but simple cookie with a timestamp in it.
When user connects the first time, this cookie is null.
When they fetch the data, on the websocket connection, they receive all the datas. When datas arrived, you update the cookie timestamp with the most recent date in the data.
Websocket is disconnected, you open a new websocket with the cookie timestamp on it. With this information you can query all the datas more recent than the timestamp on the cookie.
Like this, you don't have to download the entirity of data, but only fresh ones.
Other approaches may be more helpfull but without more informations on your datas and more precise requirements, it is hard to say.
If you have a lot of data, I will personally check some pagination mechanism and maybe combine some classic http request for fetching the data, and websocket, sse, or long polling for live events.
You can put a comment if needed and I will update my response !
Cheers
I am new to Node.js, and I have been reading questions and answers related with this issue, but still not very sure if I fully understand the concept in my case.
Suggested Code
router.post('/test123', function(req, res) {
someAsyncFunction1(parameter1, function(result1) {
someAsyncFunction2(parameter2, function(result2) {
someAsyncFunction3(parameter3, function(result3) {
var theVariable1 = req.body.something1;
var theVariable2 = req.body.something2;
)}
)}
});
Question
I assume there will be multiple (can be 10+, 100+, or whatever) requests to one certain place (for example, ajax request to /test123, as shown above) at the same time with some variables (something1 and something2). According to this, it would be impossible that one user's theVariable1 and theVariable2 are mixed up with (i.e, overwritten by) the other user's req.body.something1 and req.body.something2. I am wondering if this is true when there are multiple callbacks (three like the above, or ten, just in case).
And, I also consider using res.locals to save some data from callbacks (instead of using theVariable1 and theVariable2, but is it good idea to do so given that the data will not be overwritten due to multiple simultaneous requests from clients?
Each request an Node.js/Express server gets generated a new req object.
So in the line router.post('/test123', function(req, res), the req object that's being passed in as an argument is unique to that HTTP connection.
You don't have to worry about multiple functions or callbacks. In a traditional application, if I have two objects cat and dog that I can pass to the listen function, I would get back meow and bark. Even though there's only one listen function. That's sort of how you can view an Express app. Even though you have all these get and post functions, every user's request is passed to them as a unique entity.
UPDATE: See MarkLogic 8 - Stream large result set to a file - JavaScript - Node.js Client API for someone's answer on how to do this in Javascript. This question is specifically asking about XQuery.
I have a web application that consumes rest services hosted in node.js.
Node simply proxies the request to XQuery which then queries MarkLogic.
These queries already have paging setup and work fine in the normal case to return a page of data to the UI.
I need to have an export feature such that when I put a URL parameter of export=all on a request, it doesn't lookup a page anymore.
At that point it should get the whole result set, even if it's a million records, and save it to a file.
The actual request needs to return immediately saying, "We will notify you when your download is ready."
One suggestion was to use xdmp:spawn to call the XQuery in the background which would save the results to a file. My actual HTTP request could then return immediately.
For the spawn piece, I think the idea is that I run my query with different options in order to get all results instead of one page. Then I would loop through the data and create a string variable to call xdmp:save with.
Some questions, is this a good idea? Is there a better way? If I loop through the result set and it does happen to be very large (gigabytes) it could cause memory issues.
Is there no way to directly stream the results to a file in XQuery?
Note: Another idea I had was to intercept the request at the proxy (node) layer and then do an xdmp:estimate to get the record count and then loop through querying each page and flushing it to disk. In this case I would need to find some way to return my request immediately yet process in the background in node which seems to have some ideas here: http://www.pubnub.com/blog/node-background-jobs-async-processing-for-async-language/
One possible strategy would be to use a self-spawning task that, on each iteration, gets the next page of the results for a query.
Instead of saving the results directly to a file, however, you might want to consider using xdmp:http-post() to send each page to a server:
http://docs.marklogic.com/xdmp:http-post?q=xdmp:http-post&v=8.0&api=true
In particular, the server could be a Node.js server that appends each page as it arrives to a file or any other datasink.
That way, Node.js could handle the long-running asynchronous IO with minimal load on the database server.
When a self-spawned task hits the end of the query, it can again use an HTTP request to notify Node.js to close the file and report that the export is finished.
Hping that helps,
I just started the Meteor js, and I'm struggling in its publish method. Below is one publish method.
//Server side
Meteor.publish('topPostsWithTopComments', function() {
var topPostsCursor = Posts.find({}, {sort: {score: -1}, limit: 30});
var userIds = topPostsCursor.map(function(p) { return p.userId });
return [
topPostsCursor,
Meteor.users.find({'_id': {$in: userIds}})
];
});
// Client side
Meteor.subscribe('topPostsWithTopComments');
Now I'm not getting how I can use publish data on client. I meant I want to use data which will be given by topPostsWithTopComments
Problem is detailed below
When a new post enters the top 30 list, two things need to happen:
The server needs to send the new post to the client.
The server needs to send that post’s author to the client.
Meteor is observing the Posts cursor returned on line 6, and so will send the new post down as soon as it’s added, ensuring the client will receive the new post straight away.
However, consider the Meteor.users cursor returned on line 7. Even if the cursor itself is reactive, it’s now using an outdated value for the userIds array (which is a plain old non-reactive variable), which means its result set will be out of date as well.
This is why as far as that cursor is concerned, there is no need to re-run the query and Meteor will happily continue to publish the same 30 authors for the original 30 top posts ad infinitum.
So unless the whole code of the publication runs again (to construct a new list of userIds), the cursor is no longer going to return the correct information.
Basically what I need is:
if any changes happens in Post, then it should have the updated users list. without calling user collection again. I found some user full mrt modules.
link1 |
link2 |
link3
Please share your views!
-Neelesh
When you publish data on the server you're just publishing what the client is allowed to query. This is for security. After you subscribe to your publication you still need to query what the publication returned.
if(Meteor.isClient) {
Meteor.subscribe('topPostsWithTopComments');
// This returns all the records published with topPostsWithComments from the Posts Collection
var posts = Posts.find({});
}
If you wanted to only publish posts that the current user owns you would want to filter them out in the publish method on the server and not on the client.
I think #Will Brock already answered your question but maybe it becomes more clear with an abstract example.
Let's construct two collections named collectiona and collectionb.
// server and client
CollectionA = new Meteor.Collection('collectiona');
CollectionB = new Meteor.Collection('collectionb');
On the server you could now call Meteor.publish with 'collectiona' and 'collectionb' separately to publish both record sets to the client. This way the client could then also separately subscribe to them.
But instead you can also publish multiple record sets in a single call to Meteor.publish by returning multiple cursors in an array. Just like in the standard publishing procedure you can of course define what is being sent down to the client. Like so:
if (Meteor.isServer) {
Meteor.publish('collectionAandB', function() {
// constrain records from 'collectiona': limit number of documents to one
var onlyOneFromCollectionA = CollectionA.find({}, {limit: 1});
// all cursors in the array are published
return [
onlyOneFromCollectionA,
CollectionB.find()
];
});
}
Now on the client there is no need to subscribe to 'collectiona' and 'collectionb' separately. Instead you can simply subscribe to 'collectionAandB':
if (Meteor.isClient) {
Meteor.subscribe('collectionAandB', function () {
// callback to use collection A and B on the client once
// they are ready
// only one document of collection A will be available here
console.log(CollectionA.find().fetch());
// all documents from collection B will be available here
console.log(CollectionB.find().fetch());
});
}
So I think what you need to understand is that there is no array sent to the client that contains the two cursors published in the Meteor.publish call. This is because returning an array of cursors in the function passed as an argument to your call to Meteor.publish merely tells Meteor to publish all cursors contained in the array. You still need to query the individual records using your collection handles on the client (see #Will Brock's answer).
I have an ASP.NET MVC 3 (.NET 4) web application.
This app fetches data from an Oracle database and mixes some information with another Sql Database.
Many tables are joined together and lot of database reading is involved.
I have already optimized the best I could the fetching side and I don't have problems with that.
I've use caching to save information I don't need to fetch over and over.
Now I would like to build a responsive interface and my goal is to present the users the order headers filtered, and load the order lines in background.
I want to do that cause I need to manage all the lines (order lines) as a whole cause of some calculations.
What I have done so far is using jQuery to make an Ajax call to my action where I fetch the order headers and save them in a cache (System.Web.Caching.Cache).
When the Ajax call has succeeded I fire off another Ajax call to fetch the lines (and, once again, save the result in a cache).
It works quite well.
Now I was trying to figure out if I can move some of this logic from the client to the server.
When my action is called I want to fetch the order header and start a new thread - responsible of the order lines fetching - and return the result to the client.
In a test app I tried both ThreadPool.QueueUserWorkItem and Task.Factory but I want the generated thread to access my cache.
I've put together a test app and done something like this:
TEST 1
[HttpPost]
public JsonResult RunTasks01()
{
var myCache = System.Web.HttpContext.Current.Cache;
myCache.Remove("KEY1");
ThreadPool.QueueUserWorkItem(o => MyFunc(1, 5000000, myCache));
return (Json(true, JsonRequestBehavior.DenyGet));
}
TEST 2
[HttpPost]
public JsonResult RunTasks02()
{
var myCache = System.Web.HttpContext.Current.Cache;
myCache.Remove("KEY1");
Task.Factory.StartNew(() =>
{
MyFunc(1, 5000000, myCache);
});
return (Json(true, JsonRequestBehavior.DenyGet));
}
MyFunc crates a list of items and save the result in a cache; pretty silly but it's just a test.
I would like to know if someone has a better solution or knows of some implications I might have access the cache in a separate thread?!
Is there anything I need to be aware of, I should avoid or I could improve ?
Thanks for your help.
One possible issue I can see with your approach is that System.Web.HttpContext.Current might not be available in a separate thread. As this thread could run later, once the request has finished. I would recommend you using the classes in the System.Runtime.Caching namespace that was introduced in .NET 4.0 instead of the old HttpContext.Cache.