How to connect pinoccio to apache couchdb - node.js

Is there anyone using the nice pinoccio from www.pinocc.io ?
I want to use it to post data into an apache couchdb using node.js. So I'm trying to poll data from the pinnocio API, but I'm a little lost:
schedule the polls
do long polls
do a completely different approach
Any ideas are welcome
Pitt

Sure. I wrote the Pinoccio API, here’s how you do it
https://gist.github.com/soldair/c11d6ae6f4bead140838
This example depends on the pinoccio npm module ~0.1.3 so make sure to npm install again to pick up the newest version.
you don't need to poll because pinoccio will send you changes as they happen if you have an open connection to either "stats" or "sync". if you want to poll you can but its not "real time".
sync gives you the current state + streams changes as they happen. so its perfect if you
only need to save the changes to your troop while your script is running. or show the current and last known state on a web page.
The solution that replicates every data point we store is stats. This is the example provided. Stats lets you read everything that has happened to a scout. Digital pins for example are the "digital" report. You can ask for data from a specific point in time or just from the current time (default). Changes to this "digital" report will continue streaming live as they happen, until the "end" time is reached, or if "tail" equals 0 in the options passed to stats.
hope this helps. i tested the script on my local couch and it worked well. you would need to modify it to copy more stats from each scout. I hope that soon you will be able to request multiple reports from multiple scouts in the same stream. i just have some bugs to sort out ;)

You need to look into 2 dimensions:
node.js talking to CouchDB. This is well understood and there are some questions you can find here.
Getting the data from the pinoccio. The API suggests that as long as the connection is open, you get data. So use a short timeout and a loop. You might want to run your own node.js instance for that.
Interesting fact: the CouchDB team seems to work on replacing their internal JS engine with node.js

Related

mongo DB change strem with load balancing in production env

mongo change stream with load balancing
Can any one help as how we can achieve mango change stream with load balanceing server..?
we are working on micro service architecture facing issue on production when i go with load balanceing the same code is deployed over the 4 server when i perform any operation on single server the change stream trigger is fired form all the 4 server.
So what i want it should be trigger from same server where the operation are performed
Thanks in advance
Change stream is a database-level concept. Data is inserted/updated/deleted from the database, this produces change events. Any number of subscribers can subscribe to the change events and do whatever they want to do with the changes. Each subscriber is notified of every change event.
A change stream is not meant to inform the application that originated the change of the said change. This is redundant - the application already knows what it did.
Consider rephrasing your question to explain what you are trying to accomplish better.

Calling external API only when new data is available

I am serving my users with data fetched from an external API. Now, I don't know when this API will have new data, how would be the best approach to do that using Node, for example?
I have tried setInterval's and node-schedule to do that and got it working, but isn't it expensive for the CPU? For example, over a day I would hit this endpoint to check for new data every minute, but it could have new data every five minutes or more.
The thing is, this external API isn't ran by me. Would the only way to check for updates hitting it every minute? Is there any module that can do that in Node or any approach that fits better?
Use case 1 : Call a weather API for every city of the country and just save data to my db when it is going to rain in a given city.
Use case 2 : Send notification to the user when a given Philips Hue lamp is turned on at the time it is turned on without having to hit the endpoint to check if it is on or not.
I appreciate the time to discuss this.
If this external API has no means of notifying you when there's new data, then the only thing you can do is to "poll" it to check for new data.
You will have to decide what an "efficient design" for polling is in your specific application and given the type of data and the needs of the client (what is an acceptable latency for new data).
You also need to be sure that your service is not violating any terms of service with your polling scheme or running afoul of rate limiting that may deny you access to the server if you use it "too much".
Would the only way to check for updates hitting it every minute?
Unless the API offers some notification feature, there is no other scheme other than polling at some interval. Polling every minute is fairly quick. Do your clients really need information that is less than a minute old? Or would it really make no difference if the information was as much as 5 minutes old.
For example, in your example of weather, a client wouldn't really need temperature updates more often than probably every 10-15 minutes.
Is there any module that can do that in Node or any approach that fits better?
No. Not really. You'll probably just use some sort of timer (either repeated setTimeout() or setInterval() in a node.js app to repeatedly carry out your API operations.
Use case: Call a weather API for every city of the country and just save data to my db when it is going to rain in a given city.
Trying to pre-save every possible piece of data from an external API is probably a losing proposition. You're essentially trying to "scrape" all the data from the external API. That is likely against the terms of service and will likely also run afoul of rate limits. And, it's just not very practical.
Instead, you will probably want to fetch data upon demand (when a client requests data for Phoenix, then, and only then, do you start collecting data for Phoenix) and then once a demand for a certain type of data (temperatures in a particular city) is established, then you might want to pre-cache that data more regularly so you can notify clients of changes. If, after awhile, no clients are asking for data from Phoenix, you stop requesting updates for Phoenix any more until a client establishes demand again.
I have tried setInterval's and node-schedule to do that and got it working, but isn't it expensive for the CPU? For example, over a day I would hit this endpoint to check for new data every minute, but it could have new data every five minutes or more.
Making a remote network request is not a CPU intensive operation, even if you're doing it every minute. node.js uses non-blocking networking so most of the time during a network request, node.js isn't doing anything and isn't using the CPU at all. The only time the CPU would be briefly used is when you first send the API request and then when you receive back the result from the API call and need to process it.
Whether you really need to "poll" every minute depends upon the data and the needs of the client. I'd ask yourself if your app will work just fine if you check for new data every 5 minutes.
The method I would use to update would be contained outside of the code in a scheduled batch/powershell/bash file. In windows you can schedule tasks based upon time of day or duration since last run, so what you could do is run a simple command that will kill your application for five minutes, run npm update, and then restart your application before closing the shell.
That way you're staying out of your API and keeping code to a minimum, and if your code is inside that Node package in the update, it'll be there and ready once you make serious application changes or you need to take the server down for maintenance and updates to the low-level code.
This is a light-weight solution for you and it's a method I've used once or twice at my workplace. There are lots of options out there, and if this isn't what you're looking for I can keep looking out for you.

What is the best way to keep local copy of Firebase Database on node.js

I have an app where I need to check people's posts constantly. I am trying to make sure that the -server- handles more than 100,000 posts. I tried to explain the program and specify the issues I am worried about by numbers.
I am running a simple node.js program on my terminal that runs as firebase admin controlling the Firebase Database. The program has no connectivity with clients(users), it just keeps the database locally to check users' posts every 2-3 seconds. I am keeping the posts in local hash variables by using on('child_added') to simply push the post to a posts hash and so on for on('child_removed') and on('child_changed').
Are these functions able to handle more than 5 requests per second?
Is this the proper way of keeping data locally for faster processing(and not abusing firebase limits)? I need to check every post on the platform every 2-3 seconds, so I am trying to keep a local copy of the -posts data.
That local copy of the posts are looped through every 2-3 seconds.
If there are thousands of posts, will a simple array variable handle that load?
Second part of the program:
I run a for loop to loop through the posts in a function. I run the function every 2-3 seconds using setInterval(). The program needs not only to check new added posts but it constantly needs to check all posts on the database.
If(specific condition for a post) => the program changes the state of the post
.on(child_changed) function => sends an API request to a website after that state change
Can this function run asynchronously ? When it is called, the function should not wait for the previous call to finish because the old call is sending an API request and it might not complete fast. How can I make sure that .on(child_changed) doesn't miss a single change on the -posts data?
Listen for Value Events documentation shows how to observe changes, namely one uses the .on method.
In terms of backing up your Realtime Database, you simply export the data manually, or if you have the paid plan you can automate it.
I don't understand why you would want to recreate the wheel, so to speak, and have your server ping firebase for updates. Simply use firebase observers.

Forcing a Redis snapshot / persistence vis SAVE command?

I am using the ioredis library for Node.js - I am wondering how to send Redis a signal to force persistence. I am having a hard time finding out how to do this. The SAVE command seems to do this, but I can't verify that. Can anyone tell me for sure if the SAVE command will tell Redis to write everything in memory to disk on command?
this article hints at it:
https://community.nodebb.org/topic/932/redis-useful-info so does this
one: http://redis.io/commands/save
The answer is yes, SAVE will do the job for you, but it has a synchronous behaviour, means it will be blocking till the saving is done not letting other clients retrieve data. as shown in the docs:
You almost never want to call SAVE in production environments where it
will block all the other clients
The better solution is described in BGSAVE , you can call BGSAVE and then check for the command LASTSAVE which will return for you the timestamp of the latest snapshot taken from the instance. http://redis.io/commands/lastsave

How can I "break up" a long running server side function in a Meteor app?

I have, as part of a meteor application, a server side that gets POST messages of information to feed to the web client via inserts/updates to a Collection. So far so good. However, sometimes these updates can be rather large (50K records a go, every 5 seconds). I was having a hard time keeping up to this until I started using batch-insert package and then low-level batch.find.update() and batch.execute() from Mongo.
However, there is still a good amount of processing going on even with 50K records (it does some calculations, analytics, etc). I would LOVE to be able to "thread" that logic so the main event loop can continue along. However, I am not sure there is a real easy way to create "real" threads for this within Meteor. So baring that, I would like to know the best / proper way of at least "batching" the work so that every N (say 1K or so) records I can release the event loop back to process other events (like some client side DDP messages and the like). Then do another 1K records, etc. until however many records as I need are done.
I am THINKING the solution lies within using Fibers/Futures -- which appear to be the Meteor way -- but I am not positive that is correct or the low level ideas like "setTimeout()" and/or "setImmediate()" are more appropriate.
TIA!
Meteor is not a one size fits all tool. I think you should decouple your meteor application from your batch processing. Set up a separate meteor instance, or better yet set up a pure node.js server to handle these requests and batch processes. It would look like this:
Create a node.js instance that connects to the same mongo database using the mongodb plugin (https://www.npmjs.com/package/mongodb).
Use express if you're using node.js to handle the post requests (https://www.npmjs.com/package/express).
Do the batch processing/inserts/updates in this instance.
The updates in mongo will be reflected in meteor very quickly. I had a similar situation and used a node server to do some batch data collection and then pass it into a cassandra database. I then used pig latin to run some batch operations on that data, and then inserted it into mongo. My meteor application would reactively display the new data pretty much instantaneously.
You can call this.unblock() inside a server method to allow the code to run in the background, and immediately return from the method. See example below.
Meteor.methods({
longMethod: function() {
this.unblock();
Meteor._sleepForMs(1000 * 60 * 60);
}
});

Resources