Forcing a Redis snapshot / persistence vis SAVE command? - node.js

I am using the ioredis library for Node.js - I am wondering how to send Redis a signal to force persistence. I am having a hard time finding out how to do this. The SAVE command seems to do this, but I can't verify that. Can anyone tell me for sure if the SAVE command will tell Redis to write everything in memory to disk on command?
this article hints at it:
https://community.nodebb.org/topic/932/redis-useful-info so does this
one: http://redis.io/commands/save

The answer is yes, SAVE will do the job for you, but it has a synchronous behaviour, means it will be blocking till the saving is done not letting other clients retrieve data. as shown in the docs:
You almost never want to call SAVE in production environments where it
will block all the other clients
The better solution is described in BGSAVE , you can call BGSAVE and then check for the command LASTSAVE which will return for you the timestamp of the latest snapshot taken from the instance. http://redis.io/commands/lastsave

Related

How do I automate redis flushall when it runs out of memory?

My redis db runs out of memory after 2-3 day, I do flush manually. But, I want to fushall automatically in nodejs.
Read about Redis' maxmemory-policy and then choose one of the allkeys-* policies.
you should look why it happens at first place, maybe you could expire keys using tls check this link
Other option would be just set a cronjob to flush redis once in a while(not recommended)

Do I really need to call client.shutdown() when finished with Cassandra in Node.js script?

I've been trying to find information about Cassandra sessions relating to the Node.js cassandra-driver by Datastax. I read something which said that cassandra-driver automatically manages a session and that I don't need to call client.shutdown().
I'm looking for general information about how cassandra-driver manages sessions, how can I see all active Cassandra sessions, and do I need to call shutdown() or is that counter productive having to reopen a session every time the script is run?
Based on "pm2 info" I don't see a ton of active handles so I don't think anything wrong is going on but I may be mistaken. Ram usage does seem a bit high for a small script (85mb).
In the DataStax drivers, Session is a stateful object handling a pool of connections and aware of the status of nodes in the Cluster at any time (avoiding sending request to unavailable node). TCP sockets are opened and it is a best practice to close when you don't need it anymore. See here to get more infos : https://docs.datastax.com/en/developer/nodejs-driver-dse/2.1/features/connection-pooling/
Now session.connect() may takes a bit of time: the more nodes you have in your cluster, the longer it will be to open connections to every single one. This is the reason why, it is better to init connections in a "cold start" when you work with FAAS (avoiding to open/close for each request)
So:
Always close your connections (shutdown()) when you don't need it anymore (shutdown hook in your applications)
Keep your connections "alive" as long as you need it, do not shutdown for each request, this is NOT stateless.
yes, it is "better" to connect the client outside of the handler function. to keep it state-Full.
however, AWS Lambda with nodeJS, by default function execution continues until the event loop is empty or the function times out.
create the client outside of handler, set the context.callbackWaitsForEmptyEventLoop = false and don't call client.shutdown.

knex migration error in node js app

I am using knew to connect with postgres in my application. I am getting following error when I run
knex migrate:latest
TimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
at Timeout._onTimeout
Referring some thread , I understand that I have to add transacting call but Do I need to add in all the sql calls of my app ?
In documentation , It do not give me details about when to add this ? why is must ? My queries are mostly of type "GET", hence not sure if those queries needs to apply transacting?
It seems a library bug, probably.
Generally speaking, any behaviors including SELECT also need a transaction with read locking. DB will organize the resource locking sequence according to the transaction isolation level setting and mostly READ COMMITTED is default. Rows in a table cannot be deleted while a user is reading it until finished the action. Delete (exclusive locking) waits until the Select (read shared lock) release it, even if we didn't mention a begin transaction.
In this reason, most of the database connection libraries are supporting "auto commit" option like this, this and this to automatically wrap with a transaction by default if there is no explicit transaction made (or supported by the DBMS session option natively), so all the request run on a transaction block.
Knex seems not have this option explicitly. I can find
it may differ to the DBMS types. Oracle dialect. While reading the code, I found Oracle implementation have it here but Postgresql implementation here does not have auto commit. It looks incomplete to me.
The document also says it could select query without transacting call. If it leaks many open session, then it's obviously a bug. Please file a bug report with a sample code to reproduce this issue.
Or you could inspect what queries in the pending list from the database side. All the modern database system could list up the sessions and locking status. I suppose you have mixed with the naive select call and the transacting() call and then the naive select calls may appended to an uncommitted open transaction. You can watch what is happening from the DB admin feature like this.

How can I "break up" a long running server side function in a Meteor app?

I have, as part of a meteor application, a server side that gets POST messages of information to feed to the web client via inserts/updates to a Collection. So far so good. However, sometimes these updates can be rather large (50K records a go, every 5 seconds). I was having a hard time keeping up to this until I started using batch-insert package and then low-level batch.find.update() and batch.execute() from Mongo.
However, there is still a good amount of processing going on even with 50K records (it does some calculations, analytics, etc). I would LOVE to be able to "thread" that logic so the main event loop can continue along. However, I am not sure there is a real easy way to create "real" threads for this within Meteor. So baring that, I would like to know the best / proper way of at least "batching" the work so that every N (say 1K or so) records I can release the event loop back to process other events (like some client side DDP messages and the like). Then do another 1K records, etc. until however many records as I need are done.
I am THINKING the solution lies within using Fibers/Futures -- which appear to be the Meteor way -- but I am not positive that is correct or the low level ideas like "setTimeout()" and/or "setImmediate()" are more appropriate.
TIA!
Meteor is not a one size fits all tool. I think you should decouple your meteor application from your batch processing. Set up a separate meteor instance, or better yet set up a pure node.js server to handle these requests and batch processes. It would look like this:
Create a node.js instance that connects to the same mongo database using the mongodb plugin (https://www.npmjs.com/package/mongodb).
Use express if you're using node.js to handle the post requests (https://www.npmjs.com/package/express).
Do the batch processing/inserts/updates in this instance.
The updates in mongo will be reflected in meteor very quickly. I had a similar situation and used a node server to do some batch data collection and then pass it into a cassandra database. I then used pig latin to run some batch operations on that data, and then inserted it into mongo. My meteor application would reactively display the new data pretty much instantaneously.
You can call this.unblock() inside a server method to allow the code to run in the background, and immediately return from the method. See example below.
Meteor.methods({
longMethod: function() {
this.unblock();
Meteor._sleepForMs(1000 * 60 * 60);
}
});

How to connect pinoccio to apache couchdb

Is there anyone using the nice pinoccio from www.pinocc.io ?
I want to use it to post data into an apache couchdb using node.js. So I'm trying to poll data from the pinnocio API, but I'm a little lost:
schedule the polls
do long polls
do a completely different approach
Any ideas are welcome
Pitt
Sure. I wrote the Pinoccio API, here’s how you do it
https://gist.github.com/soldair/c11d6ae6f4bead140838
This example depends on the pinoccio npm module ~0.1.3 so make sure to npm install again to pick up the newest version.
you don't need to poll because pinoccio will send you changes as they happen if you have an open connection to either "stats" or "sync". if you want to poll you can but its not "real time".
sync gives you the current state + streams changes as they happen. so its perfect if you
only need to save the changes to your troop while your script is running. or show the current and last known state on a web page.
The solution that replicates every data point we store is stats. This is the example provided. Stats lets you read everything that has happened to a scout. Digital pins for example are the "digital" report. You can ask for data from a specific point in time or just from the current time (default). Changes to this "digital" report will continue streaming live as they happen, until the "end" time is reached, or if "tail" equals 0 in the options passed to stats.
hope this helps. i tested the script on my local couch and it worked well. you would need to modify it to copy more stats from each scout. I hope that soon you will be able to request multiple reports from multiple scouts in the same stream. i just have some bugs to sort out ;)
You need to look into 2 dimensions:
node.js talking to CouchDB. This is well understood and there are some questions you can find here.
Getting the data from the pinoccio. The API suggests that as long as the connection is open, you get data. So use a short timeout and a loop. You might want to run your own node.js instance for that.
Interesting fact: the CouchDB team seems to work on replacing their internal JS engine with node.js

Resources