Configuring the mongodb node.js driver to use snappy compression - node.js

We've recently upgraded our MongoDB replica set to v3.4.4; I notice that this release now supports compressed network communication with snappy. I've got the set members set to enable this, so that they communicate with each other this way, and the mongo shell seems to support it natively, but I can't find any documentation on how to set up our Node.js clients to use it.
Everything still works fine, of course, since uncompressed communications are used where the client and the server can't agree on a compression protocol, but it would be good to be able to take advantage of this for obvious reasons.
Has anyone else played with this or had any luck with it?
It does appear that Mongo has an open ticket on this, but wondered if anyone else had made any progress independently of that.

You have two options to enable compression:
1) Driver option
Set the driver option when you initialize the MongoClient. Depending on the driver the syntax may differ. See docs, for example:
"options": {
"compression": [
"zstd",
"snappy",
"zlib"
]
}
2) Connection string
Add the compressors argument to your connection string, for example:
mongodb://example.com/?compressors=zstd,snappy,zlib
Note
The examples above demonstrate how to set multiple compressors to better explain the syntax and make the answer more useful. If you only want to set only one compressor, change accordingly.
You can find more examples on how to set compressors in the mongoDB Node.js driver test scripts.
You can verify that network compression is begin used and which compression algorithm is being used in the mongo logs, as each compression / decompression generates a verbose log entry. Or you can run db.serverStatus()['network'] and observe the bytesIn / bytesOut of various compressors, e.g.:
{
"bytesIn" : NumberLong("1828061251"),
"bytesOut" : NumberLong("57900955809"),
"physicalBytesIn" : NumberLong("2720120753"),
"physicalBytesOut" : NumberLong("32071382239"),
"numRequests" : NumberLong("570858"),
"compression" : {
"snappy" : {
"compressor" : {
"bytesIn" : NumberLong("2215000774"),
"bytesOut" : NumberLong("752759260")
},
"decompressor" : {
"bytesIn" : NumberLong("226402961"),
"bytesOut" : NumberLong("848171447")
}
}
},
"serviceExecutorTaskStats" : {
"executor" : "passthrough",
"threadsRunning" : 80
}
}

Related

Searching huge number of keywords in terms filter using bool query crashes the terminal

I was able to search for a couple of keywords in 2 different fields using the code below:
curl -XGET 'localhost:9200/INDEXED_REPO/_search?pretty' -H 'Content-Type:
application/json' -d'{"query" : {"constant_score" : {"filter" : {"bool" :
{"should" : [{ "terms" : {"description" : ["heart","cancer"]}},{ "terms" :
{"title" : ["heart","cancer"]}}]}}}}}'
However, when I put 15000 keywords, the server suddenly closed my terminal. I am using Mobaxterm. What is the best solution to include this many keywords?
There is a limit to the Max number of clauses you can use in the bool query. You can change it but it has effects on the CPU usage of the Server and might cause it to crash. I think if you didn't get an error of max clauses reached you might have already crashed the server.
I would find an optimal number that your server can take or if its absolutely necessary to search for all of them at once upgrade the server. setup extra nodes and do shard / replicas properly.
To allow more bool clauses Add the following in your elasticsearch.yml file
"indices.query.bool.max_clause_count : n" (where n - new supported number of clauses)
Refer to these for more details
Max limit on the number of values I can specify in the ids filter or generally query clause?
Elasticsearch - set max_clause_count
Also Cerebro is a better alternative to Mobaxterm you can download it from here https://github.com/lmenezes/cerebro. It will give a nice interface to play with the queries before finalizing it in your code.

How can I get rid of "result", "insertedCount", and "insertedIds" from MongoDB callback and node and get only an array of database objects?

I just updated Node after not doing so for a while and had to reinstall MongoDB and other modules. Where I previously would only get an array of database objects when using the find() function, I'm now getting a JSON object that includes "results", "ops", "insertedCount", and "insertedIds". I can't remember what I might have done when I initially set it up or maybe this is just an annoying change with Mongo, but I'd like to return to only getting an array of database objects so that I don't have to test my entire server. I've tried several npm parse modules with no success.
Here's an example:
{ result: { ok: 1, n: 1 },
ops:
[ { user: '595ee2fec2924e5435dfdd2d'},
_id: 595f0fe55e84fa2468b17ce8 } ],
insertedCount: 1,
insertedIds: [ 595f0fe55e84fa2468b17ce8 ] }
Whereas previously, it would have only returned:
[ { user: '595ee2fec2924e5435dfdd2d'},
_id: 595f0fe55e84fa2468b17ce8 } ]
You can simply get the ops array.
result.ops;
You may also need to make sure to follow your call stack correctly as those objects are only returned on an insert.
You can take the resulting object and strip it down to just the array by accessing the ops attribute.
runMyQuery().then(function(res) {
return res.ops;
});
The previous example makes a lot of assumptions, so don't expect a copy-paste solution.
The most correct solution would be to continue running your project with the exact versions of everything your package.json depends on.
That said, I'm assuming you're encountering this issue is because you're running Node locally on your system and needed to upgrade it for another project or security fix. If this is the case, you may want to consider using a Node version management tool like nvm or nodenv. These allow you to have multiple versions of Node installed and associate them with individual projects so you don't run into compatibility issues.
For an even more powerful variation of this, you might want to virtualize your entire development environment using a virtual machine such as VirtualBox or a container system like Docker. Both let you create files that define how you'd like to provision your VMs or containers (what OS, what versions of software are installed, etc.). They're the most robust way of ensuring that when you come back to a project months, or even years later, it will still run exactly the way you left it.
Sounds like you inadvertently upgraded from MongoDB 3.X drivers to MongoDB 4.X drivers. The insert operations now return a much more developer friendly "InsertManyResult" object. Unfortunately, as far as I can see, this change did not make the documentation on breaking changes.

Partial Update Document without script Elasticsearch

I am using the following code for partial update
POST /website/blog/1/_update
{
"script" : "ctx._source.views+=1"
}
is there any alternative way I can achieve the same thing. because I don't want to change anything in
groovy script because last time I changed the settings and my server was compromised.
So someone please help me with the solution or some security measures if there is no work around.
No, you cannot dynamically change a field value without using a script.
You can use file-based scripts though, which means that you can disable dynamic scripting (default in ES 1.4.3+) while still using scripting in a safe, trusted way.
config/
elasticsearch.yml
logging.yml
scripts/
your_custom_script.groovy
You could have the script store:
ctx._source.views += your_param
Once stored, you can then access the script by name, which bypasses dynamic scripting.
POST /website/blog/1/_update
{
"script": "your_custom_script",
"params" : {
"your_param" : 1
}
}
Depending on the version of Elasticsearch, the script parameter is better named (e.g., ES 2.0 uses "inline" for dynamic scripts), but this should get you off the ground.

Puppet Servers of same type

I have a best practice question around Puppet when working is server/agent mode.
I have created a working solution using a manifest/sites.pp configuration that identifies the configuration using the hostname of the agent.
For example:
node 'puppetagent.somedomain.com' {
include my_module
notify { 'agent configuration applied':
}
}
This works great for configuring a single node but what if I had a scenario in which I had multiple applications servers all with differing hostnames but all of which needed the same configuration.
Adding multiple node entries, comma separated hostname list or regular expressions doesn't feel like the 'right' way to do this.
Are there alternative ways? Can you define node 'types'? What do the community consider best practice for this?
Many thanks
If all the servers have the same configuration, inheritance, or the hieara hierarchy are the easiest ways to achieve this.
Once you need to maintain a larger set of systems where certain nodes have types such as 'web server' or 'database server' the configurations will diverge and the single inheritance model is not entirely sufficient.
You can use composition in those places. Take a peak at this article for more details.
Regular expressions might not be so bad, but I suppose the current trend is to use hiera_include.
You can do something dirty like this :
$roles = { 'webserver' => [ 'server1', 'server2', 'server3' ]
, 'smtp' => [ 'gw1', 'gw2' ]
}
node default {
$roles . filter |$k,$v| { $hostname in $v }
. each |$k,$v| { hiera_include($k) }
}
I would suggest taking a look at the concept of "roles and profiles" here: http://www.craigdunn.org/2012/05/239/
You can have multiple nodes all of which include the same configuration with a "profile" that includes one or more "roles".
As for defining multiple nodes with the same configuration or a "profile" containing "role(s)", I would suggest using hiera_include like #bartavelle mentioned. Except to use a common environment variable for identifying the nodes rather than using regular expressions.

How to generate CouchDB UUID with Node.js?

Is there a way to generate random UUID like the ones used in CouchDB but with Node.js?
There are different ways to generate UUIDs. If you are already using CouchDB, you can just ask CouchDB for some like this:
http://127.0.0.1:5984/_uuids?count=10
CouchDB has three different UUID generation algorithms. You can specify which one CouchDB uses in the CouchDB configuration as uuids/algorithm. There could be benefits to asking CouchDB for UUIDs. Specifically, if you are using the "sequence" generation algorithm. The UUIDs you get from CouchDB will fall into that sequence.
If you want to do it in node.js without relying on CouchDB, then you'll need a UUID function written JavaScript. node-uuid is a JavaScript implementation that uses "Version 4" (random numbers) or "Version 1" (timestamp-based). It works with node.js or hosted in a browser: https://github.com/broofa/node-uuid
If you're on Linux, there is also a JavaScript wrapper for libuuid. It is called uuidjs. There is a performance comparison to node-uuid in the ReadMe of node-uuid.
If you want to do something, and it doesn't look like it's supported in node.js, be sure to check the modules available for npm.
I had the same question and found that simply passing a 'null' for the couchdb id in the insert statement also did the trick:
var newdoc = {
"foo":"bar",
"type": "my_couch_doctype"
};
mycouchdb.insert(newdoc, null /* <- let couchdb generate for you. */, function(err, body){
});

Resources