I'm designing a Riak cluster at the moment and wondering if it is possible to hint Riak that a specific bunch of keys should be placed on a single node of the cluster?
For example, there is some private data for the user, that only she is able to access. This data contains ~10k documents (too large to be kept in one key/document), and to serve one page, we need to retrieve ~100 of them. It would be better to keep the whole bunch on a single node + have the application on the same instance to make this faster.
AFAIK it is easy on Cassandra: just use OrderedPartitioner and keys like this: <hash(username)>/<private data key>. That way, almost all user keys will be kept on a single node.
One of the points of using Riak is that your data is replicated and evenly distributed throughout the cluster, thus improving your tolerance for network partitions and outages. Placing data on specific nodes goes against that goal and increases your vulnerability.
Related
Consider a growing number of data, let's choose from two extreme choices:
Evenly distribute all data across all nodes in the cluster
We pack them to as few nodes as possible
I prefer option 1 because as the volume of data grows, we can scatter it with all nodes, so that when each node is queried, it has the lowest load.
However, some resources state that we shouldn't query all the nodes because that will slow down the query. Why would that slow the query? Isn't that just a normal scatter and gather? They even claim this hurts linear scalability as adding more nodes will further drag down the query.
(Maybe I am missing on how Cassandra performs the query, some background reference is appreciated).
On the contrary, some resources state that we should go with option 2 because it queries the least number of nodes.
Of course there is no black and white choices here; everything must have a tradeoff.
I want to know, what's the real difference between option 1 and option 2. Plus, regarding the network querying, why option 1 would be slow.
I prefer option 1 because as the volume of data grows, we can scatter it with all nodes, so that when each node is queried, it has the lowest load.
You definitely want to go with option #1. This is also preferable, in that new or replacement nodes will stream much faster than a cluster made of fewer, dense nodes.
However, some resources state that we shouldn't query all the nodes because that will slow down the query.
And those resources are absolutely correct. First of all, if you read through the resources which Alex posted above you'll discover how to build your tables so that your queries can be served by a single node. Running queries which only hit a single node is the best way around that problem.
Why would that slow the query?
Because in a distributed database environment, query time becomes network time. There are many people out there who like to run multi-key or unbound queries against Cassandra. When that happens, and the query is unable to find a single node with the data, Cassandra picks one node to designate as a "coordinator."
That node builds the result set with data from the other nodes. Which means in a 30 node cluster, that one node is now pulling data from the other 29. Assuming that these requests don't time-out, the likelihood that the coordinator will crash due to trying to manage too much data is very high.
The bottom line, is that this is one of those tradeoffs between a CA relational database and an AP partitioned row store. Build your tables to support your queries, store data together which is queried together, and Cassandra will perform just fine.
I am working on a highly I/O Intensive application (A selection based on the availability of seats) using MERN Stack.
The app is expected to get 2000 concurrent users.
I want to know whether it's wise to use two instances of MongoDB, one on the RAM (in memory) and another on the Hard drive.
The RAM one to be used to store the available seats.
And the Hard drive one to backup the data after regular intervals.
But at the same time I know that if the server crashes my MongoDB data on the RAM is lost.
Could anyone guide me please?
I am using Socket IO instead of AJAX...
I don't think you need this. You can get a good server, with a good amount of RAM, and if you create your indexes correctly, everything should work fine.
Also Mongo 3 won't lock the entire database on each update, like Mongo 2 used to do.
I believe the best approach would be using something like Memcached in order to improve reads. Also, in order to improve database performance and have automated failover use sharding and replica sets.
Consider also that you would have headaches when your server restarted and you lose your data...
This seems unnecessary, because MongoDB already behaves exactly like that out-of-the-box.
The old engine (MMAPv1) was using memory-mapped files, which means that if you have as much RAM as you have data, it practically behaves like an in-memory database with automatic hard-drive backing.
The new engine (Wired Tiger) works a bit different in detail, but the same in general. It allows you to set a cache size (config key storage.wiredTiger.engineConfig.cacheSizeGB). When the cache size is as large enough, you again have an in-memory database with automatic hard-drive mirroring.
More about that in the storage FAQ.
What you are talking about is a scaling problem. You have two options when it comes to scaling: Add resources causing the bottleneck to your existing setup (more RAM and faster disks, usually) or expand your setup. You should first add resources, almost up to the point where adding resources does not give you an according bang for the buck.
At some point, this "scaling up" will not be feasible any more and you have to distribute the load amongst more nodes.
MongoDB comes with a feature for distributing load amongst (logical) nodes: sharding.
Basically, it works like this: multiple replica sets each form a logical node called a shard. Each shard in turn only holds a subset of your data. Instead of connecting to the shards directly, you acres your data via a mongos query router which is aware of which shard holds the data to answer the query and where to write new data.
By carefully selecting your shard key, your reads and writes should be evenly distributed between the shards.
Side note: putting production data on a standalone instance instead of a replica set crosses the border of negligence in my book. Given the prices of today's (rented) hardware, it has never been easier to eliminate a single point of failure than with a MongoDB replica set.
How to use the ByteOrderedPartitioner (BOP) to force specific key values to be partitioned according to a custom requirement. I want to force Cassandra to partition and replicate data according to custom requirements, without introducing a custom partitioner how far I can control this behavior and how ?
Overall: I want my data starting with particular ID to be at a predefined node because I know data will be accessed from that node heavily. Also like the data to be replicated to nearby nodes.
I want my data starting with particular ID to be at a predefined node because I know data will be accessed from that node heavily.
Looks like that you talk about data locality problem, which is really important in bigdata-like computations (Spark, Hadoop, etc.). But the general approach for that isn't to pin data to specific node, but just to move your whole computation to the data itself.
Pinning data to specific node may cause problems like:
what should you do if your node goes down?
how evenly will the data be distributed among the cluster? Will be there any hotspots/bottlenecks because of node over(under)-utilization?
how can you scale your cluster in future?
Moving computation to data has no issues with these questions, but the approach you going to choose - has.
Found the answer here...
http://www.mail-archive.com/user%40cassandra.apache.org/msg14997.html
Changing the setting "initial_token" in cassandra.yaml file we can let the nodes to be divided into key ranges and partitioning will choose the node which is going to save the first replication of the data and strategy class SimpleStrategy will add the replica to proceeding nodes so by arranging the nodes the way you want you can exploit the replication strategy.
I'm going to implement consistent hashing over a bunch of nodes. Each node has a limited capacity (let's say 1GB). I starts with one node and when it's getting full I'm gonna add another node and use consistent hashing to redistribute the data and move forward by adding new nodes. However there are still chances that a node gets full. I know some nosql databases such as cassandra uses consistent hashing to do something similar to what i'm doing. How can I avoid nodes from overflowing using consistent hashing?
Cassandra does not use consistent hashing in a way you described.
Each table has a partition key (you can think about it as a primary key or first part of it in RDBMS terminology), this key is hashed using murmur3 algorithm. The whole hash space forms a continuos ring from lowest possible hash to the highest. After that this ring is divided into chunks (vnodes, 256 by default) and these chunks are fairly distributed among multiple nodes. Each node hosts not only it's own part of the ring, but also maintains replicated copy of other vnodes according to replication factor.
This way of doing things helps to solve a lot of problems:
balance data load among all cluster nodes, no specific node can be overloaded (data size, reads and writes are evenly distributed, no hot points)
if you add a new node to a cluster, it will handle it's own part of ring and pull required vnodes automatically from other nodes. No need to manual resharding.
if node fails, due to replication you won't miss any data because it is already stored on other nodes. In this case you can decomission failed nodes so all other nodes will redistribute failed ring part among them. No need to have complex switching scenarios for failed db nodes.
Of course, you can always implement similar DB behaviour on top of any RDBMS in your application layer, but it is always much harder and not error-prone than using already existing battle-tested solution.
I guess you know how keys gets moved from one node to another node, when a node is added or deleted. Coming to your question of how uniform distribution happens?
You can have your own logic here to make it happen. You keep on monitoring all the nodes in the hash if any node is getting hot(Handling more keys) insert another node before this node so that the load will be distributed among the old and the new nodes. Similar way if any of the the nodes are under utilised you can delete them so that load will be shift to the next node.
Hope this help..!!
I'm currently delving into CouchDB, and I am puzzled by the distribution of Map-Reduce computations in views. I see a lot of resources mentioning that Map-Reduce is inherently distributed, because you can process one half of your data on server A, the other half on server B, and then reduce both results. One example would be slide 16 of this presentation:
http://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288
This seems fairly logical, but:
CouchDB does not seem to provide an API for dispatching computations to several servers. The only distribution it appears to provide is replication of the entire data set to other servers (which would then, I assume, compute their own view data).
CouchDB uses a B-Tree to store view data based on keys that are generated in the Map step of the view algorithm, which precludes appropriate partitioning of documents based on what server they should be on.
So, does CouchDB distribute Map-Reduce computations at all? Or is the Map-Reduce property used merely to cache values in the B-Tree nodes?
You are looking for BigCouch, it enables a CouchDB cluster and uses distributed MapReduce.
CouchDB does NOT distributed views across nodes, since couchdb is not a distributed application. You can only continously-replicate from one instance to the other, but still each instance works alone.