I redesign our system (P2P application) that was built using "flat model" of k-buckets - each distance has its own k-backet. The distance is length of identifier minus length of shared prefix (XOR). Everything is clear here.
Now we want to use binary tree to hold buckets (as in last Kadelmia docs).
The tree approach doesn't deal with distance "directly" when we "look for bucket" to put new contact in. This confuses me, since paper says that k-bucket should be split if new node is closer to local then K-closest node.
My question: how to calculate distance in this case? It cannot be prefix (path) of bucket, since bucket may contain nodes with different prefixes.
What is a convenient way to find K-closest node?
Thanks in advance.
It cannot be prefix (path) of bucket, since bucket may contain nodes with diffrent prefixes.
In the tree layout each bucket does have a prefix, but it is not implicit in its position in the routing table data structure, it must be tracked explicitly during split and merge operations instead, e.g. as base address plus prefix length, similar to CIDR notation.
An empty routing table starts out with a prefix covering the entire keyspace (i.e. 0x00/0), after some splitting one of the buckets might cover the range 0x0CFA0 - 0x0CFBF, which would be be the bucket prefix 0x0CFA/15.
See this answer on another question which contains an example routing table layout
Additionally see this answer for a simple and more advanced bucket splitting algorithm
How to find the matching bucket for a given ID depends on the data structure used. A sorted list will require binary search, a Patricia-Trie with nearest-lookups is another option. Even the brute force approach can be adequate as long as you do not have to handle too many operations per second.
Related
Initially this was a different question. But the questions I ended up answering which gave me a good idea of how everything works were:
How are buckets organized?- how does the range system work.
How does the xor matrix works for distance?
And how is the routing table organized?
This is an idea of how I think this works:
Routing table is composed of a binary tree composed of 1 and 0. A bucket range is composed of nodes with similar prefix, on a branch number composed of the amount of similar bits on the prefix that are similar. So for example branch 2 composed of 1-0 can have a bucket of (1011,1000,1011) and so on. using your own Id you can xor another nodes Id and the amount of 0's in the prefix will be the branch number and the prefix making the bucket range would be the same as your own node. When needing to insert a new node into a full bucket, if your id fits within the range you'll split the bucket. This way you'll have a routing table composed of nodes similar to you and a bit that don't. Also need to mention the deeper you go on the tree the closer you'll get to your id
In the section "Routing" in the original paper, the problem of the normal bucket splitting rule is described as follows:
Every node with prefix 001 would have an empty k-bucket into which u should be inserted, yet u's bucket refresh would only notify [$ k]of the node.
I don't understand why it is a problem. Why does every node with prefix 001 needs to receive u's bucket refresh?
I have read the original paper and have spent quite amount of time researching the bucket splitting, but I couldn't figure out. Can any one explain?
Thank you
That's the highly unbalanced tree case. For kademlia to work properly it must have nearly-exact view of its neighborhood while only needing a subsampled views of more remote portions of the keyspace.
The initially described bucket splitting algorithm would only split a bucket when its ID-range covers the node's own ID. But if a node's nearest neighbors don't fall into this bucket then maintaining an exhaustive view of the neighborhood requires a non-covering bucket to be split (or in S/Kademlia: an explicit neighborhood list to be maintained).
Okay, I've been reading articles and the paper about Kademlia recently to implement a simple p2p program that uses kademlia dht algorithm. And those papers are saying, those 160-bit key in a Kademlia Node is used to identify both nodes (Node ID) and the data (which are stored in a form of tuple).
I'm quite confused on that 'both' part.
As far as my understanding goes, each node in a Kademlia binary tree uniquely represents a client(IP, port) who each holds a list of files.
Here is the general flow on my understanding.
Client (.exe) gets booted
Creates a node component
Newly created node joins the network (bootstrapping)
Sends find_node(filehash) to k-closest nodes
Let's say hash is generated by hashing file binary named file1.txt
Received nodes each finds the queried filehash in its different hash table
Say, a hash map that has a list of files(File Hash, file location)
Step 4,5 repeated until the node is found (meanwhile all associated nodes are updating the buckets)
Does this flow look all right?
Additionally, bootstrapping method of Kademlia too confuses me.
When the node gets created (user executes the program), It seems like it uses bootstrapping node to fill up the buckets. But then what's bootstrapping node? Is it another process that's always running? What if the bootstrapping node gets turned off?
Can someone help me better understand the concept?
Thanks for the help in advance.
Does this flow look all right?
It seems roughly correct, but your wording is not very precise.
Each node has a routing table by which it organizes the neighbors it knows about and another table in which it organizes the data it is asked to store by others. Nodes have a quasi-random ID that determines their position in the routing keyspace. The hashes of keys for stored data don't precisely match any particular node ID, so the data is stored on the nodes whose ID is closest to the hash, as determined by the distance metric. That's how node IDs and key hashes are used for both.
When you perform a lookup for data (i.e. find_value) you ask the remote nodes for the k-closest neighbor set they have in their routing table, which will allow you to home in on the k-closest set for a particular target key. The same query also asks the remote node to return any data they have matching that target ID.
When you perform a find_node on the other hand you're only asking them for the closest neighbors but not for data. This is primarily used for routing table maintenance where you're not looking for any data.
Those are the abstract operations, if needed an actual implementation could separate the lookup from the data retrieval, i.e. first perform a find_node and then use the result set to perform one or more separate get operations that don't involve additional neighbor lookups (similar to the store operation).
Since kademlia is UDP-based you can't really serve arbitrary files because those could easily exceed reasonable UDP packet sizes. So in practice kademlia usually just serves as a hash table for small binary values (e.g. contact information, public keys and such). Bulk operations are either performed by other protocols bootstrapped off those values or by additional operations beyond those mentioned in the kademlia paper.
What the paper describes is only the basic functionality for a routing algorithm and most basic key value storage. It is a spherical cow in a vacuum. Actual implementations usually need additional features or work around security and reliability problems faced on the public internet.
But then what's bootstrapping node? Is it another process that's always running? What if the bootstrapping node gets turned off?
That's covered in this question (by example of the bittorrent DHT)
Friends,
I am modeling a table in Cassandra which contains a Map column. So this Map should contains dynamic values and will be update so much for that row (I will update by a Primary Key)
Is it an anti-patterns, which other options should I consider ?
What you're trying to do is possibly what I described here.
First big limitations that comes into my mind are the one given by the specification:
64KB is the max size of an item in a collection
65536 is the max number of queryable elements inside a collection
More there are the problems described in other post
you can not retrieve part of a collection: even if internally each entry of a map is stored as a column you can only retrieve the whole collection (this can lead to very slow performances)
you have to choose whether creating an index on keys or on values, both simultaneously are not supported.
Since maps are typed you can't put mixed values inside: you have to represent everything as a string or bytes and then transform your data client side
I personally consider this approach as an anti pattern for all these reasons -- this approach provide a schema less solution but reduce performances and introduce lots of limitations like the one secondary indexes and typing.
HTH, Carlo
I know how data is (in theory) stored in a DHT. However, I am uncertain as to how one might go about updating a piece of data associated with a key. Is this possible? Also, how are conflicts handled in a DHT.
A DHT simply defines put(key,value) and get(key) operations and the core of the various DHT algorithms revolve around how to locate the nodes responsible for a specific key.
What those nodes do on an incoming put request for a value already stored largely depends on the purpose and implementation of the DHT network, not on the algorithm itself.
E.g. a node might opt to timestamp all incoming values and return lists with multiple separate timestamped issues. Or it might return lists that also include the source address for each value. Or they might just overwrite the stored value.
If you have some relation between the key and a signature within the value or the source ID or something like that you can put enough intelligence into the nodes to verify the data cryptographically and thus allow them to keep a single canonical value for each key by replacing the old data.
In the case of bittorrent's DHT you wouldn't want that. Many different bittorrent peers announce their presence to a single key from different source addresses. Therefore the nodes actually store unique <key,IP,port> tuples where <IP,port> can be considered the value. Which means it'll return lists of IPs and ports on each lookup. And since a DHT will have multiple nodes responsible for one key you will actually have K (bucket size) nodes responding with varying lists.
TL;DR: It's implementation-dependent
It is possible. I've researched pastrys dht. It is possible to alter data stored under a given key but pastrys developers advise against it as it can have nasty side effects, mainly with replications of the altered piece of data which is stored on other nodes. (see the FAQ on freepastrys home page).
I'm not sure about how it would effect other dhts such as chord or tapestry however.
With regard to conflicts, again I have only experience with pastry. If you try to store data under a key that's already in use an exception will be thrown.