According to BEP-05 , when you start a find_node or get_peers request, you will receive the query message or K (8) good nodes closest to the target/infohash.
However, in my case ,with the bootstrap node router.utorrent.com:6881, the remote returned the 8 nodes which closest to self's nodeId. And if it is a get_peers request, it always returned 8 nodes closest to self and 7 invalid peers. But if access to some special node which redirect to near the infohash, the protocol acts normal.
weird wireshark dump
success wireshark dump
Any help would be appreciated!
You shouldn't pay too much attention to what the bootstrap nodes do as long as they allow you to populate your routing table, since that is their primary purpose.
They receive a disproportionate amount of traffic and to avoid directing undue amounts of traffic to any particular node they may deviate from the specification in a few ways that are harmless as long as only a vanishingly small fraction of the network behaves that way. There is only a single-digit number of bootstrap nodes among millions, so their behavior is negligible and should not be taken as a reference point.
It does not make sense to contact a bootstrap node via get peers either. find node queries would be the correct choice to populate your routing table. And it is only necessary to contact them in the relatively rare case where other mechanisms were not successful.
Related
I am using IPFS(Inter Planetary File System) to store documents/files in a decentralized manner.
In order to search a file from the network, is there a record of all the hashes on the network(like leeches)?
How does my request travel through the network?
Apologies, but it's unclear to me if you intend to search the contents of files on the network or to just search for files on the network. I'm going to assume the latter, please correct me if that's wrong.
What follows is a bit of an oversimplification, but here it goes:
In order to search a file from the network, is there a record of all the hashes on the network(like leeches)?
There is not a single record, no. Instead, each of the ipfs nodes that makes up the network holds a piece of the total record. When you add a block to your node, the node will announce to the network that it will provide that block if asked to. The process of announcing means letting a number of other ipfs nodes in the network know that you have that block. Essentially, your node asks its peers who ask their peers, and so on, until you find some nodes with ids that are near the hash of the block. Near could be measured using something simple like xor.
The important thing to understand is that, given the hash for a block, your node finds other ipfs nodes in the network that have ids which are similar to the hash of the block, and tells them "if anyone asks, I have the block with this hash". This is important because someone who wants to go find the content for the same hash can use the same process to find nodes that have been told where the hash can be retrieved from.
How does my request travel through the network?
Basically the reverse of above.
You can read more about ipfs content routing in the following:
https://ipfs.io/ipfs/QmR7GSQM93Cx5eAg6a6yRzNde1FQv7uL6X1o4k7zrJa3LX/ipfs.draft3.pdf
https://discuss.ipfs.io/t/how-does-resolution-and-routing-work-with-ipfs/365/3
We have a Java 1.6 application that uses Hazelcast 3.7.4 version,
with a topology of two nodes. The application operates mainly with 3 maps.
In normal application working, response times when consulting the maps are
generally in values around some milliseconds tens.
I have observed that in some circumstances such as for example with network
cuts, the response time increases to huge values such as for example, 20 or 30 seconds!!
And this is impacting the application performance.
I would like to know if this kind of situation with network micro-cuts can lead
to increase searches response time in this manner. I do not know if some concrete configuration can be done to minimize this, and also which other elements can provoke so high times.
I provide some examples of some executed consults
Example 1:
String sqlPredicate = "acui='"+acui+"'";
Collection<Agent> agents =
(Collection<Agent>) data.getMapAgents().values(new SqlPredicate(sqlPredicate));
Example 2:
boolean exist = data.getMapAgents().containsKey(agent);
Thanks so much for your help.
Best Regards,
Jorge
The Map operations are all TCP Socket based and thus are subject to your Operating Systems TCP Driver implementation.
See TCP_NODELAY
I'm working on implementing yet another bittorrent client and at this time struggling with DHT. It is implemented accordingly to this specification http://www.bittorrent.org/beps/bep_0005.html but starting debugging it I noticed that other nodes' responses on the network vary.
For example, find_node is supposed to return either target node info or 8 closest nodes. Most of the nodes reply with 34 closest nodes and usually only 1 - 3 nodes from those 34 successfully reply to the consequent ping request.
Is there another document with better implementation recommendation? May be it is already proved that using 15 minutes interval to change the nodes state to questionable is not efficient and I have to use 10 or other number? Where can I find the best up to date suggestions?
There is another strange thing. Bootstrap nodes like router.bittorrent.com reply with even more closest nodes and usually the "nodes" BDictionary property buffer length is not divisible to 6 (compact node info: 4 for IP and 2 for port). For now, I simply cut off the buffer at the closest divisible to 6 length but all that is strange. Does anybody know why that might happen?
the spec says (emphasis mine):
When a node receives a find_node query, it should respond with a key "nodes" and value of a string containing the compact node info for [...]
Further down:
Contact information for nodes is encoded as a 26-byte string. Also known as "Compact node info" the 20-byte Node ID in network byte order has the compact IP-address/port info concatenated to the end.
Additionally you should read the original Kademlia paper since the bittorrent BEP builds on the concepts described therein and omits deeper explanations of those concepts.
You might also want to read for a few few extensions that are more or less de-facto standard for most implementations now http://libtorrent.org/dht_extensions.html
And read the other DHT-related BEPs, some are fairly widely adopted and modify/clarify BEP-5-specified behavior, but generally in a backward-compatible way.
For example, find_node is supposed to return either target node info or 8 closest nodes
Nodes will return a variable amount of entries. Could be more than 8. Or fewer.
While thinking about how BitTorrent works, a few questions come to my mind. Would appreciate if somebody can share a few possible responses.
Suppose a BitTorrent gets 50 peers from the tracker and then it establishes connection with 20 of them to form the peer-set. Is this peer-set randomly selected or based on their bandwith? (i understand that the peers which will be unchoked are selected based on their offered bandwidth) Subsequently, how is this bandwidth determined for each connection (a ping can give us the latency but not the bandwidth i assume)
The optimistic unchoke leads to the problem of free-riders in the system. Considering an unchoke might not always result in better peers, why is not possible to discard this policy at all? (I assume this policy helps peers with slow bandwith to fulfill requests, why cannot BitTorrent adopt a policy to probe the bandwith of the optimistic peer without sending data packets; and have another (maybe the 5th connection) for low-bandwidth peers so that they don't starve. This 5th channel will transmit at only a fraction of the bandwith compared to the other 4 channels) This may at least discourage free-riding?
traditionally the peers are selected at random. Some clients may have had weak biases based on previous interactions with the peers or CIDR distance. However, there is a recent proposal (which uTorrent and libtorrent implemens) suggests a consistent but uniformly distributed peer selection/priority algorithm. For more information, see this blog post. The unchoke algorithm is triggered every 15 seconds. The peers are then sorted by the number of bytes they sent during the last 15 seconds. The ones sending the most are then unchoked, and the rest are choked. So, the download rate is the 15 second average.
If you don't optimistically unchoke peers, there's no way for you to prove to them that you are better than the other peers in their unchoke set, and they will never unchoke you back. Without optimistic unchokes (also assuming you don't have the allow-fast extension), there is no way to start a download. When you first join, you won't have any pieces, you can't trade the first piece, you have to rely on being optimistically unchoked. Estimating someone's bandwidth without sending bulk data is hard and probably unreliable. Even if you got a good estimate of someones capacity, that wouldn't necessarily mean that capacity was available to you. The current mechanism is very robust in that it doesn't need to make assumptions about the network equipment between the peers (like the packet-train bandwidth estimation needs to do) and it looks at actual data.
I'm trying to implement Bittorent in C. First of all, before writing a code snippet, I tried to used a web browser to send the following message(URL) to the tracker server.
you may try this URL.
http://torrent.ubuntu.com:6969/announce?info_hash=%9ea%80%ed%e7/%c4%ae%c8%de%8c%b0C%81c%fbq%3cJ%22&peer_id=M7-3-5--%eck%a8%2a%7f%e6%3ah%84%f2%9d%c5&port=43611&uploaded=0&downloaded=0&left=0&corrupt=0&key=00BA7F86&event=started&numwant=4&compact=0&no_peer_id=0
I have downloaded the torrent file from this link which is named xubuntu-13.04-desktop-i386.iso and has 9e6180ede72fc4aec8de8cb0438163fb713c4a22 as SHA-1 value.
However, after sending above request, I get
HTTP/1.0 200 OK
d8:completei357e10:incompletei8e8:intervali1800e5:peers24:l\262j"\310Հp\226\310\325G?\205^%!\221x \364\367\357e
But Bittorent specification says
peers : The value is a list of dictionaries, each with the following keys
-peer id
peer's self-selected ID, as described above for the tracker request (string)
-ip
peer's IP address (either IPv6 or IPv4) or DNS name (string)
-port
peer's port number (integer)
Why value of peers field is binary, not Bencoded list?
Thank you in advance.
the peers value may be a string consisting of multiples of 6 bytes. First 4 bytes are the IP address and last 2 bytes are the port number. All in network (big endian) notation.
https://wiki.theory.org/BitTorrentSpecification#Tracker_HTTP.2FHTTPS_Protocol
The protocol you refer to was used in the early days of bittorrent. However, as some trackers became increasingly popular without being scaled out significantly in terms of capacity, the size of the tracker responses became significant. One measure to deal with this was for clients to accept gzipped HTTP responses and the compact peer responses (which is by far the most popular format among trackers these days). The compact peer response provides a significantly smaller response with the same amount of information. It's defined in BEP23.
However, even though the responses are relatively small now, the TCP handshake and teardown still imposes a significant const, this is why many trackers are moving over to UDP BEP15.