Confusion around Bitfield Torrent - bittorrent

I'm a bit confuse about the bitfield message in bittorrent. I have noted the confusion in form of question below.
Optional vs Required
Bitfield to be sent immediately after the handshaking sequence is
completed
I'm assuming this is compulsory i.e after handshake there must follow a bitfield message. Correct?
When to expect bitfield?
The bitfield message may only be sent immediately after the
handshaking sequence is completed, and before any other messages are
sent
assuming I read this clear although be optional message. peer can still broadcast the bitfield message prior to any message (like request, choke, uncoke etc). correct ?
The high bit in the first byte corresponds to piece index 0
If I'm correct bitfield represent the state i.e whether or not the peer has a given piece with it.
Assuming that my bitfield is [1,1,1,1,1,1,1,1,1,1 ..]. I establish the fact that the peer has 10th piece missing and if the bitfield look like this [1,1,0,1,1,1,1,1,1,1 ..] the peer has a 3rd piece missing. Then what is the high bit in the first byte corresponds to piece index 0 means.
Spare bits
Spare bits at the end are set to zero
What does this mean ? I mean if have a bit at end as 0 does it not means that peers has that as missing piece. why is the spare bit used.
The most important of all what is the purpose of the bitfield.
My hunch on this is that bitfield make it easier to find the right peer for a piece knowing available with the peer but am i correct on this?
#Encombe
here how my bitfield payload looks like
\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFE

I'm assuming this is compulsory i.e after handshake there must follow a bitfield message. Correct?
No, the bitfield message is optional, but if a client sends it, it MUST be the first message after the handshake.
Also, both peers must have sent their complete handshakes, (ie the handshaking sequence is completed), before anyone of them starts to send any type of regular messages including the bitfield message.
assuming I read this clear although be optional message. peer can still broadcast the bitfield message prior to any message (like request, choke, uncoke etc). correct ?
Yes, see above. If a client sends a bitfield message anywhere else the connection must be closed.
Assuming that my bitfield is [1,1,1,1,1,1,1,1,1,1 ..]. I establish the fact that the peer has 10th piece missing
No. It's unclear to me if your numbers is bits (0b1111111111) or bytes (0x01010101010101010101).
If it's bits (0b11111111): It means have pieces 0 to 9
If it's bytes (0x01010101010101010101): It means have pieces 7, 15, 23, 31, 39, 47, 55, 63, 71 and 79
if the bitfield look like this [1,1,0,1,1,1,1,1,1,1 ..] the peer has a 3rd piece missing.
No, pieces are zero indexed. 0b1101111111: means piece 2 is missing.
Then what is the high bit in the first byte corresponds to piece index 0 means.
It means that the piece with index 0 is represented by the leftmost bit. (Most significant bit in bigendian.)
. eight bits = one byte
. 0b10000000 = 0x80
. ^ high bit set meaning that the client have piece 0
. 0b00000001 = 0x01
. ^ low bit set meaning that the client have piece 7
why is the spare bit used
If the number of pieces in the torrent is not evenly divisible by eight; there will be bits over, that don't represent any pieces, in the last byte of the bitfield. Those bits must be set to zero.
The size of the bitfield in bytes can be calculated this way:
size_bitfield = math.ceil( number_of_pieces / 8 )
and the number of spare bits is:
spare_bits = 8 * size_bitfield - number_of_pieces
what is the purpose of the bitfield
The purpose is to tell what pieces the client has, so the other peer know what pieces it can request.

Related

Node js peerwire protocol implementation

While implementing Bittorrent prtotocol, to communicate with peers and get pieces run into problem with some incoming peer messages:
buffer of such messages contain about 200 "255" values and then, about 200 random numbers. The problem is that i can't find in specification definition for such payload. Type of message described by first or fourth byte in buffer, any way in my situation both of them are equal to 255, and there is no such type of message (available types are: 1-8, 16, 21-23)
Array representation of the buffer:
[255,255,255,255,255,239,254,255,255,255,255,255,255,255,255,247,255,255,255,255,255,255,255,255,255,255,223,255,255,255,255,255,255,255,255,255,255,255,255,255,254,255,255,255,239,255,254,237,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,239,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,251,255,255,255,255,255,255,255,255,255,255,255,255,253,191,255,255,255,255,255,255,253,255,255,255,255,255,255,255,255,255,255,255,249,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,247,255,255,255,255,255,255,255,254,191,255,127,255,247,255,255,255,255,255,255,255,255,255,255,255,255,255,0,0,0,5,4,0,0,3,11,0,0,0,5,4,0,0,5,196,0,0,0,5,4,0,0,1,186,0,0,0,5,4,0,0,2,102,0,0,0,5,4,0,0,2,95,0,0,0,5,4,0,0,6,7,0,0,0,5,4,0,0,4,30,0,0,0,5,4,0,0,4,190,0,0,0,5,4,0,0,4,189,0,0,0,5,4,0,0,2,47,0,0,0,5,4,0,0,1,19,0,0,0,5,4,0,0,0,28,0,0,0,5,4,0,0,0,223,0,0,0,5,4,0,0,2,75,0,0,0,5,4,0,0,4,33,0,0,0,5,4,0,0,1,31,0,0,0,5,4,0,0,1,100,0,0,0,5,4,0,0,6,24,0,0,0,5,4,0,0,3,181,0,0,0,5,4,0,0,4,94,0,0,0,5,4,0,0,2,99,0,0,0,5,4,0,0,6,44,0,0,0,5,4,0,0,0,74,0,0,0,5,4,0,0,6,9,0,0,0,1,1]
What you have is a bitfield message missing the beginning with the length, type and probably some of the data, 24 have messages and one unchoke message.
255,255,255,255,255,239,254,255,255,255,255,255,255,255,255,247,255,255,255,255,255,255,255,255,255,255,223,255,255,255,255,255,255,255,255,255,255,255,255,255,254,255,255,255,239,255,254,237,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,239,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,251,255,255,255,255,255,255,255,255,255,255,255,255,253,191,255,255,255,255,255,255,253,255,255,255,255,255,255,255,255,255,255,255,249,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,247,255,255,255,255,255,255,255,254,191,255,127,255,247,255,255,255,255,255,255,255,255,255,255,255,255,255,
0,0,0,5,4,0,0,3,11,
0,0,0,5,4,0,0,5,196,
0,0,0,5,4,0,0,1,186,
0,0,0,5,4,0,0,2,102,
0,0,0,5,4,0,0,2,95,
0,0,0,5,4,0,0,6,7,
0,0,0,5,4,0,0,4,30,
0,0,0,5,4,0,0,4,190,
0,0,0,5,4,0,0,4,189,
0,0,0,5,4,0,0,2,47,
0,0,0,5,4,0,0,1,19,
0,0,0,5,4,0,0,0,28,
0,0,0,5,4,0,0,0,223,
0,0,0,5,4,0,0,2,75,
0,0,0,5,4,0,0,4,33,
0,0,0,5,4,0,0,1,31,
0,0,0,5,4,0,0,1,100,
0,0,0,5,4,0,0,6,24,
0,0,0,5,4,0,0,3,181,
0,0,0,5,4,0,0,4,94,
0,0,0,5,4,0,0,2,99,
0,0,0,5,4,0,0,6,44,
0,0,0,5,4,0,0,0,74,
0,0,0,5,4,0,0,6,9,
0,0,0,1,1
A BitTorrent peer to peer connection consists of two unidirectional byte streams, one in each direction. When reading the data stream from the receive buffer, don't expect to get exactly one complete message per time. You must yourself split the stream to messages. Also, be prepared for that the responding peer likely will start to send messages immediately following after the end of the handshake.

What do xfrm_replay_state_esn fields mean?

I'm trying to understand a little bit more about Linux kernel IPSec networking by looking at the kernel source. I understand conceptually that IPSec prevents replay attacks with a sequence number and a replay window, i.e. if a recipient receives a packet with a sequence number that is not within the replay window, or it has received before, then it drops that packet and increments the replay counter.
I'm trying to correlate this to the structure xfrm_replay_state_esn which is defined as such:
struct xfrm_replay_state_esn {
unsigned int bmp_len;
__u32 oseq;
__u32 seq;
__u32 oseq_hi;
__u32 seq_hi;
__u32 replay_window;
__u32 bmp[0];
};
I've tried searching for documentation, but it's scant and I haven't been able to find a man of the various functions and structures, so I don't understand what the individual fields relate to.
XFRM is an IPSec implementation for the Linux kernel. The name XFRM stands for "transform" referencing the transformation of IP packets as per the IPSec protocol.
The following RFCs are relevant for IPSec:
RFC4301: Definition of the IPSec protocol.
RFC4302: Definition of the Authentication Header (AH) sub-protocol for ensuring authenticity of IP packets.
RFC4303: Definition of the Encapsulating Security Payload (ESP) sub-protocol for ensuring authenticity and secrecy of IP packets.
The IPSec protocol allows for sequence numbers of size 32 bits or 64 bits. The 64 bit sequence numbers are referred to as Extended Sequence Numbers (ESN).
The anti-replay mechanism is defined in the RFCs for both AH and ESP. The mechanism keeps a window of acceptable sequence numbers of incoming packets. The window extends back from the highest sequence number received so far, defining a lower bound for the acceptable sequence numbers. When receiving a sequence number below that bound, it is rejected. When receiving a sequence number higher than the current highest sequence number, the window is shifted forward. When receiving a sequence number within the window, the mechanism will mark this sequence number in a checklist for ensuring that each sequence number in the window is only received once. If the sequence number has already been marked, it is rejected.
This checklist can be implemented as a bitmap, where each sequence number in the window is represented by a single bit, with 0 meaning this sequence number has not been received yet, and 1 meaning it has already been received.
Based on this information, the meaning of the fields in the xfrm_replay_state_esn struct can be given as follows.
The struct holds the state of the anti-replay mechanism with extended sequence numbers (64 bits):
The highest sequence number received so far is represented by seq and seq_hi. Each is a 32 bit integer, so together they can represent a 64 bit number, with seq holding the lower 32 bit and seq_hi holding the higher 32 bit. The reason for splitting the 64 bit value into two 32 bit values, instead of representing it as a single 64 bit variable, is that the IPSec protocol mandates an optimization where only the lower 32 bit of the sequence number are included in the package. For this reason, it is more convenient to have the lower 32 bits as a separate variable in the struct, so that it can be accessed directly without resorting to bit-operations.
The sequence number counter for outgoing packages is tracked in oseq and oseq_hi. As before, the 64 bit number is represented by two 32 bit variables.
The size of the window is represented by replay_window. The smallest acceptable sequence number if given by the sequence number expressed by seq and seq_hi minus replay_window plus one.
The bitmap for checking off received sequence numbers within the window is represented by bmp. It is defined as a zero-sized array, but when the memory for the struct is allocated, additional memory is reserved after the struct, which can then be accessed e.g. with bmp[i] (which is of course just syntactic sugar for *(bmp+i)). The size of the bitmap is held in bmp_len. It is of course related to the window size, i.e. window size divided by 8*sizeof(u32), rounded up. I would speculate that it is stored explicitly to avoid having to recalculate this value frequently.

How to divide block in piece when they overlap

Some input I'm looking to build a simple minimal bittorrent client.
I reading the protocol spec for a 2-3 days now.
here what my understanding on it thus far . Assuming that torrent has a piece length of 26000 bytes and according to non official spec block size is 16384. Something like this.
Now upon request of a block of piece message would look like this
piece 0
block offset 0
block length 16484
So far so good.
Now, for next block which overlap in piece 0 and 1 what should the request look like
piece 0 ## since the start of byte is in piece 0 use piece 0 instead of piece 1
block offset 16384
block length 16384
Now on the receiving end I need to recreate the piece of 26000 bytes so that I can compare that with pieces (hash) to match the piece for correctness.
Is my understanding correct ?
Also I'm let suppose the piece verification failed and may be it because of the first block i.e Block 0 (which is faulty or corrupt)
then I should requeue Block 0 and Block 1 (which was valid btw and also a part of piece 1) to retransmit again.
And now suddenly the piece and block distribution become a bit complex then what I assume it be. and I hoping there is a simpler solution to this.
Any thought
Will use the more distinct term 'chunk' instead of the ambiguous 'block'.
A torrent is divided into pieces.
A piece is divided into chunks.
A chunk is cut from one piece.
A torrent is divided into pieces when it's created. With the Request message, a piece is in turn further divided into chunks by the downloading BitTorrent client.
How the client cut the chunks out from a piece doesn't matter, as long as no single chunk is larger than 16 KB (16384 bytes).
The simplest and most rational way to divide a piece, is to do it in as few chunks as possible, by dividing it in 16 KB chunks and let the last chunk of the piece be smaller if necessary.
The Request message format: <len=0013><id=6><Piece_index><Chunk_offset><Chunk_length>
<Piece_index > integer specifying the zero-based piece index
<Chunk_offset> integer specifying the zero-based byte offset within the piece
<Chunk_length> integer specifying the requested number of bytes
When requesting a chunk:
the whole chunk must be within the piece specified by the Piece_index,
ie Chunk_offset+Chunk_length must be less or equal to the size of that specific piece*.
the Chunk_length can not be larger than 16 KB (16384 bytes) and must be at least 1 byte
the peer that get the request must have the piece specified by the Piece_index
If any of the conditions is not met, the peer receiving the request will close the connection.
* For all pieces except the very last one that is the 'piece length' defined in the info-dictionary.
The size of the last piece can by calculated as:
size_last_piece = size_of_torrent - (number_of_pieces - 1) * 'piece length'
The maximum block size commonly accepted by clients is 16KiB. Clients are free to make smaller requests.
Pieces are commonly a multiple of 16KiB, but the current spec does not require it (this changes with BEP52) and some people use prime numbers or similar things for fun, so they do exist in the wild.
Blocks only exist in the sense that you need multiple requests to get a complete piece that is larger than 16KiB. In other words, blocks are the same thing as whatever you decide to request. You could request 500 bytes, then 1017 bytes and then 13016 bytes, ... until you got a complete piece. They are arbitrary subdivisions within a piece - there is no overlap - that you need to keep track of between the start of downloading a piece and finishing the piece.
They do not participate in hashing, they do not factor into the HAVE or BITFIELD messages. Only REQUEST, PIECE, CANCEL and REJECT messages concern themselves with blocks. And instead of blocks you could also call them sub-piece offset-length tuples or something to that effect.
Last block in a piece may be smaller than the transfer block size. I.e. 26000 - 16384 = 9616 bytes should be requested in the second PIECE message. As soon as all 26000 bytes have been received, SHA-1 hash should be calculated and compared with the corresponding checksum from the pieces section of metainfo dictionary. If the checksum does not match, you have no means to know which block contained invalid data and should re-download all blocks from this piece.
My advice would be not to depend on some particular partitioning of the piece, because:
1) peers may use a different transfer block size when requesting data
2) SHA-1 algorithm is block-based, and the digester better use a bigger block size (otherwise calculations will take more time)
A proper abstraction for a piece would be a generic data range with the following methods:
read(from:int, length:int):byte[]
write(offset:int, block:byte[]):()
Then you'll be able to read/write arbitrary subranges of data.

In a RPC, how does a receiver identify the senders architecture?

I am trying to wrap my head around how the sender identifies the endianness of the sender. I know the initial byte is usually the architecture/type of the sender. For example 0x00 is i386 etc. However, how does the first byte help at all if the receiver has no idea how to interpret it?
Endianness refers to the ordering of bytes into larger numbers, not the order of bits inside a byte. A single byte is always endian-safe; networks transfer byte streams transparently (that is, bytes are received in the same order in which they were sent).

To pad or not to pad - creating a communication protocol

I am creating a protocol to have two applications talk over a TCP/IP stream and am figuring out how to design a header for my messages. Using the TCP header as an initial guide, I am wondering if I will need padding. I understand that when we're dealing with a cache, we want to make sure that data being stored fits in a row of cache so that when it is retrieved it is done so efficiently. However, I do not understand how it makes sense to pad a header considering that an application will parse a stream of bytes and store it how it sees fit.
For example: I want to send over a message header consisting of a 3 byte field followed by a 1 byte padding field for 32 bit alignment. Then I will send over the message data.
In this case, the receiver will just take 3 bytes from the stream and throw away the padding byte. And then start reading message data. As I see it, he will not be storing the 3 bytes and the message data the way he wants. The whole point of byte alignment is so that it will be retrieved in an efficient manner. But if the retriever doesn't care about the padding how will it be retrieved efficiently?
Without the padding, the retriever just takes the 3 header bytes from the stream and then takes the data bytes. Since the retriever stores these bytes however he wants, how does it matter whether or not the padding is done?
Maybe I'm missing the point of padding.
It's slightly hard to extract a question from this post, but with what I've said you guys can probably point out my misconceptions.
Please let me know what you guys think.
Thanks,
jbu
If word alignment of the message body is of some use, then by all means, pad the message to avoid other contortions. The padding will be of benefit if most of the message is processed as machine words with decent intensity.
If the message is a stream of bytes, for instance xml, then padding won't do you a whole heck of a lot of good.
As far as actually designing a wire protocol, you should probably consider using a plain text protocol with compression (including the header), which will probably use less bandwidth than any hand-designed binary protocol you could possibly invent.
I do not understand how it makes sense to pad a header considering that an application will parse a stream of bytes and store it how it sees fit.
If I'm a receiver, I might pass a buffer (i.e. an array of bytes) to the protocol driver (i.e. the TCP stack) and say, "give this back to me when there's data in it".
What I (the application) get back, then, is an array of bytes which contains the data. Using C-style tricks like "casting" and so on I can treat portions of this array as if it were words and double-words (not just bytes) ... provided that they're suitably aligned (which is where padding may be required).
Here's an example of a statement which reads a DWORD from an offset in a byte buffer:
DWORD getDword(const byte* buffer)
{
//we want the DWORD which starts at byte-offset 8
buffer += 8;
//dereference as if it were pointing to a DWORD
//(this would fail on some machines if the pointer
//weren't pointing to a DWORD-aligned boundary)
return *((DWORD*)buffer);
}
Here's the corresponding function in Intel assembly; note that it's a single opcode i.e. quite an efficient way to access the data, more efficient that reading and accumulating separate bytes:
mov eax,DWORD PTR [esi+8]
Oner reason to consider padding is if you plan to extend your protocol over time. Some of the padding can be intentionally set aside for future assignment.
Another reason to consider padding is to save a couple of bits on length fields. I.e. always a multiple of 4, or 8 saves 2 or 3 bits off the length field.
One other good reason that TCP has padding (which probably does not apply to you) is it allows dedicated network processing hardware to easily separate the data from the header. As the data always starts on a 32 bit boundary, it's easier to separate the header from the data when the packet gets routed.
If you have a 3 byte header and align it to 4 bytes, then designate the unused byte as 'reserved for future use' and require the bits to be zero (rejecting messages where they are not as malformed). That leaves you some extensibility. Or you might decide to use the byte as a version number - initially zero, and then incrementing it if (when) you make incompatible changes to the protocol. Don't let the value be 'undefined' and "don't care"; you'll never be able to use it if you start out that way.

Resources