In the XMPP RFC, there are two MUST directives stating that the XML used for STARTTLS and SASL must not include any whitespace, for the sake of something that the spec states as "security layer byte precision". What is that?
Relevant extacts from RFC:
...
During STARTTLS negotiation, the entities MUST NOT send any whitespace as separators between XML elements (i.e., from the last character of the first-level element qualified by the 'urn:ietf:params:xml:ns:xmpp-tls' namespace as sent by the initiating entity, until the last character of the first-level element qualified by the 'urn:ietf:params:xml:ns:xmpp-tls' namespace as sent by the receiving entity). This prohibition helps to ensure proper security layer byte precision.
...
During SASL negotiation, the entities MUST NOT send any whitespace as separators between XML elements (i.e., from the last character of the first-level element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace as sent by the initiating entity, until the last character of the first-level element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace as sent by the receiving entity). This prohibition helps to ensure proper security layer byte precision.
This directive is to ensure proper handling of byte streams. Imagine if a client sends a newline after XML fragments, it might send a response like this:
<response ... /> [LF]
The server will parse the XML incrementally up to the final '>', at which point it will send a <success/> element back to the client. Now the client will send a new stream start i.e. <stream:stream ... > using the security layer. This should cause the security layer to break on the server side, since it will expect the extra LF character to be part of the security layer when it is not.
You may say that the server should simply clear its receive buffer before issuing a <success/> packet, but this not the proper way to treat a bytestream. After all, the underlying subsystem might have delayed the delivery of that LF character, and the server might receive it after sending the <success/> packet.
The solution, of course, is for the client to NOT send such extra data. You can read more about this specific discussion here on the mailing list.
Related
My question is about the way to properly treat data that are received by using a tcp connection. In fact by establishing a tcp connection a Stream is created.Suppose I want to send a message which has a beginning and an end. As the data are flowing in the stream without specifying any boundaries, how can i identify the beginning and the end of a message. I thought to put some special characters at the beginning and at the end of my message in order to recognize them but I wonder if it is a proper way to do. My question is therefore how can i properly establish boundaries to a message for a tcp connection? (I'm using Node.js for client side and java for server side)
thank you in advance
A plain TCP connection needs some sort of protocol which defines the data format so the receiving end knows how to interpret what is being sent. For example, http is one such protocol, webSocket is another. There are thousands of existing protocols. I'd suggest you find one that is a good match for what you want to do and use it rather than building your own.
Different protocols use different schemes for defining the data format and thus different ways of delineating pieces of your data. For example in http, it uses \n to delineate headers and then use xxxx: yyyy on each line and then uses a blank line to delineate the end of the headers.
Other protocols use a binary format that define message packets with a message type, a message length and a message payload.
There are literally hundreds of different ways to do it. Since there are so many pre-built choices out there, one can usually find an existing protocol that is a decent match and use a pre-built server and client for each end rather than writing your own protocol generating and parsing code.
Sometimes I receive this strange responses from other nodes. Transaction id match to my request transaction id as well as the remote IP so I tend to believe that node responded with this but it looks like sort of a mix of response and request
d1:q9:find_node1:rd2:id20:.éV0özý.?tjN.?.!2:ip4:DÄ.^7:nodes.v26:.ï?M.:iSµLW.Ðä¸úzDÄ.^æCe1:t2:..1:y1:re
Worst of all is that it is malformed. Look at 7:nodes.v it means that I add nodes.v to the dictionary. It is supposed to be 5:nodes. So, I'm lost. What is it?
The internet and remote nodes is unreliable or buggy. You have to code defensively. Do not assume that everything you receive will be valid.
Remote peers might
send invalid bencoding, discard those, don't even try to recover.
send truncated messages. usually not recoverable unless it happens to be the very last e of the root dictionary.
omit mandatory keys. you can either ignore those messages or return an error message
contain corrupted data
include unknown keys beyond the mandatory ones. this is not an error, just treat them as if they weren't there for the sake of forward-compatibility
actually be attackers trying to fuzz your implementation or use you as DoS amplifier
I also suspect that some really shoddy implementations are based on whatever string types their programming language supports and incorrectly handle encoding instead of using arrays of uint8 as bencoding demands. There's nothing that can be done about those. Ignore or occasionally send an error message.
Specified dictionary keys are usually ASCII-mappable, but this is not a requirement. E.g. there are some tracker response types that actually use random binary data as dictionary keys.
Here are a few examples of junk I'm seeing[1] that even fails bdecoding:
d1:ad2:id20:�w)��-��t����=?�������i�&�i!94h�#7U���P�)�x��f��YMlE���p:q9Q�etjy��r7�:t�5�����N��H�|1�S�
d1:e�����������������H#
d1:ad2:id20:�����:��m�e��2~�����9>inm�_hash20:X�j�D��nY��-������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
d1:ad2:id20:�����:��m�e��2~�����9=inl�_hash20:X�j�D��nY���������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
d1:ad2:id20:�����:��m�e��2~�����9?ino�_hash20:X�j�D��nY���������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
[1] preserved char count. replaced all non-printable, ASCII-incompatible bytes with the unicode replacement character.
I am left with a few questions after reading the RFC 6520 for Heartbeat:
https://www.rfc-editor.org/rfc/rfc6520
Specifically, I don't understand why a heartbeat needs to include arbitrary payloads or even padding for that matter. From what I can understand, the purpose of the heartbeat is to verify that the other party is still paying attention at the other end of the line.
What does these variable length custom payloads provide that a fixed request and response do not?
E.g.
Alice: still alive?
Bob: still alive!
After all, FTP uses the NOOP command to keep connections alive, which seem to work fine.
There is, in fact, a reason for this payload/padding within RFC 6520
From the document:
The user can use the new HeartbeatRequest message,
which has to be answered by the peer with a HeartbeartResponse
immediately. To perform PMTU discovery, HeartbeatRequest messages
containing padding can be used as probe packets, as described in
[RFC4821].
>In particular, after a number of retransmissions without
receiving a corresponding HeartbeatResponse message having the
expected payload, the DTLS connection SHOULD be terminated.
>When a HeartbeatRequest message is received and sending a
HeartbeatResponse is not prohibited as described elsewhere in this
document, the receiver MUST send a corresponding HeartbeatResponse
message carrying an exact copy of the payload of the received
HeartbeatRequest.
If a received HeartbeatResponse message does not contain the expected
payload, the message MUST be discarded silently. If it does contain
the expected payload, the retransmission timer MUST be stopped.
Credit to pwg at HackerNews. There is a good and relevant discussion there as well.
(The following is not a direct answer, but is here to highlight related comments on another question about Heartbleed.)
There are arguments against the protocol design that allowed an arbitrary limit - either that there should have been no payload (or even echo/heartbeat feature) or that a small finite/fixed payload would have been a better design.
From the comments on the accepted answer in Is the heartbleed bug a manifestation of the classic buffer overflow exploit in C?
(R..) In regards to the last question, I would say any large echo request is malicious. It's consuming server resources (bandwidth, which costs money) to do something completely useless. There's really no valid reason for the heartbeat operation to support any length but zero
(Eric Lippert) Had the designers of the API believed that then they would not have allowed a buffer to be passed at all, so clearly they did not believe that. There must be some by-design reason to support the echo feature; why it was not a fixed-size 4 byte buffer, which seems adequate to me, I do not know.
(R..) .. Nobody thinking from a security standpoint would think that supporting arbitrary echo requests is reasonable. Even if it weren't for the heartbleed overflow issue, there may be cryptographic weaknesses related to having such control over the content the peer sends; this seems unlikely, but in the absence of a strong reason to support a[n echo] feature, a cryptographic system should not support it. It should be as simple as possible.
While I don't know the exact motivation behind this decision, it may have been motivated by the ICMP echo request packets used by the ping utility. In an ICMP echo request, an arbitrary payload of data can be attached to the packet, and the destination server will return exactly that payload if it is reachable and responding to ping requests. This can be used to verify that data is being properly sent across the network and that payloads aren't being corrupted in transit.
https://www.rfc-editor.org/rfc/rfc6520 does not explain why a heartbeat request/response round-trip is supposed to contain a payload. It just specifies that there is room for payload and that the response has to contain the same payload as the request.
What is this payload good for? My questions are:
What could it be that the engineers thought when they designed the protocol to allow for including arbitrary payload into the heartbeat request? What are the advantages?
What are the reasons that this payload must be contained in the response?
I see that by allowing for arbitrary payload the application is able to unambiguously match a certain response with a certain request. Is that the only advantage? If yes, then why did one not force the payload to be of a certain length? What is the flexibility in the payload length good for? Does it have to do with a cryptographic concept, such that the length of heartbeat requests must be unpredictable?
Other "heartbeat"-like protocol extensions simply pre-define the exact request (e.g. "ping") and the corresponding response (e.g. "pong"). Why did https://www.rfc-editor.org/rfc/rfc6520 take a different route?
It is important to understand the reasoning behind the choices made in RFC6520 in order to properly assess hypotheses that all this might have been an intelligently placed backdoor.
Regarding the arbitrary size: the rfc abtract states that the Hearbeat extension is a basis for path MTU (PMTU) discovery for DTLS. Varying the size is a basis to implement that protocol (http://en.wikipedia.org/wiki/Path_MTU_Discovery)
Regarding the arbitrary content: packet delivery may not be preserved or packets may be lost. varying the content helps to identify them
guys need some insight here.
I know the definition of a protocol, being new to this c++ programming is quite a challenging
task.I am creating a Multi-threaded chat using SDL/C++, this is a learning experience for me
and now i have encounter a hump in which I need to overcome but understanding it is a little more difficult than I had thought.I need to make a chat protocol of some sort, I think...but am stump. Up until this point i have been sending messages in strings of characters.Now that am improving the application to the point where clients can register and login, I need a better way to communicating with my clients and server.
thank you.
Create objects that represent a message, then serialize the object, send it over the network, then deserialize at the other end.
For example, you could create a class called LoginMessage that contains two fields. One for a user name, and one for a password. To login, you would do something like:
LoginMessage *msg = new LoginMessage();
msg->username = "Fred";
msg->password = "you'll never guess";
char *serialized_msg = serialize(msg);
// send the bytes over the network
You would do something similar at the other end to convert the byte stream back into an object.
There are APIs for creating message objects and serializing them for you. Here are two popular ones. Both should suit your needs.
Protocol Buffers by Google
Thrift By Facebook
If you want the serialized messages to be readable, you can use YAML. Google has an API called yaml-cpp for serializing data to YAML format.
UPDATE:
Those APIs are for making your own protocol. They just handle the conversion of messages from object form to byte stream form. They do have feature for the actual transport of the messages over the network, but you don't need to use those features. How you design your protocol it up to you. But if you want to create messages by hand, you can do that too.
I'll give you some ideas for creating your own message format.
This is one way to do it.
Have the first 4 bytes of the message represent the length of the message as an unsigned integer. This is necessary to figure out where one message ends and where the next one starts. You will need to convert between host and network byte order when reading and writing to/from these four bytes.
Have the 5th byte represent the message type. For example, you could use a 1 to indicate a login request, a 2 to indicate a login response, and 3 to indicate a chat message. This byte is necessary for interpreting the meaning of the remaining bytes.
The remaining bytes would contain the message contents. For example, if it was a login message, you would encode the username and password into these bytes somehow. If it is a chat message, these bytes would contain the chat text.