UDP socket file transfer python 3.5 - python-3.x

How do i transfer a large file (video,audio) from my client to server in the local host using UDP sockets in python 3.5? I was able to send a small .txt file but not other file types. Please give me suggestions.
Thank you!
Here is my code to transfer a text file.
CLIENT CODE:
import socket
import sys
s=socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
host = '127.0.0.1'
port=6000
msg="Trial msg"
msg=msg.encode('utf-8')
while 1:
s.sendto(msg,(host,port))
data, servaddr = s.recvfrom(1024)
data=data.decode('utf-8')
print("Server reply:", data)
break
s.settimeout(5)
filehandle=open("testing.txt","rb")
finalmsg=filehandle.read(1024)
s.sendto(finalmsg, (host,port))
SERVER CODE:
import socket
host='127.0.0.1'
port=6000
s=socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
s.bind(("",port))
print("waiting on port:", port)
while 1:
data, clientaddr= s.recvfrom(1024)
data=data.decode('utf-8')
print(data)
s.settimeout(4)
break
reply="Got it thanks!"
reply=reply.encode('utf-8')
s.sendto(reply,clientaddr)
clientmsg, clientaddr=s.recvfrom(1024)

Don't use UDP for transferring large files, use TCP.
UDP does not garauntee all packets you send will arrive, or if they will arrive in order, they may even be duplicated. Furthermore UDP is not suited to large transfers because 1) it has no congestion control so you will just flood the network and the packets will be dropped, and, 2) you would have to break up your packets into smaller ones usually about 1400 bytes is recommended to keep under MTU otherwise if you rely on IP fragmentation and one fragment is lost your whole file is lost .. You would have to write custom code to fix all these issues with UDP since file transfers require everything to be sent reliably.
TCP on the other hand already does all this, it is reliable, has congestion control and is ubiquitous - you are viewing this web page over HTTP which is on top of TCP.

If you must use UDP instead of TCP or an application level protocol then, you should be able to 'generate redundant blocks' with a package like zfec so that you can reconstruct the original data even if not all of the packets are received.

Related

When using recv(n), with n greather than the MTU are you guaranteed to read at least a whole layer 2 frame?

I was wondering, imagine if there is no data to read from a TCP socket, then a whole frame of 1492 bytes arrives (full). In your code (C or any language supporting TCP) you have let's say recv 4096 bytes, will the OS guarantee that the recv reads the whole 1492 bytes, or is it possible that the loading of the frame in memory and recv are "interleaved", so the recv may get less ?
TCP is a stream oriented protocol. Data are received in order but you must not do any assumption about how many times you have to call recv until you receive all your data.
It is up to your application to repeat the calls to recv until you know you have received what you need.
(1) TCP is stream-oriented protocol. This means that it accepts a stream of data from the upper layer on the sender and returns the stream of data to the upper layer on the receiver. TCP itself receives packets from IP layer, and then reconstructs the stream. That is at some points packets cease to exist. In theory it is possible that somewhere during this reconstructed stream, only half of the incomming packet is copied in buffer, but it seems to me pretty unlikely that this would happen.
Now, linux man page states
The receive calls normally return any data available up to the requested amount,
I would interpret it as "if one packet has arrived (correctly, in order, etc), you will get the whole packet worth of data". But there is no guarantee.
On the other hand Windows docs states:
recv will return as much data as is currently available—up to the size of the buffer specified.
Which sounds more like the guarantee.
Note, however, that the data will only be returned if the packet is received correctly, and it is next in-order packet (with next expected sequence numbers).
(2) Now, TCP layer works on complete packets. It is actually impossible for it to do interleaving or anything. Ethernet has a checksum, which cannot be computed unless the packet was received completely. Packets with incorrect Ethernet checksum should be filtered out by the network card. TCP also has a checksum which requires all packet data to compute. So, if the network card has passed the packet to your OS, then data should be available.
(3) I don't think you can assume that if the packet is received, it is immediatelly available. A pretty common feature of network cards is TCP segmentation offload, which reconstructs part of the stream and results in network card passing one TCP packet that was reconstructed from multiple TCP packets. There are other things that can be in place to reduce the number of interrupts, which more or less result in several packets comming at once. So, the more likely situation is that you will have maybe some delay and then receive data from several packets at once.
The point is, the opposite of what you described is likely to happen. However, I still would not write an application that makes any assumptions about how large a chunk of data is available at a time. This negates the concept of a stream.

Send TCP SYN packet with payload

Is it possible to send a SYN packet with self-defined payload when initiating TCP connections? My gut feeling is that it is doable theoretically. I'm looking for a easy way to achieve this goal in Linux (with C or perhaps Go language) but because it is not a standard behavior, I didn't find helpful information yet. (This post is quite similar while it is not very helpful.)
Please help me, thanks!
EDIT: Sorry for the ambiguity. Not only the possibility for such task, I'm also looking for a way, or even sample codes to achieve it.
As far as I understand (and as written in a comment by Jeff Bencteux in another answer), TCP Fast Open addresses this for TCP.
See this LWN article:
Eliminating a round trip
Theoretically, the initial SYN segment could contain data sent by the initiator of the connection: RFC 793, the specification for TCP, does permit data to be included in a SYN segment. However, TCP is prohibited from delivering that data to the application until the three-way handshake completes.
...
The aim of TFO is to eliminate one round trip time from a TCP conversation by allowing data to be included as part of the SYN segment that initiates the connection.
Obviously if you write your own software on both sides, it is possible to make it work however you want. But if you are relying on standard software on either end (such as, for example, a standard linux or Windows kernel), then no, it isn't possible, because according to TCP, you cannot send data until the session is established, and the session isn't established until you get an acknowledgment to your SYN from the other peer.
So, for example, if you send a SYN packet that also includes additional payload to a linux kernel (caveat: this is speculation to some extent since I haven't actually tried it), it will simply ignore the payload and proceed to acknowledge (SYN/ACK) or reject (with RST) the SYN depending on whether there's a listener.
In any case, you could try this, but since you're going "off the reservation" so to speak, you would need to craft your own raw packets; you won't be able to convince your local OS to create them for you.
The python scapy package could construct it:
#!/usr/bin/env python2
from scapy.all import *
sport = 3377
dport = 2222
src = "192.168.40.2"
dst = "192.168.40.135"
ether = Ether(type=0x800, dst="00:0c:29:60:57:04", src="00:0c:29:78:b0:ff")
ip = IP(src=src, dst=dst)
SYN = TCP(sport=sport, dport=dport, flags='S', seq=1000)
xsyn = ether / ip / SYN / "Some Data"
packet = xsyn.build()
print(repr(packet))
TCP Fast open do that. But both ends should speak TCP fast open. QUIC a new protocol is based to solve this problem AKA 0-RTT.
I had previously stated it was not possible. In the general sense, I stand by that assessment.
However, for the client, it is actually just not possible using the connect() API. There is an alternative connect API when using TCP Fast Open. Example:
sfd = socket(AF_INET, SOCK_STREAM, 0);
sendto(sfd, data, data_len, MSG_FASTOPEN,
(struct sockaddr *) &server_addr, addr_len);
// Replaces connect() + send()/write()
// read and write further data on connected socket sfd
close(sfd);
There is no API to allow the server to attach data to the SYN-ACK sent to the client.
Even so, enabling TCP Fast Open on both the client and server may allow you to achieve your desired result, if you only mean data from the client, but it has its own issues.
If you want the same reliability and data stream semantics of TCP, you will need a new reliable protocol that has the initial data segment in addition to the rest of what TCP provides, such as congestion control and window scaling.
Luckily, you don't have to implement it from scratch. The UDP protocol is a good starting point, and can serve as your L3 for your new L4.
Other projects have done similar things, so it may be possible to use those instead of implementing your own. Consider QUIC or UDT. These protocols were implemented over the existing UDP protocol, and thus avoid the issues faced with deploying TCP Fast Open.

Finding out the number of dropped packets in raw sockets

I am developing a program that sniffs network packets using a raw socket (AF_PACKET, SOCK_RAW) and processes them in some way.
I am not sure whether my program runs fast enough and succeeds to capture all packets on the socket. I am worried that the recieve buffer for this socket occainally gets full (due to traffic bursts) and some packets are dropped.
How do I know if packets were dropped due to lack of space in the
socket's receive buffer?
I have tried running ss -f link -nlp.
This outputs the number of bytes that are currently stored in the revice buffer for that socket, but I can not tell if any packets were dropped.
I am using Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-52-generic x86_64).
Thanks.
I was having a similar problem as you. I knew that tcpdump was able to to generate statistics about packet drops, so I tried to figure out how it did that. By looking at the code of tcpdump, I noticed that it is not generating those statistic by itself, but that it is using the libpcap library to get those statistics. The libpcap is on the other hand getting those statistics by accessing the if_packet.h header and calling the PACKET_STATISTICS socket option (at least I think so, but I'm no C expert).
Therefore, I saw only two solutions to the problem:
I had to interact somehow with the linux header files from my Pyhton script to get the packet statistics, which seemed a bit complicated.
Use the Python version of libpcap which is pypcap to get those information.
Since I had no clue how to do the first thing, I implemented the second option. Here is an example how to get packet statistics using pypcap and how to get the packet data using dpkg:
import pcap
import dpkt
import socket
pc=pcap.pcap(name="eth0", timeout_ms=10000, immediate=True)
def packet_handler(ts,pkt):
#printing packet statistic (packets received, packets dropped, packets dropped by interface
print pc.stats()
#example packet parsing using dpkt
eth=dpkt.ethernet.Ethernet(pkt)
if eth.type != dpkt.ethernet.ETH_TYPE_IP:
return
ip =eth.data
layer4=ip.data
ipsrc=socket.inet_ntoa(ip.src)
ipdst=socket.inet_ntoa(ip.dst)
pc.loop(0,packet_handler)
tpacket_stats structure is defined in linux/packet.h header file
Create variable using the tpacket_stats structre and pass it to getSockOpt with PACKET_STATISTICS SOL_SOCKET options will give packets received and dropped count.
-- some times drop can be due to buffer size
-- so if you want to decrease the drop count check increasing the buffersize using setsockopt function
First off, switch your operating system.
You need a reliable, network oriented operating system. Not some pink fluffy "ease of use" with "security" functionality enabled. NetBSD or Gentoo/ArchLinux (the bare installations, not the GUI kitted ones).
Start a simultaneous tcpdump on a network tap and capture the traffic you're supposed to receive along side of your program and compare the results.
There's no efficient way to check if you've received all the packets you intended to on the receiving end since the packets might be dropped on a lower level than you anticipate.
Also this is a question for Unix # StackOverflow, there's no programming here what I can see, at least there's no code.
The only certain way to verify packet drops is to have a much more beefy sender (perhaps a farm of machines that send packets) to a single client, record every packet sent to your reciever. Have the statistical data analyzed and compared against your senders and see how much you dropped.
The cheaper way is to buy a network tap or even more ad-hoc enable port mirroring in your switch if possible. This enables you to dump as much traffic as possible into a second machine.
This will give you a more accurate result because your application machine will be busy as it is taking care of incoming traffic and processing it.
Further more, this is why network taps are effective because they split the communication up into two channels, the receiving and sending directions of your traffic if you will. This enables you to capture traffic on two separate machines (also using tcpdump, but instead of a mirrored port, you get a more accurate traffic mirroring).
So either use port mirroring
Or you buy one of these:

How to programmatically increase the per-socket buffer for UDP sockets on LInux?

I'm trying to understand the correct way to increase the socket buffer size on Linux for our streaming network application. The application receives variable bitrate data streamed to it on a number of UDP sockets. The volume of data is substantially higher at the start of the stream and I've used:
# sar -n UDP 1 200
to show that the UDP stack is discarding packets and
# ss -un -pa
to show that each socket Recv-Q length grows to the nearly the limit (124928. from sysctl net.core.rmem_default) before packets are discarded. This implies that the application simply can't keep up with the start of the stream. After discarding enough initial packets the data rate slows down and the application catches up. Recv-Q trends towards 0 and remains there for the duration.
I'm able to address the packet loss by substantially increasing the rmem_default value which increases the socket buffer size and gives the application time to recover from the large initial bursts. My understanding is that this changes the default allocation for all sockets on the system. I'd rather just increase the allocation for the specific UDP sockets and not modify the global default.
My initial strategy was to modify rmem_max and to use setsockopt(SO_RCVBUF) on each individual socket. However, this question makes me concerned about disabling Linux autotuning for all sockets and not just UDP.
udp(7) describes the udp_mem setting but I'm confused how these values interact with the rmem_default and rmem_max values. The language it uses is "all sockets", so my suspicion is that these settings apply to the complete UDP stack and not individual UDP sockets.
Is udp_rmem_min the setting I'm looking for? It seems to apply to individual sockets but global to all UDP sockets on the system.
Is there a way to safely increase the socket buffer length for the specific UDP ports used in my application without modifying any global settings?
Thanks.
Jim Gettys is armed and coming for you. Don't go to sleep.
The solution to network packet floods is almost never to increase buffering. Why is your protocol's queueing strategy not backing off? Why can't you just use TCP if you're trying to send so much data in a stream (which is what TCP was designed for).

sendto on Tru64 is returning ENOBUF

I am currently running an old system on Tru64 which involves lots of UDP sockets using the sendto() function. The sockets are used in our code to send messages to/from various processes and then eventually on to a thick client app that is connected remotely. Occasionally the socket to the thick client gets stuck, this can cause some of these messages to get built up. My question is how can I determine the current buffer size, and how do I determine the maximum message buffer. The code below gives a snippet of how I set up the port and use the sendto function.
/* need to adjust the maximum size we can send on this */
/* as it needs to be able to cope with the biggest */
/* messages we send */
lenlen = sizeof(len) ;
/* allow double for when the system is under load */
int lenlen, len ;
lenlen = sizeof(len) ;
len = 2 * 32000;
msg_socket = socket( AF_UNIX,SOCK_DGRAM, 0);
result = setsockopt(msg_socket, SOL_SOCKET, SO_SNDBUF, (char *)&len, lenlen) ;
result = sendto( msg_socket,
(char *)message,
(int)message_len,
flags,
dest_addr,
addrlen);
Note. We have ported this application to Linux and the problem does not seem to appear there.
Any help would be greatly appreciated.
Regards
UDP send buffer size is different from TCP - it just limits the size of the datagram. Quoting Stevens UNP Vol. 1:
...
A UDP socket has a send buffer size (which we can change with SO_SNDBUF socket option, Section 7.5), but this is simply an upper limit on the maximum-sized UDP datagram that can be written to the socket. If an application writes a datagram larger than the socket send buffer size, EMSGSIZE is returned. Since UDP is unreliable, it does not need to keep a copy of the application's data and does not need an actual send buffer. (The application data is normally copied into a kernel buffer of some form as it passes down the protocol stack, but this copy is discarded by the datalink layer after the data is transmitted.)
UDP simply prepends 8-byte header and passes the datagram to IP. IPv4 or IPv6 prepends its header, determines the outgoing interface by performing the routing function, and then either adds the datagram to the datalink output queue (if it fits within the MTU) or fragments the datagram and adds each fragment to the datalink output queue. If a UDO application sends large datagrams (say 2,000-byte datagrams), there's a much higher probability of fragmentation than with TCP. because TCP breaks the application data into MSS-sized chunks, something that has no counterpart in UDP.
The successful return from write to a UDP socket tells us that either the datagram or all fragments of the datagram have been added to the datalink output queue. If there is no room on the queue for the datagram or one of its fragments, ENOBUFS is often returned to the application.
Unfortunately, some implementations do not return this error, giving the application no indication that the datagram was discarded without even being transmitted.
The last footnote needs attention - but it looks like Tru64 has this error code listed in the manual page.
The proper way of doing it though is to queue your outstanding messages in the application itself and to carefully check return values and the errno after each system call. This still does not guarantee delivery (since UDP receivers might drop the packets without any notice to the senders). Check the UDP packet discard counters with netstat -s on both/all sides, see if they are growing. There is really no way around this besides switching to TCP or implementing your own timeout/ack and re-transmission logic.
You should probably be using some sort of congestion control to avoid overloading the network. By far the easiest way to do this is to use TCP instead of UDP.
It fails less often on Linux because UDP sockets wait for space in the local network interface queue on Linux (unless you set them non-blocking). However, with any operating system, if the overfull queue is not in the local system, the packet will be dropped silently.

Resources