I'm trying to receive IEEE1722 packet via a raw Ethernet socket on ubuntu linux.
The socket itself works fine, I receive any single packet (ARP,TCP,SSDP,....) flowing around on the network, with exception of the IEEE1722 packets. They are somehow ignored on my read calls and don't understand why - maybe someone of you has an idea.
The packets are 802.1 frames with VLAN tag and EtherType 0x22f0
Neither switching from ETH_P_ALL to ETH_P_8021Q or to htons(0x22f0) does help. If I change it I don't receive anything anymore.
That's my code - someone with an idea what's wrong?
Creating the socket:
m_socket = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
if (m_socket < 0)
{
LOGERROR("EthRawSock", "Start(): SOCK_RAW creation failed! error: %d",errno);
m_socket = NULL;
return ErrorFileOpen;
}
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
strcpy(ifr.ifr_name, m_sznic.ptrz());
if (ioctl(m_socket, SIOCGIFINDEX, &ifr) < 0) {
LOGERROR("EthRawSock", "Start(): ioctl() SIOCGIFINDEX failed! error: %d (NIC: %s)",errno,ifr.ifr_name);
return ErrorFileOpen;
}
struct sockaddr_ll sll;
memset(&sll, 0, sizeof(sll));
sll.sll_family = AF_PACKET;
sll.sll_ifindex = ifr.ifr_ifindex;
sll.sll_protocol = htons(0x22f0);
if (bind((int)m_socket, (struct sockaddr *) &sll, sizeof(sll)) < 0) {
LOGERROR("EthRawSock", "Start(): bind() failed! error: %d",errno);
return ErrorFileOpen;
}
if (ioctl(m_socket, SIOCGIFHWADDR, &ifr) < 0)
{
LOGERROR("EthRawSock", "Start(): SIOCGIFHWADDR failed! error: %d",errno);
return ErrorFileOpen;
}
struct packet_mreq mr;
memset(&mr, 0, sizeof(mr));
mr.mr_ifindex = sll.sll_ifindex;
mr.mr_type = PACKET_MR_PROMISC;
if (setsockopt(m_socket, SOL_PACKET, PACKET_ADD_MEMBERSHIP, &mr, sizeof(mr)) < 0) {
LOGERROR("EthRawSock", "Start(): setsockopt() PACKET_ADD_MEMBERSHIP failed! error: %d",errno);
return ErrorFileOpen;
}
Reading via:
nsize = read(m_socket,m_recv_buffer,ETH_FRAME_LEN);
My two cents contribution:
AVTP streams run in a tagged frame, this means that you won't find the ethertype 0x22f0 at the expected offset (12 octets from start of packet, just after destination and source MAC addresses) - it will be 4 octets after that. The ethertype for VLAN tagged frames is normally 0x8100.
Have you tried wireshark - or tshark - on this interface? Wireshark should be able to get those packets fine - nots sure if you need to enable it though. If I'm not mistaken all network ports must support 802.1AS. IEEE 1722 requires hardware support and I think that it would be impossible to help you out without knowing what's how this was set up.
Related
I have written a Linux application program that receives UDP packets transmitted from a Desktop with fixed & known IP-address on the network. I am using a raw socket to receive packets on my system and filter the received packets based on the source address.
The problem I am facing is, the program runs fine for some time and I get all the required packets, but after a couple of hours, the application stops getting any packets. If I run the command,
tcpdump -i eth0 src 192.168.20.48 on my system, then I see that the system continues to receive the expected packets. But I am not sure what is causing my program to stop receiving packets.
Below is the code snippet used to open a raw socket, receive packets, and filter out the UDP packets transmitted from the known IP address.
int main()
{
int sockfd;
int one = 1;
struct timeval tv;
socklen_t len;
int bytes;
unsigned char tsptr[2048];
struct sockaddr_in cliaddr;
struct iphdr *iph;
int result=0;
char source_add[50];
char expected_source_add[50];
len = sizeof(struct sockaddr_in);
// Creating socket file descriptor
if ((sockfd = socket(AF_INET , SOCK_RAW , IPPROTO_UDP)) < 0 ) {
BRH_PERROR("socket creation failed");
return 1;
}
tv.tv_sec = 30;
tv.tv_usec = 0;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR|SO_REUSEPORT, &one, sizeof(one));
setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO,(char*)&tv,sizeof(tv));
strcpy(expected_source_add, "192.168.20.48");
while (1) {
/*Read fixed data count from socket*/
bytes =recvfrom(sockfd, tsptr, 1500, MSG_WAITALL, (struct sockaddr *)&cliaddr, &len);
iph=(struct iphdr*)tsptr;
//get only UDP packet
if (iph->protocol != 17) continue;
strcpy(source_add,inet_ntoa(cliaddr.sin_addr));
result = strcmp(expected_source_add,source_add);
/*receive data from expected IP address only*/
if( result == 0) {
//Consume the packet
}
}
return 0;
}
Any clue on why the packet receive stops on my application, even though tcpdump shows that packets are being received on the interface, will be helpful.
The code you write here can not see any problem that you describe, I think you should do something like below.
1. using wireshark or tcpdump to see if the nic receive packets successfully
2. beyond the program, do you use any buffer or message queue and are they working good?
3. using tools to see if there exists any memeory leak
4. writing log in every step, especially around recvefrom and strcmp
I need to find the specific interface which is used by a socket, so that I can keep stats for it, using the sysfs files (/sys/class/net/<IF>/statistics/etc).
I've tried two different approaches in the test code below, but both fail. The first one connects to a remote server, and uses ioctl with SIOCGIFNAME, but this fails with 'no such device'. The second one instead uses getsockopt with SO_BINDTODEVICE, but this again fails (it sets the name length to 0).
Any ideas on why these are failing, or how to get the I/F name? after compiling, run the test code as test "a.b.c.d", where a.b.c.d is any IPV4 address which is listening on port 80. Note that I've compiled this on Centos 7, which doesn't appear to have IFNAMSZ in <net/if.h>, so you may have to comment out the #define IFNAMSZ line to get this to compile on other systems.
Thanks.
EDIT
I've since found that this is essentially a dupe of How can I get the interface name/index associated with a TCP socket?, so I should probably remove this. (Only) one of the answers there is correct (https://stackoverflow.com/a/37987807/785194) - get your local IP address with getsockname, and then look up this address in the list returned by getifaddrs.
On the general issue that sockets are essentially dynamic (mentioned below, and several times in the other question): not really relevant. I've checked the kernel source, and sockets have an interface index and interface name, and the API includes at least three ways to get the current name, and other routines to look up the name from the index, and vice-versa. However, the index is somtimes zero, which is not valid, which is why the getsockopt version below fails. No idea why ioctl fails.
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/ioctl.h>
#include <net/if.h>
int main(int argc, char **argv) {
int sock;
struct sockaddr_in dst_sin;
struct in_addr haddr;
if(argc != 2)
return 1;
if(inet_aton(argv[1], &haddr) == 0) {
printf("'%s' is not a valid IP address\n", argv[1]);
return 1;
}
dst_sin.sin_family = AF_INET;
dst_sin.sin_port = htons(80);
dst_sin.sin_addr = haddr;
if((sock = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
perror("socket");
return 1;
}
if(connect(sock, (struct sockaddr*)&dst_sin, sizeof(dst_sin)) < 0) {
perror("connect");
return 1;
}
printf(
"connected to %s:%d\n",
inet_ntoa(dst_sin.sin_addr), ntohs(dst_sin.sin_port));
#if 0 // ioctl fails with 'no such device'
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
// get the socket's interface index into ifreq.ifr_ifindex
if(ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
perror("SIOCGIFINDEX");
return 1;
}
// get the I/F name for ifreq.ifr_ifindex
if(ioctl(sock, SIOCGIFNAME, &ifr) < 0) {
perror("SIOCGIFNAME");
return 1;
}
printf("I/F is on '%s'\n", ifr.ifr_name);
#else // only works on Linux 3.8+
#define IFNAMSZ IFNAMSIZ // Centos7 bug in if.h??
char optval[IFNAMSZ] = {0};
socklen_t optlen = IFNAMSZ;
if(getsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, &optval, &optlen) < 0) {
perror("getsockopt");
return 1;
}
if(!optlen) {
printf("invalid optlen\n");
return 1;
}
printf("I/F is on '%s'\n", optval);
#endif
close(sock);
return 0;
}
TCP (and UDP) sockets are not bound to interfaces, so there is really no facility for answering this query. Now it's true that in general, a given socket will end up passing packets to a specific interface based on the address of the peer endpoint, but that is nowhere encoded in the socket. That's a routing decision that is made dynamically.
For example, let's say that you are communicating with a remote peer that is not directly on your local LAN. And let's say you have a default gateway configured to be 192.168.2.1 via eth0. There is nothing to prevent your configuring a second gateway, say, 192.168.3.1 via eth1, then taking eth0 down. As long as the new gateway can also reach the remote IP, eth1 can now be used to reach the destination and your session should continue uninterrupted.
So, if you need this info, you'll need to infer it from routing entries (but realize that it is not guaranteed to be static, even though in practice it will likely be so). You can obtain the address of your peer from getpeername(2). You can then examine the available routes to determine which one will get you there.
To do this, you could parse and interpret /proc/net/route for yourself, or you can just ask the ip command. For example, my route to an (arbitrary) ibm.com address goes through my eth0 interface, and connecting a socket to there, my local address will be 192.168.0.102 (which should match what getsockname(2) on the connected socket returns):
$ ip route get 129.42.38.1
129.42.38.1 via 192.168.0.1 dev eth0 src 192.168.0.102
cache
I want to implement command tcpdump -i eth0 arp to observe arp packets on interface eth0 on my ubuntu. I use libpcap, but the return value of function pcap_next_ex is always 0. With tcpdump -i eth0 arp in the same time , it can observe arp packets.
/*
* compile(root): gcc test.c -lpcap
* run : ./a.out
* output : time out
* time out
* time out
* ...
*/
#include <pcap.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#define ARP_REQUEST 1
#define ARP_REPLY 2
typedef struct arp_hdr_s arp_hdr_t;
struct arp_hdr_s {
u_int16_t htype;
u_int16_t ptype;
u_char hlen;
u_char plen;
u_int16_t oper;
u_char sha[6];
u_char spa[4];
u_char tha[6];
u_char tpa[4];
};
#define MAXBYTES2CAPTURE 2048
int
main(int argc, char **argv)
{
char err_buf[PCAP_ERRBUF_SIZE];
const unsigned char *packet;
int i;
int ret;
arp_hdr_t *arp_header;
bpf_u_int32 net_addr;
bpf_u_int32 mask;
pcap_t *desrc;
struct pcap_pkthdr *pkthdr;
struct bpf_program filter;
net_addr = 0;
mask = 0;
memset(err_buf, 0, PCAP_ERRBUF_SIZE);
desrc = pcap_open_live("eth0", MAXBYTES2CAPTURE, 0, 512, err_buf);
if (desrc == NULL) {
fprintf(stderr, "error: %s\n", err_buf);
exit(-1);
}
ret = pcap_lookupnet("eth0", &net_addr, &mask, err_buf);
if (ret < 0) {
fprintf(stderr, "error: %s\n", err_buf);
exit(-1);
}
ret = pcap_compile(desrc, &filter, "arp", 1, mask);
if (ret < 0) {
fprintf(stderr, "error: %s\n", pcap_geterr(desrc));
exit(-1);
}
ret = pcap_setfilter(desrc, &filter);
if (ret < 0) {
fprintf(stderr, "errnor: %s\n", pcap_geterr(desrc));
exit(-1);
}
while (1) {
ret = pcap_next_ex(desrc, &pkthdr, &packet);
if (ret == -1) {
printf("%s\n", pcap_geterr(desrc));
exit(1);
} else if (ret == -2) {
printf("no more\n");
} else if (ret == 0) { // here
printf("time out\n");
continue;
}
arp_header = (arp_hdr_t *)(packet + 14);
if (ntohs(arp_header->htype) == 1 && ntohs(arp_header->ptype == 0x0800)) {
printf("src IP: ");
for (i = 0; i < 4; i++) {
printf("%d.", arp_header->spa[i]);
}
printf("dst IP: ");
for (i = 0; i < 4; i++) {
printf("%d.", arp_header->tpa[i]);
}
printf("\n");
}
}
return 0;
}
Without getting too deep in your code, I can see a major problem:
In your use of pcap_open_live(), you do not set promiscuous mode: the third parameter should be non-zero. If the ARP request is not targeted to your interface IP, pcap will not see it without promiscuous mode. tcpdump does, unless specifically told not to do so by using the --no-promiscuous-mode, use promisc (and hence will require CAP_NET_ADMIN privilege, which you'll get by sudo, which your program will require too).
Side note:
1/ Leak: you may want to free your filter using pcap_freecode() after your pcap_setfilter().
2/ I assume you've read the official tuto here:
http://www.tcpdump.org/pcap.html
...if that's not the case you'd be well advised to do that first. I quote:
A note about promiscuous vs. non-promiscuous sniffing: The two
techniques are very different in style. In standard, non-promiscuous
sniffing, a host is sniffing only traffic that is directly related to
it. Only traffic to, from, or routed through the host will be picked
up by the sniffer. Promiscuous mode, on the other hand, sniffs all
traffic on the wire. In a non-switched environment, this could be all
network traffic. [... more stuff on promisc vs non-promisc]
EDIT:
Actually, looking deeper to you code compared to my code running for +1 year at production level (both in-house and at the customer) I can see many more things that could be wrong:
You never call pcap_create()
You never call pcap_set_promisc(), we've talked about this already
You never call pcap_activate(), this may be the core issue here
...pcap is very touchy about the sequence order of operations to first get a pcap_t handle, and then operate on it.
At the moment, the best advice I can give you - otherwise this is going to a live debugging session between you and me, are:
1/ read and play/tweak with the code from the official tutorial:
http://www.tcpdump.org/pcap.html
This is mandatory.
2/ FWIW, my - definitely working - sequence of operations is this:
pcap_lookupnet()
pcap_create()
pcap_set_promisc()
pcap_set_snaplen(), you may or may not need this
pcap_set_buffer_size(), you may or may not need this
pcap_activate() with a note: Very important: first activate, then set non-blocking from PCAP_SETNONBLOCK(3PCAP): When first activated with pcap_activate() or opened with pcap_open_live() , a capture handle is not in non-blocking mode''; a call to pcap_set-nonblock() is required in order to put it intonon-blocking'' mode.
...and then, because I do not use stinking blocking/blocking with timeout, busy looping:
pcap_setnonblock()
pcap_get_selectable_fd()
...then and only then:
- pcap_compile()
- followed by a pcap_setfilter()
- and then as I mentioned a pcap_freecode()
- and then a select() or family on the file'des' I get from pcap_get_selectable_fd(), to pcap_dispatch(), but this is another topic.
pcap is an old API starting back in the 80's, and its really very very touchy. But don't get discouraged! It's great - once you get it right.
It would probably work better if you did
if (ntohs(arp_header->htype) == 1 && ntohs(arp_header->ptype) == 0x0800) {
rather than
if (ntohs(arp_header->htype) == 1 && ntohs(arp_header->ptype == 0x0800)) {
The latter evaluates arp_header->type == 0x0800, which, when running on a little-endian machine (such as a PC), will almost always evaluate to "false", because the value will look like 0x0008, not 0x0800, in an ARP packet - ARP types are big-endian, so they'll look byte-swapped on a little-endian machine). That means it'll evaluate to 0, and byte-swapping 0 gives you zero, so that if condition will evaluate to "false", and the printing code won't be called.
You'll still get lots of timeouts if you fix that, unless there's a flood of ARP packets, but at least you'll get the occasional ARP packet printed out. (I would advise printing nothing on a timeout; pcap-based programs doing live capturing should expect that timeouts should happen, and should not report them as unusual occurrences.)
This question is similar to Network port open, but no process attached? and netstat shows a listening port with no pid but lsof does not. But the answers to them can't solve mine, since it is so weird.
I have a server application called lps that waits for tcp connections on port 8588.
[root#centos63 lcms]# netstat -lnp | grep 8588
tcp 0 0 0.0.0.0:8588 0.0.0.0:* LISTEN 6971/lps
As you can see, nothing is wrong with the listening socket, but when I connect some thousand test clients(written by another colleague) to the server, whether it's 2000, 3000, or 4000. There have always been 5 clients(which are also random) that connect and send login request to the server, but cannot receive any response. Take 3000 clients as an example. This is what the netstat command gives:
[root#centos63 lcms]# netstat -nap | grep 8588 | grep ES | wc -l
3000
And this is lsof command output:
[root#centos63 lcms]# lsof -i:8588 | grep ES | wc -l
2995
That 5 connections are here:
[root#centos63 lcms]# netstat -nap | grep 8588 | grep -v 'lps'
tcp 92660 0 192.168.0.235:8588 192.168.0.241:52658 ESTABLISHED -
tcp 92660 0 192.168.0.235:8588 192.168.0.241:52692 ESTABLISHED -
tcp 92660 0 192.168.0.235:8588 192.168.0.241:52719 ESTABLISHED -
tcp 92660 0 192.168.0.235:8588 192.168.0.241:52721 ESTABLISHED -
tcp 92660 0 192.168.0.235:8588 192.168.0.241:52705 ESTABLISHED -
The 5 above shows that they are connected to the server on port 8588 but no program attached. And the second column(which is RECV-Q) keeps increasing as the clients are sending the request.
The links above say something about NFS mount and RPC. As for RPC, I used the command rcpinfo -p and the result has nothing to do with port 8588. And NFS mount, nfssta output says Error: No Client Stats (/proc/net/rpc/nfs: No such file or directory).
Question : How can this happen? Always 5 and also not from the same 5 clients. I don't think it's port conflict as the other clients are also connected to the same server IP and port and they are all properly handled by the server.
Note: I'm using Linux epoll to accept client requests. I also write debug code in my program and record every socket(along with the clients' information) that accept returns but cannot find the 5 connections. This is uname -a output:
Linux centos63 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Thanks for your kind help! I'm really confused.
Update 2013-06-08:
After upgrading the system to CentOS 6.4, the same problem occurs. Finally I returned to epoll, and found this page saying that set listen fd to be non-blocking and accept till EAGAIN or EWOULDBLOCK error returns. And yes, it works. No more connections are pending. But why is that? The Unix Network Programming Volume 1 says
accept is called by a TCP server to return the next completed connection from the
front of the completed connection queue. If the completed connection queue is empty,
the process is put to sleep (assuming the default of a blocking socket).
So if there are still some completed connections in the queue, why the process is put to sleep?
Update 2013-7-1:
I use EPOLLET when adding the listening socket, so I can't accept all if not keeping accept till EAGAIN encountered. I just realized this problem. My fault. Remember: always read or accept till EAGAIN comes out if using EPOLLET, even if it is listening socket. Thanks again to Matthew for proving me with a testing program.
I've tried duplicating your problem using the following parameters:
The server uses epoll to manage connections.
I make 3000 connections.
Connections are blocking.
The server is basically 'reduced' to handling the connections only and performing very little complicated work.
I cannot duplicate the problem. Here is my server source code.
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <netdb.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/epoll.h>
#include <err.h>
#include <sysexits.h>
#include <string.h>
#include <unistd.h>
struct {
int numfds;
int numevents;
struct epoll_event *events;
} connections = { 0, 0, NULL };
static int create_srv_socket(const char *port) {
int fd = -1;
int rc;
struct addrinfo *ai = NULL, hints;
memset(&hints, 0, sizeof(hints));
hints.ai_flags = AI_PASSIVE;
if ((rc = getaddrinfo(NULL, port, &hints, &ai)) != 0)
errx(EX_UNAVAILABLE, "Cannot create socket: %s", gai_strerror(rc));
if ((fd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol)) < 0)
err(EX_OSERR, "Cannot create socket");
if (bind(fd, ai->ai_addr, ai->ai_addrlen) < 0)
err(EX_OSERR, "Cannot bind to socket");
rc = 1;
if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &rc, sizeof(rc)) < 0)
err(EX_OSERR, "Cannot setup socket options");
if (listen(fd, 25) < 0)
err(EX_OSERR, "Cannot setup listen length on socket");
return fd;
}
static int create_epoll(void) {
int fd;
if ((fd = epoll_create1(0)) < 0)
err(EX_OSERR, "Cannot create epoll");
return fd;
}
static bool epoll_join(int epollfd, int fd, int events) {
struct epoll_event ev;
ev.events = events;
ev.data.fd = fd;
if ((connections.numfds+1) >= connections.numevents) {
connections.numevents+=1024;
connections.events = realloc(connections.events,
sizeof(connections.events)*connections.numevents);
if (!connections.events)
err(EX_OSERR, "Cannot allocate memory for events list");
}
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &ev) < 0) {
warn("Cannot add socket to epoll set");
return false;
}
connections.numfds++;
return true;
}
static void epoll_leave(int epollfd, int fd) {
if (epoll_ctl(epollfd, EPOLL_CTL_DEL, fd, NULL) < 0)
err(EX_OSERR, "Could not remove entry from epoll set");
connections.numfds--;
}
static void cleanup_old_events(void) {
if ((connections.numevents - 1024) > connections.numfds) {
connections.numevents -= 1024;
connections.events = realloc(connections.events,
sizeof(connections.events)*connections.numevents);
}
}
static void disconnect(int fd) {
shutdown(fd, SHUT_RDWR);
close(fd);
return;
}
static bool read_and_reply(int fd) {
char buf[128];
int rc;
memset(buf, 0, sizeof(buf));
if ((rc = recv(fd, buf, sizeof(buf), 0)) <= 0) {
rc ? warn("Cannot read from socket") : 1;
return false;
}
if (send(fd, buf, rc, MSG_NOSIGNAL) < 0) {
warn("Cannot send to socket");
return false;
}
return true;
}
int main()
{
int srv = create_srv_socket("8558");
int ep = create_epoll();
int rc = -1;
struct epoll_event *ev = NULL;
if (!epoll_join(ep, srv, EPOLLIN))
err(EX_OSERR, "Server cannot join epollfd");
while (1) {
int i, cli;
rc = epoll_wait(ep, connections.events, connections.numfds, -1);
if (rc < 0 && errno == EINTR)
continue;
else if (rc < 0)
err(EX_OSERR, "Cannot properly perform epoll wait");
for (i=0; i < rc; i++) {
ev = &connections.events[i];
if (ev->data.fd != srv) {
if (ev->events & EPOLLIN) {
if (!read_and_reply(ev->data.fd)) {
epoll_leave(ep, ev->data.fd);
disconnect(ev->data.fd);
}
}
if (ev->events & EPOLLERR || ev->events & EPOLLHUP) {
if (ev->events & EPOLLERR)
warn("Error in in fd: %d", ev->data.fd);
else
warn("Closing disconnected fd: %d", ev->data.fd);
epoll_leave(ep, ev->data.fd);
disconnect(ev->data.fd);
}
}
else {
if (ev->events & EPOLLIN) {
if ((cli = accept(srv, NULL, 0)) < 0) {
warn("Could not add socket");
continue;
}
epoll_join(ep, cli, EPOLLIN);
}
if (ev->events & EPOLLERR || ev->events & EPOLLHUP)
err(EX_OSERR, "Server FD has failed", ev->data.fd);
}
}
cleanup_old_events();
}
}
Here is the client:
from socket import *
import time
scks = list()
for i in range(0, 3000):
s = socket(AF_INET, SOCK_STREAM)
s.connect(("localhost", 8558))
scks.append(s)
time.sleep(600)
When running this on my local machine I get 6001 sockets using port 8558 (1 listening, 3000 client side sockets and 3000 server side sockets).
$ ss -ant | grep 8558 | wc -l
6001
When checking the number of IP connections connected on the client I get 3000.
# lsof -p$(pgrep python) | grep IPv4 | wc -l
3000
I've also tried the test with the server on a remote machine with success too.
I'd suggest you attempt to do the same.
In addition try turning off iptables completely just in case its some connection tracking quirk.
Sometimes the iptables option in /proc can help too. So try sysctl -w net.netfilter.nf_conntrack_tcp_be_liberal=1.
Edit: I've done another test which produces the output you see on your side. Your problem is that you are shutting down the connection on the server side pre-emptively.
I can duplicate results similar to what you are seeing doing the following:
After reading some data in to my server, call shutdown(fd, SHUT_RD).
Do send(fd, buf, sizeof(buf)) on the server.
After doing this the following behaviours are seen.
On the client I get 3000 connections open in netstat/ss with ESTABLISHED.
In lsof output I get 2880 (nature of how I was doing shutdown) connections established.
The remainder of the connections lsof -i:8558 | grep -v ES are in CLOSE_WAIT.
This only happens on a half-shutdown connection.
As such I suspect this is a bug in your client or server program. Either you are sending something to the server which the server objects to, or the server is invalidly closing connections down for some reason.
You need to confirm that what state the "anomalous" connections in (like close_wait or something else).
At this stage I also consider this a programming problem and not really something that belongs on serverfault. Without seeing the relevant portions of the source for the client/server it is not going to be possible for anybody to track down the cause of the fault. Albeit I am pretty confident this is nothing to do with the way the operating system is handling the connections.
I am working on a Linux server that listens for UDP messages as part of a discovery protocol. My code for listening follows:
rcd = ::select(
socket_handle + 1,
&wait_list,
0, // no write
0, // no error
&timeout);
if(rcd > 0)
{
if(FD_ISSET(socket_handle,&wait_list))
{
struct sockaddr address;
socklen_t address_size = sizeof(address);
len = ::recvfrom(
socket_handle,
rx_buff,
max_datagram_size,
0, // no flags
&address,
&address_size);
if(len > 0 && address.sa_family == AF_INET)
{
struct sockaddr_in *address_in =
reinterpret_cast<struct sockaddr_in *>(&address);
event_datagram_received::cpost(
this,
rx_buff,
rcd,
ntohl(address_in->sin_addr.s_addr),
ntohs(address_in->sin_port));
}
}
}
In the meantime, I have written a windows client that transmits the UDP messages. I have verified using wireshark that the messages are being transmitted with the right format and length (five bytes). However, when I examine the return value for recvfrom(), this value is always one. The size of my receive buffer (max_datagram_size) is set to 1024. The one byte of the packet that we get appears to have the correct value. My question is: why am I not getting all of the expected bytes?
In case it matters, my Linux server is running under Debian 5 within a VirtualBox virtual machine.
nos answered my question in the first comment. I was using the wrong variable to report the buffer length.