I'm running a long-running socket connection, locally. When I kill the server application it closes cleanly. I now want to see how the client behaves when it does not close cleanly. If they were on different servers I could pull out the network cable. When the connection is local I have nothing to pull. [Insert rude joke here.]
I can do netstat -an |grep tcp | grep :80 and see the connection, which looks something like this:
tcp 0 0 10.1.2.3:80 10.1.2.3:40494 ESTABLISHED
Can I use that 40494 port number to kill or hang the socket, the same way I can use a program's PID to kill it?
Very similar question: How can I simulate a 'plugged network cable' (TCP/IP)? (That question is asking about Windows, I am on Linux)
Unload the kernel module for the network driver might kill it the way you want.
Pick your network interface from the list of active (eth0, eth1, etc...)
ifconfig -a
eth0 Link encap:Ethernet HWaddr 18:A9:05:68:6F:B0
inet6 addr: fe80::1aa9:5ff:fe68:6fb0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
To find the driver module for eth0 interface
ls -l /sys/class/net/eth0/device/driver/module
lrwxrwxrwx 1 root root 0 Aug 15 18:39 module -> ../../../../module/bnx2
To list the kernel modules for your system run
lsmod | grep bnx
bnx2i 48158 0
cnic 55511 1 bnx2i
bnx2 81890 0
To remove the module from memory will require root or sudo access
rmmod -f -w bnx2
This is not a safe operation and you messing with the operating system in an unsafe way. Expect bad things(tm) to happen when you do this. Don't do it in a production environment. Also this may not work depending on the driver behavior itself.
That said, this is an interesting problem and I'm interested in hearing how it turns out.
It appears the answer is "no", there is no way to kill a particular connection.
What I ended up doing was sudo ifdown eth0 (then afterwards sudo ifup eth0 to restore the network). Obviously this is a sledgehammer approach, as it takes down all connections.
For completeness, another approach was to guess which Apache process was running the socket connection, and kill just that process. But apart from involving guessing, that is also more application specific than the question I was asking.
Related
I am not able to find an answer to a simple thing I will try to achive:
once a tcp connection is established to my linux server, let's say ssh / tcp 22 or x11 / tcp 6000 display -> how do I close this connection without killing the process (sshd / x11 display server).
I saw also some suggestoin to use iptables, but it does not work for me, the connection is still visible in netstat -an.
would be good if someone can point me to the right direction.
what I tried so far
tcpkill: kills the process, not good for me
iptables: does not close the established connection, but prevent further connections.
Thanks in adavnce
DJ
Ok, I found at least one solution (killcx) which is working. Maybe we will be able to find an easier solution.
Also, i saw the comment from "zb" - thanks - which might also work, but I was not able to find a working syntax, since this tool seems to be really useful but complex.
So here is an example how to work with the 1. solution which is working for me:
netstat -anp | grep 22
output: tcp 0 0 192.168.0.82:22 192.168.0.77:33597 VERBUNDEN 25258/0
iptables -A INPUT -j DROP -s 192.168.0.77 (to prevent reconnect)
perl killcx.pl 192.168.0.77:33597 (to kill the tcp connection)
killcx can be found here: http://killcx.sourceforge.net/
it "steals" the connection from the foreign host (192.168.0.77) and close it. So that solution is working fine, but to complex to setup quickly if you are under stress. Here are the required packages:
apt-get install libnetpacket-perl libnet-pcap-perl libnet-rawip-perl
wget http://killcx.sourceforge.net/killcx.txt -O killcx.pl
however, would be good to have an easier solution.
tcpkill wont work, since it will only kill any new connection, it doesnt kill existing ESTABLISHED connections
heres how you remove an Established TCP connection
find the PID of the process and the IP of the client connecting,
lets say you are on serverA and someone is connecting from serverB
root#A> netstat -tulpan | grep ssh | grep serverB
should see something like,
tcp 0 0 <serverA IP>:<port> <serverB>:<port> ESTABLISHED 221955/sshd
use lsof utility to get the File Descriptor of this connection using the parent PID
root#A> lsof -np 221995 | grep serverB IP
should see something like this
sshd 221955 <user> 17u IPv4 2857516568 0t0 TCP <serverA IP>:<port>-><serverB IP>:<port> (ESTABLISHED)
get the File Descriptor number (4th column) = 17u
use GDB to shut down this connection, w/out killing sshd
root#A> gdb -p 211955 --batch -ex 'call shutdown(17u, 2)'
should see something similar,
0x00007f0b138c0b40 in __read_nocancel () from /usr/lib64/libc.so.6
$1 = 0
[Inferior 1 (process 211955) detached]
that TCP connection should now be closed
I have a process that opens several tcp connections to several browsers on separate ports.
Using netsat the output is something like this :
tcp 0 0 server1.something:myprog client1.something:49987 ESTABLISHED
tcp 0 0 server1.something:myprog client1.something:65987 ESTABLISHED
tcp 0 0 server1.something:myprog client1.something:89987 ESTABLISHED
Now i would like to kill exactly one of the connections? How do i do it? (Since killing the process will kill all connections)
On linux kernel >= 4.9 you can use the ss command from iproute2 with key -K
ss -K dst client1.something dport = 49987
the kernel have to be compiled with CONFIG_INET_DIAG_DESTROY option enabled.
Here are some options:
Attach with gdb and call close() on the fd. You can map from addr/port to inode number via /proc/net/tcp and from inode number to FD inside the process with ls -la /proc/$pid/fd.
Spoof a RST packet. You'll need to generate it locally and guess the SEQ number somehow.
Maybe setup an iptables rule to generate a RST on the next packet.
Write a kernel module.
There doesn't seem to be a well supported way to do this. It is likely that processes will crash if their FDs are unexpectedly closed anyway.
You can't kill a single connection of a process.
But you could block it with iptables. So the connection can't provide or receive data and the client will run in a timeout.
You can kill by destination port:
ss -K dport = 65987
I'm writing a testing tool that requires known traffic to be captured from a NIC (using libpcap), then fed into the application we are trying to test.
What I'm attempting to set-up is a web server (in this case, lighttpd) and a client (curl) running on the same machine, on an isolated test network. A script will drive the entire setup, and the goal is to be able to specify a number of clients as well as a set of files for each client to download from the web server.
My initial approach was to simply use the loopback (lo) interface... run the web server on 127.0.0.1, have the clients fetch their files from http://127.0.0.1, and run my libpcap-based tool on the lo interface. This works ok, apart from the fact that the loopback interface doesn't emulate a real Ethernet interface. The main problem with that is that packets are all inconsistent sizes... 32kbytes and bigger, and somewhat random... it's also not possible to lower the MTU on lo (well, you can, but it has no effect!).
I also tried running it on my real interface (eth0), but since it's an internal web client talking to an internal web server, traffic never leaves the interface, so libpcap never sees it.
So then I turned to tun/tap. I used socat to bind two tun interfaces together with a tcp connection, so in effect, i had:
10.0.1.1/24 <-> tun0 <-socat-> tcp connection <-socat-> tun1 <-> 10.0.2.1/24
This seems like a really neat solution... tun/tap devices emulate real Ethernet devices, so i can run my web server on tun0 (10.0.1.1) and my capture tool on tun0, and bind my clients to tun1 (10.0.2.1)... I can even use tc to apply shaping rules to this traffic and create a virtual WAN inside my linux box... but it just doesn't work...
Here are the socat commands I used:
$ socat -d TCP-LISTEN:11443,reuseaddr TUN:10.0.1.1/24,up &
$ socat TCP:127.0.0.1:11443 TUN:10.0.2.1/24,up &
Which produces 2 tun interfaces (tun0 and tun1), with their respective IP addresses.
If I run ping -I tun1 10.0.1.1, there is no response, but when i tcpdump -n -i tun0, i see the ICMP echo requests making it to the other side, just no sign of the response coming back.
# tcpdump -i tun0 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tun0, link-type RAW (Raw IP), capture size 65535 bytes
16:49:16.772718 IP 10.0.2.1 > 10.0.1.1: ICMP echo request, id 4062, seq 5, length 64
<--- insert sound of crickets here (chirp, chirp)
So am I missing something obvious or is this the wrong approach? Is there something else i can try (e.g. 2 physical interfaces, eth0 and eth1???).
The easiest way is just to use 2 machines, but I want all of this self-contained, so it can all be scripted and automated on a single machine, without and other dependencies...
UPDATE:
There is no need for the 2 socats to be connected with a tcp connection, it's possible (and preferable for me) to do this:
socat TUN:10.0.1.1/24,up TUN:10.0.2.1/24,up &
The same problem still exists though...
OK, so I found a solution using Linux network namespaces (netns). There is a helpful article about how to use it here: http://code.google.com/p/coreemu/wiki/Namespaces
This is what I did for my setup....
First, download and install CORE: http://cs.itd.nrl.navy.mil/work/core/index.php
Next, run this script:
#!/bin/sh
core-cleanup.sh > /dev/null 2>&1
ip link set vbridge down > /dev/null 2>&1
brctl delbr vbridge > /dev/null 2>&1
# create a server node namespace container - node 0
vnoded -c /tmp/n0.ctl -l /tmp/n0.log -p /tmp/n0.pid > /dev/null
# create a virtual Ethernet (veth) pair, installing one end into node 0
ip link add name veth0 type veth peer name n0.0
ip link set n0.0 netns `cat /tmp/n0.pid`
vcmd -c /tmp/n0.ctl -- ip link set n0.0 name eth0
vcmd -c /tmp/n0.ctl -- ifconfig eth0 10.0.0.1/24 up
# start web server on node 0
vcmd -I -c /tmp/n0.ctl -- lighttpd -f /etc/lighttpd/lighttpd.conf
# create client node namespace container - node 1
vnoded -c /tmp/n1.ctl -l /tmp/n1.log -p /tmp/n1.pid > /dev/null
# create a virtual Ethernet (veth) pair, installing one end into node 1
ip link add name veth1 type veth peer name n1.0
ip link set n1.0 netns `cat /tmp/n1.pid`
vcmd -c /tmp/n1.ctl -- ip link set n1.0 name eth0
vcmd -c /tmp/n1.ctl -- ifconfig eth0 10.0.0.2/24 up
# bridge together nodes using the other end of each veth pair
brctl addbr vbridge
brctl setfd vbridge 0
brctl addif vbridge veth0
brctl addif vbridge veth1
ip link set veth0 up
ip link set veth1 up
ip link set vbridge up
This basically sets up 2 virtual/isolated/name-spaced networks on your Linux host, in this case, node 0 and node 1. A web server is started on node 0.
All you need to do now is run curl on node 1:
vcmd -c /tmp/n1.ctl -- curl --output /dev/null http://10.0.0.1
And monitor the traffic with tcpdump:
tcpdump -s 1514 -i veth0 -n
This seems to work quite well... still experimenting, but looks like it will solve my problem.
We have an linux application (we don't have the source) that seems to be hanging. The socket between the two processes is reported as ESTABLISHED, and there is some data in the kernel socket buffer (although nowhere near the configured 16M via wmem/rmem). Both ends of the socket seem to be stuck on a sendto().
Below is some investigation using netstat/lsof and strace:
HOST A (10.152.20.28)
[root#hosta ~]# lsof -n -u df01 | grep 12959 | grep 12u
q 12959 df01 12u IPv4 4398449 TCP 10.152.20.28:38521->10.152.20.29:gsigatekeeper (ESTABLISHED)
[root#hosta ~]# netstat -anp | grep 38521
tcp 268754 90712 10.152.20.28:38521 10.152.20.29:2119 ESTABLISHED 12959/q
[root#hosta ~]# strace -p 12959
Process 12959 attached - interrupt to quit
sendto(12, "sometext\0somecode\0More\0exJKsss"..., 542, 0, NULL, 0 <unfinished ...>
Process 12959 detached
[root#hosta~]#
HOST B (10.152.20.29)
[root#hostb ~]# netstat -anp | grep 38521
tcp 72858 110472 10.152.20.29:2119 10.152.20.28:38521 ESTABLISHED 25512/q
[root#hostb ~]# lsof -n -u df01 | grep 38521
q 25512 df01 14u IPv4 6456715 TCP 10.152.20.29:gsigatekeeper->10.152.20.28:38521 (ESTABLISHED)
[root#hostb ~]# strace -p 25512
Process 25512 attached - interrupt to quit
sendto(14, "\0\10\0\0\0Owner\0sym\0Type\0Ctpy\0Time\0Lo"..., 207, 0, NULL, 0 <unfinished ...>
Process 25512 detached
[root#hostb~]#
We have upgraded the NIC driver to the latest and greatest. The systems are running RHEL 5.6 x64 (2.6.18-238.el5), I have checked the eratta for RHEL 5.7 and 5.8 but I can see no mention of bugs with the bnx2 driver or the kernel.
Does anyone have any ideas of how to debug this further?
Is either side actually reading? If not, it could be that both sides' receive buffers are full, leading to not sending data (due to the receive window being filled), leading to both send buffers being filled, which will cause sendto to block. (It's possible that this could happen despite your setting of wmem/rmem if the application is setting the SO_RCVBUF and SO_SNDBUF socket options.)
To debug this, I'd synchronize both machine's clocks, then run both applications under strace with the -e trace=network and -tt options, so you can compare the logs and see if the application isn't reading.
You could also use a network analyzer (such as Wireshark) to determine if the TCP receive window gets stuck on 0.
If this is the case, you could probably work around this by creating a small caching proxy, which would recv/send from both sides, buffering whatever can't be sent at the time.
Is there a way to tie a network connection to a PID (process ID) without forking to lsof or netstat?
Currently lsof is being used to poll what connections belong which process ID. However lsof or netstat can be quite expensive on a busy host and would like to avoid having to fork to these tools.
Is there someplace similar to /proc/$pid where one can look to find this information? I know what the network connections are by examining /proc/net but can't figure out how to tie this back to a pid. Over in /proc/$pid, there doesn't seem to be any network information.
The target hosts are Linux 2.4 and Solaris 8 to 10. If possible, a solution in Perl, but am willing to do C/C++.
additional notes:
I would like to emphasize the goal here is to tie a network connection to a PID. Getting one or the other is trivial, but putting the two together in a low cost manner appears to be difficult. Thanks for the answers to so far!
I don't know how often you need to poll, or what you mean with "expensive", but with the right options both netstat and lsof run a lot faster than in the default configuration.
Examples:
netstat -ltn
shows only listening tcp sockets, and omits the (slow) name resolution that is on by default.
lsof -b -n -i4tcp:80
omits all blocking operations, name resolution, and limits the selection to IPv4 tcp sockets on port 80.
On Solaris you can use pfiles(1) to do this:
# ps -fp 308
UID PID PPID C STIME TTY TIME CMD
root 308 255 0 22:44:07 ? 0:00 /usr/lib/ssh/sshd
# pfiles 308 | egrep 'S_IFSOCK|sockname: '
6: S_IFSOCK mode:0666 dev:326,0 ino:3255 uid:0 gid:0 size:0
sockname: AF_INET 192.168.1.30 port: 22
For Linux, this is more complex (gruesome):
# pgrep sshd
3155
# ls -l /proc/3155/fd | fgrep socket
lrwx------ 1 root root 64 May 22 23:04 3 -> socket:[7529]
# fgrep 7529 /proc/3155/net/tcp
6: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 7529 1 f5baa8a0 300 0 0 2 -1
00000000:0016 is 0.0.0.0:22. Here's the equivalent output from netstat -a:
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
Why don't you look at the source code of netstat and see how it get's the information? It's open source.
For Linux, have a look at the /proc/net directory
(for example, cat /proc/net/tcp lists your tcp connections). Not sure about Solaris.
Some more information here.
I guess netstat basically uses this exact same information so i don't know if you will be able to speed it up a whole lot. Be sure to try the netstat '-an' flags to NOT resolve ip-adresses to hostnames realtime (as this can take a lot of time due to dns queries).
The easiest thing to do is
strace -f netstat -na
On Linux (I don't know about Solaris). This will give you a log of all of the system calls made. It's a lot of output, some of which will be relevant. Take a look at the files in the /proc file system that it's opening. This should lead you to how netstat does it. Indecently, ltrace will allow you to do the same thing through the c library. Not useful for you in this instance, but it can be useful in other circumstances.
If it's not clear from that, then take a look at the source.
Take a look at these answers which thoroughly explore the options available:
How I can get ports associated to the application that opened them?
How to do like "netstat -p", but faster?