How to UDP Broadcast from Linux Kernel? - linux

I'm developing a experimental Linux Kernel module, so...
How to UDP Broadcast from Linux Kernel?

-13 is -EACCES. Do you have SO_BROADCAST set? I believe sock_sendmsg returns -EACCES if SO_BROADCAST isn't set and you're sending to a broadcast address.
You're looking for <errno.h> for error codes.
What kernel version are you developing under? I'd like to browse thru the kernel source briefly. I'm not seeing how -ENOPKG can be returned from sock_set, but I do see that -ENOPROTOOPT can be returned (which is errno 92 in kernel 2.6.27).
Oh-- and repost that bit of code where you're setting SO_BROADCAST, if you would. I didn't make a note of it and I'd like to look at it again.
Try calling it with SOL_UDP. I think that's what you're looking for. I don't have a 2.6.18 build environment setup anywhere to play w/ this, but give that a shot.
No-- nevermind-- that's not going to do what you want. I should've read a little further in the source. I'll keep looking. This is kinda fun.
I suppose you could just set the broadcast flag yourself! smile
lock_sock(sock->sk);
sock->sk->broadcast = 1;
release_sock(sock->sk);
You've got me stumped, and I've got to head off to bed. I did find this bit of code that might be of some assistance, though these guys aren't doing broadcasts.
http://kernelnewbies.org/Simple_UDP_Server
Good luck-- I wish I could have solved it for you.

#adjuster..
Acctually, I just got it. When I'm setting SO_BROADCAST, I'm receiving 92 (Package not installed)
What package should I install, then?
Edit: The Kernel version is 2.6.18, and you are right! 92 is ENOPROTOOPT
//Socket creation
sock_create(AF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock);
//Broadcasting
int broadcast = 1;
int err;
if( (err = sock->ops->setsockopt(sock, SOL_SOCKET, SO_BROADCAST, (char *)&broadcast, sizeof broadcast)) < 0 )
{
printk(KERN_ALERT MODULE_NAME ": Could not configure broadcast, error %d\n", err);
return -1;
}
Edit: I've got this from setsockopt man page...
ENOPROTOOPT
The option is unknown at the level indicated.
...so, I supose that SOL_SOCKET isn't the right value to pass. I've also tried IPPROTO_UDP instead of SOL_SOCKET with no luck.
Edit: http://docs.hp.com/en/32650-90372/ch02s10.html says that SO_BROADCAST is an option of the SOL_SOCKET level, but I continue to get -92
Edit: I'm desperate, so I've tried SOL_UDP, still -92.
Yes, it is fun :) ... Good synergy! At the end (I hope we get there soon) let's assembly a definitive answer clean and nice! :)
Edit: Even if a hard set the broadcast flag, the sock_sendmsg will fail (-13, "Permission denied")
sock->sk->sk_flags |= SO_BROADCAST;
I really need some help on this one..

Mm, I wish I had more time to help you out.
To get UDP multicasting to work, it has to be baked into your kernel. You have to enable it when you configure your kernel. Google should have more info; I hope this puts you on the right track.

Look at the IPVS (linux virtual server) code in the Linux kernel. It already has a working implementation of UDP multicast, which it uses to share connection state for failover.
Having already taken a look at this and knowing some people who have done this, I would really recomend creating a netfilter link and using a userspace daemon to broadcast the information over the network.

The following worked for me (so finally this thread could be closed).
int yes = 1;
sock_setsockopt(sock, SOL_SOCKET, SO_BROADCAST, &yes, sizeof(yes));
sock->ops->connect(sock, (struct sockaddr *)&addr, sizeof(struct sockaddr), 0);
Here sock is a initialized struct socket and addr should be struct sockaddr_in with a broadcast address in it.

Related

Where did Wireshark/tcpdump/libpcap intercept packet inside Linux kernel?

According to this, wireshark is able to get the packet before it is dropped (therefore I cannot get such packets by myself). And I'm still wondering the exact location in linux kernel for wireshark to fetch the packets.
The answer goes as "On UN*Xes, it uses libpcap, which, on Linux, uses AF_PACKET sockets." Does anyone have more concrete example to use "AF_PACKET sockets"? If I understand wireshark correctly, the network interface card (NIC) will make a copy of all incoming packets and send it to a filter (berkeley packet filter) defined by the user. But where does this happen? Or am I wrong with that understanding and do I miss anything here?
Thanks in advance!
But where does this happen?
If I understood you correctly - you want to know, where is initialized such socket.
There is pcap_create function, that tries to determine type of source interface, creates duplicate of it and activates it.
For network see pcap_create_interface function => pcap_create_common function => pcap_activate_linux function.
All initialization happens in pcap_activate_linux => activate_new function => iface_bind function
( copy descriptor of device with handlep->device = strdup(device);,
create socket with socket(PF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL)),
bind socket to device with bind(fd, (struct sockaddr *) &sll, sizeof(sll)) ).
For more detailed information read comments in source files of mentioned functions - they are very detailed.
After initialization all work happens in a group of functions such as pcap_read_linux, etc.
On Linux, you should be able to simply use tcpdump (which leverages the libpcap library) to do this. This can be done with a file or to STDOUT and you specify the filter at the end of the tcpdump command..

force socket disconnect without forging RST, Linux

I have a network client which is stuck in recvfrom a server not under my control which, after 24+ hours, is probably never going to respond. The program has processed a great deal of data, so I don't want to kill it; I want it to abandon the current connection and proceed. (It will do so correctly if recvfrom returns EOF or -1.) I have already tried several different programs that purport to be able to disconnect stale TCP channels by forging RSTs (tcpkill, cutter, killcx); none had any effect, the program remained stuck in recvfrom. I have also tried taking the network interface down; again, no effect.
It seems to me that there really should be a way to force a disconnect at the socket-API level without forging network packets. I do not mind horrible hacks, up to and including poking kernel data structures by hand; this is a disaster-recovery situation. Any suggestions?
(For clarity, the TCP channel at issue here is in ESTABLISHED state according to lsof.)
I do not mind horrible hacks
That's all you have to say. I am guessing the tools you tried didn't work because they sniff traffic to get an acceptable ACK number to kill the connection. Without traffic flowing they have no way to get hold of it.
Here are things you can try:
Probe all the sequence numbers
Where those tools failed you can still do it. Make a simple python script and with scapy, for each sequence number send a RST segment with the correct 4-tuple (ports and addresses). There's at most 4 billion (actually fewer assuming a decent window - you can find out the window for free using ss -i).
Make a kernel module to get hold of the socket
Make a kernel module getting a list of TCP sockets: look for sk_nulls_for_each(sk, node, &tcp_hashinfo.ehash[i].chain)
Identify your victim sk
At this point you intimately have access to your socket. So
You can call tcp_reset or tcp_disconnect on it. You won't be able to call tcp_reset directly (since it doesn't have EXPORT_SYMBOL) but you should be able to mimic it: most of the functions it calls are exported
Or you can get the expected ACK number from tcp_sk(sk) and directly forge a RST packet with scapy
Here is function I use to print established sockets - I scrounged bits and pieces from the kernel to make it some time ago:
#include <net/inet_hashtables.h>
#define NIPQUAD(addr) \
((unsigned char *)&addr)[0], \
((unsigned char *)&addr)[1], \
((unsigned char *)&addr)[2], \
((unsigned char *)&addr)[3]
#define NIPQUAD_FMT "%u.%u.%u.%u"
extern struct inet_hashinfo tcp_hashinfo;
/* Decides whether a bucket has any sockets in it. */
static inline bool empty_bucket(int i)
{
return hlist_nulls_empty(&tcp_hashinfo.ehash[i].chain);
}
void print_tcp_socks(void)
{
int i = 0;
struct inet_sock *inet;
/* Walk hash array and lock each if not empty. */
printk("Established ---\n");
for (i = 0; i <= tcp_hashinfo.ehash_mask; i++) {
struct sock *sk;
struct hlist_nulls_node *node;
spinlock_t *lock = inet_ehash_lockp(&tcp_hashinfo, i);
/* Lockless fast path for the common case of empty buckets */
if (empty_bucket(i))
continue;
spin_lock_bh(lock);
sk_nulls_for_each(sk, node, &tcp_hashinfo.ehash[i].chain) {
if (sk->sk_family != PF_INET)
continue;
inet = inet_sk(sk);
printk(NIPQUAD_FMT":%hu ---> " NIPQUAD_FMT
":%hu\n", NIPQUAD(inet->inet_saddr),
ntohs(inet->inet_sport), NIPQUAD(inet->inet_daddr),
ntohs(inet->inet_dport));
}
spin_unlock_bh(lock);
}
}
You should be able to pop this into a simple "Hello World" module and after insmoding it, in dmesg you will see sockets (much like ss or netstat).
I understand that what you want to do it's to automatize the process to make a test. But if you just want to check the correct handling of the recvfrom error, you could attach with the GDB and close the fd with close() call.
Here you could see an example.
Another option is to use scapy for crafting propper RST packets (which is not in your list). This is the way I tested the connections RST in a bridged system (IMHO is the best option), you could also implement a graceful shutdown.
Here an example of the scapy script.

Why would you ever call accept() with addr and addrlen set to 0?

While looking at the syscalls made by a linux executable, I saw this one that struck me as odd:
accept(fd, 0, 0);
Why would addr and addrlen be set to 0?
I was also unable to connect to the port that the executable was listening on, but I don't think this accept() call has anything to do with that. Please correct me if I am wrong about this.
The second and third parameters are the protocol address and its length. If they are not NULL accept will fill them in with the info of the client that connected. If you don't care or don't have a need to know who the client is you can pass those values in as NULL to accept and they won't be returned.
It would probably look more normal as
accept(fd, NULL, NULL);
In terms of usage its probably a little odd that we don't see this form more regularly. A lot of people go through the trouble of passing the sockaddr struct and never use the returned information anyway. And if you do need the info down the line you can always call getpeername on the connected socket.

skb_dst() returns NULL

I'm trying to write a virtual netdevice driver on linux kernel 3.3.2. Some features of my driver need the route info when transmitting packets, so I use function skb_dst(struct sk_buff *) to get the dst_entry pointer. But whatever I do, wherever I ping, whenever I try, skb_dst() always returns NULL. I don't know why, and the bug confused me for more than a week. Can anyone help me?
I've found the reason! It's because of a flag added to kernel: IFF_XMIT_DST_RELEASE, if a virtual device is allocated with the flag set to 0, the kernel will drop the routing information when sending the sk_buff to the device. Thanks for Kristof Provost's reply all the same and sorry for ending the question so late.
Ping uses RAW sockets. They probably bypass part of the routing infrastructure.
Try looking at raw_send_hdrinc and raw_sendmsg in net/ipv4/raw.c
To be clear, add dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; to setup function

Change congestion control algorithms per connection

The command 'sysctl' in linux as of now changes the congestion control algorithm globally for the entire system. But congestion control, where the TCP window size and other similar parameters are varied, are normally done per TCP connection. So my question is:
Does there exist a way where I can change the congestion control algorithm being used per TCP connection?
Or am I missing something trivial here? If so, what is it?
This is done in iperf using the -Z option - the patch is here.
This is how it is implemented (PerfSocket.cpp, line 93) :
if ( isCongestionControl( inSettings ) ) {
#ifdef TCP_CONGESTION
Socklen_t len = strlen( inSettings->mCongestion ) + 1;
int rc = setsockopt( inSettings->mSock, IPPROTO_TCP, TCP_CONGESTION,
inSettings->mCongestion, len);
if (rc == SOCKET_ERROR ) {
fprintf(stderr, "Attempt to set '%s' congestion control failed: %s\n",
inSettings->mCongestion, strerror(errno));
exit(1);
}
#else
fprintf( stderr, "The -Z option is not available on this operating system\n");
#endif
Where mCongestion is a string containing the name of the algorithm to use
It seems this is possible via get/setsockopt. The only documentation i found is:
http://lkml.indiana.edu/hypermail/linux/net/0811.2/00020.html
In newer versions of Linux it is possible to set the congestion control for a specific destination using ip route ... congctl .
If anyone are familiar with this approach, please edit this post.
As far as I know, there's no way to change default TCP congestion control per process (I'd love bash script being able to say that whatever is executed by this script should default the congestion control lp).
The only user mode API I'm aware of is as follows:
setsockopt(socket, SOL_TCP, TCP_CONGESTION, congestion_alg, strlen(congestion_alg));
where socket is an open socket, and congestion_alg is a string containing one of the words in /proc/sys/net/ipv4/tcp_allowed_congestion_control.
Linux has pluggable congestion algorithms which can change the algorithm used on the fly but this is a system wide setting not per connection.

Resources