Kernel panic on doing skb copy and sending to ipstack - linux

Code which doesnt crash :
int process(skb) {
doing some changes to skb;
netif_rx_ni(skb); // send to ip stack
return 0
}
skb knet_rx_cb() {
process(skb)
return skb;
}
Code which gives a kernel panic
48y(config)# do ping 10.20.0.2
Message from syslogd#48y at Jun 27 15:32:14 ...
kernel:[ 709.727421] skbuff: skb_under_panic: text:00000000a8ea24db len:-65279 put:-65339 head:0000000080c98bfe data:000000000d494eec tail:0x100 end:0x3ec0 dev:Eth0.1-0
int process(orig_skb)
{
nskb = skb_copy(orig_skb);
doing some changes to nskb ;
netif_rx_ni(nskb) ;// send copy skb to ip stack
return 0
}
skb knet_rx_cb()
{
process(skb)
return skb;
}

Related

Linux send with MSG_DONTWAIT blocks?

in my multithreaded linux server application, a thread gets stuck forever in send function, which I believe should be nonblocking. The code looks like this:
while(size) {
const ssize_t sent = send(unixSocketFD, data, size, MSG_NOSIGNAL | MSG_DONTWAIT);
if(sent > 0) {
size -= sent;
data += sent;
} else {
if(-1 == sent) {
if(EINTR == errno) {
continue;
}
if(EWOULDBLOCK == errno || EAGAIN == errno) {
return 0; // need to be called again later
}
}
return -1; // indicate error ...
}
}
... and the stacktrace:
(gdb) bt
#0 0x00007ffff7bca9ff in __libc_send (fd=5, buf=0x7fffedd25880, n=87643, flags=16448) at ../sysdeps/unix/sysv/linux/x86_64/send.c:26
#1 0x00000000004076cf in Output::sendBlock (this=0x7fffdc0a3190, unixSocketFD=5) at OutputBlock.cpp:24
...
I believe that the send should return immediately when flags contain MSG_DONTWAIT. What's wrong with my code/assumptions? Thanks you for any suggestions.
I'm sorry, this was my mistake: the send was not blocking, instead it was enterered again and again even if it indicated that the socket is not ready to accept more data.

Why does queue_work return 0 in linux kernel driver module

I have been porting a custom vendor class USB driver from WINCE 6.0 to Linux kernel for a custom Android device. This driver has a bulk endpoint that is used for transfer of data to a file. Please don't explain to me that is a bad idea. I understand this. In my bulkout_complete function I copy the data to a write buffer and then call queue_work to schedule a work queue to write the data to file. To ensure that the host cannot send data again until the write has completed I have a write status variable that gets cleared as the last operation in the work queue function. The host driver calls a vendor specific request on the control endpoint to monitor the status of that variable. When the variable is cleared the host will send the next packet to the bulk endpoint. The problem is that even though the variable has cleared, the next call to queue_work will return a value of zero, which seems to indicate that the work is already in progress.
From my header file:
u8 mWriteBuffer[4096];
u32 mWriteLength;
static bool mWriteActive = false;
#define USB_WORKQUEUE_NAME "wqusb"
static struct workqueue_struct *wqUSB;
static struct work_struct writeFile_work;
DECLARE_WORK(writeFile_work,file_write_handler);
Creating and cleaning up the workqueue
static int __init init(void)
{
wqUSB = create_workqueue(USB_WORKQUEUE_NAME);
// other init stuff
}
static void __exit cleanup(void)
{
flush_workqueue(wqUSB);
destroy_workqueue(wqUSB);
//other cleanup stuff
}
Bulk endpoint code
static void bulkout_complete(struct usb_ep *ep, struct usb_request *req)
{
struct usb_composite_dev *cdev;
struct f_myusb *ss = ep->driver_data;
int status = req->status;
cdev = ss->function.config->cdev;
switch (status)
{
case 0: /* normal completion? */
mWriteLength = (u32)req->length;
memcpy(mWriteBuffer,(u8*)req->buf,req->length);
mWriteActive = true;
if(queue_work(wqUSB,&writeFile_work)==0)
{
printk(KERN_INFO "[USB DRIVER] - failed to queuework \n");
}
break;
/* this endpoint is normally active while we're configured */
case -ECONNABORTED: /* hardware forced ep reset */
case -ECONNRESET: /* request dequeued */
case -ESHUTDOWN: /* disconnect from host */
VDBG(cdev, "%s gone (%d), %d/%d\n", ep->name, status,
req->actual, req->length);
free_ep_req(ep, req);
return;
case -EOVERFLOW: /* buffer overrun on read means that
* we didn't provide a big enough
* buffer.
*/
default:
#if 1
DBG(cdev, "%s complete --> %d, %d/%d\n", ep->name,
status, req->actual, req->length);
#endif
case -EREMOTEIO: /* short read */
break;
}
status = usb_ep_queue(ep, req, GFP_ATOMIC);
if (status) {
ERROR(cdev, "kill %s: resubmit %d bytes --> %d\n",
ep->name, req->length, status);
usb_ep_set_halt(ep);
}
}
The workqueue function that writes data
void file_write_handler(struct work_struct *pwork)
{
write_file(mWriteBuffer,mWriteLength);
mWriteActive = false;
}

How to get ipv4 address of an interface using libnl3 (netlink version 3) on linux?

I'm learning the netlink library version 3 and I want to know how to get the ipv4 address of a specified network interface. I can get the mac address and even requery the interface name from a link data structure, but I can not figure out how to get the ip address using the libnl and libnl-route libs. I did find some code to get the ip address using the libnl-cli lib but that is for dumping the results to a file descriptor (think stdout). I have sent mail to the mailing list for this library but I have not gotten a response.
Here is my code:
https://gist.github.com/netskink/4f554ed6657954b17ab255ad5bc6d1f0
Here are my results:
./stats
Returned link name is enp6s0
Returned link addr is a0:36:9f:66:93:13
Ive seen the mechanism to retrieve the ip address using ioctls, but since netlink lib can return the ip address using the cli sublibrary I figure it can be done but I can not figure out a way.
Interface can have multiple addresses (ipv4 and ipv6 addresses - code sample gave me one ipv4 and one ipv6), so there is no such function that returns one address for interface. If only you had specific local address, you could have called rtnl_addr_get. Instead you can iterate addresses.
#include <libnl3/netlink/cache.h>
void addr_cb(struct nl_object *o, void *data)
{
int ifindex = (int)(intptr_t)data;
struct rtnl_addr *addr = (rtnl_addr *)o;
if (NULL == addr) {
/* error */
printf("addr is NULL %d\n", errno);
return;
}
int cur_ifindex = rtnl_addr_get_ifindex(addr);
if(cur_ifindex != ifindex)
return;
const struct nl_addr *local = rtnl_addr_get_local(addr);
if (NULL == local) {
/* error */
printf("rtnl_addr_get failed\n");
return;
}
char addr_str[ADDR_STR_BUF_SIZE];
const char *addr_s = nl_addr2str(local, addr_str, sizeof(addr_str));
if (NULL == addr_s) {
/* error */
printf("nl_addr2str failed\n");
return;
}
fprintf(stdout, "\naddr is: %s\n", addr_s);
}
You can iterate addresses from cache and see if they contain needed address (looking at ifindex). Please take a look at https://www.infradead.org/~tgr/libnl/doc/api/cache_8c_source.html for useful functions (there is some filter function).
int ifindex = rtnl_link_get_ifindex(p_rtnl_link);
printf("ifindex: %d\n", ifindex);
bool empty = nl_cache_is_empty(addr_cache);
printf("empty: %d\n", empty);
nl_cache_foreach(addr_cache,
addr_cb, (void*)(intptr_t)ifindex);
And to check ip version use rtnl_addr_get_family.
Building upon user2518959's answer.
The rtnl_addr_alloc_cache and rtnl_link_alloc_cache both return a nl_cache object/structure. Even those these two results are of the same type, they have different routines which can be used on each.
The nl_cache returned from rtnl_addr_alloc_cache can be used to get rtnl_addr object/structures. Which are in turn can be used to call rtnl_addr_get_local to get the ipv4 or ipv6 address.
In contrast, the nl_cache returned from rtnl_link_alloc_cache can be used to get the interface name (eth0, enp6s0, ...) and the mac address. The routines are rtnl_link_get_by_name and rtnl_link_get_addr respectively.
In either case, the common link between the two is routine rtnl_addr_get_index and rtnl_link_get_index which return an interface index which can be used to relate either entry from each cache. ie. interface 1 from the addr version of nl_cache and interface 1 from the link nl_cache are the same interface. One gives the ip address and the other gives the mac address and name.
Lastly, a tunnel will have an ip address but no mac so it will not have a link name or mac address.
Here is some code which shows user25185959 approach and an alternate method which shows the relationship explictly. User2518959 passed the interface number into the callback to filter out interfaces.
#include <libnl3/netlink/netlink.h>
#include <libnl3/netlink/route/link.h>
#include <libnl3/netlink/route/addr.h>
#include <libnl3/netlink/cache.h>
#include <libnl3/netlink/route/addr.h>
#include <errno.h>
/*
gcc ipchange.c -o ipchange $(pkg-config --cflags --libs libnl-3.0 libnl-route-3.0 libnl-cli-3.0)
*/
#include <stdbool.h>
#define ADDR_STR_BUF_SIZE 80
void addr_cb(struct nl_object *p_nl_object, void *data) {
int ifindex = (int) (intptr_t) data; // this is the link index passed as a parm
struct rtnl_addr *p_rtnl_addr;
p_rtnl_addr = (struct rtnl_addr *) p_nl_object;
int result;
if (NULL == p_rtnl_addr) {
/* error */
printf("addr is NULL %d\n", errno);
return;
}
// This routine is not mentioned in the doxygen help.
// It is listed under Attributes, but no descriptive text.
// this routine just returns p_rtnl_addr->a_ifindex
int cur_ifindex = rtnl_addr_get_ifindex(p_rtnl_addr);
if(cur_ifindex != ifindex) {
// skip interaces where the index differs.
return;
}
// Adding this to see if I can filter on ipv4 addr
// this routine just returns p_rtnl_addr->a_family
// this is not the one to use
// ./linux/netfilter.h: NFPROTO_IPV6 = 10,
// ./linux/netfilter.h: NFPROTO_IPV4 = 2,
// this is the one to use
// x86_64-linux-gnu/bits/socket.h
// defines AF_INET6 = PF_INET6 = 10
// defines AF_INET = PF_INET = 2
result = rtnl_addr_get_family(p_rtnl_addr);
// printf( "family is %d\n",result);
if (AF_INET6 == result) {
// early exit, I don't care about IPV6
return;
}
// This routine just returns p_rtnl_addr->a_local
const struct nl_addr *p_nl_addr_local = rtnl_addr_get_local(p_rtnl_addr);
if (NULL == p_nl_addr_local) {
/* error */
printf("rtnl_addr_get failed\n");
return;
}
char addr_str[ADDR_STR_BUF_SIZE];
const char *addr_s = nl_addr2str(p_nl_addr_local, addr_str, sizeof(addr_str));
if (NULL == addr_s) {
/* error */
printf("nl_addr2str failed\n");
return;
}
fprintf(stdout, "\naddr is: %s\n", addr_s);
}
int main(int argc, char **argv, char **envp) {
int err;
struct nl_sock *p_nl_sock;
struct nl_cache *link_cache;
struct nl_cache *addr_cache;
struct rtnl_addr *p_rtnl_addr;
struct nl_addr *p_nl_addr;
struct nl_link *p_nl_link;
struct rtnl_link *p_rtnl_link;
char addr_str[ADDR_STR_BUF_SIZE];
char *pchLinkName;
char *pchLinkAddr;
char *pchIPAddr;
char *interface;
interface = "enp6s0";
pchLinkAddr = malloc(40);
pchIPAddr = malloc(40);
strcpy(pchLinkAddr,"11:22:33:44:55:66");
strcpy(pchIPAddr,"123.456.789.abc");
p_nl_sock = nl_socket_alloc();
if (!p_nl_sock) {
fprintf(stderr, "Could not allocate netlink socket.\n");
exit(ENOMEM);
}
// Connect to socket
if(err = nl_connect(p_nl_sock, NETLINK_ROUTE)) {
fprintf(stderr, "netlink error: %s\n", nl_geterror(err));
p_nl_sock = NULL;
exit(err);
}
// Either choice, the result below is a mac address
err = rtnl_link_alloc_cache(p_nl_sock, AF_UNSPEC, &link_cache);
//err = rtnl_link_alloc_cache(p_nl_sock, AF_INET, &link_cache);
//err = rtnl_link_alloc_cache(p_nl_sock, IFA_LOCAL, &link_cache);
if (0 != err) {
/* error */
printf("rtnl_link_alloc_cache failed: %s\n", nl_geterror(err));
return(EXIT_FAILURE);
}
err = rtnl_addr_alloc_cache(p_nl_sock, &addr_cache);
if (0 != err) {
/* error */
printf("rtnl_addr_alloc_cache failed: %s\n", nl_geterror(err));
return(EXIT_FAILURE);
}
p_rtnl_link = rtnl_link_get_by_name(link_cache, "enp6s0");
if (NULL == p_rtnl_link) {
/* error */
printf("rtnl_link_get_by_name failed\n");
return(EXIT_FAILURE);
}
pchLinkName = rtnl_link_get_name(p_rtnl_link);
if (NULL == pchLinkName) {
/* error */
printf("rtnl_link_get_name failed\n");
return(EXIT_FAILURE);
}
printf("Returned link name is %s\n",pchLinkName);
////////////////////////////////// mac address
p_nl_addr = rtnl_link_get_addr(p_rtnl_link);
if (NULL == p_nl_addr) {
/* error */
printf("rtnl_link_get_addr failed\n");
return(EXIT_FAILURE);
}
pchLinkAddr = nl_addr2str(p_nl_addr, pchLinkAddr, 40);
if (NULL == pchLinkAddr) {
/* error */
printf("rtnl_link_get_name failed\n");
return(EXIT_FAILURE);
}
printf("Returned link addr is %s\n",pchLinkAddr);
////////////////////////////////// ip address
// How to get ip address for a specified interface?
//
// The way she showed me.
//
// Return interface index of link object
int ifindex = rtnl_link_get_ifindex(p_rtnl_link);
printf("ifindex: %d\n", ifindex);
// She gave me this but its not necessary
// Returns true if the cache is empty. True if the cache is empty.
// bool empty = nl_cache_is_empty(addr_cache);
// printf("empty: %d\n", empty);
// Call a callback on each element of the cache. The
// arg is passed on the callback function.
// addr_cache is the cache to iterate on
// addr_cb is the callback function
// ifindex is the argument passed to the callback function
//
nl_cache_foreach(addr_cache, addr_cb, (void*)(intptr_t)ifindex);
// This shows that the link index returned from rtnl_addr_get_index
// and rtnl_link_get_index are equivalent when using the rtnl_addr
// and rtnl_link from the two respective caches.
// Another way...
// This will iterate through the cache of ip's
printf("Getting the list of interfaces by ip addr cache\n");
int count = nl_cache_nitems(addr_cache);
printf("addr_cache has %d items\n",count);
struct nl_object *p_nl_object;
p_nl_object = nl_cache_get_first(addr_cache);
p_rtnl_addr = (struct rtnl_addr *) p_nl_object;
for (int i=0; i<count; i++) {
// This routine just returns p_rtnl_addr->a_local
const struct nl_addr *p_nl_addr_local = rtnl_addr_get_local(p_rtnl_addr);
if (NULL == p_nl_addr_local) {
/* error */
printf("rtnl_addr_get failed\n");
return(EXIT_FAILURE);
}
int cur_ifindex = rtnl_addr_get_ifindex(p_rtnl_addr);
printf("This is index %d\n",cur_ifindex);
const char *addr_s = nl_addr2str(p_nl_addr_local, addr_str, sizeof(addr_str));
if (NULL == addr_s) {
/* error */
printf("nl_addr2str failed\n");
return(EXIT_FAILURE);
}
fprintf(stdout, "\naddr is: %s\n", addr_s);
//
printf("%d\n",i);
p_nl_object = nl_cache_get_next(p_nl_object);
p_rtnl_addr = (struct rtnl_addr *) p_nl_object;
// Just for grins
}
// Another way...
// This will iterate through the cache of LLC
printf("Getting the list of interfaces by mac cache\n");
count = nl_cache_nitems(link_cache);
printf("addr_cache has %d items\n",count);
p_nl_object = nl_cache_get_first(link_cache);
p_rtnl_link = (struct rtnl_link *) p_nl_object;
for (int i=0; i<count; i++) {
// This routine just returns p_rtnl_addr->a_local
const struct nl_addr *p_nl_addr_mac = rtnl_link_get_addr(p_rtnl_link);
if (NULL == p_nl_addr_mac) {
/* error */
printf("rtnl_addr_get failed\n");
return(EXIT_FAILURE);
}
int cur_ifindex = rtnl_link_get_ifindex(p_rtnl_link);
printf("This is index %d\n",cur_ifindex);
const char *addr_s = nl_addr2str(p_nl_addr_mac, addr_str, sizeof(addr_str));
if (NULL == addr_s) {
/* error */
printf("nl_addr2str failed\n");
return(EXIT_FAILURE);
}
fprintf(stdout, "\naddr is: %s\n", addr_s);
//
printf("%d\n",i);
p_nl_object = nl_cache_get_next(p_nl_object);
p_rtnl_link = (struct rtnl_link *) p_nl_object;
}
return(EXIT_SUCCESS);
}

dev_queue_xmit results in kernel panic

I am writing a part of kernel module which transfers skbuff from one interface out to another interface. Ex all packet coming on eth0, gets forwarded out to on eth1. The problem even the first packet I intend to transmit is resulting in kernel crash in dev_queue_xmit(). Can anyone help me understand whats going wrong here ?. Kernel Panic occurs in
<2>kernel BUG at net/core/skbuff.c:927!
code sample function that I am using to transmit packet is as below. Please let me know.
enter code here
int txPktOnOtherIf(struct sk_buff *skb, struct net_device *tdev)
{
int reservedSpace=max((int)LL_RESERVED_SPACE(tdev),(int)sizeof(struct ethhdr));
int buffLen = reservedSpace - sizeof(struct ethhdr) + skb->len + skb->dev->needed_tailroom;
struct sk_buff* nskb = NULL;
int err = 0;
printk("ReservedSpace is %d and buffer len is %d",reservedSpace,buffLen);
nskb = alloc_skb(buffLen, GFP_KERNEL);
if (!nskb) {
printk("Couldn't allocate SKB\n");
return -1;
}
skb_reserve(nskb, reservedSpace);
skb_reset_network_header(nskb);
skb_put(nskb, skb->len - sizeof(struct ethhdr));
skb_push(nskb, sizeof(struct ethhdr));
skb_reset_mac_header(nskb);
skb_reset_mac_len(nskb);
err = skb_store_bits(nskb, 0, skb->data, skb->len);
if (err) {
kfree_skb(nskb);
printk("Error %d storing to SKB\n", err);
return -1;
}
nskb->dev = tdev;
nskb->protocol = skb->protocol;
skb_get(nskb);
err = dev_queue_xmit(nskb);
err = net_xmit_eval(err);
if (err) {
kfree_skb(nskb);
printk("Error %d sending frame\n", err);
return -1;
}
return 0;
}

flush_cache_range() and flush_tlb_range() do not seem to work

Here is what I did:
A user space process uses malloc() to allocate memory on the heap and fills it with a specific pattern of characters and then spells out the address returned by the malloc().
The process id and the address of the memory chunk are passed to a kernel module that looks like this:
int init_module(void) {
int res = 0;
struct page *data_page;
struct task_struct *task = NULL;
struct vm_area_struct *next_vma;
struct mm_struct *mm;
task = pid_task(find_vpid(pid), PIDTYPE_PID);
if (pid != -1)
target_process_id = pid;
if (!task) {
printk("Could not find the task struct for process id %d\n", pid);
return 0;
} else {
printk("Found the task <%s>\n", task->comm);
}
mm = task->mm;
if (!mm) {
printk("Could not find the mmap struct for process id %d\n", pid);
return 0;
}
next_vma = find_vma(mm, addr);
down_read(&task->mm->mmap_sem);
res = get_user_pages(task, task->mm, addr, 1, 1, 1, &data_page, NULL);
if (res != 1) {
printk(KERN_INFO "get_user_pages error\n");
up_read(&task->mm->mmap_sem);
return 0;
} else {
printk("Found vma struct and it starts at: %lu\n", next_vma->vm_start);
}
flush_cache_range(next_vma,next_vma->vm_start,next_vma->vm_end);
flush_tlb_range(next_vma,next_vma->vm_start,next_vma->vm_end);
up_read(&task->mm->mmap_sem);
return 0;
}
I added printk() statement to the handle_mm_fault() function in the Linux kernel to track page faults caused by target_process_id (3rd line of code after variable definitions above). Something like this:
if (unlikely(current->pid == target_process_id))
printk("Target process <%d> generated a page fault at address %lu\n", current->pid, address);
Now, what I noticed is that the last printk() statement does not catch anything.
The function init_module is the initialization function for a kernel module. It is inserted into the running kernel using insmod...using the command insmod module.ko pid=<processId> addr=<address>
Any idea what might going wrong?

Resources