msg not getting printed in struct msghdr in linux kernel - linux

I am using kernel 4.5.3 , and i am working on to implement new Layer 4 transport protocol (essentially a new socket). From user space app, when i send data through my new socket, i am receiving it in kernel. In kernel i am trying to unwrap the buffer and trying to verify if all is correct. The user data send from user space is wrapped into struct msghdr structure by socket layer of linux kernel (so i this portion has to be bug free). It is this structure i am trying to print in my kernel handler of the new socket specific sendmsg fn. Though i am able to print all the fields, except when i print the actual msg send (char *) , it is printing blank.
Below is the dump fn i have written:
static
void dump_msg_hdr(struct msghdr *msg){
struct sockaddr_in *dest_addr = NULL;
struct in_addr dest_ip;
char ipv4[32], *usermsg = NULL;
struct iovec *iov = NULL;
memset(&dest_addr, 0, sizeof(struct sockaddr_in));
memset(&dest_ip, 0, sizeof(struct in_addr));
dest_addr = (struct sockaddr_in *)msg->msg_name;
dest_ip = dest_addr->sin_addr;
getDotDecimalIpV4(htonl(dest_ip.s_addr), ipv4);
printk(KERN_INFO "Dest ip address = %s", ipv4);
iov = msg->msg_iter.iov;
usermsg = (char *)kmalloc(iov->iov_len, GFP_USER);
memcpy(usermsg, iov->iov_base, iov->iov_len);
printk(KERN_INFO "user msg = %s\n", usermsg);
printk(KERN_INFO "msg->msg_iter.count (no of data bytes) = %d", msg->msg_iter.count);
printk(KERN_INFO "iov_base = 0x%x, iov_len = %d\n", iov->iov_base, iov->iov_len);
printk(KERN_INFO "msg->msg_iter.nr_segs (no of iovec segments) = %d", msg->msg_iter.nr_segs);
kfree(usermsg);
}
output
Dest ip address = 192.168.6.6
user msg = <blank>
sg->msg_iter.count (no of data bytes) = 27
iov_base = 0xbf95c153, iov_len = 27
msg->msg_iter.nr_segs (no of iovec segments) = 1
structures
struct iov_iter {
int type;
size_t iov_offset;
size_t count;
const struct iovec *iov; /* SIMPLIFIED - see below */
unsigned long nr_segs;
};
47 struct msghdr {
48 void *msg_name; /* ptr to socket address structure */
49 int msg_namelen; /* size of socket address structure */
50 struct iov_iter msg_iter; /* data */
51 void *msg_control; /* ancillary data */
52 __kernel_size_t msg_controllen; /* ancillary data buffer length */
53 unsigned int msg_flags; /* flags on received message */
54 struct kiocb *msg_iocb; /* ptr to iocb for async requests */
55 };

Related

How are the steps to access GPIOs in linux kernel modules?

I am struggling to find out, what steps are necessary to access a gpio-pin from a linux kernel module.
Maybe someone can explain it to me by a simple example. I like to use pin 4(input) and 33(output). My steps so far:
1.) Device Tree(dts): I leave the dts file untouched - Do I need to setup the pin 4 and 33 via pin control?
2.) kernel module: some pseudo code
gpio_is_valid(4)
gpio_request_one(4, GPIOF_DIR_IN | GPIOF_EXPORT_DIR_FIXED , "myPin4")
gpio_export(4, false)
gpio_get_value(4)
gpio_is_valid(33)
gpio_request_one(33, GPIOF_DIR_OUT | GPIOF_INIT_LOW | GPIOF_OPEN_SOURCE | GPIOF_EXPORT_DIR_FIXED , "myPin33")
gpio_export(33, false)
gpio_set_value(33, 1)
How to do it in a proper way?
I would suggest the combination of an own device tree file + a platform driver + character driver
0.) RTF
check how device trees(dts) are working
check how a platform device works
check how a character device works
gain some knowledge about gpios and dts
#gpio mappings
#subsystems using gpios
#Specifying GPIO information for devices
Read the informations provided by your SOC manufacturer.
The state-of-the-art way to access the gpios is via struct gpio_desc variables. They are created form the device tree.
1.) approach
To toggle a pin under linux you need to make shure, that 3 units are working togehter.
The pin-controller(pinctrl) defines how the output is driven. Open source, pull up etc.
The pin-multiplexer(pinmux) defines different functions for the pin.
The gpio-controller(gpioctrl) translates the gpio number. p.E.: 44 -> GPIO A 11
These components are implemented by the SOC manufacturer. For each platform there are differences. An example for the SAMA5D35xxx follows.
#the device tree
Define the pin controller
pinctrl#fffff200 {
pinctrl_myPins: myPins {
atmel,pins = <AT91_PIOA 2 AT91_PERIPH_GPIO AT91_PINCTRL_NONE // pin 1
AT91_PIOD 19 AT91_PERIPH_GPIO AT91_PINCTRL_NONE>; // pin 2
};
};
Create a node witch is linked to the own platform device:
myPins {
compatible = "myPlatformDevice";
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_myPins>;
pin1 = <&pioA 2 GPIO_ACTIVE_HIGH>;
pin2 = <&pioD 19 GPIO_ACTIVE_HIGH>;
};
#create platform + char driver (pseudo code):
// ----------------------
// kernel message support(via dmesg)
// ----------------------
#define KMSG_DEBUG(fmt,args...) printk(KERN_DEBUG "myDrv" ": "fmt"\n", ##args)
#define KMSG_PERR(fmt,args...) printk(KERN_ERR "myDrv" ": "fmt"\n", ##args)
#define KMSG_PINFO(fmt,args...) printk(KERN_INFO "myDrv" ": "fmt"\n", ##args)
// ----------------------
// trace support via defining dMyDrvTrace
// ----------------------
#ifndef dMyDrvTrace
#define TRACE(...)
#else
#define TRACE(fmt,args...) printk(KERN_INFO "myDrv" ": [%s] "fmt"\n", __FUNCTION__, ##args)
#endif
typedef struct SMyDrvDrvData {
struct platform_device *pdev; //!< next device
// here goes the local/private data
int gpiod_pin1;
int gpiod_pin2;
u32 pin1;
u32 pin2;
} TMyDrvDrvData;
static struct dentry * gmyPlattformDrvDebugfsRootDir; //!< root dir at debugfs
static int myPlattformDrv_probe(struct platform_device *pdev);
static int myPlattformDrv_remove(struct platform_device *pdev);
#if defined(CONFIG_OF)
//! filter for the device tree class
static struct of_device_id gMyPlattformDrvdtsFilter[] = {
{.compatible = "myPlatformDevice"},
{}
};
MODULE_DEVICE_TABLE(of, gMyPlattformDrvdtsFilter);
#else
#define gmyPlattformDrvdtsFilter (NULL)
#endif
static struct platform_device *MyPlattformDrv_devs[] = {
};
static struct platform_driver myPlattformDrv_driver = {
.driver = {
.name = dMyPlattformDrvdriver,
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(gMyPlattformDrvdtsFilter),
},
.probe = myPlattformDrv_probe,
.remove = myPlattformDrv_remove,
};
// char device
static dev_t gMyCharDev;
static struct class *gMyCharDevClass;
static struct cdev gMyCharDev_cdev;
static int dev_open (struct inode *, struct file *);
static int dev_release (struct inode *, struct file *);
static ssize_t dev_read (struct file *, char *, size_t, loff_t *);
static ssize_t dev_write (struct file *, const char *, size_t, loff_t *);
static const struct file_operations gMyCharDevOps =
{
.read = dev_read,
.open = dev_open,
.write = dev_write,
.release = dev_release
};
//! looks up for the gpio name and request it
static int get_gpio(struct platform_device *pdev, const char * name, int * pGPIOnum)
{
int n,i;
int r;
struct device_node * pDN;
TRACE("look at %s for %s ...", pdev->name, name);
// reset return value
*pGPIOnum = 0;
// parse device tree
// get device tree entries associated with the device
pDN = of_find_node_by_name(NULL, pdev->name);
// parse pins
n = of_gpio_named_count(pDN, name);
if (n <= 0) {
TRACE("no gpios found");
return -1;
}
for (i = 0; i < n; i++) {
// get pin number
*pGPIOnum = of_get_named_gpio(pDN,name, i);
if (*pGPIOnum == -EPROBE_DEFER) {
return r;
}
// check if pin number is valid
if (gpio_is_valid(*pGPIOnum)) {
// yes
// request pin
r = devm_gpio_request(&pdev->dev, *pGPIOnum, name);
if (r) {
return r;
} else {
r = gpio_direction_output(*pGPIOnum, 0);
}
if (r) return r;
}
}
}
return 0;
}
//! probes the platform driver
static int myPlattformDrv_probe(struct platform_device *pdev)
{
struct TMyDrvDrvData *priv;
int i,j,r,gpioNum, ret;
KMSG_PINFO("probe my driver ...");
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv) {
KMSG_PERR("Failed to allocate memory for the private data structure");
return -ENOMEM;
}
priv->pdev = pdev;
platform_set_drvdata(pdev, priv);
TRACE("setup gpios ...");
r = get_gpio(pdev, "pin1", &gpioNum);
if (r) {
KMSG_PERR("Failed to find gpio \"pin1\" in device tree");
}
// save number
priv->gpiod_pin1 = gpioNum;
// create "pin1" debugfs entry
debugfs_create_u32("pin1", S_IRUGO, gmyPlattformDrvDebugfsRootDir, &priv->Pin1);
r = get_gpio(pdev, "pin2", &gpioNum);
if (r) {
KMSG_PERR("Failed to find gpio \"pin2\" in device tree");
}
// save number
priv->gpiod_pin2 = gpioNum;
// create "pin2" debugfs entry
debugfs_create_u32("pin1", S_IRUGO, gmyPlattformDrvDebugfsRootDir, &priv->Pin2);
// create device class
TRACE("create myCharDev char device class");
// create char dev region
ret = alloc_chrdev_region(&gMyCharDev, 0, 1, "myCharDev");
if( ret < 0) {
KMSG_PERR("alloc_chrdev_region error %i", ret);
goto error;
}
// create device class
if((gMyCharDevClass = class_create(THIS_MODULE, dSEK4DevClass)) == NULL)
{
KMSG_PERR("class_create error");
goto error_classCreate;
}
if(NULL == device_create(gMyCharDevClass, NULL, gMyCharDev, NULL, "myCharDev"))
{
KMSG_PERR("device_create error");
goto error_deviceCreate;
}
cdev_init(&gMyCharDev_cdev, &gMyCharDevOps);
ret = cdev_add(&gMyCharDev_cdev, gMyCharDev, 1);
if(-1 == ret) {
KMSG_PERR("cdev_add error %i", ret);
goto error_device_add;
return -1;
}
TRACE("added myCharDev char device");
return 0;
// error handling block
error_std:
error_device_add:
device_destroy(gMyCharDevClass, gMyCharDev);
error_deviceCreate:
class_destroy(gMyCharDevClass);
error_classCreate:
unregister_chrdev_region(gMyCharDev, 1);
error:
return -1;
}

skbuff packet sent with zero payload

Using Netfilter's hooks (NF_INET_PRE_ROUTING and NF_INET_POST_ROUTING) I implemented a kernel module that monitors all incoming and outgoing packets for a given host. By looking at the skuffs I can classify packets and identify those that I am interested in. Each time I detect such packets I want to generate my own packet and send it out in the network (note: I am not copying/cloning the original packet but instead create one from "scratch" and send it out using dev_queue_xmit. The only similarity with the original packet is that my packet goes towards the same destination, but has different port/protocol etc.).
The problem is that my generated packets are being sent out with empty payload. Packets successfully reach the destination, the size of the payload is correct, but its content is all set to zeros. Two observations: (1) The function that I am using to construct skbuffs and send out has been tested before and seemed to work. (2) The problem appear only when I classify outgoing packets and attempt to send my own along the way, seemingly the same code works fine when I classify packets using NF_INET_PRE_ROUTING.
Please see the code below:
static struct nf_hook_ops nfout;
static int __init init_main(void) {
nfout.hook = hook_func_out;
nfout.hooknum = NF_INET_POST_ROUTING;
nfout.pf = PF_INET;
nfout.priority = NF_IP_PRI_FIRST;
nf_register_hook(&nfout);
printk(KERN_INFO "Loading Postrouting hook\n");
return 0;
}
static unsigned int hook_func_out(const struct nf_hook_ops *ops,
struct sk_buff *skb, const struct net_device *in,
const struct net_device *out, int (*okfn)(struct sk_buff *)) {
if (skb is the packet that I am looking for, omitting details...) {
generate_send_packet(skb);
}
return NF_ACCEPT;
}
void generate_send_packet(struct sk_buff *target_skb) {
static struct tcphdr * tcph;
static struct iphdr * iph;
static struct ethhdr * ethh;
struct sk_buff * skb1;
iph = ip_hdr(target_skb);
tcph = tcp_hdr(target_skb);
ethh = eth_hdr(target_skb);
int payload = 123456789;
skb1 = construct_udp_skb(dev,
dev->dev_addr, mac,
iph->saddr, iph->daddr,
ntohs(tcph->source), dst_port,
(unsigned char *)&payload, sizeof(int)
);
if (dev_queue_xmit(skb1) != NET_XMIT_SUCCESS) {
printk(KERN_INFO "Sending failed\n");
}
}
struct sk_buff* construct_udp_skb(struct net_device *dev,
unsigned char * src_mac, unsigned char * dst_mac,
uint32_t src_ip, uint32_t dst_ip,
uint32_t src_port, uint32_t dst_port,
uint32_t ttl, uint32_t tcp_seq,
unsigned char * usr_data, uint16_t usr_data_len) {
static struct ethhdr *ethh;
static struct iphdr *iph;
struct udphdr * udph;
static struct sk_buff *skb;
static uint16_t header_len = 300;
unsigned char * p_usr_data;
int udplen;
skb = alloc_skb(1000, GFP_KERNEL);
skb_reserve(skb, header_len);
//-----------------------------------------------
udph = (struct udphdr*) skb_push(skb, sizeof(struct udphdr));
iph = (struct iphdr*) skb_push(skb, sizeof(struct iphdr));
ethh = (struct ethhdr*) skb_push(skb, sizeof(struct ethhdr));
memset(udph, 0 , sizeof(struct udphdr));
memset(iph, 0 , sizeof(struct iphdr));
skb_set_mac_header(skb, 0);
skb_set_network_header(skb, sizeof(struct ethhdr));
skb_set_transport_header(skb, sizeof(struct ethhdr) + sizeof(struct iphdr));
//ETH -------------------------------------------
memcpy(ethh->h_source, src_mac, 6);
memcpy(ethh->h_dest, dst_mac, 6);
ethh->h_proto = htons(ETH_P_IP);
//IP --------------------------------------------
iph->ihl = 5;
iph->version = 4;
iph->ttl = 128;
iph->tos = 0;
iph->protocol = IPPROTO_UDP;
iph->saddr = src_ip;
iph->daddr = dst_ip;
iph->check = ip_fast_csum((u8 *)iph, iph->ihl);
iph->id = htons(222);
iph->frag_off = 0;
iph->tot_len = htons(sizeof(struct iphdr) + sizeof(struct udphdr) + usr_data_len );
ip_send_check(iph);
//UDP--------------------------------------------
udph->source = htons(src_port);
udph->dest = htons(dst_port);
skb->dev = dev;
skb->protocol = IPPROTO_UDP;
skb->priority = 0;
skb->pkt_type = PACKET_OUTGOING;
p_usr_data = skb_put(skb, usr_data_len);
printk(KERN_INFO "Sending [%i]\n", *(int*)usr_data);
skb->csum = csum_and_copy_from_user(usr_data, p_usr_data, usr_data_len, 0, &err);
udplen = sizeof(struct udphdr) + usr_data_len;
udph->len = htons(udplen);
udph->check = udp_v4_check(udplen,
iph->saddr, iph->daddr,
0);
return skb;
}
What could be the possible reason why dev_queue_xmit would nullify the payload in the skbuff?
Thanks!
I assume that the result of debug print message for printk(KERN_INFO "Sending [%i]\n", (int)usr_data) is as expected. But what about the skb->data? Can you try to print it after csum_and_copy_from_user() to see whether it is empty or not? It seems very likely that you will see the zero payload at this point already.

how to implement splice_read for a character device file with uncached DMA buffer

I have a character device driver. It includes a 4MB coherent DMA buffer. The buffer is implemented as a ring buffer. I also implemente the splice_read call for the driver to improve the performance. But this implementation does not work well. Below is the using example:
(1)splice the 16 pages of device buffer data to a pipefd[1]. (the DMA buffer is managed as in page unit).
(2)splice the pipefd[0] to the socket.
(3)the receiving side (tcp client) receives the data, and then check the correctness.
I found that the tcp client got errors. The splice_read implementation is show below (I steal it from the vmsplice implementation):
/* splice related functions */
static void rdma_ring_pipe_buf_release(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
put_page(buf->page);
buf->flags &= ~PIPE_BUF_FLAG_LRU;
}
void rdma_ring_spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
{
put_page(spd->pages[i]);
}
static const struct pipe_buf_operations rdma_ring_page_pipe_buf_ops = {
.can_merge = 0,
.map = generic_pipe_buf_map,
.unmap = generic_pipe_buf_unmap,
.confirm = generic_pipe_buf_confirm,
.release = rdma_ring_pipe_buf_release,
.steal = generic_pipe_buf_steal,
.get = generic_pipe_buf_get,
};
/* in order to simplify the caller work, the parameter meanings of ppos, len
* has been changed to adapt the internal ring buffer of the driver. The ppos
* indicate wich page is refferred(shoud start from 1, as the csr page are
* not allowed to do the splice), The len indicate how many pages are needed.
* Also, we constrain that maximum page number for each splice shoud not
* exceed 16 pages, if else, a EINVAL will return. If a high speed device
* need a more big page number, it can rework this routing. The off is also
* used to return the total bytes shoud be transferred, use can compare it
* with the return value to determint whether all bytes has been transfered.
*/
static ssize_t do_rdma_ring_splice_read(struct file *in, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len,
unsigned int flags)
{
struct rdma_ring *priv = to_rdma_ring(in->private_data);
struct rdma_ring_buf *data_buf;
struct rdma_ring_dstatus *dsta_buf;
struct page *pages[PIPE_DEF_BUFFERS];
struct partial_page partial[PIPE_DEF_BUFFERS];
ssize_t total_sz = 0, error;
int i;
unsigned offset;
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
.nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &rdma_ring_page_pipe_buf_ops,
.spd_release = rdma_ring_spd_release_page,
};
/* init the spd, currently we omit the packet header, if a control
* is needed, it may be implemented by define a control variable in
* the device struct */
spd.nr_pages = len;
for (i = 0; i < len; i++) {
offset = (unsigned)(*ppos) + i;
data_buf = get_buf(priv, offset);
dsta_buf = get_dsta_buf(priv, offset);
pages[i] = virt_to_page(data_buf);
get_page(pages[i]);
partial[i].offset = 0;
partial[i].len = dsta_buf->bytes_xferred;
total_sz += partial[i].len;
}
error = _splice_to_pipe(pipe, &spd);
/* use the ppos to return the theory total bytes shoud transfer */
*ppos = total_sz;
return error;
}
/* splice read */
static ssize_t rdma_ring_splice_read(struct file *in, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len, unsigned int flags)
{
ssize_t ret;
MY_PRINT("%s: *ppos = %lld, len = %ld\n", __func__, *ppos, (long)len);
if (unlikely(len > PIPE_DEF_BUFFERS))
return -EINVAL;
ret = do_rdma_ring_splice_read(in, ppos, pipe, len, flags);
return ret;
}
The _splice_to_pipe is just the same one as the splice_to_pipe in kernel. As this function is not an exported symbol, so I re-implemented it.
I think the main cause is that the some kind of lock of pages are omitted, but
I don't know where and how.
My kernel version is 3.10.

How to add a new custom layer 4 protocol (a new Raw socket) in linux kernel?

i am trying adding my own customized layer 4 protocol in linux (ubuntu 14.04) - IPPROTO_MYPROTO as a loadable kernel module. I have done all necessary steps to register the protocol. Here i am sharing my code.
When i am trying to send a mesage from user space program using sendmsg(), i expect the corresponding fn myproto_sendmsg() registered via struct proto structure should be called in kernel space. But what i am observing is that though the myproto_sendmsg() in kernel space is not being called, yet destination machine is receiving the correct data. surprise ! surprise !. Is the default udp sendmsg() fn kicking in here which is like uninvited guest doing his work.
Here, sendmsg() call in user space returns as many bytes as send. Hence, fn returns success.
User space program :
void forwardUDP( int destination_node ,char sendString[] )
{
struct msghdr msg;
destination_node = destination_node % N; //destination node to which data is to be forwaded
int sock, rc;
struct sockaddr_in server_addr;
struct iovec iov;
struct hostent *host; //hostent predefined structure use to store info about host
host = (struct hostent *) gethostbyname(node[destination_node].ip_address);//gethostbyname returns a pointer to hostent
if ((sock = socket(AF_INET, SOCK_RAW, 5)) == -1)
{
perror("socket");
exit(1);
}
//destination address structure
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(node[destination_node].udpportno);
server_addr.sin_addr = *((struct in_addr *)host->h_addr); //host->h_addr gives address of host
bzero(&(server_addr.sin_zero),8);
/* fill the messsage structure*/
memset(&msg, 0 , sizeof(struct msghdr));
memset(&iov, 0, sizeof(struct iovec));
msg.msg_name = (void *)&server_addr;
msg.msg_namelen = sizeof(struct sockaddr_in);
printf("sendString = %s\n", sendString);
iov.iov_base = (void *)sendString;
iov.iov_len = strlen(sendString);
msg.msg_iov = &iov;
printf("len = %d\n", strlen(sendString));
#if 1
msg.msg_iovlen = 1;
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
#endif
//sendto(sock, sendString, strlen(sendString), 0,(struct sockaddr *)&server_addr, sizeof(struct sockaddr));
**rc = sendmsg(sock, &msg, 0);**
printf("rc = %d\n", rc);
//sendto() function shall send a message through a connectionless-mode socket.
printf("\nFORWARD REQUEST : '%s' has been forwarded to node ---->%d\n",sendString,destination_node);
//close(sock);
}
Kernel Module
/* Define the handler which will recieve all ingress packets for protocol = IPPROTO_MYPROTO
defined in net/protocol.h
*/
/* Resgiter the call backs for pkt reception */
static const struct net_protocol myproto_protocol = {
.handler = myproto_rcv,
.err_handler = myproto_err,
.no_policy = 1,
.netns_ok = 1,
};
static struct inet_protosw myproto_protosw;
int
myproto_rcv(struct sk_buff *skb){
printk(KERN_INFO "myproto_rcv is called\n");
return 0;
}
int
myproto_sendmsg(struct kiocb *iocb, struct sock *sk,
struct msghdr *msg, size_t len){
printk(KERN_INFO "myproto_sendmsg() is called\n");
return 0;
}
void myproto_lib_close(struct sock *sk, long timeout){
printk(KERN_INFO "close is called\n");
return;
}
int
myproto_recvmsg(struct kiocb *iocb, struct sock *sk,
struct msghdr *msg,
size_t len, int noblock, int flags,
int *addr_len){
printk(KERN_INFO "myproto_recvmsg() is called.\n");
printk(KERN_INFO "iocb = 0x%x,\nsk = 0x%x,\nmsg = 0x%x,\nlen = %d,\nnoblock = %d,\nflags = %d,\naddr_len = 0x%x", iocb, sk, msg, len, noblock, flags, addr_len);
return 0;
}
/* Socket structure for Custom protocol, see struct udp_sock for example*/
struct myproto_sock{
struct inet_sock inet; // should be first member
__u16 len;
};
void
myproto_lib_hash(struct sock *sk){
printk(KERN_INFO "myproto_lib_hash() is called");
}
/* Define the **struct proto** structure for the Custom protocol defined in
net/sock.h */
struct proto myproto_prot = {
.name = "MYPROTO",
.owner = THIS_MODULE,
.close = myproto_lib_close,
.sendmsg = myproto_sendmsg,
.hash = myproto_lib_hash,
.recvmsg = myproto_recvmsg,
.obj_size = sizeof(struct myproto_sock),
.slab_flags = SLAB_DESTROY_BY_RCU,
};
int init_module(void);
void cleanup_module(void);
int
init_module(void)
{
int rc = 0;
rc = proto_register(&myproto_prot, 1);
if(rc == 0){
printk(KERN_INFO "Protocol registration is successful\n");
}
else{
printk(KERN_INFO "Protocol registration is failed\n");
cleanup_module();
return rc;
}
rc = inet_add_protocol(&myproto_protocol, IPPROTO_MYPROTO);
if(rc == 0){
printk(KERN_INFO "Protocol insertion in inet_protos[] is successful\n");
}
else{
printk(KERN_INFO "Protocol insertion in inet_protos[] is failed\n");
cleanup_module();
return rc;
}
memset(&myproto_protosw, 0 ,sizeof(myproto_protosw));
myproto_protosw.type = SOCK_RAW;
myproto_protosw.protocol = IPPROTO_MYPROTO;
myproto_protosw.prot = &myproto_prot;
extern const struct proto_ops inet_dgram_ops; // defined in ipv4/af_inet.c
myproto_protosw.ops = &inet_dgram_ops;
myproto_protosw.flags = INET_PROTOSW_REUSE;
inet_register_protosw(&myproto_protosw);
return 0;
}
void cleanup_module(void)
{
int rc = 0;
rc = inet_del_protocol(&myproto_protocol, IPPROTO_MYPROTO);
if(rc == 0)
printk(KERN_INFO "Protocol removed successful\n");
else
printk(KERN_INFO "Protocol removal failed\n");
proto_unregister(&myproto_prot);
printk(KERN_INFO "Module cleaned up\n");
return;
}

How to make the copy of the packet?

I want to make a copy of the packet (and send it to queue that is made by me) at the Net Filter hook.
Will skb_copy work for me? i also have to add the seq no before the packet,skb_reserve will do that?
I have written the following code to capture packet
unsigned int hook_func(unsigned int hooknum,
struct sk_buff **skb,
const struct net_device *in,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
if (strcmp(in->name, drop_if) == 0) {
printk("Dropped packet on %s...\n", drop_if);
return NF_DROP;
} else {
return NF_ACCEPT;
}
}
/* Initialisation routine */
int init_module()
{
/* Fill in our hook structure */
nfho.hook = hook_func; /* Handler function */
nfho.hooknum = NF_IP_PRE_ROUTING; /* First hook for IPv4 */
nfho.pf = PF_INET;
nfho.priority = NF_IP_PRI_FIRST; /* Make our function first */
nf_register_hook(&nfho);
return 0;
}
I do agree with Rachit Jain, unless you have a valid reason to do this in Kernel space, I do suggest you use libpcap to do it in user-space.
Anyhow, if you just wanna copy the packet and then amend some data, I suggest you allocate a new skb with enough space to copy the data you already have in the skb you received + enough space to add a header.
Here's a code that I once used, it doesn't do any copying from an already existing skb but it can be useful to you. I am crafting a special kind of ICMP echo message here
int sendICMPEcho(unsigned char *msg, unsigned int length,
__be32 source, __be32 dest)
{
struct ethhdr *eth;
struct iphdr *iph;
struct icmphdr *icmph;
struct sk_buff *newPacket;
unsigned char *data;
unsigned int skbSize = length + sizeof(struct icmphdr)
+ sizeof(struct iphdr)
+ sizeof(struct ethhdr);
/* Allocate the skb */
newPacket = alloc_skb(skbSize, GFP_ATOMIC);
if(newPacket == NULL)
return SEND_FAIL_MEMORY;
/* Reserve the headers area */
skb_reserve(newPacket, sizeof(struct icmphdr)
+ sizeof(struct iphdr)
+ sizeof(struct ethhdr));
/* Extend the data area from 0 to the message length */
data = skb_put(newPacket, length);
/* Copy the data from the message buffer to the newPacket */
memcpy(data, msg, length);
/************** ICMP HEADER***************/
/* skb_push - pushing the icmp header in the packet data */
icmph = (struct icmphdr *) skb_push(newPacket,
sizeof(struct icmphdr));
/*set ICMP header here */
icmph->type = ICMP_ECHO;
icmph->code = 100; /* Our magic number */
icmph->un.echo.id = 0;
icmph->un.echo.sequence = htons(sendCounter);
icmph->checksum= 0;
icmph->checksum = in_cksum((unsigned short *)icmph,
sizeof(struct icmphdr) + length);
/************** END ICMP HEADER**************/
/************** IP HEADER ***************/
iph = (struct iphdr *) skb_push(newPacket,
sizeof(struct iphdr));
/* set IP header here */
iph->ihl = 5;/* 5 * 32(bits) */
iph->version = 4;
iph->tos = 255; /* Just a magic number - remove it */
iph->tot_len = htons( sizeof(struct iphdr)
+ sizeof(struct icmphdr)
+ length);
iph->id = 0;
iph->frag_off = 0; /* No fragementation */
iph->ttl = 65;
iph->protocol = IPPROTO_ICMP;
iph->saddr = source;
iph->daddr = dest;
iph->check = 0;
iph->check = in_cksum((unsigned short *)iph, sizeof(struct iphdr));
/************** END IP HEADER ***************/
/*WARNING: THE CODE BELOW SHOULD BE REPLACED BY SOMETHING MORE ROBUST
THAT USES THE KERNEL ROUTING!
AND USES IP_LOCAL_OUT INSTEAD OF WHAT WE ARE DOING */
/* Set up the net-device for the new packet */
/* In my code, there was a function findDeviceByIp that does the routing and return which net_device to use for transmission*/
newPacket->dev = findDeviceByIP(source);
if(newPacket->dev == NULL)
{
kfree_skb(newPacket);
return SEND_DEV_FAIL;
}
/************** ETH HEADER ***************/
eth = (struct ethhdr *) skb_push(newPacket, sizeof(struct ethhdr));
if(strcmp(newPacket->dev->name, "wlan0") == 0)
memcpy(eth->h_dest, wifiMAC, 6);
else if(strcmp(newPacket->dev->name, "eth0") == 0)
memcpy(eth->h_dest, etherMAC, 6);
else
{
kfree_skb(newPacket);
return SEND_FAIL_SEND;
}
memcpy(eth->h_source, newPacket->dev->dev_addr, 6);
eth->h_proto = htons(ETH_P_IP);
/************** END ETH HEADER ***************/
dev_queue_xmit(newPacket);/* Transmite the packet */
/* END OF THE WARNING AREA */
++sendCounter;
return SEND_SUCCESS;
}
There are many helper functions provided by linux kernel to work on skb's. it depends on the usecase which one you want to use.
skb_clone ==>copies the skbuff header and increments the reference counter for data buffer. If you are only interested in modifying the skbuff header then you can use skb_clone
skb_copy ==> copies skbuff header, data buffer as well as fragments. Use when you are interested in modifying the data in main buffer as well as in fragments buffer
pskb_copy ==> copies skbuff header + only the data buffer but not the fragments, thus if you want to modify skb except fragment buffer then you can use this one. which takes headroom also as arguement.
Better read the helper function provided by linux kernel(net/core/skbuff.c) to do the skb operations efficiently and to avoid any pitfalls.

Resources