Next Previous Contents

3. Socket Based Mechanisms

3.1 Introduction

Beside the file system based mechanism described in the last section, there are mechanisms based on the socket interface. Sockets allow the Linux kernel to send notifications to a user space application. This is in contrast to the file system mechanisms, where the kernel can alter a file, but the user program does not know of any change until it choses to access that file. The socket based mechanisms allow the applications to listen on a socket, and the kernel can send them messages at any time. This leads to a communication mechanism in which user space and kernel space are equal partners.

There are different socket families which can be used to achieve this goal:

AF_INET

designed for network communication, but UDP sockets can also be used for the communication between a kernel module and the user space. The use of UDP sockets for node local communication involves a lot of overhead.

AF_PACKET

allows the user to define all packet headers.

AF_NETLINK (netlink sockets)

They are especially designed for the communication between the kernel space and the user space. There are different netlink socket types currently implemented in the kernel, all of which deal with a specific subset of the networking part of the Linux kernel.

In this section we briefly cover UDP sockets, as they are the most flexible communication mechanism. Afterwards we look at netlink sockets, with a special focus on "generic netlink sockets".

3.2 UDP Sockets

Description

We briefly describe UDP sockets, since their usage in user space is well known and they provide a lot of flexibility. With UDP sockets it is possible to have communication between a kernel module on one system and a user space application on an other machine.

Implementation

udpRecvCallback.c kernel module that provides a callback function to receive messages on port number 5555.
udpUser.c user space program that sends a message to the kernel which sends the message back.

The user space program sends the string "hello world" to the kernel, which listens on port 5555. The module prints the payload to the system log file and sends it back to the application. The application receives the message and prints the payload to stdout.

The user space code does not need any explanation, since a lot of documentation is available for socket programming. The module implementation requires some understanding of the Linux kernel. The module_init function performs three actions:

  1. Create a socket to receive UDP packets. As is well known from user space, we first create the socket and then we bind it. In addition we specify a callback function, which is executed every time a packet is received on that socket. This callback function is executed in "interrupt context". This implies that only a restricted set of operations can be performed in that function, and it is not allowed to send any messages.
  2. Therefore we define a work_queue with which we can delay the sending of the answer until we have left interrupt context.
  3. Finally, we create a socket to send the answer back to the application.

Each time the module receives a packet, the callback function cb_data() is executed. The only thing this callback function does is to submit the send_answer() function to the work_queue. If the kernel decides that there is no better work to do, it executes the send_answer function. The send_answer() function receives the packets, by dequeuing them from the sockets-receive-queue, with skb_dequeue. Since more than one packet can be received by the socket between two consecutive executions of send_answer(), this function may need to dequeue multiple messages. The number of messages in the socket queue can be obtained with a call to skb_queue_len(). After having dequeued the packet the message is printed to the system log and a message is send back to the application with the sock_sendmsg() function. Since this function assumes to be executed from user space, we have to adjust the boundaries for the allowed memory region with the help of the set_fs macro.

3.3 Netlink Sockets

Description

Netlink is a special IPC used for transferring information between kernel and user space processes, and provides a full-duplex communication link between the Linux kernel and user space. It makes use of the standard socket APIs for user-space processes, and a special kernel API for kernel modules. Netlink sockets use the address family AF_NETLINK, as compared to AF_INET used by a TCP/IP socket.

Netlink sockets have the following advantages over other communication mechanisms:

Beside these advantages netlink sockets have two drawbacks:

To eliminate these drawbacks the "Generic Netlink Family" was implemented. It acts as a Netlink multiplexer, in a sense that different applications may use the generic netlink address family.

Generic Netlink communications are essentially a series of different communication channels which are multiplexed on a single Netlink family. Communication channels are uniquely identified by channel numbers which are dynamically allocated by the Generic Netlink controller. Kernel or user space users which provide services, establish new communication channels by registering their services with the Generic Netlink controller. Users of the service, then query the controller to see if the service exists and to determine the correct channel number.

Each generic netlink family can provide different "attributes" and "commands". Each command has its own callback function in the kernel module and may receive messages with different attributes. Both commands and attributes, are "addressed" by an identifier.

Implementation

gnKernel.c kernel module that implements a generic netlink family (CONTROL_EXMPL)
gnKernel_anurag_chugh.c kernel module for newer kernels. Tested with 2.6.35-22-generic. Thanks to Anurag Chugh!
gnUser.c user space program that sends a message to the kernel and receives an answer back
Here we describe how to use this generic netlink protocol to exchange data between user space and kernel space.

User Space:

Although the user space application could be written just with the help of the well known socket operations it is not reasonable to do so. For convenient user space programming there exists the libnl netlink library. It provides functions dedicated to be used for generic netlink socket communication. The libnetlink library supports generic netlink as of version 1.1, so probably you need to download the actual version from http://people.suug.ch/~tgr/libnl/. A program that uses this library needs to be compiled with -lnl specified.

User Space Sending Phase::

  1. nl_handle_alloc: create a socket
  2. genl_connect: connect to the NETLINK_GENERIC socket family.
  3. genl_ctrl_resolve: resolve the ID for the particular generic netlink family we want to talk with. In this example we have called the family CONTROL_EXMPL.
  4. genlmsg_put: create the generic netlink message header. In most cases you can leave all the arguments as in the example except the DOC_EXMPL_C_ECHO argument. This specifies which callback function of your kernel module gets executed. In this example there is just one callback function.
  5. nla_put_*: put the data into the message. All the possibilities are listed in the file attr.c of libnl. The second argument is used by the kernel module to distinguish which attribute was sent. In this example we have only one attribute: DOC_EXMPL_A_MSG which is a null terminated string.
  6. nl_send_auto_complete: send the message to the kernel
Receiving Phase:
  1. nl_socket_modify_cb: Add a callback function to the socket. This callback function gets executed when the socket receives a message. In the callback function the message needs to be decoded. We use nla_parse for this. Using genlmsg_parse would be more specific, but I could not link genlmsg_parse with my program. The nla_get_* functions are the counterpart of the nla_put_* functions, and are used to get a specific attribute from the message.
  2. nl_recvmsgs_default: wait until a message is received.

Kernel Module

In the module_init function we need to register our generic netlink family with a call to genl_register_family. As an argument we have to specify a struct genl_family which holds the name of our family (CONTROL_EXMPL). In a second step we have to register the functions that get executed upon receiving a message from the user space. This is done with genl_register_ops which takes as an argument the family to which this function belongs and a struct genl_ops. This struct specifies the actual callback function as well as a security policy checked before the actual callback function gets executed. This means that if you want to receive, for example an integer, but the user space program sends a string, your callback function does not get invoked.

The actual callback function is doc_exmpl_echo. It performs two things: prints the received message, and sends a message back to the user space process. The callback function has as an argument a struct genl_info *info which holds the already parsed message. This struct contains an array which has an entry for each possible attribute. Our example has only one attribute (DOC_EXMPL_A_MSG). The data related to this attribute is saved in info->attrs[DOC_EXMPL_A_MSG];. In order to obtain the data for a given attribute there is a simple function: nla_data.

The sending process is very similar to the user space sending process.

  1. genlmsg_new: this allocates a new skb that will be sent to the user space. Since we do not yet know the final size, we use the macro NLMSG_GOODSIZE.
  2. genlmsg_put: fills the generic netlink header. All messages sent by kernel have pid = 0
  3. nla_put_string: write a string into the message.
  4. genlmsg_end: finalize the message
  5. genlmsg_unicast: send the message back to the user space program.

Resources and Further Reading


Next Previous Contents