by Alessandro Rubini
Based on last month's article about invoking system calls from
within kernel code, this month's column shows how a complete network
server can be implemented as a kernel thread. The sample code
shown implements the skeleton of a simplified
While normally you shouldn't bring user-space processes to kernel space, there are times when this may be a good choice, either for performance or for size. The former reason is what led to kHTTPd, the latter may be relevant to avoid libc altogether in small embedded systems devoted to a single task.
The discussion in this column is based in the kernel-based web
server released within version 2.4.0-test9 of the kernel, in the
http://people.redhat.com/mingo/TUX-patches/
.
Code excerpts included in this column are part of a http://ar.linux.it/docs/khttpd/ktftpd.tar.gz
. The
kernel-space daemon loosely mimics what
Our /tftp
file tree, where /tftp
could
also be a symbolic link to another directory.
Even though the
The daemon won't keep real logs. It'll merely print a little
information using conventional
The structure of the skeletal module has been thought out to be
thouroughly understood in a reasonable time, but has not been
completely implemented. In my opinion, reading this column,
The first step a programmer must take to run a server in kernel
space is forking a process. To create a new thread, you must
call the function
Listing 1 shows how module initialization forks a new thread and
how the thread detaches itself from the user process that forked it.
/*
* in init_module(): fork the main thread
*/
kernel_thread(ktftpd_main, NULL /* no arg */,
0 /* no clone flags */);
/*
* in ktftpd_main(): detach from the original process
*/
sprintf(current->comm,"ktftpd-main"); /* comm is 16 bytes */
lock_kernel(); /* This seems to be required for exit_mm */
exit_mm(current);
/* close open files too (stdin/out/err are open) */
exit_files(current);
In order to handle several clients at the same time, a daemon
usually forks several copies of itself, each of them in charge of a
single connection. This is accomplished by calling
You shouldn't be shy of forking copies of your kernel daemon, as the resources consumed by each of them are almost negligible when compared to the cost associated to forking a user-space server. A kernel thread requires no memory-management overhead: it only consumes a pair memory pages for the stack and a few data structures that are replicated for each thread.
To count the number of running threads, an atomic_t
data item is used. I called it DaemonCount
, the same
name used by
Just before unloading the module you'll need to stop all the threads,
as the code their are executing is bound to disappear.
There are several ways to accomplish the task. The kHTTPd server uses
a
http://www.linux.it/~rubini/docs/sysctl/sysctl.html
for more information. To keep code shorter and simpler I chose a
different approach:
the individual thread doesn't add to the usage count for the module, and
the cleanup function sets a global flag and then waits for all the
threads to terminate.
Listing 2 shows the code that deals with thread termination.
int ktftpd_shutdown = 0; /* set at unload time */
DECLARE_WAIT_QUEUE_HEAD(ktftpd_wait_threads);
/*
* In the code of each thread, the main loop depends
* on the value of ktftpd_shutdown
*/
while (!signal_pending(current) && !ktftpd_shutdown) {
/* .... */
}
/*
* The following code is part of the cleanup function
*/
/* tell all threads to quit */
ktftpd_shutdown = 1;
/* kill the one listening (it would take too much time to exit) */
kill_proc(DaemonPid, SIGTERM, 1);
/* and wait for them to terminate (no signals accepted) */
wait_event(ktftpd_wait_threads, !atomic_read(&DaemonCount));
Additionally, the user is allowed to terminate each thread by
sending it a signal (as you may have imagined by looking at the
condition around the main loop above). Trivially, when a signal is
pending the thread exits. This behavior is the same signal handling
implemented in the kernel web server, and boils down to the few lines
of code shown in listing 3. The instructions shown are part of the
initialization code of the main thread. Other threads are created with
the CLONE_SIGHAND
flag, so sending a signal to any of
them will kill them all.
/* Block all signals except SIGKILL, SIGTERM */
spin_lock_irq(¤t->sigmask_lock);
siginitsetinv(¤t->blocked, sigmask(SIGKILL) | sigmask(SIGTERM));
recalc_sigpending(current);
spin_unlock_irq(¤t->sigmask_lock);
The main task of a
As fas as network access is concerned, what a server should
generally do reduces to the following few system calls:
fd = socket();
bind(fd); listen(fd);
while (1) {
newfd = accept(fd);
if (fork()) {
close(newfd);
/* .... */
exit();
} else {
close(fd);
}
Performiing the same task from kernel space reduces to similar
code, with
The file
/* Open and bind a listening socket */
error = sock_create(PF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock);
if (error < 0) {
printk(KERN_ERR "ktftpd: can't create socket: errno == %i\n", -error);
goto out;
}
/* Same as setsockopt(SO_REUSE). Actaully not needed for tftpd */
/* sock->sk->reuse = 1; --- needed for multi-thread TCP servers */
sin.sin_family = AF_INET;
sin.sin_addr.s_addr = INADDR_ANY;
sin.sin_port = htons((unsigned short)KTFTPD_PORT);
error = sock->ops->bind(sock,(struct sockaddr*)&sin,sizeof(sin));
if (error < 0) {
printk(KERN_ERR "ktftpd: can't bind UDP port %i\n", KTFTPD_PORT);
goto out;
}
#if 0 /* There is no need to listen() for UDP. It would be needed for TCP */
error = sock->ops->listen(sock,5); /* "5" is the standard value */
if (error < 0) {
printk(KERN_ERR "ktftpd: can't listen()\n");
goto out;
}
#endif
Next, a TCP server would sleep in the
A kernel-space server shouldn't resort to
If the service is based on the UDP protocol, the thread will
usually sleep on
Sleeping on a system call invoked from kernel space is not different than
sleeping in user space: the system call handles its own wait queue.
The difference is, as outlined last month, in the need to use
The MSG_PEEK
flag in order
not to flush the input queue until all headers are received. The
other function is called MSG_DONTWAIT
flag in order not to block when no data is
there.
The procedure used by
/*
* This procedure is used as a replacement for recvfrom(). Actually it is
* is based on the one in kHTTPd which in turn is based on sys_recvfrom.
* The iov is passed by the caller since it hosts the peer's address,
* and the buffer is passed by the calles because it can't be global
* (all threads share the same address space)
*/
static inline int ktftpd_recvfrom(struct socket *sock,
struct sockaddr_in *addr,
unsigned char *buf)
{
struct msghdr msg;
struct iovec iov;
int len;
mm_segment_t oldfs;
if (sock->sk==NULL) return 0;
msg.msg_flags = 0;
msg.msg_name = addr;
msg.msg_namelen = sizeof(struct sockaddr_in);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
msg.msg_iov->iov_base = buf;
msg.msg_iov->iov_len = PKTSIZE;
oldfs = get_fs(); set_fs(KERNEL_DS);
len = sock_recvmsg(sock,&msg,1024,0);
set_fs(oldfs);
return len;
}
The code to transmit packets is similar to the code receiving them. It
exploits
The main difference is in how blocking is managed. The kernel thread that pushes data to a TCP socket should prevent to find itself with partially-written data, as that situation would require extra data management.
In the implementation of
int ReadSize,Space;
int retval;
Space = sock_wspace(sock->sk);
ReadSize = min(4*4096, FileLength - BytesSent);
ReadSize = min(ReadSize , Space );
if (ReadSize>0) {
oldfs = get_fs(); set_fs(KERNEL_DS);
retval = filp->f_op->read(filp, buf, ReadSize, &filp->f_pos);
set_fs(oldfs);
if (retval>0) {
retval = SendBuffer(sock, buf, (size_t)retval);
if (retval>0) {
BytesSent += retval;
}
}
}
With UDP each packet is sent as an individual item, so there is no
need to check
rubini@gnu.org
.
<ciminaghi-at-prosa-dot-it>
and
Andrea Glorioso <andrea.glorioso-at-binary-only-dot-com>
for helping revising this article.