by alessandro rubini
Reprinted with permission of Linux Magazine
As everybody knows, the role of the kernel is mostly related to
hardware control, and user-space program need an entry points to talk
with each hardware device. While every Unix offers a /dev
directory where such entry points are collected, there exist several
different ways to lay out /dev
, and each has its own
advantages and disadvantages.
Simple systems, like Linux-1.0 and maybe Linux-2.0, are best served
by an on-disk /dev
directory and 8-bit-wide major and
minor numbers. But as the number of supported devices grows and new
entry points to low-level information are conceived, the old and easy
lay out may not fit any more. That's why version 2.3.46 finally
introduced devfs support in the official kernel tree. The
facility is marked as experimental, and its use is expected to remain
optional, as some environments (embedded systems are the most notable
ones) may still prefer to use the old approach.
In this article I'm only going to give a brief introduction to the
tool, as there is plenty of documentation about setting it up (a good
reading for instance is
Documentation/filesystems/devfs/README
). I'll rather
show how device programmers can write code that fits in the
devfs environment. The discussion and the sample code is based
on version 2.2.14 of the kernel, patched with
devfs-patch-v99.11.gz
, available from
ftp://ftp.atnf.csiro.au/pub/people/rgooch/linux
.
The sample module is called drums, short for ``Devfs Resources in User Module Sample'', and is available with the Makefile and this article as http://ar.linux.it/docs/devfs.tar.gz.
A device driver that wants to register its entry point within the
devfs filesystem should call one of the forms of the
devfs_register function. The devfs kernel interface is
prototyped in the header file
@lt;linux/devfs_fs_kernel.h>
. Let's imagine for
example we want to register a character device driver, the function to
call is:
devfs_handle_t devfs_register (devfs_handle_t dir,
const char *name, unsigned int namelen,
unsigned int flags,
unsigned int major, unsigned int minor,
umode_t mode, uid_t uid, gid_t gid,
void *ops, void *info);
Given the huge list of arguments, the function can register pretty anything and can assign the desired ownership and permissions to the file. The current version of devfs (at time of writing) doesn't allow registration of directories and symbolic links using this function, but there are other functions to create such files.
In a perfectly devfs-ized world, devfs_register would be everything that's needed to create an entry point for a device. However, you may want to allow the superuser to create a non-devfs entry point, using the conventional mknod command. To this aim, you need to register the file operations associated to your major number, by calling devfs_register_chrdev, which takes the same arguments you used to pass to register_chrdev.
Both devfs_register_chrdev and devfs_register_blkdev
are simple wrappers arount register_chrdev and
register_blkdev. They either call the old-style function or
don't do anything, according to whether the command-line option of
devfs=only
has been passe to the kernel at boot time. If
devfs is the only way to access devices, the functions
don't do anything, so any device file created outside of devfs
will not be associated to any device driver.
With this background, listing 1 shows how drums registers its
entry points: a /dev/drums directory and a few files in there.
While the real source code has complete error checking and recovery,
I'd better avoid print those lines here, as they may be distracting.
devfs_register_chrdev(DRUMS_MAJOR, "drums", &drums_fops);
drums_dir = devfs_mk_dir(NULL, "drums", 0, NULL);
for (i=0; i<DRUMS_NR_DEV; i++) {
drums_devs[i] = devfs_register(drums_dir /* parent dir */,
drums_strings[i], DRUMS_NAME_LEN,
DEVFS_FL_NONE, DRUMS_MAJOR, i/*minor*/,
S_IFCHR | S_IRUGO, 0, 0,
&drums_fops, NULL);
}
Once registered, the devices behave pretty much like any
conventional device, and you can even chown and chmod
them. The sample drums are not very refined, and if you listen to them
they repeat the same note over and over.
borea.root# ls -l /dev/drums
total 0
cr--r--r-- 1 root root 60, 0 Jan 1 1970 bam
cr--r--r-- 1 root root 60, 1 Jan 1 1970 bum
cr--r--r-- 1 root root 60, 2 Jan 1 1970 pam
cr--r--r-- 1 root root 60, 3 Jan 1 1970 pum
cr--r--r-- 1 root root 60, 4 Jan 1 1970 tam
cr--r--r-- 1 root root 60, 5 Jan 1 1970 tum
borea.root# head -2 /dev/drums/bam
bam
bam
borea.root# head -100 /dev/drums/tum | uniq
tum
The implementation of the drums is pretty standard: the
minor number of the device being read is used to choose which string
to return to user space, and the string being returned is the same
drums_strings[i] used in registering the device name.
int minor = MINOR(inode->i_rdev);
if (count > DRUMS_TXT_LEN) count = DRUMS_TXT_LEN;
copy_to_user(buf, drums_strings[minor], count);
Unregistering the devices at unload time is easy, you just need to
call devfs_unregister for each entry point you registered.
Also, if you called devfs_register_chrdev you should now
call devfs_unregister_chrdev. Unregistering is shown in
listing 2.
for (i=0; i<DRUMS_NR_DEV; i++)
devfs_unregister(drums_devs[i]);
devfs_unregister(drums_dir);
devfs_unregister_chrdev(DRUMS_MAJOR, "drums");
If your device is meant to be only available via devfs, you can choose to avoid to deal with major and minor numbers. Actually, when a devfs node is opened, the kernel doesn't need to use the device numbers, as the driver already provided the file_operations structure that must be used to act on that device.
To get automatic device numbers, the only thing that's needed is
specifying DEVFS_FL_AUTO_DEVNUM
in the flags
argument. The major and minor arguments are then unused,
and the filesystem will automatically choose a major/minor pair for
your device.
What is most interesting in using automatic device numbers is that the driver write can't use the drums_read approach any more (choosing what to do according to the minor number), as the minor number isn't known at compile time.
What comes to rescue is the private_data field that is part of
the file
structure. Most drivers that use the field
internally assign its value at open time based on the minor
number being opened, and use it in the other device methods
(read, write, etc). With devfs you are allowed
to choose your private_data pointer before the device is
opened, and the chosen value can be passed to devfs_register as
last argument.
The historical role of the device numbers is vanished by devfs: the major number is unneeded because each device declares its operations, and the minor number is not needed because each device declares its private information. The only remaining problems may be in user-space program, which expect the major number to be constant across similar devices, but this applications' behavior doesn't touch to new devices (whose applications has not yet been written), so the problem doesn't really apply.
In the drums module, you'll find tambourine and
timpani as examples of automatic assignment of device numbers.
The code lines that implement them are shown in listing 3, and their
appearence in the system is show by this screenshot.
borea.root# ls -l /devfs/timpani /devfs/tambourine
cr--r--r-- 1 root root 144, 3 Jan 1 1970 /devfs/tambourine
cr--r--r-- 1 root root 144, 4 Jan 1 1970 /devfs/timpani
borea.root# head -1 /devfs/timpani
boom
borea.root# head -1 /devfs/tambourine
rattle
/* init_module: register tambourine and timpani */
drums_tambourine = devfs_register(NULL, "tambourine", 0,
DEVFS_FL_AUTO_DEVNUM, 0, 0,
S_IFCHR | S_IRUGO, 0, 0,
&drums_fops, (void *)"rattle\n");
drums_timpani = devfs_register(NULL, "timpani", 0,
DEVFS_FL_AUTO_DEVNUM, 0, 0,
S_IFCHR | S_IRUGO, 0, 0,
&drums_fops, (void *)"boom\n");
/* this is the read() implementation */
txt = filp->private_data;
if (count > strlen(txt)) count = strlen(txt);
copy_to_user(buf, txt, count);
*offp += count;
return count;
The ability to work without using the device numbers is very important, because the Linux device space is not far from exhaustion, due to its sparse nature: a major number is assigned for every important-enough device driver, even though most systems only have a dozen of drivers installed. Being able to make drivers work independent of major assignment, and without resorting to hairy scripts to call mknod at load time if you use a dynamic major number.
While the sample drums module only shows the basic
functionality of devfs, the interface exported by
devfs_fs_kernel.h
offers much more. The filesystem can
host conventional files, symbolic links and everything that can live
in a conventional filesystem.
The filesystem is currently marked as experimental, even though The current devfs implementation is pretty stable. The problem with devfs as I write this is that actual use of its features must be somehow standardized, to prevent possible cluttering of the devfs name space. As a matter of fact, kernel developers are still discussing about the suitability of procfs and devfs for all system configuration, in order to find the best and cleanest way to access sytem configuration and resources.
While non-devfs systems use less memory and a tiny
conventioanl /dev
directory is currently still the best
option for small embedded systems, the availability of devfs
opens a new range of options for driver developers, and simplifies
user's life in adding a new device driver to their system, as now the
driver module can do everything is needed to grant user-space access
to the hardware.
rubini-at-gnu-dot-org
.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved