Secure Programming for Linux HOWTO: Background

2. Background

2.1 Linux and Open Source Software

In 1984 Richard Stallman's Free Software Foundation (FSF) began the GNU project, a project to create a free version of the Unix operating system. By free, Stallman meant software that could be freely used, read, modified, and redistributed. The FSF successfully built many useful components but was having trouble developing the operating system kernel [FSF 1998]. In 1991 Linus Torvalds began developing an operating system kernel, which he named ``Linux'' [Torvalds 1999]. This kernel could be combined with the FSF material and other components producing a freely-modifiable and very useful operating system. This paper will term the kernel itself the ``Linux kernel'' and an entire combination as ``Linux'' (many use the term GNU/Linux instead for this combination).

Different organizations have combined the available components differently. Each combination is called a ``distribution,'' and the organizations that develop distributions are called ``distributors.'' Common distributions include Red Hat, Mandrake, SuSE, Caldera, Corel, and Debian. This paper is not specific to any distribution; it does presume Linux kernel version 2.2 or greater and the C library glibc 2.1 or greater, which are valid assumptions for essentially all current major Linux distributions.

Increased interest in such ``free software'' has made it increasingly necessary to define and explain it. A widely used term is ``open source software,'' which further defined in [OSI 1999]. Eric Raymond [1997, 1998] wrote several seminal articles examining its development process.

Linux is not derived from Unix source code, but its interfaces are intentionally Unix-like. Therefore, Unix lessons learned apply to Linux, including information on security. Much of the information in this paper actually applies to any Unix-like system, but Linux-specific information has been intentionally added to enable those using Linux to take advantage of its capabilities. This paper intentionally focuses on Linux systems to narrow its scope; including all Unix-like systems would require an analysis of porting issues and other systems' capabilities, which would have greatly increased the size of this document.

Since Linux is intentionally Unix-like, it has Unix security mechanisms. These include user and group ids (uids and gids) for each process, a filesystem with read, write, and execute permissions (for user, group, and other), System V inter-process communication (IPC), socket-based IPC (including network communication), and so on. See Thompson [1974] and Bach [1986] for general information on Unix systems, including their basic security mechanisms. Section 3 summarizes key Linux security mechanisms.

2.2 Security Principles

There are many general security principles which you should be familiar with; consult a general text on computer security such as [Pfleeger 1997].

Saltzer [1974] and Saltzer and Schroeder [1975] list the following principles of the design of secure protection systems, which are still valid:

Least privilege. Each user and program should operate using the fewest privileges possible. That way, damage from attack is minimized.
Economy of mechanism. The protection system's design should be small, simple, and straightforward.
Open design. The protection mechanism must not depend on the attacker ignorance. Instead, the mechanism should be public, depending on the secrecy of relatively few (and easily changeable) items like passwords. This makes extensive public scrutiny possible. Bruce Schneier argues that smart engineers should ``demand open source code for anything related to security,'' as well as ensuring that it receives widespread review and that any identified problems are fixed [Schneier 1999].
Complete mediation. Every access attempt must be checked; position the mechanism so it cannot be subverted. For example, in a client-server model, generally the server must do all access checking because users can build or modify their own clients.
Permission-based. The default should be denial of service.
Separation of privilege. Ideally, access to objects should depend on more than one condition, so that defeating one protection system won't enable complete access.
Least common mechanism. Shared objects provide potentially dangerous channels for information flow, so physically or logically separate them.
Easy to use. If a mechanism is easy to use it is unlikely to be avoided.

2.3 Types of Secure Programs

Many different types of programs may need to be secure programs (as the term is defined in this paper). Some common types are:

Application programs used as viewers of remote data. Programs used as viewers (such as word processors or file format viewers) are often asked to view data sent remotely by an untrusted user (this request may be automatically invoked by a web browser). Clearly, the untrusted user's input should not be allowed to cause the application to run arbitrary programs. It's usually unwise to support initialization macros (run when the data is displayed); if you must, then you must create a secure sandbox (a complex and error-prone task). Be careful of issues such as buffer overflow, discussed later, which might allow an untrusted user to force the viewer to run an arbitrary program.
Application programs used by the administrator (root). Such programs shouldn't trust information that can be controlled by non-administrators.
Local servers (also called daemons).
Network-accessible servers (sometimes called network daemons).
CGI scripts. These are a special case of network-accessible servers, but they're so common they deserve their own category. Such programs are invoked indirectly via a web server, which filters out some attacks but nevertheless leaves many attacks that must be withstood.
setuid/setgid programs. These programs are invoked by a local user and, when executed, are immediately granted the privileges of the program's owner and/or owner's group. In many ways these are the hardest programs to secure, because so many of their inputs are under the control of the untrusted user and some of those inputs are not obvious.

This paper merges the issues of these different types of program into a single set. The disadvantage of this approach is that some of the issues identified here don't apply to all types of program. In particular, setuid/setgid programs have many surprising inputs and several of the guidelines here only apply to them. However, things are not so clear-cut, because a particular program may cut across these boundaries (e.g., a CGI script may be setuid or setgid, or be configured in a way that has the same effect). The advantage of considering all of these program types together is that we can consider all issues without trying to apply an inappropriate category to a program. As will be seen, many of the principles apply to all programs that need to be secured.

There is a slight bias in much of this paper towards programs written in C, with some notes on other languages such as C++, Perl, Python, Ada95, and Java. This is because C is the most common language for implementing secure programs on Linux (other than CGI scripts, which tend to use Perl), and most other languages' implementations call the C library. This is not to imply that C is somehow the ``best'' language for this purpose, and most of the principles described here apply regardless of the programming language used.

2.4 Paranoia is a Virtue

The primary difficulty in writing secure programs is that writing them requires a different mindset, in short, a paranoid mindset. The reason is that the impact of errors (also called defects or bugs) can be profoundly different.

Normal non-secure programs have many errors. While these errors are undesirable, these errors usually involve rare or unlikely situations, and if a user should stumble upon one they will try to avoid using the tool that way in the future.

In secure programs, the situation is reversed. Certain users will intentionally search out and cause rare or unlikely situations, in the hope that such attacks will give them unwarranted privileges. As a result, when writing secure programs, paranoia is a virtue.

2.5 Sources of Design and Implementation Guidelines

Several documents help describe how to write secure programs (or, alternatively, how to find security problems in existing programs), and were the basis for the guidelines highlighted in the rest of this paper.

AUSCERT has released a programming checklist [AUSCERT 1996], based in part on chapter 22 of Garfinkel and Spafford's book discussing how to write secure SUID and network programs [Garfinkel 1996]. Matt Bishop [1996, 1997] has developed several extremely valuable papers and presentations on the topic. Galvin [1998a] described a simple process and checklist for developing secure programs; he later updated the checklist in Galvin [1998b]. Sitaker [1999] presents a list of issues for the ``Linux security audit'' team to search for. Shostack [1999] defines another checklist for reviewing security-sensitive code. The Secure Unix Programming FAQ also has some useful suggestions [Al-Herbish 1999]. Some useful information is also available from Ranum [1998]. Some recommendations must be taken with caution, for example, Anonymous [unknown] recommends the use of access(3) without noting the dangerous race conditions that usually accompany it. Wood [1985] has some useful but dated advice in its ``Security for Programmers'' chapter. Bellovin [1994] and FreeBSD [1999] also include useful guidelines.

There are many documents giving security guidelines for programs using the Common Gateway Interface (CGI) to interface with the web. These include Gundavaram [unknown], Kim [1996], Phillips [1995], Stein [1999], and Webber [1999].

There are also many documents describing the issue from the other direction (i.e., ``how to crack a system''). One example is McClure [1999], and there's countless amounts of material from that vantage point on the Internet.

This paper is a summary of what I believe are the most useful guidelines; it is not a complete list of all possible guidelines. The organization presented here is my own (every list has its own, different structure), and the Linux-unique guidelines (e.g., on capabilities and the fsuid value) are also my own. Reading all of the referenced documents listed above as well is highly recommended.

One question that could be asked is ``why did you write your own document instead of just referring to other documents?'' There are several answers:

Much of this information was scattered about; placing the critical information in one organized document makes it easier to use.
Some of this information is not written for the programmer, but is written for an administrator or user.
Some information isn't relevant to Linux. For example, many checklists warn against setuid shell scripts; since Linux doesn't permit them in the normal case, there's no need to warn against them.
Much of the available information emphasizes portable constructs (constructs that work on all Unix-like systems). It's often best to avoid Linux-unique abilities for portability's sake, but somethimes the Linux-unique abilities can really aid security. Even if non-Linux portability is desired, you may want to support Linux-unique abilities on Linux.
This approach isn't unique. Other operating systems, such as FreeBSD, have a security programming guide specific to their operating system.

2.6 Document Conventions

System manual pages are referenced in the format name(number), where number is the section number of the manual. C and C++ treat the character '\0' (ASCII 0) specially, and this value is referred to as NIL in this paper. The pointer value that means ``does not point anywhere'' is called NULL; C compilers will convert the integer 0 to the value NULL in most circumstances, but note that nothing in the C standard requires that NULL actually be implemented by a series of all-zero bits.

Next Previous Contents