Storage array.

File System Design for Performance and Security

The Unix Family of Operating Systems

The Unix family includes a range of similar operating systems with some common characteristics. Formally, "Unix" means just those that have been certified as conforming to the Single Unix Specification. But practically, "Unix" is used as a generic term. Companies selling the commercial variants have used other names to distinguish their product — Sun (since purchased by Oracle) wants you to just buy Solaris, HP offers HP-UX, DEC developed Tru64 (and after being purchased by Compaq and then by HP, they still get more requests for Tru64 than HP-UX), IBM offers AIX, Silicon Graphics developed Irix, and so on.

Then there are the free Unix variants — Linux, FreeBSD, NetBSD, OpenBSD, Android, and a few others that are less commonly encountered.

As for Apple's OS X, many people say "OS X is based on BSD" or even "OS X is BSD", but neither of those is exactly right. It's more that both BSD and OS X have the same ancestor.

With a little thought and investigation, the advice on this page should apply to any member of the Unix family of operating systems.

Fundamentals — What Goes Where

The kernel, the operating system itself, is loaded into RAM by the boot loader. On many Unixes this is a large file in the root of the file system such as /vmunix or /bsd or similar. On Linux it is /boot/vmlinuz-release and the loadable modules are stored in a hierarchy below /lib/modules/release/.

Detected devices correspond to "device-special files" stored in /dev or possibly /devices. These include hardware devices like disks and serial ports and so on, and also software devices like /dev/null and /dev/random and so on. A detailed listing, the output of ls -l /dev, shows device-special files of type c for "character", meaning unbuffered or "raw", and type b for "buffered" or "block".

Programs useful to all users are in /bin and /usr/bin. The most fundamental ones, needed while booting or during system maintenance, are in /bin while those probably not needed until after users have logged in are in /usr/bin.

Programs useful mostly for the system administrator and the booting process are in /sbin and /usr/sbin with the same distinction that the second of those holds those programs not needed while booting.

Shared libraries are in /lib or /usr/lib or possibly both, with the directory possibly named lib64 instead of lib (and in many Linux distributions, all four possibile combinations are found).

The system configuration is in /etc.

Data and configuration files for applications are typically in /usr/share.

System services store their data and use working storage under /var. For example, Syslog can store log data anywhere but the standard place is /var/log or /var/adm. Mail messages are stored under /var/spool or /var/mail, and print jobs under /var/spool or /var/lp. Apache is typically configured to use some subdirectory of /var such as /var/www.

Source code for the kernel and applications goes under /usr/src.

Users' home directories and thus all their files are in /home. If you are using automounting, /home is an empty directory used to mount user subdirectories as needed and the actual home directories are in /export or /export/home or similar.

Kernel data structures may be found in /proc and, at least on Linux, /sys. If you only find files with numbers for names in /proc (as is the case on Solaris), each is the kernel data structure for the process with that numeric process ID. The Linux /proc is more user-friendly, and it also contains a rather large collection of named kernel data structures for the TCP/IP protocols (counts, timers, flags etc), memory management, detected devices, and so on. Other special in-memory file systems on Linux are /sys and /run, and /tmp might be implemented in RAM, which can easily be done by making /tmp a symbolic link pointing to /dev/shm.

Mount points for removable media are in /mnt and /media.

The system-wide temporary area is /tmp. Its permissions must be octal 1777, shown as drwxrwxrwt, allowing anyone to create a file because it's world-writable, but not allowing anyone (other than root of course) to delete another user's file because of the "sticky bit".

Software added from outside the original distribution is typically stored in a hierarchy under /usr/local, although some packages use /opt. The /usr/local hierarchy can contain many of the same subdirectories with the same purposes as found under /usr, including bin, sbin, lib, man, share and so on. The X graphical system has a similar hierarchy in /usr/X11R6 or /usr/X11.

It used to be the case that /usr contained things not needed until the system had already done the basic initialization work, and so it could and perhaps should be a separate file system. That is no longer the case. Current Linux distributions need /usr to boot. Many critical subsystems needed during the booting process have placed executable programs, shared libraries, and configuration files under /usr, so keep /usr in the root file system.

Improving File System Security

Example OpenBSD Partitioning
/dev/sd0a 1 GB /
/dev/sd0b 3 GB swap
/dev/sd0c spans entire device
/dev/sd0d 4 GB /tmp
/dev/sd0e 10 GB /var
/dev/sd0f 2 GB /usr
/dev/sd0g 1 GB /usr/X11R6
/dev/sd0h 10 GB /usr/local
/dev/sd0i 2 GB /usr/src
/dev/sd0j 265 GB /home

Red Hat's installer creates a small partition on the first disk and puts /boot there. Boot loaders need their own partition because they can't handle logical volumes. To support UEFI firmware, a small ESP or (U)EFI System Partition with a VFAT file system is created next. It then uses the third partition of that disk plus all of the other disks as a volume group. Within that volume group it creates a logical volume for the swap area, sized according to the amount of RAM, and a second logical volume using all the remaining space. That logical volume is used for one large root file system.

Linux
booting
details

OpenBSD has a better solution — its installer lays out multiple file systems in an intelligent fashion. The precise number of file systems it creates depends on the amount of total space, but at right is what I got with one 300 GB disk and 3 GB of RAM.

Red Hat's single large file system is easy to expand later. Add more disks, make them physical volumes with pvcreate, then extend the volume group with vgextend, extend the lv_root logical volume with lvextend, and grow the file system with resize2fs.

However, security can be improved with only a slight increase in complexity.

Improve your system security by splitting the storage into multiple file systems and mounting some of them in more cautious modes. These changes make it harder for an intruder or an insider to elevate their privileges.

First, though, realize that some things must be in the root file system. It must have /dev, /etc, /lib*, /bin, /sbin, and now much of /usr to boot. But other than those, other subdirectories can be on independent file systems. Many of those cases can lead to increased security.

Only /dev and therefore the / file system needs device-special files.
Mount all file systems other than / with the nodev option.

Find all your SETUID/SETGID programs with this command:

# find / -perm -4000 -o -perm -2000

Now, get the list of the directories containing those:

# find / -perm -4000 -o -perm -2000 -exec dirname {} \; | sort -u

You will probably discover that just these directories contain those files:

/bin
/sbin
/usr/X11R6
/usr/bin
/usr/sbin

Depending on what you have added, you may also find some in /usr/local/bin and /usr/local/sbin.
Mount all the other file systems with the nosuid option.

Mount all file systems that do not need to contain executable files (probably /tmp, /var and /usr/src in this case) with the noexec option. If you are quite certain that your users do not need executable programs, you could also mount /home with the noexec option. If, for example, you are building a Samba file server where the users' files are all remotely accessed from an operating system that will not interpret the Unix permissions anyway, this would be safe. But realize that you may confuse and frustrate your more advanced users by mounting their home directories this way. Both chmod and ls work as expected, allowing you to turn on and verify the execute permission, but the kernel mysteriously refuses to allow those programs to run.

# cp /bin/date /usr/src
# /usr/src/date
Sat Apr 20 13:04:10 UTC 2024
# umount /usr/src
# mount -o noexec /usr/src
# /usr/src/date
/usr/src/date: Permission denied.
# ls -l /usr/src/date
-r-xr-xr-x  1 root  wsrc  131528 Nov 18 13:25 /usr/src/date
    It looks like it should work!  Let's make sure:
# chmod +x /usr/src/date
# ls -l /usr/src/date
-r-xr-xr-x  1 root  wsrc  131528 Nov 18 13:25 /usr/src/date
# id
uid=0(root) gid=0(wheel) groups=0(wheel), 2(kmem), 3(sys), 4(tty), 5(operator), 20(staff)
    That looks good, and I am root.  Let's try again:
# /usr/src/date
/usr/src/date: Permission denied.
    Confusion deepens until it occurs to you to try the mount command.

Designing a Partitioning Scheme

How can you design your partitioning scheme in advance, anticipating just how much space you need in the various file systems? That's difficult, but there's no need to go about it the hard way. Here's the easy way:

Do a test installation onto one enormous root file system. Choose the software packages as you will for the final plan, but create just one file system to hold everything.

Decide how you want to partition the system, as a list of file system mount points.

Use the du command to see how big each one is. But there's no need to include /tmp or /home in this command as you have not added users or done anything on the system yet.

# du -sh / /usr/X11R6 /usr/local /usr/share /usr/src /var 

Now you need to do a little math and a little thinking. For the leaf file systems, the ones that do not themselves contain further mount points, you can directly use the sizes reported from the above command. For the others, subtract the sizes of all the file systems that will be mounted within them. Here is some example output and how to interpret it if all these are to be mount points for file systems:

# du -sh / /usr/X11R6 /usr/local /usr/share /usr/src /var
6.0G   /
200M   /usr/X11R6
100M   /usr/local
4.0G   /usr/share
800M   /usr/src
100M   /var

   / contains the others, on its own it now uses:
        6000 - (200 + 100 + 4000 + 800) = 900 MB

Give your system plenty of room to grow! Disks are cheap, and we no longer try to specify things down to just barely a few tens of megabytes over the bare minimum.

You may need quite a bit of room in /tmp but this depends on the set of programs run on your system.

Logging will make /var grow quite a bit, although log rotation will impose some limit on that. You need plenty of space in /var to spool print jobs, and mail can grow without bounds.

Adding more software packages will grow their destination file systems — / and /usr/share for further components of your distribution, and either /usr/local or /usr/X11R6 plus /usr/share for add-ons. The packages that go into /opt will grow the / file system in this example plan.

Performance Considerations

Spread those file systems across multiple physical devices. Have no more than one busy partition (file system) per disk. LVM (Logical Volume Management) can have an effect similar to software RAID 0 — striping but no redundancy. Or at least that is the case when you set up LVM across multiple disks in the beginning. If you have a single disk logical volume, you fill that and add a second disk, fill that and add a third, the existing files aren't re-striped across the added disks.

For guidance on selecting hardware and tuning file system I/O see: Linux Performance Tuning

How To Specify These Options

Read your manual pages for the mount command and your /etc/fstab file (on Solaris, it is /etc/vfstab). The OpenBSD installer will automatically do something very reasonable:

# cat /etc/fstab
/dev/sd0a  /           ffs  rw               1 1
/dev/sd0j  /home       ffs  rw,nodev,nosuid  1 2
/dev/sd0d  /tmp        ffs  rw,nodev,nosuid  1 2
/dev/sd0f  /usr        ffs  rw,nodev         1 2
/dev/sd0g  /usr/X11R6  ffs  rw,nodev         1 2
/dev/sd0h  /usr/local  ffs  rw,nodev         1 2
/dev/sd0i  /usr/src    ffs  rw,nodev,nosuid  1 2
/dev/sd0e  /var        ffs  rw,nodev,nosuid  1 2

Back to the Linux/Unix page