File System Design for Performance and Security
The Unix Family of Operating Systems
The Unix family includes a range of similar operating systems with some common characteristics. Formally, "Unix" means just those that have been certified as conforming to the Single Unix Specification. But practically, "Unix" is used as a generic term. Companies selling the commercial variants have used other names to distinguish their product — Sun (since purchased by Oracle) wants you to just buy Solaris, HP offers HP-UX, DEC developed Tru64 (and after being purchased by Compaq and then by HP, they still get more requests for Tru64 than HP-UX), IBM offers AIX, Silicon Graphics developed Irix, and so on.
Then there are the free Unix variants — Linux, FreeBSD, NetBSD, OpenBSD, Android, and a few others that are less commonly encountered.
As for Apple's OS X, many people say "OS X is based on BSD" or even "OS X is BSD", but neither of those is exactly right. It's more that both BSD and OS X have the same ancestor.
With a little thought and investigation, the advice on this page should apply to any member of the Unix family of operating systems.
Fundamentals — What Goes Where
The kernel, the operating system itself,
is loaded into RAM by the boot loader.
On many Unixes this is a large file in the root of the
file system such as /vmunix
or /bsd
or similar.
On Linux it is /boot/vmlinuz-release
and the loadable modules are stored in a hierarchy below
/lib/modules/release/
.
Detected devices
correspond to "device-special files"
stored in /dev
or possibly /devices
.
These include hardware devices like disks
and serial ports and so on, and also software devices like
/dev/null
and /dev/random
and so on.
A detailed listing, the output of ls -l /dev
,
shows device-special files of type c
for "character",
meaning unbuffered or "raw", and type b
for "buffered"
or "block".
Programs useful to all users
are in /bin
and /usr/bin
.
The most fundamental ones, needed while booting or during
system maintenance, are in /bin
while those probably
not needed until after users have logged in are in
/usr/bin
.
Programs useful mostly for
the system administrator and the booting process
are in /sbin
and /usr/sbin
with the same distinction that the second of those holds
those programs not needed while booting.
Shared libraries
are in /lib
or /usr/lib
or possibly both, with the directory possibly named
lib64
instead of lib
(and in many
Linux distributions, all four possibile combinations are
found).
The system configuration
is in /etc
.
Data and configuration files for applications
are typically in /usr/share
.
System services
store their data and use working storage under /var
.
For example, Syslog can store log data anywhere but the
standard place is /var/log
or /var/adm
.
Mail messages are stored under /var/spool
or
/var/mail
,
and print jobs under /var/spool
or /var/lp
.
Apache is typically configured to use some subdirectory
of /var
such as /var/www
.
Source code for the kernel and applications
goes under /usr/src
.
Users' home directories and thus all their files
are in /home
.
If you are using automounting, /home
is an
empty directory used to mount user subdirectories as needed
and the actual home directories are in /export
or
/export/home
or similar.
Kernel data structures
may be found in /proc
and, at least on Linux,
/sys
.
If you only find files with numbers for names in /proc
(as is the case on Solaris), each is the kernel data structure
for the process with that numeric process ID.
The Linux /proc
is more user-friendly,
and it also contains a rather large collection of named
kernel data structures for the TCP/IP protocols (counts,
timers, flags etc), memory management, detected devices,
and so on.
Other special in-memory file systems on Linux are
/sys
and /run
, and /tmp
might
be implemented in RAM, which can easily be done by
making /tmp
a symbolic link pointing to
/dev/shm
.
Mount points for removable media
are in /mnt
and /media
.
The system-wide temporary area
is /tmp
.
Its permissions must be octal
1777,
shown as
drwxrwxr
w
t
,
allowing anyone to create a file because it's
world-writable,
but not allowing anyone (other than root
of course)
to delete another user's file because of the
"sticky bit".
Software added from outside the original distribution
is typically stored in a hierarchy under /usr/local
,
although some packages use /opt.
The /usr/local
hierarchy can contain many of the
same subdirectories with the same purposes as found under
/usr
, including bin
, sbin
,
lib
, man
, share
and so on.
The X graphical system has a similar hierarchy in
/usr/X11R6
or /usr/X11
.
It used to be the case that /usr
contained
things not needed until the system had already done the
basic initialization work, and so it could and perhaps
should be a separate file system.
That is no longer the case.
Current Linux distributions need /usr
to boot.
Many critical subsystems needed during the booting process
have placed executable programs, shared libraries,
and configuration files under /usr
,
so keep /usr
in the root file system.
Improving File System Security
Example OpenBSD Partitioning | ||
/dev/sd0a |
1 GB | / |
/dev/sd0b |
3 GB | swap |
/dev/sd0c |
spans entire device | |
/dev/sd0d |
4 GB | /tmp |
/dev/sd0e |
10 GB | /var |
/dev/sd0f |
2 GB | /usr |
/dev/sd0g |
1 GB | /usr/X11R6 |
/dev/sd0h |
10 GB | /usr/local |
/dev/sd0i |
2 GB | /usr/src |
/dev/sd0j |
265 GB | /home |
Red Hat's installer creates a small partition
on the first disk and puts /boot
there.
Boot loaders need their own partition because they
can't handle logical volumes.
To support
UEFI firmware,
a small ESP or (U)EFI System Partition with a VFAT file system
is created next.
It then uses the third partition of that disk plus all
of the other disks as a volume group.
Within that volume group it creates a logical volume for
the swap area, sized according to the amount of RAM, and
a second logical volume using all the remaining space.
That logical volume is used for one large root file system.
booting
details
OpenBSD has a better solution — its installer lays out multiple file systems in an intelligent fashion. The precise number of file systems it creates depends on the amount of total space, but at right is what I got with one 300 GB disk and 3 GB of RAM.
Red Hat's single large file system is easy to expand later.
Add more disks, make them physical volumes with
pvcreate
, then extend the volume group with
vgextend
, extend the lv_root
logical
volume with lvextend
, and grow the file system
with resize2fs
.
However, security can be improved with only a slight increase in complexity.
Improve your system security by splitting the storage into multiple file systems and mounting some of them in more cautious modes. These changes make it harder for an intruder or an insider to elevate their privileges.
First, though, realize that some things must be in
the root file system.
It must have /dev
,
/etc
,
/lib*
,
/bin
,
/sbin
,
and now much of
/usr
to boot.
But other than those, other subdirectories can
be on independent file systems.
Many of those cases can lead to increased security.
Only /dev
and therefore the /
file system needs device-special files.
Mount all file systems other than /
with the nodev
option.
Find all your SETUID/SETGID programs with this command:
# find / -perm -4000 -o -perm -2000
Now, get the list of the directories containing those:
# find / -perm -4000 -o -perm -2000 -exec dirname {} \; | sort -u
You will probably discover that just these directories contain those files:
/bin /sbin /usr/X11R6 /usr/bin /usr/sbin
Depending on what you have added, you may also find
some in /usr/local/bin
and
/usr/local/sbin
.
Mount all the other file systems with the
nosuid
option.
Mount all file systems that do not need to contain
executable files (probably /tmp
,
/var
and /usr/src
in
this case) with the noexec
option.
If you are quite certain that your users do not
need executable programs, you could also mount
/home
with the noexec
option.
If, for example, you are building a Samba file server
where the users' files are all remotely accessed from
an operating system that will not interpret the Unix
permissions anyway, this would be safe.
But realize that you may confuse and frustrate
your more advanced users by mounting their
home directories this way.
Both chmod
and ls
work as expected,
allowing you to turn on and verify the execute
permission, but the kernel mysteriously refuses to
allow those programs to run.
# cp /bin/date /usr/src # /usr/src/date Wed Jan 08 12:01:02 UTC 2025 # umount /usr/src # mount -o noexec /usr/src # /usr/src/date /usr/src/date: Permission denied. # ls -l /usr/src/date -r-xr-xr-x 1 root wsrc 131528 Nov 18 13:25 /usr/src/date It looks like it should work! Let's make sure: # chmod +x /usr/src/date # ls -l /usr/src/date -r-xr-xr-x 1 root wsrc 131528 Nov 18 13:25 /usr/src/date # id uid=0(root) gid=0(wheel) groups=0(wheel), 2(kmem), 3(sys), 4(tty), 5(operator), 20(staff) That looks good, and I am root. Let's try again: # /usr/src/date /usr/src/date: Permission denied. Confusion deepens until it occurs to you to try the mount command.
Designing a Partitioning Scheme
How can you design your partitioning scheme in advance, anticipating just how much space you need in the various file systems? That's difficult, but there's no need to go about it the hard way. Here's the easy way:
Do a test installation onto one enormous root file system. Choose the software packages as you will for the final plan, but create just one file system to hold everything.
Decide how you want to partition the system, as a list of file system mount points.
Use the du
command to see how big each one is.
But there's no need to include /tmp
or /home
in this command as you have not added users or done anything
on the system yet.
# du -sh / /usr/X11R6 /usr/local /usr/share /usr/src /var
Now you need to do a little math and a little thinking. For the leaf file systems, the ones that do not themselves contain further mount points, you can directly use the sizes reported from the above command. For the others, subtract the sizes of all the file systems that will be mounted within them. Here is some example output and how to interpret it if all these are to be mount points for file systems:
# du -sh / /usr/X11R6 /usr/local /usr/share /usr/src /var 6.0G / 200M /usr/X11R6 100M /usr/local 4.0G /usr/share 800M /usr/src 100M /var / contains the others, on its own it now uses: 6000 - (200 + 100 + 4000 + 800) = 900 MB
Give your system plenty of room to grow! Disks are cheap, and we no longer try to specify things down to just barely a few tens of megabytes over the bare minimum.
You may need quite a bit of room in /tmp
but
this depends on the set of programs run on your system.
Logging will make /var
grow quite a bit, although
log rotation will impose some limit on that.
You need plenty of space in /var
to spool print
jobs, and mail can grow without bounds.
Adding more software packages will grow their destination
file systems — /
and /usr/share
for further components of your distribution,
and either /usr/local
or /usr/X11R6
plus /usr/share
for add-ons.
The packages that go into /opt
will grow the /
file system in this example plan.
Performance Considerations
Spread those file systems across multiple physical devices. Have no more than one busy partition (file system) per disk. LVM (Logical Volume Management) can have an effect similar to software RAID 0 — striping but no redundancy. Or at least that is the case when you set up LVM across multiple disks in the beginning. If you have a single disk logical volume, you fill that and add a second disk, fill that and add a third, the existing files aren't re-striped across the added disks.
For guidance on selecting hardware and tuning file system I/O see: Linux Performance Tuning
How To Specify These Options
Read your manual pages for the mount
command
and your /etc/fstab
file (on Solaris, it is
/etc/vfstab
).
The OpenBSD installer will automatically do something
very reasonable:
# cat /etc/fstab /dev/sd0a / ffs rw 1 1 /dev/sd0j /home ffs rw,nodev,nosuid 1 2 /dev/sd0d /tmp ffs rw,nodev,nosuid 1 2 /dev/sd0f /usr ffs rw,nodev 1 2 /dev/sd0g /usr/X11R6 ffs rw,nodev 1 2 /dev/sd0h /usr/local ffs rw,nodev 1 2 /dev/sd0i /usr/src ffs rw,nodev,nosuid 1 2 /dev/sd0e /var ffs rw,nodev,nosuid 1 2