UNIX / Linux keyboard.

Linux /dev/random and Other Sources of Entropy

Random Data on Linux
(And Other UNIX-Family Operating Systems)

A number of scientific, engineering, and cybersecurity tasks need random data. We need them for scientific applications including Monte Carlo methods for simulating complex physical processes. We need to generate random noise signals in order to test digital signal processing techniques. Finally, several cryptographic tasks need unpredictable, thus unguessable, data. These include long-term RSA and ECC key pairs for SSH and PGP, one-use-only session keys for encrypting SSH and TLS/SSL connections and for encrypting stored data, and the initialization vectors used for the various chaining and feedback modes of symmetric block ciphers.

The problem is that a computer is a completely deterministic device. If you run the same program multiple times it should do the same thing each time. Otherwise it would have a serious problem. A computer program is instead a pseudorandom generator. John von Neumann said "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."

An older way of generating a pseudorandom sequence on a Unix-family operating system (Linux, BSD, Apple OS X, Solaris, etc., is to first seed the sequence generator with srand() and then repeatedly call rand() to obtain the sequence. The problem is that the output is too regular.

The GNU manual page describes these library calls bluntly:

NAME
     rand, srand - bad random number generator

SYNOPSIS
     #include <stdlib.h>

     void srand(unsigned int seed);
     int rand(void);
     int rand_r(unsigned int *seed);

     [...]

As the GNU manual page explains, rand() and srand() first appeared in Version 3 AT&T UNIX and conform to ANSI C89 (ANSI X3.159-1989). The low dozen bits go through a cyclic pattern. Things were different then.

The Solaris manual page is a little more kind but still makes the point that these old functions are not good:

USAGE
     The  spectral  properties  of rand() are limited.  The
     drand48(3C) function provides a better, more elaborate
     random-number generator.

The functions srandom() and random() seed and then generate a sequence with much better characteristics. The GNU manual page for random() explains that it uses a non-linear additive feedback random number generator with a period of approximately 16×(231-1) or 34,359,738,352.

For Monte Carlo simulations or digital signal processing, you just need pseudorandom data with the desired distribution. In fact, it may be preferred because you get the same sequence every time you start with the same seed.

However, security applications need truly unpredictable random numbers for purposes including the generation of cryptographic keys. This leads to the concept of a cryptographically strong pseudorandom number generator, something that makes it adequately difficult to predict the next values even after observing the sequence so far. These unpredicatable sequences could be used to generate long-term SSH keys for servers and users, SSL keys for servers, or the session keys used to encrypt sensitive files or e-mail messages.

A more mundane (and therefore frequently overlooked) need is for unpredictable TCP initial sequence numbers and DNS transaction ID numbers. The TCP risks were pointed out by Robert Morris in 1985 and Steven Bellovin in 1989, but we still had problems into the 2000s with operating systems implementing TCP in a way that allowed attackers to hijack connections. The DNS problems are more recent, with RFC 5452 suggesting some interesting extensions.

Random Devices

Linux was the first operating system to include a pseudo-device producing pseudorandom data seeded by sources of entropy or true randomness. The Solaris manual page for the random device says: "An implementation of the /dev/random and /dev/urandom kernel-based random number generator first appeared in Linux 1.3.30." Other Unix-family operating systems have since added them. The Linux random(4) manual page describes these pseudo-devices as follows:

The random number generator gathers environmental noise from device drivers and other sources into an entropy pool. The generator also keeps an estimate of the number of bits of noise in the entropy pool. From this entropy pool random numbers are created.

When read, the /dev/random device will only return random bytes within the estimated number of bits of noise in the entropy pool. /dev/random should be suitable for uses that need very high quality randomness such as one-time pad or key generation. When the entropy pool is empty, reads from /dev/random will block until additional environmental noise is gathered.

A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current unclassified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.

That same manual page continues with some guidelines for using these kernel features:

If you are unsure about whether you should use /dev/random or /dev/urandom, then probably you want to use the latter. As a general rule, /dev/urandom should be used for everything except long-lived GPG/SSL/SSH keys.

[...]

The amount of seed material required to generate a cryptographic key equals the effective key size of the key. For example, a 3072-bit RSA or Diffie-Hellman private key has an effective key size of 128 bits (it requires about 2^128 operations to break) so a key generator only needs 128 bits (16 bytes) of seed material from /dev/random.

While some safety margin above that minimum is reasonable, as a guard against flaws in the CPRNG algorithm, no cryptographic primitive available today can hope to promise more than 256 bits of security, so if any program reads more than 256 bits (32 bytes) from the kernel random pool per invocation, or per reasonable reseed interval (not less than one minute), that should be taken as a sign that its cryptography is not skillfully implemented.

Devices and Kernel Data Structures

The Linux kernel data structures in /proc/sys/kernel/random/* provide an additional interface to the /dev/random device. The read and write wakeup thresholds can be changed by writing to those files, the other values are read-only. All can be read by cat or sysctl.

boot_id Random string generated at boot time.
entropy_avail The number of bits of available entropy.
poolsize The size of the entropy pool, the maximum size of entropy_avail.
read_wakeup_threshold The number of bits of entropy required for waking up processes that sleep waiting for entropy from /dev/random.
uuid Random UUID string generated afresh at each read.
write_wakeup_threshold The number of bits of entropy below which we wake up processes that do a select() or poll() for write access to /dev/random.

Below we check on the random devices available in various UNIX-family operating systems.

% uname -a ; ls -l /dev/*random*
Linux linux.example.org 3.12.0 #1 SMP Sun Nov 3 19:58:07 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
crw-rw-rw- 1 root root 1, 8 Nov  3 20:14 /dev/random
crw-rw-rw- 1 root root 1, 9 Nov  3 20:14 /dev/urandom

% ssh raspberrypi 'uname -a ; ls -l /dev/*random*'
Linux raspberrypi 3.6.11+ #538 PREEMPT Fri Aug 30 20:42:08 BST 2013 armv6l GNU/Linux
crw-rw-rw- 1 root root 1, 8 Dec 31  1969 /dev/random
crw-rw-rw- 1 root root 1, 9 Dec 31  1969 /dev/urandom

% ssh openbsd 'uname -a ; ls -l /dev/*random*'
OpenBSD openbsd.example.org 5.4 GENERIC#0 amd64
crw-r--r--  1 root  wheel   45,   3 Nov  8 09:05 /dev/arandom
crw-r--r--  1 root  wheel   45,   0 Nov  2 08:27 /dev/random
crw-r--r--  1 root  wheel   45,   1 Nov  2 08:27 /dev/srandom
crw-r--r--  1 root  wheel   45,   2 Nov  2 08:27 /dev/urandom

% ssh solaris 'uname -a ; ls -l /dev/*random* ; ls -lL /dev/*random*'
SunOS solaris.example.org 5.10 Generic_148888-01 sun4u sparc SUNW,Sun-Fire-V440
lrwxrwxrwx   1 root  root       33 May  3  2012 /dev/random -> ../devices/pseudo/random@0:random
lrwxrwxrwx   1 root  root       34 May  3  2012 /dev/urandom -> ../devices/pseudo/random@0:urandom
crw-r--r--   1 root  sys   190,  0 Mar 13  2013 /dev/random
crw-r--r--   1 root  sys   190,  1 Nov  8 15:07 /dev/urandom

All have random and urandom devices. OpenBSD is the odd one with its additional arandom and srandom. All of the OpenBSD devices have unique minor device numbers, but I think that they all use the same underlying arc4random algorithm. All four on OpenBSD are very fast and highly random.

Hardware Random-Number Generators

If your CPU or motherboard has a hardware random number generator, the corresponding Linux kernel module can create a random device.

The screenshot shows the graphical configuration tool used to define a kernel build. See my kernel building page for details on configuring, building, and installing a custom kernel.

The build configuration process is hardware specific. Here you see a kernel build being configured on AMD64 hardware, where these five hardware RNG devices may be found.

The result is one or more kernel modules in /lib/modules/release/kernel/drivers/char/hw_random, including the following on IA64/AMD64 platforms:

Linux kernel build configuration, selecting hardware random number generator support.

Selecting hardware random number generator support under Device drivers ⇒ Character devices in the Linux kernel build configuration.

% ls -l /lib/modules/`uname -r`/kernel/drivers/char/hw_random
total 88
drwxr-xr-x 2 root root  4096 Nov  3 20:02 ./
drwxr-xr-x 7 root root  4096 Nov  3 20:02 ../
-rw-r--r-- 1 root root  8080 Nov  3 20:02 amd-rng.ko
-rw-r--r-- 1 root root 13432 Nov  3 20:02 intel-rng.ko
-rw-r--r-- 1 root root 13512 Nov  3 20:02 rng-core.ko
-rw-r--r-- 1 root root 10952 Nov  3 20:02 timeriomem-rng.ko
-rw-r--r-- 1 root root  4560 Nov  3 20:02 tpm-rng.ko
-rw-r--r-- 1 root root  7424 Nov  3 20:02 via-rng.ko
-rw-r--r-- 1 root root  9000 Nov  3 20:02 virtio-rng.ko

The Raspberry Pi platform is based on the Broadcom BCM2835 system-on-a-chip with a low-power ARM1176JZ-F processor and a hardware random number generator. The bcm2708_rng kernel module detects and handles the hardware random number generator, creating device node hwrng:

# ls -l /lib/modules/`uname -r`/kernel/drivers/char/hw_random
-rw-r--r-- 1 root root 4752 Jun  1 12:02 bcm-2708-rng.ko
# ls -l /dev/*rng*
ls: cannot access /dev/*rng*: No such file or directory
# modprobe bcm2708_rng
# dmesg | tail
[....]
[ 8035.084620] bcm2708_rng_init=dc8d6000
# ls -l /dev/*rng*
crw------- 1 root root 10, 183 Nov  8 16:05 /dev/hwrng

Add the rng-tools package to fully take advantage of the hardware random number generator. You will also need to add the kernel module bcm2708_rng to the list of automatically loaded modules in /etc/modules.

# apt-get install rng-tools
# echo bcm2708_rng >> /etc/modules

On the Pidora distribution, Fedora ported to the Raspberry Pi, the driver is built into the monolithic kernel so there is no separate loadable module.

Raspberry Pi small Linux system.

IC2 is the SoC and RAM. It's the large module (12.5×12.5 mm) in the center of the board, between the yellow RCA connector and the orange-topped HDMI connector, to the right of the "Raspberry Pi" logo. The Samsung SDRAM is stacked on top of the Broadcom BCM2835 SoC.
IC3 is the combined USB and Ethernet controller. It's the chip between the blue audio connector, the USB connector and the Ethernet connector.

You could edit /etc/default/rng-tools to specify the hardware device, but as a comment in that file warns, you should just leave that commented out so the boot script will know to auto-detect the device.

The daemon will be automatically started after the next boot, although of course you can manually start it right away. On Pidora, this would be a matter of enabling and starting the rngd service with systemctl.

raspbian:# modprobe bcm2708_rng
raspbian:# /etc/init.d/rng-tools restart

pidora:# systemctl enable rngd
pidora:# systemctl start rngd

With the rngd daemon running, it reads from the hardware RNG /dev/hwrng and feeds that entropy into /dev/random.

# lsof -p $(pgrep rngd)
COMMAND  PID USER   FD   TYPE     DEVICE SIZE/OFF NODE NAME
rngd    2578 root  cwd    DIR      179,2     4096    2 /
rngd    2578 root  rtd    DIR      179,2     4096    2 /
rngd    2578 root  txt    REG      179,2    35536 1594 /usr/sbin/rngd
rngd    2578 root  mem    REG      179,2  1196144 9107 /lib/arm-linux-gnueabihf/libc-2.13.so
rngd    2578 root  mem    REG      179,2   116462 9093 /lib/arm-linux-gnueabihf/libpthread-2.13.so
rngd    2578 root  mem    REG      179,2    10170 1777 /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so
rngd    2578 root  mem    REG      179,2   126236 9095 /lib/arm-linux-gnueabihf/ld-2.13.so
rngd    2578 root    0u   CHR        1,3      0t0   18 /dev/null
rngd    2578 root    1u   CHR        1,3      0t0   18 /dev/null
rngd    2578 root    2u   CHR        1,3      0t0   18 /dev/null
rngd    2578 root    3r   CHR     10,183      0t0 1474 /dev/hwrng
rngd    2578 root    4u   CHR        1,8      0t0   21 /dev/random
rngd    2578 root    5u   REG       0,12        5 2947 /run/rngd.pid
rngd    2578 root    6u  unix 0xd9fb4780      0t0 3479 socket

Broadcom has not released any detailed documentation on their hardware random number generator, but this is better than nothing. It shouldn't make things any worse, because it is just being used as another source of entropy by the Linux kernel, and it should make things better. The Raspberry Pi does not have any traditional disk controllers, leaving it without the typical good sources of entropy.

It would make sense that the Broadcom hardware devices works somewhat like the urandom device, generating output even when it has run low on entropy and the result becomes less random. Broadcom designed the device for use in telephone handsets, generating GSM and 3G/4G session keys. Users would not find it acceptable to have to wait through mysterious math-based delays before placing calls.

The rngd daemon sends its collected statistics to syslog every ten minutes and when it shuts down. Here is an example:

Nov  8 17:03:38 raspberrypi rngd[6032]: stats: bits received from HRNG source: 9140064
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: bits sent to kernel pool: 9099968
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: entropy added to kernel pool: 9099968
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2 successes: 457
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2 failures: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2(2001-10-10) Monobit: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2(2001-10-10) Poker: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2(2001-10-10) Runs: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2(2001-10-10) Long run: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS 140-2(2001-10-10) Continuous run: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: HRNG source speed: (min=300.573; avg=517.059; max=796.154)Kibits/s
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: FIPS tests speed: (min=638.109; avg=5811.021; max=6359.899)Kibits/s
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: Lowest ready-buffers level: 0
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: Entropy starvations: 438
Nov  8 17:03:38 raspberrypi rngd[6032]: stats: Time spent starving for entropy: (min=4195; avg=15914.203; max=32138)us

How Random is the Result?

Analyze your random data with the ent program from John Walker, the founder of Autodesk and co-author of AutoCAD.

This table shows the results for a 1-megabyte file from each source, collected this way:

# dd if=/dev/name bs=1024 count=1024 

On Linux add the option iflag=fullblock, and on Linux on x86_64 be ready to wait for a long time for the random device. It took several hours to collect one megabyte.

However, see the above discussion of how much random data is really needed. A one-time pad is the only perfectly secure cipher, but it is an enormous bother to use one. If you really care enough to use a one-time pad, then it makes little sense to use a program (including kernel modules) to generate it. To really do randomness correctly, use physics. The Australian National University has built a quantum optics random number generator and you can even download a unique live random number stream from their system.

Source Entropy
bits/byte
Chi-square Serial Correlation
Coefficient
value percentage
Linux x86_64 random 7.999828 250.53 56.73% 0.000640
Linux x86_64 urandom 7.999826 253.41 51.64% 0.002495
Linux ARM random 7.999826 252.74 52.83% -0.000071
Linux ARM urandom 7.999834 242.10 70.94% -0.001153
OpenBSD amd64 random 7.999835 239.70 74.60% 0.000505
OpenBSD amd64 urandom 7.999822 259.16 41.58% -0.000447
OpenBSD amd64 arandom 7.999822 259.12 41.65% 0.000647
OpenBSD amd64 srandom 7.999823 256.65 45.93% 0.000569
Solaris sun4u sparc random 7.999831 246.24 64.16% -0.000271
Solaris sun4u sparc urandom 7.999847 222.58 92.94% 0.000538
Ideal 8.0 50% 0.0

For deeper details, the October 2013 paper "Security Analysis of Pseudo-Random Number Generators with Input: /dev/random is not Robust", by Yevgeniy Dodis, David Pointcheval, Sylvain Ruhault, Damien Vergnaud, and Daniel Wichs. Its abstract reads:

A pseudo-random number generator (PRNG) is a deterministic algorithm that produces numbers whose distribution is indistinguishable from uniform. A formal security model for PRNGs with input was proposed in 2005 by Barak and Halevi (BH). This model involves an internal state that is refreshed with a (potentially biased) external random source, and a cryptographic function that outputs random numbers from the continually internal state. In this work we extend the BH model to also include a new security property capturing how it should accumulate the entropy of the input data into the internal state after state compromise. This property states that a good PRNG should be able to eventually recover from compromise even if the entropy is injected into the system at a very slow pace, and expresses the real-life expected behavior of existing PRNG designs. Unfortunately, we show that neither the model nor the specific PRNG construction proposed by Barak and Halevi meet this new property, despite meeting a weaker robustness notion introduced by BH. From a practical side, we also give a precise assessment of the security of the two Linux PRNGs, /dev/random and /dev/urandom. In particular, we show several attacks proving that these PRNGs are not robust according to our definition, and do not accumulate entropy properly. These attacks are due to the vulnerabilities of the entropy estimator and the internal mixing function of the Linux PRNGs. These attacks against the Linux PRNG show that it does not satisfy the "robustness" notion of security, but it remains unclear if these attacks lead to actual exploitable vulnerabilities in practice. Finally, we propose a simple and very efficient PRNG construction that is provably robust in our new and stronger adversarial model. We present benchmarks between this construction and the Linux PRNG that show that this construction is on average more efficient when recovering from a compromised internal state and when generating cryptographic keys. We therefore recommend to use this construction whenever a PRNG with input is used for cryptography.

Read the wonderful reader comments about the RAND book, the best content on all of Amazon.com.

See RFC 4086, Randomness Requirements for Security for background on this topic. Also see the following resources:

To the Linux / Unix Pages