Rotors of M-209 cipher machine.

Ciphers — Encryption and Decryption

Encrypting and Decrypting with Ciphers

We need to protect the confidentiality of sensitive data, so we must encrypt the data before storing or transmitting it. Ciphers are the algorithms (that is, the detailed methods) used to do the work. They once were pencil-and-paper methods, then they became electromechanical, and now they are computer programs. A cipher always works in the same general way, but a key modifies the details so that every file or message can be protected in its own unique way.

Symmetric encryption.

The process starts with the plaintext that is to be protected. The plaintext was limited to very simple text for most of the history of cryptography. With the creation of electrical machine cryptography, the plaintext was often limited to being expressed in a very constrained alphabet — no digits, just letters, sometimes with relatively unneeded characters left out (Q, or J in some languages), and with no distinction of upper versus lower case. The plaintext alphabet was usually more constrained than the Morse code or other telegraphy used to transmit the ciphertext.

With the cultural shift in the use of computer network communication, the plaintext these days is often non-text data — multimedia such as image, sound, and video files; complex and bulky non-ASCII file formats such as those used by Libre Office and Microsoft Office; or even an archive containing an entire directory structure of multiple files of various data types.

However, we do not care about the internal format, if any, and the intended use. It's all simply plaintext to be protected.

The encryption operation uses some cipher algorithm E that always functions with the same general principles. However, its operation in this case is modified by the encryption key, Ke.

The resulting ciphertext is then transmitted across a network or stored on some media, in both cases with the possibility that an attacker will be able to get a copy of the ciphertext.

At the far end of the communication channel, or in the future when we want to recover the plaintext from the stored ciphertext, a corresponding decryption operation D applies the appropriate reverse algorithm, as directed by the decryption key, Kd.

Auguste Kerckhoffs mathematically proved that the security of a cryptosystem must not depend on keeping its algorithm secret [See his article "La cryptographie militaire", in Journal des sciences militaries, vol IX, pp 5-38, Jan 1883]. That means that an attacker will know the details of all the useful ciphers, and will be able to guess which one we are using. If we chose a good cipher algorithm E, we can feel comfortable about the security as long as:

1: We choose our keys wisely — an attacker has no advantage beyond simply guessing as to what the needed decryption key Kd might be. The keys would appear to be random strings of bits.

2: We maintain the secrecy of keys — the difficulty is getting the decryption key to the remote receiver, or safely storing it until needed to decrypt stored data.

3: We realize that we only have security for a limited amount of time — there is a finite number of possible keys, and an attacker could try all possible keys while monitoring the output for data that appears to be plaintext. So:

Choose a cipher with a large enough key space (that is, total number of possible keys) to keep an attacker fruitlessly searching until the plaintext is no longer sensitive.

Choose a cipher that, as far as we know now, has no flaws leaking partial information about the key or allowing an attacker to eliminate part of the brute-force search through the key space.

Realize that captured ciphertext never gets stronger over time, but cryptanalytic knowledge gets better and computers get much faster.

Auguste Kerckhoffs, the Dutch linguist and cryptographer.

Auguste Kerckhoffs, Dutch linguist and cryptographer.

U.S. Navy photo of the cryptanalyst Joseph Rochefort.

Joseph Rochefort, USN cryptanalyst during WWII.

Initialization Vectors

Let's say that you encrypt some plaintext using today's key. Maybe this is a routine report sent on a fixed schedule: ALL QUIET ON THE WESTERN FRONT. Later in the day, therefore using the same key, you need to encrypt the same plaintext message. The problem is that the two ciphertexts will be identical.

An attacker could use traffic analysis to spot the identical ciphertexts and correlate that with observations. This was used by the U.S. Navy cryptanalysis effort during World War II in the Pacific Theater to derive information about planned Japanese ship movements when the cipher or code itself could not yet be broken.

The fix to this is to add an initialization vector or IV to the plaintext to randomize the input to the encryption operation. Ciphers can be used in various chaining modes, where the operation applied to some block of input depends not only on the key but also on the preceding block of input (plaintext) or output (ciphertext). The IV therefore primes the cipher, getting it into some random state before encrypting that identical plaintext with today's fixed key.

The IV needs to be different for every message sent, but it does not need to be kept secret in a well-designed cryptosystem. So, if messages need to be sent no more often than once per second, a useful IV might be the number of seconds elapsed since some epoch, perhaps since 00:00:00 UTC 1 January 1970 (as time is reckoned in Unix), or since 00:00:00 UTC 1 January 2000. Or, if the messages to be protected are rapidly streamed network packets, the number of microseconds elapsed since midnight today.

Substitution and Transposition

Substitution ciphers replace letters, or bit patterns, or words. A trivial example is the Caesar Cipher, which replaces each letter with the letter N positions later or earlier in the alphabet. For a shift of +3, A becomes D, B becomes E, C becomes F, and so on.

Cleartext:
THE CAESAR CIPHER IS TRIVIAL
Ciphertext for N = +3:
WKH FDHVDU FLSKHU LV WULYLDO

The monoalphabetic substitution cipher has an enormous name although it is only slightly less trivial and almost as weak. You arbitrarily rearrange the alphabet to create the mapping between cleartext and ciphertext. You have the illusion of security if you only think about the number of possible mappings:
26! = 1×2×3×...×25×26 = 403,291,461,126,605,635,584,000,000

But, if you know a little about the statistics of the cleartext language, you can start to break these with only about 30 characters of ciphertext. See Edgar Allan Poe's story "The Gold Bug" for a good explanation of how to do this.

Polyalphabetic substitutions like those of Alberti (1467), Trithemius (ca. 1500), and Vigenère (1585) can be broken with pencil and paper attacks known since the mid 1800s.

Transposition ciphers rearrange letters, or words, or bits. For example, let's say that your cleartext is:
CONFIDENTIALITY IS IMPORTANT BUT IT CAN BE DIFFICULT TO ACHIEVE
and your key is:
SECRETKEY

Write down the key, deleting any repeated characters like the second and third "E" in this example:
SECRTKY

Then write the message in by rows, padding with random characters as needed to fill the last row:
SECRTKY
CONFIDE
NTIALIT
YISIMPO
RTANTBU
TITCANB
EDIFFIC
ULTTOAC
HIEVEJW

Now take the ciphertext out by columns ordered by the letters of the key: "C" first, then "E", "K", "R", "S", "T", "Y":
NISATITE OTITIDLI DIPBNIAJ FAINCFTV CNYRTEUH ILMTAFOE ETOUBCCW

Break it into groups of five characters each so your Morse code operators can accurately communicate the ciphertext to the other end. Pad again as needed to fill the last group:
NISAT ITEOT ITIDL IDIPB NIAJF AINCF TVCNY RTEUH ILMTA FOEET OUBCC WADSH

That may look as if it would be much more secure, but transposition like this is also quite easy to break. Helen Gaines wrote Cryptanalysis in the 1930s. It explains how to use just pencil and paper to solve pure transposition like this example, as well as combinations of transposition and polyalphabetic substitution.

Today's ciphers use several rounds of combined substitution and transposition on bits.

Symmetric vs Asymmetric Cryptography

Symmetric means that something is the same on both sides, while asymmetric means the opposite, the two sides are different.

For symmetric cryptographic algorithms, Ke = Kd. You are free to pick any appropriately long string of bits to function as the single key for an encryption and the following decryption. Once you have picked an encryption key, you must use exactly that key to decrypt. Unlike what you see in movies and TV shows, you can't tell if you are getting close to guessing the decryption key as long as you are using strong ciphers. Off by just one bit, and the output will be as meaningless as if half the bits had been wrong.

However, beware: some keys might be obvious guesses, and some algorithms have "weak keys", certain types of key patterns that allow enough information to leak into the ciphertext to make it easier to guess the key. Weak keys are often those with regular patterns — lots of consecutive 1s or 0s, or a repeated pattern like 0101010101 or 00110011. The keys need to be hard to guess, and so they should look very random.

For asymmetric cryptographic algorithms, when you encrypt with some K1 you must decrypt with a completely different K2 The keys are generated as pairs. You must use the correct math trick to generate a key pair: K1,K2. You can use either key to encrypt, and then you must decrypt using exactly the other key of the pair.

For useful asymmetric cipher algorithms, it is extremely difficult to derive one key from the other.

Asymmetric encryption.

Using Asymmetric-Key Cryptography

Each person has generated an asymmetric key pair. They call one of their pair of keys the public key and the other the private key. They must treat those keys as their names suggest. The private key must be kept secret, while the public key must be available to everyone else.

Now two people want to communicate with each other. Each of them knows three keys — their pair, and the public key of the other. What do they do? They will encrypt the message with one or more keys, but exactly which key or keys depends on what they want to accomplish:

Method 1 — Sender's private key only — Since the sender's public key is the only decryption key producing a coherent message, the receiver can trust that the sender sent it. Since slight changes in input make big changes in output, this also must have been the same message originally sent. But anyone can read it.

Method 2 — Receiver's public key only — Since the receiver's private key is the only decryption key producing a coherent message, the receiver can trust that no one else could have read it. But anyone could have sent it.

Method 3 — Sender's private key and receiver's public key — The receiver can trust that no one else could read it, that the sender must have sent it, and that it wasn't altered in transit.

How to Use Asymmetric-Key Cryptography to Accomplish Various Security Goals

Let's look at this the other way around: start with the goal and see what is needed.

I want to: Send a message in such a way that I could prove:
"I was the sender and this is precisely the message that I sent."
I must: Digitally sign the message, see a later page for the details.
I need: My private key.
 
I want to: Send a message which only the intended receiver can read.
I must: Encrypt the message.
I need: The receiver's public key.
 
I want to: Read a message which was sent so that only I can read it.
I must: Decrypt the message.
I need: My private key.
 
I want to: Verify the identity of the sender and the integrity of the received message.
I must: Verify the digital signature.
I need: The sender's public key.

Quantum Key Distribution

When we use asymmetric encryption, our security relies on our assumption that certain classes of mathematical problems are very difficult to solve. Discovering the key needed to decrypt the message would require the attack to factor a number with hundreds of digits (RSA) or solve a discrete logarithm problem (ECC). We don't mind making one key of a pair public if it would be hard enough to discover the other.

With symmetric encryption, we assume that an N-bit key cannot be guessed from the 2N possibilities before we no longer care about keeping the secret. But, we must agree on a shared secret.

Quantum Key Distribution or QKD is a physics-based method to securely communicate a randomly selected symmetric key to the other party. See "The security of practical quantum key distribution", Reviews of Modern Physics, 81, pg 1301, 29 Sep 2009, PDF here.

QKD uses single-photon signaling between the legitimate users at either end of a fibre link. An intruder could tap the fibre and monitor the key exchange, but that would corrupt the signal and warn both ends that the key was compromised and must not be used.

Many people think it sounds like science fiction or at least impractical, but it's been in use since 2004. In that year it was used to protect a transaction between the Vienna City Hall and the headquarters of Bank Austria Creditanstalt.

That seems to me to have been a proof of concept organized by the city to support development in a local university. But since 2007 the government of Geneva, Switzerland, has used QKD to protect election results.

In 2007, China began deploying a QKD network in Beijing. A larger network with over 45 nodes began operating in Hefei in 2010. Another large network began operating in Jinan in 2014, joining 28 organizations, including government agencies such as the China Banking Regulatory Commission plus commercial customers including the China Industrial and Commercial Bank, and the Xinhua News Agency. In 2016 China was planning to complete the world's longest QKD network with a 2000-kilometer backbone joining Beijing, Jinan, Hefei, and Shanghai.

In 2010, Japan began operating a QKD network developed by NEC, Toshiba, NTT and others and joining several nodes in Tokyo.

In 2013, Battelle Memorial Institute deployed a QKD network joining its headquarters in Columbus, Ohio, to a manufacturing facility 20 miles away.

In 2015 ETSI, the European Telecommunications Standards Institute, published a number of quantum key distribution specifications.

In 2016, a South Korean QKD network began operating between Seoul, Bundang, and Suwon.

In 2016, China was working on QKD between satellites and ground stations. See "China's quantum space pioneed: We need to explore the unknown", Nature, 13 Jan 2016.

ID Quantique in Geneva, Switzerland, sells QKD systems plus random-number generators based on quantum physics.

Quantum Key Distribution versus Quantum Cryptanalysis

Quantum Key Distribution is used by defenders, to protect secrets.

Quantum Cryptanalysis is used by attackers, to expose secrets.

If you generate a 256-bit symmetric key using quantum physics, and then distribute it with QKD so you know that there was no eavesdropping, you know that the attacker has no advantage in searching the space of 2256 possible keys beyond what is exposed by any weaknesses in the cipher algorithm.

Asymmetric ciphers, on the other hand, base their security on trap-door math problems. These are problems that are believed to be impractical to solve, but for which it is relatively easy to verify whether a proposed solution is correct or not. Examples including factoring a number with hundreds of digits (RSA) or solving a discrete logarithm problem (ECC).

Algorithms have been devised to solve some problems when run on a general-purpose quantum computer — Shor's algorithm for factoring, or Grover's algorithm for solving a black-box problem like finding an unknown symmetric key. Shor's algorithm could solve the factoring problem in polynomial time, an enormous speed-up.

The thing is, no one has explained just how to build a general-purpose quantum computer.

However, in 2015–2016 NSA and NIST worried that such a device might be built in the next ten to twenty years and called for the community to work on post-quantum cryptography.

Post-Quantum Cryptography

The current worry is that a general-purpose quantum computer could quickly solve the trap-door problems. In August, 2015, the NSA announced that ECC wasn't a backup for RSA when facing the threat of quantum computing cryptanalysis, to the point that government agencies and contractors considering a migration from RSA to ECC shouldn't bother.

We need post-quantum or quantum-safe asymmetric ciphers. Several families of algorithms are being explored: lattice-based cryptography, code-base cryptography, multivariate polynomial cryptography, and others.

Note that the first two URLs below are secured with a digital certificate not installed in public distributions of browsers. You can get there by clicking through the warnings, and you likely will want to install the DOD certificates in your browsers.

Commercial National Security Algorithm (CNSA) Suite and Quantum Computing FAQ Changes to CNSA Suite and Quantum Computing Policy NIST Report on Post-Quantum Cryptography "Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer", Peter W. Shor "Cybersecurity in an era with quantum computers: will we be ready?", Michele Mosca "Strengths and Weaknesses of Quantum Computing",
Charles H. Bennett, Ethan Bernstein, Gilles Brassard, and Umesh Vazirani
"Quantum Safe Cryptography and Security: An Introduction, Benefits, Enablers and Challenges",
European Telecommunications Standards Institute
"Quantum resistant public key cryptography: a survey", Ray Perlner and David Cooper

Meanwhile, symmetric ciphers are thought to be relatively safe against quantum computing attacks. Grover's algorithm on a quantum computer would speed up the search for a symmetric key, but only to the point that keys would need to be doubled in length to preserve the same resistance to attack.

Moving On

On to the next page:
One-Time Pads

At this point, you may be ready to move on to the next page, which is about one-time pads.

Specific Ciphers
and Key Length

But to dig deeper, scroll down to read about identity-based encryption, where to learn more about the available ciphers, and how AES and RSA work.

See a later page for specific ciphers and their key lengths.

Identity-Based Encryption

Identity-Based Encryption (IBE) is asymmetric cryptography where the public key is based on some unique public information about the user. Perhaps an electronic mail address, or a telephone number including country code.

There must be a completely trusted third party called the Private Key Generator (PKG). The PKG has published a master public key and retained the corresponding master private key, also called simply the master key.

Any party can generate any other party's public key using the master public key and the public information about the other party.

Only the PKG can generate the corresponding private keys using the master private key and the public information about parties.

Advantages:

Disadvantages:

For more information:

Getting the Ciphers

For an enormous list, see Applied Cryptography by Bruce Schneier. The book contains source code for many serious cryptographic algorithms.

For something more mathematically rigorous, see the CRC Handbook of Applied Cryptography by A. Menezes, P. van Oorschot and S. Vanstone. A free PDF version can be downloaded for personal use.

How AES Works

Jeff Moser has written a great explanation of the history and operation of the Advanced Encryption Standard, A Stick Figure Guide to the Advanced Encryption Standard (AES).

How RSA Works

Setting up the keys

  1. Randomly select two distinct large prime numbers p and q.
  2. Compute: n = pq
    n will be used as the modulus for both the private and public keys.
    Security comes from:
    • Difficulty of factoring product of two large prime numbers.
    • Ambiguity introduced by modulo function.
  3. Compute the least common multiple: λ(n) = lcm(p - 1, q - 1)
  4. Choose an integer e such that:
    • 1 < e < λ(n)
    • e and λ(n) share no factors other than 1. That is, so that e and φ(n) are coprime.
      A popular choice is e = 216 + 1 = 65537
      Some applications choose smaller e such as 3, 5, 17 or 256 to make encryption and signature verification fast on small devices like smart cards, at the expense of greater security risk.
  5. Compute d to satisfy the congruence relation:
    de = 1 + kφ(n)
    for some integer k.

e is the public key.

d is the private key.

As a simple example:

  1. p = 5
    q = 11
  2. n = pq = 55
  3. φ(n) = (p - 1)(q - 1) = (4)(10) = 40
    Least common multiple of (p - 1) and (q - 1) is 20 = 22 * 5.
  4. e ∈ {3, 7, 9, 11, 13, 17, 19, 21, 23, 27, 29, 31, 33, 37, 39, 41, 43, 47, 49, 51, 53}
  5. d is defined by:
    (de) mod 20 = 1
    so the possible key pairs are:
    e d
    37
    73
    99
    1111
    1317
    1713
    1919

Using the keys for encryption and decryption

  1. Sender obtains receiver's public key (n,e).
  2. Sender converts message M into a number m < n with an agreed-upon reversible padding scheme.
  3. Sender computes ciphertext c using recipient's public key e by:
    c = me mod n
  4. Sender transmits ciphertext.
  5. Receiver calculates cleartext:
    m = cd mod n
  6. Receiver reverses padding function to recover message M from cleartext m.

As a simple example:

Next — One-Time Pads