The idea of the one time pad is simple...
To transmit an $n$ bit message $M$, Alice generates an $n$ bit key $k$ at random.
She then somehow distributes $k$ to Bob (possibly the key was already distributed).
The ciphertext is $C = M \oplus k$.
After receiving $C$, Bob decrypts $C$ using $k$ to recover $M$:
decryption = $C \oplus k = M \oplus k \oplus k = M$
The OTP is interesting because it achieves information theoretic security.
This means that the ciphertext $C$ contains no information about $M$ in the absence of $k$.
This is in the sense of "information theory" as developed by Claude Shannon.
His original paper on the OTP is still very accessible: famous paper
The first cryptographer to realize the significance of OTP was Joseph Mauborgne.
The chances that an enemy with infinite computational resources can decrypt $C$ without $k$ are zero.
The reason is that for any possible decrypt $M'$, there is exactly one $k'$ such that
$$ M' = C \oplus k' $$This means that any decrypt is equally likely from Eve's point of view.
With no knowledge of $k$, why would Eve prefer $M$ to $M'$ as a possible decrypt?
She wouldn't... and therefore she can't decrypt $C$ without $k$, even with an infinitely fast computer.
You might notice that OTP is not very widely used. It is used sometimes, but rarely.
What are the disadvantages?
Issues similar to the ones found in reusing the key in OTP come up a lot in cryptography.
What goes wrong?
Suppose $C_1 = k \oplus M_1$ and $C_2 = k \oplus M_2$.
Then if Eve has $C_1$ and $C_2$, she can compute
$$C_1 \oplus C_2 = k \oplus M_1 \oplus k \oplus M_2 = M_1 \oplus M_2.$$If $M_1$ and $M_2$ are long enough and non-random enough, both $M_1$ and $M_2$ can be recovered from $M_1 \oplus M_2$.
The situation is even worse if $k$ is reused more than twice.
Also if $M_1$ is known then finding $M_2$ from $M_1 \oplus M_2$ is easy.
Modern ciphers try to imitate the strengths of OTP without the downsides.
The idea is to shorten the key to something reasonable: 128 or 256 bits.
A small key will then produce a huge pseudo-random key stream.
This means we lose information theoretic security.
But hopefully we can still have computational security.
Most modern ciphers rely on cryptosystems believed to be computationally secure.
This means that while the ciphertext may in principle contain information about the plaintext in the absence of the key, it takes too long to extract.
In practice, for symmetric key cryptography, if a key is $n$ bits we want an exhaustive check of all $2^n$ possible keys to be the only way to learn the plaintext.
Recall that a design goal for a cryptosystem is to produce ciphertext that is indistinguishable from random bits.
If you can crack a cipher then you can definitely distinguish ciphertext from randomness.
So "indistinguishability" is a very strong property: You can't even begin to crack the cipher.
We learned that the random pad in OTP gives us random ciphertext.
So what we need are good random numbers, but a small key.
There are at least 3 types of random number generators:
A common source of cryptographic weakness is to use a PRNG where a CSPRNG is needed.
True random number generators are based on the physical world...
White noise from electrical systems, typing delays, network delays, various jitterings, all pooled together.
The downside is that this is slow.
See this article on the Linux hardware TRNG/CSPRNG.
Usually we use a TRNG to produce a small seed for a PRNG.
A pseudo-random number generator takes a small amount of true randomness called the seed as input.
It then "blows up" this small amount of true randomness to an arbitrary amount of seeming randomness.
PRNGs are used a lot in gaming, simulations, experiments, and code testing.
A PRNG is usually not a CSPRNG.
The C++ rand()
function is a PRNG (and not a CSPRNG).
A cryptographically secure pseudo-random number generator is a PRNG with an extra property.
Roughly, the stream of bits it produces can't be extended without knowing the seed.
Precisely:
Given $s_1,s_2,\ldots,s_n$ bits of output from a CSPRNG it is computationally infeasible to predict the next bit, $s_{n+1}$, with accuracy significantly better than 0.5 (random guessing).
We'll see that most PRNGs emphatically lack this property!
Stream ciphers seek to imitate the OTP by using a CSPRNG.
A TRNG is used to seed the CSPRNG. The key is the seed, possibly together with some parameters of the CSPRNG.
The CSPRNG then generates a stream of pseudo-random bits called the key stream:
$$s_0,s_1,s_2,\ldots$$The plaintext $x_0,x_1,x_2,\ldots$ is then encrypted as $y_0,y_1,y_2,\ldots$, where
$$y_i = s_i + x_i\mod 2$$just as in OTP.
Suppose that Eve knows some plaintext $x_0,x_1,x_2$ and observes the ciphertext $y_0,y_1,y_2$.
She can then solve for $s_i$ in
$$y_i = s_i + x_i\mod 2$$to get
$$s_i = y_i + x_i\mod 2$$Thus it is realistic to assume Eve knows some of the keystream.
But because we use a CSPRNG, this little bit of stream does not allow Eve to produce more of the stream.
One PRNG that works fine for non-cryptographic applications is the linear congruential generator.
Basic algorithm:
$$S_{n+1} = S_na+c \mod\,m$$Here $m$ is public and $a,c$ are secret.
The seed is $S_0$.
m,a,c = 24,6,2
s = 3
outputs = []
while len(outputs)==len(set(outputs)):
s = (s*a+c)%m
outputs.append(s)
outputs
[20, 2, 14, 14]
It is useful to have a LCG with a long period.
The period is the number of updates before an output is repeated.
The period depends on the values of $m,a,c,$ and $S_0$.
m,a,c = 9,4,1
s = 0
outputs = []
while len(outputs)==len(set(outputs)):
s = (s*a+c)%m
outputs.append(s)
outputs
[1, 5, 3, 4, 8, 6, 7, 2, 0, 1]
It's informative to see how LCG fails as a CSPRNG.
Remember that the important property of a CSPRNG is that without knowledge of the key (including the seed and state variables) the sequence is unpredictable.
It must be unpredictable even if you have already seen some of it.
Suppose we use the above LCG with $m=9, a=4,c=1,$ and $S_0$.
Suppose Eve knows $m$ but not $a,c,$ or $S_0$.
Alice encrypts a message to Bob by using the keystream 1,5,3,4,8,6,7
.
Eve happens to know the first few chunks of plaintext and can use that to determine that the keystream begins 1,5,3
.
Now Eve can write down the equations
\begin{align} 5 &\equiv a\cdot 1 + c \mod 9 \\ 3 &\equiv a\cdot 5 + c \mod 9 \end{align}Simple linear algebra will solve this system of two equations in the two unknowns $a,b$.
Then, knowing $a$ and $b$, Eve can generate the rest of the keystream and decrypt the ciphertext.
Therefore \begin{align} 3 &\equiv a\cdot 5 + 5-a \mod 9 \\ -2 &\equiv a\cdot 4 \mod 9 \\ -2\cdot 7 &\equiv a\cdot 4 \cdot 7 \mod 9 \\ -14 &\equiv 4 \equiv a \mod 9 \end{align} Therefore \begin{align} c &\equiv 5-4 \equiv 1 \mod 9 \\ \end{align}
In cryptography, if the enemy can always solve for your key (even approximately) by doing a bunch of modular linear algebra, that's not good.
Good cryptosystems introduce nonlinearity to stop exactly this kind of thing.
Notice that the attack we just did on LCG would work even if $m,a,$ and $c$ were 1000s of bits long (though you would need to know more of the plaintext).
This may be an example of a cipher that is CTO secure, but not KPA secure.
The workings of an LFSR are well suited to hardware. Here is a diagram of a small LFSR:
There is a 16 bit state. During each clock cycle one bit is produced as the state shifts one place to the right.
The XOR of tap bits from the state produce a new bit introduced on the left.
This diagram numbers the bits by significance but it is more convenient to number the slots the other way: in the order in which bits are produced.
Thus 16 should be 0, 14 should be 2, 13 should be 3, and 11 should be 5,
Notice that if the internal state of an LFSR ever repeats, then the LFSR will begin to repeat its values.
The above LFSR has a 16 bit state, so the state can take on at most $2^{16}-1$ nonzero values.
If the all-zero state ever occurs, then the machine produces 0 forever after.
Therefore the period of this LFSR is at most $2^{16}-1$.
In general, the period of an $m$ bit LFSR is at most $2^m-1$.
As the LFSR described above yields bits, they satisfy the following equations.
The outputs $s_{16},s_{15},\ldots,s_1$ are the initial state.
Then
\begin{align} s_{16} &\equiv s_{5}+s_{3}+s_{2}+s_{0}\mod 2 \\ s_{17} &\equiv s_{6}+s_{4}+s_{3}+s_{1}\mod 2 \\ s_{18} &\equiv s_{7}+s_{5}+s_{4}+s_{2}\mod 2 \\ \vdots \\ s_{31} &\equiv s_{20}+s_{18}+s_{17}+s_{15}\mod 2\\ \vdots \\ s_{n} &\equiv s_{n-11}+s_{n-13}+s_{n-14}+s_{n-16}\mod 2 \end{align}The way we described the equations on the previous slide depended on the tap bits of the sample LFSR.
A superior notation uses variables $p_i$ to represent which of the state bits are tapped.
In the above diagram each $p_i \in \{0,1\}$.
The tapped bits in the state of the LFSR are the ones where $p_i = 1$.
For example, suppose $p_0=0,p_1=1,p_2=1,p_3=0,p_4=0$.
Then we get this LFSR:
Using this notation we can represent any $m$ bit LFSR as
\begin{align} s_{m} &\equiv p_{m-1}s_{m-1}+\cdots+p_{1}s_{1}+p_{0}s_{0}\mod 2 \\ s_{m+1} &\equiv p_{m-1}s_{m}+\cdots+p_{1}s_{2}+p_{0}s_{1}\mod 2 \\ s_{m+2} &\equiv p_{m-1}s_{m+1}+\cdots+p_{1}s_{3}+p_{0}s_{2}\mod 2 \\ \vdots \\ s_{m+i} &\equiv p_{m-1}s_{m+i-1}+\cdots+p_{1}s_{i+1}+p_{0}s_{i}\mod 2 \\ \end{align}or
$$s_{m+i} \equiv \sum_{j=0}^{m-1}p_js_{j+i}\mod 2.$$Consider a 5 bit LFSR.
Suppose $p_0=0,p_1=1,p_2=1,p_3=0,p_4=0$ and $s_0 = s_1 = s_3 = 1$ and $s_2 = s_4 = 0$.
What is $s_{10}$?
Because LFSR is totally linear, you might suspect it's not a CSPRNG.
Suppose that we try to base a stream cipher on an LFSR (the one from the previous slide).
The secret will be $p_0,p_1,p_2,p_3,$ and $p_4$. We assume that $m=5$ is known.
We will work in the KPA model, where we may assume that Eve knows $s_0,s_1,\ldots,s_9$.
Can Eve discover the secret? Yes!
Eve can write down the following equations.
\begin{align} s_5 &\equiv p_4s_4 + p_3s_3 + p_2s_2 + p_1s_1+p_0s_0 \mod 2 \\ s_6 &\equiv p_4s_5 + p_3s_4 + p_2s_3 + p_1s_2+p_0s_1 \mod 2 \\ s_7 &\equiv p_4s_6 + p_3s_5 + p_2s_4 + p_1s_3+p_0s_2 \mod 2 \\ s_8 &\equiv p_4s_7 + p_3s_6 + p_2s_5 + p_1s_4+p_0s_3 \mod 2 \\ s_9 &\equiv p_4s_8 + p_3s_7 + p_2s_6 + p_1s_5+p_0s_4 \mod 2 \end{align}This gives 5 equations in 5 unknowns which can easily be solved using techniques of linear algebra.