In order to make sense of some of the things we mentioned in class, it is useful to see them in action on image files; for we, as humans, are quite good at visually detecting patterns in an array of pixels. This "cloud" interface to Sage is somewhat similar to the regular "notebook" one; use SHIFT+ENTER (or the green "Run" button above) to evaluate a cell.

Internally, it's just a $288 \times 460$ matrix of pixels, each of which is colored according to a RGB triple of integers between 0 and 255:

(288, 460)

array([[[126, 157, 186],
[200, 231, 255],
[169, 199, 233],
...,
[ 33, 49, 20],
[ 83, 98, 59],
[ 72, 88, 43]],
[[ 94, 125, 154],
[145, 176, 207],
[143, 173, 207],
...,
[128, 143, 114],
[ 19, 34, 0],
[140, 155, 112]],
[[ 28, 59, 90],
[ 18, 49, 80],
[ 25, 55, 89],
...,
[ 47, 62, 33],
[ 70, 84, 48],
[202, 217, 176]],
...,
[[163, 195, 94],
[164, 196, 95],
[170, 202, 101],
...,
[162, 197, 97],
[162, 197, 97],
[156, 191, 91]],
[[168, 200, 99],
[166, 198, 97],
[173, 205, 104],
...,
[152, 187, 87],
[154, 189, 89],
[152, 187, 87]],
[[154, 186, 85],
[152, 184, 83],
[164, 196, 95],
...,
[161, 196, 96],
[162, 197, 97],
[159, 194, 94]]], dtype=uint8)

Recall that the goal of encryption would be to reversibly make it look like a random image of the same size just like this one:

As a single pixel is encoded on $3 \times 8 = 24$ bits, a $4 \times 4$ block of pixels is encoded on $16 \times 24 = 384$ bits, which give us more than enough room to store a secure symmetric encryption key.

Et voilà! Here's the picture, encrypted by a 384-bit key:

Discuss the result. Would you say that this cipher provides perfect secrecy?

This scheme certainly does not provide perfect secrecy, as we cleary recognize some features of the original image leaking into the ciphertext (structure of the mansion, patches of grass and sky...). In particular, note that repeating blocks in the plaintext yield repeating blocks in the ciphertext, thus breaking semantic security.

The problem is that the $\color{blue} 4 \times 4$ key is a perfectly good one-time pad that provides 384 bits of security if used to mask a *single* $\color{blue} 4 \times 4$ image block; but it is used here to encrypt $\color{blue} 72 \times 115 = 8280$ such blocks, thus providing (*cf.* discussion of the many-time pad) an effective $\color{blue} 384/8280 \approx 0.05$ bit of security...

Go through the same process, this time with a genuine randomly generated one-time pad. What's the key-length this time? Verify that the cipher decrypts correctly, *i.e.* play both the roles of Alice and Bob to see that everything works as it should. Also make sure to take a look at what Eve actually sees on the insecure channel.

Here's the one-time pad that we'll use:

The effective key length (assuming perfect strength of the random number generation) is the actual number of bits needed to specify this image:

3179520

To encrypt the image, Alice applies this random key as a mask:

Only the resulting ciphertext transits on the insecure communication channel; what Eve sees looks just like random noise.

To decrypt, Bob removes (i.e. reapplies) the pad:

You may check that the actual pixel values of the decrypted image are actually identical to the original ones:

True

Wow, this works well. In the excitation of the moment, I started encrypting all my images... but forgot to apply one of the most important usage rules of the one-time pad when I encrypted the following two images. Can you find out what they were?

The single most important thing to remember with the one-time pad is to use it only once... One may suspect that the same encryption key was used twice, and this would enable an attacker to access the sum (xor) or the plaintexts by xoring the ciphertexts:

Indeed! We clearly recognize Alice and Bob, semantic security is completely broken.

After realizing my mistake, I decided to generate pads from a 128-bit key by using it as a seed for a linear congruence generator modulo the following 128-bit prime:

True

Since such a PRNG is characterized by two 128-bit integers $a$ and $b$ which were chosen at random, there should be more than enough entropy to protect my 128-bit key, right? Prove me wrong by finding out what the seed used to produce the first few outputs from the LCG was:

217692597915196650809181220736554072509,

101697276836279744146238049998237762682,

265181937610212296333751058245677871006,

84171444745593992579687306707926595136,

12455596861516498286468598807461112654,

...

Recall that a linear congruence generator works by iterating a simple affine function modulo $\color{blue} p$: $\color{blue} x_{n+1} \equiv a \cdot x_n + b \ \mathrm{mod} \ p.$ By substrating the formula for $\color{blue} x_n$ from the formula for $\color{blue} x_{n+1}$, one sees that: $\color{blue} x_{n+1} - x_n \equiv a \cdot (x_n - x_{n-1}) \ \mathrm{mod} \ p;$ that is, the differences between successive outputs form a geometric progression.

To perform division mod $\color{blue} p$, we use the extended Euclidean algorithm to a modular inverse (or just regular integer division followed by the remainder operator since Sage is permissive):

29879779137056188643413695637347256028

Now the value of $\color{blue} b$ is easily recovered:

261355376501628057578013110734163562795

And we can even go backwards in time:

116262181799952103953797009259690012116

We verify that we are able to recover the given terms (three of them were actually enough to completely reverse engineer the LGC) by using $\color{blue} x_0$ as a seed (as well as predict future values):

217692597915196650809181220736554072509
101697276836279744146238049998237762682
265181937610212296333751058245677871006
84171444745593992579687306707926595136
12455596861516498286468598807461112654
105106464186866804329444885409649936144
144019025603440397490230044970459472584
265139309434915765374704941432985644451
289642631303735312067300979134840749521
119205904360883294127743631089967565464
...

Linear congruence generators (and linear shift feedback registers) were not designed to be *secure* pseudo-random number generators, and should never be used as such.

Ok, now it's time to start doing things properly. We will use the still-standard-albeit-somewhat-deprecated RC4 pseudo-random number generator to generate a key-stream from a 128-bit key. We first aquire 128 bits of "real" random data from the entropy pool of the system. x[1]

'42d03204f3581f4671f73463c788c1c0'

Use this to generate of pseudo-random pad from $k$, and then encrypt the Bletchley Park picture with it. Make sure that Bob will be able to decrypt it knowing only $k$.

Alice generates a pseudo-random pad of the required size by using the pixel values supplied by the PRNG:

and then applies it to the image to encrypt it:

Upon reception of the ciphertext, Bob initializes his own PRNG with the secret key $\color{blue} k$ to compute the exact same pad in order to remove it: