Cracking a LFSR stream cipher
Project 2
Due Th 10/5
Summary
In this project we will execute a known plaintext attack on a LFSR stream cipher.
In particular, we will crack an LFSR with degree . Given will be a long ciphertext which is known to be the encryption of a book from Project Gutenberg. The first few bytes of works on Project Gutenberg are all very similar, which allows for a known plaintext attack. The target ciphertext is in this directory and has the name target_ciphertext
.
Using known plaintext, we will recover bits of keystream, and solve a system of linear equations to determine the tap bits . If the basic idea of this plan is not clear to you, please read the section of the textbook on this topic (Chapter 2).
In order to do the attack you will first need to implement some parts of an LFSR. Then you will use Sage together with some known plaintext to produce the original state and tap variables used in the encryption of the target. Finally you will use your LFSR to decrypt the target and produce the solution.
lfsr.c
I have prepared an outline of the code you need to write in the file called lfsr.c
which you should find in this directory.
This program implements a 64 bit LFSR based on the C type uint64_t
. You can think of variables of this type as 64 bit unsigned integers. We use one of these types to represent the state of the LFSR (i.e. ) and one to represent the tap bits (i.e. ).
You can compile and run this code from a terminal on cocalc with these commands:
Optionally you could use the clang
compiler rather than gcc
; generally clang
has more helpful error messages than gcc
.
The output is a file called ct.dat
. Right now it is identical to the plaintext input, namely toy_pt.txt
. After you implement the missing functions in lfsr.c
, the output should be identical to the data in the file ct.toy
. You can check whether they agree from a cocalc terminal with this command:
If there is no output, then you have successfully implemented your LFSR. If there is output saying that the files differ, then your LFSR does not yet work correctly. By confirming that ct.dat
and ct.toy
agree, you can be confident that your LFSR works before moving on to other parts of the project.
We will say more below about the missing functions which you need to implement, after we introduce the basic datatypes used to implement our 64 bit LFSR.
The code in lfsr.c
represents a 64 bit LFSR in the following way. The 64 bits of both the state bits and the taps are represented using the uint64_t
, which is a 64 bit integer type. The least significant bits of a uint64_t
correspond to the small indexes for and . For example, the binary number (expressed in hex)
would indicate that , , , and when .
Similarly the binary number (expressed in hex)
indicates that , and when .
Most of the code you need to implement an LFSR has already been written. There are a few crucial functions left for you to write, which will be described in detail below.
The LFSR L is a struct type defined in the file as follows:
Thus L has two fields, which are both 64 bit integers. These represent the variables and as described above.
Currently the main()
function has this code:
The first four lines of this function declare and initialize the state and tap variables of the LFSR. Right now these are initialized with sample values. In the course of decrypting target_ciphertext
, you will determine the settings used to produce the keystream that encrypted that ciphertext.
The last two lines in this file are commented out, but you should comment them back in as you fill in the functions necessary for them to run. We will describe these functions in detail below.
The code currently in lfsr.c
should compile and run. There is a chance that there will be problems if the code is compiled on Visual Studio because GNU builtin commands are used in the parity()
function in lfsr.c
.
This version of the partity function should work on Windows:
You can comment out the provided parity function and replace it with the above if you prefer not to work in GNU.
Completing the LFSR
To complete the code in lfsr.c
and produce a working LFSR you need to write two functions. These are the read_lfsr
and next_state
functions shown below.
As described in the comments, the read_lfsr
function should return the rightmost bit of the LFSR state
variable. Note that L
is passed as a pointer, so the state
field must be accessed as L->state
not L.state
.
The next_state
function updates the state of the LFSR according to the value of the tap bits. The simplest way to do this is to AND the state and taps variables, and then return the parity of the result. This works because AND is coordinate-wise multiplication modulo 2, and parity is a summation of bits modulo 2.
As discussed above, you can check the correctness of your code by comparing ct.dat
and ct.toy
.
When you have a working LFSR, you can use it to decrypt target_ciphertext
as soon as you know the initial state and tap settings used to produce that ciphertext.
Recovering the state and tap variables
In order to recover the state and tap settings used to encrypt target_ciphertext
, we must first use a known-plaintext attack to recover the initial state, .
To do this, complete the function get_128_keystream_bits
, described below.
As discussed in the book, a known plaintext attack works by XORing the known plaintext with the ciphertext (in this case target_ciphertext
). This exposes a certain amount of the keystream. Because the period of the LFSR is 64, you need bytes of keystream. For your 16 bytes of known plaintext, you might want to investigate what the first 16 characters tend to be for books posted as plaintext on Project Gutenberg.
Once you have the first 8 bytes of keystream, you know the initial state of the LFSR, and you can fill in this value back in main()
.
That is, change
to
where 0xZZZZZZZZZZZZZZZZ
is the initial state expressed in hex. Recall that goes in the rightmost bit, and goes in the leftmost bit.
All that remains to be done is to recover the inital taps.
Recovering the taps
This task is outlined in detail in the file matrix.sagews
in this folder. Please read this document.
Grading:
In order to grade this project I will be testing the following:
Does "read_lfsr" work
Does "next_state" work
Does "get_128_keystream_bits" work
Does "shape_keystream" work
Did you decrypt
target_ciphertext
.
There are many opportunities for partial credit. Please begin early and post questions on Piazza or ask them in class.
Submission:
Please simply leave your files in the project_2
folder and they will be automatically collected.