Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download

IPython notebook Caesar/2015-02-16-144443.ipynb

Project: Crypto
Views: 99
Kernel: Python 2 (SageMath)

##Caesar's Cipher## Caesar's cipher is to move a letter 3 places forward in the alphabet.

def caesar_letter(letter): """Return the 3rd letter after the input""" assert type(letter) is str and len(letter)==1 letter = letter.upper() num = ord(letter) - ord('A') #now letter in [0,..,25] num = (num+3)%26 #shift letter = chr(num+ord('A')) #take number back to letter return letter
caesar_letter('a'),caesar_letter('b'),caesar_letter('c'),caesar_letter('d')
('D', 'E', 'F', 'G')
alphabet = "abcdefghijklmnopqrstuvwxyz" for a in alphabet: print a," ---> ", caesar_letter(a)
a ---> D b ---> E c ---> F d ---> G e ---> H f ---> I g ---> J h ---> K i ---> L j ---> M k ---> N l ---> O m ---> P n ---> Q o ---> R p ---> S q ---> T r ---> U s ---> V t ---> W u ---> X v ---> Y w ---> Z x ---> A y ---> B z ---> C
def caesar_string(S): T = str() for s in S: if s==' ': T = T+s else: T = T+caesar_letter(s) return T
caesar_string("all of gaul is divided into three parts")
'DOO RI JDXO LV GLYLGHG LQWR WKUHH SDUWV'

#Caesar's cipher is most effective against people who are illiterate

A generalization of Caesar's cipher is to shift not by 3, but by some amount kk.

This makes it a cipher with a key

def shift_letter(letter,k): """Return the kth letter after the input""" assert type(letter) is str and len(letter)==1 and type(k) is int letter = letter.upper() num = ord(letter) - ord('A') #now letter in [0,..,25] num = (num+k)%26 #shift letter = chr(num+ord('A')) #take number back to letter return letter
def shift_string(S,k): T = str() for s in S: if s==' ': T = T+s else: T = T+shift_letter(s,k) return T
shift_string("all of gaul is divided into three parts",3)
'DOO RI JDXO LV GLYLGHG LQWR WKUHH SDUWV'

#Caesar's cipher is just the shift cipher with kk=3

for i in range(27): print i,":",shift_string("all of gaul is divided into three parts",i)
0 : ALL OF GAUL IS DIVIDED INTO THREE PARTS 1 : BMM PG HBVM JT EJWJEFE JOUP UISFF QBSUT 2 : CNN QH ICWN KU FKXKFGF KPVQ VJTGG RCTVU 3 : DOO RI JDXO LV GLYLGHG LQWR WKUHH SDUWV 4 : EPP SJ KEYP MW HMZMHIH MRXS XLVII TEVXW 5 : FQQ TK LFZQ NX INANIJI NSYT YMWJJ UFWYX 6 : GRR UL MGAR OY JOBOJKJ OTZU ZNXKK VGXZY 7 : HSS VM NHBS PZ KPCPKLK PUAV AOYLL WHYAZ 8 : ITT WN OICT QA LQDQLML QVBW BPZMM XIZBA 9 : JUU XO PJDU RB MRERMNM RWCX CQANN YJACB 10 : KVV YP QKEV SC NSFSNON SXDY DRBOO ZKBDC 11 : LWW ZQ RLFW TD OTGTOPO TYEZ ESCPP ALCED 12 : MXX AR SMGX UE PUHUPQP UZFA FTDQQ BMDFE 13 : NYY BS TNHY VF QVIVQRQ VAGB GUERR CNEGF 14 : OZZ CT UOIZ WG RWJWRSR WBHC HVFSS DOFHG 15 : PAA DU VPJA XH SXKXSTS XCID IWGTT EPGIH 16 : QBB EV WQKB YI TYLYTUT YDJE JXHUU FQHJI 17 : RCC FW XRLC ZJ UZMZUVU ZEKF KYIVV GRIKJ 18 : SDD GX YSMD AK VANAVWV AFLG LZJWW HSJLK 19 : TEE HY ZTNE BL WBOBWXW BGMH MAKXX ITKML 20 : UFF IZ AUOF CM XCPCXYX CHNI NBLYY JULNM 21 : VGG JA BVPG DN YDQDYZY DIOJ OCMZZ KVMON 22 : WHH KB CWQH EO ZEREZAZ EJPK PDNAA LWNPO 23 : XII LC DXRI FP AFSFABA FKQL QEOBB MXOQP 24 : YJJ MD EYSJ GQ BGTGBCB GLRM RFPCC NYPRQ 25 : ZKK NE FZTK HR CHUHCDC HMSN SGQDD OZQSR 26 : ALL OF GAUL IS DIVIDED INTO THREE PARTS
shift_string("DOO RI JDXO LV GLYLGHG LQWR WKUHH SDUWV",-3)
'ALL OF GAUL IS DIVIDED INTO THREE PARTS'

#Shifting backwards by 3 undoes Caesar's cipher!

shift_string("KVV YP QKEV SC NSFSNON SXDY DRBOO ZKBDC",-10)
'ALL OF GAUL IS DIVIDED INTO THREE PARTS'

#Shifting backwards by kk undoes a shift forward by kk!

The shift cipher is not a good cipher!

But why, exactly?

Because the key space is very small.

We can check 26 possibilities in a millisecond.

Note that the first 5 characters of the message more or less tells us if the key is correct or not.

#Exercise: Decrypt KTBG BG LITBG YTEEL BG MAX IETBG

2**256-1
115792089237316195423570985008687907853269984665640564039457584007913129639935L
ciphertext = "KTBG BG LITBG YTEEL BG MAX IETBG" for i in range(27): print i,":",shift_string(ciphertext,i)
0 : KTBG BG LITBG YTEEL BG MAX IETBG 1 : LUCH CH MJUCH ZUFFM CH NBY JFUCH 2 : MVDI DI NKVDI AVGGN DI OCZ KGVDI 3 : NWEJ EJ OLWEJ BWHHO EJ PDA LHWEJ 4 : OXFK FK PMXFK CXIIP FK QEB MIXFK 5 : PYGL GL QNYGL DYJJQ GL RFC NJYGL 6 : QZHM HM ROZHM EZKKR HM SGD OKZHM 7 : RAIN IN SPAIN FALLS IN THE PLAIN 8 : SBJO JO TQBJO GBMMT JO UIF QMBJO 9 : TCKP KP URCKP HCNNU KP VJG RNCKP 10 : UDLQ LQ VSDLQ IDOOV LQ WKH SODLQ 11 : VEMR MR WTEMR JEPPW MR XLI TPEMR 12 : WFNS NS XUFNS KFQQX NS YMJ UQFNS 13 : XGOT OT YVGOT LGRRY OT ZNK VRGOT 14 : YHPU PU ZWHPU MHSSZ PU AOL WSHPU 15 : ZIQV QV AXIQV NITTA QV BPM XTIQV 16 : AJRW RW BYJRW OJUUB RW CQN YUJRW 17 : BKSX SX CZKSX PKVVC SX DRO ZVKSX 18 : CLTY TY DALTY QLWWD TY ESP AWLTY 19 : DMUZ UZ EBMUZ RMXXE UZ FTQ BXMUZ 20 : ENVA VA FCNVA SNYYF VA GUR CYNVA 21 : FOWB WB GDOWB TOZZG WB HVS DZOWB 22 : GPXC XC HEPXC UPAAH XC IWT EAPXC 23 : HQYD YD IFQYD VQBBI YD JXU FBQYD 24 : IRZE ZE JGRZE WRCCJ ZE KYV GCRZE 25 : JSAF AF KHSAF XSDDK AF LZW HDSAF 26 : KTBG BG LITBG YTEEL BG MAX IETBG

#The above is a ciphertext only attack because the plaintext is not known.

It is brute force because it just tries every key.

Sometimes we know the plaintext and the ciphertext and just want to find the key.

This is a known plaintext attack. Example:

"hot pie" --- > DKP LEA
ord('D')-ord('H')
-4

Thus the shift is k=4k=-4, or equivalently k=22k=22.

shift_string("DKP LEA",-22)
'HOT PIE'

Note that while the time necessary for a ciphertext only attack was (in this case) proportional to the keyspace size, the time necessary for a known plaintext attack is constant.

This makes sense: Having more information about the cipher helps you break it.

###Note that the algebraic form of the shift cipher is

Ek(x)=(x+k)%26E_k(x) = (x+k)\%26

Dk(x)=(xk)%26D_k(x) = (x-k)\%26

This assumes that the alphabet is {0,1,...,25} rather than {A,B,C,...,Z}.

But this is just a superficial difference.

###The Affine Cipher is slightly more complex

Ea,b=(ax+b)%26E_{a,b} = (a\cdot x + b)\%26

Now the key k=(a,b)k = (a,b) is a pair of numbers in {0..25} and the keyspace has size seems to be

262=67626^2 = 676

Actually we will see below that some aa are unsuitable and there are only

1226=31212\cdot26 = 312

pairs (a,b) that give an invertible Ea,bE_{a,b}.

It is important that EE be an invertible function, because we eventually want to decrypt things!

def affine_letter(letter,k): """Return the affine cipher with a = k[0], b = k[1]""" assert type(letter) is str and len(letter)==1 letter = letter.upper() x = ord(letter) - ord('A') #now letter in [0,..,25] x = (x*k[0]+k[1])%26 #affine cipher letter = chr(x+ord('A')) #take number back to letter return letter
affine_letter('A',(1,3)) #The Caesar cipher
'D'
def affine_string(S,k): T = str() for s in S: if s==' ': T = T+s else: T = T+affine_letter(s,k) return T
affine_string("hello how are you",(3,11))
'GXSSB GBZ LKX FBT'

How do we invert the affine cipher?

Does it even have an inverse?

for a in "abcdefghijklmnopqrstuvwxyz": print a,"-->",affine_letter(a,(2,0))
a --> A b --> C c --> E d --> G e --> I f --> K g --> M h --> O i --> Q j --> S k --> U l --> W m --> Y n --> A o --> C p --> E q --> G r --> I s --> K t --> M u --> O v --> Q w --> S x --> U y --> W z --> Y

##Note that E2,0(x)=(2x)%26E_{2,0}(x) = (2\cdot x)\%26 is not one-to-one! Therefore it has no inverse.

The problem is basically that there is no element gg in {0..25} for which

(2g)%26=1(2g)\%26=1

Note that if there were such a gg then we could invert E2,0E_{2,0} by applying Eg,0E_{g,0}.

Meaning, that:

Eg,0(E2,0(x))=(g2x)%26=xE_{g,0}(E_{2,0}(x)) = (g2x)\%26 = x

from fractions import gcd invertables = [a for a in range(26) if gcd(a,26)==1] invertables
[1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, 25]

####The 12 numbers in {0..26} which are relatively prime to 26 do have inverses.

def egcd(a, b): if a == 0: return (b, 0, 1) else: g, y, x = egcd(b % a, a) return (g, x - (b // a) * y, y) def modinv(a, m): g, x, y = egcd(a, m) if g != 1: raise Exception('modular inverse does not exist') else: return x % m
for i in invertables: print "%d * %d mod 26 = %d"%(i,modinv(i,26),1)
1 * 1 mod 26 = 1 3 * 9 mod 26 = 1 5 * 21 mod 26 = 1 7 * 15 mod 26 = 1 9 * 3 mod 26 = 1 11 * 19 mod 26 = 1 15 * 7 mod 26 = 1 17 * 23 mod 26 = 1 19 * 11 mod 26 = 1 21 * 5 mod 26 = 1 23 * 17 mod 26 = 1 25 * 25 mod 26 = 1

####These invertables (mod 26) are the only suitable choices for aa in Ea,bE_{a,b}.

If we denote the multiplicative inverse of aa mod 26 by a1a^{-1} then

for Ea,b(x)E_{a,b}(x) we decrypt by

Da,b(x)=Ea1,a1b(x)=(a1xa1b)%26D_{a,b}(x) = E_{a^{-1},-a^{-1}b}(x) = (a^{-1}x-a^{-1}b)\%26

Proof:

Da,b(Ea,b(x))=Ea1,a1b(ax+b)=D_{a,b}(E_{a,b}(x)) = E_{a^{-1},-a^{-1}b}(ax + b) =

a1(ax+b)a1b=x+a1ba1b=xa^{-1}(ax+b) - a^{-1}b = x+a^{-1}b-a^{-1}b = x

##The affine cipher is a terrible cipher

But it illustrates some important things.

For one thing, the keyspace is still too tiny.

For another thing, it does not disguise the statistics of the language enough.

We will crack it in a lab, or as an assignment (maybe).

#Cryptograms: Substitutions

##A cryptogram is a permutation of the alphabet.

###The key is the permutation used to mix up the alphabet.

import random abet = alphabet[:] bizzarrobet = str() while len(bizzarrobet) < 26: r = random.randint(0,len(abet)-1) print ("%3d %s")%(r,abet) bizzarrobet += abet[r] abet = abet.replace(abet[r],"")
5 abcdefghijklmnopqrstuvwxyz 3 abcdeghijklmnopqrstuvwxyz 21 abceghijklmnopqrstuvwxyz 6 abceghijklmnopqrstuvwyz 0 abceghjklmnopqrstuvwyz 6 bceghjklmnopqrstuvwyz 4 bceghjlmnopqrstuvwyz 17 bcegjlmnopqrstuvwyz 17 bcegjlmnopqrstuvwz 3 bcegjlmnopqrstuvw 15 bcejlmnopqrstuvw 6 bcejlmnopqrstuv 2 bcejlmopqrstuv 5 bcjlmopqrstuv 4 bcjlmpqrstuv 10 bcjlpqrstuv 5 bcjlpqrstu 1 bcjlprstu 5 bjlprstu 3 bjlprtu 5 bjlrtu 3 bjlrt 3 bjlt 2 bjl 1 bj 0 b
bizzarrobet
'fdxiakhyzgwneomvqcspurtljb'
Emap = dict(zip(alphabet,bizzarrobet)) Emap[' '] = ' ' Emap
{' ': ' ', 'a': 'f', 'b': 'd', 'c': 'x', 'd': 'i', 'e': 'a', 'f': 'k', 'g': 'h', 'h': 'y', 'i': 'z', 'j': 'g', 'k': 'w', 'l': 'n', 'm': 'e', 'n': 'o', 'o': 'm', 'p': 'v', 'q': 'q', 'r': 'c', 's': 's', 't': 'p', 'u': 'u', 'v': 'r', 'w': 't', 'x': 'l', 'y': 'j', 'z': 'b'}
S = "a stitch in time saves nine" EncryptedS = str() for s in S: EncryptedS += Emap[s] EncryptedS
'f spzpxy zo pzea sfras ozoa'

##The decrypt is the reverse permutation

Dmap = {v: k for k, v in Emap.items()} Dmap
{' ': ' ', 'a': 'e', 'b': 'z', 'c': 'r', 'd': 'b', 'e': 'm', 'f': 'a', 'g': 'j', 'h': 'g', 'i': 'd', 'j': 'y', 'k': 'f', 'l': 'x', 'm': 'o', 'n': 'l', 'o': 'n', 'p': 't', 'q': 'q', 'r': 'v', 's': 's', 't': 'w', 'u': 'u', 'v': 'p', 'w': 'k', 'x': 'c', 'y': 'h', 'z': 'i'}
SS = str() for s in EncryptedS: SS += Dmap[s] SS
'a stitch in time saves nine'

##Now the keyspace is huge ###There are n!n! ways to permute nn objects.

math.factorial(26)
403291461126605635584000000L

####But still, a simple substitution is easily cracked with statistics

S = """Describing him as a normal, well-adjusted individual in nearly every respect, acquaintances of 32-year-old Gary Morgan confirmed Monday that the otherwise reasonable man earnestly believes the U.S. government landed a spacecraft on the moon. "Gary's a pretty regular, levelheaded guy unless he gets going on one of his ridiculous rants about how American astronauts actually walked on the moon back in 1969,” said friend Jeremy Lesser, adding that you would never guess by looking at Morgan that he’s absolutely convinced NASA successfully sent six different crews on lunar missions. “Honestly, I’ve found it’s best just to avoid the topic of space exploration altogether when we hang out. I’ve come to realize that no matter what I say, I’m never going to change his mind about the Apollo program. And I’d rather talk to him about anything else than hear him go off about astronauts exploring the lunar surface and bringing samples back to Earth again.” At press time, Lesser was reportedly biting his tongue as Morgan explained the existence of several American flags allegedly planted on the moon.""" S = S.lower() S = S.replace('\xe2',"") S = S.replace('\x80',"") S = S.replace('\x9d',"") S = S.replace('\x99',"") S = S.replace('\x9c',"") Emap[',']=',' Emap["'"]="'" Emap["."]="." Emap["-"]="-" Emap["9"]="9" Emap["6"]="6" Emap["3"]="3" Emap["2"]="2" Emap["1"]="1" Emap['"']='"' EncryptedS = str() for s in S: EncryptedS += Emap[s] EncryptedS
'iasxczdzoh yze fs f omcefn, tann-figuspai zoizrziufn zo oafcnj aracj casvaxp, fxqufzopfoxas mk 32-jafc-mni hfcj emchfo xmokzceai emoifj pyfp pya mpyactzsa cafsmofdna efo afcoaspnj danzaras pya u.s. hmracoeaop nfoiai f svfxaxcfkp mo pya emmo. "hfcj\'s f vcappj cahunfc, naranyafiai huj uonass ya haps hmzoh mo moa mk yzs czizxunmus cfops fdmup ymt feaczxfo fspcmofups fxpufnnj tfnwai mo pya emmo dfxw zo 1969, sfzi kczaoi gacaej nassac, fiizoh pyfp jmu tmuni oarac huass dj nmmwzoh fp emchfo pyfp yas fdsmnupanj xmorzoxai ofsf suxxasskunnj saop szl izkkacaop xcats mo nuofc ezsszmos. ymoaspnj, zra kmuoi zps dasp gusp pm frmzi pya pmvzx mk svfxa alvnmcfpzmo fnpmhapyac tyao ta yfoh mup. zra xmea pm cafnzba pyfp om efppac tyfp z sfj, ze oarac hmzoh pm xyfoha yzs ezoi fdmup pya fvmnnm vcmhcfe. foi zi cfpyac pfnw pm yze fdmup fojpyzoh ansa pyfo yafc yze hm mkk fdmup fspcmofups alvnmczoh pya nuofc suckfxa foi dczohzoh sfevnas dfxw pm afcpy fhfzo. fp vcass pzea, nassac tfs cavmcpainj dzpzoh yzs pmohua fs emchfo alvnfzoai pya alzspaoxa mk saracfn feaczxfo knfhs fnnahainj vnfopai mo pya emmo.'
import string def hist(string): d = {} tot =0 for i in string.lower(): tot += 1 if i.isalpha(): d[i] = d[i] + 1 if i in d else 1 result = list() for k in d.keys(): result.append((k,float(d[k])/tot)) return result def getKey(item): return item[1]
S1 = sorted(hist(S),key=getKey) S2 = sorted(hist(EncryptedS),key=getKey) Z = zip(S1,S2) print S print EncryptedS Z
describing him as a normal, well-adjusted individual in nearly every respect, acquaintances of 32-year-old gary morgan confirmed monday that the otherwise reasonable man earnestly believes the u.s. government landed a spacecraft on the moon. "gary's a pretty regular, levelheaded guy unless he gets going on one of his ridiculous rants about how american astronauts actually walked on the moon back in 1969, said friend jeremy lesser, adding that you would never guess by looking at morgan that hes absolutely convinced nasa successfully sent six different crews on lunar missions. honestly, ive found its best just to avoid the topic of space exploration altogether when we hang out. ive come to realize that no matter what i say, im never going to change his mind about the apollo program. and id rather talk to him about anything else than hear him go off about astronauts exploring the lunar surface and bringing samples back to earth again. at press time, lesser was reportedly biting his tongue as morgan explained the existence of several american flags allegedly planted on the moon. iasxczdzoh yze fs f omcefn, tann-figuspai zoizrziufn zo oafcnj aracj casvaxp, fxqufzopfoxas mk 32-jafc-mni hfcj emchfo xmokzceai emoifj pyfp pya mpyactzsa cafsmofdna efo afcoaspnj danzaras pya u.s. hmracoeaop nfoiai f svfxaxcfkp mo pya emmo. "hfcj's f vcappj cahunfc, naranyafiai huj uonass ya haps hmzoh mo moa mk yzs czizxunmus cfops fdmup ymt feaczxfo fspcmofups fxpufnnj tfnwai mo pya emmo dfxw zo 1969, sfzi kczaoi gacaej nassac, fiizoh pyfp jmu tmuni oarac huass dj nmmwzoh fp emchfo pyfp yas fdsmnupanj xmorzoxai ofsf suxxasskunnj saop szl izkkacaop xcats mo nuofc ezsszmos. ymoaspnj, zra kmuoi zps dasp gusp pm frmzi pya pmvzx mk svfxa alvnmcfpzmo fnpmhapyac tyao ta yfoh mup. zra xmea pm cafnzba pyfp om efppac tyfp z sfj, ze oarac hmzoh pm xyfoha yzs ezoi fdmup pya fvmnnm vcmhcfe. foi zi cfpyac pfnw pm yze fdmup fojpyzoh ansa pyfo yafc yze hm mkk fdmup fspcmofups alvnmczoh pya nuofc suckfxa foi dczohzoh sfevnas dfxw pm afcpy fhfzo. fp vcass pzea, nassac tfs cavmcpainj dzpzoh yzs pmohua fs emchfo alvnfzoai pya alzspaoxa mk saracfn feaczxfo knfhs fnnahainj vnfopai mo pya emmo.
[(('q', 0.0009165902841429881), ('b', 0.0009165902841429881)), (('z', 0.0009165902841429881), ('q', 0.0009165902841429881)), (('j', 0.002749770852428964), ('g', 0.002749770852428964)), (('k', 0.00458295142071494), ('l', 0.00458295142071494)), (('x', 0.00458295142071494), ('w', 0.00458295142071494)), (('w', 0.00916590284142988), ('t', 0.00916590284142988)), (('v', 0.010999083409715857), ('r', 0.010999083409715857)), (('b', 0.012832263978001834), ('d', 0.012832263978001834)), (('p', 0.012832263978001834), ('v', 0.012832263978001834)), (('f', 0.013748854262144821), ('k', 0.013748854262144821)), (('y', 0.01833180568285976), ('j', 0.01833180568285976)), (('c', 0.021998166819431713), ('x', 0.021998166819431713)), (('m', 0.022914757103574702), ('e', 0.022914757103574702)), (('u', 0.026581118240146653), ('u', 0.026581118240146653)), (('d', 0.02841429880843263), ('i', 0.02841429880843263)), (('g', 0.02841429880843263), ('h', 0.02841429880843263)), (('h', 0.032080659945004586), ('y', 0.032080659945004586)), (('l', 0.04216315307057745), ('n', 0.04216315307057745)), (('i', 0.04857928505957837), ('c', 0.04857928505957837)), (('r', 0.04857928505957837), ('z', 0.04857928505957837)), (('s', 0.054995417048579284), ('s', 0.054995417048579284)), (('o', 0.06049495875343722), ('m', 0.06049495875343722)), (('t', 0.06416131989000917), ('p', 0.06416131989000917)), (('n', 0.06507791017415215), ('o', 0.06507791017415215)), (('a', 0.08157653528872594), ('f', 0.08157653528872594)), (('e', 0.09532538955087076), ('a', 0.09532538955087076))]

###Note that we didn't know the plaintext, only its statistical profile ####And we didn't just mechanically search the (enormous) keyspace. #####This is an example of a non-brute-force attack.

for z in Z: print z[0][0],"--->",z[1][0],
Q ---> B Z ---> Q J ---> G K ---> L X ---> W W ---> T V ---> R B ---> D P ---> V F ---> K Y ---> J C ---> X M ---> E U ---> U D ---> I G ---> H H ---> Y L ---> N I ---> C R ---> Z S ---> S O ---> M T ---> P N ---> O A ---> F E ---> A
Emap
{' ': ' ', '"': '"', "'": "'", ',': ',', '-': '-', '.': '.', '1': '1', '2': '2', '3': '3', '6': '6', '9': '9', 'a': 'f', 'b': 'd', 'c': 'x', 'd': 'i', 'e': 'a', 'f': 'k', 'g': 'h', 'h': 'y', 'i': 'z', 'j': 'g', 'k': 'w', 'l': 'n', 'm': 'e', 'n': 'o', 'o': 'm', 'p': 'v', 'q': 'q', 'r': 'c', 's': 's', 't': 'p', 'u': 'u', 'v': 'r', 'w': 't', 'x': 'l', 'y': 'j', 'z': 'b'}