CoCalc Public Files18.783 Lecture 3 Root finding in finite fields.ipynb
Author: Andrew Sutherland
Views : 92
Description: 18.783 Lecture 3: Demonstration of Rabin's root-finding algorithm and the Cantor-Zassenhaus algorithm for factoring polynomials in finite fields.
Compute Environment: Ubuntu 20.04 (Default)
 Problem: Given a polynomial $f\in \mathbb F_q[x]$, determine the $\mathbb F_q$-rational roots of $f$. Solution: Rabin's algorithm!
In [ ]:
p=61; F=GF(p); R.<x>=PolynomialRing(F); f = x^8-2*x+5

In [ ]:
f.roots()

In [ ]:
f.factor()

In [ ]:
f.change_ring(GF(p^6)).roots()

 Theorem: $\mathbb F_{p^n} = \{\alpha\in \overline\mathbb F_p: \alpha^{p^n}=\alpha\}$, in other words, we have $x^{p^n}-x=\prod_{\alpha\in \mathbb F_{p^n}}(x-\alpha)$ Proof: For $n=1$ this follows from Fermat's little theorem. If we identify $\mathbb F_p\simeq \mathbb Z/p\mathbb Z$, we have $a^{p-1}=1\bmod p$ for $a\in [0,p-1]$, so the $p$ elements of $\mathbb Z/p\mathbb Z$ are precisely the roots of $x^p-x$. For $n>1$ this is true by definition: $\mathbb F_{p^n}:=\overline \mathbb{F}_{p}^{\langle \alpha \mapsto \alpha^{p^n}\rangle}$ is the fixed field of the $p^n$-power Frobenius automorphism of $\overline\mathbb F_p$, equivalently, the splitting field of $x^{p^n}-x$ over $\mathbb F_p$. Corollary: For any prime power $q$ we have $\mathbb F_{q^n} = \{\alpha\in \overline\mathbb F_q: \alpha^{q^n}=\alpha\}$ and $x^{q^n}-x=\prod_{\alpha\in \mathbb F_{q^n}}(x-\alpha)$.
In [ ]:
product([x-i for i in F]) == x^p-x

In [ ]:
product([x.change_ring(GF(p^2))-i for i in GF(p^2)]) == x^(p^2)-x

 Corollary: Let $f\in \mathbb F_q[x]$ be nonzero and let $S:=\{\alpha\in \mathbb F_{q^n}:f(\alpha)=0\}$. Then $\gcd(f(x),x^{q^n}-x)=\prod_{\alpha\in S}(x-\alpha).$ In particular, $\#S=\deg(\gcd(f(x),x^{q^n}-x)$.
As usual, $\gcd(a,b)$ denotes the unique monic representative of the nonzero principal ideal $(a,b)$ of $\mathbb F_q[x]$.
In [ ]:
%time gcd(f,x^p-x).degree()

In [ ]:
%time gcd(f,x^(p^2)-x).degree()

In [ ]:
%time gcd(f,x^(p^3)-x).degree()

In [ ]:
%time gcd(f,x^(p^6)-x).degree()

 Key point: Make sure you exponentiate in the right ring! The first step in computing $\gcd(f(x),x^q-x)$ is to compute $x^q-x\bmod f$ (note that $q$ may be exponentially large, but $\deg f$ certainly won't be). If $\pi_f\colon \mathbb{F}_q[x]\to\mathbb{F}_q[x]/(f)$ is the ring homomorphism defined by $x\mapsto x\bmod f$, then $\pi_f(x^q-x)=\pi_f(x^q)-\pi_f(x)=\pi_f(x)^q-\pi_f(x),$ so the order of operations makes no mathematical difference, but the RHS can be computed exponentially faster than either the middle or the LHS!The complexity of computing $x^q\bmod f$ with Euclidean division is quasi-linear in $q$, The complexity of computing $\pi_f(x)^q$ using binary exponentiation is quasi-linear in $\log q$.
In [ ]:
pif=R.quotient(f)


In [ ]:
%time print(pif(x)^(p^6))

In [ ]:
p=2^255-19;  F=GF(p); R.<x>=PolynomialRing(F); f=x^8-2*x+5; pif=R.quotient(f);
%time print(f.factor())

In [ ]:
%time print(pif(x)^p)
%time print(pif(x)^(p^2))
%time print(pif(x)^(p^3))
%time print(pif(x)^(p^4))
%time print(pif(x)^(p^10))

In [ ]:
def distinct_root_poly(f):
q = f.base_ring().cardinality()
pif = f.parent().quotient(f)
return gcd(f,(pif(x)^q).lift()-x)

def count_distinct_roots(f):
return distinct_root_poly(f).degree()

In [ ]:
p=61;  F=GF(p); R.<x>=PolynomialRing(F); f=x^8-2*x+5; pif=R.quotient(f)
for n in range(1,7):
print("%d distinct roots over F_{%d^%d}" % (count_distinct_roots(f.change_ring(GF(p^n))),p,n))


Okay, we can count (distinct) roots, but what if we actually want to know what they are?

In [ ]:
gcd(f,(pif(x)^p-x).lift())

In [ ]:
s = (p-1) // 2 # we assume p (or more generally q) is odd
gcd(f,(pif(x)^s-1).lift())

In [ ]:
print(gcd(f,(pif(x)^p-x).lift())/gcd(f,(pif(x)^s-1).lift()))


Excellent, we managed to split the roots of $f$ by picking out two that are roots of $x^s-1$, where $s=(q-1)/2$ (we are assuming $q$ is odd).
What are the roots of $x^s-1$ anyway?

In [ ]:
print(sorted((x^s-1).roots()))


My keen sense of pattern recognition (or a recollection of Euler's criterion) tells me these are the roots of $x^s-1$ are precisely the squares (quadratic residues) in $\mathbb F_p^\times.$

In [ ]:
product([x-a^2 for a in F if a < p-a]) == (x^s-1)


But if $f$ has multiple roots that are quadratic residues (or maybe none), then what?

Rabin: Instead of $x^s-1$, use $(x+\delta)^s-1$ for a random $\delta\in \mathbb F_q$.

 Definition: Let us say that $\alpha,\beta\in \mathbb F_q$ are of different type if both are nonzero and $\alpha^s\ne \beta^s$, in other words, one is a quadratic residue and one is not. Theorem: For every pair of distinct $\alpha,\beta\in \mathbb F_q$ we have $\#\{\delta\in \mathbb F_q:\alpha+\delta \text{ and } \beta+\delta \text{ are of different type }\} = \frac{q-1}{2}.$ Proof: See Section 3.11 in the lecture notes for a short easy proof. Corollary: Let $f\in \mathbb F_q[x]$, let $g(x)=\gcd(f,x^q-x)$, let $s=(q-1)/2\in \Z$, and let $\delta$ be a uniformly random element of $\mathbb F_q$.If $g$ has more than one nonzero root, then $\Pr[0<\deg\gcd(g,(x+\delta)^s-1)<\deg g] \ge \frac{1}{2}.$
In [ ]:
def rational_root(f):
"""Returns a rational root of f or None if f has no rational roots."""
q = f.base_ring().cardinality()
assert q % 2 == 1
s = (q-1) // 2 # Note // yields an integer (whereas / would yield a rational)
x = f.parent().gen()
if f[0]==0:
return f.base_ring()(0)
g = distinct_root_poly(f)
print("Initial g = gcd(f,x^q-x) = %s" % g)
pig = g.parent().quotient(g)
while g.degree() > 1:
delta = g.base_ring().random_element()
print("delta = %s" % delta)
h = gcd(g, ((pig(x)-delta)^s-1).lift())
if 0 < h.degree() < g.degree():
g = h if 2*h.degree() <= g.degree() else g // h # as above, we use // to get a polynomial
pig = g.parent().quotient(g)
print("Better g = %s" % g)
return -g[0] if g.degree() == 1 else None


 Theorem: Let $n=\log q$ and $d=\deg f$. The expected running time of $\texttt{rational\_root}$ is $\tilde O(n^2d)$ bit operations.

Here the soft-O notation means up to a poly-logarithmic factors. See the lecture notes for a more precise bound and a proof.

In [ ]:
for i in range(5):
r = rational_root(f)
print("Found root %s\n" % r)
assert f(r) == 0


In [ ]:
h = product([x-i for i in range(1,20)])
for i in range(5):
r = rational_root(h)
print("Found root %s\n" % r)
assert h(r) == 0


In [ ]:
p=2^61-1;  F=GF(p); R.<x>=PolynomialRing(F);  f=x^8-2*x+5; h = product([x-i for i in range(1,20)])
for i in range(3):
%time print("Found root %s\n" % rational_root(h))


What if we want all the rational roots?

In [ ]:
def distinct_rational_roots(f,recurse=False):
"""Returns a list of all the distinct rational roots of f."""
q = f.base_ring().cardinality()
assert q % 2 == 1
s = (q-1) // 2
x = f.parent().gen()
roots = []
if recurse:
g = f
else:
g = distinct_root_poly(f)
if g(0) == 0:
roots = [g.base_ring()(0)]
g = g // x
if g.degree() <= 1:
return [-g[0]] if g.degree() == 1 else []
while True:
delta = g.base_ring().random_element()
pig = g.parent().quotient(g)
h = gcd(g, ((pig(x)-delta)^s-1).lift())
if 0 < h.degree() < g.degree():
roots += distinct_rational_roots(h, True)
roots += distinct_rational_roots(g//h, True)
return roots


 Theorem: Let $n=\log q$ and $d=\deg f$. The expected running time of $\texttt{distinct\_rational\_root}$ is $\tilde O(n^2d)$ bit operations.

This looks the same as the bound for $\texttt{rational\_root}$, but in fact there is an extra factor of $\log d$ hidden in the soft-O notation.

In [ ]:
%time distinct_rational_roots(f)

In [ ]:
%time distinct_rational_roots(product([x-F.random_element() for i in range(10)]))


What if we want all the rational roots and their multiplicities?

Yun: Compute the squarefree factorization!

 Definition: The squarefree factorization of a monic $f\in \mathbb F_q[x]$ is the unique list of monic squarefree polynomials $g_1,\ldots,g_m$ with $g_m\ne 1$ for which $f=g_1g_2^2\cdots g_m^m$. Key fact: If $f=f_1\cdots f_n$ is the factorization of a monic $f\in \mathbb F_q[x]$ into irreducibles then $f_i=f_j$ for some $i\ne j$ if and only if $f_i|\gcd(f,f')$.This holds in any perfect field.
In [ ]:
def squarefree_factorization(f):
"""Yun's algorithm to compute a list of polynomials g_1,..,g_m such that f=lc(f)g_1g_2^2...g_m^m with g_m != 1, assuming deg(f) < char(Fq).  We include g_0=1 for convenience."""
assert f.degree() < f.base_ring().characteristic()
if not f.is_monic:
f = f.monic()
df = f.derivative()
u = gcd(f,df)
v, w = f//u, df//u
gs =[f.parent()(1)]
while True:
g = gcd(v, w-v.derivative()); gs.append(g)
if v.degree() == g.degree():
return gs
v, w = v//g, (w-v.derivative())//g

 Theorem: Let $n=\log q$ and $d=\deg f$. The running time of $\texttt{squarefree\_factorization}$ is $\tilde O(nd)$ bit operations.

This is negligible compared to the complexity of $\texttt{distinct\_rational\_roots}$, so we might as well call $\texttt{squarefree\_factorization}$ first.
Even if we only want distinct roots, this will actually save time if $f$ is not squarefree.

In [ ]:
h = product([(x^2+F.random_element()*x+F.random_element())^i for i in range(1,10)])
%time gs = squarefree_factorization(h)
assert h == product([gs[i]^i for i in range(len(gs))])
gs

In [ ]:
def rational_roots(f):
"""Returns a list of tuples identifying the rational roots of f and their multiplicities."""
gs = squarefree_factorization(f)
return reduce(lambda x, y: x+y,[[(a,i) for a in distinct_rational_roots(gs[i])] for i in range(1,len(gs))])

In [ ]:
h = product([(x-F.random_element())^i for i in range(1,6)])
%time rational_roots(h)


What if we want to factor $f$ into irreducibles?

Rabin: You can do that with my root-finding algorithm.

Cantor-Zassenhaus: Yes, but that requires working in extension fields, and with our algorithm there is no need!

The first step is to compute the distinct degree factorization

 Definition: The distinct degree factorization of a monic $f\in \mathbb F_q[x]$ is the unique list of polynomials $g_1,\ldots,g_d$ with $f=g_1\cdots g_d$ such that each $g_i$ is a (possibly empty) product of distinct monic irreducible polynomials of degree $i$, for $1\le i\le d=\deg f$. Key point: This can be done deterministically by taking succesive gcds with $x^{q^i}-x$.
In [ ]:
def distinct_degree_poly(f,i):
q = f.base_ring().cardinality()
pif = f.parent().quotient(f)
return gcd(f,(pif(x)^(q^i)).lift()-x)

def distinct_degree_factorization(f, squarefree=False):
"""
Computes the distinct degree factorization of f as list gs with gs[0]=1 and gs[i]=g_i such that f=prod_i gs[i]^i.
The optional parameter squarefree inidicates whether f is known to be squarefree (usually one computes the squarefree factorization first).
"""
if not squarefree:
return naive_distinct_degree_factorization(f)
d = f.degree()
gs = [f.parent()(1) for i in range(d+1)]
for i in range(1,d+1):
if 2*i > f.degree():
gs[f.degree()] = f
break
gs[i] = distinct_degree_poly(f,i)
f = f // gs[i]
return gs

def naive_distinct_degree_factorization(f):
"""
Computed distinct degree factorization of arbitrary f by combining distinct degree factorization of polys in square free factorization.
There is no reason to ever use this, it is provided just for the sake of completeness
"""
d = f.degree()
gs = [f.parent()(1) for i in range(d+1)]
hs = [distinct_degree_factorization(g, squarefree=True) for g in squarefree_factorization(f)]
# combine results (this is silly, we are undoing work here)
for i in range(1,len(hs)):
for j in range(1,len(hs[i])):
gs[j] *= hs[i][j]^i
return gs

def facpat(f):
"""Returns a dictionary {d:i} whose keys are the degrees of the irreducible factors of f and whose values are the number of irreducible factors of that degree."""
X = {}

gs = squarefree_factorization(f)
for i in range(1,len(gs)):
gis = distinct_degree_factorization(gs[i],squarefree=True)
for j in range(1,len(gis)):
if gis[j].degree() > 0:
X[j] = X.get(j,0) + i * (gis[j].degree() // j)
return X

In [ ]:
%time facpat(f)

In [ ]:
h = (x^9-1)^2*(x^32-1)
assert h == product(distinct_degree_factorization(h))
%time print(facpat(h))
print([(r[0].degree(),r[1]) for r in h.factor()])


The key innovation introduced by Cantor-Zassenhaus is that when searching for irreducible factors $g$ of $f$ of degree $j$, rather than computing $\gcd(f,(x+\delta)^s-1))$ using a random $\delta\in \mathbb F_{q^j}$ and computing the gcd in $\mathbb F_{q^j}[x]$ (which is how one would go about finding a root of $g$ using Rabin's algorithm), it is better to compute $\gcd(f,u^s-1)$ in $\mathbb F_q[x]$ using a random $u\in \mathbb F_q[x]$ coprime to $f$ with $\deg u < \deg f$. This faster because we are working the ring $\mathbb F_q[x]/(f)$ rather than $\mathbb F_{q^j}[x]/(f)$ (so the bit-size of each element is smaller by a factor of $j$), and the probability of success is the same.

 Theorem: Let $f\in \mathbb F_q[x]$ be the product of $r\ge 2$ distinct monic irreducible polynomials of degree $j$ and let $u\in \mathbb F_q[j]$ be a random polynomial coprime to $f$ with $\deg u < \deg f$. Then $\Pr[0<\deg\gcd(f,u^s-1)<\deg f] = 2^{1-r} \ge \frac{1}{2}.$ Proof: This follows from applying the Chinese Remainder Theorem to $\mathbb F_q[x]/(f)\simeq \mathbb F_{q^j}^r$. See Section 3.12 of the lecture notes for details.
In [ ]:
def equal_degree_factorization(f,j):
"""Completely factors a monic polynomial f in Fq[x] that is known to be the product of distinct irreducible polynomials of degree j"""
d = f.degree()
assert d % j == 0
if d == 0:
return []
elif d == j:
return [f]
Fq = f.base_ring(); q = Fq.cardinality()
assert q%2 == 1
s = (q^j-1) // 2
x = f.parent().gen()
pif = f.parent().quotient(f)
while True:
# generate a random polynomial of degree < deg(f), note f.parent()(v) converts a vector of Fq coefficients to a polynomial
u = f.parent()([Fq.random_element() for i in range(d)])
h = gcd(f,u)
if 0 < h.degree() < d:
return equal_degree_factorization(h,j) + equal_degree_factorization(f//h,j)
# at this point we know u is coprime to f
h = gcd(f,(pif(u)^s-1).lift())
if 0 < h.degree() < d:
return equal_degree_factorization(h,j) + equal_degree_factorization(f//h,j)

def factorization(f):
"""Computes the complete factorization of a non-constant f in Fq[x] using the Cantor-Zassenhaaus algorithm (assumes q is odd)."""
assert f.degree() > 0
if not f.is_monic:
f = f.monic()
x = f.parent().gen()
res = []
gs = squarefree_factorization(f)
for i in range(1,len(gs)):
gis = distinct_degree_factorization(gs[i],squarefree=True)
for j in range(1,len(gis)):
res += [(g,i) for g in equal_degree_factorization(gis[j],j)]
# sort output to make it canonical (independent of random choices)
return sorted(res)


In [ ]:
assert factorization(f) == sorted(list(f.factor()))
print(factorization(f))
%timeit -n 10 factorization(f)
%timeit -n 10 f.factor()

In [ ]:
assert factorization(h) == sorted(list((h).factor()))
print(factorization(h))
%timeit -n 1 -r 5 factorization(h)
%timeit -n 1 -r 5 h.factor()

 Theorem: Let $n=\log q$ and $d=\deg f$. The expected running time of $\texttt{factorization}$ is $\tilde O(n^2d^2)$ bit operations.

There are other factorization methods based on linear algebra and modular composition that improve the dependence on $d$.
Currently the best asymptotic bound is achieved by the Kedlaya-Umans algorithm with an expected running time of $\tilde O(d^{3/2}n + dn^2)$.

In [ ]:
p=2^255-19;  F=GF(p); R.<x>=PolynomialRing(F);  f=x^8-2*x+5; h = product([x-i for i in range(1,20)])
assert factorization(f^3+f^2+1) == sorted(list((f^3+f^2+1).factor()))
%time print(factorization(f))


In [ ]: