Section 12.5 Introduction to Factorization
ΒΆSubsection 12.5.1 Factorization and the RSA
ΒΆLet's look at another toy RSA problem to get a sense of what is going on. First, I choose a modulus n=899. I will also use Sage to verify it has two prime factors, without telling you what they are.xxxxxxxxxx
n=899
print("There are %s prime factors and their powers are %s and %s."%(len(n.factor()), n.factor()[0][1], n.factor()[1][1]))
xxxxxxxxxx
e=13
print("We choose n=%s and exponent e=%s, and verify that gcd(e,phi(n))=1: %s"%(n,e,1==gcd(e,euler_phi(n))))
xxxxxxxxxx
x=11
message=mod(x,n)^e
message
euler_phi(899)
directly.) Well, we do know n=899 and that e=13. That could help. Remember, if we knew p and q, we could easily calculate Ο(n) without even using Sage, which should be enough.
Question 12.5.1.
Can you quickly now factor n=899 without using Sage?
Hint: be smart about it. Think strategically; how should I have chosen a public modulus \(n\) to make this hard to do? How should \(p\) and \(q\) relate?
Sage note 12.5.2. Trying your primes yourself.
You can fill in the values you got for p
and q
here to make things work. Try it!
xxxxxxxxxx
p=
q=
f=inverse_mod(e,(p-1)*(q-1))
f
When we decrypt, we should get the original message x=11 again.
xxxxxxxxxx
message^f
Fact 12.5.3.
If I know Ο(n) and n, and know that n is a product of exactly two distinct primes, I can easily compute them both.
Proof.
Of course, if we know \(\phi(n)\text{,}\) we already can crack the code, but who cares; maybe we are given \(\phi(n)\) and \(n\) and want the factorization. Here is the short proof.
Suppose the (as yet unknown) primes are \(p\) and \(q\text{.}\) Then expand our formula to
We now can represent both \(p+q\) and \(pq\) as formulas in \(n\) and \(\phi(n)\text{:}\)
\(p+q=n-\phi(n)+1\)
\(pq=n\)
Where might we have a formula with \(p+q\) and \(pq\text{?}\) That should seem familiar β¦
So we can simply use the quadratic formula on this expression to get the values for \(p\) and \(q\text{!}\)
Example 12.5.4.
Continuing the example above,
gives
Subsection 12.5.2 Trial division
ΒΆThe first, and oldest, method of factoring is one you already know, and maybe used a few minutes ago β trial factorization, or trial division. It is the method we used with the Sieve of Eratosthenes; you just try each prime number, one by one. In Algorithm 6.2.2, do you remember what the highest number you would have to try is in order to factor a given n by trial division? (Can you prove it?) The following algorithm does this very naively (and slowly, even for trial division). Let's try to talk through what each step does.Sage note 12.5.5. Code for trial division.
This is one of the few places where it really is important to follow the code. That said, the details of the syntax are not as important as the algorithm β unless you want to harness the power of computers more effectively!
xxxxxxxxxx
def TrialDivFactor(n): # We define the function
p = next_prime(1) # We start off by testing the next prime after 1
top = ceil(math.sqrt(n)) # This was proved to be the biggest number we need
while p < top: # As long as the prime is less than that bound, we keep going
if mod(n,p)==0: # In this case, p divides n and we're done!
break # This is Python's way of saying we are done searching
p=next_prime(p) # Otherwise, we try the next prime until we're done looking
if n==1: # We probably could have checked for this right away
print("1 is not prime") # Well, 1 is not a prime!
elif p==n: # If we get all the way through and end with a prime...
print(n,"is prime") # Then our number was prime
elif mod(n,p)==0: # But otherwise... (!)
print(n,"factors as",p,"times",n/p) # We have a factorization!
else: # And finally...
print(n,"is prime") # We must have gotten lucky.
Algorithm 12.5.6. Trial Factorization.
To factor n, first enumerate the primes in ascending order p1,p2,β―pk, where pk is the largest prime less than or equal to βn. For each prime in order, check whether piβ£n. If it does, terminate by returning pi and n/pi; otherwise n must be prime.
xxxxxxxxxx
for z in range(1,18):
TrialDivFactor(z)
xxxxxxxxxx
def _(n=6739815371):
TrialDivFactor(n)
timeit('TrialDivFactor(%s)'%n)
xxxxxxxxxx
def _(n=6739815371):
print(n.trial_division())
timeit('%s.trial_division()'%n)
xxxxxxxxxx
def _(n=6739815371):
print(n.factor())
timeit('%s.factor()'%n)
xxxxxxxxxx
timeit('TrialDivFactor(997*991)')
xxxxxxxxxx
timeit('(997*991).trial_division()')
xxxxxxxxxx
timeit('(997*991).factor()')
Subsection 12.5.3 Starting in the middle
ΒΆSo much for trial division! But we have other tools at our disposal. Some of you might have tried something other than straight trial factorization when attacking n=899 from our earlier problem. Reason this way; since we know that someone is trying to protect a secret, they probably are not going to pick a number with primes like 3 and 5 in it. After all, that would be too easy to factor. In fact, it stands to reason that the primes p and q should be relatively large compared to n β so why not start in the middle? This was Fermat's idea for factoring larger numbers. However, he didn't just start with primes in the middle; for one thing, if your number is even somewhat big and you don't have a computer or huge list of primes, how would you know where to start? So Fermat became clever, as always, and used an algebraic identity to help himself along.Fact 12.5.7.
Write n=ab, with a>b, and assume n is odd. Then we can write n as a difference of two square numbers.
Proof.
Namely, \(n\) is the difference of the squares of \(s=\frac{a+b}{2}\) and \(t=\frac{a-b}{2}\text{:}\)
Remark 12.5.8.
Why is it fine to assume n is odd in these circumstances?
Algorithm 12.5.9. The Fermat factorization algorithm.
To find a factor for a number n, begin by seeking a perfect square s2 bigger than n, but still as close as possible. Now, do the following until you succeed, increasing s by one each time.
Check whether s2βn is itself a perfect square t2.
-
That means we essentially turned
s2βt2=n around into s2βn=t2.
Once you succeed, then s and t are not the factors of n; rather, they are
Proof.
It should be clear why \(a\) and \(b\) are the factors. But how do we know this algorithm terminates?
Assuming you started with \(s\) as instructed, eventually you will reach \(s=(n+1)/2\text{,}\) which is much larger than \(\sqrt{n}\text{.}\) But then \(((n+1)/2)^2-n=\frac{n^2+2n+1-4n}{4}=((n-1)/2)^2\text{.}\) You should check that this gives us the trivial factorization \(n=n\cdot 1\text{,}\) though! (See Exercise 12.7.11)
xxxxxxxxxx
def FermatFactor(n,verbose=False):
if n%2==0:
raise TypeError("Input must be odd!")
s=ceil(math.sqrt(n))
top=(n+1)/2
while is_square(s^2-n)==0:
if verbose:
print(s,"squared minus",n,"is",s^2-n,"which is not a perfect square")
s=s+1
t=sqrt(s^2-n)
print("Fermat found that",s,"squared minus",t,"squared equals",n)
if s^2==n:
print("So",n,"was already a perfect square,",s,"times",s)
elif s<top:
print("So",s+t,"times",s-t,"equals",(s-t)*(s+t),"which is",n)
elif s==top:
print("So Fermat did not find a factor, which means",n,"is prime!")
Example 12.5.10.
Before we move on, let's try to factor 143 and 93 using this algorithm. Remember, we start with s2βn, where s is the next integer above βn, and see if it is a perfect square; then we increase s by one each time.
After you attempt this by hand, you can see what Sage does with them to check.
xxxxxxxxxx
FermatFactor(143,verbose=True)
Well, we struck gold on the first try here! That happens if your number is the product of two primes which are two apart. (Such primes are known as twin primes, and have some interesting stories. Among other things, calculating with them helped find a bug in the Pentium computer chip in 1995; see Subsection 22.3.2.)
xxxxxxxxxx
FermatFactor(93,verbose=True)
As you can see, we probably would have been better off with trial division for n=93. It's obvious that it's divisible by 3, but that takes a long time to reach from the middle.