Course: Math 311w-01
Laboratory 5: The RSA cipher and public-key cryptography, part 1
Today, we are starting a two-part laboratory on RSA encryption. RSA encryption
is a certain form of asymmetric public-key cryptography system. Traditionally,
ciphers have relied on a symmetric private key system. A special code is
created using a secret key. The key is used to encrypt the message. Then, the
message can only be decrypted by a person with the key. The cipher is symmetric
because the same key is used to both encrypt and decrypt the message. As long
as the key stays "private", the message is secure (well, atleast that's what's
supposed to happen).
An asymmetric public-key cryptography system uses two different keys, one to
encrypt the message, and a different one to decrypt the message. Because the
two keys are different, it is okay to make one of them public, while keeping the
other private -- even if a spy knows how the message was encrypted, they still
cannot decrypt it. This means that you can publish your public key, and anybody
who wants to send you a secure message can, as long as you keep your private key
private. You can learn more about the details of this latter at
My explanation simplifies matters a little more than it should. It turns out
that the RSA cipher relies on certain mathematical properties that nobody has
yet been able to establish the truth of. Thus, there is a fear that some day, a
smart person will figure out how to read all our secrets. The 1992 movie
"Sneakers" is based on the fundamental dilemma of the RSA encryption system --
it gives us vital flexibility in creating new communication channels with people
we have never met, but we don't actually know if these communications channels
are secure. Bitcoin currency is a technology developed using related ideas, and
suffering from similar potential limitations.
This lab will probably be too long to complete in a single sitting. In the
first part of the lab, we will put together the core number-theory functions we
will need. In the second part of the lab, we will implement the RSA cipher, and
see how RSA lets you encode a secret message that will stay secret EVEN IF SPIES
KNOW THE CODE YOU ARE USING. Because the second part makes use of the first
part, you will need the first part working by next Friday.
As you can gather from our progress in testbook section 1.6 so far, RSA makes
use of modular arithmetic. If you flip ahead, you can see that RSA is rather
simple in appearance, just using exponentiation and modulus operations, except
for a few things related to GCD's. Thus, we need a few functions from basic
number theory to help us as we go.
- We need to be able to calculate gcd(x,y).
- We need to be able to tell when an integer is prime.
* It will also be convenient to have a way to search for new primes.
- We need to be able to calculate the inverse of [a]_n.
We will use python to right code for encrypting and decrypting a message. While
the RSA algorithm is (relatively) simple, it repeats certain calculations many
times, and the numbers used are much larger than the numbers we usually work
with by hand. The numbers are even larger than calculators can work with -- 32
bit calculators can only represent integers as large as 2**31 ~= 1e10. In 2011,
RSA keys use integers about as large as 2**2000. However, some sophisticated
desktop calculator programs like "dc", "bc", and Python know how to represent
larger integers, only limited by the computer memory. While you can understand
the RSA algorithm, you would never want to do an encryption or decryption by
Part 1: Number theory tools
First, we want to make a library of the number-theory functions we will need.
Open a new file called "libnumbertheory.py". This is the file where we will
write our basic mathematical functions. We will use the import command to
include these functions in our cipher code.
To speed things up, I have posted a script called "util.py" which contains some
basic utility functions will eventually be useful, but which are not
particularly important or interesting. Download this file and put it in the
same directory as numbertheorylib.py Make the second line of
"numbertheorylib.py" read "from util import * "
To calculate gcd(x,y), we can use Euclid's algorthm, like we learned in class.
Write such a function. It doesn't need to be very long. My implementation is
6 lines, and you can probably do it fewer. Use the following code
to test your implementation.
TEST = False
for (x,y,g) in [(2*3*5, 3*5*7,3*5),(2*89,7*89,89)]:
print "#Test gcd: gcd(%d,%d)=%d=%d"%(x,y,gcd(x,y),g)
Now, we need to be able to find prime numbers. We will use
a variant of Eratosthenes's Sieve (http://www.youtube.com/watch?v=9m2cdWorIq8)
The catch is that finding prime numbers is expensive, particularly as
numbers get large, so we do not want to waste more effort on this than
we have to. Write a function isprime(x) that returns True if x is prime
and False if x is composite. Use the following code to test.
for x in range(2,20):
print "# %d is prime."%x
print "# %d is composite."%x
Once your isprime(x) function is working, we can use it to find new prime
numbers with the following function.
while not isprime(x):
x += 1
Note the speed of your function. For lab-purposes, we will need to work with
prime-numbers having atleast 3 digits. What's the largest prime number your
algorithm can quickly find?
Modular Arithmetic inverses
Now, we need to calculate an inverse of [a]_n. We can check if this inverse
exists using the gcd implementation above and the condition 1==gcd(a,n), but we
need to do the matrix calculation to find the inverse. This is a complicated
calculation, where it is easy to introduce bugs and hard to find them, so rather
than writing your own function, use the code below. Note that this code uses
the numpy library to facilitate matrix and vector calculations.
# these first part of the function
# is a set of tests and conversions
# that we use to set things up before
# the main algorithm starts.
assert isinstance(a,int) or isinstance(a,long)
assert isinstance(b,int) or isinstance(b,long)
if (a < 0): a = -a
if (b < 0): b = -b
a = a % b
assert not 0==a
assert 1 == gcd(b,a)
# Now, we start the main algorithm.
M = [ numpy.array([a,1,0]), numpy.array([b,0,1])]
i_big = 0
if ( a < b ):
i_big = 1
while M[i_big] > 0:
i_big = 1 - i_big
M[i_big] -= M[1-i_big]*(M[i_big]/M[1-i_big])
# The algorithm has finished. Now, we make sure the
# result is in the form we need, and we return it.
inv = long((M[1-i_big]+b)%b)
TEST = False
for a, n in [ (5,26), (11,23), (27,128) ]:
y = inverse(a,n)
print "#Check inverse: [%d][%d] mod %d = %d"%(a,y,n,(a*y)%n)
Part 2: RSA encryption
Now, we have all the parts we need to implement the RSA cipher. You may
discover that the functions you have written are not "good enough" because they
are slow or buggy, but that's something we'll assess as we make progress.
We will implement the RSA cipher as a pair of classes in python. A class is a
concept from somewhat obsolete object-oriented programming paradigm. It is a
set of data variables and some functions that operate on them in specific ways.
Here, we want to associate encryption and decryption routines with specific
public and private key pairs.
Open up a new file "RSA.py" and fill in the following code. Note that gcd,
isprime, and inverse all appear atleast once. The symbol "**" appears for
exponentiation, while "%" appears for modulus.
from libnumbertheory import *
self.n = n
self.k = k
return [ long((m_i**self.k)%self.n) for m_i in m ]
return str( (self.n,self.k) )
p,q = long(p),long(q)
n = p*q
totient = (p-1)*(q-1)
x = inverse(a,totient)%n
self.public_key = RSAKey(n,a)
self.private_key = RSAKey(n,x)
# extra things that should be forgotten
self.primes = (p,q)
self.totient = totient
I've provided two scripts to show you how RSA.py can now be used
to encrypt and decrypt messages. First
shows how you can make your own keys using prime numbers.
Second, the script
shows how random keys can be generated.