===============================================================================

Course: Math 311w-01

Laboratory 5: The RSA cipher and public-key cryptography, part 1

Date: 2011-11-11

================================================================================

Introduction

================================================================================

Today, we are starting a two-part laboratory on RSA encryption. RSA encryption

is a certain form of asymmetric public-key cryptography system. Traditionally,

ciphers have relied on a symmetric private key system. A special code is

created using a secret key. The key is used to encrypt the message. Then, the

message can only be decrypted by a person with the key. The cipher is symmetric

because the same key is used to both encrypt and decrypt the message. As long

as the key stays "private", the message is secure (well, atleast that's what's

supposed to happen).

An asymmetric public-key cryptography system uses two different keys, one to

encrypt the message, and a different one to decrypt the message. Because the

two keys are different, it is okay to make one of them public, while keeping the

other private -- even if a spy knows how the message was encrypted, they still

cannot decrypt it. This means that you can publish your public key, and anybody

who wants to send you a secure message can, as long as you keep your private key

private. You can learn more about the details of this latter at

http://en.wikipedia.org/wiki/Public-key_cryptography

My explanation simplifies matters a little more than it should. It turns out

that the RSA cipher relies on certain mathematical properties that nobody has

yet been able to establish the truth of. Thus, there is a fear that some day, a

smart person will figure out how to read all our secrets. The 1992 movie

"Sneakers" is based on the fundamental dilemma of the RSA encryption system --

it gives us vital flexibility in creating new communication channels with people

we have never met, but we don't actually know if these communications channels

are secure. Bitcoin currency is a technology developed using related ideas, and

suffering from similar potential limitations.

Goals

================================================================================

This lab will probably be too long to complete in a single sitting. In the

first part of the lab, we will put together the core number-theory functions we

will need. In the second part of the lab, we will implement the RSA cipher, and

see how RSA lets you encode a secret message that will stay secret EVEN IF SPIES

KNOW THE CODE YOU ARE USING. Because the second part makes use of the first

part, you will need the first part working by next Friday.

As you can gather from our progress in testbook section 1.6 so far, RSA makes

use of modular arithmetic. If you flip ahead, you can see that RSA is rather

simple in appearance, just using exponentiation and modulus operations, except

for a few things related to GCD's. Thus, we need a few functions from basic

number theory to help us as we go.

- We need to be able to calculate gcd(x,y).

- We need to be able to tell when an integer is prime.

* It will also be convenient to have a way to search for new primes.

- We need to be able to calculate the inverse of [a]_n.

We will use python to right code for encrypting and decrypting a message. While

the RSA algorithm is (relatively) simple, it repeats certain calculations many

times, and the numbers used are much larger than the numbers we usually work

with by hand. The numbers are even larger than calculators can work with -- 32

bit calculators can only represent integers as large as 2**31 ~= 1e10. In 2011,

RSA keys use integers about as large as 2**2000. However, some sophisticated

desktop calculator programs like "dc", "bc", and Python know how to represent

larger integers, only limited by the computer memory. While you can understand

the RSA algorithm, you would never want to do an encryption or decryption by

hand.

Part 1: Number theory tools

================================================================================

First, we want to make a library of the number-theory functions we will need.

Open a new file called "libnumbertheory.py". This is the file where we will

write our basic mathematical functions. We will use the import command to

include these functions in our cipher code.

To speed things up, I have posted a script called "util.py" which contains some

basic utility functions will eventually be useful, but which are not

particularly important or interesting. Download this file and put it in the

same directory as numbertheorylib.py Make the second line of

"numbertheorylib.py" read "from util import * "

http://www.math.psu.edu/treluga/311w/lab5/util.py

GCD's

--------------

To calculate gcd(x,y), we can use Euclid's algorthm, like we learned in class.

Write such a function. It doesn't need to be very long. My implementation is

6 lines, and you can probably do it fewer. Use the following code

to test your implementation.

##<start code>##

TEST = False

if TEST:

for (x,y,g) in [(2*3*5, 3*5*7,3*5),(2*89,7*89,89)]:

print "#Test gcd: gcd(%d,%d)=%d=%d"%(x,y,gcd(x,y),g)

##<end code>##

Primality testing

-------------------

Now, we need to be able to find prime numbers. We will use

a variant of Eratosthenes's Sieve (http://www.youtube.com/watch?v=9m2cdWorIq8)

The catch is that finding prime numbers is expensive, particularly as

numbers get large, so we do not want to waste more effort on this than

we have to. Write a function isprime(x) that returns True if x is prime

and False if x is composite. Use the following code to test.

##<start code>##

TEST=False

if TEST:

for x in range(2,20):

if isprime(x):

print "# %d is prime."%x

else:

print "# %d is composite."%x

##<end code>##

Once your isprime(x) function is working, we can use it to find new prime

numbers with the following function.

##<start code>##

def nextprime(x):

while not isprime(x):

x += 1

return x

##<end code>##

Note the speed of your function. For lab-purposes, we will need to work with

prime-numbers having atleast 3 digits. What's the largest prime number your

algorithm can quickly find?

Modular Arithmetic inverses

-----------------------------

Now, we need to calculate an inverse of [a]_n. We can check if this inverse

exists using the gcd implementation above and the condition 1==gcd(a,n), but we

need to do the matrix calculation to find the inverse. This is a complicated

calculation, where it is easy to introduce bugs and hard to find them, so rather

than writing your own function, use the code below. Note that this code uses

the numpy library to facilitate matrix and vector calculations.

##<start code>##

import numpy

def inverse(a,b):

# these first part of the function

# is a set of tests and conversions

# that we use to set things up before

# the main algorithm starts.

assert isinstance(a,int) or isinstance(a,long)

assert isinstance(b,int) or isinstance(b,long)

if (a < 0): a = -a

if (b < 0): b = -b

a = a % b

assert not 0==a

assert 1 == gcd(b,a)

# Now, we start the main algorithm.

M = [ numpy.array([a,1,0]), numpy.array([b,0,1])]

i_big = 0

if ( a < b ):

i_big = 1

while M[i_big][0] > 0:

i_big = 1 - i_big

M[i_big] -= M[1-i_big]*(M[i_big][0]/M[1-i_big][0])

# The algorithm has finished. Now, we make sure the

# result is in the form we need, and we return it.

inv = long((M[1-i_big][1]+b)%b)

return inv

TEST = False

if TEST:

for a, n in [ (5,26), (11,23), (27,128) ]:

y = inverse(a,n)

print "#Check inverse: [%d][%d] mod %d = %d"%(a,y,n,(a*y)%n)

##<end code>##

Part 2: RSA encryption

===========================================

Now, we have all the parts we need to implement the RSA cipher. You may

discover that the functions you have written are not "good enough" because they

are slow or buggy, but that's something we'll assess as we make progress.

RSA.py

--------------

We will implement the RSA cipher as a pair of classes in python. A class is a

concept from somewhat obsolete object-oriented programming paradigm. It is a

set of data variables and some functions that operate on them in specific ways.

Here, we want to associate encryption and decryption routines with specific

public and private key pairs.

Open up a new file "RSA.py" and fill in the following code. Note that gcd,

isprime, and inverse all appear atleast once. The symbol "**" appears for

exponentiation, while "%" appears for modulus.

##<start code>##

from libnumbertheory import *

class RSAKey:

def __init__(self,n,k):

self.n = n

self.k = k

def crypt(self,m):

return [ long((m_i**self.k)%self.n) for m_i in m ]

def __str__(self):

return str( (self.n,self.k) )

class RSACipher:

def __init__(self,p,q,a):

assert isprime(p)

assert isprime(q)

p,q = long(p),long(q)

n = p*q

totient = (p-1)*(q-1)

assert 1==gcd(a,n)

x = inverse(a,totient)%n

self.public_key = RSAKey(n,a)

self.private_key = RSAKey(n,x)

# extra things that should be forgotten

self.primes = (p,q)

self.totient = totient

def encrypt(self,m):

return self.public_key.crypt(m)

def decrypt(self,b):

return self.private_key.crypt(b)

##<end code>##

Using RSA.py

-------------------

I've provided two scripts to show you how RSA.py can now be used

to encrypt and decrypt messages. First

http://www.math.psu.edu/treluga/311w/lab5/test_with_fixed_keys.py

shows how you can make your own keys using prime numbers.

Second, the script

http://www.math.psu.edu/treluga/311w/lab5/test_with_rand_keys.py

shows how random keys can be generated.

## Comments 0

Log in to post a comment