Introduction to Cryptography

innocentsickΤεχνίτη Νοημοσύνη και Ρομποτική

21 Νοε 2013 (πριν από 3 χρόνια και 8 μήνες)

101 εμφανίσεις

1
Introduction to Cryptography
Introduction to Cryptography
2
CMPS 122, Spring 2004
What is cryptology?

Greek:

krypto

= hide

Cryptology

science of hiding

cryptography + cryptanalysis +
steganography

Cryptography

secret writing

Cryptanalysis

analyzing (breaking) secrets

Decipher
(
decryption
) is what we do

Cryptanalysis
is what they do
2
Introduction to Cryptography
3
CMPS 122, Spring 2004
Steganography


Covered

messages

Technical
steganography

Invisible ink, shaved heads, microdots

Linguistic
steganography


Open code



secret message appears innocent


East wind rain

= war with USA

Broken dolls in WWII

Hide message in low-order bits in GIF
Introduction to Cryptography
4
CMPS 122, Spring 2004
Cryptology
vs
. security

Cryptology is a branch of mathematics

Lots of formal representation

Proofs about encryption are possible

Security is a system issue

Easiest way to violate security is through people!

Security uses cryptology and other tools
3
Introduction to Cryptography
5
CMPS 122, Spring 2004
Terminology
Encrypt
Decrypt
Plaintext
Plaintext
Alice
Bob
Eve
Insecure Channel
C = E(P)
P = D(C)
E must be
invertible
Ciphertext
Introduction to Cryptography
6
CMPS 122, Spring 2004
Kerckhoff

s
Principle

Cryptography always involves two things

Transformation

Secret

Security should depend only on the secrecy of the
key

Assume the enemy can get the algorithm

Can capture machines (or people), disassemble programs, etc.

Very expensive and difficult to invent a new algorithm if the old
one might have been compromised

Security through obscurity isn

t

Look at history of examples

Better to have scrutiny by open experts


The enemy knows the system being used.

(Claude
Shannon)
4
Introduction to Cryptography
7
CMPS 122, Spring 2004
Alice and Bob
K
E
K
D
C = E(K
E
, P) = E
K
E

(P)
P = D(K
D
, C) = D
K
D

(C)
K
E
= K
D
=>
symmetric
encryption
K
E

K
D
=>
asymmetric
encryption
Encrypt
Decrypt
Plaintext
Plaintext
Alice
Bob
Ciphertext
Introduction to Cryptography
8
CMPS 122, Spring 2004
Overview of modern cryptography

Three basic types of algorithms

Symmetric (shared) key encryption

Asymmetric (public key) encryption

Secure hash functions

For each type of algorithm, many choices

Symmetric key: DES, AES, Blowfish, RC5, RC6

Asymmetric key: RSA,
El-Gamal
, elliptic curve

Secure hash function: MD4, MD5, SHA-1, RIPEMD

Different implementations within a type of algorithm share
many characteristics

Goal, approach are similar

Specific implementation details may differ

Good books on algorithms include
Applied Cryptography
(somewhat dated) and
Practical Cryptography
5
Introduction to Cryptography
9
CMPS 122, Spring 2004
Symmetric key encryption

Encryption key and decryption key are identical

Strength of algorithm is usually proportional to 2
key length

Assumes a truly random key!

Algorithm is usually fast

Around 20 cycles per byte for many algorithms

Upwards of 100 MB/s possible on today

s CPUs

Straightforward to build hardware to run the algorithm

Decryption may be the same algorithm as encryption, but isn

t always
K
S
K
S
Encrypt
Decrypt
Plaintext
Plaintext
Alice
Bob
Ciphertext
Introduction to Cryptography
10
CMPS 122, Spring 2004
Asymmetric key encryption

Keys come in pairs: <KU,KR> (K
U
is p
u
blic, K
R
is p
r
ivate)

Designation of which is public and which is private is arbitrary

Knowing one key of a pair won

t help you figure out the other one

Encryption and decryption are typically the same algorithm

May be applied in either order (public or private encrypt first)

D
KR
(E
KU
(
m
)) = D
KU
(E
KR
(
m
)) =
m

Usually much slower than symmetric key encryption

Speed much less than 1 MB/s
KU
KR
Encrypt
Decrypt
Plaintext
Plaintext
Alice
Bob
Ciphertext
6
Introduction to Cryptography
11
CMPS 122, Spring 2004
Secure hash functions

Variable-length input produces fixed-size output

Similar to encryption, but without a key and output blocks collapsed together

Secure:

difficult

to construct fake plaintexts

Weak collision resistance: difficult to find a plaintext with the same hash value as any
randomly-chosen plaintext

Strong collision resistance: difficult to find pairs of plaintexts with the same hash value

Useful because secure hash function can serve as a stand-in for the plaintext for
various other functions

Secure hash
Plaintext
Alice
Bob
Hash value
Introduction to Cryptography
12
CMPS 122, Spring 2004
Simple cipher: Substitution Cipher

C = E
K
(p)

C
i

= K[p
i
]

Key is alphabet mapping

a

J, b

L, ...

Suppose attacker knows algorithm but not key, how
many keys to try?

Answer: 26! (26 factorial)

If every person on earth tried one per second, it would
take 5 billion years
7
Introduction to Cryptography
13
CMPS 122, Spring 2004
Monoalphabetic
Cipher


XBW HGQW XS ACFPSUWG FWPGWXF
CF AWWKZV CDQGJCDWA CD BHYJD
DJXHGW; WUWD XBW ZWJFX PHGCSHF
YCDA CF GSHFWA LV XBW KGSYCFW SI
FBJGCDQ RDSOZWAQW OCXBBWZA
IGSY SXBWGF.

We know:
This is English text.
It uses a
monoalphabetic
cipher
Introduction to Cryptography
14
CMPS 122, Spring 2004
Frequency Analysis


XBW HGQW XS ACFPSUWG FWPGWXF CF AWWKZV
CDQGJCDWA CD BHYJD DJXHGW; WUWD XBW
ZWJFX PHGCSHF YCDA CF GSHFWA LV XBW
KGSYCFW SI FBJGCDQ RDSOZWAQW OCXBBWZA
IGSY SXBWGF.

W: 20

Normal

English:
C: 11
e
12%
F: 11
t
9%
G: 11
a
8%
8
Introduction to Cryptography
15
CMPS 122, Spring 2004
Pattern Analysis
Most common trigrams in English:
the = 6.4%
and = 3.4%
XBe
=

the

?


XBe
HGQe
XS
ACFPSUeG FePGeXF
CF
AeeKZV CDQGJCDeA
CD BHYJD
DJXHGe
;
eUeD
XBe
ZeJFX
PHGCSHF YCDA CF
GSHFeA
LV
XBe
KGSYCFe
SI FBJGCDQ
RDSOZeAQe
OCXBBeZA
IGSY
SXBeGF
.

Introduction to Cryptography
16
CMPS 122, Spring 2004
Guessing


the
HGQe
tS
ACFPSUeG FePGetF
CF
AeeKZV
CDQGJCDeA
CD
hHYJD DJtHGe
;
eUeD
the
ZeJFt
PHGCSHF YCDA CF
GSHFeA
LV the
KGSYCFe
SI
FhJGCDQ RDSOZeAQe OCthheZA
IGSY
StheGF
.

tS
= to

S =

o

9
Introduction to Cryptography
17
CMPS 122, Spring 2004
Guessing


the
HGQe
to
ACFPoUeG FePGet
F

CF

AeeKZV
CDQGJCDeA
CD
hHYJD DJtHGe
;
eUeD
the
ZeJFt
PHGCoH
F

YCDA
CF

GoHFeA
LV the
KGoYCFe oI
FhJGCDQ RDoOZeAQe OCthheZA IGoY otheG
F
.

F appears at the end of many words

likely a consonant
CF is a common two-letter word

C likely a vowel
F =

s

and C =

i

otheGs
= others

G =

r

Introduction to Cryptography
18
CMPS 122, Spring 2004
Guessing


the
HrQe
to
AisPoUer

sePrets

is
AeeKZV
iDQrJiDeA
iD
hHYJ
D
DJtHre
;
eUe
D

the
ZeJst
PHrioHs YiDA
is
roHseA
LV the
KroYise oI shJriDQ
RDoOZeAQe OithheZA IroY
others.

sePrets
=

secrets



P =

c

AiscoUer
= discover

A =

d

, U =

v

iD
=

if

or

in

, but

D

ends two words (unlikely to be

f

)
oI
=

on

or

of

, (

r

already deciphered)
D =

n

and I =

f

10
Introduction to Cryptography
19
CMPS 122, Spring 2004
Guessing


the
HrQe
to discover secrets is
deeKZV
inQrJined
in
hHYJn nJtHre
; even the
ZeJst
cHrioHs Yind
is
roHsed
LV the
KroYise
of
shJrinQ RnoOZedQe OithheZd froY
others.

At this point, start completing individual words.
Yind
=

mind

&
froY
=

from



Y =

m

Kromise
=

promise



K =

p

cHrioHs
=

curious



H =

u

And so on

Introduction to Cryptography
20
CMPS 122, Spring 2004
Monoalphabetic
Cipher

The urge to discover secrets is deeply ingrained
in human nature; even the least curious mind
is roused by the promise of sharing knowledge
withheld from others.

- John Chadwick,

The Decipherment of Linear B
11
Introduction to Cryptography
21
CMPS 122, Spring 2004
Why was it so easy?

Doesn

t hide statistical properties of plaintext

Common letters in plaintext will result in common
symbols in
ciphertext

Doesn

t hide relationships in plaintext

EE cannot match dg

English (and all natural languages) are very
redundant

About 1.3 bits of information per letter

Many combinations of letters simply don

t exist or aren

t common

Running English thru
gzip

reduces size by a factor of 6

8 bits/letter / 1.3 bits of information per letter

6
Introduction to Cryptography
22
CMPS 122, Spring 2004
How can we make it tougher?

Cosmetic: use different symbols

Hide statistical properties:

Encrypt

e

with 12 different symbols,

t

with 9
different symbols, etc.

Add nulls, remove spaces

Polyalphabetic
cipher

Use different substitutions

Transposition

Scramble order of letters
12
Introduction to Cryptography
23
CMPS 122, Spring 2004
Types of attacks

Ciphertext-only

how much
ciphertext
is needed?

Known plaintext

often

guessed plaintext


Chosen plaintext (get ciphertext)

Not as uncommon as it sounds!

Chosen
ciphertext
(get plaintext)

Leave these to the professionals:

Dumpster diving

Social engineering


Rubber-hose cryptanalysis

(actually an advanced form
of social engineering)

Use threats, blackmail, torture, and bribery to get the key.
Introduction to Cryptography
24
CMPS 122, Spring 2004
Really brief history: first 4000 years
Cryptographers
Cryptanalysts
3000BC
monoalphabetics
900
al-
Kindi
- frequency analysis
Alberti

first
polyalphabetic
cipher
1460
Vigen
è
re
1854
Babbage breaks
Vigen
è
re
;
Kasiski
(1863) publishes
13
Introduction to Cryptography
25
CMPS 122, Spring 2004
Really brief history: last 100 years
Cryptographers
Cryptanalysts
1854
1918
Mauborgne

one-time pad
Mechanical ciphers - Enigma
1939
Rejewski
repeated
message-key attack
Turing

s loop attacks,
Colossus
Enigma adds rotors, stops repeated key
1945
Feistel
block cipher, DES
Linear, Differential
Cryptanalysis
?
1973
Public-Key
Quantum Crypto
Introduction to Cryptography
26
CMPS 122, Spring 2004
How does cryptology advance?

Arms race between cryptographers and cryptanalysts

Often, disconnect between two (e.g., Mary Queen of Scots used
monoalphabetic
cipher long after known breakable)

Multi-disciplinary field

Linguists, classicists, mathematicians, computer scientists, physicists

Secrecy often means advances rediscovered and
miscredited

Public-key cryptography first done by British security agency, rediscovered
by
Diffie
&
Hellman

Dominated by needs of government: war is the great catalyst

Cryptanalysis advances led by most threatened countries:

France (1800s), Poland (1930s), England/US (WWII), Israel? (Today)
14
Introduction to Cryptography
27
CMPS 122, Spring 2004
Security vs. Pragmatics

Trade-off between security and effort

one-time pad: perfect security, but requires distribution
and secrecy of long key

DES: short key, fast algorithm, but breakable

quantum cryptography: perfect security, guaranteed
secrecy of key, slow, requires expensive hardware

Don

t spend $10M to protect $1M

Don

t protect $1B with encryption that can be
broken for $1M
Introduction to Cryptography
28
CMPS 122, Spring 2004
Unbreakable cipher: one-time pad

Mauborgne
/
Vernam
[1917]

XOR (

):

0

0 = 0 1

0 = 1

0

1 = 1 1

1 = 0

a

a = 0

a

0 = a

a

b

b = a

E(P, K) = P

K

D(C, K) = C

K = (
P

K)

K = P
15
Introduction to Cryptography
29
CMPS 122, Spring 2004
Why perfectly secure?

For any given ciphertext, all plaintexts are equally
possible.

Ciphertext
:
0100111110101

Key1: 1100000100110

Plaintext1:
1000111010011 =

CS


Key2: 1100010100110

Plaintext2:
1000101010011 =

BS

Introduction to Cryptography
30
CMPS 122, Spring 2004
Perfect security => our job is done?

Can

t reuse K

What if receiver has

C1 = P1

K and
C2 = P2

K
C1

C2 = P1

K


P2

K

= P1


P2

Need to generate truly random bit sequence as long
as all messages

Need to securely distribute keys
16
Introduction to Cryptography
31
CMPS 122, Spring 2004
Vigen
è
re

Invented by
Blaise
de
Vigen
è
re
, ~1550

Considered unbreakable for 300 years

Broken by Charles Babbage but kept secret to help
British in Crimean War (circa 1854)

Attack discovered independently by Friedrich
Kasiski
, 1863
Introduction to Cryptography
32
CMPS 122, Spring 2004
Key is an
N
-letter string
Alphabet has
Z
symbols
E
K
(P) = C where
C
i

= (P
i
+
K
i
MOD
N
) MOD
Z
E

KEY

(

test

) = DIQD
C
0
= (

t


+

K

) mod 26 =

D

C
1
= (

e


+

E

) mod 26 =

I

C
2
= (

s


+

Y

) mod 26 =

Q

C
3
= (

t


+

K

) mod 26 =

D

Vigenère
Encryption
17
Introduction to Cryptography
33
CMPS 122, Spring 2004
Babbage

s Attack

Use repetition to guess key length:

Suppose sequence XFO appears at 65, 71, 122, 176

Calculate distances between occurrences

(71

65) = 6 = 3 * 2

(122

65) = 57 = 3 * 19

(176

122) = 54 = 3 * 18

Key is probably 3 letters long

This approach isn

t foolproof

XFO could correspond to different sequences at different
locations

Use lots of different trigrams (or longer!) to find the key
length
Introduction to Cryptography
34
CMPS 122, Spring 2004
Index of coincidence

Calculate index of coincidence by

Taking two strings and pairing their letters by position

Computing the fraction of paired letters that are the same

For English, index of coincidence is

About 3.8% for randomly chosen letters (= 1/26)

About 6.6% for real English text

Reason: some letters (and sequences) are more common than others
in English

Index of coincidence is unaffected by simple substitution
ciphers (assuming both strings encrypted with the same key)!

Take the encrypted text and compare it with itself shifted
(horizontally) by N positions (do this for values of N from 1

maximum key length)

If N is a multiple of the key length, the index of coincidence will
jump to a higher value
18
Introduction to Cryptography
35
CMPS 122, Spring 2004
PAMP DOKW SCAO PBSJ VFSV HRGE ASEX BRQR AGMR KOPZ
HBOI KIZH LFSV HRGE ASEM UHQV LGFI KWZE UMAJ AVQW
LODI HGAJ YSEI HFOL PTKS BFDI ZSMV JVSS HZEQ HHOL
AVAW LCRT YCVI JHEJ VFIL PQTM OOHI LLFI YBMP ZIBT
VFFM TOKF LONP LHAW BDBS YHKS BOEE YSEI HFOL HGEM
ZHMR A
Key length and frequency

Once you think you know the key length

Slice the
ciphertext

Use the frequency methods we looked at earlier

Example:

Key length = 4

For first letter, H=9, L=7 & A=6 are most common => guess a, e, t

Keep going like this


Even if each position in key is fully scrambled (not just shifted), this
mechanism works
Introduction to Cryptography
36
CMPS 122, Spring 2004
Vigenère
simplification

Use binary alphabet:

C
i

= (P
i
+
K
i
mod
N
) mod 2

C
i

= P
i



K
i
mod
N

Use a key as long as P:

Ci
= P
i



K
i

One-time pad

perfect cipher!
19
Introduction to Cryptography
37
CMPS 122, Spring 2004
How do you know the cipher

s good?


I tried really hard to break my cipher, but couldn

t. I

m a
genius, so I

m sure no one else can break it either.



Lots of really smart people tried to break it, and couldn

t.


Mathematical arguments

Key size (dangerous!)

Statistical properties of
ciphertext

Depends on some provably (or believed) hard problem

Invulnerability to known cryptanalysis techniques (but what
about undiscovered techniques?)

Show that
ciphertext
could match multiple reasonable
plaintexts without knowing key

Simple
monoalphabetic
secure for about 10 letters of English:
XBCF CF FWPHGW
This is secure
Spat at
troner
Introduction to Cryptography
38
CMPS 122, Spring 2004
Real world standard

Attacker almost certainly has details of algorithm

Attacker has access to

Limited (maybe) amount of
ciphertext

Known plaintext (sometimes)

Chosen plaintext (occasionally)

Breaking a cipher means the attacker can read a
secret message

May mean the attacker can read
many
secret messages if
the key is reused (think PGP

)
20
Introduction to Cryptography
39
CMPS 122, Spring 2004

Academic

standard

Harsher than real-world standard (but not always)

Assume the attacker has

Full details of the algorithm

An unlimited number of chosen plaintext/ciphertext pairs

Assume attacker can perform a very large number
of computations

Up to, but not including, 2
n
, where
n
is the key size in bits

This means that the attacker can

t mount a brute force attack, but
can get close

Ciphers that meet this standard may be stronger than
those designed for the

real world


Example: ENIGMA (more on this later) relied upon
secrecy of the algorithm as well as the key
Introduction to Cryptography
40
CMPS 122, Spring 2004
Showing a cipher is imperfect

Two (easy?) ways to show a cipher is imperfect

Find a
ciphertext
that is more likely to be one message
than another

Show that there are more messages than keys

Can be easy if message is longer than key


Implies that there is some message more likely to be a given
ciphertext
, even if you can

t find it

Since most ciphers have more messages than keys,
they

re imperfect

One-time pad is an exception!
21
Introduction to Cryptography
41
CMPS 122, Spring 2004
Entropy & rate

The entropy (H) of a message
M
is the amount of
information in the message

H(
M
) = log
2

n
where
n
is the number of possible meanings

Example:
H (month of year)

= log
2
12

3.6 (need 4 bits to encode a year)

Rounding up can give misleading results

Encoding three (independent months) requires log
2
12
3


10.8 bits

Using 4 bits per month would require 12 bits


Absolute rate: how much information can be encoded

R = log
2

Z
, where
Z
is the size of the alphabet

R
English

= log
2
26

4.7 bits/letter

Actual rate of a language:
r = H(
M
) /
N
, where
M
is an
N
-letter message.

r of months spelled out using ASCII:
r =
log
2
12 / (8 letters * 8 bits/letter)

0.06
Introduction to Cryptography
42
CMPS 122, Spring 2004
r = H(
M
) /
N
1.3 = H(
M
)/20
H(
M
) = 26 = log
2

n

n
= 2
26
= 6.7 million (of

×


possible)
One out of

×


randomly selected 20-letter groups
Rate of English

r
English

is about 1.3 bits/letter (.28 letters/letter).

Many letter combinations don

t occur (or don

t occur frequently) in
English (
qz
,
xg
,
cfn
)

Many words don

t occur together often (

educated car

)

This ratio can be derived by compressing English text and looking at
the compression ratios (8/1.3

6)

How many meaningful 20-letter messages are there in
English?
22
Introduction to Cryptography
43
CMPS 122, Spring 2004
Redundancy &
unicity

Redundancy (D) is defined as:

D = R

r

Redundancy in English:

D
English

= 4.7

1.3 = 3.4 bits/letter

Each letter is 1.3 bits of content, and 3.4 bits of redundancy. (~72%)

English encoded as ASCII: 1 byte per letter

D = 8

1.3 = 6.7

84% redundancy, 14% information

Unicity

Theoretical and probabilistic measure of how much
ciphertext
is
needed to determine a unique plaintext

Does
not
indicate how much
ciphertext
is needed for cryptanalysis

U = H(
K
) / D

Minimum
amount of
ciphertext
needed for
brute-force attack
to
succeed.
Introduction to Cryptography
44
CMPS 122, Spring 2004
Unicity Examples

One-Time Pad

H(
K
) = infinite

U = H(
K
)/
D
= infinite

Monoalphabetic
Substitution

H(
K
) = log
2
26!

87

D
= 3.4 (redundancy in English)

U = H(
K
)/
D


25.5

Intuition: if you have 25 letters, probably only matches one
possible plaintext.

Random bit stream (message)

D
= 0

U = H(
K
)/
D
= infinite

No amount of text will be enough!
23
Introduction to Cryptography
45
CMPS 122, Spring 2004