Linear Feedback Shift Registers, Galois Fields, and Stream Ciphers

wanderooswarrenΤεχνίτη Νοημοσύνη και Ρομποτική

21 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

64 εμφανίσεις

















Linear Feedback Shift Registers, Galois Fields, and Stream Ciphers

Mike Thomsen

Cryptography II

May 18
th
, 2012



Abstract

L
inear Feedback Shift Registers (LFSR) are linear registers
that on each iteration create a
different state of


bits. Each bit of the





register is either a
shifted bit from the



state,
or the result of a simple algebraic computation from the nth sta
te. The 'tap positions' are the
positions of the



state in which bits are extracted, and then XOR
'd together to be fed back
into
the first bit of the





state. If the tap positions are 'maximal', t
hey generate





states, as the
zero state is not allowed. These registers have some interesti
ng properties, including
strong
connections to Galois
Fields. Moreover, the outputs of multip
le LFSR can be combined
and fed
through a
b
oolean

function to create a stream cipher
--

a way to en
crypt data bitwise or
bytewise,
different

from a block cipher where data is encrypted all at

once. Despite these LFSR
-
based
stream ciphers still being used in some application
s
, including
A5/1 which is used for
over
-
the
air communication privacy with cellular devi
c
es, they can
be susceptible to many
attacks,
including correlation attacks.
Correlation

attacks

and the
eventual

breaking of A5/1 will
be discussed.


I.
Introduction


Linear Feedback Shift Registers

Linear Feedback Shift Registers (LFSR) are a collection of cyclic

binary

states where the
current state is a direct computation of its predecessor. A simple XOR of particular bits (the tap
positions), and a shifting behavior allows for a uniform serial computation until the start state
repeats
.

The length of unique states depe
nds on the tap positions that are used to create the
‘feedback’ bit. If the tap positions are maximal (will be discussed later), then there are





possible states, spanning all non
-
zero


bit binary numbers.

The zero state is not allowed in
LFSR because it would infinit
ely return the zero state since XOR of any number of zeros will
always return zero.

These states are useful in digital electronics, and can be used for testing hardware in a
somewhat ‘random’

m
anner since as stated, they can span all non
-
zero


bit binary numbers.

In
addition
they can be used for noise generation, or scramblers. But, the most common use of
LFSR is in cryptographic applications. Whether it is used for a pseudo
-
random number gene
rator
(PRNG) or in a stream
cipher
, LFSR are attractive due to their quick hardware implementations.
They can be difficult to implement in software, but in hardware, they only need a simple shift
and XOR to function. This allows for execution of cryptograp
hic protocols to be very fast


an
extremely desirably quality.



II.

Properties

of LFSR

We take a small toy example of LFSR

to illustrate the properties of all LFSR.

Consider
the following

3
-
bit register states, with tap position
s [2,3] and start state
[1,0,1]:













We first notice that there are 7 unique states, so we know that [2,3] are maximal length
tap positions, and this sequence is maximal with





states.

We again notice that there is no
zero state, but all 3
-
bit binary numbers have been enumerated in a seemingly ‘random’ manner.

An interesting property of LFSR is that if we take each column individually
, we notice
that it’s simply a rotation of another c
olumn. Additionally, for a maximal length sequence (also
known as an m
-
sequence) there were always be





ones, and







zeros.

There is also the
sliding window property, in which if we take a column of the LFSR and slide a window of


bits
successiv
ely
over the bits,
it will also span every possible state of the LFSR.

Because of these
properties

and its cyclic nature
,
the LFSR is very predictable

when it stands alone
.

We now look at a non
-
maximal

length LFSR of 3
-
bits

with tap positions [1,2] and start
state [1,0,1]
.



1

0

1

1

1

0

1

1

1

0

1

1

0

0

1

1

0

0

0

1

0

1

0

1

1

1

0

0

1

1

What’s interesting to note about this non
-
maximal length set of states, is that because we
used [1,2] as the tap positions, it essentially falls into an LFSR of 2
-
bits, and the last bit has no
effect on
the state of the register at all.


Normally and for most cases, if we have a non
-
maximal length LFSR, the order of the
states it spans
must

divide




, but since 3 is a Mersenne prime (3 and







are both
prime), it obviously cannot divide it.

However given an LFSR of


bit length (


is not a
Mersenne prime), any non
-
maximal length “group” must

have a

total

number of elements that
divide
s





.


III
. LFSR and Galois Fields


In order to find which tap positions will yield maximal length
sequences, we must turn to
mathematics. First, we enter the realm of






, the binary field where all operations
are done
modulo

2.

We call the ‘characteristic’,
‘generating’

or ‘feedback’

polynomial of an LFSR the
polynomial that correlates to the tap
positions:


(

)























The



coefficients can either be 1 or 0


either be a position to take bit information from
or not. In all LFSR applications,



must be 1, which refers to the fact that there is feedback
coming into the

new state of the LFSR.


It is known that if

(

)

divides


(

)






,
then

(

)

is said to be primitive, and
correlate to tap positions will yield a maximal length sequence.

For example, if we take





we look at


(

)





. We can factor this
polynomial as follows:


(

)






(



)
(







)
(






)


Therefore we can see that the last two factors are ones we are interested in to make a
maximal length sequence of 7 registers long. Note that


(



)


, and therefore is not a
candidate

(

)

to give maximal length tap positions.

Therefore,
maximal

tap positions for an
LFSR of 3
-
bits are [1,3] and [2,3].


This process for larger

-
bit registers can be very tedious. Tap positions for larger length
LFSR are known, and publicly available.

But, this
mathematical

process
can

determine what tap
positions are maximal, and
what positions

are not. Any
polynomial
factors that have degree less
than


will

always

yield
non
-
maximal length sequence
s. As stated previously, the order of these
sequence
s

must

divide





if


is not Mersenne.



While we are still in the realm
of






, we introduce the correlations that LFSR and
Galois Fields have.

The characteristic polynomials

(

)

that we just found that yield
the
maximal length sequences
are actually
irreducible

polynomials
over


(


)
.

If we find all the
elements

from a Galois Field using a particular

(

)

(i.e.








(

)
)

and compare it to the
enumeration of all states

using

(

)

as the charac
teristic polynomial of an LFSR (
the sta
tes may
not

be generated in the same order
, however
)
,
they will be identical.

Non
-
maximal length tap
positions correlate to reducible polynomials.


It is also known that if given a set of tap positions or a polynomial that corresponds to
those, if we take
the mirror image of it, we will produce a mirror image sequence. Generally, this
can be described that if

the original feedback set is [m, A, B, C], the reversed feedback set is
described by [m, m
-
C, m
-
B, m
-
A], where

m

is the number of LFSR stages [1
].

Thi
s is an
interesting property, essentially stating that if given one irreducible polynomial in


(


)
, it is
trivial to determine a second one.



Moreover, what
should

now be obvious is the fact that each register state of LFSR is an
element of

(


)

over


(

)
.


We can now take this relationship of LFSR and Galois Fields into another

cryptographic
application: AES.

We know what the polynomial of AES
is

(












)
.

Now with
this LFSR application, we can quickly enumerate all of the elements of AES by using the tap
positions [8,4,3,1] on an LFSR of 8
-
bits.

Furthermore, we now know that we can generate
another identical set of

maximal length

tap
position
s: [8,7,5,4]. This also tells us that
we have
another identical irreducible polynomial

that will give us all the elements of AES. Namely, we
have the following
:








































.


IV. LFSR and Stream Ciphers


LFSR can be used

for a number of practical cryptographic applications, but the main

uses
are

stream ciphers or pseudo
-
random number generators (PRNG).

In either case, LFSR are used
to output one bit at a time either to use to encrypt plaintext, or to generate one bit of a

‘random’
looking number.

Here, we will talk only about stream ciphers built from LFSR.


Due to its cyclical and predictable nature,
a single

LFSR cannot be used
for a stream
cipher.

It would be a fatal mistake to use one

LFSR, since an independent registe
r is very easily
broken
. Therefore, multiple LFSR are strung together in parallel
. It is important to note that for
each ‘clock’ or ‘new state’ of the registers, only one bit is output at a time so as to not give away
too much information about the private
, internal state
.


Consider the following diagram which is a ‘general’ representation of how LFSR are
strung together

to create
a stream cipher

[2]
:


The idea is to put


registers together (LFSR 1 to LFSR


can be any bit length as
desired), and feed their output bits into a boolean function ‘

’. This function contains some bit
logic


a formula that takes


input bits (







) and outputs a single bit for the ‘keystream’.

The function



can be a si
mple XOR of all of the input bits, or it can be as complicated as the
designer of the circuit requires


using AND’s and OR’s as desired.

In a good scheme,


will be non
-
linear


an attempt to try and stave
off
very powerful
attack
s
: correlation attacks.


V. LFSR and Stream Ciphers
: Correlation Attacks


A very powerful attack on stream ciphers that use LFSR
is a correlation attack
.

The basis
for the attack is the notion that there is some relationship between one of the registers in the
scheme and the output of the boolean function


.

The attack starts as a brute force attack, but if a
correlation can be determined, it becomes a di
vide and conquer attack.

There are numerous ways to determine whether or not a register is correlated to the
output of


, including a way to exhaustively search, proposed by
Siegenthaler
[3]
.



When an attacker starts a correlation

(brute force)

attack on a stream cipher, he must
consider the system as a whole. The
states of the registers are

usually independent of each other,
but th
at information is private, so the attacker can only know the structure of the register, and the
output bit of


.

T
herefore, he must break the entire system

via brute force
. However, if he can
determine that a correlation exists between a single register and the output of


, then he can
break that register individually, vastly
improving the
speed

of the attack.


Gener
ally, we consider the following complexity of a brute force attac
k on a
stream
cipher using LFSR. Given

registers













, with
lengths










, to break
the system, we must exhaustively try all possible keys and compare, so
we have











.

However, if we can correlate one register,
say




, then the complexity then is reduced
to















.


We look at a concrete example to further this point.

Assume we have a system with 3
LFSR, each of 16 bits.

Then,
to brute force the whole system and determine the key used, we
have a complexity
of








.
If we can attack one register with a correlation attack, then the
complexity decreases to






, a savings of 65535. If we can attack two registers with
correlation attacks, then complexity decreases even further to a
much

lower





, a massive
savings.


Although this ‘correlation’ property seems unlikely if



is a well
-
chosen, non
-
linear
function
, t
he
re have been improvements to [
Siegenthaler
’s]

exhaustive search, including “a very
interesting way of exploring the correlation in a fast correlation attack provided

that the feedback
polynomial of

the LFSR has a very low weight” [3].

So, one way to c
ombat correlation attacks
on a stream cipher is to choose as many tap positions as possible


or


choose a polynomial that
has as many non
-
zero coefficients as possible.


Correlation

attacks have improved as time has gone by. A new attack on
any

feedback
polynomial

for an LFSR stream cipher

(regardless of weight

of the polynomial
) is presented in
[3] using convolutional codes.

Therefore, it has become practice to avoid the use of LFSR in
stream ciphers wherever possible


for example in RC4, RC5,

and RC6.



VI. LFSR and Stream Ciphers: A5/1


A5/1 is a stream cipher that uses LFSR, and it also used by billions of people around the
world.

GSM (Global System for Mobile Communication) is an entity that provides data security
for over the air transmissions of cellular devices.


A5/1 was developed in 1987

for use in GSM
,
and its use has only grown in size as cellular devices become more and mo
re popular.


Interestingly, A5/2 was also developed simultaneously with A5/1, but it was intended for
use in “export regions”, namely not the US and UK. Its implementation is sufficiently and
purposefully weaker and thus more vulnerable. The plan was to
keep both A5/1 and A5/2’s
algorithms secret, but in 1999 the
y were reverse engineered

and design
s were

made publicly
available.

The

very

insecure
A5/2

is very similar

to A5/1
, and
this fact helped bring

the demise
of
the
A5/

stream
cipher
.


The layout of

the A5/1 cipher is as follows

[4]
:













As we can see, there are three LFSR running parallel in the A5/1 protocol.

R1 has tap
positions 13, 16, 17, and 18; R2 has tap positions 20, 21; R3 has tap positions 7, 20, 21, and 22.

The C1, C2, and C3 positions refer to the ‘clock positions’. The output bit from R1, R2, and R3
is taken, and a majority function is calculated (the output of this function is 0 if 2/3 of the bits are
0, and 1 if 2/3 of the bits are 1).

If C1, C2 or C3 a
gree with this majority function

bit
, then that
particular register is clocked (i.e. a new state for that register is computed).

Note that at least two
registers
must

be clocked for any given
‘round’.

A

bit of the keystream is determined by
XOR’ing all of the outp
ut bits for the three registers together.


A conversation using GSM sends a new ‘frame’ every 4.6 milliseconds.

Therefore,
we can see that for one second

of secure data
,



frames are needed. Ea
ch one of these frames
uses a

new

session key
,




and incorporates the frame counter
,





Each one of these frames
needs 228 total bits of keystream data from
the LFSR. 114 bits are needed for the

conversation
from

party


and
party


, and 114 bits
are needed for the conversation from party



back
to
party


.

Therefore, we ask A5/1 for each ‘frame’ to give us 228 total bits of keystream data.

For
each frame, the LFSR is initialized by [4]:



Registers are zeroed, and ignoring the ‘
stop/go’ (clock bit
s),

the registers are then
clocked 64 times, being fed a bit from the session
key


, outputs

are

discarded
.



Registers are clocked 22 more times (still ignoring the
stop/go control
), being fed
in data from the 22 bit frame
counter



,
discarding

the outputs.



Registers are clocked 100 times without any input data, but still
outputs are
discarded
.



Registers are then clocked 228 times (using the stop/go contr
ol), and a bit of the
keystream is generated by XOR’ing the output bit of each of R1, R2,
and R3.



These 228 bits are used for encryption of a frame between the two parties.

VI
I
. LFSR and Stream Ciphers: A5/1

attacks


A5/1 was scrutinized from its inception, but upon its reverse engineering and
subsequent release to the public, attacks were even

more plentiful.

Many known
-
plaintext attacks
on A5/1 were tried [4]. Most of these attacks required an unrealistic amount of either time or
memory space on a hard drive


including Golic’s attack

in 1997

which randomly probes into a
precomputed table with




128
-
bit entries, needing a
n unreasonable

64 terabyte hard drive

and
months of computation time

to complete.


New known
-
plaintext attacks were developed by Biryukov, Shamir, and Wagner in
2000.
They combined Golic’s work, the idea of ‘special states’


states that have a particularly
useful prefix


and launched biased birthday attacks, as well as random subgraph attacks [4].

They were able to determine the following about their attacks on A5/1:







The most serious attack of A5/1 came after
Barkan,
Biham, and Keller determined an
inherent weakness in all phones that were capable of using A5/2 encryption.

There are numerous
attacks on A5/2 that are well known, and also known to be extremely fast [5].

While the details
will not be discussed, the A5/2 cipher can have its key recovered

in less than a second
.


The idea that Barkan, Biham, and Keller proposed is an attack on either A5/1 or the
newer A5/3 (also known as KASUMI).

The attack is a man in the m
iddle attack, where the
adversary acts as the network to the victim, and the victim to the network.

This attack assumes
the victim’s phone has the capability to encrypt using the A5/2 protocol.

The steps to break A5/1

[5]
:



The network asks for authenticati
on of the victim, and the attacker allows this to take
place, so both the network and victim believe they are talking securely.



The network asks the victim which protocol he would like to use for encryption. While
this is taking place, the attacker asks th
e victim to start encrypting with A5/2. The
attacker quickly recovers the key. This key is used for
A5
/1 or A5/3

as well as being used
for A5/2, so the attacker now has the correct key

to be used for
decryption
.



The attacker then tells the network

that the victim would like

either A5/1 or A5/3,
and
then lets the victim and the network talk normally, being able to recover any
transmissions and decrypt it.


With this attack and others (including a class
-
mark attack), A5/1 was officially
broken by Bar
kan, Biham, and Keller in 2006.

The interest in breaking this scheme has
continued through today. More attacks including a brute force attack with highly parallel
FPGA’s (called COPACOBANA), and GPU’s
have surfaced

in an effort to more efficiently
recover
the keys used for the stream cipher.



VI
II
.
Future


We took a look at LFSR, interesting properties of them (including ties to Galois
Fields), and how they can be used to construct cryptographic primitives. There are some serious
vulnerabilities in general stream ciphers with LFSR, including correlation atta
cks. Some steps to
combat correlation attacks are available

and should be considered

when designing a stream
ciphe
r that uses LFSR.


A particular and concrete example o
f a stream cipher that uses LFSR, A5/1 was
presented. This protocol was and is used for

mobile phone data encryption, and in principle
should be secure. However, due to new technologies and an inherent weakness in some cell
phones (their capability to encrypt and use A5/2), A5/1 was broken in 2006.


Looking ahead to the future

for mobile enc
ryptio
n
, the most obvious choice would be
to disallow any phones to be capable of A5/2 encryption to avoid Barkan et al.’s man in the
middle attack described above. Assuming this becomes the case,

and

although the details are not
discussed here,
the
A5/3
b
lock cipher
is known to be a much stronger protocol and perhaps
should be used from now on for all cellular network security.



In conclusion,

we arrive at the age old trade off


whether security or speed is most
important. LFSR

are extremely fast (and fairly simple) to use
in hardware. However,

their
security is suspect.
If LFSR are to be
implemented

for a protocol, it is vital that its vulnerabilities
are addressed and combatted before use.



IX
.
References

[1]
http://www.newwaveinstruments.com/resources/articles/m_sequence_linear_feedback_shift_r
egister_lfsr.htm

[2]

Patrik Edhal,
On LFSR
-
based Stream Ciphers (PhD)
,
2003

[3]

Thomas Johansson, Fredrik Jonsson,
Improved Fast Correlation Attacks on Stream Ciphers
via Convolutional Codes
, 1999

[4]

Alex Biryukov, Adi Shamir, David Wagner,
Real Time Cryptanalysis of A5/1 on a PC
,
2000

[5]

Elad Barkan, Eli Biham, Nathan Keller
,
Instant Ciphertext
-
Only Cryptanalysis of GSM
Encrypted Communication
, 2003/2006