Modified Association Rule Mining Approach for the MHC-Peptide Binding Problem

chardfriendlyΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

69 εμφανίσεις

Modified Association Rule Mining Approach for the
MHC
-
Peptide Binding Problem

Galip Gürkan Yardımcı
,

Alper Küçükural
,

Y
ücel
Saygın
,
Uğur Sezerman

{yardimci, kucukural}@su.sabanciuniv.edu

{
ysaygin
,

ugur
}
@sabanciuniv.edu


Faculty of Engineering and Natural S
ciences,

Sabanci University, Turkey

Abstract.
Computational approach to predict peptide

binding to major
histocom
patibility complex (MHC)

is crucial

for vaccine design since these
peptides can act as a T
-
Cell epitope to trigger immune response. There are t
wo
main branches for peptide prediction methods; structural and data mining
approaches. These methods can be successfully used for prediction of T
-
Cell
epitopes in cancer, allergy and infectious diseases. In this paper
,

association
rule mining methods are
implem
ented to generate

rules of peptide selection by
MHCs. To capture the binding characteristics, modified rule mining and data
transformation methods are implemented in this paper. Peptides are known to
bind to
the

same MHC show sequence variability,

to

capture this
characteristic
,

we used

a

reduced amino acid alphabet by clustering amino acids according to
their physico
-
chemical properties. Using the classif
ication of amino acids and

the OR
-
operator to combine the
rules to reflect that different amino
a
cid

types
and positions along the peptide may be responsible for binding are the
innovations of the method presen
ted. We can predict MHC Class
-
I

binding with
75
-
97
% coverage

and 76
-
100% accuracy.


Keywords
:

Peptides, MHC Class
-
I, Association rule mining,
reduced amino
acid alphabet, data mining.

1 Introduction

Peptide binding prediction is a crucial step for vaccine design since it
enables

the

understanding
of
the mechanism of the immune response
to foreign bodies

and how
vaccines work. There are numerous
experimental
research results regarding this
subject.
These experiments take too much time and
are

costly since

there
are a vast
number of peptides

to be tried as a vaccine candidate even for a single MHC.
Therefore,
there is an urgent need

for developing
effective computational methods to
solve the peptide binding problem to the MHC.
The methods developed in finding
peptide sequences specific for the target MHCs can be also used for developing

therapeutical proteins as well
for

other types of receptors.


M
HCs recognize a
ntigens

which

are foreign macromolecules that cause an
immune response in the body. There are two types of immune responses to the
antigens: humoral and cellular immune response. Class II MHC molecules are
2

Galip Gürkan Yardımcı, Alper Küçükural, Y
ücel Saygın,

Uğur Sezerman

involved in humoral immune response

whereas Clas
s I MHC molecules are involved
in the cellular immune response

which is

the response after the antigen enters the cell

[9]
. In this paper we will focus on cellular response which involves recognition of
antigenic fragments by Class I MHCs. Aft
er foreign bodies enter the cell they are
cleaved into smaller pieces that are called peptides. These peptides are picked up by
Class I MHCs and brought to th
e cell surface. There are on

average three to four
different type of Class I MHCs

in the human c
ell,

which all bind to different typ
es
of
peptides including self and antigenic peptides. The T
-
Cells

recognize the infected cell
upon binding to the antigenic peptide
-
MHC complex, which triggers a cascade of
events leading to the cellular immune response
to foreign bodies. In both Class I and
Class II pathways the most important molecule initiation of the recognition of
infected cells is major histocompatibility complex (MHC). Knowing which peptides
that are yielding from the cleavage of antigens

will be p
icked up by the MHC
molecule and understanding the mechanism of the binding of the peptides (sequence
motifs) will be of great use in vaccine design
. A peptide presented to a T
-
Cell
together with a MHC molecule is called T
-
Cell epitope. If the cell is inf
ected, it can
be induced to apoptosis by T
-
Cell. In this paper, we investigate
the

Class I pathway
for prediction of T
-
Cell epitopes.


Laboratory experiments can be used to determine which
peptide
s

bind to which
kind of MHC
molecule
s
. The peptides that are

known to bind to Class I MHCs have
variable length but

the

majority of them have between 8 to 10 residues. Conducting
laboratory experiments for all types of peptide binding combinations is not feasible
since there are 20
8

to 20
10

possible peptides using

20 amino acid alphabet, but only
a
few are selected by the MHC
[12]
.

We combine structural and data mining based
methods for prediction of T
-
Cell epitopes. Association rule mining techniques are
used for finding correlations between positions of the bound p
eptides and determining
the binding motifs for each type of MHC. These rules will be useful for
understanding the mechanism of peptide binding.

2

Background and Related Work

There are two main approaches to

the

peptide prediction problem: profile based
ap
proaches and machine learning. Profile
base
d

approaches
build

profile scoring
matrices from the alignment of the binding peptides. These methods control the
peptide sequence for the availability of the preferred sequences at certain positions of
the peptid
e as predicted by the scoring matrix.

Up to now most successful methods
are machine learning methods, like SVMHC[7].

Profile base
d

methods,
SYFPEITHI[11],
Rankpep[13], and ProPred1
[15]
,
only
take into account the positive cases to derive the information t
herefore they do not
have high specifity as compared to machine learning approaches where the non binder
class information is also taken into account to distinguish the properties of binders.
[4]

The second group of researchers

used machine learning appro
aches such as
Support Vector Machines and Artificial Neural Networks to find the correlations
between the positions of the peptide to build a valid probabilistic model using both
the binding and non binding
peptides’
data [5],[6],[7]. Another method done

by
Mi
lledge et. al. that was used for predicting peptides for HLA 0201 type of MHC has
A Modified Associated Rule Mining Approach for the MHC
-
Peptide Binding Problem


3

created sequence structural patterns by using association rules to reflect the MHC
binding characteristics of peptides [10].

2
.1 Association rule mining

The problem of f
inding association rules among items is formally defined by Agrawal
et al. in [1],

[2] as follows:

Let I =
{i
1
, i
2
, ..., i
m
}

be a set of all items. Let T be a transaction consist
ing

of
a
set
of items such that T


I. We call D a database of transactions.

We say that a
transaction T contains X, a set of some items in I, if

X


T
. An association rule is an
implication of the form X


Y, where X


I, Y


I and X


Y =

. An item set X
has support s if s% of the transactions
contain

X. We say that the rule X

Y holds
with confidence c if c% of the transactions in D that contain X also contain Y. The
rule X

Y
has support

s if s% of transactions in D contain X

Y.

Association rule mining algorithms scan the database of transactions and calculate
the support

and confidence of the candidate rules to determine if they are significant
or not. For that purpose, threshold values are used by the algorithms to prune the
insignificant rules. A rule is significant if its support and confidence is higher than the
user
specified minimum support and minimum confidence threshold. In this way,
algorithms do not retrieve all the association rules that could possibly be generated
from a database, instead only a very small subset of rules which satisfies the threshold
values a
re retrieved.

Support of an association rule mimics the coverage of that rule, and confidence of
the rule specifies the accuracy. Both of these measures are important for determining
the significance of a rule. Therefore we used a combined support confiden
ce measure
(CSC
-
Measure)
1
. The formula for the CSC
-
measure is obtained by taking the
harmonic mean

of the support and confidence measure, which is formulated below:








where s is the support and c is the confidence of
the rule. CSC
-
Measure takes both the
confidence and support of the rule into account, so rules which have high confidence
values and which cover more transactions over the data set will be more valuable.

3

Association Rule Mining Methods for (Peptide
-
Bin
ding)
Prediction

Our data set D contain amino acid sequences of peptides which are known to bind to
Class I MHC mol
ecules [3]. In D, there are
198

transactions (peptides) known to bind
to 4 different MHCs
.
We have worked with nine amino acid long sequence
s only



1

In information retrieval context, precision and recall mea
sures are combined in the same way
to calculate the F
-
measure.

4

Galip Gürkan Yardımcı, Alper Küçükural, Y
ücel Saygın,

Uğur Sezerman

since the majority of the known peptides were nine
-
mers.
Each peptide is represented
by an item
-
set of nine elements, based on it sequence.
So in our case

there are 180
different items
since there are nine different positions and twenty different am
ino
acids. S
et

I has
20
9

different
item
-
sets,

each set has nine elements for nine positions
and each element can be one of the 20 different items
. The position of each amino
acid in the sequence is important so we have turned the sequences into item sets X

of
the form A
P

where A is the one letter code of each amino acid and p is the position of
the amino acid in the sequen
ce. The rules mined will be
as follows

{
V
1
}



{
G
2
}
,
meaning that the presence of a Valine in first position of the peptide sequence impli
es
that there will be Glycine in the second position in the peptide sequence.

For
simplicity, we’ll omit the curly brackets in the following sections.

But MHC molecules are not very decisive when binding the peptides
, it can
accommodate different
type
s

of

amino acids at the same position of the peptide. There
are pockets at the binding site of MHC, some of these pockets have to be filled with
certain types of amino acids for the binding requirements to be fulfilled
[19]
.
Sometimes the second position of th
e peptide fills the appropriate pocket and
sometimes the third position of the peptide occupies the same pocket. Therefore
different amino acids and different positions of the peptide may have the same role in
defining the peptides


binding characterist
ics
;

association rules cannot catch this
property well. So we have decided to change the rule structure to deal with this
problem.

Our association rules have the form
{
V
2
}
V

{
A
2
}
V

{
L
3
}


{
I
9
}
meaning that the
presence of a Valine or an Alanine in the second position or a Leucine in the third
position of the peptide implies

that the

ninth position of the peptide sequence will be
an Isoleucine. Such rules can capture the binding characteri
stics better. This rule
structure with O
R
s

(
V

)

will also increase the CSC
-
Measure of the rules, resulting
with more globally correct binding characteristic rules. The support and the
confidence measures' definitions remain unchanged, the only diff
erence
is that the
calculations are done

taking the OR into account.

3.1 Candidate generation and rule mining

The candidate generation step is generally done by

the

apriori algorithm and its
variations [2]. Since using OR as a rule increases the number of candida
tes so much
that the apriori algorithm will not have a reasonable runtime. We first extracted rules
with one amino acid on each side by the conventional rule mining algorithm. Then we
have combined these rules with the OR operator, to yield rules which ref
lect the
binding characteristic better. The confidence of a new combined rule will be between
the values of the minimum and the maximum of confidences of the rules which were
combined to yield the new rule. The support will obviously increase as the number

of
sequences which contain the amino acids on the left side are increasing because of the
OR operator between them.

First we have mined the database for association rules of the form X
i


Y
j

where
X and Y are amino acids and i and j are their positions. Sm
all confidence (50%)

and
support (2
0%) threshold
s
are

used for two reasons. The first is that we expect these
values to go up as we combine the rules with the OR operator so we want as many
rules as possible. The second is that low support values imply tha
t the number of
A Modified Associated Rule Mining Approach for the MHC
-
Peptide Binding Problem


5

sequences or the transactions which contain both of the combined amino acids will be
small.

The combining process will be as follows, over the set of all one amino acid rules
of the form X
i



Y
j
, we will combine the rules which have the same implication, then
generate all the possible two amino acid combinations of these rules.

After we have the two amino acid rules, again we combine these rules to yield
three amino acid rules. This time the

process will be similar to the apriori algorithm.
We combine k amino acid rules which share k
-
1 amino acids and which have the
same implication. Combining these
rules
yield k+1 amino acid rules. The fact that we
are using the OR operator guarantees that n
ew rules’ support values will never
decrease so we don’t have to check support values. The pruning criterion is CSC
-
Measure of the new rules. If CSC measu
re does not improve by at least 2%
by
addition of

the new OR rule, the new rule is

pruned.


3.2 Amino
acid classification

Evolution allows for sequence variabi
lity
;

to capture this information, we have
also classified the amino acids according to their physico
-
chemica
l properties as given
in Table 1
.
Different classes of amino acids are obtained from a pre
vious study by
Sezerman et. al. which used an encoding decoding algorithm that classified amino
acids based on similarity scoring matrices

[14]
.

The classif
ication scheme given in
Table 1

yielded the best results for us.
Using the classification table enab
led us to
distinguish the binding rules according to their physico
-
chemical properties e.g. HLA
-
A2 molecule prefers a peptide with a bulky hydrop
hobic residue at position two
(Class F) and
a small hydroph
obic residue at position nine (
Class A) for binding.

The classification step reduces the number of items and item
-
sets, reducing the
number of rules but making the rules more compact. The number of possible item
-
sets
reduces to 129 from 209 and number of items reduces to 108 from 180.

Table
1
.

Classificatio
n of amino acids

Class

Amino Acid(s)


Class

Amino Acid(s)

A

I,V,L,M,A


G

W

B

R,K


H

H

C

D,E


J

G

D

S,T


K

Q,N

E

Y


L

C

F

F


M

P

4

Implementation and Experimental Results

First datasets are downloaded from
SYFPEITHI
[11]
.
The peptide sequences are re
-
writte
n using the classes in Table 1
as a preprocessing operation.

6

Galip Gürkan Yardımcı, Alper Küçükural, Y
ücel Saygın,

Uğur Sezerman

Nine amino acid long binding sequences of different kinds of MHC molecules are
used for rule extraction explicitly. The amount of binding peptides for different kinds

of MHC molecules vari
ed from

24

to 1
07
.

The nature of our data set required data
cleaning.
Peptide sequences are obtained experimentally. In some cases they obtain
MHC bound peptides and these are sequenced and stored in the databases. In other
cases they artificially create p
olyalanine p
eptide sequences of length nine,

check the
binding affinity of this peptide
to the specific MHC of interest
. They mutate each
position to different amino acid types separately and look at the binding affinity of the
mutated peptide

and compare
it with the original one
. Therefore many binding
peptides coming from these studies had alanine (which is a neutral small amino acid
that would not have any impact on the binding) in many positions. Since we are
looking for the support and confidence of th
e binders, this would cause a bias for that
amino acid type in our association rules th
erefore we cleaned our data
of

such
sequences. A peptide sequence was removed from our data set if it had the same
amino acid in four consecutive positions.

Table

2
.

So
me of the b
est rules for four types of MHC molecule using four fold cross
validation.

Molecule

Rule

Avg.
Support %

Avg.
Confidence
%

Avg.
CSC
-
Measure
%

Avg
Accuracy
%

HLA
-
A020110

A
1
V
A
5
V
A
6
V
A7

A2

69
,
3

83
,2
2

75,5
9

76
,
38

A1
V
A
6
V
A
7
V
A
8

A2

69,
3

86
,
89

77
,
01

76
,
3
8

A1
V
A
5
V
A
6
V
A
7
V
A
8

A2

71
,
92

83,7
1

77
,
32

78
,
88

HLA
-
A02019

A
1
V
A
3
V
A5
V
A6

A
2

77
,
25

93
,
25

84
,
48

82,19

A1
V
A3
V
A
5
V
A6
V
A7

A
2

83,48

93,
71

88
,
29

85
,
93

A1
V
A3
V
A4
V
A6
V
A7
V
A8

A9

85,66

93,85

89
,
56

87,82

HLA
-
B089

A
1
V
A
4
V
A2

B3

68,05

74,
1
7

70
,
96

83
,
33

A1
V
C1
V
M2
V
A
6
V
A8

A9

72
,2
2

75
,
63

73
,
84

87
,
5

C4
V
B5
V
A6
V
A7
V
A8

A9

75

77
,
28

76
,
11

87
,
5

HLA
-
B27059

B1
V
A3
V
A
6

B
2

79
,
27

100

88
,
41

92,85

B1
V
A
3
V
A
6
V
A9

B2

90
,
80

100

95
,
17

100

B1
V
A3
V
A
5
V
A
6
V
A7
V
B9

B
2

95
,
4

100

97
,
64

100

4.1

Testing Method

The data set we have used for the association rul
e mining is non
-
redundant and the
number of sequences in the data

set is not large enough
especially for certain MHC
data to split the data
base to yield a test and training set.
We have used only binding
peptides (positives)
for the rul
e mining and testing

processes
.

Since we haven’t
worked with nonbinding (negatives) peptides, we can only calculate sensitivity of our
rules. Therefore we

refer to sensitivity as the accuracy of our method.

For the testing
process, rules whose accuracy values are among the to
p 80% of all accuracy values
A Modified Associated Rule Mining Approach for the MHC
-
Peptide Binding Problem


7

are used.
Some of the best rules are presented in Table 2.
The values in Table 2

are
obtained by using a training

set of 198

peptides total on
4

different MHC types. We
can predict the binding up to 100% accuracy and
97
%

covera
ge for some cases.
(Table

2)
.

We have used four fold cross
-
validation to test the accuracy and validity of the
rules we have
mined.
The data set was split in
to
four sub
-
data

sets rando
mly. We
obtained the association

rules using three data sets and
tested

it on the fourth set.
Then
we switched the test set and the training sets until we run all possible combinations of
training and test sets.
The tes
ting procedure involved usi
ng

the association

rules
generated by the tr
aining set to identify binders.

The v
alues in Table
2

are average
values of the four tests.

The cross validation showed that association rules can predict
bet
ween 76% a
nd 100% accuracy.

Accuracy of the resulting rules are dependent on the confidence and support
thresholds.

For some MHC class
es, dataset size is not sufficiently large, thus small
confidence and support thresholds must be used. For sufficiently large datasets, large
support and confidence t
hresholds can be set, yielding
90
%

percent accuracy.


CSC measure gives

a

better

picture o
f success of our method.
CSC values varied
between 71%
and

92
%.
Our methods yield approximate
ly
81
%

percent

accuracy.
Brusic et.al.
report

a

predictive value of 78% for binding to

human MHC HLA
-
A2
and 88% for

mo
use MHC H
-
2K
B

using ANNs[
6]
.
Udaka et.al repo
rt

a
pproximately
80%

accuracy

using a

scoring program for prediction on three mouse MHC
binding

sequences

[17]
.

Dönnes

et. al. re
port

90% of all the peptides that are known to bind to
MHC can be predicted with
90% specificity using support vector
machines

on 21
MHC data[7
]
.

In another article,
Udaka et al reports
that
HMMs achieve %84
precision, assessing their method by using a so called precision recall curve analysis
in [
16
].
SYFPEITHI
uses a profile based method, evaluating
the contribution of
each
ami
no acid

in a peptide

to binding process and assigns a
n overall

score to a
given
peptide.

The scoring process is based on the knowledge of anchor and auxiliary
anchor positions.

For a given protein,

all possible octamers, nonamers and decamers
are evaluated

and SYFPEITHI

reports

that the naturally presented epitope is among
the top scoring 2% of all peptides in

80%
of all predictions.
[11]

The methods
reported
here use different data
-
sets with
varying data preprocessing steps so our
results are not directly
comparable to theirs
, except for SYFPEITHI w
ith which we
share our dataset.

5
Discussion and Perspectives

The no
velties of our approach are
the use of

the OR operator and reduced amino acid
alpha
bet

classification
. We have used
a

new association

rule mini
ng operator (OR) to
combine the rules to describe binding preference of MHC molecules. This
combination gives better explanation to the importance of
specific sites at the
binding

peptide
.

Second and ninth positions appear most frequently in the motifs. Th
ese
positions have h
ighly correlated hydrophobicity

values whic
h is also supported by
Zhang et al.
[19]

Zhang et al.
also
report that HLA
-
A02 classes require
isoleucine
,

valine
,

leucine
,

methionine

(class A according to our amino acid classification) as
co
nsensus

anchors for binding, and HLA
-
B classes need charged residues (class B
and C according to our classification) as
consensus
anchors. These finding also
8

Galip Gürkan Yardımcı, Alper Küçükural, Y
ücel Saygın,

Uğur Sezerman

correlate with our rules.
We also used a reduced amino acid alphabet which helped us
to determine
important physical and chemical properties of amino acids required at
significant positions for a succ
essful binding to MHC
.

Deriving general rules for
binding is a crucial contribution of our metho
d. Profile based methods

assume

contribution of each posit
ion on the peptide even though some would contribute more
than the others depending on the frequency of occurrence at the given position. Even
though a peptide has the binding motif at the key positions
,

the scores coming from
the other sites can cause i
t
to be classified as

non binder. According to Gulukota et. al.
[8] profile based methods have 30% accuracy in prediction of binders. Our method
points out key positions and significant features for binding. Machine learning based
methods can predict the bin
ders with high accuracy and specifi
ci
ty but cannot give
out features that are important for binding
, which is
crucial information for vaccine
design. Therefore, they are not well suited for this type of application.

U
p to now we did not consider the inform
ation coming from non binders in this
work.
So, for future work, we are

developing a new approach

which takes

non
binders’ information
into account as well. We are also trying to scan for explicit pairs
or triplets in peptide sequences using a Bayesian app
roach and compare its efficiency
with our method.

References

[1
] R.Agrawal, T
.

Imielinski, and A. Swami. Mining association rules between sets of items in
large databases. In
Proc 1993 ACM
-
SIGMOD Int. Conf: Management of Data
(SIGMOD’93)
, Washington, DC, p
p. 207
-
216 , May 1993.

[
2
] R.Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc.
1994 Int.
Conf. Very Large Data Bases (VLDB’94)
, Santiago, Chile, Sept, pp. 487
-
499, 1994.

[3
] M
.
Bhasin,
H. Singh
,
G. P. S. Raghava.

MHCBN: A Compr
ehensive Database of MHC
Binding and Non
-
Binding Peptides.
Nucleic Acids Research
, Vol. 19 no.5 pp. 665
-
666,
2002.

[4
] V. Brusic, V.
B. Bajica, N.

Petrovsky. Computational methods for prediction of T
-
cell
epitopes

a framework for modelling, testing, and app
lications.

Methods,

34(4):436
-
43,
2004.

[5
] V. Brusic and D.R. Flower. Bioinformatics tools for identifying T
-
cell epitopes
. DDT:
BIOSILICO Vol. 2, No. 1,
pp. 18
-
23,

January 2004.

[6
] V
. Brusic, G. Rudy, L.

C. Harrison. Prediction of MHC Binding Peptides U
sing Artificial
Neural Networks.
Complexity International
, Volume 2, 1995

[7] P.

Dönnes, A
.

Elofsson. “
Prediction of MHC class I binding peptides, using SVMHC”.
Bioinformatics
, 3:25, 2002.

[8]

K. Gulukota
,
J. Sidney
,
A. Sette
,
C. DeLisi
.
Two complementary methods for predicting
peptides binding major histocompatibility complex molecules.
J Mol Biol
,
267:
1258
-
1267,
1997

[9] P.
M. Kloetzel. The proteasome and MHC class I antigen processing.
Biochimica et
Biophysica Acta 1695
, pp. 217
-
225, 20
04.

[
10
]
T. Milledge, G. Zheng, G.
Narasimhan.
An Application Of Association Rule Mining to

Hla
-
A*0201 Epitope Prediction.
ICBA
, 2004.

[1
1
]

H. G. Rammensee
, J. Bachmann, N.P.N. Emmerich, O.A. Bachor, and S. Stevanovic.
SYFPEITHI: database for MHC ligands a
nd peptide motifs.
Immunogenetics
,
50(3
-
4):213
-
21
9,

1999.

[1
2
]
H.G.
Rammensee,
T.
Friede,
S.
Stevanovic.
MHC ligands and peptide motifs: 1st listing.
Immunogenetics

41, pp. 178
-
228, 1995.

A Modified Associated Rule Mining Approach for the MHC
-
Peptide Binding Problem


9

[1
3
] P.A.

Reche
, J.

P. Glutting, and E.L. Reinherz. Prediction of MH
C Class I Binding Peptides
Using Profile Motifs.
Hum. Immunol
., 63:701
-
709, 2002.

[14]
O.U. Sezerman, R. Islamaj and E. Alpaydin. Three dimensional representation of amino
acid characteristics.
IEEE EMBC
, Vol. 3 2903
-
2906, 2001

[15
] H.

Singh

and G.P.S. Rag
hava.
ProPred1: prediction of promiscuous MHC Class
-
I binding
sites
.

Bioinformatics
, Vol. 19 no. 8 pp. 1009
-
1014, 2003.

[16]
K. Udaka, H. Mamitsuka, Y. Nakaseko and N. Abe.
Empirical Evaluation of a Dynamic
Experiment Design

Method for Prediction of MHC Cl
ass I
-
Binding Peptides
.
The Journal
of Immunolog
y,
169:5744


5753, 2002

[17
]
K.
Udaka
,
K.H.
Wiesmuller
, S. Kienle, G. Jung
,
H. Tamamura
,
H. Yamagishi, K.
Okumura, P. Walden, T. Suto, T. Kawasaki
.

An automated prediction of MHC class I
-
binding peptides bas
ed on positional scanning

with peptide libraries
.

Immunogenetics
, pp.
816
-
828, 2000.

[18
] J. Zeng, H. R. Treutlein & G. B. Rudy.
Predicting sequences and
structures of MHC
-
binding peptides: a computational combinatorial approach.
Journal of Computer
-
Aided

Molecular Design
, pp. 573
-
576, 2001.

[1
9
]
C. Zhang, A. Anderson , C. DeLisi
. Structural principles that govern the peptide
-
binding
motifs of class I MHC molecules.

J. Mol Biol
,

929


947, 1998.