The role of Artificial Neural

californiamandrillSoftware and s/w Development

Dec 13, 2013 (3 years and 4 months ago)

53 views

Mike Arnoult

9/30/2010

The role of Artificial Neural
Networks in Phage Research

What is an Artificial Neural Network?

Mathematical and computational model

Motivated by biological neurons

Trained by using features to learn patterns and commonalities

Uses values of its neuron connections to classify an example

The neural network can be trained to recognize features of
phage proteins, and distinguish between them.

I have trained ANNs to recognize and classify phage major
capsid proteins

Why Apply Artificial Neural Networks to
Phage Research?

What is a Bacteriophage?

A virus that infects bacteria

The most common biological entity on earth

A major impact on any environment with Bacteria

A type of virus with a highly unique structure, which
injects its genome into a host, through its tail

A possible alternative to Antibiotics in medicine


How the ANN works:

Why Apply Artificial Neural
Networks to Bioinformatics?

The Neural Network can be trained to recognize features of
proteins, and distinguish between them.

In my research, I will train Neural Networks to recognize
phage major capsid or tail proteins.



What I’ve done so far:

I’ve collected Positive and Negative Data sets from NCBI


Positive data sets included Phage Major Capsid Proteins

and synonyms:


Major Shell Protein


Major Head Protein


Major Coat Protein


Major Procapsid Protein


Major Prohead Protein…


Negative data sets included phage proteins unrelated to

Major capsid proteins


Packaging proteins


Spike proteins


DNA and RNA Polymerase


Assembly proteins


Contractile Sheath proteins




What I’ve done so far:

I have written and used Perl scripts to filter
the Training Data



Any sequences with conspicuously incorrect
GenPept annotations were removed from the
positive data
-
set.


All sequences with Major Capsid Protein
related annotations were removed from the
negative data
-
set.


What I’ve done so far:

I’ve turned the sequences into percent
compositions of Amino Acids and
side
-
chain groups, to Train Neural
Networks

The positive entries are labeled with a
1 and the negative entries are labeled
with a

1.

Using a Matlab Script, a random 20%
of the positive data
-
set is set aside and
used as a test set against the other
80%.


What I’m doing now:

To find which criteria are best
suited to Training the Neural
Network to recognize Phage Major
Capsid Proteins…

I am training neural networks
using different characteristics of
Amino Acid side
-
chains (Polar,
Nonpolar, Aromatic, Positive and
Negative)

Adjusting parameters of the way
the Matlab script trains Neural
Networks.

Classification of Known Sequences:

The values are average percentages of correctly classified
sequences, of 1000 separately trained Neural Networks .

Amino Acid and Side
-
chain Percent
Compositions used as
features

Amino Acid Percent
Compositions used as
features

No Side chains

92.9233%


What I’m going to do Soon:

Test The Neural Networks using other Phage
Major Capsid Proteins




Ramy’s curated Phage Major

Capsid Proteins




Eventually verify the Neural Network predictions

in the lab.


THE END