Module 5: Algorithms in Bioinformatics

peaceshiveringΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

76 εμφανίσεις

Module 5: Algorithms in Bioinformatics

(Instructor: H. Ali)

Description:

The module focuses on introducing students to key problems in bioinformatics and
how to solve them using several problem
-
solving techniques
. In particular, this module includes
algorithms for comparing biological sequences, constructing evolutionary trees and finding genes
in sequenced genomes. The module emphasizes how solving various bioinformatics problems
has become a key contributor to

our biological knowledge. For example, biological sciences have
a long tradition of discovery by comparison and obtaining information about an unknown
biological element can be estimated by comparing attributes of the new element to attributes of
known e
lements. With the current development of Bioinformatics algorithms, it is natural to use
biological sequences as the attributes to explore the potential similarities between the unknown
element and various known ones. The course will present basic algori
thmic concepts in
computational biology and show how they are connected to molecular biology and
biotechnology. For example, the students will be introduced to alignment algorithms with a
simple introduction to dynamic programming. The module will also in
clude algorithms for gene
prediction, clustering and constructing evolutionary trees.


Homework:

As part of the module, students will be asked to apply the introduced problem
solving methods on
simple but critical Bioinformatics

problems
, with a focus on
how sequence
comparison techniques can be used to classify and recognize organism
s.


Intended audience:

At UNO, the target course for this module will be BIOL 4960 (
Advanced
Genetics
). It would also be very applicable to CSCI 3320 (
Data Structures
).


Mod
ule Outline:

I.

General introduction to basic problems in Bioinformatics

A.

Very brief introduction to Bioinformatics

B.

Basic Biological (Algorithmic) concepts to Biology (Computer Science) students

C.

Overview of key Bioinformatics problems and associated algorith
ms



Sequence comparison



Sequencing and map assembly



Gene prediction



Phylogenetic trees

II.

Sequence comparison
and a
lignment algorithms

A.

Local
and global
alignment


B.

Multiple sequence alignment

C.

Applications of Sequence Comparison



Identification/classification
of organisms




Gene Prediction

III.

Clustering algorithms and evolutionary trees

A.

Brief i
ntroduction to clustering algorithms

B.

Using a simple algorithm to construct evolutionary trees

C.

Linking multiple sequence alignment to Phylogeny

Learning Objectives:

1.

To gain

a good

understanding of
what the field of

Bioinformatics

is
and how it
can
play a
significant role

in

solving various problems in the domain of biosciences
.

Students will be asked
various

questions related to what Bioinformatics is
, and as a new
emerging
multi
-
disciplinary field, how other traditional disciplines contributes to the
understanding of the basic concepts of Bioinformatics.

A)
Which of the following statements best describes the field of

b
ioinformatics?


a.

Applying
b
iological concepts in the des
ign of computer algorithms.

b.

Using mathematical and computation
al

techniques to solve biological problems.

c.

Using computers to speed up what bioscientists do manually
.

d.

Using
s
upercomputers to store
complex
biological data
.

B)
Which of the following concepts are related to

b
ioinformatics
?
(
Check
all that apply)



Genetic
a
lgorithms



Sequence
c
omparison
s



Constructing evolutionary trees



Swarming techniques



Ants
a
lgorithms

C) Bioinformatics

is a multi
-
disciplinary field of study.
W
hi
ch of the following
traditional disciplines are
necessary

for
the study of b
ioinformatics?

(Check all that
apply)



Computer
s
cience




Biology



Information
s
ystems




Pharmacy



Mathematics and
s
tatistics




Medicine



Chemistry


2.

T
o gain general knowledge of
the
main

Bioinformatics algorithms and their main
applications
, with a particular focus on sequence comparison

in the context of analyzing
biological data
.

Students will be asked to
list the main Bioinformatics problems and
suggest basic
algorithms to solve simplified versions of important Bioinformatics problems such as
how to measure the similarity between two biological sequences. They will also be asked
how sequence comparison techniques can be used to recognize and/or cl
assify various
organisms
.

A)

Sequence alignment is a key operation for several
b
ioinformatics application
s
.
Which
of the following statements
are

true

about bioinformatics
applications?

(Check all that
are true
)



Sequence alignment is a computationally
intensive problem
.



Sequence alignment is the only method used to compare sequences
.



Dynamic P
rogramming
(DP)
is
widely used to solve the alignment problem.



A solution to

the global alignment problem can be achieved in quadratic time
complexity
.



H
euristics

have been used
to find a near optimal solution in linear time

for the
global alignment problem
.



Pairwise sequence alignment can
be
easily extended to solve the alignment of
multiple sequences.


B)
Genes are important segments of DNA in every genome since

they code into proteins.
Various species are known to share genes; this can be used to search for genes by finding
conserved regions in the genomes of the species.
Which of the following statements is
true
?

a.

Multiple sequence alignment is ideal for finding

conserved segments of DNA
among various organisms

b.

Finding genes by aligning genomes
works better if the genomes belong to closely
related organisms.

c.

Closely related species tend to share conserved DNA regions beyond genes.

d.

Classification of organisms can

be obtained by comparing how many DNA
segments are conserved among their genomes.


3.

To develop a good

understanding of how to
apply known algorithmic techniques to
solve
specific Bioinformatics problem
s

such as how to
construct evolutions trees for a set
of
related organisms
.

Students will be asked to compare various clustering algorithms and how suitable each
one to use in constructing evolutionary trees
.
They will also be asked to demonstrate how
trees can be constructed using various algorithmic techniq
ues.


A)

Constructing the tree of life is one of the most outstanding problems in
b
iosciences.
Such
a
tree would depict how various organisms evolved from each other.
W
hich
of the
following
feature
s

would the tree of life have?

(Check all that apply)



The distance between two organisms in the tree
would

reflect how closely related
they are.



T
he tree would provide information about which species are threatened
with

extinction and hence need protection.



The tree
would
be a binary tree.



The path between t
wo nodes
would
represent the number of evolutionary steps
that took place to evolve one organism from another.



Primates
w
ould be grouped

together in the tree.


B)

Constructing the tree of life is
an intractable problem
.
As a result, many heuristics
have be
en developed to find approximated trees. Which
of
the following statement
s

are

consistent with the previous
sentence?

(Check all that apply)



Finding the optimal evolutionary tree is not computationally possible within a
reasonable amount of time
.



Finding
such a tree may be easier for a certain group of organisms but may be
very hard to obtain for another group.



Practically,

all approaches used to construct evo
lutionary trees are heuristics
since they
may not produce the best possible tree for every given
input.



Constructing
an
evolutionary tree would take too much time using a regular
computer, but would take
a
reasonable time using a supercomputer.