the laboratory for genomics and bioinformatics

sparrowcowardBiotechnology

Oct 2, 2013 (3 years and 6 months ago)

72 views

Bioinformatics

Edgar Scott

What is Bioinformatics?


An interdisciplinary field that combines concepts from biology,
probability and statistics, and computer science to create and test
new hypotheses based on sequence data.


Resources


Databases (i.e., protein sequence database, nucleotide sequence database,
tertiary structure databases)


Computational Tools (i.e., sequence similarity searching, phylogenetic
analysis, tertiary structure prediction tools.)


Bioinformatics Applied


Pharmacogenomics


Genome Sequence and Annotation


Forensic Entomology

EMBL


European Molecular Biology Laboratory
(EMBL)


Part of International Nucleotide Sequence
Database Collaboration


EMBL


National Center for Biotechnology (NCBI)


DNA Databank of Japan (DDBJ)


Provides numerous databases and computational
tools

Databases and Database Records


Database


a collection of data or records stored in a
computer system


Database record
-

A data file that contains a sequence
and annotations


Sequence


…TAGCCTCCTTATTCGAGCCGAGCTGGGCCAGCCAGGCA
ACCTTCTAGGTAACGACCACATCTACAACGTT…


Annotation


Mitochondrial genome


DNA


Homo sapiens


AM948965



Accession number
-

A unique identifier for the database record

Example Database Record


Point your browser to
http://www.ebi.ac.uk/


Type the ascension number into the text box
AM948965



Click on
Nucleotide Sequences


Click on
AM948965



Sequence Alignments


Sequence alignment


a comparison between two sequences to identify a
series of characters or character patterns in the same order in both sequences.


Pairwise Global Alignment


Pairwise Local Alignment


Multiple Sequence Alignment


Basic Local Alignment Search Tool (BLAST)



sequence similarity search
tool.


Compares a query sequence to a sequence database using the local alignment
method.


Returns a list of sequences that are significantly similar to the query.


Types of BLAST programs


Blastp:

compares protein sequence to a protein database


Blastn:

compares DNA sequence to DNA database


Blastx:

compares a translated DNA sequence to a protein database


Tblastn:

compares a protein sequence to a translated DNA database


Tblastx:

compares a translated DNA sequence to a translated DNA database



Sequence Alignments


Alignment features


Identical matches


Conservative matches (conservative
substitutions)


mismatches


gaps


Alignment scoring


Percent identity = (ident.
matches)/(align. length)*100


Percent similarity = (ident. + cons.
matches)/(align. length)*100


Alignment score = a score that
measures the similarity between the
two sequence being compared that
takes into account all identical
matches, conservative matches,
mismatches, and gaps.


Expectation Values = estimation of
the number of times an alignment
with this alignment score could be
observed by random chance from a
database search.


BLAST example


Point your browser to
http://www.ebi.ac.uk/Tools/blast2/nucleo
tide.html


From the Lab Home Page, copy and paste the
BLAST input sequence

into the input text
box.


Press the
Run BLAST

button.

Molecular Phylogenetics


The analysis of molecular sequences to infer
evolutionary relationships between a group of
sequences or a group of organisms.


ClustalW2



bioinformatics program that creates
multiple sequence alignments and phylogenetic trees


Multiple sequence alignment


an alignment with three or
more sequences.


Phylogenetic tree


a diagram of nodes and branching lines
depicting close and distant relationships between sequences
or organisms.

Molecular Phylogenetics

Example MSA


Point your browser to
http://www.ebi.ac.uk/Tools/clustalw2/


From the Lab home page, copy and paste the
ClustalW input sequences

into the text box.


Press the
Run

button.


Example Tree


Point a second browser to
http://www.ebi.ac.uk/Tools/clustalw2/


From your previous browser, copy and paste the entire
multiple sequence alignment into the text box.


Change the “TREE TYPE” designation from “none”
to “nj”.


Change the “IGNORE GAPS” designation from “off”
to “on”.


Press the
Run
button