Bioinformatics AM

clumpfrustratedΒιοτεχνολογία

2 Οκτ 2013 (πριν από 4 χρόνια και 1 μήνα)

58 εμφανίσεις

Bioinformatics

Definitions:


“The application of computational techniques to the management and
analysis of biological information”


“The science of the analysis, handling and storage of biological information,
especially the large amounts of data emerging from genome
-
sequencing and
proteomics work”

Milestones

1887
: NIH (National Institute of Health) established

1974
: EMBL (European molecular biology laboratory) established

1977
: Dideoxy DNA sequencing method developed (Sanger)

1986
: DDBJ (DNA database of Japan) established

1988
: NCBI (National Centre for Biotechnology Information) established

1990
: Official launch of human genome project

1995
: First human physical map of human genome

2001
: First draft human genome completed

2008
: >
50
eukaryote genomes completed

2010
: >
700
eukaryote genome projects finished or in progress

The main databases

Cross
-
referenced and
updated daily

Major Sequencing centres


Washington University:
http://www.genome.wustl.edu/

Chicken, chimp,
Drosophila simulans, Drosophila yakuba


Baylor College of Medicine:
http://www.hgsc.bcm.tmc.edu/

Cattle, rat, honeybee,
Drosophila pseudoobscura


JCVI (J Craig Venter Institute):
http://www.jcvi.org/

Was TIGR and other institutions.

Rice, wheat, many eukaryotes


Wellcome Trust Sanger Institute:
http://www.sanger.ac.uk/

Humans, zebra fish,
Caenorhabditis elegans,
Ensembl


Broad Institute:
http://www.broadinstitute.org/

Chimp, dog

Visiting NCBI

http://www.ncbi.nlm.nih.gov/

This session: a walk through some of these resources

PubMed
: a list of citations, references
etc. Similar to Web of Science

All Resources
: an entry point into all
of the databases within NCBI

BLAST
:
B
asic
L
inear
A
lignment
S
earch
T
ools

Bookshelf:
Some classic textbooks
online

Taxonomy
: Species listed in
hierarchical manner. Database stats &
links

Genomes
: portal to numerous
genome projects


Navigating the NCBI toolbars


part 1

All Resources


very comprehensive
listing.


Example features:

Stand
-
alone BLAST (software)

NCBI handbook and help

FTP sites


Navigating the NCBI toolbars


part
2

Looking at nucleotide sequences
-
NCBI

Enter a search term


ovis

[
orgn
] AND PRNP

Or an accession number


NM_001009481



Sequence length

GenBank identifier

RefSeq (non redundant)

Taxonomy

Publication

Coding sequence

Amino acid sequence

Nucleotide sequence

Exploring a nucleotide record

Fasta formatted sequences

>Sequence 1

gagatactagcggcttatataggtact

>Sequence 2

tgatatatatttaaggatcgtaactgg

Use a font like courier new

Retrieving fasta formatted sequence
-

NCBI

Select ‘fasta’ from menu

Then save fasta file onto your computer by choosing ‘File’

Retrieving fasta formatted sequence
-

NCBI

Select the
sequences
you want

Then select ‘fasta’ from the
Display

menu and ‘save to file’ from the
Send to

menu

Other resources

Gene Ontology

Ontologies are 'specifications of a relational vocabulary'.


“Controlled vocabulary used to describe concepts and their inter
-
relationships”

http://www.geneontology.org/

Biologists often need to retrieve and
analyse

data from disparate sources.

If databases use different terms e.g. 'translation‘ and 'protein synthesis‘ for the same
process, it is difficult to find functionally equivalent terms.


The Gene Ontology (GO) project is a collaborative effort to address the need for
consistent descriptions of gene products in different databases.


http://www.geneontology.org/

Biological Processes

Molecular function

Cellular component

Behaviour

Reproduction

Death

Antitoxin

Immunity protein

Cell cycle regulator

Cell wall

Intracellular

Extracellular

Other resources (Gene Ontology cont)

Bioconductor

http://www.bioconductor.org/

Bioinformatics packages written in R

Other R resources

Possible to browse R packages by topic at
http://cran.r
-
project.org/


Text Editors


get a free one

Programmer’s File Editor

http://www.winsite.com/bin/Info?500000017700

http://www.download.com/3640
-
2352
-
904159.html

Crimson Editor

http://www.crimsoneditor.com/

NotesPad (not Notepad)

http://www.newbie.com/NotesPad/

EditPad

http://www.editpadpro.com/editpadlite.html

Journals

http://bioinformatics.oxfordjournals.org/

Emailed table of contents biweekly

Other useful journals.

BMC Bioinformatics

Nucleic Acids Research

Molecular Biology and Evolution

Molecular Ecology Resources


Less software papers, but still some

Textbooks

£76.50

£40.99

£42.99

£50.99

£
32.50

Afternoon session

Introduction to sequence similarity searching (BLAST)

Sequence alignment

Genome
Browsers

Introduction to
Bioconductor