ppt

signtruculentΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

140 εμφανίσεις

NGS Bioinformatics
Workshop

1.2 Tutorial


Sequence Formats, Databases and
Visualization Tools


March 15
th
, 2012

BioSci

room
B9242

Facilitator: Richard
Bruskiewich

Adjunct Professor, MBB

Learning Objectives


Linux revisited


Quick dive into the Open
-
Bio pool (
BioPython
)


A first look at NGS data:


NCBI short read archive


Processing NGS: FASTX tool kit et al.


Visualization: IGV



Files and Permission


Linux user permissions:
owner, group, or others


Owner/user is the person who created the file


“OWNS” the file / directory


Group is a team of people that’s associated together


GROUP project / Team work


Others is just other people on the server



Each file / directory can have it’s permission set
to (
r)
ead
, (w)rite, or e(x)
ecute

Do a long listing (
ls


l)


dr
-
x
-
wxrw
-

Separated into four sections



(d)(r
-

x)(
-

w x)(r w
-
)








Examples:

chmod

o+x

foo.txt




grant ‘e
x
ecute’ permission to ‘
o
thers’ on foo.txt


chmod

g
-
rw

foo.txt




remove ‘
r
ead’ and ‘
w
rite’ permission from group


chmod

ugo+rwx

foo.txt


grant all rights to everyone


To change the user/group (‘owner’) of a file:


chmod

ubuntu:ubuntu

foo.txt



chmod
: change file permissions

d
irectory or file (
-
)

u
ser (owner)

g
roup

o
thers


Hitting “tab” will auto
-
complete file or program names (or
suggest possible names)



Up arrow will let you return to previous commands



Editing of text files: “
nano
” is an easier alternative to “
emacs
”,
but less powerful




alternatively, use SSH client to transfer files on your Windows desktop, edit
them in Windows, then transfer back



BUT: make sure you use a text editor that knows the difference between a
Windows and a Linux text file
(e.g. Notepad++)


a few useful tips…

Some more useful basic Linux commands


“cd” changes your directory, e.g. ‘cd /
usr
/local’


“man” display manual for command, e.g. ‘man

ls




pwd
” tells you the directory you are currently
in (= working directory)


“history” will
list recent
commands,
enumerated with line
numbers. By;
typing an
exclamation point with the line number (e.g.
!123
), you can
redo the command

Accessing remote servers



ssh



S
ecure
Sh
ell

s
sh


i

private_keypair

user
@
host



scp



S
ecure
C
o
P
y

ssh


i

private_keypair

[
user@host
:]
sourcefile

[
user@host
:]
targetfile


Where user is the account (default: local user)

a
nd host is the internet name of the computer
(defaults: local host)

OpenBio

Case Study:
BioPython

http://
biopython.org/wiki/Biopython


http://biopython.org/DIST/docs/tutorial/Tutorial.html

FIRST LOOK AT
NGS

DATA

NGS Bioinformatics Workshop

1.2 Tutorial


Sequence Formats, Databases and Visualization Tools

http://
www.ncbi.nlm.nih.gov/sra/


http://hannonlab.cshl.edu/fastx_toolkit/


Linux,
MacOSX

or Unix only

Get the precompiled
binary

wget

http://hannonlab.cshl.edu/
fastx_toolkit
/



fastx_toolkit_0.0.13_binaries_Linux_2.6_amd64.tar.bz2


bunzip2

fastx_toolkit_0.0.13_binaries_Linux_2.6_amd64.tar.bz2


tar

xvf

fastx_toolkit_0.0.13_binaries_Linux_2.6_amd64.tar


sudo

mv
bin/* /
usr
/local/bin


FASTX tool kit I



FASTQ
-
to
-
FASTA converter



Convert
FASTQ files to FASTA files.



FASTQ Information



Chart Quality Statistics and Nucleotide Distribution



FASTQ/A
Collapser



Collapsing identical sequences in a FASTQ/A file into a single
sequence (while maintaining reads counts)



FASTQ/A Trimmer



Shortening reads in a FASTQ or FASTQ files (removing
barcodes or noise).



FASTQ/A
Renamer



Renames the sequence identifiers in FASTQ/A file.



FASTQ/A Clipper



Removing sequencing adapters /
linkers

FASTX tool kit II



FASTQ/A Reverse
-
Complement



Producing the Reverse
-
complement of each sequence in a
FASTQ/FASTA file
.


FASTQ/A
Barcode splitter



Splitting a FASTQ/FASTA files
containing
multiple samples



FASTA Formatter



Changes
the width of sequences line in a FASTA file



FASTA Nucleotide Changer



Converts
FASTA sequences from/to RNA/DNA



FASTQ Quality Filter



Filters sequences based on quality



FASTQ Quality Trimmer



Trims (cuts) sequences based on quality



FASTQ Masker



Masks nucleotides with 'N' (or other character) based on
quality

www.bioinformatics.bbsrc.ac.uk/projects/download.html

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc
/


Integrative Genomics Viewer

http://www.broadinstitute.org/igv
/