slide01 - 北京大学网络与信息系统研究所

builderanthologyΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

271 εμφανίσεις

计算机网络与分布式系统

北京大学计算机科学与技术系

王建勇

Email:
jwang@net.cs.pku.edu.cn

URL:
HTTP://csnetlib.pku.edu.cn/~jwang/course/cnds.html

Why Do We Study Distributed Systems?



规模可伸缩性
(scalability)



通过图灵测试

(Turing test)



语音到文本的转换


(speech to text)



文本到语音的转换


(text to speech)



机器视觉
(machine vision)



个人的“梅米克斯”


( personal MEMEX)

A dozen remaining IT problems proposed by James Gray:



世界的“梅米克斯”


( world MEMEX)



虚拟现实


(TelePresence, i.e. VR)



无故障系统


(trouble
-
free systems)



安全系统
(secure systems)



高可用系统
(AlwaysUp)



自动程序设计


(automatic programming)

教材:

G. Coulouris, J. Dollimore, J. Kindberg.
Distributed Systems:
Concepts and Design
. Addison
-
Wesley, 1994

参考书
:


Larry L. Peterson and Bruce S. Davie.
Computer Networks: A
System Approach
, Morgan Kaufmann, 1996.

Andrew S. Tanenbaum.
Distributed Operating Systems
. Prentice
Hall International, Inc., 1996.

Andrew S. Tanenbaum.
Computer Networks
. Prentice Hall
International, Inc., 1996.

Xueliang Yang.
Distributed Computer Systems
. Graduate school of
USTC, the Chinese Academy of Sciences, P.R. China.

成绩考核
:

1
次编程(
11

12
日前完成提交),

1
次文献阅读(
论文提交的截止日期

2000/12/26

),一次期末考试(占总成绩的
60%

.



网络与分布式系统

文献阅读

URL: http://csnetlib.pku.edu.cn/~jwang/course/Assignment2.html

安全与身份验证

1. Jennifer G. Steiner, Clifford Neuman, and Jerrfey I. Schiller.


"Kerberos: An Authentication Service for Open Network


Systems." Proceedings of the 1988 USENIX Winter Conference,


February 1988, Dallas, Texas, Pages 191
-
202.(MIT)

2. Butler Lampson, Martin Abadi, Michael Burrows, and Edward


Wobber. "Authentication in Distributed Systems: Theory and


Practice." Proceeding of the 13th Symposium on Operating


Systems Principles, October 1991, Pacific Grove, CA, Pages 165
-



182.(DEC)

3. Victor L. Voydock and Stephen T. Kent. "Security Mechanisms in


High
-
Level Network Protocols." ACM Computing Surveys, 15(2),


June 1983, Pages 135
-
171.

Tentative course outline



Introduction, Basic networking, ISO model,
networks & internetworking



Inter
-
Process communications: BSD
sockets, Client
-
server model



RPC. Sun's RPC etc.


DOS principles.



Name service: terminology, DFS & DNS



Distributed file systems:concepts,design
and implementation;DFS case studies: NFS,
AFS, Coda, COSMOS(or S2FS).

Tentative course outline
(continue)



Distributed shared memory: IVY & Munin



Coordination in Distributed System: potential
causality, clock synchronization, logical time



Replication: Gossip & Isis



Transaction: Acid, Locks, deadlocks, nested
transaction, optimistic concurrency control,
timestamp ordering, distributed transaction



Recovery & Fault tolerance



Security in DS: DES,RSA, digital signature,
Needham
-
Schroeder model, kerberos


Components of distributed system

Foundations

of


Remote

Procedure

Calling


distributed


Interprocess

Communication


system



Networking

and

Internetworking

The micro
-
kernel of
DOS

Processes & threads, Naming & protection

Communication & invocation, Virtual memory

Services
provided
by
distributed
systems


name services distributed file systems

distributed shared memory
Time & coordination

shared data services (distributed transactions &

concurrency control, recovery)

highly available services(replication & security

fault tolerance)

Chapter 1
Introduction to Distributed
Systems


Review of computing history


Why should we develop distributed system


Key

characteristics

of

distributed

system

1.1 Review of computing history



Physically distributed hardware



Logically centralized software

1.1.1 The trend of hardware

1.1.2 The need for logically centralized software

1960s & 1970s
: timesharing system

1980s
: personal computer & personal workstation

1990s:

distributed computer systems



User’s requirement:

-

a system built out of large numbers of powerful PCs or workstations

-

but which act together in a coherent way

>>
that is as easy to use & understand as an old fashioned timesharing system
.



Role of a new generation operating system(DOS):

2000s
: mass distributed systems

-

e.g., Web OS, Cluster OS


Review of computing history

Figure 1.2 Milestones in Distributed Systems

1.2 Why should we develop distributed system

1
.
2
.
1

Most

important

reason

is

that

application

is

a

starting

point

and



end

result

of

development

of

distributed

systems
.

1
.
2
.
2

Many

computer

applications

occur

in

a

distributed

or

decentralized



environment
.




Sharing

expensive

resources




Exchange

data

between

systems

1.2.4 The interface between users and Computers is more friendly

1.2.5 LAN & Internet applications stimulate DOS’s development



It’s

the

software,

not

the

hardware

that

determines

whether

a

system



is

distributed

or

not


1.2.6 Examples of distributed systems and applications

1

䑩D瑲楢畴ud

啎䥘
:




Berkeley

BSD

UNIX+NFS+NIS




Amoeba,

Mach,

Chorus

1.2.3
Proliferation of low cost and high performance PCs or Workstations

2

䍯浭Cr捩慬

慰灬楣慴a潮o




airline

seat

reservation

and

ticketing





automatic

teller

machine

Reliability, security

3

W楤i

慲敡

湥瑷潲o

慰灬楣慴a潮o




Internet,

ARPAnet

100
=>

1

million





Internet

information

service,such

as

Email,

Web,



www

search

engine,

BBS,

E
-
commerce

digital

library

4

䍬畳C敲

獹獴敭



IBM

SP
2



Berkeley’s

NOW



NCIC’s

Dawning

superserver

5

䵥Ma

䍯浰畴楮C




idle

computers

are

ubiquitous

6

䵵M瑩m敤楡

楮i潲o慴a潮

慣捥as

慮a

捯湦cr敮捩湧

慰灬楣慴a潮




continuous

media

service,

such

as

VOD

servers,

video

phone


and

video

conference,

their

main

requirement

is

quality

of



service




ATM,

real

time

OS,

continuous

media

servers

1.3 Key characteristics of distributed system

What’s the Distributed System?

Definition

1
:

A

distributed

system

is

one

in

which

there

exists

a

multip
-

licity

of

interconnected

processing

resources

able

to

cooperate

under


system
-
wide

control

on

a

single

problem

with

minimal

reliance

on


centralized

procedures,

data

or

hardware
.


Formulated

by

the

organizing

committee

for

the

1
st

conf
.

on

DCS

Definition

2
:

A

Distributed

system

consists

of

a

collection

of

autonomous


computers

linked

by

a

computer

network

and

equipped

with

distributed

system

software


From

our

textbook

1.3.1 Resource Decentralization and sharing

-

Some

or

all

of

the

computing

resources

should

be

decentralized

in


function

as

well

as

distance


-

and

this

is

a

prerequisite

for

making

the

distinction

from

other

types



of

systems,

such

as

time
-
sharing

system
.

1.3.2 Cooperative Autonomy

1.3.3 Concurrency (i.e. work parallelism)

1.3.4 System transparency

-

Some

resources

are

very

expensive,

and

data

sharing

is

an

essential



requirement

in

many

computer

applications


-

Cooperative

autonomy,

especially

control

autonomy

increases

the



overall

reliability

and

availability

of

the

system

-

Concurrent

vs

Parallel



>>

MIMD(Multiple

Instruction

&

Multiple

Data

stream)

vs

Concurrent

of

TSS

-

Two

reasons
:


>>

Many

users

simultaneously

invoke

commands

or

interact

with



applications

programs
;


>>

Many

server

processes

run

concurrently,

each

responding

to



different

request

from

client

processes
.

-

it

looks

to

its

users

like

a

centralized

single

computer

system


-

but

runs

on

multiple

independent

machines

,

i
.
e
.

Single

System

Image
.

1.3.7 Openness

1.3.5 Fault tolerance

1.3.6 Scalability



ISO

definition
:


Access

transparency,

Location

transparency,

Concurrency

transparency,


Replication

transparency,

Failure

transparency,

Migration

transparency,


Performance

transparency,

Scaling

transparency

-

Two

approaches

to

the

the

design

of

fault
-
tolerant

computer

systems
:


>>

hardware

redundancy
:

the

use

of

redundant

components
;


>>

software

recovery
:

the

design

of

programs

to

recover

from

faults
.

-

Availability

is

a

measure

of

the

proportion

of

time

that

it

is

avail
-


able

for

use
.

-

Scalable

techniques
:


>>

Re
-
configurable

,

removing

performance

bottleneck{serverless,


replicated

data

and

services,

caching}


>>

e
.
g
.
,

NFS

is

short

of

scalability
.

-

the

characteristic

that

determines

whether

the

system

can

be

extended



in

various

ways
.

-

e
.
g
.
,

UNIX



>>

Open

systems

are

characterized

by

the

fact

that

their

key

interfaces

are



published
;


>>

Open

distributed

systems

are

based

on

the

provision

of

a

uniform



inter
-
process

communication

mechanism

and

published

interfaces



for

access

to

shared

resources
;


>>

Open

distributed

systems

can

be

constructed

from

heterogeneous



hardware

and

software,

possibly

from

different

venders
.

-

To summarize:

Chapter 2 Design Goals & Issues


Introduction


Basic technical issues


Users’ requirements



Summary

Key characteristics of distributed system



Resource sharing



Openness



Concurrency



Fault tolerance


Transparency



Scalability

Key

design

goals



Scalability



Reliability



Performance



Security



Consistency

2
.
1

Basic

design

issues


Naming:


-

global meaning & scalability


Communication:


-

how to optimize the implementation of communication in


distributed system


-

while retaining a high
-
level programming model for its use


Software structure:


-

how to structure a system so that new services can be introduced


>> that will interwork fully with existing services


>> without duplicating existing service elements


Workload allocation:


-

how to deploy the processing and communication resources in a


network to optimum effect in the processing of a changing


workload


Consistency maintenance:


-

maintenance of consistency at reasonable cost

2
.
1
.
1

Naming


name
vs

identifier


resolved name is an identifier together with other attributes


-

internet communication: IP+PORT number


-

UNIX file system: index node number


-

Mach communication system: Port number


naming design considerations


-

choose an appropriate name space


-

use name service to resolve names to communication identifiers


-

scalability considerations


name contexts are represented by tables or databases


-

file system: /etc/a.out vs /usr/a.out


-

internet: www.cs.pku.edu.cn vs www.cs.tsinghua.edu.cn


names maybe
structured

or
flat
,

readable
or
unreadable
,
location
-
independent

or
containing location clues


naming schemes can incorporate security mechanism


-

file systems’ directory

2
.
1
.
2

Communication


Communication between a pair of processes involves:


-

transfer of data & synchronization activity


Communication primitives: send & receive may be:


-

synchronous(i.e. blocking) or asynchronous(i.e. non
-
blocking)


Two communication patterns:


-

client
-
server model between pairs of processes


-

group multicast model between groups of cooperating processes


2.1.2.1 Client
-
server Communication


it’s oriented towards service provision,and an exchange consists of:


-

transmission of a request from a client process to a server


process;


-

execution of the request by the server;


-

transmission of a reply to the client.


it can be implemented in terms of message
-
passing operations(
send
& receive
)


-

but commonly presented at the language level as
RPC



Dynamic

binding

in

client
-
server

model



-

example
:

DNS

name

server



Function

shipping

in

client
-
server

model



-

example
:

Postscript

with

laser

printers

2
.
1
.
2
.
2

Group

multicast



-

sending

a

message

to

the

members

of

a

specified

group

of

processes


is

known

as

multicasting


Motivation

of

group

multicasting


-

Locating

an

object


-

Fault
-
tolerance


-

Multiple

update


>>

e
.
g
.
,

maintaining

cache

coherence

under

write
-
update

mechanism


>>

e
.
g
.
,

Time

synchronization,

RAID

2
.
1
.
3

Software

Structure




components of DOS


-

operating system kernel services



>> extending conventional Unix kernel, like BSD Unix



>> microkernels, like Mach, Amoeba and Chorus


-

open services



>> DFS



>> DSM



>> other services, like electronic mail delivery service


-

Support for distributed programming



>> RPC



>> MPI or PVM

2
.
1
.
4

Workload

allocation




two

main

workload

allocation

model


-

processor

pool

model,



-

the

use

of

idle

workstations


2.1.4.1 The processor pool model

Figure 2.5 the processor pool model



examples: Amoeba, Plan 9, Cambridge Distributed Computing System

Dawning 2000 super server

2.1.4.2 Use of idle workstation



use

of

idle

or

under
-
utilized

workstations

as

a

fluctuating

pool

of

extra

computers



example
:

Sprite,

LSF

2
.
1
.
4
.
3

Shared
-
memory

multiprocessors


-

also

called

Symmetric

shared
-
memory

Multi
-
Processor

(or

SMP)

2
.
1
.
5

Consistency

maintenance




Update consistency

-

there

are

likely

to

be

many

users

accessing

shared

data
;

-

the

operation

of

the

system

itself

depends

on

the

consistency

of



certain

databases



Replication consistency



Cache

coherency


-

hypothesis

of

locality



Failure consistency



Clock consistency



User interface consistency

2
.
2

User

requirements


Functionality


-

what the system should do for users


Reconfigurability


-

the need for a system to accommodate changes without causing


disruption to existing service provision


Quality of service


-

embracing issues of performance, reliability and security

2.2.1 Functionality



Key

benefits

of

a

distributed

computer

system
:


-

economy

&

convenience

from

resource

sharing
;


-

potential

improvement

in

performance

&

reliability

from



distributed

resource
.



Enhancements

to

the

services

provided

by

centralized

computers
:


-

sharing

across

a

network

can

bring

access

to

a

richer

variety

of

resources



than

could

be

provided

by

any

single

computer
;


-

utilization

of

the

advantages

of

distribution

enables

explicit

sharing,



fault
-
tolerant

or

parallel

applications

can

be

programmed
.



Three

options

when

considering

a

migration

from

centralized

computing



to

distributed

computing
:


-

adapt

existing

operating

systems

for

networking


>>

example
:

BSD

Unix

+

NFS


-

move

to

an

entirely

new

operating

system

designed

specifically

for



distributed

systems


-

emulation
:

move

to

a

new

DOS,

but

can

emulate

one

or

more

existing

OS
.



>>

examples
:

Mach

&

Chorus

2.2.2 Reconfigurability



Requirements

of

a

reconfigurable

distributed

system
:


-

the

changes

due

to

the

scalability

of

a

distributed

system

design

and

its



ability

to

accommodate

heterogeneity


-

a

failed

process,

computer

or

network

component

is

replaced

by

another



working

counterpart
;


-

computational

load

is

shifted

from

over
-
loaded

to

less
-
loaded

machines,



so

as

to

increase

the

total

throughput

of

the

distributed

system
;

2.2.3 Quality of service



Performance
:

in

terms

of

the

response

times

experienced

by

its

users


-

Optimizing

the

performance

of

all

of

the

software

components

that

involved



>>

OS’s

communication

services



>>

distributed

programming

support

(

e
.
g
.
,

RPC)



>>

and

the

software

that

implements

the

service
.



Security comes from two main threats

-

against

the

privacy

and

integrity

of

users’

data

as

it

travels

over

the

network

-

their

openness

to

interference

with

system

software
:



>>

not

all

machines

on

a

network

can

in

general

be

made

physically

secure


-

a

fault
-
tolerant

system

is

one



>>

which

can

detect

a

fault



>>

either

fail

gracefully(that

is,

predictably)



>>

or

mask

the

fault

so

that

no

failure

is

perceived

by

users

of

the

system
.



Reliability and availability:

Chapter 3 Networking & Internetworking


Network technologies


Protocols


Technology case studies: Ethernet, Token Ring
and ATM


Protocol case studies: Internet protocols and FLIP

In next class we’ll discuss:

Thanks for your attention!