Computer System: General requirements

chainbirdinhandΑσφάλεια

23 Φεβ 2014 (πριν από 3 χρόνια και 6 μήνες)

55 εμφανίσεις


1

Computer
System:

General requirements



In general you build your computer system to satisfy a specific need. A
"computer system" is made up of
: 1) hardware: Servers, PCs, networks,
storage… 2) software: operating systems and applications.







Any sys
tem should meet the following requirements:


-

Reliability:

-

Fault tolerant

-

Scalable

-

Security

-

Upgrade and Update


Let’s talk in some details about each of those terms:


1.

Reliability



The trustworthiness to do what the system is expected or designed to do.

Reliability metrics include the following averages:




POFOD (probability of failure on demand)
:
The likelihood that the
system will fail when a user requests service. A biometric authentication
device that fails to correctly identify or reject users an ave
rage of once out
of a hundred times has a POFOD of 1%.


2



ROCOF (rate of failure occurrence)
:
The number of unexpected events
over a particular time of operation. A firewall that crashes an average of
five times every 1,000 hours has a ROCOF of 5 per 1,000 ho
urs.





AVAIL (availability or uptime)
:
The percentage of time that a system is
available for use, taking into account planned and unplanned downtime. If
a system is down an average of four hours out of 100 hours of operation,
its AVAIL is 96%.

Actually we
talk about the following
terms:

uptime, and
downtime.

Uptime

is
the

time during which a system is working without
failure. Contrast with downtime. The
downtime

is the time during which
a computer is not functioning due to hardware, operating system or
app
lication program failure. In this term the AVAIL or the percentage of
availability:



AVAIL =

Uptime
/ (down time + uptime).

This downtime cost lot of many for example
EBAY was down for
approximately three hours
, th
e company will have to pay around $ 12
mi
llion as compensation. The reason was a crashed gateway router
.


Note1
: Downtime + Uptime = the
measurement period
.

Note2
: Some times we use the term “zero downtime “ which means no
down time and from it the
IBM Eserver zSeries




M
ean
T
ime
To Restore

(
MTTR
)

:is the average time taken to reinstate
a failed component to functioning state. In previous example if the first
if the first time we took 1 hor and 2 hours in the rest times, so the
MTTR = 1.8 h




M
ean
T
ime
B
etween
F
ailures (MTBF)

or sometime MTTF Mean
Time
To
Fail:

mean time between failures (MTBF):


An indicator of expected
system reliability

calculated on a statistical basis from the known
failure

rates of various components of the system. Note: MTBF is usually
expressed in hours. Of a system, over a long
performance
measurement period
, the measurem
ent period divided by the number
of failures that have occurred during the measurement period.


Availability % = MTBF
sys

/ (MTBF
sys

+ MTTR
sys
) * 100


Where

MTBF is Mean Time Before Failures and MTTR is Mean Time
To Repair
.


3




MTBF for serial and parallel

systems:



For serial systems:



For parallel systems:



Example A=0.99, B= 0.999, so the serial=? And the parallel equal= ??




4

2.

Fault

tolerant


To improve your system
reliability you should
be ready when a failure occurs.
This failure could be:



Softwar
e failures



Hardware failures



Human errors



Disasters


The fault tolerant reflects the

ability to continue non
-
stop when a failure occurs. A
fault
-
tolerant system is designed from the ground up for reliability by building
multiples of all critical components
, such as CPUs, memories, disks and power
supplies into the same computer. In the event one component fails, another
takes over without skipping a beat.


True fault tolerant systems with redundant hardware are the most costly because
the additional compone
nts add to the overall system cost. However, fault tolerant
systems provide the same processing capacity after a failure as before, whereas
high availability systems
often provide reduced capacity.




Redundancy

Multi redundant hot swappable Power suppli
es

There are many level of fault tolerant could be expressed by the following terms:



hot swap

:
To pull out a component from a system and plug in a new one
while the main power is still on. Also called "hot plug" and "hot insertion
.


5

in Fault Tolerant En
vironments Hot swap is a desired feature of fault
tolerant systems built with redundant drives, circuit boards and power
supplies,

especially servers that run 24/7. When a component fails and the
redundant unit takes over, the bad one can be replaced witho
ut stopping
the operation.




hot fix

:
To make a repair during normal operation. It often refers to
marking sectors in poor condition as bad and remapping the data to spare
sectors. Some SCSI drives can automatically move the data in sectors
that are becomi
ng hard to read to spare sectors without the user,
operating

system or even the SCSI host adapter being aware of it.


3.

Clustering



Using two or more computer systems that work together. It generally refers to
multiple servers that are linked together in or
der to handle variable workloads
or to provide continued operation in the event one fails. Each computer may
be a multiprocessor system itself. For example, a cluster of four computers,
each with four CPUs, would provide a total of 16 CPUs processing
simul
taneously.



A cluster of servers provides fault tolerance and/or load balancing. If one server fails, one or more
additional servers are still available. Load balancing distributes the workload over multiple
systems.



4.

Scalable



6

Expandable

or scalable
.
Referring to hardware or software, the term has
become a popular buzzword in the IT world. A "highly scalable" device or
application implies that it can handle a large increase in users, workload or
transactions without undue strain.


Scalable does not alw
ays mean that expansion is free

( like adding user to a
domain for example)
. Extra
-
cost hardware or software may be required to
achieve maximum scalability. Nevertheless, scalability is a positive feature of
a product sold to fast
-
growing companies.





5.

S
ecurity
:

(
Later in chapter 14
)

6.

Upgrade and Update

Usually we refer to the upgrade by replacing an existing item by a newer one.
In general, we
use this term with hardware and software. While we use
update to talk about replacing existing data or file by
the most recent available
one. Some vendor talk about updating when we replace an item or few while
they use upgrade when the make a hall change.


The Servers:


1.

Servers Classifications:

There is no general classification used to classify the servers, this

classification refers
to the vendors. The main vendors in the market today are:
IBM, HP, SUN,DELL, Fujitsu and others.

I
n general
, and to classify the servers,

we could use the following criteria:


-

By model : depends to the trade mark of each vendor.

o

IB
M:
In today’s market, IBM has five server brands:



IBM
Eserver
zSeries®



IBM
Eserver
iSeries™



IBM
Eserver
pSeries®



IBM
Eserver
xSeries



IBM
Eserver
BladeCenter™

Refer to
www.ibm.com

for more information



7


o

HP : HP

In today’s market, HP has many server brands





HP Integrity servers




HP Integrity NonStop se
rvers




HP 9000 servers




HP AlphaServer systems




HP e3000 servers

Refer to
www.hp.com

for more information


-

By processors:

Intel based servers, AMD based servers and others.

-

By architecture:

Three main architecture

o

Tower
or stand alone servers:

-

By operating system:



Performance:

The speed with which a computer processes data. It is a combination of internal
processing speed, peripheral speeds (I/O) and the efficiency of the operating
system and other system software all w
orking together.




Throughput

1) In computer technology, throughput is the amount of work that a computer can
do in a given time period. Historically, throughput has been a measure of the
comparative effectiveness of large commercial computers that run many

programs concurrently. An early throughput measure was the number of batch
jobs completed in a day. More recent measures assume a more complicated
mixture of work or focus on some particular aspect of computer operation. While
"cost per
million instructions per second

(MIPS)" provides a basis for comparing
the cost of raw computing over time or by manufacturer, throughput theoretically
tells you how much useful wo
rk the MIPS are producing.

Another measure of computer productivity is
performance
, the speed with which
one or a set of batch programs run with a certain workload or how ma
ny
interactive user requests are being handled with what responsiveness. The

8

amount of time between a single interactive user request being entered and
receiving the application's response is known as
response time
.

A
benchmark

can be used to measure throughput.

2) In data transmission, throughput is the amount of data moved suc
cessfully
from one place to another in a given time period.




Response

time


According to the IBM Dictionary of Computing (which cites
International
Organization fo
r Standardization

Information Technology Vocabulary as the
source), response time is:

The elapsed time between the end of an inquiry or demand on a computer
system and the beginning of a response; for example, the length of the time
between an indication

of the end of an inquiry and the display of the first
character of the response at a user terminal.

There is also the concept of perceived response time, which is the time a user
senses as the beginning of input and the end of the response. It is actuall
y
possible (though not usual) for perceived response time to be too fast (it can be
mildly disconcerting if a system responds almost instantly). However, this is not
the usual complaint.




MIPS


The number of MIPS (million instructions per second) is a gene
ral measure of
computing
performance

and, by implication, the amount of work a larger
computer can do. For large
server
s or
mainframe
s, MIPS is a way to measure
the cost of computing: the more MIPS delivered for the money, the better the
value
. Historically, the cost of computing measured in the number of MIPS has
been reduced by half on an annual basis for a number of years.

The number of MIPS attributed to a computer is usually determined by one or
more
benchmark

runs


performance


Performance seems to have two meanings:

1) The speed at which a computer operates,
either theoretically (for example,
using a formula for calculating Mtops
-

millions of theoretical instructions per
second) or by counting operations or instructions performed (for example, (
MIPS
)
-

millions of instructions per second) during a
benchmark

test. The benchmark
test usually involves some combination of work that atte
mpts to imitate the kinds
of work the computer does during actual use. Sometimes performance is
expressed for each of several different benchmarks.


9

2) The total effectiveness of a computer system, including
throughput
, individual
response time
, and availability.