KEY CHAPTER POINTS

weedyhospitalElectronics - Devices

Nov 25, 2013 (3 years and 8 months ago)

85 views

Bonus Material B


File and Operating Systems




The two main software components of any computer platform are its operating system and its
file system. In the beginning, computer systems had neither. Each entry into the processor
was a program or a
routine that informed every element of the computer what to do; each
time the program wanted the element of the computer to do it. Given there were no graphical
user interfaces, no keyboards, and not much more than a few toggle switches; the need for an
op
erating system (OS) or a file system (FS) was not even apparent.

Today that has all changed. The OS is now the backbone of a computer, managing all of the
software and hardware resources. Operating systems are responsible for everything from the
control an
d allocation of memory to recognizing input from external devices and transmitting
output to others. The OS also manages the files on physical memory, control systems, and
peripherals such as printers and scanners.

We know the file system (sometimes
written as one word "filesystem") mostly by the way we
name files or by the way they are placed logically so we can retrieve them from disks, USB
sticks, or from a network. File systems form that intermediate layer between block
-
oriented
disk drives and th
e applications that remain on that drive.

KEY CHAPTER POINTS



Operating system and file system types that include conventional, time
-
sharing, real
-
time, embedded OSs, and mobile



Discussion of services, managers, and the layers of an operating system



File sy
stem categories and types including local, distributed, clustered, and those
familiar to most PC users, as well as those not so familiar



Flash, journaling, and traditional file systems and their applications to mobile and
media operations and users

Operati
ng Systems

Most of us know of operating systems by their commercial names, e.g., Mac OS or Windows
7. Fundamentally, an operating system (OS) is a collection of software routines designed to
run other programs on a computer. The software of an operating sy
stem controls, organizes,
and accounts for the hardware blocks and turns those elements into a cohesive entity with
powerful functionality.

The tasks that an OS must be capable of handling include processor management, memory
and storage management, and de
vice management. The OS provides for a common
application interface (API) and a common user interface (UI). Operating systems, based upon
the kind of applications they support, can be described in four overall dimensions.



Single user/single task

managing o
f the computer tasks for single users that are
doing only one thing at a time



Single user/multitask

the more common place computer in the home, workplace, or
laptop, such a fully
-
fledged "Windows
-
like" OS



Multiuser

a balanced system that can handle many us
ers and in turn many tasks
simultaneously, such as in the Unix OS



Real
-
time operating system (RTOS)

tasks that must be handled (executed) with
precise timing, every time

OS Infancy

The first
-
generation electronic digital computing systems of 1940s had no o
perating systems.
In that era, programs were entered in the form of one bit at a time on rows of mechanical
switches physically mounted on “plug boards.” No assembly languages or programming
languages were known, and operating systems were unheard of.

By t
he early 1950s, the earliest implementation of any form of an operating system emerged.
The second generation of computers, those that included their only I/O on punch cards, began
to see the introduction of a crude operating system. General Motors Researc
h Laboratories
implemented a first
-
generation operating system in early 1950s for their IBM 701, the first
production computer from IBM and labeled as Defense Calculator. Systems of this
generation generally processed one job at a time, called a single
-
str
eam batch processing
system, because programs and data were submitted in groups or batches.

Third
-
generation computing systems of the 1960s took advantage of the computer's resources
and processed (ran) several jobs simultaneously. Still referred to as bat
ch processing systems,
the designers needed to develop a concept for multiprogramming whereby several jobs could
be held in main memory, and the processor would then switch from job to job as needed in
order to advance multiple jobs, keeping the peripheral

devices active. The operating system
kept the CPU from sitting idle as each I/O process was in progress. This was accomplished
by partitioning the host memory systems so that each job could process in each separate
partition. The result was that wait cycl
es were reduced, and processing could continue as I/O
processes were executed.

These third
-
generation computing systems employed an operating system that allowed for
new features such as spooling which permitted simultaneous peripheral operations to occur
concurrently or sequentially, and online. Spooling is essentially a buffering technique that
matches peripheral I/O operations to the computer memory and processing functions. In this
operation, a high
-
speed device such as a disk drive is interposed betwee
n a running program
and a low
-
speed device involved with the program during an input/output operation. Instead
of the program writing directly to a printer, the output data is written to a disk drive where it
is spooled out to the printer when and as the p
rinter is available to handle that data stream.
Programs can be run to completion faster, and other programs can be initiated sooner. When
the printer becomes available, the output data may be sent to that device and then printed.

This generation of computing systems also employed a variant of the multiprogramming
technique called time
-
sharing in which each user has a directly connected (online) terminal
console. In this new mode, which differs from the batching techniques, users ar
e interacting
with the computer in near real time. This operational mode requires that the computer system
responds quickly to user requests to avoid a decrease in productivity. Time
-
sharing systems
were developed to support the multiprogram mode whereby l
arge numbers of simultaneous
interactive users utilized the central processing units concurrently.

Operating systems entered the mainstream world of fourth
-
generation computer systems with
the introduction of the microprocessor (MPU), large scale integrate
d (LSI) circuits, and
complex peripheral chip sets that spawned the evolution of the personal computer and the
workstation age. The most noted of the modern operating systems, the one that dominated the
personal computer scene, is MS
-
DOS (Microsoft Disk Op
erating System), written by
Microsoft, Inc. for the IBM PC and those machines employing the Intel 80xx CPU and its
successors, and UNIX, originally developed in 1969 by a group of AT&T employees at Bell
Labs which became most prevalent on larger personal c
omputers, servers, and workstations
using the Motorola 6899 CPU family.

A Time
-
Sharing Operating System

One of the earliest operating systems, Multics (Multiplexed Information and Computing
Service), became an influential guide to what would be referred to

as a time
-
sharing
operating system. The project, started in 1964, was originally a cooperative project led by
MIT (by Fernando Corbató) along with General Electric and Bell Labs. Bell Labs dropped
out the project in 1969, and in 1970 GE's computer busines
s including Multics was taken
over by Honeywell.

Multics implemented a single
-
level store for data access. This concept discarded the clear
distinction between files (called segments in Multics) and process memory. The memory of a
process consisted solely
of segments (files) which were mapped into its address space. To
read or write to them, the process simply used normal CPU instructions, and the operating
system then managed the services to be sure that all the modifications were saved to disk.

Familiar T
ypes

Besides being just a program that controls the execution of application programs and acts as
the interface between the user of a computer, the OS is that interface between the computer
hardware, including storage, memory, and peripherals, and the
user. The user may be a
human or another computer or a device that depends on the functions executed by a
computer
-
like device.

The more notable operating systems are provided by those software vendors that have
influenced and guided the development of the

modern computer society. Recognized
computer operating systems include the following:



UNIX (the OS whose source code and product implementation went to SCO and
whose trademark UNIXWARE is owned by The Open Group)



Mach (the operating system kernel develope
d at Carnegie Mellon University derived
from the Mac OS)



MS
-
DOS, MS
-
Windows, and Windows/NT (Microsoft Corporation's various OS
platforms)



OS/2 and Warp (IBM's endeavor for the PS
-
2 era)



Mac OS (Apple's "system software")



OpenVMS (for VAX and AXP
-
Alpha CPU
s)



FreeVMS (Virtual Memory System)



MVS (Multiple Virtual Storage)



VM (Virtual Machine), as in the software implementation of machine that executes
programs as though it was a physical machine

Needless to say, the term operating system or simply OS is assoc
iated with many a different
set of software, hardware, and even virtual machines. The OS, by itself, can be confusing
even though it essentially remains in the background quietly performing the tasks that most
users want nothing to know about.

Control

Cont
rolling the computer involves software at several levels. In the same fashion, there are
components of the operation system that must be differentiated, even though most of them
are software based in nature. Of those elements that control the computer, it
is the OS that is
made up of kernel services, library services, and application
-
level services.

The core of the operating system is its kernel. A kernel is a control program that functions in
privileged state, which is an execution context that allows all
the hardware instructions to be
executed. The kernel reacts and responds to interrupt from external devices and to service
requests and traps from processes. The operating system uses processes to run applications,
which are linked together with libraries
to perform standard services. It is the kernel that
supports the processes by providing a path to the peripheral devices.

In most cases, the kernel is a permanent resident of the computer which creates and
terminates processes and then responds to their re
quest for service. Operating systems are the
resource managers whose main resource is the computer hardware in the form of processors,
storage, input/output devices, communication devices, and data.

OS Functions

The operating system enables and manages
many functions which are essential to the
successful "operations" of a computer system. These functions include the user interface, the
sharing of hardware among users, the sharing of data among the systems in the computer, and
the prevention of interferen
ce among those systems.

The OS further facilitates the I/O functions, the recovery from errors, and the accounting of
resources and usages. Its tasks facilitate parallel operations and prioritization, the
organization and securing of the software and data,

and the handling of network
communications internally (on the buses of the computer) and externally over public or
private connections.

At a deeper level, the OS is responsible for the allocation of each resource. This
responsibility extends to memory man
agement of the hardware and of the processor. The OS
determines when an I/O device can be used by the applications or programs running on the
computer. It controls access to and the use of the files, although a great deal of that is
managed independently b
y the "file system" manager.

At a critical level, the OS determines just how much processor time is devoted to the
execution of programs, services, and other functions which occur internally and peripherally
to the core processor or processors. In a failov
er state, it is the OS (in conjunction with the
records kept by the file system) that picks up all the pieces and quickly resumes its business.

Resource Manager

The OS is a resource manager that handles the movement, storage, and processing of data
includi
ng the control of those functions. The OS functions in a similar way to that of ordinary
software except the program(s) is executed by the processors. An operating system will
frequently relinquish its control at which time it then depends upon the process
or to recover
that control.

The operating system's job is to direct the processor in the use of the system resources. In
order for the processor to do accomplish its tasks, it must temporarily cease the execution of
the operating system code and execute ot
her programs. How this situation is managed is
dependent upon the functionality of the core processor, and the relationship that it has at each
level of program execution.

Operational Layers

File systems and volume managers are generic applications; that i
s, their services and the
performance of those services are not necessarily optimized for any specific application.
However, in media specific storage systems, both may be tailored to the kinds of specialized
applications that are typical to move larger, c
ontiguous files with specific characteristics.

When applied to a more generic set of applications, such as databases or transactional data
that is continually in flux, the volume manager and file systems must be configured for a
variety of services and dat
a sets. The performance of the file system and the volume manager
are characteristic parameters from which a user may choose one storage platform over
another.



Volume Manager

Within the operating system is an intermediate layer located between the file system and
standard small computer systems interface (SCSI) device drivers. It is the system whose basic
function is to aggregate multiple hard disk drives (HDD) to form a large
virtual disk drive
that is visible only to the higher levels of the operating system.



A file system, which is discussed in detail during the second part of this chapter, in the
traditional sense resides on single devices. When control of more than one
device is
necessary, a volume manager is employed. A volume manager can provide the same function
as that of an intelligent disk subsystem or a RAID controller.

Embedded Operating Systems

An embedded system is an application
-
oriented special computer
system. These systems are
most often found in devices that are intrinsically designed to focus on a set of specific tasks,
such as the control and instrumentation in an automobile, a GPS tracker, a mobile phone, or
consumer electronics. Embedded systems mu
st be scalable on both the software and
hardware sides. As the applications that run on embedded systems get more complex, the
need for an operating system and the support for its development become critical.

Linux Embedded

In most recent times, Linux has
become quite popular on embedded systems. The modular
nature of Linux and its fully functional OS allows it to be slimmed down through the removal
of unnecessary components, utilities, tools, and other software system services.

Windows CE and Windows Mobil
e

Another more formidable embedded OSs is Microsoft Windows CE, which was first
introduced in the Handheld PC (H/PC) set of products in November 1996. Currently at
Version 6, the developers at Microsoft are headstrong in completing the next iteration of wh
at
is still scheduled for Q1
-
2011; but has had several delays over the course of nearly 3 years of
waiting. The OS is officially known as Windows Embedded Compact (or Windows
Embedded CE post Version 6.0).

Windows Mobile is yet another operating system dev
eloped for smart phones and other
mobile devices. And now, with Microsoft's latest endeavor, Windows Phone 7, it is based on
the Windows Embedded Compact 7 core; but it is not clear what the underlying OS will be.
Compact 7 brings Silverlight 3 for Windows

Embedded, an embedded Internet Explorer,
Adobe Flash 10.1 support, and has HD and MPEG
-
4 streaming support. A real
-
time
operating system for “connected,” battery
-
powered and handheld devices is included in the
tool set.

Real
-
Time Operating Systems

Functio
ns or events that must be processed in an isochronous (time dependent) or
instantaneous fashion are considered to occur in “real time.” The real
-
time operating system
(RTOS) enables computing where system correctness not only depends on the correctness of
logical result but also on the time of the delivery. Video and audio are the best examples of
where the RTOS fits. Imagine that video to the home was randomly interrupted, lots of
frames were skipped, or the audio was just placed wherever it felt like rega
rdless of what the
image was. Without a "live
-
time" RTOS, this is what users in the digital world would
experience.

The RTOS should have features to support these critical requirements and must have
predictable behavior to unpredictable external events. A
true RTOS will be deterministic
under any condition.

Hard and Soft RTOS

A real
-
time operating system that would not miss a deadline without catastrophic
consequences is called a hard real
-
time system. Should the system be allowed to miss
deadlines, and sti
ll recover, they are referred to as soft real
-
time systems. In actuality, the
computer system does not require an operating system. A simple timer that uses a motor to
advance a hub that activates a switch which opens or closes an electric circuit does not

need
an operating system. However, a timer that adjusts for daylight savings time that has a digital
display, that knows the seasonal timing of sunrise and sunset, and that can be remotely called
into from a cell phone or a computer does need an operating

system, and it might need to be a
real
-
time OS if it is coupled to life safety or mission critical applications.

Adaptability

A well
-
designed RTOS will allow for real
-
time applications to be easily designed and
expanded upon. Functionality should be added

without a major set of changes to the software.
The RTOS should allow for the splitting of application code into separate tasks that can be
modularized without recompiling length strings of code sets. A preempty RTOS will handle
all critical events in as
quick and efficient time frame as possible.

OS Services

The operating system needs to be capable of providing a mix of services to the user, the
applications, and the computer itself. These fundamental services need to include at least the
following:



Edito
rs and debuggers

a variety of facilities and service that assist the user, at
almost a programmer's level, in the diagnosis of issues that might be affecting the
application, memory, storage, or utilities.



Program execution

the tasks associated with the fu
nctional aspects of executing a
program. The OS manages the instructions and data that must be loaded into main
memory, I/O devices and files that must be initialized, and other resources that must
be scheduled to perform.



Access to I/O devices

the specifi
c instructions that control and make functional
items such as data file exchanges between processor and disk storage, input from the
keyboard, or output to a display.



Controlled access to files

dealing with the file format of the storage system and the
int
erpretation of privileges or permissions necessary for users or devices that request
access to data.



System access

accessibility to the system as a whole and to specific system
resources. Access functions need to provide the protection for resources and da
ta from
unauthorized users. The OS must be able to resolve conflicts at the system level and at
the I/O level and more.



Error detection and response

the ability to react and take a response that clears an
error conditions with the least impact on running a
pplications.



Accounting

the collection of usage statistics from various resources, the monitoring
of performance parameters, the logging and classification of events, and the decision
processes that can automatically improve overall system performance.

File Systems

A file system is generally created at the time the storage system is first initialized. The file
system is necessary to create new files and folders (formerly called directories) which are
used by applications and users to organize, categorize
, and segregate various elements of the
operating system, applications, and data. These folders are in turn managed by the file
system.

The way in which we name the files, and where they are placed so that they can be logically
retrieved is the underlying
concept behind the file system. The most familiar files systems are
those found in DOS, Windows, OS/2, Mac OS, and UNIX
-
based operating systems. Each of
these OSs has file systems whereby files are hierarchically placed. Nearly all of these OSs
use the "tr
ee" structure, sometimes viewed as a "leaf and node" architecture. Files are placed
in a directory (known as a folder in Windows and Mac OS) or subdirectory (subfolder in
Windows) at the desired, hopefully logical place in the tree structure. The naming
co
nventions for folders, directories, subfolders, and subdirectories usually change depending
upon the operating system.

File systems specify conventions for naming files. These conventions include the maximum
number of characters in a name, which characters

can be used, and in some systems, how
long the file name suffix can be used. A file system also includes a format for specifying the
path to a file through the structure of directories.

The Mac OS X (Version 10.5) uses the "home folder" as the location th
at stores the user's
personal information, which is created for each user of the computer when that particular
user's account is set up. The "home folder" is not typically named Home; but it usually is
given the user’s short name as specified in the user a
ccount. The home folder is almost never
changed once it is set and is usually subdivided into additional folders that make it easier and
more logical to locate the various types of files (e.g., music, pictures, public or shared files, or
sites).

The Window
s OS (XT, Vista, and Windows 7) have a very similar architecture to the Mac
OS; generally with the same concepts, but using slightly different names.

These elements are the basic components at the surface level of the file system; the parts that
the users
see the most and probably only ever need to directly interface with. Getting any
deeper than the directory or folder level is more for the advanced user and becomes quite
different for each operating system; therefore we will leave these discussions outthi
s book.
However, there is a need to understand what is under the hood if you are a media professional
or IT administrator who needs to mix or associate different file systems in order to set up
storage or configure a cross platform architecture. The remain
der of this chapter will cover
the variances of some of the more known file systems.

File System Categories

Industry and users have different perceptions about what a file system is, and what it is
perceived to do versus what it really does. For the purpos
es of this discussion, we’ll define a
file system as that entity in a computer system that stores named data sets and attributes about
those data sets so that they can be subsequently accessed and interpreted according to the
needs of that system.

The attr
ibutes stored in the named data sets include data describing the originator and
ownership, the access rights and date of the last access, and the physical location of those
data sets. Like descriptive metadata, there may be other optional or advanced attri
butes that
might include textual descriptions of the data, any migration or performance policies, and
encryption keys.

It can be quite useful to break file systems into a group of categories. Examples include the
following:



Local file systems

providing for

the naming, data access, and attribute interpretation
for only the services they provide and only for the local system they run.



Shared file systems

a means of sharing data between more than one system.
Familiar examples include NFS (network file system)
and CIFS (common Internet file
system). Most shared file systems will require that a local file supports the file server
or network attached storage (NAS) appliance.



Special file systems

those with properties specific to applications or uses, but tend
to s
tore data in a conventional sense. An example is "tmpfs," which stands for
"temporary file storage facility" (as in Unix or Unix
-
like OSs). Here the file system
appears like a mounted file system, but the data is stored in volatile memory rather
than persi
stent storage, such as a disk drive. RAM disks, which look like virtual disk
drives but host a disk file system, could be a tmpf.



Media
-
specific file systems

those associated to the media on which they are
typically found. This is not media, as in images o
r sound, but the physical media such
as a DVD
-
RW or DVD+RW. A more familiar example is the universal disk format
(UDF), which is the file system format found on most DVDs.

Over the next pages, we will look at "local file system" examples, which most reader
s will
see, are probably the most familiar of the file systems which they routinely come into contact
with.

Local File System

The names that stand out most in file systems where personal computers are used consist of
the following:



File Allocation Table
(FAT)



New Technology File System (NTFS)



Universal Disk Format (UDF)

The number of other file systems, including those for mobile systems, and those that are
embedded, are beginning to grow as storage, and memory moves well outside the boundaries
of spinning disks and CD
-
ROMs. To cover all of these file systems would be a j
ournal unto
itself, so this section will pick a few of the more recognizable file systems that are employed
with or on the operating systems discussed in the first part of this chapter.

File Allocation Table

The FAT file system, developed during the period

around 1976

1977, was first created for
managing disks that supported the Intel 8086 microprocessor version of Microsoft Stand
-
Alone Disk BASIC, a BASIC interpreter with a built
-
in operating system. FAT is the file
system supported by the Microsoft MS
-
DOS

operating system.

FAT uses a table which centralizes the information that describes which areas of the storage
media (drives) belong to files, those areas that are free or possibly unusable, and the location
of each file as it is stored on the disk. Disk
space is allocated to files in contiguous groups of
hardware sectors called clusters (i.e., allocation units). Early disk drives had a relatively low
number of clusters compared with today’s drives. Throughout the evolution of magnetic
storage media, begin
ning with the flexible removable floppies and growing into
comparatively huge fixed hard disk drives, the number of clusters that divided these storage
areas on these drives grew significantly. This escalation required increasing the number of
bits used to

identify each cluster. Successive versions of the FAT format were created to
address the number of table element in terms of the bits that described the table entries as in
the following.

FAT12

An initial version designed as a file system for floppy disks

with cluster addresses limited to
12
-
bit values. This file system had a limited cluster count of 4078. The12
-
bit value is derived
from 2
12

(= 4096), which ordinarily would be the number of clusters (or allocation units) that
can be addressed. However, the

actual cluster count is less than this, given that 000h and
001h are not used, and FF0h to FFFh are reserved (or used for other purposes), leaving the
possible range of clusters between 002h and FEFh (decimal numbers 2

4079).

FAT16

[sty: Text flush] The s
econd implement introduced in 1988 which became the primary file
system for Microsoft MS
-
DOS 4.0 (disk operating system) up through Microsoft Windows
95. FAT16 would support disk drive capacities up to 2 Gbytes.

FAT32

This FAT system was introduced in 1996

for Windows OSR2 users and became the primary
file system for consumer versions of Windows up through Windows ME. FAT32 supports
disk drive capacities up to 8 Tbytes.

New Technology File System

Windows New Technology File System, the formal name which qui
ckly became simply
NTFS, was introduced by Microsoft in 1993 with the Windows NT 3.1 operating system.
NTFS is the primary file system used in Microsoft's Windows NT, Windows 2000, Windows
XP, Windows 2003, and Windows Vista/Windows 7 operating systems. NT
FS supports hard
drive capacities up to 256 Tbytes.

When Microsoft first designed NTFS, it already supported both FAT and HPFS. The
company had determined that it needed a new operating system, which it would name
Windows NT, so the Windows NT File System
was engineered to be a more robust
replacement for HPFS. NTSF, as a mature journaling file system, would become a superlative
file system for fixed hard drives because it provided for the essentials of recoverability and
availability.

Availability meant th
at should a service or system crash, it would recover because of how the
data was stored in the journaling process. The file system would in a few seconds recover
without having to run a CHKDSK as often as other systems without journaling.

NTFS provides re
coverability because of the techniques employed that maintain, in the
database
-
like logging structure, all the activities of the OS and other activities typical to a
journaling file system, which commits metadata changes to the file system in transactions.

As
explained earlier, should a crash or a power failure occur without backup, the NTFS file
system rolls back any uncommitted transactions so as to quickly return the file system to a
stable, consistent state. Then it continues to execute instructions.

Ot
her FAT File Systems

Beyond the mainstream file systems developed for the consumer and industrial workspace,
other proprietary and nonpropriety file systems have been developed for applications
including high
-
performance storage systems, real
-
time operatin
g systems, and embedded
systems.

Some of those modified FAT file systems for Windows include FATX designed for the
Microsoft XBox gaming system's hard drive and memory cards. Another incompatible
replacement for the FAT file system, called exFAT, was intro
duced for Windows Embedded
CE 6.0 where its intended use was for flash drives. exFAT introduced a free space bitmap
that increased faster space allocation and faster deletes, and supported for files up to 2
64

(=
18.435 Ebytes), and included much larger clu
ster sizes among other features.

Besides those file systems described previously, other examples of disk file systems include
the following:

HFS (and HFS+)

Formally named the Hierarchical File System, the proprietary file system developed by Apple
for
those running the Mac OS; HFS Plus is the extended implementation also called the Mac
OS Extended. HFS divides a volume into logical blocks of 512 bytes. Those logical blocks
are grouped into allocation blocks which are comprised of one or more logical blo
cks,
depending upon the total size of the volume. HFS is limited to 65,536 allocation blocks
because it was originally set up using a 16
-
bit value to an address in those allocation blocks.

HPFS

The High
-
Performance File System that was developed for the OS
/2 operating system
initially created by IBM and Microsoft, and then later exclusively developed by IBM, and is
no longer marketed by them. HPFS allows for mixed case file names, supports 255
-
character
-
long file names (FAT was limited to 8+3), stores files

on a per
-
sector basis as opposed to
using multisector clusters (this improves disk space utilization), and uses the B+ tree
structure for directories.

B+ tree (B plus tree) uses a key identifier that allows for multilevel indexing which improves
record in
sertion, removal, and retrieval. The alternative version, B
-
tree (B minus tree) is a
structure generalized from the binary search tree which has its records stored in a leaf level of
the tree. The binary search tree allows a node to have more than two chil
dren. B
-
tree is
commonly used in databases and file systems and is optimized for those systems that read
and write large blocks of data.

UFS and V7

This stands for the Unix file system, which is used by Unix and Unix
-
like platforms. UFS is a
hierarchical f
ile system, with the root of the file system as inode 2. UFS is the default Solaris
file system.

Also known as the Berkeley Fast File System, the Berkeley Software Distribution (BSD),
sometimes just Berkeley Unix, the BSD Fast File System or the FFS,all th
ese names are
historically, yet distantly related to the original file system used by Unix known as Version 7
Unix (or Version 7, or V7). V7 was developed for the PDP
-
11 from Digital Equipment
Corporation (also known as DEC). During the era of the minicomp
uter, V7 was the first
"portable" version of Unix and was released in 1979. A public release of V7/x86 (for the x86
based IA
-
32 PC) has been from Nordier & Associates, with the majority of files in the
V7/x86 distribution derived from UNIX Version 7 source

code, and some of them from 32V
UNIX source code.

UDF

Defined by the Optical Storage Technology Association (OSTA), the Universal Disk Format
is a file system that is compliant with ISO
-
13346/ECMA
-
167, which is the successor to the
CD
-
ROM file system (CDF
S or ISO
-
9660). All the standard formats used in video recording,
on DVD
-
like media, will use some version of the UDF file system.



Philips' DVD+VR format uses UDF 1.02 with an ISO 9660 bridge for DVD+R and
DVD+RW.



The DVD Forum's DVD
-
VR format uses UDF 2.0
0 for DVD
-
R, DVD
-
RW and
DVD
-
RAM.



Blu
-
ray Disc and the DVD Forum's HD
-
DVD will be using UDF 2.50 or UDF 2.60.

The UDF file system has become popular as the requirements for storing large files on cross
-
platform media whose fully rewriteable storage capabili
ties have exceeded several gigabytes.
Those media include, for example, flash media (see Chapter Seven) and the 35 Gbytes and 70
Gbytes Iomega REV removable hard disks.

ZFS

Originally called the Zettabyte File System

a zettabyte (ZB) is 1000 Ebytes
, with an
exabyte equals to 1000 Pbytes

this file system was designed and implemented in the 2004
time frame by Sun Microsystems (now part of Oracle) and released for the Solaris around
mid 2006. The main focus of ZFS is to protect the file against "silent

corruption,” which
designers claim that it is not currently possible with most of the other named file systems
including XFS, JFS, or, NTFS.

Flash File Systems

] Designed for storing files on flash memory devices, flash file systems are becoming more
prev
alent especially as the number of mobile devices increases and the capacity of flash
memories gets much larger. Flash file systems are optimized for the particular architectures
of flash which are related to how data is written, read, and altered on that d
evice.

Although a disk file system can be used on a flash device, this is suboptimal for several
reasons:



Block erasure

flash memory blocks have to be explicitly erased before they can be
rewritten to. The time required to erase these blocks can be
significant; thus, it is
advantageous to erase those unused blocks while the device is idle.



Random accessibility

a disk file system is optimized to avoid the latency associated
with disk seeks (i.e., the activity of locating the blocks or data on the phys
ical drive)
whenever possible. Flash memory devices impose comparatively little to no impact on
seek latency.



Wear leveling

when a single block in a flash memory device is continually
overwritten, they will tend to wear out over time. The flash file system

is designed
such that it spreads out and writes evenly.

A log
-
structured file system, such as JFFS2, UBIFS, and YAFFS, will have many of the
desirable properties for a flash file system.

JFFS2

This revision is the successor to the original Journaling Flas
h File System (JFFS) that was
used only with NOR flash memory and was typically employed on the Linux operating
system. JFFS enforced wear leveling by treating the flash device as a circular log. JFFS2 now
supports NAND flash, whereas the original JFFS sup
ported only NOR. NAND devices have
a sequential I/O
-
interface and thus cannot be memory
-
mapped for reading. JFFS2 remedies
this and other issues including hard links, compression, and better performance including
support for larger flash devices in excess
of 64

512 Mbytes.

The successor to JFFS2 is the Unsorted Block Image File System (UBIFS), designed when
flash memory chips began to exceed 512 Mbytes. The indexing on flash with UBIFS is B+
tree, with a wandering algorithm, and indexing in memory uses Tree
-
Node Cache (TNC),
which caches indexing nodes and may be shrunk should Linux virtual machine (VM) requires
more memory. UBIFS allows for write
-
back operations, which perform caching at writing, as
opposed to write through in JFFS2

YAFFS

As if three letter

acronyms (TLAs) were not enough, this five
-
letter acronym is named Yet
Another Flash File System and was designed by New Zealander Charles Manning for Aleph
One. Of course, there is a YAFFS1 and a YAFFS2 designed because NAND flash memory
evolved from the

original 512
-
byte page (with 16
-
byte spare) to the new larger pages with
2048
-
byte and 64
-
byte spares. Still a log
-
structured file system, it is designed with data
integrity and robustness in mind and is used on Linux, pSOS, eCos and various other "specia
l
purpose" OSs including WinCE.

The importance of these newer and relatively obscured file systems is in the development and
rising popularity of mobile devices and other portable platforms. Most of these systems use
embedded OSs, which rely on flash memor
y for storage. Ordinary file systems place undo
wear on the flash chips, causing them to wear out prematurely. These file systems are built
around the needs of wear leveling, the practice which prevents using the same memory space
in the flash chip repetit
ively before exhausting other areas of open/free memory on the chip.

Traditional File Systems

Traditional file systems offer facilities, i.e., capabilities or services, that can create, move, and
delete both files and directories. However, they also may la
ck the abilities to create
supplemental links to a directory (called “hard links” or “parent links” in Unix and Unix
-
like
OS) or to create bidirectional links to files.

Traditional file systems may further offer facilities that can truncate, append to, cre
ate, move,
delete, or in
-
place modify files, but they lack the facilities to prepend to or truncate from the
beginning of a file and may further preclude the capability of arbitrary insertion into or
deletion from a file.

Editing systems for media centric
systems depend upon many of the functions that are outside
the bounds of traditional operating systems. Because of the requirements that media activities
present, such as the continual modification of files of enormous proportions, specialized
consideratio
ns must be given to these types and sizes of files.

These sets of operations are highly asymmetric, and they lack the generality to be useful in
unexpected contexts. In Unix, as an example, interprocess pipes have to be implemented
outside of the file syst
em because the pipes concept does not offer truncation from the
beginning of files. Other similar concepts may be necessary, and are often proprietary, to the
architectures of file systems and operating systems for media
-
focused activities.

Networked or Di
stributed File Systems

One of the two possible references to a "networked file system" is the familiar NFS, more of
a protocol that allows a client computer to access files across a network. NFS is referenced
throughout this chapter and other chapters in t
his book.

The other version makes reference to a distributed file system. In a more generalized fashion,
NFS (the protocol) is the implementation of a distributed file system. In this later perspective,
the explanation is quite similar. A distributed file
system is any file system that allows access
to and the sharing of files between two or more host computers (clients). As mainframe
computer systems were experiencing their halcyon days, and file servers were emerging (ca.
1970s), leaders in the computer b
usiness saw the need to allow multiple users on multiple
machines to interchange files and data among themselves. For a mainframe implementation,
long before the personal computer, this was rather an obvious requirement especially since
there was only a sm
all subset of computer systems available versus users and/or terminals.

As more machines came into being, more users needed access and eventually the
minicomputer began performing the same kinds of tasks as the mainframes but with less
compute power. The m
inis also needed shared access so that they and their users could
exchange files.

Files that are accessed over a network need to have the same core principles available as
those accessed on a local drive. This is readily apparent when capturing a file over

the
Internet and copying it to the local drive. If another user from another machine has sharing
abilities with your machine, the file on your machine should look and act exactly the same as
the one looked at on the other machine, and both views should be

the same as the one found
originally on the Internet. Updates on a file made on one machine should not interfere with
the access or updates in another location.

Distributed file systems need to have protective measures implemented such as access control
l
ists (ACLs) that prevent unauthorized users from viewing, tampering, or destroying files that
are network accessed. Since the files may be location independent, this functionality is very
important.

The file system needs to be secure. It may employ a file
system client such as to mount and
interact with directories and files that are located on a remote workstation or server. In this
case, the client will most likely interact with this remote file system via SSH File Transfer
Protocol, also called Secure Fi
le Transfer Protocol (SFTP). There are many acceptable means
to prevent unauthorized or hostile access, most of which came from the work of the IETF,
and employ security measures based more on networks than file systems.

Some of the recognized distributed
file systems in use include the following:



Apple Filing Protocol (AFP)



Server Message Block (SMB), also known as CIFS



Amazon S3



NetWare Core Protocol (NCP), from Novell, used on Netware
-
based network

Fault Tolerance

A distributed fault
-
tolerant file system

is a closely associated, dedicated, and distributed file
system that is engineered for high availability (HA) and supports activities in a disconnected
(offline) operation. These systems utilize synchronization methods (sometimes over HTTP)
and are availa
ble on various operating systems and may further include encryption.

Clustered file systems and shared disk file systems are alternative names for what are
considered distributed file systems.

File System Processes

Within each file system, a number of proc
esses are available that control access to files,
permissions, and other security
-
related functions. Each activity or event associated with the
file system can be controlled through a set of permissions, assignments, or allocations. This
information can be

obtained from the metadata associated with the file attributes and a data
rights and policies process called an access control list.

Access Control List

When referencing file systems, the access control list (ACL) is usually a data structure in the
form o
f a table containing entities that describe and assign permissions which are attached to
an object. The topics of object
-
based storage (OBS) and those devices that support objects,
i.e., object
-
based storage devices (OSD), are discussed in greater depth in

Chapter Eleven.
The ACL specifies which system rights, processes, and what users are granted access to those
objects. The objects may be programs, processes, or files. The ACL further delegates what
operations are allowed to be performed on which objects.

ACL models may be applied to a set (i.e., a collection) of objects as well individual entities
within the system.

Security is a key consideration in any ACL
-
based model. Of concern is how these access
control lists are edited, by which users and under whi
ch processes. The ACL drives specific
access rights which show up as privileges or permissions for users, such as executing, writing
to or reading from an object. ACL processes can be extensible to a digital rights management
(DRM) system whereby any modif
ication process that could jeopardize the integrity of the
content, that is, the files which are the content.

ACLs will also be known as access control entries (ACE) in the operating systems for
Microsoft Windows NT, OpenVMS, Unix
-
like systems (e.g., Linux
, BSD, or Solaris), and
Apple's Mac OS X.

Attributes of Metadata

Files have their own set of attributes that are used as a means of bookkeeping for both the
individual files and the file system as a whole. The information which is in those attributes is
called metadata. The metadata is typically organized so that it ass
ociates with the structure of
how that information is identified, tracked, and cataloged within the file system. The data
length of a file may be stored as the number of blocks allocated for the file or as an exact byte
count. When the file was last modifi
ed may be stored as the file's timestamp. Other metadata
attributes employed by some file systems include the creation time, the time it was last
accessed, and the time that the file's metadata was changed.

Additional information, some of which early PC op
erating systems did not track, included the
file's device type, its block, character, socket, subdirectory, or rights management data. The
file may also catalog the owner's user
-
ID or group
-
ID and access permission settings such as
whether the file is read
-
only, if it is an executable file or the security level associated with the
file.

Advanced file systems carry information considered arbitrary, such as in XFS, some versions
of UFS, and HFS+. This information is cataloged using a set of "extended file att
ributes."
Such features are implemented in the kernels of Linux, FreeBSD, and Mac OS X operating
systems, which enable metadata to be associated with the file at the file system level. The
information could include, for example, the author of a document, t
he character encoding of a
plain
-
text document, or even a data checksum for file integrity or recovery authentication.

Journaling File Systems

A journaling file system, in general principle, is a file system that tracks changes into a
journal which can be
thought of as a form of register or log. In this context, the file system's
"journal" is usually a circular log placed into a dedicated area of the file system. This is
where any changes are placed before they are committed to the main file system. Should
a
failure occur, such as a system crash or power interruption, the journal which has kept these
potentially volatile files is utilized by the file system to quickly bring the system, and its files,
back online with less likelihood of corruption.

The journa
ling file system is used in many product lines and on various operating systems.
Because the file system uses a special area on the disk, referred to as the "journal," to write
changes which occur during operation, this significantly reduces the time requi
red for data
updating while increasing file system reliability.

Modern versions of the journaling file system were developed for the OS/2 Warp Server
(from IBM) for e
-
Business. IBM calls their implementation a "journaled file systems" (JFS)
built on a 64
-
bit journaling file system architecture. The original IBM JFS was later ported to
IBM AIX and Linux. The goal of the developers was to provide high performance, reliability,
and scalability on symmetrical multiprocessing machines (SMP).

When a computer's
hardware architecture employs two or more identical microprocessors
that share the same main memory and are controlled by a single instance of the operating
system, the computer will generally operate in a symmetrical mode. When multiple
processors are con
nected using cross
-
bar switching, on
-
chip mesh networks or through buses,
the control of the interconnections is critical to avoid bottlenecks. Scaling SMP systems is
complicated when using buses or cross
-
bar switching because of the interconnections that
must be made between memory, storage, and the associated processors.

USN Journal

The Update Sequence Number Journal (USN Journal) is a system management feature that
records changes to all files, streams, and directories on the volume, as well as their var
ious
attributes and security settings.

USN Journal is a critical functionality of NTFS, and a feature that FAT/FAT32 does not
provide, for ensuring that its internal complex data structures will remain constants during
instances where the system crashes. T
hese data structures are notably the volume allocation
bitmap (data moves performed by the defragmentation API), modifications to Master File
Table (MFT) records, updates to the shared security descriptors or to the boot sector as well
as its local mirrors
. The later structures are where the last USN transaction that would be
committed on the volume is stored. In the event of a crash, the indices for the directories and
security descriptors will remain consistent; thus, allowing for an easy rollback of thos
e
uncommitted changes to critical data structures once the volume is remounted.

In versions of Windows and Windows Server (2003 et al), the USN journal was extended to
trace the state of other transactional operations on other parts of the NTFS file system
. Such
services, like that of the Volume Shadow Copy Service, will create shadow copies of system
files with, for example, cloning (a full copy/split mirror) or copy
-
on
-
write (i.e., differential
copy) semantics. Volume Snapshot Service (VSS) is another tec
hnology (included in
Windows) that operates at the block level of the file system. VSS is similar in purpose, but
different in functionality. Snap shotting is often found in intelligent disk storage systems.

Large Scale File Systems

Installing and configur
ing large
-
scale file systems can be extremely complex. Whether it is a
clustered file system in front of SAN
-
attached storage or a number of traditional dual
controller NAS systems glued together with virtualization software; time and money can be
wasted t
rying to deploy legacy systems for large
-
scale multimedia and video applications.

Multimedia and Video Applications

Meeting the requirements of a mission critical system requires more than just duplicated or
mirrored systems which are designed mostly to pr
event a complete failure of a primary
system. In media and video applications, performance, capacity, and reliability become
paramount in the design and deployment of servers and storage systems. Employing
traditional, legacy storage architectures for thes
e extremely demanding systems can be
expensive, risky, and complex.

When the storage systems built for media or video applications utilize conventional
approaches built around NAS and SAN (refer to Chapter Fifteen) storage systems, one may
find there are l
imitations that limit either the capacity or the scalability of the storage
platform. These limitations may present themselves at unexpected moments when, for
example, the maximum number of addressable or attachable NAS
-
heads is realized; or the
expansion
of a SAN is curtailed because of port count that has been reached or the finite

amount of addressable drives in a given volume is approached.

Some storage architectures also begin to suffer from performance (bandwidth) constraints as
the number of drives o
r array components approaches some degree of physical capacity
whereby the only alternative is a forklift upgrade that replaces some or all of the drives.

There continue to be improvements in addressing the capacity and performance capabilities
of media
-
ce
ntric storage systems. These issues have in recent years been resolved by various
storage system approaches that include new approaches in grid storage and file system
handling.

Grid Architectures

When a storage system is built such that it utilizes a dist
ributed architecture that spreads
slices of the files into sets of replicated components, the system may be referred to as grid
storage or as a grid architecture. When the file data is segmented into chunks and then broken
further down into slices, some fo
rm of tracking mechanism must be employed so that the file
elements can be reconstructed when read from the store.

Metadata is used to track the locations of file data. This metadata is usually managed by a
secondary server. In addition to being the mechan
ism for reassembling the files, the metadata
controller is allocated the task of organizing the data when a storage element (e.g., a disk
drive or content store) fails, and it must be reconstructed using some flavor of a parity system
similar to the way a
RAID set is rebuilt.

Grid architectures are quite useful for optimizing performance and capacity scalability. When
servers or storage components are added to a grid system, not only does storage capacity
increase but also performance is increased as bandwi
dth and throughput are scaled up.

Grids and clusters are discussed in Chapter Fourteen.

Further Readings

ECMA
-
167. Volume and File Structure for Write
-
Once and Rewritable Media using
NonSequential Recording for Information Interchange (1997, 3rd ed.). Spec
ifies a format and
associated system requirements for volume and boot block recognition, volume structure, file
structure, and record structure for the interchange of information on media between users of
information processing systems
http://www.ecma
-
international.org/publications/Ecma
-
167.htm

Universal Disk Format (UDF) specification, Revision 2.50
http://www.osta.org/specs/

I
SO
-
13346
-

the ISO number for the UDF file specification
http://www.iso.org

The Filesystem Manager
http://www.swd.de/documents/manuals/sysarch/fsys_en.html