Available - Radiant

boilermakerwrapperElectronics - Devices

Nov 8, 2013 (3 years and 9 months ago)

105 views

Ans
-
1

File organization refers to the relationship of the key of the record to the physical location of that record in the computer

file.

File organization may be either physical file or a logical file. A physical file is a physical unit, such as magnetic

tape or a disk.

A logical file on the other hand is a complete set of records for a specific application or purpose.

A logical file may occupy a part of physical file or may extend over more than one physical file.

The objectives of computer based file or
ganization:



Ease of file creation and maintenance



Efficient means of storing and retrieving information.

The various file organization methods are:



Sequential access.



Direct or random access.



Index sequential access.

The selection of a parti
cular method depends on:



Type of application.



Method of processing.



Size of the file.



File inquiry capabilities.



File volatility.



The response time.

1. Sequential access method: Here the records are arranged in the ascending or descending

order or chronological order of a key
field which may be numeric or both. Since the records are ordered by a key field, there is no storage location identification
. It is
used in applications like payroll management where the file is to be processed in en
tirety, i.e. each record is processed. Here, to
have an access to a particular record, each record must be examined until we get the desired record.

Sequential files are normally created and stored on magnetic tape using batch processing method.

Advantages
:



Simple to understand.



Easy to maintain and organize



Loading a record requires only the record key.



Relatively inexpensive I/O media and devices can be used.



Easy to reconstruct the files.



The

proportion of file records to be processed is high.

Disadvantages:



Entire file must be processed, to get specific information.



Very low activity rate stored.



Transactions must be stored and placed in sequence prior to processing.



Data

redundancy is high, as same data can be stored at different places with different keys.



Impossible to handle random enquiries.

2. Direct access files organization: (Random or relative organization). Files in his type are stored in direct access storage

devices
such as magnetic disk, using an identifying key. The identifying key relates to tits actual storage position in the file. The

computer
can directly locate the key to find the desired record without having to search through any other record first.
Here the records are
stored randomly, hence the name random file. It uses online system where the response and updation are fast.

Advantages:



Records can be immediately accessed for updation.



Several files can be simultaneously updated during transac
tion processing.



Transaction need not be sorted.



Existing records can be amended or modified.



Very easy to handle random enquiries.



Most suitable for interactive online applications.

Disadvantages:



Data

may be accidentally erased or over written unless special precautions are taken.



Risk of loss of accuracy and breach of security. Special backup and reconstruction procedures must be
established.



Les efficient use of storage space.



Expensive hard
ware and software are required.



High complexity in programming.



File updation is more difficult when compared to that of sequential method.

3. Indexed sequential access organization: Here the records are stored sequentially on a direct access device
i.e. magnetic disk and
the data is accessible randomly and sequentially. It covers the positive aspects of both sequential and direct access files.

The type of file organization is suitable for both batch processing and online processing.

Here, the records

are organized in sequence foe efficient processing of large batch jobs but an index is also used to speed up
access to the records.

Indexing permit access to selected records without searching the entire file.

Advantages:



Permits efficient and economic

use of sequential processing technique when the activity rate is high.



Permits quick access to records, in a relatively efficient way when this activity is a fraction of the work
load.

Disadvantages:



Slow retrieval, when compared to other methods.



Does not use the storage space efficiently.



Hardware and software used are relatively expensive.


ANS
-
2

ROLE OF DBA

One of the main reasons for using DBMS is to have a central control of both data and the programs accessing those
data. A person who has such control over the system is called a Database Administrator(DBA). The following are the
functions of a Database adm
inistrator


1.
Schema Definition

2.
Storage structure and access method definition

3.
Schema and physical organization modification.

4.
Granting authorization for data access.

5.
Routine Maintenance


Schema Definition

The Database Administrator creates the dat
abase schema by executing DDL statements. Schema includes the logical
structure of database table(Relation) like data types of attributes,length of attributes,integrity constraints etc.


Storage structure and access method definition

Database tables or i
ndexes are stored in the following ways: Flat files,Heaps,B+ Tree etc..


Schema and physical organization modification

The DBA carries out changes to the existing schema and physical organization.


Granting auth
orization for data access

The DBA provides different access rights to the users according to their level. Ordinary users might have highly
restricted access to data, while you go up in the hierarchy to the administrator ,you will get more access rights.


R
outine Maintenance

Some of the routine maintenance acti
vities of a DBA is given below:
-



Taking backup of database periodically



Ensuring enough disk space is available all the time.



Monitoring jobs running on the database.



Ensure that performance is not
degraded by some expensive task submitted by some users.



Performance Tuning


USE OF DATA DICTIONARY BY DBA


A

data dictionary
, or

metadata

repository
, as defined is a "centralized repository of information about data such as
meaning, relationships to other data, origin, usage, and format."

The term may have one of several closely related
meanings
pertaining to

databases

and

database management systems

(DBMS):



a

document

describing a database or collection of databases



an integral

component

of a

DBMS

that is required to determine its structure



a piece of

middleware

that extends or supplants the native data
dictionary of a DBMS


The term

Data Dictionary

and

Data Repository

are used to indicate a more general software utility than a catalogue.
A

Catalogue

is closely coupled with the DBMS Software; it provides the information stored in it to user and the DBA,
but it is mainly accessed by the various software modules of the DBMS itself, such as DDL and DML compilers, the
query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a Data
Dictionary is a data stru
cture that stores meta
-
data, i.e., data about data. The Software package for a stand
-
alone
Data
Dictionary or Data Repository

may interact with the software modules of the DBMS, but it is mainly used by the
Designers, Users and Administrators of a computer
system for information resource management. These systems are
used to maintain information on system hardware and software configuration, documentation, application and users as
well as other information relevant to system administration.

If a data diction
ary system is used only by the designers, users, and administrators and not by the DBMS Software , it
is called a Passive Data Dictionary; otherwise, it is called an

Active Data Dictionary or Data Dictionary
.

An Active
Data Dictionary

is automatically upda
ted as changes occur in the database. A

Passive Data Dictionary

must be
manually updated.

The data Dictionary consists of record types (tables) created in the database by systems generated command files,
tailored for each supported back
-
end DBMS. Command f
iles contain SQL Statements for CREATE TABLE, CREATE
UNIQUE INDEX, ALTER TABLE (for referential integrity), etc., using the specific statement required by that type of
database.

Database

users

and

application

developers can benefit from an authoritative data dictionary document that catalogs
the organization, contents, and conventions

of one or more databases.
[2]

This typically includes the names and
descriptions of various

tables

and

fields

in each database, plus additional details, like the

type

and length of each

data
element
. There is no universal standard as to the level of detail in such a document, but it is primarily a weak kind of
data.


ANS
-
3

(I)

INVERTED FILE ORGANISATION

REFER TO PAGE
-
76
BLOCK
-
1 TOPIC
-
3.6.3 & FIGURE
-
27


(II)

REFERENTIAL INTEGRITY

Referential integrity

is a property of data which, when satisfied, requires every value of one attribute (column) of
a

relation

(table) to exist as a value of another attribute in a different (or the same) relation (table).

For referential integrity to hold in a

relational
database
, any field in a

table

that is declared a

foreign key

can
contain only values from a paren
t table's

primary key

or a

candidate key
. For instance, deleting a record that
contains a value referred

to by a foreign key in another table would break referential integrity. Some

relational
database management systems

(RDBMS) can e
nforce referential integrity, normally either by deleting the foreign
key rows as well to maintain integrity, or by returning an error and not performing the delete. Which method is
used may be determined by a referential integrity constraint defined in a

data dictionary
.

Benefits of Referential Integrity

Improved data quality

An obvious benefit is the boost to the quality of data that is stored in a database. There can still
be errors, but at least
data references are genuine and intact.

Faster development

Referential integrity is declared. This is much more productive (one or two orders of magnitude) than writing custom
programming code.

Fewer bugs

The declarations of
referential integrity are more concise than the equivalent programming code. In essence, such
declarations reuse the tried and tested general
-
purpose code in a database engine, rather than redeveloping the same
logic on a case
-
by
-
case basis.

Consistency ac
ross applications

Referential integrity ensures the quality of data references across the multiple application programs that may access a
database. You will note that the definitions from the Web are expressed in terms of relational databases. However, the

principle of referential integrity applies more broadly. Referential integrity applies to both relational and OO databases,
as well as programming languages and modeling.


(III)

FOREIGN KEY

A foreign key is a field in a relational table that matches a

candidate key

of another table. The foreign key can be
used to cross
-
reference tables.

For example, say we have two tables, a CUSTOMER table that includes all customer data, and an

ORDERS table
that includes all customer orders. The intention here is that all orders must be associated with a customer that is
already in the CUSTOMER table. To do this, we will place a foreign key in the ORDERS table and have it relate to the
primary k
ey of the CUSTOMER table.

The foreign key identifies a

column

or set of columns in one (referencing) table that refers to a column or set of
columns in another (reference
d) table. The columns in the referencing table must reference the columns of the

primary
key

or other

superkey

in th
e referenced table. The values in one

row

of the referencing columns must occur in a single
row in the referenced table. Thus, a row in the referencing table cannot contain val
ues that don't exist in the
referenced table (except potentially NULL). This way references can be made to link

information

together and it is an
essential part of
database normalization
. Multiple rows in the referencing table may refer to the same row in the
referenced table. Most of the time, it reflects the one (parent table or referenced

table) to many (child table, or
referencing table) relationship.

The referencing and referenced table may be the same table, i.e. the foreign key refers back to the same table. Such a
foreign key is known in

SQL:2003

as a

self
-
referencing

or

recursive

foreign key.

A table may have multiple foreign keys, and each foreign key can have a different referenced table. Each foreign key
is enforced independently by the

database system
. Therefore, cascading relationships between tables can be
established using foreign keys.

Improper foreign key/primary key relationships or not enforcing those
relationships are often the source of many
database and

data modeling

problems.

Foreign keys are defined in the ANSI SQL Standard, through a FOREIGN KEY constraint. The syntax to

add such a
constraint to an existing table is defined in

SQL:2003

as shown below. Omitting the column list in the REFERENCES
clause implies that the foreign key shall reference the primar
y key of the referenced table.

ALTER

TABLE

<
TABLE

identifier>


ADD

[
CONSTRAINT

<
CONSTRAINT

identifier> ]


FOREIGN

KEY

( <
COLUMN

expression> {, <
COLUMN

expression>}... )


REFERENCES

<
TABLE

identifier> [ ( <
COLUMN

expression> {, <
COLUMN

expression>}... ) ]


[
ON

UPDATE

<referential action> ]


[
ON

DELETE

<referential action> ]





(iv) TRANSACTION


A

transaction

comprises a unit of work performed within a

database management system

(or similar system) against
a database, and treated in a coherent and reliable way independent of other transactions. Transacti
ons in a database
environment have two main purposes:

1.

To provide reliable units of work that allow correct recovery from failures and keep a database consistent even
in cases of system failure, when execution stops (completely or partially) and many operat
ions upon a
database remain uncompleted, with unclear status.

2.

To provide isolation between programs accessing a database concurrently. If this isolation is not provided the
programs outcome are possibly erroneous.

A database transaction, by definition,
must be

atomic
,

consistent
,

isolated

and

durable
.

Database practitio
ners often
refer to these properties of database transactions using the acronym

ACID
.

Transactions provide an "all
-
or
-
nothing" proposition, stating that each work
-
unit performed in a database must

either
complete in its entirety or have no effect whatsoever. Further, the system must isolate each transaction from other
transactions, results must conform to existing constraints in the database, and transactions that complete successfully
must get wri
tten to durable storage.

The properties of database transactions are summed up with the acronym ACID:

A
tomicity
-

all or nothing



All of the tasks (usually SQL requests) of a database transaction must be completed;



If incomplete due to any possible reasons,

the database transaction must be aborted.


C
onsistency
-

serializability and integrity



The database must be in a consistent or legal state before and after the database transaction. It means that a
database transaction must not break the database integrit
y constraints.


I
solation



Data used during the execution of a database transaction must not be used by another database transaction until the
execution is completed. Therefore, the partial results of an incomplete transaction must not be usable for other
t
ransactions until the transaction is successfully committed. It also means that the execution of a transaction is not
affected by the database operations of other concurrent transactions.


D
urability



All the database modifications of a transaction will be
made permanent even if a system failure occurs after the
transaction has been completed.

Theoretically, a database management system (DBMS) guarantees all the ACID properties for each database transaction. In reali
ty,
these ACID properties are frequently m
ore or less reduced to improve performance.


(V)CANDIDATE KEY


In the

relational model

of

databases
, a

candidate key

of a

relation

is a minimal

superkey

for that relation; that is,
a

set

of attributes such that

1.

the relation does not have two distinct

tuples

(i.e. rows or records in common database language) with the
same values for these attributes (which means that the set of attributes is a superkey)

2.

there is no

proper subset

of t
hese attributes for which (1) holds (which means that the set is minimal).

The constituent attributes are called

prime attributes
. Conversely, an attribute that does not occur in ANY candidate
key is called a

non
-
prime attribute
.

Since a relation contains
no duplicate tuples, the set of all its attributes is a superkey if NULL values are not used. It
follows that every relation will have at least one candidate key.

The candidate keys of a relation tell us all the possible ways we can identify its tuples. As

such they are an important
concept for the design

database schema
.

For practical reasons

RDBMSs

usually require tha
t for each relation one of its candidate keys is declared as
the

primary key
, which means that it is considered as the preferred way to identify individual tuples.

Foreign keys
, for
example, are usually required to reference such a primary key and not any of the other candidate keys.


A table may have more than one key,each key is

called a candi
date key.


E.g( 1) A table CUSTOMER consists of columns: Customer_Id,name,Address etc..

Customer_Id is the only key (unique),thus a candidate key.


E.g(2) consider a table CAR where we can have 2 keys namely license_no,serial_no (which should be unique).

B
oth license_no and serial_no are candidate keys.


Q4

(I)

PRIMARY & SECONDARY INDEXES

PAGE NO
-
47, PAGE NO
-
52 BLOCK
-
1 CH
-
3

(II)

DISTRIBUTED BS CENTRALIZED DATABASE

Centralized database is a database in which data is stored and maintained in a single location. This is

the traditional
approach for storing data in large enterprises. Distributed database is a database in which data is stored in storage
devices that are not located in the same physical location but the database is controlled using a central Database
Manage
ment System (DBMS).

What

is

Centralized

Database?

In a centralized database, all the data of an organization is stored in a single place such as a mainframe computer or a
server. Users in remote locations access the data through the Wide Area Network (WAN)

using the
application

programs

provided to access the data. The centralized database (the mainframe or the server) should be
able to satisfy all the reque
sts coming to the system, therefore could easily become a bottleneck. But since all the data
reside in a single place it easier to maintain and back up data. Furthermore, it is easier to maintain data integrity,
because once data is stored in a centralized

database, outdated data is no longer available in other places.

What

is

Distributed

Database?

In a distributed database, the data is stored in storage devices that are located in different physical locations. They are
not attached to a common CPU but the
database is controlled by a central DBMS. Users access the data in a
distributed database by accessing the WAN. To keep a distributed database up to date, it uses the replication and
duplication processes. The replication process identifies changes in the
distributed database and applies those
changes to make sure that all the distributed databases look the same. Depending on the number of distributed
databases, this process could become very complex and time consuming. The duplication process identifies on
e
database as a master database and duplicates that database. This process is not complicated as the replication
process but makes sure that all the distributed databases have the same data.

What

is

the

difference

between

Distributed

Database

and

Centralized

Database
?

While a centralized database keeps its data in storage
devices that are in a single location connected to a single CPU,
a distributed database system keeps its data in storage devices that are possibly located in different geographical
locations and managed using a central DBMS. A centralized database is easie
r to maintain and keep updated since all
the data are stored in a single location. Furthermore, it is easier to maintain data integrity and avoid the requirement
for data duplication. But, all the requests coming to access data are processed by a single en
tity such as a single
mainframe, and therefore it could easily become a bottleneck. But with distributed databases, this bottleneck can be
avoided since the databases are parallelized making the load balanced between several servers. But keeping the data
u
p to date in distributed database system requires additional work, therefore increases the cost of maintenance and
complexity and also requires additional software for this purpose. Furthermore, designing databases for a distributed
database is more comple
x than the same for a centralized database.


(III)

BTREE & B+ TREE

PAGE 64 BTREE
-

DEFINITION & 5 POINTS

PAGE 69 BTREE
-

7 FEATURES, 3 VARIATIONS OF BTREE, 6 ADVANTAGES, 4 DISADVANTAGES


(IV)

DATA REPLICATION & DATA FRAGMENTATION

PAGE
-
56 BLOCK
-
2 DESIGN OF DISTRIBUTED
DATABASES


(V)

PROCEDURAL & NON PROCEDURAL DML


Non
-
Procedural DML:

A high
-
level or non
-
procedural DML allows the user to specify what data is required without specifying how it
is to be obtained. Many DBMSs allow high
-
level DML statements either to be entered

interactively from a
terminal or to be embedded in a general
-
purpose programming language.



The end
-
users use a high
-
level query language to specify their requests to DBMS to retrieve data. Usually a
single statement is given to the DBMS to retrieve or update multiple records. The DBMS translates a DML
statement into a procedure that manipulates
the set of records. The examples of non
-
procedural DMLs are
SQL and QBE (Query
-
By
-
Example) that are used by relational database systems. These languages are
easier to learn and use. The part of a non
-
procedural DML, which is related to data retrieval from
database,
is known as query language.


Procedural DML:

A low
-
level or procedural DML allows the user, i.e. programmer to specify what data is needed and how to
obtain it. This type of DML typically retrieves individual records from the database and process
es each
separately. In this language, the looping, branching etc. statements are used to retrieve and process each
record from a set of records. The programmers use the low
-
level DML


ANS
-
5

NORMALISATION

Normalisation is the process of taking data from a p
roblem and reducing it to a set of relations while
ensuring data integrity and eliminating data redundancy


Data integrity
-

all of the data in the database are consistent, and satisfy all integrity constraints.

Data redundancy


if data in the database ca
n be found in two different locations (direct redundancy) or
if data can be calculated from other data items (indirect redundancy) then the data is said to contain
redundancy.

Data should only be stored once and avoid storing data that can be calculated fr
om other data already
held in the database. During the process of normalisation redundancy must be removed, but not at the
expense of breaking data integrity rules.


If redundancy exists in the database then problems can arise when the database is in
normal operation:


When data is inserted the data must be duplicated correctly in all places where there is redundancy. For
instance, if two tables exist for in a database, and both tables contain the employee name, then creating a
new employee entry requi
res that both tables be updated with the employee name.

When data is modified in the database, if the data being changed has redundancy, then all versions of the
redundant data must be updated simultaneously. So in the employee example a change to the empl
oyee
name must happen in both tables simultaneously.

The removal of redundancy helps to prevent insertion, deletion, and update errors, since the data is only
available in one attribute of one table in the database.


2NF

A relation is in 2NF if, and only i
f, it is in 1NF and every non
-
key attribute is fully
functionally dependent on the whole key.


Thus the relation is in 1NF with no repeating groups, and all non
-
key attributes must
depend on the whole key, not just some part of it. Another way of saying th
is is that
there must be no partial key dependencies (PKDs).

The problems arise when there is a compound key, e.g. the key to the Record relation
-

matric_no, subject
. In this case it is possible for non
-
key attributes to depend on only
part of the key
-

i
.e. on only one of the two key attributes. This is what 2NF tries to
prevent.

Consider again the Student relation from the flattened Student #2 table:


Student(
matric_no
, name, date_of_birth,

subject
, grade

)





There are no repeating groups



The relation is

already in 1NF



However, we have a compound primary key
-

so we must check all of the non
-
key
attributes against each part of the key to ensure they are functionally dependent
on it.



matric_no determines name and date_of_birth, but not grade.



subject

together with matric_no determines grade, but not name or date_of_birth.



So there is a problem with potential redundancies


3NF

3NF is an even stricter normal form and removes virtually all the redundant data :



A relation is in 3NF if, and only if, it is
in 2NF and there are no transitive functional
dependencies



Transitive functional dependencies arise:



when one non
-
key attribute is functionally dependent on another non
-
key
attribute:



FD: non
-
key attribute
-
> non
-
key attribute



and when there is redundancy
in the database

By definition transitive functional dependency can only occur if there is more than one
non
-
key field, so we can say that a relation in 2NF with zero or one non
-
key field must
automatically be in 3NF.

What is the difference between 1NF and
2NF and 3NF?

1NF, 2NF and 3NF are normal forms that are used in relational databases to minimize redundancies in tables. 3NF is
considered as a stronger normal form than the 2NF, and it is considered as a stronger normal form than 1NF.
Therefore in general
, obtaining a table that complies with the 3NF form will require decomposing a table that is in the
2NF. Similarly, obtaining a table that complies with the 2NF will require decomposing a table that is in the 1NF.
However, if a table that complies with 1NF

contains candidate keys that are only made up of a single attribute (i.e.
non
-
composite candidate keys), such a table would automatically comply with 2NF. Decomposition of tables will result
in additional join operations (or Cartesian

products
) when executing queries. This will increase the computational time.
On the other hand, the tables that comply with stronger normal forms would have fewer redundancies

than tables that
only comply with weaker normal forms.





ANS
-
6 THIS WILL BE GIVEN BY RADIANT ON PAPER


ANS 7
-

BLOCK
-
1, CHAPTER
-
3

PAGE 70 TO PAGE
-
72 TOPIC 3.5 FULLY

ANS 8

i.

select d.day,d.shift from dutyallocation d, employee e where e.e_name=’vijay’
and
e.emp_no=d.emp_no;

ii.

select count(emp_no) from dutyallocation group by shift;