doc

yakconspiracySoftware and s/w Development

Dec 14, 2013 (3 years and 3 months ago)

65 views

DS4: Advanced Data Objects


1

DS 4: Advanced Data Objects


Review


Structured data types (fixed or variable size, hetero or homogeneous, array or
something else (tree, list)


Strings (separate type or array of characters, static or dynamic)


Arrays (indice types, allocation and sizing tim
ing issues, slices)


Records (variant records)


Sets (size limits, binary encoding)

Synopsis



Pointers, Executable Data Objects, Files, Data Encapsulation & Abstraction

References



Pratt &Zelkowitz 5, 6, 7.1

DS4: Advanced Data Objects


2

Pointer Types

A pointer is a data type in which the
values of variables are memory addresses (or the
special value 'nil'). Pointers are mainly used



to provide indirect addressing (to hold a memory location


for parameters/arrays)



combined with dynamic allocation/deallocation operators, to provide dynamic
storage management

DS4: Advanced Data Objects


3

Pointer operations:



referencing: setting a pointer to point at a memory location (either the same as
another pointer, or newly allocated memory)



dereferencing: obtaining, not the address given by the pointer, but the contents of
that

address



allocation: specifically setting the pointer to reference newly allocated memory



deallocation: specifically freeing up dynamic memory (and un
-
pointing the pointer)


DS4: Advanced Data Objects


4

Problems:

Misdirection of pointers is easy to create and difficult to debug. S
pecific forms this
takes are:



dangling pointers: the memory location has been deallocated, but a pointer still
references it (some languages provide checks in the compiled code to prevent this).

a = new Person();

a = null;

a.name = “bob”



lost objects: al
located memory to which no pointer any longer points, so the allocated
memory cannot be referenced
-

even to deallocate it.



Garbage collection: deallocation of dynamic memory and the associated pointers
leads to fragmentation of memory. Thus deallocation
schemes require a garbage
management mechanism.


DS4: Advanced Data Objects


5

In addition to the operations given above, C permits pointer arithmetic (+/
-

the size of
the item), allowing arrays to be manipulated with pointers

DS4: Advanced Data Objects


6

Alternatives to Pointers

In most imperative languages, p
ointers are the riskiest, most error prone, parts of the
language. Yet they are normally used to implement only a small range of structures:
lists, trees, rings, stacks.

Some languages provide either as part of their language, or as part of their standard

libraries: standard implementations of these structures.

DS4: Advanced Data Objects


7

Executable Data Objects

In C, Java, Perl, Clean, there is a clear distinction between code and data. However this
is not true of all languages. In lisp, almost everything, including executable code
, is a
list. In most logic programming languages, executable code is built up out of structures
(ie the logic programming analogue of records), and a structure may be passed to a
special predicate (rule) for execution


DS4: Advanced Data Objects


8

Files

The file types supported by a
language usually fall into one of the following classes:



Sequential Files
: Files that may only be accessed record by record.. and only for either
reading or writing. In many languages, each element of the file corresponds to one
data object (which may be
an array or object). Usually the data object may not
contain pointers (which become meaningless).



Text files:
A file of characters, but also usually provided with operations for reading
or writing numbers.



Interactive File Input
-
Output
: Not usually support
ed by languages, problems being in
both read and write mode, buffering, end
-
of file.



Direct Access Files
: Each data object in the file is assigned a key. A particular record
may be read or written by specifying that key.



Indexed Sequential Files
: As above,

but files may also be read sequentially, starting
from the location that was previously accessed.

DS4: Advanced Data Objects


9

Encapsulation

Say a program is being constructed to deal with a database organising the academy. It
would be sensible to first define the major data compon
ents: a student, a staff member,
a course, a list of students, a list of staff members, a list of courses. Then the common
operations could be defined: enter a student, delete a student, assign/remove a staff
member to a course, assign/remove a student to
a course.

Ideally we would like the various data structures to only be handled as parameters
being passed to subprograms. So that once the subprograms are written, it is no longer
necessary to know the structure of the data types. So we would develop a nu
mber of
abstract data types
, for example a student, that may be added to a list, removed from a
list, be added to a course, may be printed, may be assigned a mark for a course.

Once the basic code is written, ideally there should be no obvious difference
between
an abstract data type and one of the provided data types.

So to develop an abstract data type, a language must provide



a way to define
data objects



a way to define
abstract operations

on those objects

DS4: Advanced Data Objects


10



a way to
encapsulate

the data so it may only
be manipulated by abstract operations.

Example of Abstract Data Type

The abstract data type 'stack' is a standard example in textbooks. It is a relatively
widely used data type that is rarely provided as a built
-
in type in imperative or object
-
oriented la
nguages, though it is very close to the list data type provided in most
functional and logic languages.

A stack may be defined via the abstract operations:

create(stack)




creates/initialises a stack

destroy(stack)






deallocates storage for the stac
k

empty(stack)






boolean function:




true if stack empty, else false

push(stack, element)


DS4: Advanced Data Objects


11




pushes the element onto the stack

pop(stack)






removes top element from the stack

top(stack)






returns top element of the stack

The following might tur
n up as code fragments:



stk1, stk2: stack of colour;


colour1, colour2: colour;



.


push(stk1, colour1);


push(stk2, colour2);


if (not empty(stk1))


then temp := top(stk1);



.

DS4: Advanced Data Objects


12


push(stk2,temp);

DS4: Advanced Data Objects


13

You might decide to implement the stack data type using

arrays;

struct current{


int size;



int[] elements;

};

int empty(struct stack current) {return size==0;}

main()

{ struct current myStack; if (empty(myStack)) …}

however, you might find, because of the way this particular program is used, that
memory

management of an array implementation becomes too expensive.





DS4: Advanced Data Objects


14

Because of the data abstraction, it is possible to change the underlying representation to
a linked list, but the rest of the program will not have to be altered.

struct current{


linked
list * head;

};

int empty(struct stack current) {return head==NULL;}

main()

{ struct current myStack; if (empty(myStack)) …}

Some languages allow data abstraction explicitly (i.e. in Java with a class), in some it
can be implemented implicitly (i.e. as w
e are doing with C)

However, in languages with implicit abstract data types, the data type implementor
could not be certain that the data type user did not refer directly to the underlying
representation
-

this would only be picked up when the using progra
m failed to
compile with the new representation.


main(){ struct current myStack; if (myStack.size==0 …}

DS4: Advanced Data Objects


15

Design Issues for Data Abstraction



must provide a syntactic unit to encapsulate the type definitions



must provide subprogram definitions of the abst
raction operators



must be possible to hide the representation



must be possible to make the abstraction operators externally visible

The language should not automatically provide operations for ADTs other than the
bare essentials.

DS4: Advanced Data Objects


16

For assignment, langu
ages either

i) treat it as an operator (so that the implementor can overload it),

ii) provide it by the language,

iii) force a programmer to use conventional code

Person a;

Person b;

a=b; //can = be defined by the programmer


procedure “=” (Person a,

Person b)


{a.name=b.name; a.age=b.age; a.income=b.income;}

DS4: Advanced Data Objects


17

Other issues



Should the encapsulation construct define only one ADT, or permit multiple
definitions in one unit?



Should there be any restriction on the data types that can be abstract? (res
trictions
to pointer types can avoid recompilation of calling units, since the representation of
the pointer does not change, only the representation of what it points to)



Can ADTs be generic (as the stack construct above was), or should we have to
define

separate ADTs for stacks of colour, of integer, of char....?


DS4: Advanced Data Objects


18

Review



Pointers (indirect addressing/memory allocation & deallocation)



Executable Data Objects (functional & logic languages)



Files (text


data types, simultaneous IO, indexed, indexed seque
ntial)



Issues for Data Encapsulation Mechanisms (implicit/explicit, information hiding,
only be pointers, generic)