Compilation - Ahmet Sayar

californiamandrillSoftware and s/w Development

Dec 13, 2013 (3 years and 9 months ago)

103 views

Programming Languages

Asst. Prof. Dr.
Ahmet

Sayar

Spring
-
2013

Kocaeli

University

Computer Engineering Department



Language Translators


and

Lexical Analysis

Language Translation

Interpreters and Compilers

3

Layered language interfaces

Interpretation versus compilation


Compilation
:


A program called a compiler reads your program
and translates it into machine code. Then the
computer obeys the machine code.


Interpretation
:


A program called an interpreter looks at each line
of your program in turn, works out what it means,
obeys it, and then goes onto the next line.

5

Language
Interpretation


Pure Interpretation


One
-
step
-

process
, in which
both the program and the
input are provided to the
interpreter, and the output is
obtained


Easy
source
-
level debugging


Much slower than compiled
code


Decoding is slow


Same statement may be decoded
many times


More space required for symbol
table


Storage for symbol
table

Examples


Some examples of interpreted programs are PHP BASIC,
QBASIC



The Practical Extraction and Reporting Language, or
Perl
, is a
script
-
based programming language whose syntax parallels
that of the C language but is an interpreted language; Perl can
optionally be compiled prior to execution into either C code or
cross
-
platform
bytecode



JavaScript

is another example of an interpreted script
-
based
programming language.

Compiler


Compilation is two
-
step process


Original program


source

program is the input and
new program


target

program is output.


Target program may then be executed.


More commonly target language is assembly language


Target program must be translated by an assembler
into an object
-
program


Then,
linked

with other object programs


And then,
loaded

into appropriate memory locations
before it can be executed

8

Language

Implementation


Compilation


Programs are translated
into machine language



Generated code can be
executed directly on the
computer



Linkers are used to link
your program to system
programs and other
precompiled programs

See the below link for lexical analyzers and
syntax analyzers:
http://www.pling.org.uk/cs/lsa.html

Compiler

Linking


Libraries of subroutines


From Source Code to Executable Code

program gcd(input, output);

var i, j: integer;

begin


read(i, j);


while i <> j do



if i > j then i := i


j;



else j := j


i;


writeln(i)

end.

Compilation

Hybrid Implementation System


JAVA

uses both interpretation and compilation


Source code is compiled into byte
-
code


JVM (Java Virtual Machine) runs it as if it is an
interpreter

13

Hybrid Implementation


Hybrid Implementation


Programs are translated into
intermediate code


Intermediate code will be
interpreted at runtime


Example: Perl, Java

1
-
14

Compiler vs. Interpretation


Compilation


Translate high
-
level program to machine code


Slow translation


Fast execution


Compiler vs. Interpretation


Compiler are better in terms of time efficiency


Interpreters are better in terms of memory usage


Interpreters are better in terms of exception
handling


Language Definition

Language Definition


Language Definition can be loosely divided
into two parts


SYNTAX (structure): grammar of a language


An if
-
statement consists of the word “if” followed by an
expression inside parentheses, followed by…


SEMANTICS (meaning)


An if=statement is executed by first evaluating its
expression, which must have arithmetic or pointer type,
including all side effects, and if it compares unequal to
0, ….


Syntax


The description of language syntax is one of the
areas where formal definitions have gained
acceptance, and the syntax of almost all
languages is now using
context
-
free grammar



Example context
-
free grammar

Quicksort

in
Java

A programming language is a way
of thinking

Different people think in a
different way

qsort [] = []

qsort (x:xs) = qsort lt_x ++ [x] ++ qsort ge_x



where




lt_x = [y | y <
-

xs, y < x]




mid = [y | y <
-

xs, y = x] ++ [x]




ge_x = [y | y <
-

xs, y > x]


Quicksort in Haskell

Semantics


Much more complex than syntax.


Meaning can be defined in many ways.


No generally accepted method


Several notational systems for formal
definitions have been developed and are
increasingly in use.


Operational Semantics


Denotational

Semantics


Axiomatic semantics

Phases of Compilation

22

The Structure of a Compiler

1.
Lexical Analysis

2.
Parsing

3.
Semantic Analysis

4.
Symbol Table

5.
Optimization

6.
Code Generation



The first 3, at least, can be understood by analogy
to how humans comprehend English.

23

1. Lexical
Analysis


First step: recognize words.


Smallest unit above letters


This is a sentence.



Note the


Capital “
T
” (start of sentence symbol)


Blank “ “ (word separator)


Period “
.
” (end of sentence symbol)

24

More Lexical Analysis


Lexical analysis is not trivial. Consider:

ist his ase nte nce



Plus, programming languages are typically
more cryptic than English:

*p
-
>f +=
-
.12345e
-
5



25

And More Lexical Analysis


Lexical analyzer divides program text into
“words” or “tokens”

if x == y then z = 1; else z = 2;



Units:

if
,
x
,
==
,
y
,
then
,
z
,
=
,
1
,
;
,
else
,
z
,
=
,
2
,
;



26

2. Parsing


Once words are understood, the next step is to
understand sentence structure



Parsing = Diagramming Sentences


The diagram is a tree



Diagramming a Sentence

27

This

line

is

a

longer

sentence

verb

article

noun

article

adjective

noun

subject

object

sentence

28

Parsing Programs


Parsing program expressions is the same


Consider:

If x == y then z = 1; else z = 2;


Diagrammed:


if
-
then
-
else

x

y

z

1

z

2

==

assign

relation

assign

predicate

else
-
stmt

then
-
stmt

29

3. Semantic
Analysis


Once sentence structure is understood, we can try to understand
“meaning”


But meaning is too hard for compilers



Compilers perform limited analysis to catch inconsistencies



Some do more analysis to improve the performance of the program

Parse tree

Semantic
Analyzer

Intermediate
Program

30

Semantic Analysis in English


Example:

Jack said Jerry left his assignment at home.

What does “his” refer to? Jack or Jerry?



Even worse:

Jack said Jack left his assignment at home?

How many Jacks are there?

Which one left the assignment?


Programming languages
define strict rules to
avoid such ambiguities



This C++ code prints “4”;
the inner definition is
used

Semantic Analysis in Programming

31

{


int Jack = 3;


{



int Jack = 4;



cout << Jack;


}

}


32

More Semantic Analysis


Compilers perform many semantic checks besides
variable bindings



Example:

Jack left her homework at home.



A “type mismatch” between
her

and
Jack
; we
know they are different people


Presumably Jack is male


After the lexical analyzing and parsing, symbol table is
created.


Lexical anayzer gives symbol table as an output


Below table shows tokens for a pascal statemet


toplam:=değer+10;

4. Symbol Table

34

5. Optimization


No strong counterpart in English, but akin to
editing



Automatically modify programs so that they


Run faster


Use less memory


In general, conserve some resource



Result
= f
unction
1(a+b) + f
unction2
(a+b)


How do you optimize this code?

Machine independent optimization


Code motion


Invariant

expressions should be executed only once


E.g.

for (
int

i

= 0;
i

<
x.length
;
i
++)


x[
i
] *=
Math.PI

* Math.cos(y);



double
picosy

=
Math.PI

* Math.cos(y);

for (
int

i

= 0;
i

<
x.length
;
i
++)


x[
i
] *=
picosy
;


Machine independent optimization


Multiplying a number with the power of 2


Shift to the left


Dividing a number with the power of 2


Shift to the right

Machine dependent optimization

38

Issues


Compiling is almost this simple, but there are
many pitfalls.



Example: How are erroneous programs
handled?



Language design has big impact on compiler


Determines what is easy and hard to compile


Course theme: many trade
-
offs in language design


39

Compilers Today


The overall structure of almost every compiler
adheres to this outline



The proportions have changed since FORTRAN


Early:
lexing
, parsing most complex, expensive



Today: optimization dominates all other phases,
lexing

and parsing are cheap