GCC Hacks - Alexey Smirnov

quaggahooliganInternet και Εφαρμογές Web

5 Φεβ 2013 (πριν από 4 χρόνια και 4 μήνες)

108 εμφανίσεις

GCC Hacks

Alexey Smirnov

GRC’06

http://gcchacks.info

Introduction

GNU Compiler Collection includes C,
C++, Java, etc. compilers and libraries
for them

Standard compiler for Linux

Latest release: GCC 4.1

http://gcc.gnu.org

Introduction

GEM


compiler extensibility framework

Examples: syntactic sugar, BCC,
Propolice, etc.

Dynamically loaded modules simplify
development and deployment

Overview

GCC 3.4 Tutorial

GEM Overview

Hacks, hacks, hacks.

GCC Architecture

Driver program
gcc
. Finds appropriate
compiler. Calls compiler, assembler,
linker

C language: cc1, as, collect2

This presentation: cc1

GCC Architecture

Front end, middle end, back end.

Representations

AST


abstract syntax tree

RTL


register transfer language

Object


assembly code of target
platform

Other representations used for
optimizations

GCC Initialization

cc1 is preprocessor and compiler

toplev.c:
toplev_main()

command
-
line option
processing, front end/back end initialization,
global scope creation

Front end is initialized with


standard types:
char_type_node
,
integer_type_node
,
unsigned_type_node
.


built
-
in functions:
builtin_memcpy
,
builtin_strlen

These objects are instances of
tree
.

Tree data type

Code, operands.

MODIFY_EXPR



an assignment expression.
TREE_OPERAND(t,0), TREE_OPERAND(t,1)

ARRAY_TYPE



declaration of type.
TREE_TYPE(t)


type of array element,
TYPE_DOMAIN(t)


type of index.

CALL_EXPR



function call.
TREE_OPERAND(t,0)


function definition,
TERE_OPERAND(t,1)


function arguments.

debug_tree()

prints out AST

Parser

Identifier after identifier

get_identifier()

char*
-
> tree with
IDENTIFIER_NODE code.

A declaration is a tree node with _DECL code.
lookup_name()

returns declaration
corresponding to the symbol

Symbol table not constructed.
C_DECL_INVISIBLE attribute used instead.

AST to RTL to assembly

start_decl()

/
finish_decl()

start_function()
/
finish_function()

tree
build_function_call
(tree function, tree
params)

When a function is parsed it is converted to
RTL immediately or after the file is parsed.
Option

funit
-
at
-
a
-
time

finish_function()

Assembly code is generated from RTL.
output_asm_insn()

is executed for each
instruction

GEM Framework

The idea is similar to that of LSM

Module loaded using an option:


-
fextension
-
module=test.gem

Hooks throughout GCC code


AST


Assembly output


New hooks added when needed

GEM Framework Hooks

gem_handle_option

gem_c_common_nodes_and_builtins

gem_macro_name, gem_macro_def

gem_start_decl, gem_start_func

gem_finish_function

gem_output_asm_insn

Traversing an AST

walk_tree

static tree callback(tree *tp, …) {


switch (TREE_CODE(*tp)) {


case CALL_EXPR:





case VAR_DECL:





}


return NULL_TREE;

}


walk_tree(&t, callback, NULL, NULL);

Creating trees

t =build_int_2(val, 0);

build1(ADDR_EXPR,
build_pointer_type(T_T(t)), t);

build(MODIFY_EXPR, TREE_TYPE(left),
left, val);

Hacks

Syntactic sugar

Operating systems

Security


Syntactic Sugar

When a compiler error occurs, fix
compiler rather than program.

Examples:


Function overloading as in C++


toString() in each structure as in Java


Invoke block of code from a function Ruby


Use functions to initialize a variable


Default argument values

Security

DIRA: detection, identification, and
repair of control hijacking attacks

PASAN: signature and patch generation

Propolice
-
fstack
-
protector

Operating Systems

Dusk: develop in userland, install at
kernel level.


Function Overloading

Two functions:


void add(int, int);


void add(int, char*);

The idea is to replace function name so that it
includes argument types:


add_i_i


add_i_pch

gem_start_decl()

gem_start_function()

gem_build_function_call()

Alias Each Declaraiton

cfo_find_symtab(&t_func, func_name);

if (t_func==NULL_TREE ||
DECL_BUILT_IN(t_func)) { return; }



If found then alias and create new
declaration.

Alias Each Declaration

strcpy(new_name, func_name);

strcat(new_name,
cfo_build_name(TREE_PURPOSE(T_O(declarator,
1))));

cfo_find_symtab(&t_func_alias, name);


If not found:

t_alias_attr=tree_cons(get_identifier("alias"),
tree_cons(NULL_TREE, get_identifier(name),
NULL_TREE), NULL_TREE);
TYPE_ATTRIBUTES(T_T(t_func)) = t_alias_attr;
DECL_ATTRIBUTES(t_func)=t_alias_attr;


T_O(declarator,0) = get_identifier(new_name);


Replace function calls

name = cfo_build_decl_name(t_func,
t_parm);


t_new_func = get_identifier(name);


if (t_new_func) { t_new_func =
lookup_name(t_new_func); }


*func = t_new_func;


Conclusion

GCC is a big program so we thought it’s a good
idea to document it:


http://en.wikibooks.org/GNU_C_Compiler_Internals

GEM allows to implement GCC extensions.

http://www.ecsl.cs.sunysb.edu/gem

Examples: programming languages, security,
OS.

Thank you



http://gcchacks.info