Reverse Engineering of

slimwhimperSoftware and s/w Development

Nov 3, 2013 (3 years and 10 months ago)

52 views

Reverse Engineering of

Design Patterns from

Java Source Code

Nija Shi

shini@cs.ucdavis.edu

Ron Olsson

olsson@cs.ucdavis.edu

UC

DAVIS

UC

DAVIS

ASE 2006

Outline


Design patterns vs. reverse engineering


Reclassification of design patterns


Pattern detection techniques


PINOT


Ongoing and future work

UC

DAVIS

ASE 2006

Design Patterns


A design pattern offers guidelines on when, how, and
why an implementation can be created to
solve

a
general
problem

in a particular
context
.

--

Design Patterns: Elements of Reusable Object
-
Oriented Software



Gang of Four (GoF)



A few well
-
known uses


Singleton
:
Java AWT’s (GUI builder) Toolkit class


Proxy
:
CORBA’s (middleware) proxy and real objects


Chain of Responsibility
:

Tomcat’s (application server)
request handlers

UC

DAVIS

ASE 2006

Reverse Engineering of Design Patterns

Toolkit

public static synchronized Toolkit getDefaultToolkit()

protected abstract ButtonPeer createButton(Button target)

protected abstract TextFieldPeer createTextField(TextField target)

protected abstract LabelPeer createLabel(Label target)

protected abstract ScrollbarPeer createScrollbar(Scrollbar target)


private static Toolkit toolkit;

Component

TextField

Button

Label

ComponentPeer

TextFieldPeer

LabelPeer

ButtonPeer

Container

ComponentPeer

LayoutManager

layoutMgr 1

1

UC

DAVIS

ASE 2006

Representative Current Approaches

Tools

Language

Techniques

Case Study

Patterns Targeted

SPOOL

C++

Database query

ET++

Template Method,

Factory Method, Bridge

DP++

C++

Database query

DTK

Composite, Flyweight,

Class Adapter

Vokac et al.

C++

Database query

SuperOffice CRM

Singleton, Template Method,
Observer, Decorator

Antoniol et al.

C++

Software metric

Leda, libg++, socket, galib,
groff, mec

Adapter, Bridge

SPQR

C++

Formal semantic

test programs

Decorator

Balanyi et al.

C++

XML matching

Jikes, Leda,

Star Office Calc, Writer

Builder, Factory Method, Prototype,
Bridge, Proxy, Strategy, Template
Method

PTIDEJ

Java

Constraint Solver

Java.awt.*, Java.net.*

Composite, Facade

FUJABA

Java

Fuzzy logic and

Dynamic
analysis

Java AWT

Bridge, Strategy, Composite

WoP Scanner

Java

AST query

AWT, Swing, JDBC API, etc.

Abstract Factory

HEDGEHOG

Java

Formal Semantic

PatternBox, Java 1.1, 1.2

Most GoF patterns (discussed later)

Heuzeroth et al.

Java

Dynamic
analysis

Java Swing

Observer, Mediator, CoR, Visitor

KT

SmallTalk

Dynamic
analysis

KT

Composite, Visitor,

Template Method

MAISA

UML

UML matching

Nokia DX200 Switching
System

Abstract Factory

UC

DAVIS

ASE 2006

Current Approaches


Limitations


Misinterpretation of pattern definitions


Limited detection scope on implementation
variants


Can be grouped as follows:


Targeting structural aspects


Analyze class/method declarations


Analyze inter
-
class relationships (e.g., whether one class
extends another)


Targeting behavioral aspects


Analyze code semantics (e.g., whether a code
segment is single entry)

UC

DAVIS

ASE 2006

Targeting Structural Aspects


Method


Extract structural relationships (inter
-
class
analysis)


For a pattern, check for certain structural
properties


Drawback


Relies only on structural relationships,
which are not the only distinction
between patterns


UC

DAVIS

ASE 2006

Targeting Behavioral Aspects


Method


Narrow down search space


using inter
-
class relationships


Verify behavior in method bodies


Dynamic analysis


Machine learning


Static program analysis


UC

DAVIS

ASE 2006

Targeting Behavioral Aspects


Drawback


Dynamic analysis:


Requires good data coverage


Verifies program behavior but does not verify
the intent


Complicates the task for detecting patterns
that involve concurrency


Machine learning:


Most patterns have concrete definitions, thus
does not solve the fundamental problem.

UC

DAVIS

ASE 2006


if

(instance == NULL)


instance =
new

Singleton();


return

instance;


public class

Singleton

{


private static

Singleton instance;


private

Singleton(){}



public static

Singleton getInstance()


{



instance =
new

Singleton();


return

instance;


A Motivating Example

Detecting the

Singleton Pattern:


As detected by FUJABA


Common search criteria


private

Singleton()


private static

Singleton instance


public static

Singleton
getInstance()


Problem


No behavioral analysis on
getInstance()


Solution?


}

}


return new

Singleton();


Inaccurately

recognized

as Singleton

Correctly

identified as

a Singleton

UC

DAVIS

ASE 2006

GoF Patterns Reclassified

Iterator

Prototype
Builder
Memento
Interpreter
Command
Bridge
Template Method
Proxy
Adapter
Facade
Composite
Singleton
Observer
Visitor
Decorator
Flyweight
Mediator
Strategy
Abstract Factory
Factory Method
Chain of Responsibility
State
context -interface
association
check
write access
on context
factory
interface
check
object creation
backward
data-flow
analysis
Singleton
class
structure
forward
data-flow
analysis
call
dependence
analysis
1:N
1:1
push
delegation
centralized
delegation
verify implementation
of flyweight pool
1:N
aggregation
virtual
delegation
conditional
delegation
unconditional
delegation
grouping
family of
products
Language
provided
Behavior
driven
Domain
specific
Generic
concepts
Structure
driven
UC

DAVIS

ASE 2006

Language
-
provided Patterns


Patterns provided in the language or library


The Iterator Pattern


“Provides a way to access the elements of an aggregate object
sequentially without exposing its underlying representation” [GoF]


In Java:


Enumeration since Java 1.0


Iterator since Java.1.2


The for
-
each loop since Java 1.5


The Prototype Pattern


“Specify the kinds of objects to create using a prototypical instance,
and create new objects based on this prototype”


In Java:


The
clone()

method in
java.lang.Object


Pattern Detection


Recognizing variants in legacy code

UC

DAVIS

ASE 2006

Structure
-
driven Patterns


Patterns that are driven by software architecture.


Can be identified by inter
-
class relationships


The
Template Method
,
Composite
,
Decorator
,
Bridge
,
Adapter
,
Proxy
,
Facade

patterns


Inter
-
class Relationships


Accessibility


Declaration


Inheritance


Delegation


Aggregation


Method invocation

UC

DAVIS

ASE 2006

Behavior
-
driven Patterns


Patterns that are driven by system behavior.


Can be detected using inter
-
class and program
analyses.


The
Singleton
,
Abstract Factory
,
Factory Method
,
Flyweight
,
CoR
,
Visitor
,
Observer
,
Mediator
,
Strategy
, and
State

patterns.


Program analysis techniques:


Program slicing


Data
-
flow analysis


Call trace analysis

UC

DAVIS

ASE 2006

Domain
-
specific Patterns


Patterns applied in a domain
-
specific context


The
Interpreter

Pattern


“Given a language, define a representation for its grammar
along with an interpreter that uses the representation to
interpret sentences in the language” [GoF]


Commonly based on the Composite and Visitor patterns


The
Command

Pattern


“Encapsulate a request as an object, thereby letting you
parameterize clients with different requests, queue or log
requests, and support undoable operations” [GoF]


A use of combining the Bridge and Composite patterns to
separate user interface and actual command execution. The
Memento pattern is also used to store a history of executed
commands


Pattern Detection


Requires domain
-
specific knowledge

UC

DAVIS

ASE 2006

Generic Concepts


Patterns that are generic concepts


The
Builder

Pattern


“Separate the construction of a complex object from its
representation so that the same construction can create
different representation” [GoF]


System bootstrapping pattern, object creation is not
necessary


The
Memento

Pattern


“Without violating encapsulation, capture and externalize an
object’s internal state so that the object can be restored to
this state later” [GoF]


Implementation of memo pool and representation of states
are not specifically defined.


Pattern detection


Lack implementation trace

UC

DAVIS

ASE 2006


Structural aspect


private

Singleton()


private static

Singleton instance


public static

Singleton

getInstance()


Behavioral aspect


Analyze the behavior in
getInstance()


Check if lazy
-
instantiation is implemented


Check if
instance

is returned


Slice the method body for

instance
and analyze
the sliced program

Recognizing the Singleton Pattern

UC

DAVIS

ASE 2006

Recognizing the Singleton Pattern

public class

SingleSpoon

{


private

SingleSpoon();


private

static

SingleSpoon theSpoon;


public

static

SingleSpoon getTheSpoon()


{


if

(theSpoon == null)


theSpoon =
new

SingleSpoon();


return

theSpoon;


}

}

Conditions

theSpoon == null

Statements

theSpoon (created)

Conditions

theSpoon != null

Statements

theSpoon (returned)

UC

DAVIS

ASE 2006

P
attern

IN
ference and rec
O
very

T
ool


PINOT


A fully automated pattern detection tool


Designed to be faster and more accurate


Detects structural
-

and behavioral
-
driven
patterns


How PINOT works

Source

Code

Text

XMI

editors

view

Pattern Instances

Pattern Instances

U

M

L

PINOT

J
AVA

UC

DAVIS

ASE 2006

Implementation Alternatives


Program analysis tools


Extract basic information of the source code


Class, method, and variable declarations


Class inheritance


Method invocations, call trace


Variable refers
-
to and refers
-
by relationships


Parsers


Extract the abstract syntax tree (AST)


Compilers


Extract the AST and provide related symbol
tables and built
-
in functions operating on the AST

UC

DAVIS

ASE 2006

Implementation Overview


A modification of Jikes (open source C++ Java
compiler)


Analysis using Jikes abstract syntax tree (AST) and
symbol tables


Identifying Structure
-
driven patterns


Considers Java language constructs


Considers commonly used Java utility classes:
java.util.Collection

and
java.util.Iterator



Identifying Behavior
-
driven patterns


Applies data
-
flow analysis, inter
-
procedural analysis, alias
analysis


PINOT considers related patterns


Speed up the process of pattern recognition


E.g., Strategy and State Patterns, CoR and Decorator, etc.

UC

DAVIS

ASE 2006

Benchmarks


Java AWT (GUI toolkit)


javac (Sun Java Compiler)


JHotDraw (GUI framework)


Apache Ant (Build tool)


Swing (Java Swing library)


ArgoUML (UML editor tool)

UC

DAVIS

ASE 2006

PINOT Results


PINOT works well in terms of accuracy: it recognizes
many pattern instances in the benchmarks.


Like other pattern detection tools, PINOT is not
perfect:


False positives


Prototype vs. Factory Method


PINOT does not detect Prototype pattern


Prototype pattern involves object creation


PINOT identifies implementation of
clone

methods as factory
methods


False Negatives


User
-
defined data structures


Container structures are commonly used with Observer,
Mediator, Composite, Chain of Responsibility patterns, etc.

UC

DAVIS

ASE 2006

Pattern Interpretation


Flyweight vs. Immutable


Immutable classes are sharable singletons


Mediator vs. Facade


Colleagues of participating in the
Mediator pattern can have different types


A mediator class becomes a facade
against an individual colleague class


UC

DAVIS

ASE 2006

PINOT Results

0
100
200
300
400
500
600
No. of Pattern Instances
AbstractFactory
FactoryMethod
Singleton
Adapter
Bridge
Composite
Decorator
Facade
Flyweight
Proxy
CoR
Mediator
Observer
State
Strategy
TemplateMethod
Visitor
Ant
AWT
JHotDraw
Swing
No. of
classes
Time
(sec)
526
15.54
13.77
15.73
72.91
485
464
1028
KLOC
72.4
142.8
71.7
263.5
UC

DAVIS

ASE 2006

Ongoing and Future Work


Investigate other domain
-
specific
patterns


High performance computing (HPC)
patterns


Real
-
time patterns


Extend usability of PINOT


Formalize pattern definitions


Visualizing detection results

UC

DAVIS

ASE 2006

PINOT
+

Eclipse

UC

DAVIS

ASE 2006

Conclusion


Reverse engineering of
design patterns


Reclassifying the GoF
patterns for reverse
-
engineering


PINOT


a faster and more
accurate pattern detection
tool


Ongoing and future work


More information on our website:
http://www.cs.ucdavis.edu/~shini
/research/pinot