A Programmer-Oriented Approach to Assurance of Mechanical Program Properties - Chains of Evidence

titteringcluckΛογισμικό & κατασκευή λογ/κού

10 Νοε 2012 (πριν από 4 χρόνια και 9 μήνες)

241 εμφανίσεις

1

Chains of Evidence

(Thesis Proposal)


Tim Halloran


William L. Scherlis (advisor)

James D. Herbsleb

Mary Shaw

Joshua J. Bloch, Sun Microsystems Inc.

A Programmer
-
Oriented Approach
to Assurance of Mechanical
Program Properties

2

A thesis proposal should:


Explain the basic ideas of the thesis topic


Argue why the topic is interesting


I.e., scientific value and engineering impact


State what kinds of results are expected


Argue that these results are obtainable
within a reasonable amount of time


Demonstrate the student’s personal
qualifications for doing the proposed work

3

A “bug” description

I got a
NullPointerException

in
CallStackRootNode.CallStackChildren.changeChildren()

where the
CallStackProducer

returned a
null

Location[]
.
Now in this case my code is incomplete but it seems to me that
there is a case for the producer not being able to furnish a
stack, or the filter filtering it all out in which case some
placeholder
Location[]

needs to be created and displayed.

NetBeans Bug
Report #31423

4

The “answer”

I do not understand, you would prefer to return
null

rather than new
Location[0]
? There is something like a
convention here that we prefer to not return
null

values
from functions

its more “safe”

What is the problem here?

NetBeans Bug
Report #31423

5

Loss of design intent


People leave and join software teams


Documents become out of date and inconsistent


Models are missing


Source code becomes the only authoritative
system artifact


Maintainability

suffers because code does not reveal
all the design intent behind it


Quality

suffers because programmers make mistakes
complying with tacit or informally expressed design
intent

6

What models are missing?


Low
-
level models of design intent about
“mechanical” program properties

not
expressible in the language


Focus on bureaucratic aspects of a program


E.g., concurrency policy, exception policy, mutability
policy,
type use policy
, static program structure




Rather than functional ones


E.g., correctly sorting a data structure or correct
computation of a value

We hypothesize that expression and assurance of
mechanical program properties can provide great value

/** @typerecommendation Collection, List */

public class ArrayList extends...

7

Reasons for this problem


Missing capability in today’s languages, models,
tools, and processes to


Express

and
capture

intent


Assure

our implementations are faithful to that intent


Worse, we don’t know how to keep intent
consistent with as
-
built reality of a system as
both

evolve.

My research addresses both these problems

8

NetBeans “bug” example


Capture the intent that the
getCallStack()

method should
never return a
null

Location[]


Annotate the interface as follows:

package org.netbeans.modules.debugger;


public interface CallStackProducer extends CallStackRoot {


...


public /*
@not
-
null
*/ Location[] getCallStack();


...

}

Programmers might
overlook

the annotation or not be
confident

they followed it

to address this problem our approach uses a
tool

to statically assure consistency

9

NetBeans “bug” example


We can go further and annotate that
filterCallStack()

(within the
CompactCallStackFilter

class) should not
raise
NullPointerException

/**
@never
-
throws java.lang.NullPointerException

*/

public Location[] filterCallStack(


/*
@not
-
null
*/ CallStackProducer producer) {


...


Location[] stack = producer.getCallStack();


...


int i, k = stack.length;


...

}

Now our programmer who reported NetBeans bug #31423 and
implemented
getCallStack()

to return
null

could be informed
two models of design intent are violated

10

NetBeans “bug” example


steps to assurance

1.
Evidence
filterCallStack()

does not raise
NullPointerException assuming the
@not
-
null

annotations are valid

2.
Evidence that calls to the
filterCallStack()

method will
never pass
null

in the producer parameter

3.
Evidence that all implementations of
getCallStack()

never return
null

Evidence that each step is valid, given its assumptions,
is gathered by semantics
-
based program analysis

Each
individual

piece of evidence is useful
to the programmer…but they can be linked

1

2

3

11

NetBeans “bug” example


key points


Each step becomes a “link” of evidence
we are able to “chain” together to give us
program assurance.


Our annotations capture design intent and
serve as
cut
-
points

for program analysis


The program properties in the example are
confusing to the programmer because they
are
non
-
local

12

Adoption in practice


Consistency management


Stepwise approach to consistency


Support real
-
world inconsistencies


Avoiding programming language change


“Rising tide of abstraction”


Support extra
-
language assurance


User experience


Different from a compiler


Assurance selection

Where can we help?

24 October 2002 Post on Apache Jakarta General Mailing List
:

Most of the automated code
metrics I read complain about things like “duh its an API of course its an unused class”

“or
duh it a development utility or test case which isn’t MEANT to be flexible”

A Follow
-
up
:

Exactly! Stuff like “This class is unused”

no, it’s just specified in a properties
file somewhere and the static analysis is not picking that up! A couple of false positives like
that and people start ignoring the tool. At least I do.

13

Research goals


Effective capture of implementation
-
level design
decisions, incrementality, and tool supported
consistency management


Assurance of properties not addressed by widely
used programming languages


Design of an effective user experience for extra
-
language assurance


Understanding defects in widely deployed open
source Java projects to understand where we can
have the largest impact

14

Outline


Introduction


Thesis Statement


Approach


Hypotheses


Preliminary Work


Validation


Schedule


Expected Contribution


Loss of design intent


NetBeans “bug” example


Adoption in practice


Research goals


Chains of evidence

15

Chains of evidence


Proofs

that a software system satisfies the
theorem

that programmer
-
expressed
models of design intent are consistent with
source code


Models constructed from annotations within
code and other documentation and focused on
mechanical program properties


Assurance is formed by linking together
“chains” forged from small “links” of
evidence about the software system

16

Chains of evidence


Partial chains of evidence are essential

they enable focused engagement with the
programmer to determine if


The design intent is wrong


The design intent is incomplete


The source code is wrong


The program analysis algorithms (due to
limitations) have insufficient information to
provide a result

17

Assurance spectrum of chains
of evidence

Chains of Evidence – Assurance Focus
(tractable)
- Scalability of Assurance Technique +
Semantic “Depth” of
- Design Intent Assured +
Type
Checking
Program
Verification
Concurrency Policy (Greenhouse)
Thread Coloring (Sutherland)
Java Best Practice
Program Structure
Exception Policy
Mutability Policy
Type Use
Policy
18

Thesis statement


Chains of evidence

enables assurance of
useful mechanical properties about
programs with respect to explicit models
of design intent, and that the approach has
the potential to be scalable and practical
for working programmers to adopt

19

Key ideas


A set of representative and substantive
assurances available as part of our prototype tool
is necessary to show feasibility and flexibility of
our approach


An effective architecture for chains of evidence
is required to organize assurance results and
scale up to large Java systems


An effective user experience is needed to elicit
design intent from and communicate assurance
results to programmers

20

Key ideas


A prototype tool set within the context of a Java
IDE enables evaluation of the effectiveness of
our approach


Selection of what design intent to model and
how to assure it can be empirically informed
through (formative) analyses of bug and quality
practices and (evaluative) analysis and tool use


A business case analysis can show cost
-
effectiveness of our approach and assurances

21

Approach


Develop an architecture, framework, tools,
and user experience for chains of
evidence*


Develop specific assurances


Conduct three empirical investigations


Business case analysis

22

Assurance development


Concurrency Policy *


Mutability Policy


API Protocol Policy


NullPointerException
Policy


Alias Policy


Types and Their Use *


Program Structure

Research challenge to design, using state
-
of
-
the
-
art
program analysis, substantive assurances along a
representative set of points on our curve

Chains of Evidence – Assurance Focus
(tractable)
- Scalability of Assurance Technique +
Semantic “Depth” of
- Design Intent Assured +
Type
Checking
Program
Verification
Concurrency Policy (Greenhouse)
Thread Coloring (Sutherland)
Java Best Practice
Program Structure
Exception Policy
Mutability Policy
Type Use
Policy
23

Empirical investigations


Survey of open source Java bugs (39,463)


Understand: “Where help is needed most?”


2 phases: bug selection and bug analysis


Sophomore experiment


Hypothesis: “Violations of Java best practice
correlate with software defects”


Prototype use studies


Qualitative use studies of our prototype tool


Understand utility and practicality of chains of
evidence

24

Business case analysis


Cost/Benefit Analysis (in the sense of
Reifer) to evaluate the programmer time
and effort required to provide and maintain
design models as compared with the costs
of using current techniques


Done for each individual assurance (eases
identification of state
-
of
-
the
-
practice
techniques that address similar concerns)

25

Hypotheses


Safe evolution of software systems can be
carried out with less up
-
front effort using our
incremental approach then in approaches that
rely on full functional specification


Qualitative use studies of our prototype tool


Bugs of a non
-
local character (e.g., concurrency)
are more difficult for programmers to solve and
have great significance to engineering success


Survey of open source Java Bugs

26

Hypotheses


Cut points are feasible to provide scalability for a
wide range of important program analyses


Assurance development


Program analysis theory


Similar techniques can be used for assurances of
model compliance and assessment of Java best
practice (in the sense of Bloch)


Architecture for chains of evidence


Prototype tool coupled with assurance development

27

Hypotheses


Violations of Java best practice correlate with
software defects (and overall bad software
quality)


Sophomore experiment


Model compliance is a cost
-
effective approach to
improve software quality


Business case analysis


Consistency management can be an independent
function that is not coupled to program analysis


Architecture for chains of evidence


Consistency management (part of user experience)

28

Evidence of Feasibility

Preliminary Work


Two preliminary assurance prototypes


“Models of Thumb”


Demonstration of lock policy assurance


Preliminary Architecture


Third prototype


Empirical investigations


Survey of open source quality practices


Preliminary survey of Java bugs

29

“Models of Thumb”


Assurance that Java “rules of thumb” are followed


Two cases investigated on 2 million SLOC corpus


Ignored exceptions






Overspecific variable declarations

ArrayList results = new ArrayList();

try {


...

} catch (Throwable t) {


;

}

30

31

Tomcat: 230 ignored
exceptions

32

33

34

Tomcat: 485 overspecific variable
declarations

35

36

User Experience


Early prototype reported the following:





Mimicking compiler error message reporting


Not effective for extra
-
language assurance


Negative focus, no rationale, no next step

Extension.java [line 297] change

FROM: ArrayList results = new ArrayList();


TO: List results = new ArrayList();


WHY: Use most abstract interface possible

Research challenge to design an effective user
experience for extra
-
language assurance

37

Rationale

38

Flexible Organization of Results

39

Empirical Results

Name

kSLOC

Overspecific Variable
Declarations

Ignored Exceptions

Variabl
e Decl.

Uses
(
u
)

Violations Found

catch
Block
Uses
(
u
)

Violations Found

#

%
u

/kSLOC

#

%
u

/kSLOC

Ant

64

13,953

434

3

6.7

916

163

18

2.5

Tomcat

66

13,970

485

3

7.3

964

230

24

3.5

J2SDK 1.4

508

116,397

3,650

3

7.2

3,239

686

21

1.4

NetBeans

571

99,201

5,851

6

10.2

5,085

1,048

21

1.8

Eclipse

792

178,872

8,325

5

10.5

6,511

1,110

17

1.4

Subtotal:

2,001

422,393

18,745

4

9.4

16,715

3,237

19

1.6

Whiteboard

38

6,823

1,205

18

28.0

199

40

20

1.4

Total:

2,039

429,216

19,950

5

9.8

16,914

3,257

19

1.6

40

Ignored Exceptions: Why?

Name

catch

block
Uses
(
u
)

Ignored Exceptions

Total (
t
)

Commented

#

%
u

#

%
t

Ant

916

213

23

59

28

Tomcat

964

248

26

66

27

J2SDK

3,239

744

23

291

39

NetBeans

5,085

1,241

24

443

36

Eclipse

6,511

1,275

20

440

35

Tomcat


Sample of 50

Ignored
exception

#

%

Unfinished exception handling

1

2

Catch of an overly
-
broad
exception

5

10

Unsure
[comment or log]

8

16

Default
-
try
-
catch
[comment]

9

18

Thread: InterruptedException

8

16

IO: IOException
[close()]

7

14

Test code
[wrapping test]

3

6

OK, well commented

[not formal]

9

18

We sampled 50 ignored
exceptions from Tomcat and
Eclipse and found roughly 90%
are false positives (
program
correctness only
)
-

Explicit design intent needed

41

Greenhouse Concurrency Assurance


Assurance obtained


All
accesses

to shared fields are
protected
with the correct lock


All
lock preconditions

are satisfied for method calls that require
callers to hold locks


Constructor does not allow references to escape

(i.e.,
avoiding leakage
)

42

Greenhouse Concurrency Assurance


Assurance obtained


All
accesses

to shared fields are
protected
with the correct lock


All
lock preconditions

are satisfied for method calls that require
callers to hold locks


Constructor does not allow references to escape

(i.e.,
avoiding leakage
)

-

Complex design intent models

-

What is the next step?

43

Prototype problems


Difficult to understand the network of analyses that make up an
assurance


Difficult to reuse portions of an assurance for another assurance


No separation between data used to calculate results and actual
results


No benefit from building up assurances from smaller assurances


No standard approach to communicate results to higher
-
level
analyses


No standard approach to communicate results to the user interface


No standard ability to maintain assurance as the software or the
design intent model is being changed by a programmer within the
IDE (i.e., truth maintenance)

Diagnosis: Our architecture is wrong

44

Toward an Architecture for

Chains of Evidence


Preliminary architecture
for chains of evidence is
based upon:


A categorized
blackboard


A truth maintenance
system


A network of
program analysis
components

Region
design intent
Lock policy
design intent
Lock policy
assurance
Thread coloring
design intent
Thread coloring
assurance
Ignored exception
assurance
Ignored exception
design intent
OK to ignore
InterrupedException
within fluid.ex.*
OK: Ignored
InterrupedException
on line 13 of Foo.java
ISSUE: Ignored
IOException on
line 56 of Bar.java
Sea
Blackboard
Developed a feasibility prototype

45

Toward an Architecture for

Chains of Evidence


Preliminary use has found this design:


Enhances a programmer’s ability to
understand and react to tool results


Allows a separation of analysis results and
design intent model information


Provides efficient maintenance of assurance
as models and program code evolve

Research challenge to design user experience, evaluate
(and enhance) scalability and flexibility

46

Validation


Prototype Tool Capabilities


Assurance Soundness


Empirical evidence of
adoptability

&
utility


B
u
g
s
u
r
v
e
y
, and
Prototype use studies


Cost
-
Effectiveness


Business case analysis

Chains of evidence
enables

assurance

of
useful

mechanical properties about programs with
respect to explicit models of design intent, and
that the approach has the potential to be
scalable

and
practical

for working programmers to
adopt

47

Schedule

Date

Milestone Tasks

Jul 2003


Architecture completed and documented


Prototype using updated architecture


Representative program assurances designed


Refine automatic selection for Java bug survey

Aug 2003


Draft ICSE paper


Complete sophomore experiment plan

Sep 2003


Complete Java bug survey

Dec 2003


Complete and document sophomore experiment


Complete representative program assurances

Jan 2003


Begin prototype tool use studies


Begin dissertation draft

Mar 2003


Dissertation draft completed

May 2003


Prototype tool use studies completed and documented


Oral and written thesis defense

48

Expected Contributions


I expect to


provide an effective architecture, framework,
tools, and user experience for chains of
evidence,


demonstrate the usefulness of the of the
framework for representative assurances,


provide an empirically informed assessment
of the potential for adoption, and


qualitatively demonstrate cost effectiveness

49

Chains of Evidence



Tim Halloran


William L. Scherlis (advisor)

James D. Herbsleb

Mary Shaw

Joshua J. Bloch, Sun Microsystems Inc.

A Programmer
-
Oriented Approach
to Assurance of Mechanical
Program Properties

Questions

50

Backup Slides

51

My Proposal in One Slide


Problem
: Increasing source code quality and assurance thereof


Idea
:
Chains of Evidence

A tool
-
supported method to assist
programmers in expressing models of low
-
level design intent and
assuring their consistency with code


Preliminary Results
:


Java best practice and concurrency policy prototypes


Architecture for managing chains of evidence


Survey of open source quality practice and Java bugs


Approach
(

demonstrating potential for):


Develop a set of substantive assurances

feasibility & flexibility


Develop architecture

scalability


Design an effective user experience

adoption


Develop prototype tool in Java IDE

feasibility


Empirical investigation (bug survey, experiment)

adoption/impact


Develop a business case analysis

practicability

52

Chains of Evidence


A
well
-
formed code base

using chains of
evidence includes:


A collection of source code


A set of low
-
level design models that address
semantic properties significant to the
mechanical attributes of code


A linkage of the code base with models
assuring consistency