water Mark in Intrusion Tolerant

currygeckoSoftware and s/w Development

Dec 2, 2013 (3 years and 8 months ago)

85 views

The DPASA Survivable JBI
-

A High
-
water Mark in Intrusion Tolerant
Systems



Partha Pal

On Behalf of the Entire DPASA
*

Team

BBN Technologies, Adventium Labs, SRI, U Illinois and U Maryland



* The DPASA project was sponsored by DARPA under an AFRL Contract during 2002
-
2005 The BBN
-
led DPASA Team
designed the survivable architecture for, used it to defense
-
enable an DoD relevant information system, and subjected it
to multiple Red
-
Team evaluations.

© BBN 2006
-
2007, ppal@bbn.com

2

Outline


Intrusion Tolerance



The DPASA Approach


Survivability Architecture


Design Principles



Baseline (undefended) and the Survivable System



Evaluation Results



Conclusion and Future Direction

Intrusion Tolerance

© BBN 2006
-
2007, ppal@bbn.com

4

Generations of Security Research

No system is perfectly secure


only adequately
secured with respect to the perceived threat.

Prevent Intrusions

(Access Controls, Cryptography,


Trusted Computing Base)


1
st

Generation: Protection

Cryptography

Trusted Computing
Base

Access Control &
Physical Security

Detect Intrusions, Limit Damage

(Firewalls, Intrusion Detection Systems,

Virtual Private Networks, PKI)

2
nd

Generation: Detection

But intrusions will occur

Firewalls

Intrusion
Detection
Systems

Boundary

Controllers

VPNs

PKI


But some attacks will succeed

Tolerate Attacks

(Redundancy, Diversity, Deception,
Wrappers, Proof
-
Carrying Code,
Proactive Secret Sharing)

3
rd

Generation: Tolerance

Intrusion
Tolerance

Big Board View of
Attacks

Real
-
Time Situation
Awareness

& Response


Graceful
Degradation

Hardened
Operating
System

© BBN 2006
-
2007, ppal@bbn.com

5

3
rd

(moving towards 4
th
) Generation

3
rd

Generation: Tolerance and Survivability:



Assumes that attacks/bad things cannot be totally prevented


some attacks will even succeed, and may not even be detected on
time..



Focuses on desired qualities or attributes that need to be
preserved/ retained/continued even if in a degraded manner





availability: (of information and service)



integrity: (of information and service)



confidentiality: (of information)


Next Generation of Survivability:



Regain, recoup, regroup and even improve…


W/O attack

Undefended

3
rd

Gen:
Survivable

Next Gen?

time

Level of
service

Start of focused attack

The DPASA Approach

© BBN 2006
-
2007, ppal@bbn.com

7

Drivers and Contributing Factors


COTS: Bugs and unknown
vulnerabilities


Open/interoperable:
Discovery and use of new
exploits


Distributed:More places to
attack

Attack


Interconnected: Attack
initiation and propagation


Interdependent: Cascade
effect

Unavailability is detectable

Corruption

Exfiltration

Can’t reach server:


Wait or give up..

Wrong answer:


Can get pretty bad..

Stolen data:



Attacker may try to

introduce corruption or steal

rather than disrupt!

© BBN 2006
-
2007, ppal@bbn.com

8

Contention Between Defense and the Adversary

Continued operation:



Preserve

C, I and A



Degrade





Application security



Adaptive response



Attacker and Application compete for the same resources



corrupt



consume

host

host

memory

memory

CPU

CPU

Applications

Attacker

The game is inherently biased against the defense: adversary needs to find
only one way to win, whereas the defense needs to cover as many
possibilities as it can. Therefore, in the short term, successfully denying or
delaying the adversary is a win for the defense..

© BBN 2006
-
2007, ppal@bbn.com

9

Defense Mechanisms

host

host

memory

memory

CPU

CPU

Applications

Attacker


Defense mechanisms: mechanisms that do not contribute to
principal functionality of the system, but included in the system to
preserve/bolster C, I A


Tools, protocols, subsystems…


Network, Host (OS), Application layer mechanisms



© BBN 2006
-
2007, ppal@bbn.com

10

Survivability Architecture


Survivability architecture:


survivability goals + undefended system +
design principles


organization of components, both functional
components from the undefended system and
the added defense mechanisms, their
interconnections, and protocols that govern
them..


Entities, interconnections, protocols..



© BBN 2006
-
2007, ppal@bbn.com

11

Designing for Survivability


Key motto: combine
protection
-
detection
-
adaptive response


High barrier to entry
from outside as well
going from one part to
another


Improve the chance to
spot attacker activity


Adapt to changes
caused by the attacker

Key Assets

© BBN 2006
-
2007, ppal@bbn.com

12

Dynamic Defense in Depth


Multiple layers of defense


Unlikely that all layers have
the same hole


Dynamically changing the
defenses


Analogous to changing
your passwords


Reduces the likelihood of
success to dictionary
attacks


Unpredictable to the
attacker


Disclose as little as
possible to the attacker,
confuse, obfuscate his
view

Choice and organization of defenses


requirements + design principles

© BBN 2006
-
2007, ppal@bbn.com

13

Design Principles


SPOF protection


Controlled use of diversity


Physical barriers before key assets


Robust basis of defense in depth


Containment layers


Modularity


Range of adaptive responses


Human override


Minimalism


Configuration generation from specs


Many of these are surprisingly simplistic and intuitive
---

but it is also
surprising how many of these are routinely ignored in current system design

© BBN 2006
-
2007, ppal@bbn.com

14

SPOF Protection


It may be impossible to protect all “single points of failures” in a system


Depending on the level of abstraction/granularity there may be way too many


Do not go overboard in choosing the “unit”


A host, a process, an instance representing a physical object…


Not the DMA controller, bus, or the CPU in a host..



Units that perform key or essential functions and are exposed to outside must not be
left as SPOF


The web server that runs your electronic store front, or facilitates collaboration


The database or application server that your sales force or analysts constantly
need


Do not ignore how you access network !!!



Typically mitigated by redundancy


Spatial redundancy may not always be possible


Redundancy in time domain (restart)


Managing redundancy


Transparent (middleware)


Applications are aware of the redundancy



© BBN 2006
-
2007, ppal@bbn.com

15

Diversity and Physical Barriers


Notion of “zones”: Crumple zone,
Operations zone, Executive zone

b

Network

“key asset”

applications accessing the key
asset over the network

Introduce redundancy

a

d

c

b

Network


Enablers


Application level proxies


Additional features


Rate limiting


Size limiting


Learning usage pattern


Tunnel termination


Insertion of protocol diversity



Introduce diversity

a

Network

a


SPOF?

Introduce
physical
barriers
using DMZ

a

diversity?

a

accessibil
ity of 4
replicas?

run same attack 4 times?

4 replicas are still accessible

Network

controlled communication

Management &
decision
-
making
functions

Main
operational
functionality

Access points

© BBN 2006
-
2007, ppal@bbn.com

16

Controlled Use of Diversity

a

b

c

d

quad1

quad2

quad 4

b

b

b

SE LINUX

WINDOWS

SOLARIS


Source of artificial diversity


Hardware architecture


OS


Programming language


Application



COTS


n
-
version programming?


Automated diversity generation?

Diversity is expensive


Initial investment, continued maintenance & management

b

d

a

c


Controlled use of diversity


In a given situation more diversity is not
necessarily better


Given the organization on left, using 4
different OS is not better than using 3


There are situations where a small
additional investment provides a big pay
off


identify and take advantage of these!

Network

LOGOS are registered trademarks
of respective owners

© BBN 2006
-
2007, ppal@bbn.com

17

Robust Basis for Defense in Depth

GAM003 Photodisc (Illustration) Royalty Free
Photograph



It is likely that a majority of the defense mechanisms are “software”


Depends on hardware, OS and network services


May depend on other software mechanisms as well!


How to avoid “house of cards” in building defense in depth?


Forming a robust basis: useful things to consider while trying to satisfy a need


Hardware based mechanisms


Cryptographic strengths


Assumptions about operating environment


Interconnecting hosts in a network or inter
-
network: use of
managed switches is better than programming it in


Storing and using private keys: smart cards/separate co
-
processors is better than using the main disk/memory/CPU


Fine grain packet filtering and encryption: NIC based
solution is better than software tools (IPTables etc)


Redundancy: Hardware based vs. software based

© BBN 2006
-
2007, ppal@bbn.com

18

Containment Layers

process

host

network segment

crumple zone

operations zone

executive zone

quad1

quad2

quad3

quad4

System management function

Operations zone proxy
of the system
management function

Main functionality: PSQ
(publish
-
subscribe and
query server)

Application level
proxies


Containment layers: architectural construct that
helps limit the spread of attacks/attack effects



Two main dimensions to consider


Spatial and Functional

Containment in spatial dimension

Adding the functional dimension

© BBN 2006
-
2007, ppal@bbn.com

19

Modularity


Survivable system must adapt to changes caused by attacks


Is Containment+ Redundancy enough to support adaptive response?

crumple zone

operations zone

executive zone

quad1

quad2

quad3

quad4

System management function

Operations zone proxy
of the system
management function

Main functionality: PSQ
(publish
-
subscribe and
query server)

Application level
proxies

X


Will the system still work if you kill the affected application?


What if we have to go up in the spatial containment hierarchy


shutdown
the host, quarantine the host or the network containing the host?

Modularity is the design property that facilitates such responses

Enablers:


Actuator mechanisms: to effect the response


Post
-
action coordination: (implemented in code) healing/recovery,
masking/degradation

© BBN 2006
-
2007, ppal@bbn.com

20

Range of Adaptive Response


Rapid response: Local scope, fully automated, local decision making based on local
observation.


Spurious file [process]: delete [kill]


Lost file: recover


Coordinated response: System wide scope, mostly automated, coordinated decision
making (multiple rounds of message exchanges) based on corroborated information
from multiple parts of the system


Restart a function, reboot a host, isolate a network


Human assisted response:


Clean a host and restart


Examine the log (forensics) to identify a signature and patch


Survivable system must adapt to changes caused by attacks


It is important to have a range of adaptive responses


Some symptoms are more critical than others, e.g., port scan vs. all heartbeats went down


In some cases response delayed is response failed, e.g., observed an attack signature


Some responses are more severe than others, e.g., restoring a file vs. isolating a network


Enablers:


Advanced middleware, Sensors and correlators, Logical decision tools/expert systems

Baseline and Survivable
System

© BBN 2006
-
2007, ppal@bbn.com

22

Baseline (Undefended System)

Solaris*

Windows*

*

various versions

WxHaz

ChemHaz

TAP

AODB

TARGET

CAF

CombatOps

MAF

PSQ

Srvr

Repository

SWDIST

AODBSVR

TAPDB

Public IP Network

Emulated

IP Network

EDC

JEES

HUB

JBOSS APP Server

Information Object Repository

Metadata Repository

Security Repository

CORE LAN

Client LAN 4

Solaris*

Windows*

*

various versions

Client 1

Client 2

Client 5

Client 6

TARGET

Client 7

CAF

Client 8

CombatOps

Client 9

MAF

Client 10

PSQ

Srvr

PSQ

Srvr

Repository

Repository

SWDIST

BE SVR 1

AODBSVR

DB SVR 1

TAPDB

DB SVR 2

Public IP Network

Emulated

IP Network

Public IP Network

Emulated

IP Network

Client 3

Client 4

HUB

JBOSS APP Server

Information

Metadata

Security data

CORE LAN

Client LAN 1

Client LAN 2

Client LAN 3

© BBN 2006
-
2007, ppal@bbn.com

23

Defense
-
Enabled System

HUB

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

HUB

HUB

NIDS

QIS

QIS

QIS

QIS

HUB

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

HUB

HUB

HUB

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

VPN Router

HUB

HUB

HUB

NIDS

NIDS

NIDS

NIDS

NIDS

NIDS

QIS

QIS

QIS

QIS

SeLinux

SeLinux

WinXP

Pro

WinXP

Solaris 8

ADF NIC

ADF NIC

Experiment Control/logging network

Win2000

Bump In Wire w/ADF

Bump In Wire w/ADF

VLAN

VLAN

WxHaz

ChemHaz

TAP

AODB

TARGET

CAF

CombatOps

SWDIST

AODBSVR

TAPDB

EDC

JEES

HUB

Client LAN 4

Client 1

Client 2

Client 5

Client 6

TARGET

Client 7

CAF

Client 8

CombatOps

Client 9

MAF

MAF

Client 10

SWDIST

BE SVR 1

AODBSVR

DB SVR 1

TAPDB

DB SVR 2

Client 3

Client 4

HUB

Client LAN 1

Client LAN 2

Client LAN 3

Emulated IP network
using VLANS in a single
Cisco 3750

© BBN 2006
-
2007, ppal@bbn.com

24

Key Aspects of the Survivability Architecture


Defense mechanisms


Policy enforcement


Encryption


Authentication


Detection and correlation


Redundancy/redundancy management


Adaptive response (recover, degrade)



Design principles and enablers


Multiple layers: policy, encryption, authentication…


SPOF, Diversity, Hardware grounding, Modularity, Containment, Range of adaptive
response



Architectural elements


Zones, Quadrants, Survivable Middleware, Protection domains


System Managers (SM), Access Proxies (AP), Local Controllers (LC)



Protocols


Corruption Tolerant PSQ: Embedded in the Survivable Middleware


Heartbeats:


Alerts: Embedded sensors


Command: among SMs, SM
-
LC…

© BBN 2006
-
2007, ppal@bbn.com

25

Some Annotations



Redundancy/Diversity
VPN Firewall
Switches
JVM
ADF
CSA/SELinux
App.
Application
Process
Host
Network
System
Redundancy/Diversity
VPN Firewall
Switches
JVM
ADF
CSA/SELinux
App.
Application
Process
Host
Network
System
Policy Enforcement (permissions, capabilities)


JVM security policy


SELinux/CSA policies


Process Protection Domain


System Protection Domain


ADF policies


Network Procetion Domain

Encryption


Outer VPN


ADF VPGs


Application level encryptiom

Authentication


VPN level (router
-
router, router
-
hosts)


ADF level (host to host)


Application level

Detection and Correlation


Embedded sensors (applications,
proxies, heartbeats)


Policy engines


NIDS


EMERALD


Advisor

Adaptive responses


Restore files


Kill processes


Isolate host


Reboot host


Retry PSQ operations


Adjust redundancy/level of tolerance
(degrade)


Restart application


Quarantine network segments

© BBN 2006
-
2007, ppal@bbn.com

26

Survivable Middleware

Survivable Middleware adds a
stronger level of authentication,
access control and reliability:

Cryptography
-
based login,
Redundant core, Transparent
protocol based on weak
assumptions, multiple transports

Undefended Pub/Sub Middleware:
Password login. No
redundancy at core

Data From
Outside

Rep

PSQ Function

Application

Common Interface

Common API Implementation

PUB/SUB

Connector

PSQ Platform

PSQ Middleware

Client

Transport Substrate

Transport
Substrate

JBOSS

Data From
Outside

Other elements
(e.g.,
management) are
not shown

Core

Operations Zone

Crumple Zone

Client Zone

Rep

PSQ Function


Application

Common Interface


Survivability Delegate



Common API Implementation

Specialized
Stub

PUB/SUB
Connector

Protocol
Handler

PSQ Proxy

PSQ Platform

PSQ Middleware

Client

Executive Zone

JBOSS

Evaluation Results

© BBN 2006
-
2007, ppal@bbn.com

28

Red Team Evaluation (Adversarial)


Run 1


Defended system ran for 14 hours with no visible impact


The policies were so tight that the red team had no visibility of their actions
or their impact


Run 2 (modified the policy to enable red team the visibility they
requested)


12 hour scenario completed, but the red team was able to cause significant
hiccups during the scenario


With the added visibility they were able to DOS specific clients when they
needed to publish information


Run 3 (different red team)


Within an hour they took out the PIX VPNs!


Residual flaw in the Cisco router configuration (recall red teams have
complete knowledge of everything)


in addition to the agreed upon span
port, the configuration also gave them a trunk port access!


Rerunning the same attack without the trunk port did not succeed, but the red
team was divided in their opinion about whether the attack could be customized
to work w/o the trunk port access

Although having access to trunk ports in
multiple routers in a backbone is a considerable
amount of privilege, run 3 exposed and
exploited the tradeoff we made in the design !

No loss of
published
information or
corruption!

© BBN 2006
-
2007, ppal@bbn.com

29

Red Team Evaluation (Cooperative)

Scenario completed after 3 hrs
227m
11/18
20
Scenario is not completed
165m
11/17
19
Scenario complete
55m
11/17
18
Scenario complete
54m
11/16
17
Scenario complete
67m
11/15
16
Aborted
11/15
15
Scenario is not completed
177m
11/15
14
Scenario completed
60m
11/14
13
Scenario completed
59m
11/14
12
Scenario completed
100m
11/14
11
Scenario completed
66m
11/11
10
Scenario completed
105m
11/11
9
Scenario is not completed
156m
11/10
8
Scenario completed
65m
11/10
7
Scenario completed
148m
11/9
6
Scenario completed
61m
11/9
5
Scenario completed
53m
11/8
4
Scenario completed
53m
11/8
3
Scenario completed
60m
11/8
2
Scenario completed
165m
11/7
1
Outcome
Run Length
Date
Run
Scenario completed after 3 hrs
227m
11/18
20
Scenario is not completed
165m
11/17
19
Scenario complete
55m
11/17
18
Scenario complete
54m
11/16
17
Scenario complete
67m
11/15
16
Aborted
11/15
15
Scenario is not completed
177m
11/15
14
Scenario completed
60m
11/14
13
Scenario completed
59m
11/14
12
Scenario completed
100m
11/14
11
Scenario completed
66m
11/11
10
Scenario completed
105m
11/11
9
Scenario is not completed
156m
11/10
8
Scenario completed
65m
11/10
7
Scenario completed
148m
11/9
6
Scenario completed
61m
11/9
5
Scenario completed
53m
11/8
4
Scenario completed
53m
11/8
3
Scenario completed
60m
11/8
2
Scenario completed
165m
11/7
1
Outcome
Run Length
Date
Run
Compressed 3 hr scenario

“Traitor” blue team
member(s) worked with
the red team

Red team started inside
the defended system with
attack code pre
-
positioned

High
-
level access and
higher privilege implied
some of the sensors were
blind

This extraordinary
success of the defense
required considerable
human help

New flaws and defense
opportunities exposed:

Bad refs + Spread, Java
serialization, MSQL Injection,
ADF Policy Server exploit

Conclusion and Future
Direction

© BBN 2006
-
2007, ppal@bbn.com

31

Current Conclusion


A high
-
water mark in survivable system design


Proof that information systems can be made highly
survivable


Survivability Architecture: individual mechanisms abound, this
was a great first example of integrating them coherently with a
tight and consistent policy



There is no such thing as “improbable risk” against a
highly motivated adversary



Exploiting the SPOF PIX VPN routers were assessed to be an
improbable risk


Created a daunting level of difficulty to breach
confidentiality and integrity, but availability is not there yet


That is despite all the redundancy, diversity and adaptive
response


Loss is easily detected



Human intelligence required in interpreting observed
information and controlling the architecture

© BBN 2006
-
2007, ppal@bbn.com

32

Future Direction


What to do with availability


Beyond degradation?


Regenerate? Learn while you regenerate?


Artificial diversity?




Minimizing the need for human intelligence?


Motivation


Cost issue


Response time


Human factors


Can there be an expert system/expert assistant?

© BBN 2006
-
2007, ppal@bbn.com

33

Reference Material


Useful technologies



ADF (3COM, Secure Computing, Adventium Labs)


http://doi.ieeecomputersociety.org/10.1109/DS
N.2006.17


http://doi.ieeecomputersociety.org/10.1109/DIS
CEX.2001.932222



SELinux, CSA


http://www.nsa.gov/selinux/


http://www.cisco.com/en/US/products/sw/secur
sw/ps5057/index.html



EMERALD (SRI)


http://www.csl.sri.com/projects/emerald/



Routers, Managed switches (Various vendors Cisco,
HP etc)


http://www.cisco.com/warp/public/707/21.html


http://www.hp.com/rnd/index.htm



Tripwire (Tripware Inc), Veracity (Rocksoft)


http://www.tripwire.com/index.cfm



Spread (JHU, Spread Concepts)


http://www.spreadconcepts.com/


http://www.dsn.jhu.edu/research/group/secure
_spread/


BFT protocols


L.

Lamport, R.

Shostak, and M.

Pease. The
Byzantine generals problem.
ACM Trans. Program.
Lang. Syst.
, 4(3):382
-
401, 1982.


http://www.cs.cornell.edu/fbs/publications/2004
-
1924.pdf


Malkhi, Reiter, Castro, Liscov etc ..


Advanced middleware like QuO


http://quo.bbn.com




For more information


Papers about this project:


http://www.dist
-
systems.bbn.com/papers/2005/ACSAC/index2.shtml


http://www.dist
-
systems.bbn.com/papers/2005/ACSAC/index.shtml


http://www.dist
-
systems.bbn.com/papers/2006/NCA/index.shtml


http://www.dist
-
systems.bbn.com/papers/2005/NCA/index.shtml



Other BBN papers


Michael Atighetchi, Partha Pal, Franklin Webber,
Richard Schantz, Christopher Jones, Joseph Loyall.
Adaptive Cyberdefense for Survival and Intrusion
Tolerance. IEEE Internet Computing, Vol. 8, No. 6,
November/December 2004, pp. 25
-
33.


http://www.dist
-
systems.bbn.com/papers/2006/SPE/index.shtml



COCA


http://www.cs.cornell.edu/home/ldzhou/coca.htm



MAFTIA paper


http://www.maftia.org/



OASIS book:


http://csdl2.computer.org/persagen/DLAbsToc.jsp?reso
urcePath=/dl/proceedings/&toc=comp/proceedings/oas
is/2003/2057/00/2057toc.xml