©
I
NTELLIGENT
A
UTOMATION,
I
NC PROPRIETARY INFORMATION
Bayesian Security
Analysis: Opportunities
and Challenges
ARO Workshop, Nov 14, 2007
Jason Li
Intelligent Automation Inc
Peng Liu
Penn State University
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
2
Outline
Introductions
Overview of Bayesian Networks
Opportunities in Security Analysis
Challenges
Roadmap
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
3
Securing Large Networks
A network defender’s primary advantage over an
attacker is intimate knowledge of the network
Defender’s Arsenal
Vulnerability scanners
Firewall / Routers / other infrastructure
Databases
Intrusion Detection Systems
Network defense must fully leverage that advantage
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
4
Connecting the Dots
Attacker start
Root
on Host 1
Vuln. 1
Vuln. 4
User
on Host 2
…
…
…
…
…
…
How can an attacker
get to them?
Where are the vulnerabilities?
What do they mean?
…
NVD
Alerts src A dst B attack C
What is the situation?
It is challenging to do this
automatically
and
quickly
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
5
Dream Tools for System Administrator
Automatic
tools to assist consistent and secure configuration to
enable
normal
operations
Equipped with sufficient security sensors for a rainy day
No alarms under normal operations: life is beautiful
When the sensors go off, don’t flood me with just alarms
With tons of alarms: don’t know what’s going on; ignore them
Instead, tell me some in

depth
knowledge
What’s wrong? (e.g. where, what, scope)
What does this mean? (e.g. severity, impact assessment)
What will happen next? (e.g. downstream)
What can I do? (e.g. suggestions please)
Better yet,
tell me all these within several minutes of alarms
Preventive:
Is there some layered protection so that most (common)
attacks won’t even able to cause damages?
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
6
Intelligence
F
rom Information to
Intelligence
However, today’s technology is far from being capable of
reaching such goals.
Current security analysis tools for enterprise networks typically
examine only
individual
firewalls, routers, or hosts separately
Do not comprehensively analyze overall network security.
Certainly not sufficient
Our observations: much work on transforming “data” to
“information” (e.g. alarms in IDS), relatively few and insufficient
on transforming
“information” to “intelligence”
(e.g. situational
awareness, action planning, etc)
Attack
Networks and Systems
monitoring
alarms
Data
Information
Then what?
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
7
Introduction: Attack Graphs
To address this problem, attack graphs surface as the
mainstream technology
Network wide analysis
Multi

stage attacks
General Idea
Nodes
represent network security states
Edges
represent state transitions via exploits
To make attack graph tools useful, we identify the
following requirements
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
8
Introduction: Requirements
Automatic
generation algorithms
Attack graphs must be
scalable
Thousands of nodes
Efficient and Powerful Analysis
The attack graph size must be scalable
The semantics must be rich enough, but not richer
Static analysis, situational awareness, what

if, etc.
Attack graphs must be
practical
Attack graph tools that entail laborious manual efforts, poor
scalability, and clumsy analysis are considered impractical
Network reachability information (e.g. analyze firewall rules)
Real

time
software tool
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
9
Review of Prior Art
Attack Graph
Models
Semantics
Scalability
Analysis Capability
Carnegie
Mellon
University
Attack
Graphs
(CMU

AG)
[
6
]
Networks states and
transitions
Richest semantics
Extremely poor
Good analysis, but limited
by its scalability
George
Mason
University
Attack
Graphs
(GMU

AG)
[
1
][
2
][
4
]
Similar semantics as
CMU

AG
Better than CMU

AG
Still poor
O(N^6)
Good analysis
Kansas
State
University
Attack
Graphs
(KSU

AG)
[
5
]
Visualization
of
Datalog
rule
analysis
Between
O(N^
2
)
to
O(N^
3
)
The
best
upper
bound
Limited
analysis
Analysis
done
by
XSB,
a
Prolog
engine
.
MIT
Lincoln
Lab
Attack
Graphs
(MIT

LL

AG)
[
3
]
Nodes
represent
hosts
;
edges
represent
attacks
on
vulnerabilities
Very
simple
Between
O(N^
2
)
to
O(N^
3
)
Can
be
much
larger
Static
analysis
only
.
No
dynamic
analysis
or
action
planning
Our goal: Appropriate Semantics and
Powerful Analysis
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
10
Outline
Introductions
Overview of Bayesian Networks
Opportunities in Security Analysis
Challenges
Roadmap
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
11
What is a Bayesian Network?
A Bayesian network is a graphical model that represents the
problem domain in a probabilistic manner.
Nodes
represent interested propositions
Directed links
represent immediate influence
The parameters associated with each node represent the strength of
such immediate influence
Conditional
Probability Table
(CPT)
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
12
Representation: Breaking the Joint
A joint distribution can always be
broken down into a product of
conditional probabilities using repeated
applications of the product rule
P(A,B,E,J,M) = P(A) P(BA) P(EA,B)
P(JA,B,E) P(MA,B,E,J)
We can order the variables however
we like
P(A,B,E,J,M) = P(B) P(EB) P(AB,E)
P(JB,E,A) P(MB,E,A,J)
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
13
Compact Representation
1
1
(,) (  ( ))
n
n i i
i
P X X P X Parent X
A Bayesian network represent the
assumption that each node is
conditionally independent
of all its non

descendants given its parents
P(JB,E,A) = P(JA)
P(MB,E,A,J) = P(MA)
The joint as a product of CPTs
P(A,B,E,J,M) =
P(B) P(E) P(AB,E) P(JA) P(MA)
So the CPTs determine the full joint distribution
n
( 2 vs. 2 )
k
n
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
14
The Basic Inference Problem
Given
1. A Bayesian network BN
2. Evidence e

an instantiation of some of the variables
in BN (e can be empty)
3. A query variable Q
Compute P(Qe)

the (marginal) conditional distribution
over Q
Given what we do know, compute distribution over what
we don’t
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
15
Why Bayesian Networks
Uncertainty management
Local independence structure and d

separation
Compact representation
Efficient inference
General expressiveness
Supporting planning and action modeling:
Provides
belief states
Game theory, Markov Decision Processes
n
( 2 vs. 2 )
k
n
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
16
Scope
Focus on basic Bayesian networks for insights
Will not discuss other (more advanced) BN models
DBN (Dynamic BN)
MEBN (Multi

entity Bayesian net)
MSBN (Multi

Sectioned Bayesian net)
SLBN (Semantically Linked Bayesian net)
OOBN (Object

oriented Bayesian net)
Deep understanding is necessary
The problem domain
The appropriate BN models
High level security analysis (not alert correlation)
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
17
Outline
Introductions
Overview of Bayesian Networks
Opportunities in Security Analysis
Challenges
Roadmap
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
18
Powerful Analysis Made Possible
Look at a well

known example in BN community
Our BN model for cyber security analysis will share similar flavor
(work in progress)
Visit to Asia
(A)
Tuberculosis?
(T)
Lung cancer?
(L)
Bronchitis?
(B)
Smoking ?
(S)
Either tub
or cancer ?
(E)
positive X

ray?
(X)
Dyspnoea?
(D)
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
19
Support All Kinds of Inference
Visit to Asia
(A)
Tuberculosis?
(T)
Lung cancer?
(L)
Bronchitis?
(B)
Smoking ?
(S)
Either tub
or cancer ?
(E)
positive X

ray?
(X)
Dyspnoea?
(D)
Evidence
Query
Diagnosis
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
20
Support All Kinds of Inference
Visit to Asia
(A)
Tuberculosis?
(T)
Lung cancer?
(L)
Bronchitis?
(B)
Smoking ?
(S)
Either tub
or cancer ?
(E)
positive X

ray?
(X)
Dyspnoea?
(D)
Evidence
Query
Prediction
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
21
Support All Kinds of Inference
Visit to Asia
(A)
Tuberculosis?
(T)
Lung cancer?
(L)
Bronchitis?
(B)
Smoking ?
(S)
Either tub
or cancer ?
(E)
positive X

ray?
(X)
Dyspnoea?
(D)
Evidence
Query
Mixed
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
22
Inference with
Intervention
Most probabilistic models (including general Bayesian
nets) describe a distribution over possible events but
say nothing about what will happen if a certain
Intervention
occurs
A
causal network
, adds the property that the parents
of each node are its direct causes, and thus go
beyond regular probabilistic models
Mechanisms = stable functional relationships
= graphs (equations)
Interventions =
surgeries
on mechanisms
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
23
Seeing vs. Doing
Seeing (passive observation): alerts
Would like to know the consequences of, and the possible
causes
for such
observations (via regular inference algorithms)
Doing (active setting):
set
the value of a node via active
experiment
“Would the problematic circuit work normally if I
replace this suspicious component with a good one?”
External
reasons (the human diagnoser) explain why the
suspicious component becomes good
All its parent nodes should
not
count as causes
Delete all links that point to this node
Other belief updating are not influenced
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
24
An Example
X1
X2
X3
X4
X5
SEASON
RAIN
WET
SPRINKLER
SLIPPERY
X1
X2
X3
X4
X5
SEASON
RAIN
WET
SPRINKLER
= ON
SLIPPERY
1 2 3 4 5
1 2 1 3 1 4 2 3 5 4
(,,,,)
( ) (  ) (  ) ( ,) (  )
P x x x x x
P x P x x P x x P x x x P x x
1 2 4 5
1 2 1 4 2 3 5 4
(,,,)
( ) (  ) ( ,) (  )
P x x x x
P x P x x P x x X on P x x
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
25
What

if Analysis made possible!
Provide a what

if dialog for the system admin
Execute “graph surgery”
Implement using multi

agent system paradigm
for efficient inference
Provide timely results
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
26
What Bayesian Networks can do for us
Situational awareness: “what is going on?”
Prediction: “given the current situation, what may
happen next most likely?”
What

if analysis: “what will happen if I patch this
service?”
Specify additional tests to perform: “which sensors to
look first to confirm/rule out?”
Suggest appropriate/cost

effective treatments/actions:
“what to do first to obtain the maximized gain?”
Preventive maintenance: “what are the most
vulnerable spots?”
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
27
Outline
Introductions
Overview of Bayesian Networks
Opportunities in Security Analysis
Challenges
Roadmap
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
28
Challenges of Using Bayesian Networks
Representation
Capturing the uncertainty in cyber security domain
From attack graphs to Bayesian networks
Semantics, semantics, semantics
Inference
Powerful and responsive
Learning
Tune the Bayesian networks
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
29
Challenges: Representation
Uncertainty management
Alerts themselves
Exploit sequence
Attack consequences
Attack intent
… and so on
Connecting uncertainty management with
attack graph models
Semantics compatibility (node and link semantics)
Translation algorithm
Does this make sense?
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
30
Challenges: Inference
Tracking dynamic attacks on large scale
networks will be a very
processor intensive
task.
Evaluating what

if

solutions must be done in
real

time, in order to allow the human
operator time to find and enforce his/hers
course of action.
Available standard BN products do not scale
Scalable, (much) faster inference engine is
needed
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
31
Challenges: Learning
Mining from some dataset
What dataset
Appropriate for mining (relevant information)
Learn the structure
Model selection
Meaning structure for security analysis
Expertise vs. learning
Learn the parameters
From dataset (e.g. EM algorithm)
Subjective
nature of the parameters
Do the parameters reflect the situations?
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
32
Outline
Introductions
Overview of Bayesian Networks
Opportunities in Security Analysis
Challenges
Roadmap
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
33
How do we use Bayesian Nets?
Build Bayesian network models
Capturing uncertainty
Roadmap to build Bayesian network models
Powerful analysis algorithms
Clique tree based message passing algorithms
Multi

agent based approach
Learning (not included in this talk)
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
34
Capturing Uncertainty in Cyber Security
Class 1: uncertainty about alerts
Whether the alert is true, or false
positive
Class 2: uncertainty about exploit
sequence
Class 3: uncertainty about
possible consequences
Misconfigurations
Inconsistent patches
p(e2e1)
p(e3e1)
e1
e2
e3
S1
e2
e3
S2
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
35
Building Bayesian Networks: Semantics
Nodes
Aggregate exploits
too many specific exploits
check each and every infeasible
some exploits have common signatures
Aggregate states
Similar hosts (in terms of network segment, software
configuration, etc) are
equivalent
May represent some intermediate stage of multi

stage attacks
(e.g. gaining a user account, with the goal of root privilege)
Directed links
“lead to” (e.g., exploit e1 leads to aggregate state s3)
S3
e1
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
36
Our Approach to Build Bayesian Networks
Structure
From the deterministic attack graph (with too many repetitive
structures embodied, sometimes misleading)
Nodes are created based on aggregation techniques
(reachability group, same enclave/configurations, etc)
Develop an
algorithm
to generate links based on nodes and
the attack graph
Similar to attack graph structure to some extent
Where do the numbers come from?
Frequency in the logs, subjective
Robust to parameter values
So what is it?
Hybrid model across abstract levels (exploit, state, aggregates,
subgoals, goals)
what

if questions at such levels
Embeds intelligence from network, attack structures, human
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
37
An (Imaginary) Example
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
38
Bayesian Network Inference
Inference is NP

hard on general Bayesian networks
For tree

structured BN, efficient algorithm exists based
on message

passing (J. Pearl)
But tree

structure is too limited in practice
For multiply

connected BN (each node can have
multiple parent nodes)
This is the most applicable case
Clique tree based message passing algorithms
Shafer

Shenoy algorithm
Laurizen

Spiegelhalter algorithm
Hugin Expert tool
Netica tool
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
39
Clique Tree based Inference
From variable elimination algorithms, the nodes can
be organized into cliques
Rule 1: each clique node waits to send its message to
a given neighbor until it has received messages from
all its other neighbors
Rule 2: when a node is ready to send its message to a
particular neighbor, it computes the message by
collecting all its messages from other neighbors,
multiplying its own table by these messages, and
marginalizing the product to its intersection with the
neighbor to whom it is sending
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
40
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
41
Opportunities and IAI Unique Expertise
Each clique can be modeled as an autonomous
agent
The message passing can be run
in parallel
The whole inference process can be modeled as a
multi

agent system (MAS)
IAI is a leader in agent technology and MAS
Agent infrastructure: Cybele
Scalable multi

agent system: tens of thousands of agents
This unique combination will further improve the
scalability and enhance the response time
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
42
Distributed Bayesian Network Engine
Why?
•
Tracking dynamic attacks on large scale networks will be a very
processor intensive
task.
•
Evaluating what

if

solutions must be done in real

time, in order to
allow the human operator time to find and enforce his/hers course
of action.
•
Available standard BN engines do not scale
Solution:
•
Create a novel Distributed Bayesian Network engine to
accommodate the kind of processing power needed.
•
Use general software engineering rules and methodology so that
the distributed BN engine can be re

used in other domains.
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
43
Distributing a Bayesian Network
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
44
Conclusions
Graphical models can be powerful for cyber security
analysis and management in enterprise networks
To make powerful analysis, we look into the potentials
of Bayesian networks
Lots of opportunities, full of challenges also
Our approach
Understand the problem domain and BN models
Capture uncertainty
Obtain Bayesian nets from attack graphs
Distributed agent based inference engine
The outcome can only be as good as your model …
©
I
NTELLIGENT
A
UTOMATION,
I
NC, PROPRIETRAY INFORMATION
Page
45
A Look beyond …
Application Dependency Graph
Mission
Dependency Graph
Construction
Attack
Database
Attack Graph
IDS
Attack Analysis

Alert correlation

Filtering
Attack Prediction

Reasoning

Suggested Actions
Protection Domain
Visualization
Situational
Awareness
Static
Analysis
Damage
Assessment
What

if
Analysis
Action
Planning
Root

cause
Analysis
Containment
Suggestions
Networks and
Systems Level
Network

Application IF
Application

Mission IF
Missions and Applications Level
Application Dependency Graph
Mission
Dependency Graph
Construction
Attack
Database
Attack Graph
IDS
Attack Analysis

Alert correlation

Filtering
Attack Prediction

Reasoning

Suggested Actions
Protection Domain
Visualization
Situational
Awareness
Static
Analysis
Damage
Assessment
What

if
Analysis
Action
Planning
Root

cause
Analysis
Containment
Suggestions
Networks and
Systems Level
Network

Application IF
Application

Mission IF
Missions and Applications Level
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο