Elicitation and Combination of Rank Correlations for Dependence Modeling

ocelotgiantAI and Robotics

Nov 7, 2013 (3 years and 11 months ago)

69 views

Elicitation and Combination of Rank
Correlations for Dependence Modeling


Oswaldo Morales Napoles

Outline

CATS model, Earth Dams Model


BBN recapitulation


Quantification trough Expert Judgment


Some results

CATS

Netherlands ministry of Transport and
Water Management.


Causal model:

strengthening safety,

causes of incidents and accidents

probability of adverse events


Delft University of Technology (TUD), Det
Norske Veritas (DNV), the National
Aerospace Laboratory (NLR) and White
Queen (WQ).

3



Earth Dams safety in Mexico BNs

Joint work with Autonomous University of
the State of Mexico


Demonstrate method with data for 7 earth
dams in central Mexico


Possible extension to larger dams across
the country


November 2007 flooding was
observed in about 70% of the
Tabasco flatlands

Bayesian Networks (BNs)

BNs Directed Acyclic Graph (3)

Nodes represent random variables

A,B,C (parents) of E (child);

D NOT a direct influence for E
(ancestor)

A


B;
A


C;
A


D;
D


E | C

Information (influence) flow /
sampling order:

{A, {D

C

B}}


E

{{D, A}

C


B
}


E

A
B
C
D
E
(
1
)
(
2
)
(
3
)
(
4
)
Discrete BNs (Quantification)

LED_1_2_3_DEP_70
fail
not fail
59.1
40.9
LED1_CAT
fail
not fail
1.80
98.2
LED2_CAT
fail
not fail
1.80
98.2
LED3_CAT
fail
not fail
1.80
98.2
Catastrophic
fail
not fail
5.30
94.7
LED_FAILURE
fail
not fail
64.4
35.6
L
1
_
C
L
2
_
C
L
3
_
C
XOR
LED Failure
L
1
&
L
2
&
L
3
<
70
OR
CAT
.
Failure
P(Dep) = P(Dep|Cat) P(Cat) + P(Dep|Cat⌐)P(Cat⌐)


= 0 + 0.62408


0.947

Data or Experts

Complex models demand a lot more

18 LED System (Failure is losing
70% of luminance)

Many Cond. Prob.

Simple formulas may not be possible
for
exact

inference


many
operations required

Approximation

methods are an
option when exact inference is not
feasible


L
1
_
C
L
2
_
C
L
3
_
C
6
LED
.
Failure
L
4
_
C
L
5
_
C
L
6
_
C
1
Cat
&
D
<
74
.
11
%
AND
AND
Exactly
1
fail
XOR
Dep
<
74
.
11
%
6
LED
.
Failure
XOR
2
Cat
&
D
<
78
.
75
%
AND
Exactly
2
fail
Dep
<
78
.
75
%
3
Cat
&
D
<
84
%
AND
Exactly
3
fail
Dep
<
84
%
4
Cat
&
D
<
90
%
AND
Exactly
4
fail
Dep
<
84
%
6
Cat
&
D
<
96
.
92
%
AND
Exactly
4
fail
Dep
<
96
.
92
%
P(L) = 1


{ (1
-

P(C) )


(1
-

P(D) )}

P(D) = P(D|C) P(C) + P(D|C⌐)P(C⌐)

Bayesian Networks (BNs)
-

Quantification

Discrete BNs

(un)conditional distributions

X
4
’s CPT has k
4

entries


Continuous BNs

discretising

X
4
’s CPT has 10000 entries

X
4
’s marginal & 3 (conditional) ranks





1
,..,1 | ( )
1
,..,| ( )
n i i
n
X X n X Pa X i i
i
f x x f x Pa x



1 1
(,( ) | ( ),..,( ))
i j i i j i
r X Pa X Pa X Pa X

0
2
4
6
8
-5
0
5
Student's-t(5)
Gamma(2,1)
Clayton's copula r = 0.7
0
2
4
6
8
10
-4
-2
0
2
4
6
8
Student's-t(5)
Gamma(2,1)
Frank's copula r = 0.7
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
x
y
Frank's copula r = 0.7
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
x
y
Clayton's copula r = 0.7
Any continuous variables


possibility to extend to hybrid networks

marginal distributions

(un)conditional rank correlations

some assumption about the bivariate dependence
-

copula

Non


Parametric Bayesian Networks (NPBNs)

Flight crew error model

1
2
3
5
8
9
10
11
12
13
4
6
7
r
14
,
10
r
14
,
12
|
10
r
10
,
6
r
10
,
7
|
6
r
7
,
1
r
7
,
3
|
1
r
7
,
2
|
1
,
3
r
6
,
5
r
6
,
3
|
5
r
6
,
4
|
5
,
3
14
r
14
,
8
|
10
,
12
r
14
,
9
|
10
,
12
,
8
r
14
,
11
|
10
,
12
,
8
,
9
r
14
,
13
|
10
,
12
,
8
,
9
,
11
BBN recapitulation ATC Error model

BBN recapitulation Maintenance error model

BBN recapitulation FC
-
ATC
-
MNT

CATS

FCP model in TO
FCP model in ER
FCP model in AL
MNTP model TO ER AL
ATCP model in TO
ATCP model in ER
ATC model in AL
Common nodes for the FCP all flight
phases
X

7

X

1

X

2

X

3

BBN dependence quantification (I)

i
i
i
i
i
i
e
e
e
e
e
e
r
P
r
P
r
P
3
,
1
|
2
,
7
3
1
|
3
,
7
2
1
,
7
1



P1 = P( #FO fail prof. check per
10,000 FO > x
7,50

| # Hours flown >
x
1,50
)


P2 = P( #FO fail prof. check per
10,000 FO > x
7,50

| # Hours flown >
x
1,50

, Fatigue > x
3,50

)


P3 = P( #FO fail prof. check per
10,000 FO > x
7,50

| # Hours flown >
x1,50 , Fatigue > x
3,50

, # Days since
last training > x
2,50
)


P

(



#

F

O



f

a

i

l



p

r

o

f

.



c

h

e

c

k



p

e

r



1

0

,

0

0

0



F

O



>



X

7

,

5

0

|



#



H

o

u

r

s



f

l

o

w

n



>



X

1

,

5

0

)

i

e

r

1

,

7

i

e

P

1

X

7

X

1

X

2

X

3

BBN dependence quantification (I)

57
.
0


3
.
0
1
1
1
,
7
1




e
e
r
P
X

7

X

1

X

2

X

3

BBN dependence quantification (I)

P
2

= P( #FO fail prof. check per 10,000 FO > x
7,50

|


# Hours flown > x
1,50

, Fatigue > x
3,50
)


P
2
in (0 , 0.6)


If Fatigue does not add any information given experience then


P
2

= P
1

= 0.3

185
.
0


35
.
0
1
1
1
|
3
,
7
2



e
e
r
P
i

e

r

1

|

3

,

7

i

e

P

2

BBN dependence quantification (I)

P
3

= P( #FO fail prof. check per 10,000 FO > x
7,50

| # Hours flown > x
1,50

,
Fatigue > x
3,50
, # Days since last training > x
2,50
)

P
3
in (0.06 , 0.6)


If Training does not add any information given experience and fatigue then
P
3

= P
2

= 0.35

0.554


5
.
0
1
1
3
,
1
|
2
,
7
3



e
e
r
P
X

7

X

1

X

2

X

3

i

e

P

3

i

e

r

1

,

3

|

2

,

7

BBN dependence quantification (II)

1
1
1
1
1
1
1
1
3
1
,
7
3
,
7
2
1
,
7
2
,
7
1
,
7
1
e
e
e
e
e
e
e
e
P
r
r
P
r
r
r
P



Ask a probability of exceedence for the first parent


Ratio of each other unconditional rank correlation to the first one
elicited one for the rest


Translate estimates to probabilities of exceedence




P
1

= P( #FO fail prof. check per 10,000 FO > x
7,50

| # Hours flown >
x
1,50
)

P
2

= P( #FO fail prof. check per 10,000 FO > x
7,50

| , # Days since
last training > x
2,50
)

P
3

= P( #FO fail prof. check per 10,000 FO > x
7,50

| Fatigue > x
3,50
)

X

7

X

1

X

2

X

3

19



Normal copula


P
(
X
4

>
median
|
X
3

>
median
) = 0.3


X
2

and
X
3

are independent.

BBN dependence quantification (II)

20



Given your previous estimates, what is the
ratio ?


P
(
X
4

>
median
|
X
3

>
median
) = 0.3

X
2

and
X
3

are independent.


3
,
4
2
,
4
r
r
21



Given your previous estimates, what is the
ratio ?

P
(
X
4

>
median
|
X
3

>
median
) = 0.3

X
1
, X2

and
X
3

are independent.


= 0.25


3
,
4
1
,
4
r
r
3
,
4
2
,
4
r
r
BBN dependence quantification

Use Calibration variables to derive weights for
experts


Combine their dependence estimates based on
their performance in calibration variables







e
e
e
e
e
e
e
e
e
i
i
i
i
i
i
P
w
P
P
w
P
P
w
P
,
3
3
,
2
2
,
1
1
3
,
1
|
2
,
7
3
1
|
3
,
7
2
1
,
7
1
r
P
r
P
r
P



Combining Experts’ opinions

The DM median for FO Suitability is the
combined opinion of experts.



It differs from each experts individual
median.


215 corresponds to the 0.421 percentile
in expert’s 1 distribution.




7

1

2

3

1
1 7 1
( 215 | 10,700 )
e
P P X X
  
1 1
7 1
7 1
( ( ) 0.42 | ( ) 0.5)
e e
X X
P F X F X
  


It is a task of the analyst:

Compute the answer that the expert would have given if he
\
she would
have been asked a question regarding the median of the decision maker
such that his estimated rank correlation would remain unchanged.


FCP Elicitation Results


Case name : FCEP Error 5 experts 31 jan 085/26/2009 Version W1.0

________________________________________________________________________________

Results of scoring experts

Bayesian Updates: no Weights: global DM Optimisation: yes

Significance Level: 0.6638 Calibration Power: 1

__________________________________________________________________________________________


Nr.| Id |Calibr. |Mean relati|Mean relati|Numb|UnNormalize|Normaliz.we|Normaliz.we


| | | total |realizatioo|real|weight |without DM |with DM

______|________|___________|___________|___________|____|___________|___________|___________


1|C | 0.001547| 1.016| 0.9689| 8| 0| 0| 0


2|A | 0.02651| 0.7119| 0.4991| 8| 0| 0| 0


3|D | 0.185| 1.317| 1.029| 8| 0| 0| 0


4|B | 0.6638| 0.95| 0.574| 8| 0.381| 1| 0.5


5|E | 5.115E
-
005| 1.049| 1.06| 8| 0| 0| 0


6|GLOBAL | 0.6638| 0.95| 0.574| 8| 0.381| | 0.5


7|EQUAL | 0.2224| 0.1046| 0.09945| 8| 0.02212| | 0.03636

____________________________________________________________________________________________

________________________________________________________________________________


(c) 1989
-
2005 TU Delft


What was the total number of passengers uplifted by British Airways
in April 2004?

What was the number of Mandatory Occurrence Reports in the UK
from January up to end of November 1998 on aircraft loading errors?

FCP Elicitation Results

ATCP Elicitation Results


Case name : ATCEP Error 5 experts 31 jan 085/26/2009 Version W1.0

________________________________________________________________________________

Results of scoring experts


Bayesian Updates: no Weights: global DM Optimisation: yes


Significance Level: 0.00131 Calibration Power: 1

____________________________________________________________________________________________


Nr.| Id |Calibr. |Mean relati|Mean relati|Numb|UnNormalize|Normaliz.we|Normaliz.we


| | | total |realizatioo|real|weight |without DM |with DM

______|________|___________|___________|___________|____|___________|___________|___________


1|A | 0.1012| 0.5633| 0.5034| 10| 0.05095| 0.5208| 0.2015


2|B | 0.04706| 1.03| 0.9588| 10| 0.04512| 0.4612| 0.1784


3|C | 0.00131| 1.423| 1.349| 10| 0.001767| 0.01806| 0.006987


4|D | 2.795E
-
009| 1.669| 1.655| 10| 0| 0| 0


5|E | 2.501E
-
006| 1.017| 0.9624| 10| 0| 0| 0


6|GLOBAL | 0.6827| 0.3094| 0.2271| 10| 0.1551| | 0.6131


7|EQUAL | 0.1242| 0.2662| 0.2472| 10| 0.0307| | 0.2388

____________________________________________________________________________________________

________________________________________________________________________________


(c) 1989
-
2005 TU Delft


What was the number of infringements (unauthorized entries) of
controlled airspace in the Amsterdam FIR in 2006?


What is the missed approach rate at Schiphol (based on RWY 06
data)?

ATCP Elicitation Results

EJ Results

Consider the 7 day moving average of the daily average precipitation (mm)
from the two stations related to Embajomuy Dam from January 1961 to August
1999 in ERIC II of CONAGUA. What is the maximum moving average for the
time period of reference?


20 calibration questions were asked

Dependence results

Remark (changes in modelling)

A)


discrete

Re
-
elicit (k
1
-
1) *k
2
*k
3
*k
4
conditional
probabilities

A)


continuous

1 conditional rank correlation and 1
univariate margin

B)


discrete

Re
-
elicit k
2
*(k
3
-
1) conditional probabilities

B)


continuous

Re
-
elicit possibly 2 rank correlations

Originally r
1,2

=
-
0.49, r
3,1|2

= 0.93 from
r
3,1
/r
1,2

=
-
1.68

If r
3,2

= 0.1


r
3,1
/r
1,2
is not valid anymore

A
)
B
)
Remark (combination)

Rank correlations obtained by
the combination used might be
very different from averaging
individual estimates

-1
-0.5
0
0.5
1
0
0.2
0.4
0.6
0.8
1
r
9,8
P(X
9
> x
9,q
k
|X
8
> x
8,q
k
)


C
A
B
D

APPENDIX

Results

Base line 95
th
/ 5
th =

104.5.

A
: expectation 16.2 > base line. 95
th
/
5
th

= 484.3.

B: expectation 92 > base line. 95
th
/
5
th

= 171.9

Results

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x 10
-3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1


1. Base Line
2. Conditional on oldest aircrafts
3. Conditional on oldest aircrafts and low experience crew
10
-7
10
-6
10
-5
10
-4
10
-3
0
1
2
3
4
5
5.14E-005
2.19E-004
2.92E-004
1.19E-003


1. Base Line
2. Conditional on oldest aircrafts
3. Conditional on oldest aircrafts and low experience crew
Results

Dams safety results

Flooding vs. costs



Given

1

earthquake

per

year

>

5

Richter,

rainfall

rate

=

1

mm/day,

maintenance

performed

every

year


Given

21

earthquake

per

year

>

5

Richter,

rainfall

rate

=

16

mm/day,

maintenance

performed

every

66

years