New Directions in Data Analysis
Pushpalatha Bhat
Fermilab
DPF2000
Columbus, Ohio
August 11, 2000
“A reasonable man adapts himself to the world.
An unreasonable man tries to adapt the world to himself.
So, all
progress depends on the unreasonable one.”
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
2
Outline
Intelligent Detectors
Moving intelligence closer to action
Multivariate Methods
Neural Networks: The “New” Paradigm
New Searches & Precision
Measurements: Some Examples
Measuring the Top Quark Mass
Discovery Reach for the Higgs
More Sophisticated Approaches
Probabilistic Approach to Analysis:
Exploring Models
Summary
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
3
World before
Experiment/
Analysis
World After
Experiment/
Analysis
Data
Interpretation
Data
Collection
Data
Organization
Reduction
Analysis
Transformation
Feature Extraction
Global Decision
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
4
Intelligent Detectors
Data analysis starts when a high energy
collision/event occurs
Transform electronic data into useful
“physics” information in real

time
Move intelligence closer to action!
Algorithm

specific hardware
Neural Network chips, for example
Configurable hardware
FPGAs, DSPs
Innovative data management on

line +
“smart” algorithms in hardware
Data in RAM disk & AI algorithms in FPGAs
Expert systems for control & monitoring
Trouble

shooting, diagnosis and fix
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
5
27.5 GeV
e

920 GeV
p
+
Neural Nets
hardwired logic
Smart Triggers
There are already Success Stories!
H1 Level

2 Trigger
•
Trigger on rare
ep
collisions in an
overwhelming beam

gas
background
•
NN Hardware: the
CNAPS 1064 chip
•
12 Independent neural
nets each one trained for a
specific physics process
in a total of 960 digital
processors
•
Successful operations
since 1996
Multivariate Methods
Keep it simple
As simple as possible
Not any simpler
Einstein
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
7
Multivariate Methods
The measurements being multivariate,
the optimal methods of analyses are
necessarily multivariate
Many Applications:
Particle Identification
e

ID,
t

ID, b

ID, e/
g
, q/g
Signal/Background Event Classification
New physics
Signals of new physics are rare and small
(Finding a “jewel” in a hay

stack)
Parameter Estimation
t mass, H mass, track parameters, for example
Function Approximation
Parametric methods:
Fisher discriminant, Kernel methods
Non

parametric Methods
Adaptive/AI methods
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
8
Optimal Event Selection
r(x,y)
=
constant
defines an optimal
decision boundary
Feature space
S
=
B
=
Conventional cuts
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
9
Discriminant Approximation
with Neural Networks
Output of a feed forward neural network
can approximate the Bayesian posterior
probability
p(sx,y
).
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
10
Calculating the Discriminant
Consider the sum
Where
d
i
=
1
for signal
=
0
for background
=v散t潲潦灡牡p整敲e
T桥h
in the limit of large data samples and provided that the
function
n(x,y,
is flexible enough.
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
11
Neural Networks (NN) are mathematical, adaptive
systems (algorithms).
The “hidden” transformation functions, g, adapt
themselves to the data as part of the training process.
The number of such functions need to grow only as the
complexity of the problem grows.
NN estimates a mapping function without requiring a
mathematical description of how the output formally
depends on the input.
x
1
x
2
x
3
x
4
D
NN
Neural Networks
The “New” Paradigm
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
12
Measuring the Top Quark Mass
The Discriminants
Discriminant variables
shaded = top
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
13
NN Discriminant
(
D
NN
vs m
fit
)
Signal (170 GeV/c
2
)
Background
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
14
Background

rich
Signal

rich
Measuring the Top Quark Mass
DØ Lepton+jets
m
t
= 173.3
±
5.6(stat.)
±
6.2 (syst.) GeV/c
2
Strategy for Discovering the
Higgs Boson at the Tevatron
P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62 (2000)
hep

ph/0001152
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
16
Hints from the Analysis of
Precision Data
LEP Electroweak Group, http://www.cern.ch/LEPEWWG/plots/summer99
M
H
= GeV/c
2
M
H
< 225 GeV/c
2
at 95% C.L.
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
17
Event Simulation
Signal Processes
Backgrounds
Event generation
WH, ZH, ZZ and Top with PYTHIA
Wbb, Zbb with CompHEP,
fragmentation with PYTHIA
Detector modeling
SHW
(
http://www.physics.rutgers.edu/~jconway/soft/
shw/shw.html
)
Trigger, Tracking, Jet

finding
b

tagging (double b

tag efficiency ~ 45%)
Di

jet mass resolution ~ 14%
(
Scaled down to 10% for RunII Higgs Studies
)
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
18
WH Results from NN Analysis
M
H
= 100 GeV/c
2
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
19
WH
(110 GeV/c2)
NN Distributions
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
20
WH Results
Is it worth it?
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
21
Combined Results (WH+ZH)
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
22
Results, Standard vs. NN
About half the luminosity required in case of NN analyses
relative to conventional analyses for the same discovery reach.
A good chance of discovery up to M
H
= 130 GeV/c
2
with 20

30fb

1
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
23
Improving the Higgs Mass
Resolution
Network

improved Higgs Mass
13.8%
12.2%
13.1%
11.3%
13%
11%
Use m
jj
and H
T
(=
E
t
jets
) to train a neural networks to
predict the Higgs boson mass
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
24
Newer Approaches
Ensembles of Networks
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
25
Committees of Networks
NN
1
NN
2
NN
3
NN
M
X
y
1
y
2
y
3
y
M
Decision by a committee has lower error
than the individuals.
The performance of a committee can be
better than the performance of the best single
network used in isolation
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
26
Probabilistic Approach to
Data Analysis
Bayesian Methods
(The Wave of the future)
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
28
Bayesian Analysis
M model
A uninteresting parameters
p interesting parameters
d data
Likelihood
Prior
Posterior
Bayesian Analysis of Multi

source Data
P.C. Bhat et al., Phys. Lett. B 407(1997) 73
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
29
Higgs Mass Fits
S=80 WH events, assume background distribution
described by Wbb.
Results
S/B = 1/10 M
fit
= 114 +/

11GeV/c
2
S/B = 1/5 M
fit
= 114 +/

7GeV/c
2
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
30
Solar Neutrino Problem
Electron neutrinos from the Sun seem to be lost
en route to the Earth. That loss is described by
the
neutrino survival probability
, P(E).
We have used solar neutrino data and standard
solar model predictions to extract P(E) and its
uncertainties.
Solar Neutrino Data 1998
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
31
Bayesian Analysis
C. Bhat, P.C. Bhat, M. Paterno, H.B. Prosper,
Phys. Rev. Lett. 81, 5056 (1998)
The first term models the high
frequency components, which
occur near the origin, while
the second term models the
lower frequency components
.
Take likelihood to be a
multivariate Gaussian,
I
is prior info.
Marginalization
Modeling the Survival Probability
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
32
Neutrino Survival Probability
C. Bhat et al.
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
33
Advantages of Bayesian Approach
Provides probabilistic information on each
parameter of a model (SUSY, for example)
via marginalization over other parameters
Bayesian method enables straight

forward
and meaningful model comparisons.
Bayesian approach allows treatment of all
uncertainties in a consistent manner.
Mathematically linked to adaptive
algorithms such as Neural Networks (NN)
Hybrid methods involving NN for
probability density estimation and Bayesian
treatement can be very powerful
DPF2000 Aug. 9

12, 2000 Pushpa Bhat
34
Summary
We are building very sophisticated
equipment and will record unprecedented
amounts of data in the coming decade
Use of advanced “optimal” analysis
techniques will be crucial to achieve the
physics goals
Multivariate methods, particularly Neural
Network techniques, have already made
impact on discoveries and precision
measurements and will be the methods of
choice in future analyses
Hybrid methods combining “intelligent”
algorithms and probabilistic approach will
be the wave of the future
Comments 0
Log in to post a comment