No Fault Found, Retest-OK, Cannot Duplicate or Fault Not Found?

downtownbeeMechanics

Nov 18, 2013 (3 years and 10 months ago)

102 views

No Fault Found, Retest
-
OK, Cannot Duplicate or Fault Not Found?



Towards a standardised Taxonomy

Dr Samir Khan

Dr Paul Philips

Mr Chris Hockley

Prof Ian Jennions

5
th

November 2012



Overview of the NFF problem


Cause and impact of NFF


Tackling NFF



The Multitude of Terms


Impact of inconsistent terminology



Maintenance culture



Concluding Remarks


Agenda


No fault

Fault

Diagnostic failure

System

Diagnostic

success

Strategy

No fault

The NFF problem



Removals of equipment from service for
reasons that cannot be verified by the
maintenance process (shop or elsewhere).



(Source ARINC Report 672 (2008) Guidelines for the reduction of No Fault Found
(NFF). Avionics Maintenance Conference, Aeronautical Radio Inc)

NFF Definition

The NFF problem


To Achieve Diagnostic Success?


Implies identification of the root cause.


That enables the correct maintenance activity
to be performed.


It suggests a closed
-
loop system that can
relate the symptom to the fault to the correct
maintenance solution.

The NFF problem


Some NFF history

1993, BA concern at cost of removals where


-

nothing found wrong


-

same fault re
-
occurred

13.8% of all unscheduled removals = NFF

Cost £17.6M per year

80.4% of all NFF were avionics

26.6% of all avionics removals were NFF







BA Presentation ERA Conference 1996

BA estimate costs at £20M per year






Blishke

WR & Murphy (2003) pub
-
John Wiley & Sons

1996, Boeing state 40% rate of incorrect parts removal from
the airframe




R
Knotts

MPhil Thesis Exeter

Avionics constitutes 75% of NFF occurrences in aerospace








Aviation Week, Feb 9 2007

The NFF problem


Stage 1

Common Stages of NFF

Stage 2

Contractors

In aerospace 50% of LRU
are classified NFF

Satellite industry
-

50%
increase

Surveys in electronic
equipment


NFF 19
-
>53%

F16 radar


67%

85% of all operational faults
in aircraft electronics

Train control system 50% of
faults

Some NFF events seem to be independent of life cycle stage

AMC 2004 voted NFF ‘the
most imp issue’

AC; Avionics 74%, pneumatic 19%


Results from Past Experiences and Studies

NFF costs can be generated through the following
activities:


Removals for the wrong reasons


Workshops failing to find and then repair the reported
fault


Inability to simulate the conditions in which the fault
occurred


Cost of removals due to No
-
Fault
-
Found can be huge
where:


Nothing has been found at fault


The same fault re
-
occurs in the next or a subsequent
mission, risking safety and/or mission success


Common Stages of NFF


Reasons for NFF


Not enough emphasis on diagnostics training


Pressure on quantity in workshops not quality


No emphasis on history of components


No tracking of “rogue units”


Causes of

NFF


Reliability issues

Loose
conn
./

corrosion

Poor maintenance

System design

intermittent

integration

Designer team

Maintenance team

complexity

BIT

Organisational setup

Management team

Poor understanding

BIT


Why is Tackling NFF difficult?


The solutions to NFF problems
may not be possible
because
:


Organizational culture


Procedures and rules


Technical inefficiencies


Workforce behaviour




NFF


Insufficient time
available for
maintenance

Design related
hard/software def.

Poor on board
maint
. system

Insufficient training of line
and shop personnel

Inadequate design for
testability

Poor test
coverage

Performance
measure

Inconsistent fault
reporting


Technical Causes of NFF


Undefined or limited performance measures


BIT levels set too low


BITE ability to detect a fault


BITE inability to apply reliable fault diagnostics


Lack of information on operating environment


Inability to reproduce operating environment during test or diagnosis


Intermittent failure caused by stress not replicated on test


Inadequate design suitable for robust testing for all faults


Inadequate fault models and fault trees for determining root causes


Lack of understanding of interactions between different integrated
systems and software


Reluctance to adopt new technologies (health monitoring) due to the
need to alter system designs or because the data handling /decision
making infrastructures are not available


Organisational



Time pressures on maintenance operations



Organisational cultures with no cross
-
functionality, employee
empowerment and encouragement to identify root causes



Inadequate training or lack of training tools



No commitment to sharing information and knowledge between
designers, manufacturers, service providers and operators



Solutions likely to be disruptive to normal working practices= reluctance to
change


Behavioural and Procedural


Discrepancies and faults in test procedures



Incorrect fault reporting



Wrong processes applied



Incomplete documentation



Lack of communication between maintenance personnel and
other experts.



Mitigation of NFF

Modelling of
intermittent faults;
e.g. Detect
probability of fault

Design of fault
tolerant systems;
add redundancy

Design methodology

eg. Include in
-
built
health monitoring

Transfer of
information

Diagnostic tests should
be added before
commissioning

Management policy;
Coherent teams

Tests should be
certified by the supplier

Recognise
fault alarms

Address root cause
of BIT deficiencies


Within this field there are several identified problems
:



A lack of common understanding on what constitutes the
phenomena resulting in Diagnostic Failure


Wide concept and lack of commonality in processes




Several existing standards deal with aspects of the
phenomena but may be overly specific


Existing standards are incomplete and fail to fully define the
problem in a way which has been accepted by industry.


A lack of interaction between the aspects, processes, activities
and stakeholders who suffer the impacts

The Problem


A Multitude of Terms


Missing published
academic
literature


Early calls made
into testability attributes of electronic equipment,



specifically
to mitigate NFF


but this has not yet been achieved across



all
test/maintenance levels.



Cultural
impacts


Copernicus Technology Ltd

survey


Q1) How
can a true gauge of the problem be
investigated if there is no standardised term
used in the maintenance history?


Q2
) Are all of these terms accurate


do they
actually describe the same event, or are there
subtle differences which need to be
recognised?


Do
we need standards?


Adopting standards help industries (and research) overcome technical
barriers by promoting organisational success through better workflow
paradigms and maintenance strategies



Most of the
existing descriptions of
these phenomena do not provide any
consideration for where exactly do these diagnostic failures occur within
the maintenance
process

“No Fault Found is a reported failure that
cannot be confirmed, recognised or localised
during diagnosis and therefore cannot be
repaired”
(
Roke
, 2009)

“The inability to replicate field failure
during repair shop test/diagnosis”
(Kirkland, 2011)

“A failure that may have occurred but
cannot be verified, or replicated at will,
or attributed to a specific root cause,
failure site, or failure mode”
(Qi
et al
,
2008).

“Removals of equipment from service for
reasons that cannot be verified by the
maintenance process (shop or elsewhere)”
(ARINC Report 672, 2008)


Is a single term enough?

The majority of definitions
lead to believe
that a failure during operation
(such as an intermittent fault) is the actual ‘No Fault Found’ event, and
hence leads to the majority of academic literature to classify NFF into three
distinct categories:


intermittent
failures,


integration
faults


Built
In Test Equipment (BITE) failures

However, such
a practice
may be incorrect
as
there
are primarily the root
causes which begin a sequence of events through various levels of
maintenance, which blend with other factors such as
organisational/
behavioural/cultural
and technical abilities to result in the final outcome of
NFF

(
drivers
towards diagnostic
failure).

Maintenance culture


Maintenance culture


sources



drivers



faults


Root causes

Intermittency

BIT

Operator error

Poor design

False alarm

Design defects

Connectors

Cables

Chassis (LRU)

Components

Lack of communication

Test coverage

Incorrect reporting

Wrong process

integration

Test procedures

Insufficient training

Leads to a

maintenance action

Influencing factors that

lead to a diagnostic failure

Location of

the fault


Maintenance culture

1) There is no evidence of faults during testing so the unit can be certified as


serviceable as there probably never was a fault


2) There has been no evidence of faults during testing but the test coverage may


be inadequate so it is best to replace the unit just to be on the safe side


3) Repeating
NFF
issue > not a single event but a sequence


4) Fault Not Found instead of No Fault Found?


“A reported fault for which the root
cause cannot be found


in other
words a diagnostic failure”
(
Cockram

and
Huby
, 2009)

Research Scope


Lack

of

standards

with

different

terms

describing

largely

the

same

problem



Human

Factors

is

one

of

the

core

driver

towards

diagnostic

failure



To

identify

procedural,

process

and

behavioural

issues

that

need

to

be

changed,

learning

from

best

practice

in

each

industry
.




To

devise

strategies,

methodologies

and

system

design

rules

to

mitigate

the

occurrence

of

intermittent

failure

mechanisms

and

to

demonstrate

their

effectiveness

in

reducing

the

likelihood

of

NFF

occurrences
.




To

develop

a

multi
-
disciplinary

approach

at

the

System

level

for

the

effective

analysis

of

the

root

causes

of

NFF

in

order

to

assist

design

activity

across

domains
.




Concluding Remarks

Key Challenges


Providing solutions which do not have a negative impact on current operations


i.e. increased downtime means reduced availability



Specific training



Development and adoption of appropriate Standards



Diagnostic procedures which encourages identification of root causes



Cultural changes



Working with design to identify the necessary improvements to mitigate NFF
Through
-
life



Concluding Remarks

Contact info

Samir:
Samir.khan@cranfield.ac.uk

Paul:
p.phillips@cranfield.ac.uk