DRIVER RATING SYSTEM USING DATA RECORDER

fantasicgilamonsterData Management

Nov 20, 2013 (3 years and 6 months ago)

99 views

D
RIVER RATING SYSTEM USING DATA RECORDER



M.Tech Dissertation


Submitted in partial fulfillment of the requirements

for the degree of



M
ASTER OF
T
ECHNOLOGY



By

Rahul Prakash Mundke

03329013


Under the guidance of

Prof. Kavi Arya
















Kanwal R
ekhi School of Information Technology

Indian Institute of Technology Bombay

June
2005








ii



A
CKNOWLEDGEMENT


I wish to express my sincere thanks to
my guide
Prof
.

Kavi Arya

for
encouraging the
efforts throughout last one

and half

year
. I would also like
to thank
Prof
.

Krithi
Ramamrithiam

and
Prof. Parmesh Ramanathan

for

their

valuable feedback
. I would
specially like to
express
thanks to friend
Sachitanand Malewar
;

this work would not
have been possible without
his

assistance and suggestions
.


I am

thank
ful to IIT Bombay transport department
for their help towards the project
.

I would like to thank my
classmates; discussion with them

g
a
ve useful comments and
criticism.

At last I pray my gratitude to my parents and all those who where involved
directly an
d indirectly in the completion of this project.







Rahul Mundke


03329013





June
, 2005

iii

CONTENTS

ACKNOWLEDGEMENT

................................
................................
................................
..........................

II

CONTENTS

................................
................................
................................
................................
................

III

LIST OF FIGURES

................................
................................
................................
................................
......

V

LIST OF TABLES

................................
................................
................................
................................
......

VI

ACRONYMS

................................
................................
................................
................................
.............
VII

ABSTRACT

................................
................................
................................
................................
.............

VIII

1
INTRODUCTION

................................
................................
................................
................................
.....

1

1.0

INTRODUCTION

................................
................................
................................
...............................

1

1.1

PROBLEM

DEFINITION

................................
................................
................................
...................

2

1.2

APPROACHES

FOR

IMPROVING

FUEL

EFFICIENC
Y

................................
................................
.

3

1.3

ORGANIZATION

OF

REPORT

................................
................................
................................
.........

4

2
FUEL EFFICIENT DRIVE
R MODEL

................................
................................
................................
...

5

2.0

INTRODUCTION

................................
................................
................................
...............................

5

2.1

FUEL

EFFICIENT

DRIVING

TIPS

................................
................................
................................
....

7

2.2

DIFFERENT

DRIVING

CONDITIONS

................................
................................
.............................

7

2.3

FEATURE

SELECTION

USING

DOMAIN

KNOWLEDGE

................................
.............................

8

3
DATA RECORDER DEVELO
PMENT

................................
................................
................................
.
11

3.0

INTRODUCTION

................................
................................
................................
..............................
11

3.1

DEPLOYMET

PLATFORM

................................
................................
................................
..............
11

3.2

DATA

RECORDER

VERSION

1.0

................................
................................
................................
....
12

3.2.1 Sensors interface

................................
................................
................................
.........................
12

3.2.2 ATMEGA16 microcontroller based interfacing unit

................................
................................
...
15

3.2.3 Data logging section.

................................
................................
................................
..................
15

3.3

DATA

RECORDER

VERSION

2.0

................................
................................
................................
....
16

3.3.1 Sensors and its signal conditioning

................................
................................
.............................
17

3.3.2 Interfacing unit

................................
................................
................................
............................
18

3.3.3 Data logging section

................................
................................
................................
...................
19

3.4

PROBLEMS

ENCOUNTERED

AND

THEIR

SOLUTIONS

................................
............................
21

3.4.1 Clutch, brake and accelerator position sensors

................................
................................
..........
21

3.4.2 GPS receive
r

................................
................................
................................
...............................
21

3.4.3 Gear Position Sensors

................................
................................
................................
.................
21

3.4.4 Accelerometer (Acceleration based sensors)

................................
................................
..............
21

3.4.5 Sampling Rate

................................
................................
................................
.............................
22

3.5

DRIVER

MONITORING

APP
LICATION

................................
................................
........................
22

4
DATA MINING TECHNIQU
ES

................................
................................
................................
............
25

4.0

INTRODUCTION

................................
................................
................................
..............................
25

4.1

APPROACH

................................
................................
................................
................................
.......
26

4.2

DATA

PRE
-
PROCESSING

................................
................................
................................
................
26

4.3

DIMENSIONALITY

REDUCTION

................................
................................
................................
..
27

4.3.1 Fourier Transform

................................
................................
................................
......................
27

4.3.2
Principal

Component Analysis

................................
................................
................................
....
28

4.3.3 Higher Level Feature Extraction

................................
................................
................................
29

4
.4

PREPARING

TRAINING

DATA

SET

................................
................................
...............................
29

4.4.1 Markov Chain Based Application

................................
................................
...............................
29

4.4.2 RCD Model based Application
................................
................................
................................
....
31

4.4.3 Racer Application

................................
................................
................................
.......................
32

4.5

DIFFERENT

CLASSIFICATION

APPROACHES

................................
................................
...........
35


iv

4.5.1 Comparing the system parameters

................................
................................
..............................
35

4.5.2 Comparing Predictions

................................
................................
................................
...............
35

4.6

CLASSIFIERS

IN

DETAIL

................................
................................
................................
................
36

4.6.1 Regression Based Classifiers

................................
................................
................................
......
36

4.6.2 Bayes Classifier

................................
................................
................................
..........................
37

4.6.3 Hidden Markov Model Based Classifier

................................
................................
.....................
37

4.6.4 Neural Network Approach Based Classifier

................................
................................
...............
38

4.7

RESULTS

BASED

ON

SYNTHETIC

DATA

................................
................................
....................
39

4.7.1 Comparison with other methods

................................
................................
................................
.
39

5
CONCLUSION

................................
................................
................................
................................
.........
41

6
FUTURE WORK
................................
................................
................................
................................
......
42

REFERENCES

................................
................................
................................
................................
............
43

BOOKS

................................
................................
................................
................................
.....................
43

REPORTS

AND

PAPERS

FROM

CONFERENCES

AND

JOURNALS

................................
................
43

THESIS,

REPORTS

AND

TUTORIALS

................................
................................
................................
.
44

WEBSITES

................................
................................
................................
................................
...............
45

APPENDIX

................................
................................
................................
................................
..................
46

A.

DEVICE

DRIVER

DEVELOPMENT

................................
................................
................................
.
46

B.

DRIVER

ASSIST

SYSTEM

OVERVIEW

................................
................................
..........................
47

C.

TOOLS

................................
................................
................................
................................
.................
48

1. Weka (Data Mining
Platform)

................................
................................
................................
.........
48

2. Racer Software

................................
................................
................................
................................
.
49





v

L
IST OF FIGURES


F
IGURE
1.1:

F
UEL
P
RICE
T
RENDS
(C
RUDE
O
IL
)

1

F
IGURE
2.1

E
NERGY CONSUMPTION BY

PETROL ENGINES
(
FROM
US

DOT,

1998)

6

F
IGURE
3.1

IIT

B
US ON WHICH EXPERIME
NTS ARE CONDUCTED

11

F
IGURE
3.2

C
UT VIEW OF BUS SHOWI
NG LOCATION OF SENSO
RS

11

F
IGURE
3.3

EDR

V
ERSION
1.0

12

F
IGU
RE
3.4

A
CCELEROMETER ALONG W
ITH ITS HOUSING

12

F
IGURE
3.5

P
LACEMENT OF THE POTE
NTIOMETERS

13

F
IGURE
3.6

M
AGNETIC PICKUP ENGIN
E
R
PM

13

F
IGURE
3.8

S
ENSOR FOR
V
ELOCITY

14

F
IGURE
3.10

EDR

VERSION
1

15

F
IGURE
3.11

B
LOCK
D
IAGRAM OF
EDR

V
ERSION
1.0

16

F
IGURE
3.12

EDR

V
ERSION
2.0

17

F
IGURE
3.13

S
ENSOR FOR GEAR POSIT
ION

18

F
IGURE
3.14

I
NTERFACING UNIT

18

F
IGURE
3.15

EDR

VERSION
2.0

BLOCK DIAGRAM

18

F
IGURE
3.16

S
CREEN SHOT OF THE
D
RIVER MONITORING APP
LICATION

23

F
IGURE
3.17

S
CREEN SHOT OF
V
EHICLE
T
RACKING
A
PPLICATION

24

F
IGURE

3.18

S
CREEN SHOT OF THE
R
ECORDING APPLICATION

WITH
G
RAPH PLOTTING FEATUR
E

24

F
IGURE
4.1

O
VERVIEW OF THE STEPS

COMPRISING
KDD

P
ROCESS

27

F
IGURE
4.2

S
TATE
M
ODEL FOR
S
YNTHETIC
D
ATA
G
ENERATION APPLICATIO
N
-
1

30

F
IGURE

4.3

S
YNTHETIC DATA GENERA
TED USING APPLICATIO
N
-
1

31

F
IGURE
F
IG
4.4

S
YNTHETIC DATA GENERA
TED USING APPLICATIO
N
-
2

31

F
IGURE
F
IG
.

4.5

D
RIVING
P
ROFILE GENERATED USI
NG
R
ACER TOOL
:

G
OOD DRIVING EXAMPLE

32

F
IGURE
F
IG
.

4.6

D
RIVING
P
ROFILE GENERATED USI
NG
R
ACER TOOL
:

B
AD DRIVING EXAMPLE

33

F
IGURE
4.8

P
ARAMETRIC
R
EGRESSION FINDING A
LINEAR EQUATION MODE
L

37

F
IGURE
B.1

D
RIVING DATA
R
ECORDING CUM
A
DVICE
A
PPLICATION

47

F
IGURE
C.1.1

W
EKA
A
PPLICATION

48

F
IGURE
C.2.1

R
ACER
S
IMULATION
E
NVIRONMENT

49

F
IGURE
C.2.2

R
ACER
S
IMULATION
E
NVIRONMENT

49




vi

L
IST OF TABLES


T
ABLE
1:

BEST

B
US
M
ILEAGE

................................
................................
................................
.......................

2

T
ABLE
2:

T
RAFFIC VARIABLES USE
D IN ANALYSIS

................................
................................
..........................
10

T
ABLE
3:

C
O
MPARISON ON
S
YNTHETIC
D
ATA

................................
................................
................................
39

T
ABLE
4:

C
OMPARISON ON
S
YNTHETIC
D
ATA

................................
................................
................................
40




vii

ACRONYMS


SAE


: Society of Automotive Engineers

EDR


: Event Data Recorder

KDD


: Kno
wledge Discovery in Databases

KMPL


: Kilo Meter Per
Liter

RPM


: Revolution Per Minute

BEST


: Brihanmumbai Electricity Supply and Transport

GPS


: Global Positioning Sensor

USB


: Universal Serial Bus

PCA


: Principal Component Analysis

DFT


: Discr
ete Fourier Transform

STFT


: Short Time Fourier Transform

ADC


: Analog to Digital Converter

WEKA

:
Waikato Environment for Knowledge Analysis


viii

A
BSTRACT



Whereas d
ata recorders
are usually utilized
in vehicles
for post
-
crash analysis, w
e
propose
a new a
pproach
of
us
ing

this

device
to
proactively
improve

the drive quality of a
driver.
Whereas vehicles in the West
generally

operate in an optimal environment with
high
-
quality road infrastructure, the situation in developing countries such as India
is

differ
ent. The vehicles are often under
-
powered, sub
-
optimally maintained, over
-
loaded
and ply on poor
-
quality roads where the driving style greatly affects the
safety aspects
and
performance of the vehicle.
In such an environment, fuel costs are also relatively

high and greatly influenced by the driving
-
style of a driver.
B
y improving the
driving
style
,

fuel economy
can be improved
by
up to 15%. The
resulting system may be used
by
fleet managers
to study driver
behavior
, to improve the safety of their services a
nd to
help
drivers improve the
fuel economy
of their driving
.
An

optimal
driver
-
model
is developed
a
long with

data mining techniques
.

The solution would be able
to
rate

driving style on
fuel
-
economy bas
is

by analyzing the driver’s interaction
data
with the

vehicle
.
Preliminary results
derived from
synthetic
tests
are presented.


Keywords

Fuel Efficiency, Driver Assist, Feature Selection, D
river Rating
, Time Series
classification
























1

CHAPTER ONE

I
NTRODUCTION

1.
0

INTRODUCTION


A great deal
has been written about the safety aspects of driving style. Well
-
researched issues include the

relationships between driving speed and variables such as
crash rate and severity of injury or likelihood of a

fatality. Aggressivity of driving has
received con
siderable public attention. With an increase in

concerns about air pollution
and the greenhouse effect, there has also been extensive research examining the

effect of
driving style on the emissions of motor vehicles. However, there is not a large volume of

research

examining the interface between

driving style with fuel efficiency.


In
countries like India
,

vehicle owners are
more concerned
about running cost of
vehicle. Running cost includes mainly the expenditure on fuel and timely and untimely
maintenanc
e costs.
Different

techniques to improve
fuel economy
are
getting developed
and existing ones are evolving
.
One
of the

techniques

involves
improving

driving style
.
Technical studies of this
technique

ha
d

proven

that
,

there is a strong
correlation between
d
riving style and fuel consumption
.

E
ven without going to extremes
of
driving style
behavior, differences in fuel efficiency between different style ran to tens of percent
age

[1
8
] [
15
].This s
uggest
s

a potential for reducing fuel consumption by influencing t
he
driving
style

[
5
] [
8]
.



Figure
1
.1: Fuel Price Trends

(Crude Oil)

(
S
ource: Indian Oil Corporation web site)


While
interest
in better fuel economy is
mainly
due to

economical reasons,
this is
also tied to the question of pollutants, thus attracting th
e eyes of organizations with an
economical agenda. Thus a fuel efficient driving
also
help
s

in reducing the air pollution

[
3
-
6
]
. Fuel efficient driving style is
also
found to be a good and safe driving, thus

2

prevent
ing

accidents, reduc
ing

wear and tear on
vehicles

as well as drivers
and resulting
in
better comfort
to

passengers.


Energy consumption in its present volume and composition is using up our scarce
fossil fuel resources.
P
ollution resulting from this energy consumption has a negative
effect on the

environment. With combustion of petrochemical fuels various chemical
compounds are emitted in the environment. The type and amount of pollution depends on
many factors: the type of engine or vehicle (e.g., passenger car, heavy
-
duty truck,
lawnmower, locom
otive, etc.)
,

the type of fuel used (e.g., gasoline, diesel, or alternative
fuels)
,

type and condition of emission control devices (e.g., catalytic converters)
,

and
importantly
,

how the engine is used or run.


Although several new energy
-
supply technologie
s are emerging, the world’s
consumption of fuels derived from natural oil keeps increasing every year.
From an
environmental perspective, fuel consumption results in the production of vehicle
emissions which can be classified into air pollutants (which aff
ect health) and greenhouse
gases (which affect the environment). Fuel consumption also depletes stocks of non
-
renewable fossil fuels. Total fuel consumption can be decreased by reducing vehicle
travel or by reducing fuel consumption rate (improving fuel ec
onomy). In this project
,

focus
is
on the measures that lead to fuel economy.
[13
-
17]
.

1.1 P
ROBLEM

D
EFINITION

Brihanmumbai Electric Supply and Transport (BEST) undertaking provides a
fleet of passenger buses for city transport. They own around 3500+ buses r
unning all over
Mumbai.
The average distance traveled by
a
bus per day is
around
217 KM. The bus
service covers 90% of the Mumbai roads
,

serving around 4.5 million passengers every
day.

The
variant of buses with their current mileage as of now is given bel
ow
.




Bus Type

Current Mileage (Kilo Meter per Liter)

Air Conditioned buses

1.9 KMPL

Double Decker

2.69 KMPL

Single Decker

3.0 KMPL

Table 1
:

BEST Bus Mileage


As per BEST data, Fuel consumption per bus
per day
is in the range of 80 to 100
liters of di
esel.

The buses are
generally of
Ashok Leyland
make
and comprise 4 different
engine models.

With such a large number of buses
,

fuel cost is one of the important
factors in their operational expenditure
.
This project is one step towards helping BEST
bring d
own this
operating
cost. A savings of around 5% on fuel would
fetch
roughly
Rs.10 million in savings every month
.


BEST is interested in ways
by
which its crew (drivers) can be monitored
,

analyzed and advised to
drive in
a
fuel efficient manner

and
help
t
he
organization
improve its bottom line.
BEST is also particular about the solution and
wishes to avoid
any
retro
-
fitment which involves downtime. T
h
us

solution needs to be portable and non
-
intrusive, some kind of box which can be carried by crew.




3


BEST
needs a

new
-
generation fuel
-
efficiency
based driver
rating

tool
.
The
rating

tool includes a
rule based

model that formulates optimal driver behavior which is used to
differentiate between fuel
-
efficient and non fuel
-
efficient drivers.
I
f actual

driving
b
eh
avior deviates from an optimal fuel
-
efficient behavior, detailed advice is

presented to
the driver on how to change driving behavior in order to minimi
z
e fuel

consumption.

The tool
needs

to meet
the
following requirements
.



Driver rating should be done by
analyzing
a
minimal set of sensor variables




Rating should be
f
air

and accurate to a satisfactory level as decided by BEST



Tool needs to be non
-
intrusive

with the existing setup on bus



Target prices of the tool needs be less than
$
100



Tool should be
portab
le across
other fuel based vehicles

wherever

possible



1.2

A
PPROACHES FOR IMPROVING FUEL EFFICIENCY


The largest potential
of

improv
ing

fuel economy in road transport probably lies in
enhancing vehicular technology. However such an approach has a relative
ly long
gestation
time. The most effective way to reduce fuel economy in the short term is to
change driver behavior, which can lead to a reduction in fuel consumption of up to 15%
[
3
]. An additional benefit is that
,

the improvement achieved will still be
valid when new
vehicular technology becomes available. Together they can reduce fuel consumption
even further.


The power delivered by the engine of a car is used to overcome air resistance,
friction, rolling resistance and inertia (vehicle weight) during
acceleration. When driving
at low speed the main factor determining power requirements is vehicle weight.
T
echnological progress
has helped
vehicle manufacturers to reduce the weight of
a
vehicle. But there are limits to which this can be stretched
,

as low
er weight vehicle also
mean
s

low
safety [
3
]
.

In the report by N
ational
H
ighway
T
ransport
states that making all
passenger vehicles 100 pounds lighter could cost 1,000 more lives per year mostly in
crashes of vehicles that already are
lighter
[4]
.


One can s
ee
numerous concept

vehicles
based on futuristic

technology
on display
at major automobile fairs. One of
such vehicle

is hybrid electric
vehicle
.

T
hese are
powered by conventional engine and a large battery. The latter is charged by the engine.

A
nother
act
iv
ely

research
ed

technology vehicle is based
on

fuel
cell
s

which bring

down
the running cost
s

by a great extent
.


O
ther approaches g
e
t fuel savings from restricting the top speeds and engine
power (indirectly engine volume and weight).

It is also known tha
t
diesel engines are by
nature more fuel efficient than equivalent petrol engines. The advantage tends to be
greater in urban conditions as petrol engines show a much greater decline in efficiency at
part load than diesel engine. Direct injection engines a
re more fuel efficient than the
indirect engines.



4

Here are some of the
techniques in use

for

improving the fuel efficiency of engines

[4
-
6]



High powered ignition systems that ensure complete consumption of fuel
available



Improved fuel injectors



Computer
controlled engine management



Improved compression at low engine loads



Reduced mechanical friction



Reduced air drag and rolling resistance


But as can be seen
not all of
these are
add
-
ons to existing engines,
the
se

improvements would come along with newer v
ehicles. One can not overlook the existing
huge base of older
vehicles
.

One of the
a
pproaches

in such case
is

improving the driving
style to
get

better

fuel efficiency.


Analyzing why th
e

problem
of fuel efficiency
is
of so much importance

in Asia than
we
stern countries
led

us to
following observation
s


1.

In western countries
,

fuel is cheap; even though it has risen in recent time
s

it is
still significantly below its 20 year high. Asian countries on the contrary spend a
significant percentage of their GDP on

the import of crude oil. Any 1$ change per
barrel price would mean a significant price hike in fuel in these countries.

2.

The infrastructure i.e. roads is excellent in western countries making it less
vulnerable to traffic jams and
other such fuel wasting p
henomena
.

3.

Vehicles in
the W
estern market
s

are generally technically superior and are fuel
efficient over the ones that can be found in Asia.

4.

Fuel efficiency is less seen on
the
wish
-
list of customers from western countries
compared to Asian customers
,

wh
o are
much
more cost
sensitive
.


1.3 O
RGANIZATION OF REPORT


Chapter
2

explains the fuel efficient driver model as in what features
,
sensor data
are
essential in determining the fuel efficiency measure for a driver. It gives the details of
features which
a
re used
for rating the driver.



Chapter 3 gives the details of the custom made hardware prototype, details about
the recording and other auxiliary software. It also details out the problems faced along
with the solutions during the development. Evolution

of hardware and software is given
in short in this chapter.


V
arious data mining techniques
are discussed
in chapter
4
.
T
he suitability of one
classification
method over other
s is discussed
.

This chapter also describes a
bout various

application
s

which are

used for generating synthetic data
.

Preliminary results based on
the synthetic data are discussed in the chapter.

C
onclu
sion

is give
n

in

Chapter
5
.

S
cope of
fu
r
t
h
e
r

work
and plan of action is
mentioned
in Chapter

6
.

5

CHAPTER TWO

F
UEL EFFICIENT DRIVER MODE
L


2.
0

INTRODUCTION


Lots of research is being done in finding out the best driving style
s
.
[6
-
15]

In some
approaches
,

a
computer generated driver model is used
for

analysis while other

studies
are
based
on
experiments with
human drivers.

One needs
a

good a
mount
of data
collected
over a
sufficient

period of time
representing all possible road and traffic conditions,
to
come up with
the
best possible
fuel efficient driver model
s
.
A l
iterature survey revealed
interesting data

in line with our assumptions
.
E
xis
ting
work is mostly carried out from the
viewpoint of effects of driving on
pollution

addressing itself to questions such as “
does
changing driving style affect the pollution level

in emissions
?


Interestingly
the
p
ollution
level in
exhaust
emission
s

depen
ds on fuel efficiency
,

as most of the time the pollution in
vehicle exhaust is due to incomplete combustion of fuel
.



Generally
driving style
is

described in terms of acceleration pattern, mean
acceleration, mean decelerations

and
time spent on stable vel
ocity.
M
ean acceleration and
deceleration over
a
distance
correlate
fairly well with fuel consumption

as discussed in

some work [
14
]
.



G
ood driving is mainly about planning in advance to avoid breaking, using the
engine
to
brake

most of the time
, accelera
ting strongly and shifting into higher gear as per
recommended engine rpm speed, using a uniform throttle when a desirable speed is
attained, driving at highest possible gear

etc.

This verbal description is the base for
specific instructions with the aim o
f facilitating these behaviors; for example keeping an
eye on traffic
-
lights up ahead to prepare for early braking. Such instructions are readily
understandable to human
s

but it is not certain how they should be translated into
measurable variables apart f
rom fuel consumption.



The terms “fuel economy”, “fuel consumption” and “fuel efficiency” are often
used interchangeably when discussing vehicles, initiatives and policies. There is some
benefit to defining each of these terms, as technically they relate

to different aspects of a
vehicle’s performance.

1.

Fuel consumption

Fuel consumption
is simply the “total quantity of fuel consumed by a vehicle, or
specified segment of the vehicle fleet, in a road network in a specified area and time
period” In a metric s
ystem,

this volume of fuel is generally expressed in litres. Generally
there are two fuel consumption tests: one for city driving and one for highway driving.
The city driving test simulates a 12
-
km, stop
-
and
-
go trip with an average speed of 32
km/h. The t
est includes time spent idling and cold and hot starts. The highway driving
test represents ‘non
-
city’ driving over a distance of 16.48 km, at an average speed of 77
km/h. The test is run from a hot start and has little idling time and no stops.


6

Factors af
fecting fuel consumption rate


The Biggs
-
Akcelik instantaneous model of fuel consumption and emissions is
described [24]. In this model, the characteristics of the vehicle that affect fuel
consumption are vehicle mass, the fuel used in maintaining engine o
peration (estimated
by the idle rate), engine efficiency in general,
and energy

efficiency during acceleration,
rolling resistance and aerodynamic resistance.


The primary characteristic of the roadway that affects fuel consumption is
percentage gradient.

Fuel consumption increases with speed because the total tractive
force needed to drive the vehicle increases. Aerodynamic resistance increases more than
proportionally with speed. Fuel consumption also increases with acceleration.



2.

Fuel Economy


Fuel ec
onomy
is the inverse of fuel consumption rate. it is the distance that can be
traveled using a certain amount of fuel. Fuel economy is traditionally measured in terms
of kilometers per litre, in the metric system.


3.

Fuel Efficiency


The standard dictionary
definition of efficiency in mechanical terms is essentially
the ratio of the work or energy output of a machine or process as a function of the work
or energy input, often expressed as a percentage. Due to forces such as friction and
inertia, this ratio ge
nerally does not reach 100%.
Fuel efficiency,
therefore, is the work
output of an engine in terms of vehicle travel as a function of the energy content of the
fuel expended in the operation of the vehicle. As such, the fuel economy of a car can be
enhanced

by improving the fuel efficiency.


As Figure 2.1 demonstrates, about 18% of the energy content of fuel is used to
move a car along the road, split between overcoming rolling friction, aerodynamic drag,
and inertia (US DOT, 1998). The remaining 82% of the
initial energy is lost as heat in the
engine.


Figure 2.1 Energy consumption by petrol engines (from US DOT, 1998)


7

2.1 F
UEL EFFICIENT DRIVING TIPS


The quick literature survey suggests following tips which results in fuel efficient
driving

[6
-
16].
These t
ips form

the base work for our optimal driver model, which would
be generated by looking for presence/absence of these traits in driving profile.



Use of clutch plate only at right time



Driving maximally at the optimal engine

RPM

recommendations



A
ccelerati
n
g heavily quickly initially and then maintaining the throttle position
once the desired speed is achieved



Shifting higher gear well before higher engine
RPMs




When heading off one should change up to second gear as soon as possible and
then to higher gear
s at one
-
third to half
-
throttle.



Engine speed should not exceed 3000 rpm (or level of highest torque).



Drivers should look and plan ahead and coast to traffic lights or intersections so
that there is no unnecessary braking and the timing is such that the v
ehicle does
not come to a complete stop.



Driving to match the rhythm of the traffic.



Use the upper gears as much as possible and keep engine speeds down.



In vehicles of increased power and higher torque make the engine work more
rather than changing down a

gear.



Skip gears when it is appropriate.



Keep engine idling to a minimum



No “warming
-
up” time is required when a car is first started.



Avoiding use of air conditioners



Maintaining right pneumatic pressure in wheels



Avoiding driving at
very
high speed
, mai
ntaining speed between 60 to 80 Km/h

2.2 D
IFFERENT DRIVING CONDITIONS


Although with this data,

limitation comes from the inability to capture the road
and traffic conditions
.
Rating system needs to consider different driving conditions as
given below


1.

Dri
ving

on a
n

interstate highway

One
would expect to
observe
a
large

fraction of journey
s

traveled in
a
higher gear
with a constant
cruising
speed.
In this setup
,

different road
terrains

would be seen
.

Over a straight stretch of road
,

a fuel efficient driver
would take care not
to
stretch

engine

to
extreme levels i.e.
driving at very
high engine RPMs.
Driving behavior
at sharp turns would be one of the important judging criteria
,

as here a fuel
efficient driver would stand out clearly from rest
. Since the dist
ance traveled is
larger, difference and savings would be more visible among drivers.


2.

Driving
on a high
-
traffic city road

Contrary to highway traffic, city driving face
s

different traffic conditions right
from frequent signal
-
stopping
to
traffic congestion

to freeways. Here the driving
style
is
more crucial as
the
driver need
s

to make
more

and quick

interactions with
vehicle

compared to highway mode of driving
.
The t
ime fraction
of

engine
in
i
dle
or
in
lower gears

forms a

significant proportion
. Feature sel
ection for the

8

algorithm
addressing this condition
need
s

to be carefully chosen
,

taking into
account the
different
traffic conditions.

2.
3

F
EATURE SELECTION USING DOMAIN KNOWLEDGE




S
uccess in any data mining
based solution depends on
the right choice of

features.
It’s

been seen that in some cases,
right choice of features
,

give comparable results
for
different
classifiers [
13]. Feature selection can significantly improve the
comprehensibility of the resulting classifier model and often build a model that

generalizes better to unseen points. The main goal of feature selection is to find the subset
of the features which gives higher classification accuracy.

One of the
goal
s

of feature
selection is also to choose a subset of optimal input variables by elimin
ating features with
little or no predictive information.



In the driving domain, many variables affect driving performance. While the
distance between our
vehicles

to the vehicle immediately in front of us is probably
important, the distance from our vehi
cle to that one behind us may not be very important
in most cases.
Given below

are the details of the feature which
are
mode
led

for fuel
efficient driver
.



1.

Time
Fraction in higher Gear

Vehicle gives maximum
fuel
efficiency
on its highest possible gear

on given road
conditions
. This is mainly due to the
g
ear
-
r
atio settings
;
at higher gear more engine rpm
is linked with vehicle wheel assembly. A fuel efficient
d
river depending on torque
requirement would drive the vehicle at highest possible gear

without
engine getting into
a
stall
.
T
he ratio of instances where the vehicle
is

dr
i
ve
n

in gear X to total instance

is taken
.

This derived featured

goes through normalization

to reduce the sample space for the
algorithm so
as to
give better classification
.

[
1
1
-
13,

19
]


2.

Time
Fraction of
violation
as per

Gear vis
-
à
-
vis Speed
readings

Every manufacturer suggests optimal gear for different
engine rpm and vehicle
speeds. A
fuel efficient driver takes care
to comply with these recommendations.

An e
ngine
operated

outsi
de these
recommendations

not only consume
s

more fuel
but also result
s

in
more wear and tear
,

reducing engine and vehicle life
,

and further degrading the fuel
efficiency performance of engines
.

Again to be
fair to
drivers who
might
have driven
vehicles
for

different amount
s

of times the fraction of this
variable is taken as feature
.

[11
-
13, 19]



3.

Time
Fraction of
hard
deceleration
s

Hard braking
very quickly
destroys the moment of inertia gained by the vehicle.
A
frequent hard braking instance in data is
clea
rly a sign of non fuel efficient driving style
.

Algorithmic interests in this feature included checking sensitivity of measurements when
decelerating at low speeds (< 5m/s) and constructing algorithms that can distinguish long
decelerations instead of
measuring (too many) short decelerations.
[6
-
15
,

1
8]




9

4.
Time
Fraction of
too

fast accelerations

The presence of more number of too fast accelerations
in acceleration profile
indicates
the aggressiveness of driver.

A fuel efficient driver
ideally
show
s

a

profile of smooth
acceleration.

Algorithmic interests in this feature included, check sensitivity of
measurements when accelerating from low speeds (<5 m/s) and constructing algorithms
which can distinguish long continuous accelerations from (many) short a
ccelerations.

[6
-
15, 18]


5
.
Average Speed during journey

This give
s

an

estimate about the
driver’s

ability to drive at optimal speeds.

This data
collected over a period of time for a particular route could become a quick check while
rating a driver.

Fuel
inefficient drivers who exceed the speed limit are aggressive, time
-
pressured drivers who not only use more fuel due to their high speed, but are more likely
to accelerate and decelerate at higher rates as they thread their way through traffic

increasing
their fuel consumption dramatically.

[6
-
15, 14, 18, 24]


6
. Average Acceleration during journey

There is quite a different way of measuring driver behavior; by acceleration patterns.
These have been studied as predictors of traffic accidents, but also of f
uel consumption
[18]. The advantages of acceleration patterns as individual differences variables are that
they have been shown to be stable over time.

A good driver would have higher mean
acceleration while a lower mean deceleration over other drivers.
[1
2
-
14]


7
.
Standard deviation of trip speed

This data help
s

in finding out the
deviation of speed from the average
speed, which helps
in
find
ing

out extremes of speed range covered by driver.

A good driver would show
a
lower
deviation
over other drivers als
o this number would tend to be same during

multiple runs.

[12
-
14]


8
. Standard deviation of acceleration

Similar to standard deviation for trip speed, this data would help us find out deviation of
acceleration from the average acceleration which would be a

good candidate for
comparing among drivers in case of tie for average acceleration.


9
. Time Fraction of long idling period

In a city traffic where one can expect traffic jams and signal stops
,

idling period is more
likely to

be seen. Presence of long idl
ing
period,

although indicates time for which the
vehicle ignition c
ould have been turned off,
saving fuel.


1
0
. Fraction of time
m
isuse of clutch pedal

Use of clutch
ideally
should be done
only while

shifting gears. A driving with clutch
in
pressed positi
on
would eat up some of the energy of engine

and would also result

in wear
of clutch plate.
M
isuse
of clutch
is found out
by keeping
a sufficient enough time frame

before and after gear change.

Another method is to look for the ratio of engine RPM over
veh
icle RPM, just after the gear box. Given the gear ratio, one can determine the clutch
p
edal p
ressing status
.


10


No

Variable

Unit

Formula

1

Time
f
raction
in

g
ear
-
X

%

Time
(
c
urrent gear =
g
ear
-
X
)

/ Total Time

2

Time
f
raction of violation
as per
g
ear vis
-
à
-
vi
s
s
peed
readings

%

Time
((cu
rrent velocity <
m
in speed

for c
urrent Gear)
O
r

(
Current velocity > Max speed

for
Current Gear)
)
/ Total Time

3

Time
f
raction of too hard
decelerations

%

Time
(
c
urrent acceleration <
-
3
m/s
2
)

/ Total Time

4

Time
f
raction of too f
ast
accelerations

%

Time
(
c
urrent acceleration >

2

m/s
2
)/
Total Time

5

Average trip speed

m/s





0
1/
T
v T v t dt



6

Average acceleration

m/s
2



0
a = 1/ T a(t)dt, a(t) 0, v(t)>0
T



7

Standard deviation f
or

trip
speed

Km/h









2
0
1/,0
T
T v t v dt v t
 


8

Stan
dard deviation f
or

acceleration

G











2
0
1/,0,0
T
T a t a dt a t v t
  


9

Time fraction of vehicle
stationary
/engine idling

%

Time
(Velocity = 0 Kmph and Engine RPM > 0)

/ Total
Time

1
0

Fraction of time misuse of
clutch pedal

%

Time
(
engine rpm >
current
vehicle
rpm


* gear ratio
)

/ Total
Time

Table 2:
Tra
ffic variables used in analysis


Where

T

:
Total T
rip
T
ime

D

:
Total T
rip
D
istance

a

: Acceleration

v

: Velocity

g

: Gravitational constant (9.81 m/s
2
)



11

CHAPTER THREE

DATA RECORDER DEVELOPMENT

3.
0

INTRODUCTION


Th
e project involved a significant amount of work in developing hardware and
associated software for recording the data. Both hardware and software went through
evolution during the development.


In this chapter we discuss about types of the sensors that ar
e embedded in the bus.
Further sections discuss about architecture of the data logger. Later it describes how the
data logger evolved as we identified fundamental sensor types and various difficulties
encountered during the development. Lastly this chapter

describes various data logging
and display software.

3.1
DEPLOYMET PLATFORM

Experiments are conducted on one of the
Institute

buses
(IIT Bombay)
.



Fig
ure

3.1 IIT Bus on which experiments are conducted


Since the bus is in the warranty period, opening or

modification of any kind of hardware
is not allowed. All the sensors are interfaced with bus in an absolutely non
-
invasive way.


Figure 3.2

C
ut view of bus showing location of sensors


12

D
ata recorder can be divided in
to

three parts.

1.

Sensors and its signal
conditioning

2.

Interfacing and analog to digital converter

3.

Data logging section.

3.
2

DATA RECORDER VERSION 1.0

3.
2
.1 Sensors interface


In the
first version
,

following sensors are interfaced.

1.

ADXL 202 based 2 axis Accelerometer (
±
2g.)

2.

Analog potentiome
ter based position sensor for the Clutch, brake and accelerator

3.

Magnetic reed switch based gear position sensor.

4.

AD22151 Hall effect sensor based Engine RPM and velocity detection.

5.

Brake indication based upon feedback from the electrical system.

6.

USB based

Web camera for the Visual feed back.

7.

GPS position for the recording bus position when data logging is done.

8.

A
udio pickup sensor
is

used

to
perform
analysis
of engine sound
.



Figure 3.3

EDR
v
ersion 1.0

3.2.1.1

ADXL202 based accelerometer



Figure 3.
4

Accelerom
eter along with its housing


13

ADXL202 is a two
-
axis accelerometer used for the measuring acceleration in the two
axis (X and Y)
.

It provides an analog output which is proportional to the acceleration.
Using X and Y components, the resultant vector component
of the acceleration is
obtained.


3.2.1.2

Analog potentiometer as position sensor for the clutch
,
brake and
accelerator pedal.



Figure 3.5

Placement of the potentiometers


Analog linear potentiometers are mounted on the clutch, brake and the accelerator
pedal. W
hen any of these three pedals are pressed it causes linear shift in the
potentiometer cause of mechanical link design. It
provides

linear analog output
between 0 to 5 Volts. Output of these
three
potentiometers
i
s given to three channels
of the 10 bit A
nal
og to
D
igital
C
onverter

(ADC)

of the ATMEGA16 microcontroller.


3.2.1.3
Magnetic reed switch based gear position sensor.


These sensors were used for the gear position detection. Two magnets were
mounted on the gear position stick. Six magnetic reed switch
es were mounted on the
a
luminum frame around the gear box to detect six different gear positions.


3.2.1.4

AD22151 Hall effect sensor based Engine RPM and velocity detection.




F
IG

3.6

M
AGNETIC PICKUP ENGIN
E
RPM

F
IG
3.7

D
EPLOYMENT FOR
E
NGINE
RPM


14

(Air ti
ght container)




F
IG
3.8

S
ENSOR FOR
V
ELOCITY





F
IG
3.9

D
EPLOYMENT
OF SENSOR

(Air tight container)



AD22151 Hall effect sensors are used for the engine RPM and the velocity
measurement
.

Hall
Effect

sensors are used for detection of the magnetic field. A bar
magnet
is

mounted on the rotating engine pulley and engine drive shaft. Sensor gives
linear measure of the magnetic field. Sensor circuit is modified and converted in to
open loop sys
tem so sensor
to provide a

digital

output
. These sensors are fitted in
an

air tight container considering the harsh environment in the engine compartment.



For e
ngine
rotations

measurement a small magnet is mounted on the front side
pulley of the engine (
just near the radiator).
A
Magnetic sensor is fitted on the fiber
cover which covers the radiator fan.



For the velocity measurement magnet is fitted on the rod
,

which connects
the
back
-
wheels to the transmission case
.

A
nother magnetic pickup unit is fit
ted on the
bus chassis.



Outputs of these sensors are given directly to the status port of the laptop for the
data logging purpose in data recorder version 1.0.


3.2.1.5

Brake indication based upon feedback from the electrical system.


Air
-
brakes of the bus are
connected to a pressure sensor which is used to switch
on
the
brake lights. From this sensor
, the

brake status is extracted.


3.2.1.6

USB based Web camera for the Visual feed back.


U
niversal
S
erial
B
us (USB)

based simple web camera is used to store the video in
m
peg format along with the recording of the data. The player software is also
developed
which allows reviewing

data offline.


3.2.1.7

G
lobal
P
ositioning
S
ystem(GPS)
Receiver for position information


BU203 chipset based GPS receiver is connected to the laptop thro
ugh USB port.

The accuracy of the GPS receiver is 12 meters. It needs signals from any three
satellites at any given time to operate correctly.



15



3.2.1.8

Audio pickup

(Mic)


Mic is used to record the engine sound. Different types of audio pickup are tried
and tes
ted. E.g. Condenser Mic, Inductive pickup mic.

3.2.2
ATMEGA16 microcontroller based interfacing unit



Fig 3.10 EDR version 1


The i
nterfacing box

interprets the sensor data. This converted data in digital form is

then
fed to the

printer port of the lap
top
.
Atmega16

microcontroller is

used a
s an

analog
to digital converter

and as
an
encoder
.
The
Control port from laptop decides which
sensor

s data needs to be picked up.
The
Control port controls data conversion process.

When
the
microcontroller

completes

the conversion it indicates
to software,
by
interrupting status port pin S7.


Gear position data is
encoded and fed to the S3, S4

pins of the status port

directly
.

The e
ngine RPM and velocity data after signal conditioning is given to S5 and S6 pins of
th
e status port.


To make this device absolutely non
-
invasive a sealed leak proof lead acid battery
pack is also embedded in the unit.


3.
2
.3 Data logging section.


IBM ThinkPad laptop is used for the data logging purpose. It can be operated up to
3.5 hrs wi
thout auxiliary power. Data logging is done on Windows
-
XP platform. Visual
basic Development environment is used for data logging application. Few libraries and
ActiveX controls for web camera and GPS sensors are used in this application. Graph
libraries f
or showing graphs and multimedia libraries for playback of video are
also
used.

A kernel driver is used to bypass the security policy of operating system, which
restricts application from using parallel port data lines in read mode. Few other
applications

are developed to help
analysis task on recorded data
. This include
s

the player
application which shows the bus ride along the track using the GPS information. In this
application, at any given
instant,
user can see the various driving variables along with

the
vehicle
position information.





16



Figure 3.11

Block Diagram of EDR Version 1.0

3.3 DATA RECORDER VERSION 2.0

Analyzing the data collected using data recorder
first version,

helped to pinpoint the
fundamental sensors that are primarily required

for a
nalysis
. Other sensor
values

can be
inferred

from these
primary
sensors.

This data recorder can be segmented in to three units

1.
Sensors and its signa
l

conditioning

2.
Interfacing and anal
o
g to digital converter

3.
Data logging section


17




Figure 3.12

EDR

Version 2.0


3.3.1 Sensors and its signal conditioning


3.3.1.
1 AD22151 Hall effect sensor based Engine RPM and velocity detection

The sensors from the earlier version are used in this version.


3.3.1.
2 Brake indication based upon feed
back from the elect
rical system

Air
-
brake of the bus is connected to a pressure sensor which is used to switch on
brake lights. From this sensor brake status is extracted.


3.3.1.3

Mechanical de
-
bounc
ing

switc
hes based gear position sensing

In
first version
,

magnetic sensors

are used. These sensors some times give wrong
readings, so in the
second
version mechanically de
-
bouncing switches are used.
Sensors are strategically placed very near to the transmission case to minimize
vibrations. Using these switches better quality re
sults are extracted


3.3.1.4

Acceleration data


Acceleration data is
inferred

by finding rate of change of the velocity.


3.3.1.5

Detection of Clutch position


At any given time engine RPM, Velocity and the gear position is always known. Final
velocity is
determined by following formulae:

Final velocity = Engine RPM * Gear ratio * Circumference of the wheel

If this
calculated
velocity d
iffers from
the actual recorded velocity
,

it
indicates that
clutch
is pressed
.



18





Fig
ure

3.13

Sensor

for gear position

3
.3.2 Interfacing unit


In
second
version
,

hardware is more
optimized
. In this unit all the sensor data is
converted
to
8 bit parallel format and then directly fed to the laptop. A
ll the data
gathered in th
is

version are

in the digital form
.

In this version
,

RPM and velocity
pulse train is directly fed to 8 bit counters. These two counters
are
then multiplexed
and given to the data port of the printer port. Gear position is encoded and given to
the status port. Through control port multiplexing commands
are

given.




Fig 3.14 Interfacing unit



Figure 3.15

EDR version 2.0 block diagram


19



20























3.3.3 Data logging section


21


As mentioned in the section 3.3.2
,

same laptop based solution is used for the data
logging pu
rpose.

3.4 PROBLEMS ENCOUNTERED AND THE
I
R SOLUTIONS

In this section the details about the various problems faced during the development
are discussed followed by the solutions.


3.4.1 Clutch, brake and accelerator position sensors

Initially analog potenti
ometers are attached to the mechanical links to sense the
clutch, brake and accelerator position. These sensors
are

the mechanical links which
are

designed in such a manner that it does not result in any
dis
comfort to the driver while
operating the pedals.

These sensors
are

made from the mechanical links which
are

connected to the potentiometers. Change in pedal position causes change in resistance of
the potentiometer. It
is

observed that for the clutch pedal with its typical different link
structure than
other pedals resulted in some trouble in the link designing. Here the
resistance used to decrease with the pedal pressing till some point and then increase again
from that point onwards. This behavior
is

mainly due to the semi
-
circular link designing.
Ther
e
is

very little space
is

available for the link design which
can provide a
linear
response.

In
second
version, all these sensors are not used. The position information is
inferred from the primary sensor values.

3.4.2 G
lobal
P
ositioning
S
ystem (GPS)

recei
ver

GPS receiver is used to capture the position information about the vehicle along the
track. For proper GPS position locking GPS receiver should be at least in line of sight
view with three GPS satellite. Many times GPS receiver fails to satisfy
these c
riteria

in
the urban environment. Still satisfactory results are obtained because GPS receiver has
built
-
in battery pack which prevents it from the cold start. Using this feature it
is

able to
relocate within few seconds. In some areas GPS signals are not
available in large area.
This
shows that
sole dependence on GPS sensor for position information is not a good
idea and it needs to be compl
e
mented with some other means like some intermediate
position
keeping, or us
ing positional data from the fundamental
sensors of the bus
.


3.4.3 Gear Position Sensors

Space between any adjacent gear positions
is

9 mm which
is

very close for
resolution of the magnetic reed switch. Magnetic reed switches are very sensitive to the
magnetic field. Due to small distance availa
ble in between two gear positions sometimes
it used to trigger wrong switch. While bus
is

in motion Gear stick vibrates which leads to
chattering in these magnetic reed switches. This problem
is
solved in the
second
version.


3.4.4 Accelerometer

Engine vi
bration causes accelerometer to pickups vibrations as acceleration.

To
reduce this noise accelerometer is enclosed in the foam base. To further reduce the effect

22

of the vibration accelerometer’s bandwidth is reduced to 10Hz
.
Output of X and Y
channel of th
is sensor
is

given to the 10 bit ADC of the ATMEGA16 microcontroller.
In
final version,
acceleration data is extracted by looking at rate of change of velocity from
the
primary
velocity sensor.


3.4.5
Sampling Rate

In first version of the EDR, the RPM valu
es are sent as a pulse over the status pin of
parallel port. Software needed to measure the width of the pulse to determine the value of
RPM.
It is noticed
that
,

the application
is

not able to poll the status pin and at same time
record other sensors.
E
arl
ier
,

standard parallel port driver available on the system

is used
.
But with the problem of sampling,
a
device driver for the parallel port which treats the
signal over status pin as interrupt

is developed
. This so
lved the problem of sampling.

A
GUI in Vi
sual basic, interfacing code in dynamic link library
with

device driver

is
developed as part of this application
.


In
second
version of EDR
,

on
-
board counter
is used
to directly feed the RPM
value to software. All the pooling responsibility
is

taken over
by the hardware.


3.5 DRIVER MONITORING APPLICATION

This
application

records and
display
s

the driving profile in a user friendly way
.

It has following features

1.

Camera window and video recording

2.

Accelerometer display of X and Y axis.

3.

Velocity and engine RPM

display.

4.

Brake status indication.

5.

Sampling rate setting.

6.

GPS Data

7.

Directory and file selection.

8.

Start and stop capture


1.

Camera window and video recording

In this application
,

the
display of
front side mounted camera

is shown
. Video is also
recorded for th
e future reference.


2.

Accelerometer display of the X an
d

Y axis

This section displays acceleration in the X and Y axis in the range of the
±
2
g
.
Resultant vector position can also be obtained.


3. Velocity and the engine RPM display

This section of the soft
ware shows current velocity and the engine RPM.


3.

Brake status indication

This button indicates current brake position.


4.

Sampling rate selection

In this section sampling rate of the data logging is selected.


23


5.

GPS data

This section shows Latitude and the lo
ngitude of the vehicle. It also shows vehicle’s
heading direction

with the current height
.
One
can select the port and the baud rate for
the receiver.


6.

Directory and the file name selection

In this section location at which data can be stored is selected.


7.

Start and stop capture

Using this button starting point of the data logger can be selected.





Fig 3.16 Screen shot of th
e Driver monitoring application




24



Figure 3.17
Screen shot of
Vehicle Tracking Application



Fig. 3.18 Screen shot of the Recordi
ng application with Graph plotting feature


25

CHAPTER FOUR

D
ATA

M
INING

T
ECHNIQUES

4.
0

I
NTRODUCTION



As said earlier in chapter one, given that fuel flow meters are not available to
determine instant mileage
,

One needs to find out if it is possible to infer
fuel consumption
by measuring other variables.
Fuel consumed by a vehicle

depends on
various parameters
with a complex dependency

relation
.

These include
engine type,

power, and capacity,
number of cylinders, fuel injection type, vehicle
weight,
tire

press
ure, c
urrent vehicle
load
,

air
drag, r
oad conditions
, t
raffic conditions

and driving style.

Many of these
variables are
difficult to measure

or are costly to measure.

Hence there is a need for
methods which can infer fuel consumption through indirect means

requiring a smaller set
of variables
.


The problem
of rating drivers
is
different

in following ways



I
nvolves multiple variable
s

which include continuous and discrete variables
which

changes

over time,



S
ampling rate being different for variables
,

data
nee
d
s
to
go through
data
construction
step to
fill up for the missing values

without affecting accuracy. This
step
would be
required
only if the built in data recorder in the vehicle is used with the system.



Classification

of
driver for fuel efficiency on som
e
scale is
decided on a summary of
data.



Final deployment needs to be cheap. Thus solution neither has high computing
processor at disposal nor high memory. It also needs to be reliable and rugged to be
able to sustain rough conditions which include jerks,

heat, vibration, noise etc.



Differences among driving style

also known as driving variability

would be
small
on
some routes. This poses a great challenge for the driver model generation.



A
driver
’s

driving style
is dynamic. It keeps changing itself over p
eriod of time due to
various conditions like driver’s mood, traffic conditions, experience gained over a
period of time, vehicle conditions etc. Even in a span of day, one would see different
driving styles. In such cases the time window to pick up for an
alysis becomes a
crucial factor.



Data mining is known for its ability to construct the model
,

given enough number
of data instances. The success of data mining solution depends heavily on the quality of
training data. If the training data covers good pro
portion of population, the model
generated by data mining is fairly accurate.

The raw input to the system is the discrete
time series data
,

from various sensors at periodic rate.





26

4.
1

APPROACH




A
time series

is a sequence of data points in which the o
rder of the data points is
important. In many cases, each data point consists of both inputs and outputs. The reason
that the time order of such a time series is important, comes from fact that at a certain
time instant, the outputs are determined not only

by the current inputs, but also by some
of the recent inputs and outputs. If
the

input vector
is extended
to include those previous
inputs and outputs in addition to the current inputs, then the outputs are fully determined
by the expanded input vector. T
hus,
one

can transform a time series into a set of data
points where the time order is no longer important.



Given a time series, a system classifier’s purpose is to determine to which
category the underlying system belongs, among a set of pre
-
defined ca
ndidate categories.
To do so, our system classification algorithm transforms the time series into a set of
higher level data points, which are then passed on to classifier.


Given below is

approach
that needs to be taken as part of data mining activity.


1. It is
of utmost importance

to first understand the application domain
,
relevant
domain
knowledge, and
requirements
of end
-
user

[1].

After careful analysis of problem and
interaction with experts, decision on which variable needs to be recorded is taken
.



2. Th
e

sensor
data goes through cleaning and preprocessing. This also involves
operations like noise and outlier removal
.



3. Next data goes through reduction and projection, i.e. finding useful features to
represent data. This process is also known as

feature construction. This is the most
important stage of project
.

Here

from a
trip data
,

summary
features
are extracted

which
help
in
discriminating different driving styles.



4.
A Data mining classifier needs to be selected which is most suitable. Ther
e are
enormous amount of classifiers with different variants of them. A set of relevant data
mining classifier used for problems of similar nature are picked and experiments on them
are performed. These classifiers performance is compared with one another.



5.
After successfully testing the model to the desired accuracy levels,
the consolidation of
this
discovered knowledge

needs to be incorporated

into deployable system
s
.

4.
2

D
ATA

PRE
-
PROCESSING




The major barrier in obtaining high quality knowledge fro
m data is mainly due to
limitations of data itself. The data may have limited breadth or coverage,
e.g. data may

cover only the small subset of population,
d
ata may have limited depth and essential
variables may be missing
,

d
ata might be incomplete, missin
g and/or with noise.

The
Knowledge Discovery in Databases (
KDD
)

p
rocess is interactive and iterative,
involving numerous steps with many decisions being made by user.


27




Figure
4
.
1

O
verview of the steps comprising KDD Process


4.
3

D
IMENSIONALITY

R
EDUCTIO
N




In general, classification or pattern matching on ‘raw data’ is not feasible because
of the sheer volume of data. In addition, raw data may contain spikes, dropouts or other
noise
,

which would confuse the
classification

process.
To assist in classific
ation, the
actual time series data may be transformed in some manner. Transformation is used to
solve the dimensionality curse. The dimensionality curse is the fact that many problems
are caused by data sets with many dimensions. Data mining on time series

data with
many variables is not only difficult but also expensive. Data structures to store high
dimensional data are not very efficient.


4.
3
.1 Fourier Transform




The
most
widely used transform for time series domain is

the Discrete Fourier
transform.

This approach involves performing a discrete Fourier transform on the original
time series, discarding all but the K most informative coefficients, and then mapping
these coefficients into K
-
dimensional space.
This would change the series from Time
domai
n to frequency domain. Very widely used for audio signals where one can see a
characteristic frequency. There are other variants of the discrete Fourier transform as
well. As one would observe some time series would not have a characteristic frequency
over

the entire span but rather a locally changing characteristic frequency with respect to
time. In such cases a windowed approach to DFT known as Short Time Fourier
Transform helps. As the name suggests a small time window of sample is taken to do the
DFT.

A

variation in this technique suggests use of overlapping time windows.




2
8


A driving profile changes its acceleration frequency with respect to time hence
DFT is not a suitable candidate feature.
Although
the
Short Time Fourier Transform may
be used to
anal
yze the problem
, it is not sufficient enough to infer the driving quality
.


The problems with the DFT techniques
a
re
:

1.

The
average

acceleration and its
deviation

are highly dependent on the traffic
conditions
. So a fuel efficient driver faced with the trav
el involving the traffic jams
will be penalized with the DFT approach as his one would show presence of higher
frequencies similar to that of a fuel inefficient driver.

2.

The other traits of driving which significantly decides the fuel consumption are not
c
ached using the DFT approach.

3.

The frequency information of the acceleration may not always lead to conclusion or
information regard to fuel efficient driving style. As seen in real driving data samples,
the driver faced with the frequent signal stops and
traffic jams would show higher
frequencies in acceleration.



In our case other higher level features
are

sufficient to lead us to desired accuracy
of classification on the synthetic data. However the ne
ed
for
STFT needs

to

be verified
further
by

the
runn
ing
algorithm on the real data.


4.
3
.2
Principal

Component Analysis



This is one very interesting method and has been used for some of the time series
problems [21].
It is a way of identifying patterns in data, and expressing the data in such
a way to hig
hlight the similarities and differences. The other main advantage of the PCA
is one can compress data by reducing the dimension, without much loss of information.


PCA can be derived from raw data by following steps

1.


Raw time series is
first
converted to
get a new series which is derived after
subtracting all values from the mean of that column.

2.

Th
e

preprocessed time series data is then analyzed to get the covariance matrix, so
if
there
are ‘N’ features
, one
will get N x N
covariance
matrix.

3.

Th
en
unit leng
th
Eigen

vectors and
E
igen values of this matrix are calculated.

4.

Out of the N
Eigen

vectors after sorting on the basis of
E
igen value first ‘M’
vectors are chosen along with their
E
igen values.

5.

The original mean transposed data is multiplied with the trans
posed
E
igen feature
vector.


At the end of step 5,
generated

vector data is compact version
of

the original time
series. This data can be processed to get the original time series back with loss of some
information,
since

not

all
of
the
Eigen

vectors

are
stored
.