Identification of Mobile Devices

darkfryingpanΚινητά – Ασύρματες Τεχνολογίες

10 Δεκ 2013 (πριν από 3 χρόνια και 6 μήνες)

60 εμφανίσεις

Identification of Mobile Devices
from Network Traffic
Measurements


-

a HTTP User Agent Method

Master’s Thesis


August 2
8
, 2012

Supervisor


Prof. Heikki Hämmäinen

Instructor


M.Sc. Antti Riikonen

Aashish Adhikari


Background

2



Mobile device
identification

aids in profiling the mobile
Internet usage


Support the pricing and business development


Tailor the services to attract more users



Device identification from network measurements


Type

Allocation

Code

(TAC)


TCP
Fingerprinting



HTTP


UAProf


User Agent string parsing









User Agent
-

based device
identification

3




UA
-
based identification relies on idiosyncracies of UA string formats



Examples of UA string formats


Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0;
NP07)



Mozilla/5.0 (
iPhone
; U; CPU
iPhone

OS 4_3_3 like Mac OS X;
fi
-
fi
)
AppleWebKit
/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7A341
Safari/528.16



NokiaN70
-
1/3.0546.2.3 Series60/2.8 Profile/MIDP
-
2.0 Configuration/CLDC
-
1.1



Android
-
YouTube/2 (GT
-
I9000 GINGERBREAD);
gzip



WURFL DDR and Java API (parser)


Frequent updates by the active community


Uses Two
-
Step UA String Analysis algorithm









Research Questions & Objectives

4

R
1
: How can device and device features be identified based on


HTTP User Agent from mobile Internet traffic traces?



R
2
: How can the identification of mobile devices (and features)


aid in profiling the mobile Internet usage in Finland?


O
1
: Develop a tool to identify device type, model (and features)


based on the HTTP request header User Agent field



O
2
: Study the output of the tool and compare it with an existing tool


O
3
: Provide descriptive statistics on the mobile Internet usage in


Finland based on the identified devices











Measurement Setup


Measurement data


IP traffic traces from the Gi interface in the packet core networks of two
Finnish mobile network operators


A week’s worth of data


Parameters utilized in this thesis


User Agent string, total transferred bytes, and number of flows


Also includes String Matching results

5


(Adopted from Kivi & Riikonen, 2009)

Analysis Process

6



Datasets


TCP and UDP
logs


WURFL
Repository


Handset

Feature
List



WURFL API
Implementation



Improvements

to the WURFL output


Custom

patch

file


Custom

rules


New
Releases


String

Matching

results



Features from both, WURFL and Handset Feature List




7





TCP
/
UDP Logs
WURFL
Repository
MoMIE
Handset
Feature List
WURFL API
Implementation
Desired
Output
?
Apply Custom
WURFL Patch and
Rules

No
Yes
List of HTTP
User Agent
Strings
Integrate String
Matching
(
SM
)
Results
Manual Update
Tool Output
Improved Result
Final List of UA Strings
,
SM Results
,
Device and Device Features
,
Bytes
,
and
Flows
Map Selected
Features
General
Characteristics
Methods
Comparison
Device Type
Classification
Handset
&
Tablet
Population
Other Handset
Features
Handset Input Methods
and Screen Sizes
Tool Output


WURFL works well for web browser generated UA strings


Indentifies desktop devices


Only ~0.5% false positives with the dataset



Additional programming required to extract device information
from app
-
generated UA


Enhanced WURFL tool increased the identification by 14% points


Still uncertanities with non
-
standard app
-
generated UAs



In comparison with the String Matching


Facilitates manipulation of output


Removes the issue of the identification of app
-
generated UA strings to
some extent


Not just the brand and model of the device, but elaborated list of
features including the OS, OS version, and mobile browser


Partly removes the cumbersome task of manually updating the device
database

8

Descriptive Results

9

4%

0%

2%

14%

79%

5%

1%

2%

6%

87%

0%
20%
40%
60%
80%
100%
Unknown
Others
Tablet
Handset
PC
Bytes
Flows
Share of all mobile devices generated traffic volume and flows

Operating system distribution (bytes) among the handheld devices

47.9%

37.2%

11.3%

1.3%

0.3%

1.6%

0.5%



Only Handset and Tablet device types considered for further analysis



Android based devices generating the most traffic

Contd...

10

Bytes
Flows
Bytes
Flows
Bytes
Flows
Bytes
Flows
Bytes
Flows
Bytes
Flows
Bytes
Flows
Android
iOS
Symbian
MeeGo
Others
Unknown
Windows Phone
Unknown
Browser
App
Shares of browser and app
-
generated bytes and flows for Handsets



Clear distinction between browser and app
-
generated UA for Android and iOS



Unrealistic results for S
ymbian

and
MeeGo

OSs



Uncertanities probably due to incapability of the tool or app
-
generated UAs for


these OSs fall under
Unknown category



Contd...

11

0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Share of traffic volume

Handset Features

Share of traffic volume for selected handset features



E
rror

bars resulted from the


terminals that do not have the


feature or for which the data


were not available




Many features close to


saturation




Saturation level for FM


radio ?








Future Work



Application
identification

by

the enhanced WURFL
tool



Analysis of user sessions based on the device type,
model, OS and device features




Business perspective to the
current

analysis




12

Conclusions



Tools used for the identification of mobile devices in web
servers could be used to identify devices from mobile
network traffic traces as well


It is reliable to implement open source and community
contributed DDR (such as WURFL) and its API



Descriptive results show


Android based handheld devices gaining popularity, Samsung
being the most popular among the brands


Apple iPhone* generates the most traffic among the handsets


Devices with advanced features, such as 3G and touchscreen,
preferred for mobile Internet


* No clear distinction between the iPhone models





13

Thank You

14