Unconstrained Endpoint Profiling

plantationscarfΤεχνίτη Νοημοσύνη και Ρομποτική

25 Νοε 2013 (πριν από 3 χρόνια και 4 μήνες)

55 εμφανίσεις

Unconstrained Endpoint Profiling

Googling the Internet

Ionut

Trestian
,
Supranamaya

Ranjan
,

Alekandar

Kuzmanovic
, Antonio
Nucci


Reviewed by Lee Young
Soo

Introduction






Obtaining ‘raw’ packet trace from operational
networks can be very hard.


Accurately classifying in an online fashion at
high speeds is an inherently hard problem.


For understanding
what people are
doing on the
Internet

Analyze
operational
network trace.

Unconstrained Endpoint Profiling


Introduction of a novel methodology.


No operational traces are available


Packet
-
level traces are available


Sampled flow
-
level traces are available


Internet access trend analysis for four world
regions.

Methodology


Rule Generation


Querying Google using a sample ‘seed set’ of

random IP
address from the networks in four world regions.


Constrain top N keywords that could be meaningfully
used for endpoint classification.

Methodology

Methodology


Web Classifier


Rapid URL search


Hit text search


Example URL : www.robtex.com/dns/32.net.ru.html

Methodology


IP tagging


URL based tagging


General hit text based tagging


Hit text based tagging for Forums


Post
-
date & username is in the vicinity of the IP address


=>

forum user


Presence of following keywords


:http:
\
, ftp:
\
,
ppstream
:
\
, mms:
\


=> http share, ftp share, streaming node

Methodology


Examples


200.101.18.182
-
inforum.insite.com


URL based tagging


61.172.249.13
-
ttzai.com


Hit text based tagging for Forum

Information come from


Web logs


Proxy logs


Forums


Malicious list


Server list


P2P communication

Evaluation


When No Traces are Available.


When Packet
-
Level Trace are Available.


When Sampled Trace are Available.




When No Traces are Available







Applying the unconstrained endpoint approach
on a subset of the IP range belonging to four
ISPs shown in above table.

When No Traces are Available

When No Traces are Available



Correlation with operational traces.

Correlation with other sources.

Unconstrained endpoint profiling
approach can be effectively used to
estimate application popularity trends.

When Packet
-
Level Trace are Available

BLINC

Off
-
line tool

Cannot classify
particularly at
application level

Variable quality result
for different traces

UEP

Superior classification
result

Efficiently operate
online

When Packet
-
Level Trace are Available







Collect most popular 5% of IP address and tag
them by applying the methodology.


Use this information to classify the traffic flow.


When Packet
-
Level Trace are Available

When Sampled Trace are Available







Due to sampling, insufficient amount of data
remains in the trace, and hence the
graphlets

approach simply does not work.


Popular endpoint are still present in the trace,
despite sampling.


When Sampled Trace are Available







Endpoint approach remains largely unaffected
by sampling.

Endpoint Profiling


Endpoint Clustering


Employ clustering in networking has been done
before :
Autoclass

algorithm.


A set of tagged IP addresses from region’s network
Input to the endpoint clustering algorithm.


Endpoint Profiling






Browsing, browsing and chat or mail seems to
be most common behavior.

Endpoint Profiling


Traffic Locality

Conclusion


UEP


Accurately predict application and protocol usage trends when no
network traces are available.


Dramatically out perform when packet traces are available.


Retain high classification capabilities when flow
-
level traces are
available
.


Profile endpoints residing at four different world regions.


Network applications and protocols used in these region.


Characteristics of endpoint classes that share similar access patterns.


Clients’ locality properties.