Internet Geolocation - University of Wisconsin-Platteville

shrewdnessmodernMobile - sans fil

14 déc. 2013 (il y a 3 années et 4 jours)

102 vue(s)

Internet Geolocation



Brandon Koontz

(koontzb@uwplatt.edu)

Department of Computer Science and Software Engineering

University of Wisconsin
-
Platteville



Abstract


Internet geolocation is the process of locating a host device to the real world location of
the
device.
There are many uses for geolocation

such as

marketing, default language detection,
content distribution
,
and
emergency services
.

A

brief exp
lanation of a

few

of the
tradition
al

location techniques as well as some of the
more modern
techniques

used to find a
location of a
device

whether the

device is stationary or mobile.
A few ways that these geolocation techniques
can be

evaded

is also provided

such as proxies and software l
ike Tor
.

A simple example of
counter evasion is also provided.



I
ntroduction


Internet Geolocation is the process of
finding the geographical
location
of
a host
device

using
information
from internet
accessible network information
.

There are many
reasons that the
location
of a device
would be beneficial, such as
emerge
ncy services, financial

applications
,

content delivery networks,
marketing companies,
and
social networking sites
.

There ar
e
tradition
al wa
ys of locating a device

such as GPS or E911 services
. I
f
the device

is
dial
-
up
,

techniques
such as
the public switch
ed telephone network

c
ould be used
. As for mobile devices
different techniques w
ould be used

used such as frequency fingerprinting,

GPS
(
if available
)
, and
w
ireless access points. Many of these techniques described use the Internet Protocol Address
,
doma
in name
(
if available
), and different routing numbers

of
a device to locate it. With the help
of companies like Google which provide
ways that a developer can locate a device very easily.


As well as companies like
Internet Corporation for Assigned Names
and Numbers

for providing
and
maintaining

databases of related information.





P a g e

|
2



IP Address overview


Every
host
computer in the world is assigned an
Internet Protocol (
IP
)

address.

This IP address
is a
globally
unique number given to a device that wishes

to connect to a network.
The
global
organization

responsible

for allocating
and maintaining
IP
a
ddresses

is the
Internet Assigned
Numbers Authority (IANA)
. The IANA
allocate
s

blocks of IP address to
the
five Regional
Internet Registries (RIRs) as shown
in Figure 1.

These RIRs
then
allocate smaller blocks
of
addresses
to Internet Service Providers (ISPs
)

and some large companies
.

The
n

ISPs
will

assign
an IP address to all of the devices in
their

network wanting to connect to the internet.





Figure 1:

Regional Internet Registries[
4
]



Traditional Location Systems


Services that provide location have been around for many
years.
During the late 20th century

the
public switched telephone network (PSTN)

transitioned
from a
circuit switched network to a
pa
cket
-
switched network
.


This allowed the calling line identity

of the endpoints
, also known as
the phone number
,

to be transmitted through the
network.
For conventional fixed line telephony,
the association of the physical address to location only require
d the incorporation of a relatively
static database associating the phone number with the
known location to which the line serving
that address terminated.
[
2
]

A crucial example of this association is the emergency
call routing
function which allows someon
e to dial 911, for
the United States, and the system would find the
address of the caller and dispatch emergency services.


Other examples include caller id
as well
as the ability to

call a nationwide number
which
would connect to a local business.


When

m
obile networks became more common place, this created problems for the
PSTN.
Mobile devices were able to move aroun
d within areas of coverage
, but th
eir

phone number and
P a g e

|
3


physical address did not change in the PSTN.

This was resolved in 1996 when the Fede
ral
Communications Commission (FCC)

introduced

a regulation
, the E911 service,

that when a call
was mad
e to the 911 service the wireless service provider had to deliver the location of the caller
to the
911

service.

This was usually represented in geodetic

form,

namely latitude and longitude
values.
[
2
]



As time progressed the network equipment vendors found other ways to locate a mobile device
.
Since the

mobile network is made up of cell towers

and existing
measurements

such as air
interference timings a
nd
signal strength.

The technique of using these measurements is called
Enhanced cell identity.

[2]
There is also radio frequency fingerprinting. Where signal
strength is
sampled a
cross the coverage area,
usually in a grid pattern and stored in a database
. W
hen a
location for a device is requested
,

the
signal strength of the cell towers in the area

is compared to
the values in the database

and the location is determined
.

[
2
]




Geolocation Uses


With the recent addition of mobile devices
and how accessibl
e the internet has come,
companies
are taking advantage
of the new possibilities. E
ntrepreneurs

are also taking advantage of the new
possibilities to invent new services.

Services like Gowalla and Foursquare

use location
information to create a location b
ased social network
. Where the user and friends of the user
check
-
in to locations using their mobile device

and interact with others
.

Web sites like Hulu
,

BBC
, and
YouTube

use the

location
of the user to determine

if the user is in the right country

to
l
imit what content can
may

accessed
.

Usually this is to comply with licenses that
company and
the owner of the content agree upon.
Companies like Google

and Microsoft

also use
geolocation

to tailor a web search, advertisements and also
services like

drivi
ng directions

or finding
locations of
restaurants
.

Many other websites commonly use a country
-
level location to
determine what language should be displayed.
Other sites like Google and CNN

will redirect
a
user to a server that is closer
,

for example

"Goo
gle.com.au" for
Australia
.



Techniques


The term internet
geolocation

uses the
IP

address of the host

to determine the geographical
location of the user.

These locations are usually in the form of a civic address

(
country/
state
/
city

e
tc
)

or
geocode
(
lati
tude and longitude
)
.
There are
a few different techniques that

can be used to
determine

a users location

s
uch as
w
hois lookups


and
Domain Name Service (DNS) queries
.


P a g e

|
4


Whois is

a protocol that uses an IP address
, an Autonomous System (AS)

number,

or domai
n
name
to retrieve

records from public whois database
s

provided by the RIRs
.
This service can be
accessed
by a c
ommand on a host

or from a remote host
.

A remote host can be used if it has the
network information that is to be located.
For example
the co
mmand

"whois google.com" would
return the information

pertaining to google.com.

Such as who and where the information was
registered.


Whois can be used in conjunction with an IP address to obtain information about that IP address.
For example
,

with
the
I
P address of "
173.20.133.90
"
, the response to the command "whois

173.20.133.90
"

says that the
173.16.0.0/12

block of IP addresses is
registered
to the
organization
"
Mediacom Communications Corp
"
. Which is located in
Middletown
, New York

and is in the
ARIN

database

i.e. North America
.

This technique would
locate this host
in Middletown, NY.
But the PC for this IP address is located in
Dubuque, IA
.
W
ith this IP address
,

this technique is
only accurate
to country
-
level

resolution
. An issue with
this techn
ique is that not all IP addresses
are

located at or around the location of the
organization
.

The data in the whois database
may be

inaccurate since
the registrants
could submit false or incorrect data
.


An AS number is used by routing protocols like
Borde
r Gateway Protocol

(BGP)
. Each RIR
holds blocks of
AS numbers and organizations appl
y to their RIR to
obtain

them.[
6
]

The RIR
RIPE also has a service for finding AS numbers.
For example,

with IP address of
"
173.20.133.90
"
, using the command "
whois
-
h ri
swhois.
ripe
.net 173.20.133.90
" returns an AS
number of
6478
.


Using this with the whois command the organization that registered the AS
number can be found. In this example the command is "
whois
-
h whois.arin.net
--

AS6478
"

which returns
AT&T Services, I
nc.

as the organization which is located in
M
iddletown, NJ
.
This
technique would locate the host in Mi
ddletown, NJ
. Recall that the location resolved to a
different city in the first example and
is nowhere close to Dubuque, IA
,

where the host is located.

For this technique
one issue is that not all IP addresses are located where the AS numbers where
registered. As with example one the whois databases may contain false or incorrect data
. But
this technique did locate the IP address to the correct country
.


A domain name is used to represent

an IP address since humans remember series of
letters

much
more easily than a series of digits.

There are utilities to perform reverse DNS lookups such as
"nslookup" for windows,
"dig" for UNIX systems, as well as a v
ariety
of web utilities.

This can
also be used along with the whois command to
help
determin
e the location of a particular host.
F
irst it must be determined if the

IP address
in question maps to a domain name. For that the
command "dig
-
x
137.104.129.
136
" shows that this IP address maps to
the domain name
"www.uwplatt.edu" as in Figure 2.



;; ANSWER SECTION:

136.129.104.137.
in
-
addr.arpa. 900 IN PTR
www.uwplatt.edu.

P a g e

|
5


Figure 2: Excerpt from executing the command "dig
-
x
137.104.129.136
"


Now
the IP address is known to resolve to the domain name "uwpla
tt.edu"
. Next a command is
sent to the IANA whois server to determine what whois server is responsible for the "edu"
extension. For this the command "whois
-
h whois.iana.org

--

e
du
"

is used and the response is
"
whois.educause.edu
".

Then this new whois s
erver is used to find the information on the whole
domain name. For this the command "whois
-
h whois.educause.edu
--

uwplatt.edu"
.





Figure 3: Excerpt from running the command "whois
-
h whois.educause.edu
--

uwplatt.edu".

Not
e that the spelling of "Technolgy"

is a spelling error from when the domain was registered.


In this example

the domain "uwplatt.edu" is located to
UW
-
Platteville in Platteville, WI
. With
this information it could be
assumed

that the domain is also locate
d

there.

For this
technique the
information is correct for the city, state, and country.


Some domain names
may
contain geographic codes that can be used
locate a h
ost. Such domain
names contain letters that
correspond

to a
particular

country or province
.
For ex
-

ample, a Google search for site:.ca returns only results

located in the .ca domain
, which is the
top
-
level domain
for Canada. In some cases,

this

extends to fi
ner granularity: .ab.ca is for
Albert
a, Canada; calgary.ab.ca is for
Calgary, Alberta,

Canada.
[1]

Realistically speak
ing,
granularity beyond the country level cannot be expected


it is dependent on domain r
egistrar
policy


and even then
accuracy is suspect.

An example of this is the .tv domain
which
corresponds to the country
Tuvalu
, but

they do not restrict who can by a domain with the .tv
extension.

Sites like TWiT.tv provide entertainment and are not based in
Tuvalu
.


If a company
doesn't want to go through the hassle of finding the devices location

and is willing
to pay there are som
e services that when given an IP address will return an
approximate

location
of the
user with varying degrees of accuracy. A few providers of this service are
IP2Location,
MaxMind,

NetAcuity
, and Google Location Service.


Some of these services require pa
yment,
but a company may be willing to pay for less
hassle.


Registrant:


University of Wisconsin
-

Platteville


Office of Information Technolgy


1 University Plaza


Platteville, WI 53818


UNITED STATES

P a g e

|
6


With more modern
web
browsers
and with the a
ddition of HTML5 as well as Application
Programming
Interfaces (APIs) such as
W
3
C

G
eolocation
API

allows for

more accurate location
to

be determined.
The W3C Geolocation API uses Google's Location Services

to locate a host

by retrieving network information such as an IP address from the web browser
. Google's
Location Services uses many different techniques to detect the location of a host such as
IP
ge
olocation

and location using wireless networks
. For
some
mobile devices Google uses

cellu
lar
tower triangulation. Also the wireless internet hardware is used on the device to detect what
wireless access points can be seen
and that information is compared
to the wireless access point
data that Google Street View project has collected to determine the location of the device.


This
technique works best

in populated areas and not as well in rural areas since it uses wireless
access points.
I
n
many rural

areas

wireless access points are more spread out so
GPS

usually
is
more accurate
.

M
obile devices, su
ch as
Google's
Android,
Apple's
iPhone,
and
Window
'
s phone,
have
their

own
location services on the device that allows the user to select
what types of
location

techniques such as the
use of GPS or mobile networks to find a location.


A technique that won't be described in depth is the use of the Glo
bal Positioning System (GPS)
since many times the device may not have GPS hardware or the location
of the device is

such
where a GPS lock cannot be obtained. An example of this is an urban area like New York where
most of the sky is not visible.



Evasion
and Counter Evasion
of Internet Geolocation


With any technology there is always
some way of getting around
it
.
T
here are

people

or groups
that want or need to get around that technology
,
their intent

may be malicious

and others

are

less
harmless. A malicious example would be credit
-
card fraud.

An example of something less
harmful would be
someone outside the Unite
d States would like to watch streaming video
content from another country.


One example of evading
internet geolocation
is the use of a proxy server. A user connects to a
proxy server

and any internet traffic they send or receive i
s sent through the proxy

server as in
Figure 4.


P a g e

|
7



Figure 4: Example of a proxy server in between the clients and web servers.


Another evasion m
ethod is services like Tor.
Tor uses a system of relay computers located all
across the internet.
To create a private network p
athway with Tor, the user's software or client
incrementally builds a circuit of encrypted connections through relays on the network.
[
7
]

When
a website
receives

data sent from the user's computer the web
server

only sees the IP address of
the last relay o
r exit node.

What is unique about Tor is that after a
certain
amount

of time the
pathway is changed so
that the user may stay hidden to others. One drawback is that Tor only
works for TCP network streams

such as web browsing
.


A project that Tor created
is
Vidalia
.
This program is a cross
-
platform graphical interface for the Tor Network. This allows an easy
was to configure
, start, and stop Tor as well as providing some statistics for the current
connection.


Even though proxies an
d software like Tor pro
vide ways to bypass
so
me techniques of locating a
device, these ways can also be bypassed.


An example of a counter evasion technique is by
opening a non
-
proxy connection
. This allows the host IP address to be leaked out.
These can be
accomplished by let
ting web content to run such as Adobe Flash and Java applets
.

In Java
a new
socket can be created with the "
Proxy.NO_PROXY
"

an example of this is in Figure 5
, this
bypasses
any proxy settings set by the b
rowser and even set by the Java Control Panel.



F
igure 5: Code to bypass a proxy

based on reference

[
6
]






P a g e

|
8


Conclusion


Internet Geolocation is a
valuable

resource for any
person or
company

to determ
ine their
location.
Some of the techniques covered included traditional means such as the public switche
d
telephone network

and radio frequency fingerprinting.
More modern means use internet network
information such as an IP address, a domain name, and routing numbers.
These techniques
accessed public databases such as the whois database provided by IANA a
nd RIRs to help
decipher this data.

Along with companies that provide services such as Google

which give
developers a way to access this data easily
.

With detection there
are also

evasion techniques
such as using a proxy or software such as Tor
and the
V
idalia
project
to conceal
some of the
network information so the host may remain anonymous.

Finally this article
briefly

covers

ways
to get around this evasion

by using web content such as Adobe Flash and Java
to gain access to
the network information tha
t was concealed.



References


[
1
]
Acton, R., Friess, N., & Aycock, J. (2007). Inverse geolocation: Worms with a sense



of direction. Performance, Computing, and Communications Conference, 2007.



IPCCC 2007. IEEE Internationa
l
, 487
-
493.


[
2
]

Barnes, R.,

Winterbottom, J., & Dawson, M. (2011). Internet geolocation and




location
-
based services. Communications Magazine, IEEE, 49(4), 102
-
108.


[
3
] Google Location Service Retrieved from




http://static.googleusercontent.com/external_content/untrusted_dlc
p/www.google.

com/en/us/intl/zh
-
CN/events/facultysummit/2010/files/mobile_location.pdf


[
4
]
Internet Corporation for Assigned Names and Numbers
:
Retrieved

from




http://www.iana.org


[
5
]

Muir, J. A., & Oorschot, P. C. V. (2009). Internet geolocation: Eva
sion and





counterevasion. ACM Comput.Surv., 42(1), 4:1
-
4:23.


[
6
]

Thorvaldsen, Ø. E. (2006). Geographical location of internet hosts using a multi
-
agent


system.


[
7
] Tor Project
:

Retrieved from
https://www.torproject.org/