My paper - University of California, Santa Cruz

longtermagonizingInternet and Web Development

Dec 13, 2013 (4 years and 5 months ago)


Social Network Mapping Visualization

Lourdes Chang

University of California, Santa Cruz


Social networks
were launched to allow people to keep in touch, musicians to
get their music out,
and a
place for
users to get togethe
r outside of

have been around twenty social networks
have come and gone since they first showed up.
Each social network
give users the same functionaliti
es: a
way to connect to fri
ends and post messages/pictures.

In this paper, I wi
ll present a way of taking a
hometown attribute

and creating
multiple ways of visualizing it.


Before social networks, there was no way of knowing if a connection existed between
two people. The only way to know was by word of mouth. Howe
ver, once social
networks such as Facebook, mySpace, Twitter, and Friendster showed up, it became
easier to get information about a person. Each social network account provides
information about a person (the amount of information varies and depends on the


are friends of the users, but what about their friend’s friends? The only way to get that
type of information is to manually go to the person of interest and look at their first
degree of friends and so on. This method is inefficient to say the
least. To that end, this
paper describes a web application that will help to visualize how people are connected,
gather statistics, and analyze these statistics.

Growing up, there are always quotes that parents, friends, teachers, advisers will tell you.

Not surprisingly, the majority of them are about making good decisions in life and the
people that a person associates with, such as: “A good friend is a connection to life

a tie
to the past, a road to the future, the key to sanity in a totally insane w
orld” (Lois Wyse)
and “Show me your friends and I will show you your future” (John Kuebler).

My motivation for my project stems from the quotes; I started asking myself, who am I
associated with? Is there a correlation between the associations in the pas
t and where I
am now? My initial attempt to solve this problem included using Facebook as my
database. It has more than 400 million active users, where roughly 70% of their users are
outside of the United States [1] which gave me a fairly good range of peo
ple. However,
recently Facebook has been exposed for their security leaks and inability to secure user
information from worms. Instead of using Facebook data, I decided to create a generic
database, such that when the correct type of information is availab
le, this web application
would still be available to use.


Social networks expose information about a person whether intended to or not. We will
take a look at Facebook, which is currently one of the more popular social networks.
Next, we ex
plain why it was not ideal to use their data and in its place use a generic
dataset. Once the data was gathered, it was displayed in different formats (Google maps,
HyperTrees, Charts) and the data was analyzed to produce some results.

2.1 Facebook



Facebook started out as a social network among college students, so it required
that a user had a valid college email address to open an account (@[school]). Its main advantage over other social networks such as mySpace
was that a u
ser had to be a student in order to have an account. A person was not
allowed to create a fake account and pretend to be someone they weren’t since
each student was given a unique e
mail account. However, as its popularity grew,
Facebook expanded an
d allow
ed high school students.

Now it is open to anyone
who has an e
mail address.


Facebook Example

Below is a fake account that someone created for a ‘Freddi Staur.’ Figure 1 shows
the type of information that is displayed on a profile. A user is allowed

to post a
picture of anything as long as it falls under their regulations. Each Facebook
account comes with a profile section where the pe
rson can add varying amounts

information about himself. Facebook also shows an activity log of what the
person has

been doing, which can range from posting a comment on another
person’s profile wall to playing a Facebook game.

Figure 1


Facebook information

Every Facebook account comes with a profile section where a user can disclose as
much information about



list of information that a user
can provide is as follows:

Email address






Education history

Relationship status

Events/Social gatherings


Groups that they belong to

riend list

Hometown/Current location


gathering the list of
ible data to analyze, I narrowed it down to the
following list that would be useful.


Education history

Groups that they belong to

Hometown/Current location

nd list

In the end, I settled for Hometown location and friend list. The only information
that could have stayed unchanged is a person’s hometown. Once a person is born,
they live in a certain location which is their hometown. This information is static
and does not change. The person’s current location can change, but that is a
separate field.

Facebook API

Once the
data was settled,
I looked into
ways of extracting this information and
came across the Facebook API.
Their API is
documented and

even has
examples, but what it fails to have is
fully working functions
I used my account
and my list of friends as the testing account. I was able to gather some data, but I
noticed that
the data was incomplete.
After reading forums, it was noted that t
API does not work well with older accounts and as such information could be
Instead of waiting for a fix, I decided to go ahead and move on with
my project

synthetic data.

2.2 Synthetic Data

2.2.1 Data gathering

From my init
ial project idea, I wanted to gather friends’ information and their
hometown and graph them on a map using Google Maps. To map a marker to a
location, the latitude and the longitude are needed. Within the Google Map API
there is the ability to use a locati
on and a geocoder function that would convert
the location to a latitude and longitude coordinate then plot it. However, if the
dataset is too big, there is a possibility of a time out and markers would be

For my approach, I gathered a list of a
ll cities

and their associated state in the
United States, and broke the data into files where each file contained roughly
about 1000 cities. I fed files into the geocoder f
unction and created a file that
consisted an entry with city/state, latitude, a
nd longitude. That file made up my

I debated on how big the dataset should be and settled on all cities within the
United States. This allowed me to have a large enough dataset where having
duplicates would be highly unlikely.

2.2.2 Create t
he data

The next step consisted of creating multiple test data. In order to do so, a person
needs to be the host with x amount of friends. Those friends would be assigned y
amount of friends, and each of those friends would be assigned another set of

In order to do so, I used a perl script to generate a random number for
number of friends, while another random number was generated to pick an entry
from the database.

However at each level of friends, the range for the number of
friends gets cut in h

such that eventually it will terminate when there are no
more friends.

The naming scheme in order to tell the friends apart consisted of a name and
number association. The host was known as “friend” that has x friends, and his
friends were identifie
d as friend[0], friend[1], … friend[x]. Friend0 would have y
friends and those friends would simply append an additional “.[number]”. An
example is as follows:

Host, known as “friend” has 2 friends, they are known as Friend0, Friend1,

Friend0 has 2 frien
ds, they are known as Friend0.1 and Friend 0.2

Friend1 has 1 friend, he is known as Friend1.1

This information is saved in an xml file where each friend is an entry with the
city/state, latitude, longitude, and number of friends as their attributes.


Displaying data

2.3.1 Map

The map was created by using the Google Map API, and it consists of two
components, the map and the sidebar. The markers are placed on the map while
its associated link is placed in the sidebar for each entry in the xml file. Th
e map
contains different viewing options that affect how the markers are shown on the
map. For example, the map has an option for color by state, which will color the
markers that fall in the same state the same color. Each state is given their own

color. The clustering option takes markers that are close to each other, and
instead of showing multiple markers, will display an icon with the number of
markers that can be found in that area. Once a user zooms in the markers will be
seen and the cluster
s will decrease. Another display on the map are lines that start
from one node to its children nodes, to maintain a tree structure.

The sidebar has a list of all of the markers that are on the map and it identifies
markers by their city and state. When an

entry is read from the xml file their
marker is put on the map and the city/state string entry is put into the sidebar as a
link. The first entry in the sidebar is always the root of the tree and the following
entries are the children. If a user clicks on

the first entry in the sidebar, then the
map will display a pop up window next to the marker so that the user can locate
where it is on the map. If the user clicks on any other link in the sidebar, then the
map will refresh using the entry clicked as the
root node.

2.3.2 HyperTree

Another visualization technique used in this web application came from the
Javascript Visualization Toolkit (JVT). JVT has options to display data, including
one I chose to use, HyperTree. In order to create the HyperTree, the
data had to be
converted into JavaScript Object Notation (JSON) format and then fed into JVT.

As described on their website, HyperTree is “used to display large amount of
related data” [
]. The HyperTree is an interactive component where each
ent is a friend. Just as with the map, it has the option of changing the
focus and making a new node the root element.

Having two forms of visualization allows the user to see the tree structure better.
At times the map can be a little too cluttered and
it can be hard to determine how
friends are linked together unless zoomed in.


The last forms of visualization are two graphs: pie and bar. The bar graph
analyzes the distances between parent and children. The distances are gathered
when the
plotting on the map is handled and converted to kilometers. It is then
sorted, and the ranges are calculated by taking the difference between maximum
distance and minimum distance, and dividing that difference by 10. The distances
are then placed within th
eir respected range and plotted.

The pie graph analyzes how many unique states are displayed on the map. This
data also had to be converted into JSON format and fed into a program that
creates pie graphs. It associates each unique state with a color and s
hows its
percentage next to its slice in the graph.


The result is a web application that has both a map with options and a sidebar, as well as
a pie graph with unique states found as shown in Figure 2. Figure 3 shows
the distance
bar graph and
the HyperT
ree that is associated to the dataset.

Figure 2.

Figure 3

Related Work

There have been many Facebook APIs such as [
] that tried to display hometown
locations of their friends. However some of them either took too long to calculate or
ould only get the first degree of friends. Currently, in order to get more than one degree
of friends, each friend would have to allow the application to run on their account.

There was
a similar approach
called Vizster

, that visualized social connect
ions using
The Vizter

uses more attributes other than hometown to create their graphs
and has a more active user interface.
It allows for users to click and drag their graphs and
play with it as the user sees fit.

Future Work

I would like
to features for coupling and deco
upling between the map and the H
and to show statistically how many “friends” fall under a certain radius that the user can

As of now, the map and the H
ree are two different entities. I would like to

allow the
options of coupling the two entities so if an action is carried out on the ma
p, it will be
reflected on the H
ree and vice versa.

One other aspect that I wanted to add, but time did not permit, was to get a radius input
from the user. The
radius will use the host friend as the center point and calculate how
many friends fall within the radius.


All in all, although the initial purpose of this project was to gather real data and come to a
conclusion about a person and their fri
end association, the data was unavailable for me to
come to any conclusions. However, by creating synthetic data, I was still able to move on
with my project and create a web application that will display information about a host
friend and their friends.
This web application can be used in the future for analyzing a
person and their friend association once the data becomes available.



Statistics | Facebook

, 2010

[2] Sweeny, Charles
. “List of Cities and Countries”.


Belmonte, Nicolas Garcia. “
JavaScript InfoVis Toolkit

Interactive Data
Visualizations for the Web
, 2010


Mani, Venkat
. “Google Map

Facebook Mashup!!!”
, 2009

] Williams, Mike. “Google Map API Tutorial”


] “MarkerClusterer for V3”


[7] Veness, Chris. “Calculate distance, bearing and more between Latitude/Longitude

[8] Tentler, Gerd. “Gerry’s Script Library”.

[9] Boyd, Danah. Heer, Jeffrey. “Vizster | Visualizing online social networks”.
, 2005

[10] Boyd, Danah, Ellison Nicole B. “Social Network Sites: Definition, History, and

] Facebook.