Personal Semantic Network

blaredsnottyΤεχνίτη Νοημοσύνη και Ρομποτική

15 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

68 εμφανίσεις

Ubiquitous Semantics: how to create and navigate a
Personal Semantic Network
Oliver Brdiczka

Palo Alto Research Center (PARC)
3333 Coyote Hill Road
Palo Alto, CA, 94304, USA
brdiczka@parc.com



Abstract— Pervasive computing systems including smart phones,
home computing systems, or intelligent home appliances collect
and store more and more information about users. The amount
of information that is constantly being accumulated for each user
is enormous. Collected information comprises the user’s digital
data like messages, online social network connections, or tweets
and the user’s physical context data like user location, physical
proximity to other people, or physical activities. Increasingly
powerful content analysis and physical context inference
technologies enable the extraction of relationships among places,
people, and events, and use the resulting context to construct a
personal semantic network (PSN) of the contextual relationships
across one’s digital and physical interactions. Entities
representing locations, people, events, activities etc. serve as
pivotal elements empowering the user to navigate their PSN. This
paper will discuss and detail both the construction and navigation
of a PSN integrating a user’s physical and digital content.
Keywords – ubiquitous semantics; personal semantic network;
ubiquitous computing; context-awareness; semantic indexing;
semantic entities; integrating digital and physical data; relevance
calculation; priority calculation
I. I
NTRODUCTION

The first vision of a personalized “memory index” and
device, called memex, has been formulated by Vannevar Bush
in [1] and dates back to the year 1945. The memex was
intended to be an enlarged intimate supplement of a user’s
memory containing the user’s books, records and
communications. While the memex was mostly thought of as a
powerful storage device, its foreseen indexing and retrieval
capabilities remained rather obscure. With the explosion of
personal information produced on a daily basis nowadays, the
storage capabilities are less of a problem than the indexing and
navigation through the vast amount of information. The
progress in information retrieval research has led to a number
of approaches for semantic indexing of large corpora of digital
content (e.g., [2]). These approaches leverage some human
understanding of text, e.g., through extracting or highlighting
keywords with a specific meaning like company names or
street addresses. These semantic keywords or entities can serve
as connection points between pieces of information like
documents, emails, or tweets and enable the user to quickly
navigate through her digital content. However, today’s
information generation and gathering goes far beyond the pure
digital content. Pervasive computing systems like smart
phones, home computing systems, or intelligent home
appliances constantly collect and store information about users.
The amount of information that is being accumulated for each
user is enormous, and in particular, most pieces of information
are collected in situ, i.e. in a specific physical setting or
context. For example, taking a note using a mobile phone at a
specific physical location and time of day may indicate
something about the importance of that note as well as its
connections to other pieces of information that are in the user’s
personal information repository. The challenge is to inter-
connect both the physical context sensing and semantic content
indexing, and to create a representation that integrates both
physical context and digital content.
In order to tackle this problem, this paper proposes the
concept and architecture of the personal semantic network
(PSN). The PSN uses semantic entity extraction to index the
user’s digital content and connects extracted semantic entities
with physical context sensing. The semantic entities
representing locations, people, events, activities etc. serve as
pivotal elements empowering the user to navigate their PSN.
The navigation of the PSN is illustrated by two typical
applications: helping the user remember/retrieve information,
and helping the user filter/focus.
II. C
ONSTRUCTING THE
P
ERSONAL
S
EMANTIC
N
ETWORK

The personal semantic network will contain the user’s
digital content and sensed physical context behavior data. Both
will be inter-connected through semantic entities extracted
from the content and mapped to contextual cues. The actual
data and connections can be stored in various forms like a
RDF-based graph database (e.g., AllegroGraph [3]) for high-
performance storage and retrieval. Figure 1 shows an example
schema of a personal semantic network. The depicted schema
contains mostly connections between a user’s communication
data. Messages are inter-connected by the atomic entities:
location, street address, phone number, dateTime, person,
abbreviation, capitalized sequence, and topic. The atomic
entities also connect to meta-data like email addresses, URLs,
or DNS domains. The detection and amount of these semantic
entities will be determined by the semantic “density” of the
digital content data. Section A will detail the notion of
semantic “density” and discuss ways how to increase the
density level. In order to integrate with physical context,
elements of the physical context need to be sensed and
extracted. Section B will detail different physical context cues
to be used for a mapping to digital content. In order to link
elements of the personal semantic network schema to physical
context, the atomic entities of the schema need to be
aggregated and mapped to physical entities. Section C will
detail and discuss this mapping and the disambiguation
between digital content and physical context data.
A. Semantic “Density” of Content Data
The personal semantic network does not attempt to reach a
complete understanding of the user’s digital content data, but to
spot keywords that have a specific meaning to the user. These
keywords will
provide us with
the basic
capabilities to
index and reason
upon the user’s
digital content
and physical
contextual data.
However, the
amount of these
semantic entities
in each piece of
information that
the user generates
or received is
crucial to the
success of this
indexing and
reasoning. We
call this the
semantic
“density” that we
can reach with
the employed
method. Named
entity recognition
(NER) [4] is
normally used to
extract semantic
keywords or
entities. Traditional NER systems aim at locating and
classifying atomic elements (named entities) in text into
predefined categories such as person names, organizations, or
locations. Most NER systems are further optimized for a subset
of entity categories that are recognized with high confidence.
However, for some applications, and in particular for relevance
calculations (as we will see later), this limitation to recognizing
only a subset of entity categories with high confidence can
result in the extraction of an insufficient number of entities.
Some short pieces of information (e.g., tweets) might
frequently end up containing no semantic entities to connect to.
The extraction of more entity types leveraging the web [5] or
Wikipedia [6] can help alleviate this problem. Less strict
typing of entities, i.e. by just spotting meaningful words or
word combinations [7] with unclear connection to a specific
type will further increase the semantic density and will also
help identify new emerging entities (e.g., new company names,
or internal project names).
B. Physical Context Sensing and Integration
The idea for the integration of physical context is to map
semantic entities extracted as described in Section A to
physical context entities comprising location, people and their
activities. In the following, we will detail some of those
physical context entities and how they are sensed / detected.
1) Location
Location-based services and related context-aware
technologies have received a fair amount of attention in both
academia and
industry in recent
years. These services
typically provide
information to the
user that is adapted to
the user’s current
location. An example
would be location-
based restaurant
recommendations
that a user can query
from a smart phone
and that are for the
user’s current
location. The exact
location is typically
sensed using the GPS
signal received by the
phone (see [8] for
details). Indoors WiFi
access points can also
be leveraged to
determine a user’s
exact location [9]
given the fact that
GPS signals cannot
always be received in
buildings.
Geolocation services
like google maps [10]
can do a first translation of GPS coordinates to location names
and addresses. Recent location-based social networking sites
like foursquare [11] can be leveraged to further refine the
semantic location because of users self-reporting their semantic
location along with GPS data on these sites.

2) Person Identity and Proximity
Person identity in the physical context refers to
identification of people through physical devices (or direct
communication with them). Device proximity sensing
techniques using WiFi localization and Bluetooth (e.g., as
described in [12], [13]) can identify nearby people and
Figure 1. An example schema of a personal semantic network (from www.meshin.com)
potential interactions between the user and these people. Cross-
correlations between ambient audio samples from several
phones will further help refine user location and person
proximity. Similarly, the phone number will clearly identify a
person/contact that calls a user, thus indicating an offline
communication between the user and that contact. Sensed
person proximity and call information will be added to the PSN
similar to semantic entities from digital content sources. Phone
numbers will be mapped to identities (see section C) and
outgoing/incoming calls will indexed as specific message
types, while person proximity can be used for “activity event”
sensing (see next subsection).
3) Activity
The term "activity" refers here to what people do as
perceived by deployed sensors (e.g., mobile phone traces). The
detection of an activity is mostly based on a combination or
fusion of several sensor values. For example, a meeting can be
perceived as being linked to a specific location and co-
location/proximity of specific devices/people. There has been a
lot of research related to the detection of basic activities [14],
[15] and aggregating these to higher-level behavior as
described by humans [16]. While there is a wide range of
possible mappings between what people say or refer to in their
digital content and what is perceived by physical sensors, we
want to focus on a basic fusion of contextual cues and a
mapping to what we call "activity event". Essentially, location,
people information and time as sensed by the various methods
mentioned in the previous subsections are combined and create
an event. This activity event then serves as mapping point to
semantic entities in the PSN, in particular indexed calendar
entries. Besides indexing the physical context information, it
can also serve as trigger for querying further information
related to the current context. For example, the PSN can bundle
information for a spontaneous meeting.
Table 1. Example mappings of Content and Context
Content Context
Location place name, street address GPS location, WiFi location
Person Identity
email address, Facebook
identity, Twitter identity,
LinkedIn identity
phone number, device id
(Bluetooth, WiFi)
Events personal calendar entries
GPS location + person
proximity/identity sensing
(WiFi, bluetooth) + time
Mappings

C. Mapping and Disambiguation between Content and
Context Data
In order to reliably link digital content and physical context
data, we need to create a mapping between semantic entities (as
extracted from the user’s digital content) and physical entities
(as sensed by the user’s devices). The problem is that some of
these mappings are ambiguous, e.g., sometimes only a first
name is associated with a phone number in the address book
and it is unclear which personal contact in the PSN it may refer
to. In the following, we will detail some possible mappings
and discuss disambiguation strategies.
Table 1 shows mappings between content and physical
context. The chosen connections points are location, person
identity, and events. For location, place names and street
addresses as indexed in the PSN can be translated to GPS
locations using geolocation services [10]. However, GPS
location sensing is not always very precise and does not work
indoors, so visible WiFi networks and their signal strength can
be used to further disambiguate the physical location.
Furthermore, person identity as created and maintained in the
PSN from email address, facebook, Twitter, LinkedIn etc.
needs to be linked to phone numbers and device ids (visible
through Bluetooth or WiFi). In order to enable a correct
mapping and disambiguate person identities, person name
normalization through name/nickname dictionaries (e.g., [17])
and dedicated name string distance measures (e.g., Jaro-
Winkler [18]) can be leveraged. Additionally, the user’s digital
content might contain references to physical entities (e.g.,
phone numbers or device information in messages or online
profiles). Parsing and extracting this information (e.g., through
email signature detection [19]) will enable the necessary
mappings and disambiguation between different users. Finally,
events can be synchronized between the user’s digital content
(calendar, scheduled events) and physical context. The user’s
PSN may already have indexed personal calendar entries (e.g.,
from the user’s email). The combination of physical context
sensors GPS location, person identity/proximity, and time will
create a separate stream of perceivable “events”. By leveraging
an adapted distance measure (and thus mapping) between
calendar entries and events perceivable through physical
context, the actual daily schedule of the user can be established
(attended and not attended calendar entries). Additional events
from the physical context that cannot be mapped to calendar
entries using the distance measure are interpreted as unplanned
meetings and will create further event entries in the PSN.
III. N
AVIGATING THE
P
ERSONAL
S
EMANTIC
N
ETWORK

The amount of information that an average user needs to
deal with on a daily basis is constantly increasing. As the PSN
attempts to inter-link the user’s communication streams and
physical context, it can be leveraged to tackle two of the major
problems associated with information overload: 1)
remembering and retrieving relevant information from the
enormous amount of accumulated information, and 2) filtering
new incoming information and highlight what is important.
This section describes how the PSN can be used to enable these
two capabilities. The first capability describes how to derive
the relevance of items stored in the PSN, while the second
capability leverages the PSN in order to assess the appropriate
priority of new incoming pieces of information.
A. Relevance: Help the User Remember / Retrieve
The basic idea for the relevance calculation is to leverage
existing connections between pieces of information through
semantic entities in the user’s PSN. Figure 2 shows the
connections of two documents through entity
1
to entity
4
. Each
entity will have a different individual entity weight (idf) and
entity category weight (w) associated with it. Note that the
relevance calculation may also be triggered by a piece of
physical context instead of a document (e.g., the current
location + several person identities). In order to calculate the
relevance, a similarity measure needs to be defined and
calculated.
entity
1
entity
2
entity
3
entity
4
idf
1
, w
1
idf
2
, w
2
idf
3
, w
3
idf
4
, w
4

Figure 2. Illustration of semantic entity based relevance
calculation
Similarity between two pieces of information can be
derived by using the Dice, Jaccard or Cosine coefficients
applied to weighted semantic entity occurrences. The Dice
coefficient has been used in prior work [20] to filter emails
according to email sender/recipient correspondences. We give
the definitions of the different coefficients applied on weighted
entity occurrences below. The similarity metrics take into
account generic entity weight (w
e
, weight of different entity
groups, e.g., ‘people names’, ‘street addresses’) and the weight
of the individual entities with regard to their occurrences (this
is reflected by using the idf
e
value of each entity [21]).
Assuming that we have two pieces of information A and B, we
get:
Dice:
∑∑

∈∈
∩∈
+
=
Be
ee
Ae
ee
BAe
ee
Dice
widfwidf
widf
BAsim
**
**2
),(


Jaccard:


∪∈
∩∈
=
BAe
ee
BAe
ee
Jaccard
widf
widf
BAsim
*
*
),(


Cosine:
∑∑

∈∈
∩∈
=
Be
ee
Ae
ee
BAe
ee
eCo
widfwidf
widf
BAsim
22
2
sin
)*(*)*(
)*(
),(

These similarity metrics can be used interchangeably and
will derive the similarity between two pieces of information or
physical contexts. One piece of information or the current
physical context can be the seed for retrieving all related
information from the PSN. For example, the incoming call
from a contact will trigger the retrieval of all related
information concerning the associated person identity of the
contact in the PSN. The retrieved information can be bundled
and shown on the user’s mobile phone upon request.
B. Priority: Help the User Filter / Focus
The basic idea of the priority calculation is to leverage
stored communication stream and physical context patterns to
derive the priority that the user associates with specific people
and entities. We want to derive the user’s priority by answering
the following questions:
I. Time: With whom/what does the user spend most of her
time?
I. Offline Communication: To whom does the user talk
most?
II. Online Communication: Who/what dominates the user’s
online communications?
Time spent and offline communication can refer to both
physical context cues and digital content. Assuming that we
want to derive the numeric priority value priority(P) of a
person (or entity) P using numeric time priority T, numeric
offline communication priority F, and numeric online
communication priority O. We can state that

Priority(P) = MAX[α
1
∙T(P), α
2
∙F(P), α
3
∙O(P)]

T(P), F(P) and O(P) are interpreted as independent factors
here and the different weights α
1
, α
2
, and α
3
are associated
with them. One way of deriving the individual factors is to
consider ranked lists of all people and entities with regard to
specific criteria that are relevant for calculating T(P), F(P) or
O(P). A criterion can be represented by a sub-factor normalized
to a value between 0.0 and 1.0. For example, a relevant sub-
factor for T(P) could be the rank of a person or semantic entity
with regard to the time the user spends with him/it (in terms of
being the topic or part of scheduled meetings or sensed
physical proximity). Similarly, a relevant sub-factor for F(P) is
the rank derived from the time the user spends talking to a
person. A sub-factors for O(P) is the rank of an entity or person
with regard to being included in messages received, or sent as
reply by the user. Each sub-factor f
i
of T(P), F(P), and O(P) is
normalized to a value between 0.0 and 1.0 by:
P
Prank
Pf
i
i
#
)(
0.1)( −=


Finally, T(P), F(P), and O(P) are calculated by combining
their individual sub-factors. Each sub-factor f
i
is weighted
using individual weights β
1
,…β
n
. For example, for T(P), we
get:
)](),...,(),...,([)(
11
PfPfPfMAXPT
nnii
⋅⋅⋅=
β
β
β


In order to assess the overall priority of a new incoming
piece of information or current physical context, the rank-based
priorities Priority(P
i
) for all involved people/entities Pi will
combined (e.g., by calculating the maximum priority among all
involved entities). For example, the priority calculation will
help the user decide which new emails to read each morning
(depending on the priority of the involved people/entities), or
whether to take an incoming phone call or not (depending on
the caller’s priority).
IV. C
ONCLUSION

This paper aimed at describing the vision of ubiquitous
semantics, i.e., the extraction of meaning across the user’s
digital content (messages, social network connections, etc.) and
physical context (physical location, person proximity, etc.). To
realize this vision, the paper introduced the concept of the
personal semantic network (PSN) that bridges digital content
and context data. Different methods for automatically
connecting digital content and physical context have been
sketched including semantic entity extraction, physical context
sensing, integration and mapping. The navigation of the
personal semantic network has been illustrated by two core
capabilities: the retrieval of relevant pieces of information, and
the determination of priorities for new incoming information or
physical contexts.
While this paper focused on the “big picture” and overall
vision of a personal semantic network (PSN), there are a
number of practical challenges when realizing this vision.
Efficient storage and access from everywhere is essential for
the PSN to work. While the ideal storage of the PSN is in a
cloud-like architecture, there are still a number of issues to
overcome today related to sufficient network bandwidth and
efficient indexing (when accessing the PSN from a mobile
device). Important questions are whether and which parts of the
PSN could be pre-cached on the mobile device and how offline
access/modifications to the PSN could be handled (if network
is unavailable). Assuming that at least some of the user’s
contacts, family and friends will also index their digital content
and physical contexts in PSNs, sharing semantic entities (in
terms of activity events, person contact info etc.) becomes an
interesting opportunity. New incoming pieces of information
can be automatically suggested to be shared among friends if
they cover shared semantic entities or their relevance to a
friend’s information stream is above a threshold. Similarly,
priority values can be shared among a network of users, e.g.,
should the important contacts of your important contacts be
important to you? While these cross-connections between
many PSNs offer interesting new opportunities, they do raise
major privacy and confidentiality concerns. What if
confidential corporate information is automatically shared
through a PSN network with friends that work at a competing
company? It still remains unclear whether the risk involved in
automated sharing of information, semantic entities, or
priorities and the necessary risk mitigation strategies will
outweigh any potential benefits to users.
A
CKNOWLEDGMENTS

The author thanks the meshin team (www.meshin.com
) for
fruitful discussions and for making the first vision of the
personal semantic network a reality.
R
EFERENCES

[1] Bush, V. 1945. As we may think. The Atlantic Monthly 176, 1, 101-108.
[2] Whitelaw, C., Kehlenbeck, A., Petrovic, N., & Ungar, L. (2008). Web-
scale named entity recognition. Proceeding of the 17th ACM conference
on Information and knowledge mining - CIKM ’08, 123. New York,
New York, USA: ACM Press. doi: 10.1145/1458082.1458102.
[3] http://www.franz.com/agraph/allegrograph/
(Retrieved April 27, 2011)
[4] Nadeau, D., & Sekine, S. (2007). A survey of named entity recognition
and classification. Lingvisticae Investigationes, 30(1), 3-26. doi:
10.1075/li.30.1.03nad.
[5] Nadeau, D. (2007). Semi-Supervised Named Entity Recognition:
Learning to Recognize 100 Entity Types with Little Supervision. PhD
thesis. University of Ottawa, Canada.
[6] Nothman, J. (2008). Learning Named Entity Recognition from
Wikipedia. PhD thesis, The University of Sydney, Australia.
[7] Evert, S. (2004). The statistics of word cooccurrences: word pairs and
collocations. PhD thesis. University of Stuttgart, Germany. Retrieved
April 27, 2011, from http://en.scientificcommons.org/19948039
.
[8] Hightower, J., Borriello, G. (2001). Location systems for ubiquitous
computing. IEEE Computer, vol.34, no.8, pp.57-66. doi:
10.1109/2.940014
[9] Wenyao Ho, Smailagic, A., Siewiorek, D.P., Faloutsos, C., (2006). An
adaptive two-phase approach to WiFi location sensing.In proceedings of
IEEE International Conference on Pervasive Computing and
Communications (PerCom) Workshops.
[10] http://maps.google.com/
(Retrieved April 27, 2011)
[11] https://developer.foursquare.com/
(Retrieved April 27, 2011)
[12] Eagle, N., & Pentland, A. (2004). Social Serendipity: Proximity Sensing
and Cueing. MIT Media Laboratory Technical Note 580, May 2004.
[13] Muehlenbrock, M., Brdiczka, O., Snowdon, D., Meunier, J.-L. (2004).
Learning to detect user activity and availability from a variety of sensor
data. Proceedings of the Second IEEE Annual Conference on Pervasive
Computing and Communications (PerCom), pp. 13-22. doi:
10.1109/PERCOM.2004.1276841
[14] Brdiczka, O., Maisonnasse, J., Reignier, P., & Crowley, J. L. (2006).
Learning Individual Roles from Video in a Smart Home. Proceedings of
2nd IET International Conference on Intelligent Environments, Athens,
Greece (Vol. 1, pp. 61-69). doi: http://dx.doi.org/10.1049/cp:20060625.
[15] Shen, J. (2009). Activity Detection in Desktop Environments. PhD thesis,
Oregon State University, USA.
[16] Brdiczka, O. (2010). Integral framework for acquiring and evolving
situations in smart environments. JAISE 2(2): 91-108
[17] http://www.behindthename.com/
(Retrieved April 27, 2011)
[18] http://en.wikipedia.org/wiki/Jaro-Winkler_distance
(Retrieved April 27,
2011)
[19] Carvalho, V., Cohen, W. (2004). Learning to extract signature and reply
lines from email. Proceedings of the Conference on Email and Anti-
Spam.
[20] Cselle, G., Albrecht, K., & Wattenhofer, R. (2007). Buzztrack: topic
detection and tracking in email. Proceedings of IUI, pp. 190-197.
[21] Salton, G. & McGill, M. J. (1983). Introduction to modern information
retrieval. McGraw-Hill. ISBN 0070544840.