Specification of uTagit

schoolmistInternet και Εφαρμογές Web

22 Οκτ 2013 (πριν από 3 χρόνια και 5 μήνες)

59 εμφανίσεις


1




Specification of uTagit


-

A Time
-
based Video Tagging System







Version: 1.0

Created by: Yuk Hui

Date of Creation: 11
th

June, 2009















Metadata Project, Goldsmiths, University of London, supported by The Leverhulme Trust


2

Disclamat
ion: All images used in this document are produced by
Brigitte Kaltenbacher

Table of Contents

1.

Software

................................
................................
................................
...............................

4

1.1.

Technical Objectives

................................
................................
................................
..............

4

1.2.

Theoretical Exploration

................................
................................
................................
.......

4

2.

Users

................................
................................
................................
................................
......

6

2.1.

Registration

................................
................................
................................
..............................

6

2.2.

Login

................................
................................
................................
................................
............

6

2.3.

Permission

................................
................................
................................
................................

6

2.4.

Bookmark

................................
................................
................................
................................
..

6

3.

Video

................................
................................
................................
................................
.....

7

3.1.

Tagging

................................
................................
................................
................................
.......

7

3.1.1.

Type of Tags

................................
................................
................................
................................
.....

7

3.1.2.

Structure of Tags

................................
................................
................................
............................

7

3.1.3.

Relevance

................................
................................
................................
................................
..........

8

3.1.4.

Control

Variables

................................
................................
................................
...........................

8

3.1.5.

Tagging Mode 1
................................
................................
................................
...............................

8

3.1.6.

Tagging Mode 2
................................
................................
................................
...............................

9

3.1.7.

Tagging Mode 3
................................
................................
................................
...............................

9

3.1.8. Tagging Experience

................................
................................
................................
...........................

9

3.2.

Import Video

................................
................................
................................
.........................

11

3.2.1.

Link validity
................................
................................
................................
................................
...

11

3.2.2.

Collision of Links

................................
................................
................................
.........................

11

3.2.3.

Control variables

................................
................................
................................
.........................

11

3.2.4.

Comments

................................
................................
................................
................................
......

12

3.2.5.

Information Importation

................................
................................
................................
.........

12

3.3.

Visual Display

................................
................................
................................
........................

13

3.3.1.

Colour Grid Visualization

................................
................................
................................
.........

13

3.3.2.

Tag Cloud Visualization

................................
................................
................................
............

13

3.3.3.

Wave Display

................................
................................
................................
................................

15

3.4.

Tag Editing

................................
................................
................................
.............................

16

3.4.1.

Modifying Tags

................................
................................
................................
.............................

16

3.4.2.

Supporting Tags

................................
................................
................................
..........................

16

3.4.3.

Deleting Tags

................................
................................
................................
................................

17

4.

Video Processing

................................
................................
................................
............

18

4.1.

Segmentation

................................
................................
................................
........................

18

4.2.

Display of Segments

................................
................................
................................
............

18

4.3.

Exportation

................................
................................
................................
............................

19

5.

Searching

................................
................................
................................
..........................

20

5.1.

Searching Criteria

................................
................................
................................
................

20

5.2.

Sorting Criteria

................................
................................
................................
.....................

20

6.

Outlook

................................
................................
................................
..............................

22

6.1.

Index Page

................................
................................
................................
..............................

22

6.2.

Login Page

................................
................................
................................
..............................

22


3

6.3.

User Page

................................
................................
................................
................................

22

6.4.

Searching page

................................
................................
................................
......................

22

6.5.

Sorting Page

................................
................................
................................
...........................

22

6.6.

Video Page

................................
................................
................................
..............................

23

7.

Future Development

................................
................................
................................
.....

24




4

1.

Software


1.1.

Technical Objectives


The rapid increasing of information production demands a more advanced way of
information organization. So far, we already witnessed the semantic web movement
which advocates a further annotation

of web content according to different ontologies
so that the semantic meaning of these content can be interpreted and inferred by AI
agent. Speaking of annotation, there is always recursive problem concerning the
granularity of information analysis. Even
though there are ontologies developed to
describe a video or audio file, yet the content of the video remains underexplored. A
simple example would be, even though there are a lot of information (including
MPEG7 metadata) about a 2 hours long video, the us
er is still not able to explore the
part which he or she is looking for unless one watches most of the video. This
problem exaggerates when one has to deal with a repertoire of video.


The project is initiated by the Tate (London)’s difficulty in dealing w
ith their videos.
The hours long videos are open to public, while the usage is relatively low due to the
lack of information for users, which causes difficulty in browsing. (In this case,
Youtube becomes a better alternative, since all videos are minutes l
ong with richer
annotation)


The software aims to solve the technical problem by proposing time
-
based tagging,
which

(1) allows the users to tag the video according to different time instance;

(2) facilitates the users to interact with tags, for example
negotiation, deletion, etc, as
a way of social computing for accuracy as well as common understanding;

(3) makes possible to extract more meaningful information and relation through
normalization of vocabularies and regulation of relational predicates.

(4
) create portable tag and video metadata, so that the users can bookmark
video/video segments, and exchange these metadata with other platforms


*
The project is influenced by the Japanese Video Sharing website, NicoNico Douga,
while different from NicoNic
o Douga, this project aims to investigate further
granularity of annotation, time
-
based experience as well as data portability.


1.2.

Theoretical Exploration


Technically speaking, the software is an investigation of tagging, as a cognitive
activity as well an
alternative to the formalized and taxonomical way of computation.
The current understand of tagging falls into one of the three categories. The first one
is from the media theorist Clay Shirky, in Shirky’s view, tagging or folksonmy is a
bottom
-
up movemen
t, which is opposite to the ontology
-
driven information retrieve

5

paradigm advocated by the movement of the semantic web. Ontology is perceived by
Shirky (2005) as a hierarchical power structure of computation, which becomes
inefficient at the age of social

computing characterized by “grassroots movement”, or
under other titles like “web 2.0” The second view is from Teil and Latour (2003), the
sociologists who share a similar belief with Shirky that the closed world assumption,
which is the fixed and formal
rules of computation falls short to flexibility, while free
indexing has better computation power through association of keywords. This idea is
derived from David Hume’s association of ideas namely: resemblance, contiguity and
causality. The third view is
a twisted version of the concept “social convention” by
David Lewis, the idea of social convention attempts to explain the certain users shall
the same understanding of indexing (Halpin 2006), for example, Chinese people may
tend to index a video with a wo
rd which is nevertheless different American people.
The three views become the dominant discourse of tagging. While the question
remained unasked is precisely what is a tag.


A Tag according to our understanding is a cognitive process. A tag actually
corre
sponds to two activities, in the terminology of Edmund Husserl, noesis and
noemata. Noesis is the act of consciousness; noema is the correspondence of the act
as such. For example, a particular scene of a video reminds me a certain thing for
example smell,

color, like the smell of cake triggered the childhood memory of
Proust, there is a change in intentionality from one to another. The understanding of
tag in terms of phenomenology bears several significant importance: (1) if tagging
corresponds to an inte
ntional activity, then the software can suggests predicates for
the user, which can help to calculate the relativity of the tags. For example, the triple
“video moment + remind_me_of + God Father” furthers suggests the relation
between “God Father” (this c
an be someone’s god father, as well as the movie God
Father” (2) if a tag is thus considered as a judgment, it corresponds to a logic based
on subjective understanding, which is “intentional logic” (as opposed to external
logic) in philosophy. The former r
efers to Husserian logic, and the latter refers to
Fregean logic; (3) if an intentional logic is taken into account, we will be able to
understand the concept of social convention in a broader aspect, which is the idea of
intersubjectivity.


A Tag is also
a negotiation process, which extends the understanding of tagging as a
cognitive process, to a intersubjective understanding not based on convention but on
agreement. This agreement based computing, is a key to the concept of social
computation, which is y
et to be explored and theorized.


The above theoretical exploration of tagging doesn’t mean to reduce this software as
a theoretical footnote; rather it is an attempt to push the concept of tagging to more
fruitful practices.



6

2.

Users


Users are defined as
those who register for the service, the users will be provided
with a login name and login password.


Open ID and Open Authentification will be considered in the later stage.


2.1.

Registration


The registration information may vary from one website to another

according to
the demand of the service providers (video). Some video service providers may
already have a membership database which can be merged with the tagging
services. Basically, a user must provide a valid email address for contact.


2.2.

Login


It is su
ggested to use OpenID login, the user can apply an OpenID URL from the
OpenID provider (e.g. Google, Yahoo!, etc) The authentication will be done by
sending request to the Identity Provider. The system will be able to get the
registration information form
the Identity Provider, without the tedious web
-
form
filling activity.


2.3.

Permission


Each user will have a permission setting, which limits the user from performing
certain activities. This restriction will be mainly on the execution of tagging, for
example
other users are not allowed to modify and delete the tag provided by the
user who imports/ owes the video. (These tags are not time
-
based tags, but tags
related to the video object in general) For more information on the tagging
permission please refer to
Section 3.1.1. Type of Tags, and 3.3. Editing tags.

2.4.

Bookmark


The user will have a unique bookmark list, when a video is imported, it will
automatically bookmarked for the user. The user can also choose to bookmark
any other video.



7

3.

Video


3.1.

Tagging


3.1.1.

Type o
f Tags


There are 2 types of tags, the FIRST type is tags provided by the users
when he/she imports the video (in the case of Tate, the staff who upload
the video to the website) These tags are considered to be objective
description of the video, it also f
acilitates users to search the video after it
is imported/uploaded but before anyone did any time
-
based tagging. This
type of tags will be mentioned as User Tags hereafter.


The SECOND type is time
-
based tags from all users regarding to
particular moment.
This type of tags will be mentioned as Time
-
Based
Tags hereafter.


There will be MAXIMUM 6 time
-
based tags for a particular moment. The
purpose of this restriction is an endeavour to encourage the users to
negotiate the accurate and meaningful tags. For th
e mechanism of
negotiation or deletion please refer to section 3.3. Tag Editing.


3.1.2.

Structure of Tags


A tag is related to the video instance by a triplet structure, for example:
The moment of video instance at time 2:35 + has + Heidegger. This is a
basic st
ructure of First Order Logic (FOL), we can break the tag structure
into three parts, and given the name by Subejct, Predicate, and Object.
These names are assigned with referring to the common vocabularies of
ontology design, in order to avoid misunderstan
ding (as we know in Latin,
predicate refers to the accident of the substance, which is “Heidegger” in
the above example) So in this case:


Subject: The moment of video instance at time 2:35

Predicate: has

Object: Heidegger


In ontology design (refer to sof
tware like Protégé), the predicate
“is_about” is understood as relational function. The variation of Predicate
function indicates different relation between the object and the subject,
hence different sense/meaning of the tags. Another example would be

8

“Th
e moment of video instance at time 2:35 + is + Beautiful” “Has” and
“is” are two types of fundamental relations in FOL, the variation in these
predicates can generate richer semantic meaning of the tags, for example
“The moment of video instant at time 2:3
5+ remind_me_of + Heidegger”.


3.1.3.

Relevance


A tag also has relevance, ranging from 1 to 10, indicating how relevant the
tag is to that moment. It maybe trivial when the user tag some factual stuff
like Heidegger when Heidegger appears on the screen, but this

will be
useful when the user has to express something which is not as sharp as a
fact, for example “beautiful”, “boring”, etc.


The user can use a vertical slide to adjust the relevance after he/she enters
the tag.


The following prototype of tagging mode
s is an attempt to find out which
is the best way of tagging, combining the richness of semantic meaning
and user experience.


3.1.4.

Control Variables


Some control variables are proposed to enhance further control functions,
for example, tracing user behaviours
, as well as deducing relevance.


1)

Creator (The id of the user who creates the tag)

2)

List of users who modified it

3)

Click Rate

4)

Supported Rate (see section 3.4.2)



3.1.5.

Tagging Mode 1


The first tagging prototype, only requests simple, free indexing. The users
don
’t have to choose the predicate functions. The predicate function will
be defaulted as “is_” The user will also be requested to vary the relevance
of the tag related to the video, it varies from 1


10, the default value is 0,
which means the user doesn’t
specify the relevance of the tag.



9


(Fig. 001)

3.1.6.

Tagging Mode 2


In this mode, the user is requested to choose the predicate function. These
predicate function in the uTagIt API is coded as “Tag2”. The admin will
have to create predicate functions as contro
l vocabularies. The purpose of
restricting it as control vocabularies is to avoid analysis of Natural
Language, which is not the focus of this project. So, for example, the
admin can limit the control vocabularies into two categories: is_ and has_.
The use
rs are requested to choose the predicate function when they tag.


The user will also be requested to vary the relevance of the tag related to
the video, it varies from 1


10, the default value is 0, which means the
user doesn’t specify the relevance of th
e tag.





(Fig. 002)

3.1.7.

Tagging Mode 3


In this mode, the user is allowed to input the predicate function, the format
of the predicate function will be words separated by an underscore, for
example: remind_me_of, the length of the predicate should not be lo
nger
than 20 characters including the underscores.


The user will also be requested to vary the relevance of the tag related to
the video, it varies from 1


10, the default value is 0, which means the
user doesn’t specify the relevance of the tag.

3.1.8.
Tagging Experience


a)
Play
: The video will still go on when the user press the tag button, in
order not to interrupt the experience of the user. (This is unknown, maybe
it works the way round)


10


b)
Hints
: The interface will give hints to the user accordin
g to what he
types. For example, when the user starts tying “Hei”, the system will
suggest vocabularies such as “Heidegger”, “Heidelberg”, etc.


The hints for the “Object” in Tagging Mode 1 (section 3.1.5.) can be done
by searching the table of tags using
the API. The hints for the “Predicate”
in Tagging Mode 3 (section 3.1.7) can be done by making an inquiry of
the Tag2 table using the API.



11

3.2.

Import Video


When the user imports a video or uploads a video, he/she is prompted to provide a
link. The link will

be checked according to its validity. One the link is proved to
be a video object. The user will be prompted to enter:


A ) Description of the Video


B ) User Tags (see Section 3.1.1)


C) Title (The title of the video should be imported as default, but th
e user will be
able to modify it)


3.2.1.

Link validity


The system will be able to check the validity of the link. There maybe
difference sources of links, for example, vimeo, youtube, etc.


3.2.2.

Collision of Links


Users may import the same link, the system will hav
e to check the
duplication of the links. It is possible to identify the same source from the
same website, but it would be less possible to identify the same video from
different website. For the former case, the user will be notified the
existence of the
video, while he will have the video in the bookmark list.



3.2.3.

Control variables


Some control variable will be added to the video for further searching,
sorting, or other kind of control functions. The following variables will be
recommended with explanation
.


1) Uploaded Timestamp

2) Modified Timestamp

3) View Rate

4) Modified Rate

5) Bookmark Rate

6) List of user who modified the video

7) List of user who bookmarked the video



12

3.2.4.

Comments


Users are allowed to leave comments to the video, as a space for
negot
iation.


3.2.5.

Information Importation


The following data will be suggested to be imported:

a)

Length of the video

b)

Title of the video

c)

Source of the video

For future development, the system should be able to import not only the
link and title, but also some informa
tion related to the video. For example,
most of the video sharing websites already have public APIs to access the
information related to a video ID, including tags, timestamps, comments,
descriptions, the creator, etc. These information can be useful for
i
ntegration of data.


In the case of Tate, the staff who uploads the video will have to specify
these information, for further information required may subject to special
needs.

13

3.3.

Visual Display


3.3.1.

Colour Grid Visualization


a) visualization scheme proposed by
Brigitte



(Fig. 003)


As illustrated in the above graph, the row represents the time frame (x
axis), and the column represents the tags. Each grid is one tag, and the
colour code represent the relevance of the tag indicated by the user.


b) visualization

scheme proposed by Darren


The other way of dealing with it, is to re
-
organize the grids, in the way
that each horizontal line of grids corresponding to one tag and the
variation of its relevance (in terms of color), it may run up to hundreds of
tags, whi
ch means a huge table of grids, one way of solving the problem is
having an AJAX function (or something similar) to automatically choose
the related tags when the user is typing (like the hint service of tagging,
refer to section 3.1.6.)


In EITHER case, t
he user will be able to press the grid to bring him to the
indicated time stamp of the video and starts playing from there.



3.3.2.

Tag Cloud Visualization


a)

ON Screen Display (Default)



14


(Fig. 004)


The default mode is the ON screen display, which displays th
e tags on the
video, as the video goes on. Before the user starts playing the video, all
tags will be displayed as shown in the above illustration.


b)

OFF Screen Display


The users can choose the OFF Screen Display, and tags will be shown in
the form of tag
cloud.


The size of a tag in the tag cloud is determined by the frequency of the
appearance of the tag in the video PLUS the overall relevance of the tag.


The suggested simple algorithm:


Function (Tag cloud)

For each tag



Create Array_Tag[descending re
lavance];

For each Array_Tag



Weight= weight + Array[relevance]
order_in_array




Add_Array(weight);


The above algorithm creates arrays of tags (for example the tag Heidegger
may occur five times in the video) in descending order according to
relevance, a
nd the importance of the tags will decrease exponentially, the
sum of the weight will indicate the importance of the tag in the tag cloud.
In this case, consider two tags, Heidegger [0.7, 0.6, 0.5, 0.2] and Husserl
[0.9, 0.7], the weight[Heidegger]= 0.7 +0
.36 + 0.125 +0.016 =1.201,
weight[Husserl]= 0.9+0.49=1.39, even the tag Heidegger occurs more

15

frequent, but it will have a less heavier weight then the Husserl. In another
case lets consider Dreyfus[0.8, 0.6], the weight[Dreyfus] will be 1.16, and
which wi
ll be less heavier than Heidegger.


More advance or optimal algorithm maybe employed depends on the need.






(Fig. 005)




The user can Mouse_over the tag to access the starting points (considering
a tag may appear more than once in the video)



3.3.3.

Wave Di
splay


Curves are used to display the variation of the relevance of the tags against
time. ONLY 6 most Supported Tags can be displayed in wave form (for
Supported Tags, please refer to 3.4.2) The users can use the browse over
the wave, to access the play p
oint.



(Fig. 006)


16

3.4.

Tag Editing


3.4.1.

Modifying Tags


There are two types of tags (see section 3.1.1.) The User Tag can only be
modified by the users who imported or uploaded the video, other users are
not allowed to modify these tags.


All users can modify t
he Time
-
Based Tag. When the user puts the mouse
over the tag on the screen, the following information will be displayed

a)

number of supports

b)

supporting function

c)

deleting function






(Fig. 007)

3.4.2.

Supporting Tags


Supporting Tag is a mechanism which allows
users to express preference
over certain tags. Once a tag is supported doesn’t mean that its relevance
is higher, it only means that the tag is supported by other users. The 6
most supported tag will be displayed as waveform. The supported tags will
be ran
ked according to the number of supports they got.



(Fig. 008)



17

3.4.3.

Deleting Tags


The maximum number of tags of a time frame can only be 6, when there
are already 6 tags, the user will have to delete a LEAST supported tag in
order to input a new tag.


(Fig.

009)


18

4.

Video Processing


4.1.

Segmentation


The user can choose the “Segmentation” Editing tool. The segmentation tool
allows the users to extend the tag to other consecutive time frames.


When it enters the segmentation mode, the user will have to choose a tag

from the
Colour Grid Visualization (section 3.3.1.), and then he/she will have to slide the
arrows to locate the starting point and ending point. While the user is sliding, the
screen of the video should be able to shift as well. Then the user will have t
o
confirm the segmentation. The video segment will inherit all the metadata tagged
by others.


Problem
:
If we adopt the colour grid visualization proposed by Darren, we will
have to think where this segmentation is going to happen.





(Fig. 010)

4.2.

Display
of Segments


The segments will be displayed either on the side or bottom of the webpage. Each
segment will be titled according to the tag which the users intended to extend.
This segment can represent a unique concept, for example, to create a segment
when

Heidegger is explaining the idea of “Gestell” in the video. This shares some
similarities with Lines de Temp.


The users can click into the segment, it will display all the metadata it inherit
from. Those new metadata inputted after the segment is done ca
n also be
displayed, since the segment doesn’t create new video instance, but only indicate
the starting point and ending point.


The user will be able to bookmark the segments.



19

4.3.

Exportation


Exportation of data will be considered in the later stage, to en
hance data
portability. A suggestion is to develop an ontology of tags, which doesn’t mean to
formalize tags, but allow the users to carry these tags in different platforms. It
may worth to have a working group on this area soon.

20

5.

Searching


5.1.

Searching Crite
ria


The user will be able to search according to three criteria

a)

Search in all text information: this includes Titles + User Tags + Time Tags

b)

Search Tags with relevance: this includes all Time
-
Based Tag + relevance, the
default value is 0, which doesn’t me
an the user is searching tag with 0
relevance, but will be defaulted as the highest relevance

c)

Search Title only: Only Titles




(Fig. 011)

5.2.

Sorting Criteria


The following sorting criteria will be considered

a)

Title (in alphabetic order)

b)

User Tag (in order
of relevance)

c)

Time
-
Based Tag (in order of view rate of the video)

d)

Latest (sorted by timestamp of modification: uploaded time, modified
timestamp)

e)

Relevance

f)

View rate


According to the above searching criteria different sorting scheme will be
available (sin
ce some of them are not applicable, e.g. you cannot sort a tag when
you are searching a tag, you can only sort it by relevance)


a)

search in all text information: the user will be able to sort by 1)Title; 2)User
Tag; 3) Time
-
Based Tag; 4) Latest; 5)Relevance
; 6) view rate.


b)

search tags with relevance X:
the user will be able to sort by
1)Tag
(descending from X relevance)
;

2) Latest (video which is modified/uploaded

21

and have the specific tag
; 3)relevance (by reversing the order of relevance);
4)view rate (clic
k rate or supported rate of the tag)


c)

search title only: 1) Title; 2) Latest; 3)View Rate.


22

6.

Outlook


For a more detailed outlook of the pages, please refer to the abstraction produced by
Brigitte
Kaltenbacher
:


http://sites.google.com/site/utagit/Home/in
terface
-
development

# video tagging 2_5.ppt 1570k
-

on 31 Mar 2009
22:27 by Brigitte Kaltenbacher


6.1.

Index Page


The main page should have the following function

a)

Displaying the most recent tagged video

b)

Display a list of recently added video

c)

A login function

d)

A searching function


6.2.

Login Page


The user will be prompted to enter the login information, and if proved the user
will be redirected to the User page. Once the user login, a logout option is always
available.


6.3.

User Page


The user page will have the follo
wing basic function

a)

View the bookmarked videos

b)

Searching options

c)

A list of recommendation related to his activities


6.4.

Searching page


Please refer to the section 5, the user will be redirected to the sorting page


6.5.

Sorting Page


Please refer to Section 5.



23

6.6.

V
ideo Page


The following basic functions will be displayed


a)

Bookmarking

b)

Video Mode (with player)

c)

On/Off Screen Tag Cloud (Time
-
Based Tag)

d)

Video Description (Creator, Description, Length, User Tags)

e)

Color Grid Visualization

f)

Segmentation Function

g)

Recommended

Video (related to the video, the user maybe able to choose, the
relations: most supported tags; related User Tag; Maximum number of
matched Tag in descending order)




(Fig. 012)



24

7.

Future Development


For the future development, the following technical d
evelopment will be suggested:


a)

The relevance of video according to user behaviors. We have reserved some
control variable to identify the interaction between the user and data. An
interesting, yet under explored topic in the industry, is the relation betwe
en habit
and identity. There has been so many works on virtual identity/ electronic
identity. While rarely there is research working on habit/context and user identity.
The time
-
based tagging system will allow us to explore the taste, habit of the user,
be
cause we allow emotional tags as well as factual tags.


b)

Currently there has been many research working on the relation between tagging
and image recognition, it is very possible to identify the user’s interest not in
terms of comparison of similarity, but

the recognition of the object the user is
referring to.


c)

The portability of the time
-
based Tag, which means how can we share the time
-
based Tag with other website. We may consider developing an ontology of time
-
based metadata, this doesn’t mean controlled

vocabulary, but developing a data
format which allows these data to be exported, imported, merged, manipulated.
This ontology will be represented by RDF, in order to merge with the semantic
web, since now an ontology of tags is unthinkable in the current
understanding of
tag in the semantic web community. Since tags are simply reduced to has_tag()
function.


d)

Development of the API with OAuth, to allow open authentication of API access,
so that other social networking website can exchange data with us. For

example,
OAuth allow us to exchange information with twitter and other social networking
website.