Understanding the Semantics of Media

yakzephyrΤεχνίτη Νοημοσύνη και Ρομποτική

24 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

63 εμφανίσεις

Understanding

the Semantics of Media


Lecture Notes
on Video Search & Mining,
Spring 2012

Presented by Jun
Hee

Yoo


Biointelligence

Laboratory

School of Computer Science and Engineering

Seoul National
Univertisy


http://bi.snu.ac.kr


Semantic Understanding


There are some tools which attempt to segment video at
a higher level.


But this level of analysis does not tell us much about
the meaning represented in the media.


Problem Statement

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

2

Approach


Segmentation Literature


Use LSI because it allow us to
quantify
the position of a
portion of the document in a multi
-
dimensional semantic
space
.


Propose to summarize the text with LSI and analyze the
signal with smooth Gaussians.


Semantic Retrieval Literature


Use mixtures of probability experts for semantic
-
audio
retrieval (MPESAR) to model which more sophisticated
model connecting words and media.

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

3

Analysis Tools


SVD


To reduce the dimensionality of a signal in a manner
which is optimum, in a least
-
squared sense.


This use to reduce dimensionality of both audio and
image video data.


Color Space


𝐵𝑖
=
64𝑓𝑙𝑟
log
2
𝑅
+
8𝑓𝑙𝑟
log
2
𝐺
+
𝑓𝑙𝑟
(
log
2
𝐵
)


Concatenate into 512 histogram bins.


Word Space


Using Latent semantic indexing with SVD.


To measure the distance use the angle;

cos
𝜙
=
(
𝜈
1

𝜈
2
)
/
(
𝜈
1
𝜈
2
)

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

4

Segmenting Video


Temporal Properties of Video


Color:


It provides robust evidence for a shot change in a
video signal.


However, it cannot tell us global structure of the video.



Random words form a transcript:


The words indicate a lot about the overall structure of
the story.


© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

5

Segmenting Video


Test Material


CNN Headline News (30min TV show).


21
st

Century Jet (Documentary).


Use automatic speech recognition(ASR) to provide a
transcript of the audio.



© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

6

Segmenting Video


Scale Space


Convert the original signal into scaled space.


In scale space, we analyze a signal with many
different kernels.



© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

7

With Low Pass Filter

Histogram

Segmenting Video


Combined Image and Audio Data








Combined color, words and scale space analysis. The
result is a 20
-
dimensional vector function of time and
scale.


© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

8

Segmenting Video


Hierarchical Segmentation Results









Color and word autocorrelations for the Boeing 777
video


© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

9

Segmenting Video


Hierarchical Segmentation
Results










Grouping 4
-
8 sentences produces a larger semantic
autocorrelation.

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

10

Segmenting Video


Intermediate Results


A scale
-
space
segmentation algorithm
produced a boundary
map showing the edges
in the signal.

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

11

Segmenting Video


A comparison of ground truth.


Left: estimated result.


Right: ground truth.

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

12

Segmenting Video


Shot Boundary Segmentation.


Use commercial product, designed by
YesVideo
.

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

13

Segmenting Video


Manual Segmentation result

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

14

Semantic Retrieval

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

15


MPESAR

process

Semantic Retrieval


Acoustic Signal processing chain





Acoustic to Semantic Lookup


© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

16

Semantic Retrieval

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

17


Testing

Retrieval Results

© 2012, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr

18

Histogram of true label ranks
based on likelihoods from
audio
-
to
-
semantic tests

Histogram of true label ranks
based on likelihoods from
semantic
-
to
-
acoustic tests