# Understanding the Semantics of Media

Τεχνίτη Νοημοσύνη και Ρομποτική

24 Νοε 2013 (πριν από 4 χρόνια και 5 μήνες)

70 εμφανίσεις

Understanding

the Semantics of Media

Lecture Notes
on Video Search & Mining,
Spring 2012

Presented by Jun
Hee

Yoo

Biointelligence

Laboratory

School of Computer Science and Engineering

Seoul National
Univertisy

http://bi.snu.ac.kr

Semantic Understanding

There are some tools which attempt to segment video at
a higher level.

But this level of analysis does not tell us much about
the meaning represented in the media.

Problem Statement

Approach

Segmentation Literature

Use LSI because it allow us to
quantify
the position of a
portion of the document in a multi
-
dimensional semantic
space
.

Propose to summarize the text with LSI and analyze the
signal with smooth Gaussians.

Semantic Retrieval Literature

Use mixtures of probability experts for semantic
-
audio
retrieval (MPESAR) to model which more sophisticated
model connecting words and media.

Analysis Tools

SVD

To reduce the dimensionality of a signal in a manner
which is optimum, in a least
-
squared sense.

This use to reduce dimensionality of both audio and
image video data.

Color Space

𝐵𝑖
=
64𝑓𝑙𝑟
log
2
𝑅
+
8𝑓𝑙𝑟
log
2
𝐺
+
𝑓𝑙𝑟
(
log
2
𝐵
)

Concatenate into 512 histogram bins.

Word Space

Using Latent semantic indexing with SVD.

To measure the distance use the angle;

cos
𝜙
=
(
𝜈
1

𝜈
2
)
/
(
𝜈
1
𝜈
2
)

Segmenting Video

Temporal Properties of Video

Color:

It provides robust evidence for a shot change in a
video signal.

However, it cannot tell us global structure of the video.

Random words form a transcript:

The words indicate a lot about the overall structure of
the story.

Segmenting Video

Test Material

CNN Headline News (30min TV show).

21
st

Century Jet (Documentary).

Use automatic speech recognition(ASR) to provide a
transcript of the audio.

Segmenting Video

Scale Space

Convert the original signal into scaled space.

In scale space, we analyze a signal with many
different kernels.

With Low Pass Filter

Histogram

Segmenting Video

Combined Image and Audio Data

Combined color, words and scale space analysis. The
result is a 20
-
dimensional vector function of time and
scale.

Segmenting Video

Hierarchical Segmentation Results

Color and word autocorrelations for the Boeing 777
video

Segmenting Video

Hierarchical Segmentation
Results

Grouping 4
-
8 sentences produces a larger semantic
autocorrelation.

Segmenting Video

Intermediate Results

A scale
-
space
segmentation algorithm
produced a boundary
map showing the edges
in the signal.

Segmenting Video

A comparison of ground truth.

Left: estimated result.

Right: ground truth.

Segmenting Video

Shot Boundary Segmentation.

Use commercial product, designed by
YesVideo
.

Segmenting Video

Manual Segmentation result

Semantic Retrieval

MPESAR

process

Semantic Retrieval

Acoustic Signal processing chain

Acoustic to Semantic Lookup

Semantic Retrieval

Testing

Retrieval Results

Histogram of true label ranks
based on likelihoods from
audio
-
to
-
semantic tests

Histogram of true label ranks
based on likelihoods from
semantic
-
to
-
acoustic tests