slides - scien

paraderollAI and Robotics

Nov 17, 2013 (3 years and 1 month ago)

53 views

Simon Winder

Microsoft Research

Image Descriptors

Applications at Microsoft

Augmented reality research


Image 1

Image 2

P

Q

Descriptor Space

“Learning Local Image Descriptors”


CVPR2007


Discriminant

Embedding for Local Image Descriptors”


ICCV2007

“Picking the best DAISYs”


CVPR2009


Gang Hua (now at Nokia),

Matthew Brown (now at EPFL)

“A performance evaluation of local descriptors”


Mikolajczyk

and
Schmid
, 2005


Keypoint

recognition using randomized trees”


Lepetit

and
Fua
, 2006

“Task specific local region matching”


Babenko

et al., 2007

“PCA
-
SIFT”


Ke and
Sukthankar
, 2004

“Vector quantizing feature space with a regular lattice”


Tuytelaars

and
Schmid
, 2007

“A fast local descriptor for dense matching”


Tola

et al., 2008


Learn good descriptors

Design representative ground truth training set

Test best descriptor algorithms

Learn optimal parameters

Reduce dimensionality

Reduce bits per dimension

Find algorithms with low computational cost

Algorithm

Normalized

Image Patches

Descriptor

Vectors

Feature

Detector

Robust

Normalize

Summation

Quantize

And
Compress

PCA

Dimension

Reduction

3D Point

Cloud

Algorithm

Parameters

Training Pairs

Descriptor Distances

Incorrect Match %

Correct Match %

Update Parameters

ROC

area

Various tradeoffs for different applications

Minimize error, storage cost, computational cost


We use a low complexity 32 byte descriptor

Half the ROC error of SIFT which has 128 bytes


Developed a highly optimized implementation

Face recognition

Gang Hua, Amir Akbarzadeh, ICCV 2009

Robot navigation

Panoramic Stitching

Microsoft ICE and Photo Gallery

http://tinyurl.com/5e99su


Photosynth

3D navigation of unstructured photo
collections

Bing maps

Producing realistic 3D city and street
views and transitions

Matching of crowd
-
sourced imagery
and panoramas to city views

Bing image search

Descriptors are stored for each thumbnail

Clustering across 5 billion images

Lincoln

http://tinyurl.com/ya9muwk

Matching
cellphone

photos to products (2007
-
)

D.
Nistér
, H.
Stewénius
, CVPR 2006

Mobile matching to city street
-
view imagery

G. Schindler, M. Brown, and R. Szeliski, CVPR 2007

Bag of features, learned vocabulary, inverted index approaches

Augmented reality

Bill Gates keynote CES 2008

Microsoft Research
Techfest

2009
Treasure hunt
http://tinyurl.com/bqclnv

Augmented reality tags

Attaching information feeds to visual locations

Matching appearance to city street view,
photosynthed
, crowd
-
sourced or user supplied imagery

Requirements:

Extracting descriptors on the device

Matching to locally relevant subset of appearance database

Robust real
-
time vision
-
based tracking of camera motion

3D pose enables 3D graphics but requires local SLAM (Georg Klein)

2D tracking sufficient for text/symbol overlays

Compact efficient descriptors

Used throughout Microsoft for search,
matching, and recognition

Through Bing maps +
Photosynth

we are
leveraging large volumes of location
-
related
image content

Enabling mobile scenarios

http://tinyurl.com/yz6g7nl



©
2007
Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registere
d t
rademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the
dat
e of this presentation. Because Microsoft must respond to changing market conditions, it
should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any inf
orm
ation provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.