On the Limits of Dictatorial

wonderfuldistinctΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 4 χρόνια και 2 μήνες)

71 εμφανίσεις

On the Limits of Dictatorial
Classification

Reshef Meir

School of Computer Science and
Engineering, Hebrew University

Joint
work with

Shaull

Almagor
,
Assaf

Michaely

and Jeffrey S.
Rosenschein


Strategy
-
Proof Classification



An Example


Motivation


Our Model and previous results


Filling the gap: proving a lower bound


The weighted case

ERM

Motivation

Model

Results

Strategic labeling: an example

Introduction

5 errors

There is a better
classifier!

(for me…)

Motivation

Model

Results

Introduction

If I just
change the
labels…

Motivation

Model

Results

Introduction

2
+
5

= 7 errors

Classification

The
Supervised Classification

problem:


Input
: a set of labeled data points {(x
i
,y
i
)}
i=1..m


output
: a classifier
c

from some predefined
concept class
C

( e.g., functions of the form f : X

{
-
,+} )


We usually want
c

to classify correctly not just the
sample, but to generalize well, i.e., to minimize


R(
c
)


the expected number of errors w.r.t. the distribution
D

(the 0/1 loss function)

Motivation

Results

Introduction

Model

E
(x,y)~D
[
c
(x)≠y ]

Classification (cont.)


A common approach is to return the
ERM
(Empirical Risk Minimizer)
, i.e., the concept in
C

that is the best w.r.t. the
given samples
(has the
lowest number of errors)


Generalizes well under some assumptions on
the concept class
C
(e.g., linear classifiers tend
to generalize well)


With multiple experts, we can’t trust our ERM!

Motivation

Results

Introduction

Model

Where do we find “experts” with incentives?

Example 1
: A firm learning purchase patterns


Information gathered from local retailers


The resulting policy affects them



the best policy, is the policy that fits
my

pattern



Introduction

Model

Results

Motivation

Users

Reported Dataset

Classification


Algorithm

Classifier

Introduction

Model

Results

Example
2
:

Internet polls / polls of experts

Motivation

Introduction

Model

Results

Motivation from other domains

Motivation

Aggregating partitions



Judgment aggregation







Facility location (on the binary cube)

Agent

A

B

A & B

A | ~B

T

F

F

T

F

T

F

F

F

F

F

T

A problem
instance

is defined by



Set of
agents

I

= {1,...,n}


A set of data points




X = {x
1
,...,
x
m
}


X


For each
x
k

X

agent
i

has a label
y
ik

{

,

}


Each pair
s
ik
=

x
k
,y
ik


is a sample


All samples of a single agent compose the labeled dataset
S
i

= {s
i1
,...,
s
i,m
(
i
)
}


The joint dataset
S
=

S
1

,
S
2

,…,
S
n


is our
input


m=|
S
|


We denote the dataset with the
reported
labels by
S’



Introduction

Motivation

Results

Model

Agent
1

Agent
2

Agent
3

Input: Example









+

+



X


X
m

Y
1



{
-
,+}
m

Y
2



{
-
,+}
m

Y
3



{
-
,+}
m

S

=

S
1
,
S
2
,…,
S
n


=

(
X
,Y
1
),…,
(
X
,Y
n
)



Introduction

Motivation

Results

Model



+



+

-

-





+

+



-

+

+

Mechanisms


A

Mechanism
M

receives a labeled dataset
S

and outputs
c

=

M
(
S
)


C



Private risk of
i
:
R
i
(
c
,S
) = |{k:
c
(
x
ik
)


y
ik
}| / m
i


Global risk:
R
(
c
,S
) = |{
i,k
:
c
(
x
ik
)


y
ik
}| / m



We allow non
-
deterministic mechanisms


Measure the
expected risk


Introduction

Motivation

Results

Model

% of errors on S
i

% of errors on
S

ERM

We compare the outcome of
M

to the ERM:



c*

= ERM(
S
) =
argmin
(
R
(
c
),
S
)



r*

=
R
(
c*
,
S
)


c



C


Can our mechanism
simply compute and
return the ERM?

Introduction

Motivation

Results

Model

(Lying)

Requirements

1.
Good approximation:



S

R
(
M
(
S
),
S
) ≤
α

r*

2.
Strategy
-
Proofness

(SP):




i
,S,
S
i


R
i
(
M
(
S
-
i

, S
i

),
S
)


R
i
(
M
(
S
),
S
)




ERM(
S
) is
1
-
approximating but not SP


ERM(S
1
) is SP but gives bad approximation

Introduction

Motivation

Results

Model

MOST
IMPORTANT
SLIDE

(Truth)


A study of SP mechanisms in Regression learning



O.
Dekel
, F. Fischer and A. D.
Procaccia
, SODA (
2008
), JCSS (
2009
).


[supervised learning]




No SP mechanisms for Clustering



J.
Perote
-
Peña

and J.
Perote
,

Economics Bulletin (
2003
)

[unsupervised learning]



Introduction

Motivation

Model

Results


Related work

Results

A simple case


Tiny concept class: |
C
|=
2


Either “all positive” or “all negative”


Theorem
:


There is a SP
2
-
approximation mechanism



There are no SP
α
-
approximation mechanisms,
for any
α
<
2




Introduction

Motivation

Model

Meir,
Procaccia

and
Rosenschein
,
AAAI
2008


Previous

work

Results

General concept classes


Theorem
: Selecting a dictator at random is SP
and guarantees approximation



True for
any

concept class
C


Generalizes well from sampled data when
C

has a
bounded VC dimension


Open question #
1
: are there better mechanisms?

Open question #
2
: what if agents are weighted?

Introduction

Motivation

Model

Meir,
Procaccia

and
Rosenschein
,
IJCAI
2009


Previous

work

A lower bound

Introduction

Motivation

Model

Results

Theorem
: There is a concept class C (where |C|=
3
), for which
any

SP mechanism has an approximation ratio of at least

Our main result:







o

Matching the upper bound from IJCAI
-
09

o

Proof is by a careful reduction to a voting scenario

o

We will see the proof sketch

Proof sketch

Introduction

Motivation

Model

Results

Gibbard

[‘77] proved that every (randomized) SP voting rule
for 3 candidates, must be a lottery over dictators*.


We define
X

= {
x,y,z
}, and
C

as follows:







We also restrict the agents, so that each agent can have
mixed labels on just one point

x

y

z

c
x

+

-

-

c
y

-

+

-

c
z

-

-

+

x

y

z

-

-

-

-

-

-

-

-


++++
-

-

-

-

++++++++

++++++++

-

-

-

-

-

-

-

-


++
-

-

-

-

-

-

Proof sketch (cont.)

Introduction

Motivation

Model

Results

x

y

z

-

-

-

-

-

-

-

-


++++
-

-

-

-

++++++++

++++++++

-

-

-

-

-

-

-

-


++
-

-

-

-

-

-

Suppose that
M
is SP





Proof sketch (cont.)

Introduction

Motivation

Model

Results

x

y

z

-

-

-

-

-

-

-

-


++++
-

-

-

-

++++++++

++++++++

-

-

-

-

-

-

-

-


++
-

-

-

-

-

-

Suppose that
M
is SP


1.

M
must be monotone on the mixed point


2.

M
m
ust ignore the mixed point


3.

M
is a (randomized) voting rule


c
z

> c
y
>
c
x


c
x

>
c
z

> c
y

Proof sketch (cont.)

Introduction

Motivation

Model

Results

x

y

z

-

-

-

-

-

-

-

-


++++
-

-

-

-

++++++++

++++++++

-

-

-

-

-

-

-

-


++
-

-

-

-

-

-


4.

By
Gibbard

[‘77],
M
is a random dictator


5.

We construct an instance where random
dictators perform poorly


c
z

> c
y
>
c
x


c
x

>
c
z

> c
y

Weighted agents

Introduction

Motivation

Model

Results



We must select a dictator randomly


However, probability may be based on weight


Naïve approach:

o
Only gives 3
-
approximation


An optimal SP algorithm:

o
Matches the lower bound of


Future work


Other concept classes


Other loss functions (linear loss, quadratic loss,…)


Alternative assumptions on structure of data


Other models of strategic behavior





Introduction

Motivation

Model

Results