In search of anti
-
commons: Academic patenting and
patent
-
paper pairs in biotechnology. An analysis of
citation flows.
Tom Magerman, Bart Van Looy, Koenraad Debackere
(tom.magerman@econ.kuleuven.be)
INCENTIM (International Centre for Studies in Entrepreneurship and Innovation Management)
K.U.Leuven Managerial Economics, Strategy & Innovation
ECOOM (Centre for R&D Monitoring)
ESF
-
APE
-
INV workshop
Scientists
&
Inventors
10
-
11/5/2012
1957
TECHNOLOGY
SCIENCE
University
-
Industry
linkages
University
-
Industry
linkages
Scientification
of
technology
Commercialization
of
science
(
E
ntrepreneurial
University)
University
-
Industry linkages
Complementarities
Generation
of new
research
ideas
Additional
funding
Create
a market of
ideas
+
University
-
Industry linkages
Complementarities
Generation
of new
research
ideas
Additional
funding
Create
a market of
ideas
+
Crowding out
Quality
Research
orientation
Anti
-
commons
and
the end of open
science
-
University
-
Industry
linkages
Scientification
of
technology
Commercialization
of
science
(
E
ntrepreneurial
University)
Anti
-
commons
and
the end of open
science
If
I have
seen
a
little
further
[
then
you
and
Descartes]
it
is
by
standing on the
shoulders
of Giants.
Isaac Newton, letter
to
Robert
Hoode
(
originated
from
John of Salisbury)
Anti
-
commons
and
the end of open
science
Anti
-
commons and the end of open science
Tragedy of the
anticommons
: underuse of scarce resources because too many
owners can block each other
=> more intellectual property
rights may lead paradoxically to fewer useful
products
O
n the one hand incentive to undertake risky research
On the
other hand too many
owners hold
rights in previous discoveries that
constitute obstacles
to future research
=> high transaction costs lead to inefficiencies
Biomedical research
has been moving from
a commons
model toward a
privatization
model
=>
r
isc
of
anticommons
tragedy
Influenced by patent system: what is patentable (e.g. patents on gene fragments)
Influenced by patent owner: licensing behavior (e.g. use of reach
-
through license
agreements)
Transition or tragedy? Find ways to lower transactions costs of bundling rights
(intermediate organizations; patent pools; cross
-
licensing)
8/09/2011
Tom Magerman
–
ENID 2011
17
Anti
-
commons and the end of open science
Expansion of IPR is privatizing the scientific commons and limiting scientific
progress
–
Heller and Eisenberg (1998); Argyres and Liebskind (1998); David (2000);
Lessig (2002); Etzkowitz (1998); Krimsky (2003)
Murray and Stern (2007): “Do formal intellectual property rights hinder the
free flow of scientific knowledge? An empirical test of the anti
-
commons
hypothesis”
•
How does IPRs affect propensity of future researchers to build upon
knowledge?
•
Compare citation patterns of publications in pre
-
grant period and after
grant
•
169 patent
-
paper pairs (Nature Biotechnology)
•
Modest anti
-
commons effect: decline in citation rate by 10 to 20%
Detection of patent
-
publication pairs
Text Mining
Text mining refers to the automated
extraction of knowledge and information
from text by means of revealing relationships
and patterns present, but not obvious, in a
document collection.
Related
to data mining, but additional issues:
other scale of dimensionality (100,000+
‘variables’)
different kind of variables (not really
independent, and very, very sparse
–
99.99
%)
language issues (homonymy/polysemy
and synonymy)
Latent
Semantic
Analysis (LSA)
LSA was developed late 1980s at
BellCore
/
Bell Laboratories by
Landauer
and his team of
Cognitive Science Research:
“Latent Semantic Analysis (LSA) is a theory and method for extracting and representing
the meaning of words. Meaning is estimated using statistical computations applied to a
large corpus of text. The corpus embodies a set of mutual constraints that largely
determine the semantic similarity of words and sets of words. These constraints can be
solved using linear algebra methods, in particular, singular value decomposition.”
LSA is a technique for analyzing text: extract (underlying or latent) meaning from text
LSA is a theory of meaning: meaning is acquired by solving an enormous set of
simultaneous equations that capture the contextual usage of words
LSA is a new approach to cognitive science: use large text corpora to test cognitive
theories
Linear algebra problem
The meaning of passages of text must be
sums of the meaning of its words.
LSA
models a large corpus of text as a
large set of simultaneous equations.
The
solution is in the form of a set of
vectors, one for each word and passage,
in a semantic space
Similarity
of meaning of two words is
measured by the cosine between the
vectors, and the similarity of two
passages as the same measure on the
sum or average of all its contained words
SVD dimensionality reduction
Singular Value Decomposition rank
-
k
approximation:
Dimensionality reduction by taking first k singular values:
with a diagonal matrix of singular values
T
V
U
A
)
...
(
2
2
2
2
1
n
n
k
k
k
k
m
n
m
k
n
m
V
U
A
A
A
.
.
Practical application?
SVD
truncation
Term
weighting
Pre
-
processing
Even when using
LSA/SVD as text
mining method,
many options
remain!
Assessment of 40 measure variants
4
weighting
methods
9 SVD
truncation
levels +
no SVD
40 similarity
measures
based on
SVD and
cosine
Full process
Construct
DbT
matrix
Create full text
index with stop
word removal and
stemming (Lucene)
Convert full text
index to document
-
by
-
term matrix
(
Matlab
)
Weight DbT matrix
(4 variants)
SVD
truncation
Decompose
weighted DbT
matrix into U
∑
V
using 1,000 largest
singular values
Generate document
–
by
-
concept matrix
V
∑
Truncate document
-
by
-
concept matrix
(take first 1000,
500, …, 5 concepts)
Similarity
calculation
Normalise DbT and
DbC matrices
Calculate distance
matrix (all patents
to all publications)
by calculating inner
product of vectors
Retain closest
publication for every
patent for all of the
43 variants
Expert validation
M
easure
R²
M
easure
R²
RAW
No SVD
0.61
TF
-
IDF
No SVD
0.7
1
SVD 1000
0.34
SVD 1000
0.
45
SVD 500
0.31
SVD 500
0.
34
SVD 300
0.30
SVD 300
0.
26
SVD 200
0.31
SVD 200
0.21
SVD 100
0.30
SVD 100
0.17
SVD 25
0.22
SVD 25
0.14
SVD 5
0.11
SVD 5
0.11
BIN
No SVD
0.
77
IDF
No SVD
0.80
SVD 1000
0.65
SVD 1000
0.63
SVD 500
0.63
SVD 500
0.57
SVD 300
0.58
SVD 300
0.54
SVD 200
0.
51
SVD 200
0.51
SVD 100
0.45
SVD 100
0.49
SVD 25
0.38
SVD 25
0.46
SVD 5
0.20
SVD 5
0.21
Common terms (weighted by min number of terms)
0.82
Common terms (weighted by max number of terms)
0.68
Common terms (weighted by
avg number of terms)
0.75
University
-
Industry
linkages
Scientification
of
technology
Commercialization
of
science
(
E
ntrepreneurial
University)
Methodology and data
Publication data
Selection of biotechnology publications from the Web of Science
based on the subject classification (1991
-
2008):
•
Core set of 243,361 publications :
subject category Biotechnology &
Applied Microbiology
•
Extended set of 683,674 publications :
publications of following subject
categories citing or cited by a publication of the core set: Biochemical Research
Methods; Biochemistry & Molecular Biology; Biophysics; Plant sciences; Cell
Biology; Developmental Biology; Food sciences & Technology; Genetics &
Heredity; Microbiology Materials
•
Multidisciplinary set of 97,970 publications
: publications from
multidisciplinary journals Nature; Science; and Proceedings of the National
Academy of Sciences of the United States of America
1,025,005 publications in total (948,432 suited for text mining)
478,361 publications published between 1991 and 2000
Methodology and data
Patent data
Selection of all granted EPO and USPTO biotechnology patents,
applied for between 1991 and 2008, from PATSTAT using IPC
-
codes as listed in OECD definition of biotechnology (‘A Framework
for Biotechnology Statistics’, OECD, Paris, 2005)
27,241 EPO patents and 91,775 USPTO patents
119,016 patents in total (88,248 suited for text mining)
Methodology and data
Matching
Original document
combinations
:
83,697,227,136 patent
-
publication combinations
CommonTermsMin
≥ 0.60:
27,250 patent
-
publication combinations
And
CommonTermsMax
≥ 0.30:
645
patent
-
publication
combinations
And at least one shared inventor/author:
584 patent
-
publication pairs
Methodology and data
Pairs
584 patent
-
publication pairs identified
•
17 patent linked to multiple publications (up to 3)
•
115 publications linked to multiple patents (up to 7) (patent families)
•
566 distinct patents paired with publication
•
400 distinct publications paired with patent
Patentee type
•
292 University
•
128
Government
/ Non
profit
•
126 Company
•
38
Hospital
•
21
Individual
(42 patents have multiple
patentees
from
different sectors)
Publication and citation numbers
Citation analysis
Match publications to deal with quality differences
Paired and non
-
paired publications matched by year and journal (1991
-
2000)
PAIRS
NONPAIRS
VY
SO
PUB
AVG_AU
AVG_CIT
PUB
AVG_AU
AVG_CIT
1991
BIOCHEMISTRY
1
5.00
65.00
625
4.03
57.20
1991
BIOTECHNIQUES
1
2.00
64.00
125
3.24
40.27
…
…
1992
BIOSCIENCE BIOTECH AND BIOCHEMISTRY
1
2.00
4.00
543
4.24
8.07
1992
BIOTECHNIQUES
1
4.00
147.00
144
3.07
26.17
…
…
Total
328
5.18
130.47
117,909
4.42
67.03
328
paired
publications
versus 106,027
biotechnology
publications
Before and after publication and grant
V
ariable
C
lass
N
L
ower
cl
mean
M
ean
U
pper
cl
mean
Ratio average citations
pairs/non
-
pairs
Pre
-
gra
nt
288
1.42
1.71
2.00
Ratio average citations
pairs/non
-
pairs
Post
-
grant
288
1.48
1.74
2.00
Diff
(1
-
2)
-
0.43
-
0.03
0.36
T
-
TESTS
V
ariable
M
ethod
V
ariances
DF
t
value
Pr > |t
|
Ratio average citations
pairs/non
-
pairs
Pooled
Equal
574
-
0.17
0.8666
Rat
io average citations
pairs/non
-
pairs
Satterthwaite
Unequal
565
-
0.17
0.8666
EQUALITY OF VARIANCES
V
ariable
M
ethod
N
um DF
D
en
DF
F
value
P
r
> F
Ratio average citations
pairs/non
-
pairs
Folded F
287
287
1.29
0.0299
Paired sample t
-
tests
Test
N
Mean 1
Mean 2
Difference
t value
Pr > |t|
P
aired
vs
non
-
paired
F
orward
citations
190
130.47
74.24
56.23
4.33
0.0001
Without
self
citations
190
116.01
65.02
50.99
4.07
0.0001
P
aired
vs
non
-
paired
(at
least 2 paired
publications)
F
orward
citations
59
224.97
131.63
93.34
3.12
0.0028
Without
self
citations
59
202.7
117.88
84.82
2.97
0.0043
P
aired
and
grey
zone
vs
all others
F
orward
citations
764
60.57
42.69
17.88
5.72
0.0001
Without
self
citations
76
4
53.09
36.48
16.61
5.59
0.0001
P
aired
and
grey
zone
vs
all others
(at least 2 paired
or grey zone
publications)
F
orward
citations
281
96.41
59.64
36.77
5.57
0.0001
Without
self
citations
281
85.85
51.76
34.09
5.43
0.0001
Multivariate analysis (
negative
binomial
)
Parameter
B
S
td.
Error
95% Wald
Confidence Interval
Lower
-
Upper
Hypothesis Test
Wald Chi
-
Square
df
Sig.
(Intercept)
2.966
.1258
2.719
3.213
555.643
1
.000
Pair (Y/N)
.450
.0506
.350
.549
78.945
1
.000
Document type:
Article
-
.574
.0113
-
.
596
-
.552
2589.688
1
.000
Letter
-
.774
.0590
-
.890
-
.659
172.469
1
.000
Note
-
.567
.0175
-
.601
-
.533
1051.989
1
.000
Review
0
.
.
.
.
.
.
Number of backward
publication citations
.013
.0001
.013
.014
10416.453
1
.000
Number of authors
.
033
.0005
.032
.034
4613.407
1
.000
Time
.125
.0015
.122
.128
7191.199
1
.000
Time²
-
.012
.0001
-
.013
-
.012
29450.994
1
.000
Journal dummies
(n=104)
Included
Sector analysis
Pub
sector
P
at
sector
N
Mean
Median
Var
SD
COM
COM
21
71.6
34.0
5,999.6
77.5
KGI
COM
25
70.5
49.0
3,212.6
56.7
KGI+COM
COM
15
106.7
80.0
18,605.8
136.4
KGI
KGI
227
179.2
67.0
95,544.4
309.1
KGI+COM
KGI
16
282.0
131.5
231,467.6
481.1
KGI
KGI+COM
6
219.2
93.5
66,633.4
258.1
KGI+COM
KGI+COM
5
85.0
67.0
3,546.5
59.6
315
164.4
66.0
84,846.9
291.3
Parameter
B
Std.
Error
z
P>z
[95% Conf.
Interval]
(Intercept)
4.326
0.292
14.800
0.000
3.753
4.899
Document type:
Article
Note
0.114
0.524
0.220
0.827
-
0.913
1.141
Review
0.309
1.130
0.270
0.784
-
1.905
2.523
Number of backward
publication citations
0.046
0.008
5.990
0.000
0.031
0.061
Number of authors
0.141
0.019
7.350
0.000
0.103
0.179
Pat sector:
KGI
0.000
.
.
.
.
.
COM
-
0.627
0.206
-
3.050
0.002
-
1.030
-
0.223
KGI+COM
-
0.917
0.355
-
2.590
0.010
-
1.612
-
0.222
Aff sector
KGI
0.000
.
.
.
.
.
COM
0.051
0.314
0.160
0.870
-
0.563
0.666
KGI+COM
0.176
0.214
0.820
0.413
-
0.245
0.596
Time
-
0.301
0.122
-
2.470
0.013
-
0.539
-
0.063
Time²
0.015
0.010
1.420
0.156
-
0.006
0.035
Sector analysis
Sector analysis
THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
US
26
THE JOHNS HOPKINS UNIVERSITY
US
26
THE SALK INSTITUTE FOR BIOLOGICAL STUDIES
US
15
BOARD OF REGENTS, THE UNIVERSITY OF TEXAS SYSTEM
US
12
THE SCRIPPS RESEARCH INSTITUTE
US
10
THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY
US
9
JOHNS HOPKINS UNIVERSITY
US
9
CITY OF HOPE
US
8
PRESIDENT AND FELLOWS OF HARVARD COLLEGE
US
8
WASHINGTON UNIVERSITY
US
8
INSTITUT PASTEUR
FR
8
THE ROCKEFELLER UNIVERSITY
US
7
THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF AGRICULTURE
US
7
THE UNITED STATES OF AMERICA AS REPRESENTED BY THE DEPARTMENT OF HEALTH
US
7
UNIVERSITY OF UTAH RESEARCH FOUNDATION
US
7
OKLAHOMA MEDICAL RESEARCH FOUNDATION
US
6
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
US
6
THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF
US
6
THE JOHNS HOPKINS UNIVERSITY SCHOOL OF MEDICINE
US
6
ST. JUDE CHILDREN'S RESEARCH HOSPITAL
US
6
Conclusions
science
-
technology
interactions
•
We do not observe lower citation rates for
publications that are part of a patent
application (nor before and after grant, nor
matched by journal, nor matched by author)
•
Significant impact of
KGIs
at the patent side
•
We miss patent
-
publication
pairs
•
Dig
deeper
into
the sector
dynamics
•
Citation
patterns
are
only
one
aspect of the
diffusion
of
knowledge
Overview
In search of anti
-
commons: Academic patenting and
patent
-
paper pairs in biotechnology. An analysis of
citation flows.
Tom Magerman, Bart Van Looy, Koenraad Debackere
(tom.magerman@econ.kuleuven.be)
INCENTIM (International Centre for Studies in Entrepreneurship and Innovation Management)
K.U.Leuven Managerial Economics, Strategy & Innovation
ECOOM (Centre for R&D Monitoring)
ESF
-
APE
-
INV workshop
Scientists
&
Inventors
10
-
11/5/2012
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Comments 0
Log in to post a comment