http://docbusinessapps.challenge.gov/ We're challenging developers to look for innovative ways to utilize DOC and other publicly available data to help businesses identify opportunities, grow, enhance

hopeacceptableΛογισμικό & κατασκευή λογ/κού

28 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

53 εμφανίσεις

Part of the Commerce
Business Apps
Challenge

http://docbusinessapps.challenge.gov
/


We're
challenging developers to look for innovative ways to utilize DOC and other
publicly available data to help businesses identify opportunities, grow,
enhance
productivity
and create jobs.


$10,000 USD in prizes (1
st

-

$5,000; 2
nd

-

$3,000; and 3
rd



$2,000)


Ends: April 30, 2012 @ 11:00 PM EDT


Introductions:


Mike Kruger
, DOC


Director
of Digital
Strategy (Host)


Christopher Leithiser

(pronounced
LightHizer
),

USPTO


IT Specialist (Presenter)

Chris.Leithiser@uspto.gov

(703) 756
-
1244 Office


If you have questions regarding the USPTO Patent and Trademark Bulk Data
available from Google, Inc. for no charge, send them to:
IPD@uspto.gov



2

Agenda:



Open Government Initiative / Data.gov / Google, Inc.



(2) Datasets:



Innovative Ideas:



Mash
-
ups:



Questions:

3

Open Government Initiative / Data.gov / Google, Inc.:


PAST


PRESENT


FUTURE

4

Datasets: U.S. Patent
Grant Bibliographic Text (2001 to Present
)


Part (1 of 2):



Contains the bibliographic text (i.e., front page) of each patent grant issued weekly (Tuesdays) from January 2001 to Present

(excl udes images/drawings).


The
fi le formats are Standard Generalized Markup Language (SGML) i n accordance with the U.S. Patent Grant Version 2.4
Document Type Definition (DTD) and eXtensible Markup Language (XML) i n accordance with the U.S. Patent Grant Version 2.5;
4.0 International Common El ement (ICE); 4.1 ICE; and 4.2 ICE Document Type Definitions (DTDs
).


XML Resources at the USPTO:
http
://
www.uspto.gov/products/cis/patents_xml.jsp

(These are being updated).


Thi s
product includes a pgbyyyymmdd_wknn.zip or ipgbyyyymmdd_wknn.zip file for each week [where "yyyymmdd" is a
Tuesday issue date and "nn" is a two
-
digit, fi xed
-
length number (with l eading zero) representing the sequentially
-
numbered
week of the year
].


Wi thi n
each weekly zi p file are three (3) files:

pgbyyyymmdd.xml
or i pgbyyyymmdd.xml (Bibliographic information i n XML ICE);

pgbyyyymmddl st.txt
or i pgbyyyymmddlst.txt (List of patent grant numbers in ascending order);

pgbyyyymmddrpt.txt
or i pgbyyyymmddrpt.html (Statistical/summary
report)


Approxi mately
4,000 patent grants per week.


Approxi mately
5 MB per weekly
zipfile
.


Avai lable
from Google:
http://
www.google.com/googlebooks/uspto
-
patents
-
grants
-
biblio.html

or


Avai lable directly from the USPTO:

https://
eipweb.uspto.gov/2012/PatentGrantBibICEXML/











https://eipweb.uspto.gov/2005/PatentGrantBibICEXML
/


https://eipweb.uspto.gov/2004/PatentGrantBibXML
/


https://eipweb.uspto.gov/2003/PatentGrantBibXML
/


https://eipweb.uspto.gov/2002/PatentGrantBibXML
/


https
://eipweb.uspto.gov/2001/PatentGrantBibSGML/


5

Datasets: U.S. Patent Grant Bibliographic Text
(1976
to
2001)


Part
(2
of 2):




Contains the bibliographic text (i.e., front page) of each patent grant issued weekly (Tuesdays) from January 1976 to Decembe
r
2001 (excl udes images/drawings).


The
fi le format is a subset of the Green Book, ASCII
text:

https
://
eipweb.uspto.gov/1976/PatentGrantFullTextAPS/PatentFullTextAPSDoc_GreenBook.pdf



It i ncl udes
patent number, series code and application number, type of patent, filing date, title, issue date, inventor
i nformation, assignee name at ti me of issue, foreign priority i nformation, related US patent documents, cl assification
i nformation, U.S. and foreign references, attorney, agent or fi rm/legal representative, Patent Cooperation Treaty (PCT)
i nformation, abstract, and i f present Statement of U.S. Government Interest
.


Thi s product includes
a
yyyy.zi p fi le
for each
year (1976 to 2001). Al l of the weekly fi les were concatenated i nto an annual file.


Wi thi n each
annual
zi p file
is (1) file:

yyyy.dat
(Bi bliographic information in
ASCII);


EXCEPTION 1: Begi nning 09/03/1996 we also began providing the weekly zi p files:

(e.g., pba19960903_wk36.zip which contains: pba19960903.txt)


EXCEPTION 2: Begi nning 01/07/1997 the weekly files appear as pba19970107_wk01.zip which contains:

pbayyyymmdd.txt
(Bibliographic i nformation i n
ASCII);

pbayyyymmddlst.txt
(List of patent grant numbers i n ascending order);

pbayyyymmddrpt.txt
(Statistical/summary report)


Approxi mately
4,000 patent grants per week.


Approxi mately
1.6 GB
total.


Avai lable from Google:
http://www.google.com/googlebooks/uspto
-
patents
-
grants
-
biblio.html

or


Avai lable directly
from the USPTO:

https
://
eipweb.uspto.gov/2001/PatentGrantBibAPS/











https://
eipweb.uspto.gov/1977/PatentGrantBibAPS/


https://eipweb.uspto.gov/1976/PatentGrantBibAPS
/


6

Datasets: U.S. Patent Application Publication Bibliographic
Text
(March 15, 2001
to Present):


Contai ns
the bi bl i ographic text (i.e., front page) of each patent appl i cation publ icati on (non
-
provi sional uti l ity and
pl ant) publ i shed weekl y (Thursdays) from March 15, 2001 to Present (excl udes i mages/drawi ngs
).


The
fi l e formats are
eXtensi bl e

Markup Language (XML) i n accordance wi th the U.S. Patent Appl i cati on Versi on
1.5; 1.6; 4.0 Internati onal Common El ement (ICE); 4.1 ICE; and 4.2 ICE Document Type Defi ni ti ons (DTDs
).


XML Resources at the USPTO:
http
://
www.uspto.gov/products/ci s/patents_xml.jsp

(These are bei ng updated).


Thi s
product i ncl udes a pabyyyymmdd_wknn.zi p or i pabyyyymmdd_wknn.zi p fi l e for each week [where
"
yyyymmdd
" i s a Thursday publ i cati on date and "
nn
" i s a two
-
di gi t, fi xed
-
l ength number (wi th l eadi ng zero)
representi ng the sequenti al l y
-
numbered week of the year
].


Wi thi n each weekl y zi p fi l e are (3) fi l es:



pabyyyymmdd.xml
or i pabyyyymmdd.xml (Bi bl i ographic i nformation i n XML
ICE)

pabyyyymmddl st.txt
or i pabyyyymmddl st.txt (Li st of publ i shed patent appl i cati on numbers i n ascendi ng
order)

pabyyyymmddrpt.txt
or i pabyyyymmddrpt.html (Stati sti cal/summary report
)


Approxi matel y
5,000 patent appl i cati on publ ications per week.


Approxi matel y
2.7 MB per weekl y
zi pfi l e
.


Avai l able
from Googl e
:
http://
www.googl e.com/googl ebooks/uspto
-
patents
-
appl i cati ons
-
bi blio.html

or


Avai l able di rectl y from the USPTO:

https://
ei pweb.uspto.gov/2012/PatentAppl Bi bICEXML/











https://ei pweb.uspto.gov/2005/PatentAppl Bi bICEXML
/


https://
ei pweb.uspto.gov/2004/PatentAppl Bi bXML/


https://
ei pweb.uspto.gov/2003/PatentAppl Bi bXML/


https://ei pweb.uspto.gov/2002/PatentAppl Bi bXML
/


https://ei pweb.uspto.gov/2001/PatentAppl Bi bXML
/


7

Innovative Ideas:


Homogenize the patent grant bibliographic text data (i.e., make it all the same
format).


Same for the patent application publication bibliographic data.


Capture patent grant bibliographic text data from 1790 to 1975 using the image
data.


Build
a text searchable
database (updated
weekly) that includes
both of the
datasets discussed today.
Search queries can be saved. Result sets can be
saved/extracted/tailored.


Build a text searchable database (updated weekly) that includes subsets of both
of the datasets discussed today. (e.g., Green Technology related).


Same ideas as above, but
use full
-
text
(75 MB/104 MB per week) or
full
-
text with
embedded
images (1.4 GB/1.5GB per week):

http://
www.google.com/googlebooks/uspto
-
patents.html


8

Mash
-
ups:


C
ombine
USPTO applicant/inventor information with other USPTO datasets (e.g.,
with USPTO assignments (ownership) data):

http://www.google.com/googlebooks/uspto
-
patents
-
assignments.html

or

https://eipweb.uspto.gov/2012/PatentAsgnDailyXML/


https://eipweb.uspto.gov/2011/PatentAsgnAnnlRetroXML/



C
ombine
USPTO patent grants and patent application publications with other
DOC data (e.g., Census or Economic data).


9

Questions:


If you have questions regarding the USPTO
Patent and Trademark Bulk
Data
available from Google, Inc. for no charge, send them to:
IPD@uspto.gov



10