of LT resources

bouncerarcheryΤεχνίτη Νοημοσύνη και Ρομποτική

14 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

52 εμφανίσεις

NSF Funding

of LT resources

Tanya Korelsky, Program Director

Robust Intelligence Cluster

Division of Information and Intelligent Systems

Directorate for Computer and Information Science and Engineering

National Science Foundation

tkorelsk
@nsf.gov

http://www.nsf.gov/


How NSF is organized


Biological Sciences

Computer and Information

Sciences and Engineering

Education and

Human Resources

Engineering

Geosciences

Mathematical and

Physical Sciences

Social, Behavioral

And Economic Sciences

Office of the Director

How CISE is organized

CCF

Computing and

Communications

Foundations

CNS

Computer and

Network

Systems

IIS

Information and

Intelligent

Systems

Office of the

Assistant Director

for CISE

OCI

Office of

Cyberinfra
-

structure

(formerly SCI,
now with NSF
-
wide mission,
reporting to
Director of NSF)

Office of the Director

Clusters

Clusters

Clusters

Crosscutting Emphasis Areas

Funding Rate for Competitive Awards in CISE
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
Number of Proposals and Awards
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Funding Rate
Competitive Proposal Actions
Competitive Awards
Funding Rate
CISE Proposal/Award Statistics



FY

Proposals

Awards

Funding
Rate

CGIs

Supple
-
ments

2005

4,962

1,086

23%

1,398

581

2004

6,266

1,017

16%

1,297

400

2003

5,346

1,174

22%

1,023

354

2002

4,314

1,038

24%

918

308

2001

3,579

885

25%

768

231

2000

2,853

903

32%

547

210

1999

2,209

746

34%

493

301

1998

1,885

667

35%

476

211

1997

1,894

684

36%

527

219

1996

1,760

601

34%

610

183

1995

1,941

708

36%

631

215

*ADJUSTED

CISE Budget: 2003
-
2007

475

500

2003

2004

2005

2006


Fiscal Year

Dollars in Millions

$496M

525

$527M

2007

Request

Requested 6.1%

increase includes

20M for cybersecurity,

10M for GENI

The Human Language and Communication Program
(HLC)

Initiated by Dr. Mary Harper


This
HLC program

emphasizes innovative advances in computer
and information sciences relating to
all forms of human
communication.


High
-
level human communication topics:


Text Processing


Speech Processing


Multimodal Communication Processing


HLC is attempting to strengthen current research while broadening
future research directions of the language processing research
community (e.g., multimodal communication).


HLC/ITR LT recent resource, annotation and
evaluation metrics awards


ITR ’03: Collaborative effort on Interlingual Annotation


HLC ’04: Constructing an Enhanced Version of WordNet, $100K
(12 months)


HLC ’05:


Rapid Development of Frame Semantic lexicon, to ICSI, UC
Berkeley, $400K (36 months)


SGER: Learning Syntax
-
based Evaluation Metrics for Machine
Translation, Dr. Rebecca Hwa, University of Pittsburgh, $200K
(24 months)


A Framework for Learning High Accuracy Evaluation Metrics for
NLP Applications, Dr. Alon Lavie, CMU, $150K (24 months)




CISE CRI (Computing Research Infrastructure)
Program


Funds community resources for IIS programs; reviewers are
supplied by the technical program directors


’04 LT resource planning award: to Vassar College: An Open
Linguistic Infrastructure for American English, $50K (12 month)


’05 LT resource/annotation awards:


Towards a Comprehensive Linguistic Annotation of Language
(Brandeis, UColorado, Pitt, Penn, NYU), $850K, 24 months;
goals include achieving an international consensus on a meta
-
specification framework


Another planning award ($100K) to Vassar College and
Princeton University: An Open Linguistic infrastructure for
American English; goals include annotation of semantic
categories using WordNet and FrameNet




Information and Intelligent Systems
Reorganization into Clusters


Robust Intelligence


Artificial Intelligence, Human Language and
Communication, Robotics, Computer Vision,
Computational Neuroscience



Human
-
centered Computing


Human Computer Interaction, Social Informatics,
Universal Access


Information Integration and Informatics


Data, Information, and Knowledge Management;
Information Integration; Science and Engineering
Informatics; Digital Libraries; Digital Government


Information and Intelligent Systems


New Cluster
-
oriented Solicitation


Scheduled to be published in May with submission deadline late
October


early November


One of cross
-
cutting threads: Human
-
Robot Interaction


Implications for HLC area
-

renewed attention to


dialogue (human
-
human, machine
-
human);


ASR of imperfect and affected speech;


Speech
-
to
-
concept understanding; concept
-
to
-
speech
generation


Need corpora to support these research areas!


One Small Current Effort


SGER (Small Grant for Exploratory Research)


Creation of a Goal
-
Oriented, Human
-
Machine Spoken
Corpus


ICSI (UC Berkeley), Dr. Dillek Hakkani
-
Tur


Building a spoken mixed
-
initiative dialogue system for
for conference services


Deploying the system for the IEEE SLT Workshop
(December 2006)


Collecting and annotating the dialogue corpus

Digital Tools Summit at Michigan State
University (June 2006)


Funded jointly by the Linguistics Program and (former) HLC
program


Addresses a functionality gap between the tools that documentary
linguists and typologists need and the ability of existing tools to
annotate partially
-
understood linguistic data


Existing methods and tools presuppose a regularized digital corpus
of a well
-
understood language and require a high degree of
computational sophistication


Aims to develop a roadmap for creating regional and national
language archives and the tools to achieve it


Brings together theoretical computational linguists and “data
-
driven” linguists to brainstorm the challenging issues



NSF perspective on funding LT resources


New corpora for dialogue research


New corpora for ASR research:


mixed language (English
-
Spanish)


affected speech (911 calls); senior speech


New general corpora (ANC), both text and speech


Dependency treebanks and parsers


Harmonization of existing semantic resources (WordNet
and FrameNet)


Basic research on semantic annotation: ambivalent
attitude to standardization



NSF perspective on funding LT resources
(international resources)


Parallel corpora for new MT research on statistical
methods applied to syntactic and semantic
representations


Research on MT for minority languages (pending award
to CMU for Inupiaq and Aymara)


Corpora for research on language identification


International collaboration on speech processing (NYU
-
EBIRE
-

CNRS) and on unified linguistic annotation


International workshop on dependency representations
(2007 ACL in Prague)




Thank you

Tanya Korelsky

Robust Intelligence

Human Language and Communication

Division of Information and Intelligent Systems

Directorate for Computer and Information Science and Engineering

National Science Foundation

tkorelsk
@nsf.gov

http://www.nsf.gov/


Digital Living
2010

People across the globe will have access to each other and
information provided by pervasive devices, embedded sensors
and systems because all will be connected to the Internet.

Home Computer

PDA

Telephone

Entertainment Systems

Car

Surveillance and Security

(at home, work, or in public)

Building Automation

Banking
and
Commerce

Photography

Home Appliances

Games

Inventory/Sales

tracking

Health/Medical

Communications

Thanks to David Kotz at Dartmouth

Global Environment for Networking Innovations
(GENI)

Limitations of the Internet


Security mechanisms not included in the IP layer


End
-
to
-
end robustness cannot be assumed or assured


Scaling limitations


Quality of service mechanisms have not diffused widely
in the public Internet


Support for new technologies difficult (e.g., wireless,
mobility, sensors)



Global Environment for Networking Innovations


New networking and distributed system architectures


Build in security and robustness


Enabling pervasive computing, bridging the gap
between the physical and virtual worlds by including
mobile, wireless and sensor networks


Enable control and management of other critical
infrastructures


Include ease of operation and usability


New classes of societal
-
level services and applications




Global Environment for Networking Innovations

Research Program



Supports research, design, and development of new
networking and distributed systems


Builds on many years of knowledge and experience, but
reexamine all networking assumptions and reinvent
where needed


Design for intended capabilities; deploy and validate
architectures; build new services and applications


Encourage users to participate in experimentation


Take a system
-
wide approach to the synthesis of new
architectures

Global Environment for Networking Innovations

Facility



Shared use through slicing and virtualization (where "slice"
denotes the subset of resources bound to a particular
experiment)


Access to physical facilities through programmable platforms
(e.g., via customized protocol stacks)


Large
-
scale user participation by "user opt
-
in" and IP tunnels


Protection and collaboration among researchers by
controlled isolation and connection among slices


A broad range of investigations using new classes of
platforms and networks, a variety of access circuits and
technologies, and global control and management software


Interconnection of independent facilities via federated design.

Global Environment for Networking Innovations

Outreach



CISE has supported numerous community workshops
in support of GENI


CISE is supporting on
-
going planning efforts, including
needs assessment and requirements for the GENI
Facility.


CISE will hold town meetings and continue to support
future workshops to broaden community participation.


CISE will work with industry, other US agencies, and
international groups to broaden participation in GENI
beyond NSF and the US government.