Leveraging Database Technologies in Condor

basesprocketΔιαχείριση Δεδομένων

31 Οκτ 2013 (πριν από 3 χρόνια και 5 μήνες)

86 εμφανίσεις

Leveraging Database
Technologies in Condor

Jeff Naughton

March 14, 2005

Overview


Introducing ourselves


How we got involved


What we are doing and what we hope
to do


Request for input

Who we are


Faculty: David DeWitt, Jeff
Naughton


Students: Jiansheng Huang, Ameet
Kini, Christine Reilly, Eric Robinson,
Srinath Shankar, Lakshmikant
Shrinivas

Wisconsin DB Group


A world
-
leading DB research group for over 20
years.


Strong presence in:


Research publications.


Grads on faculty at top schools (Berkeley X 2, Cornell X
2, CMU)


Grads at top industrial DB research centers (IBM
Almaden, MS Research)


Grads in development organizations of main DB companies
(IBM DB2, Oracle, MS SQL Server)


History of influential software artifacts (WiSS,
Gamma, Exodus, SHORE, Paradise)

So how did we get to
Condor/Paradyn week?


4
th

floor of CS building: 4361 Naughton,
4367 DeWitt, 4369 Livny (adjacent
offices!)


Miron was very persuasive. His algorithm:

1.
Enter our offices.

2.
Describe some challenging and interesting
data management problem Condor faces or
will face.

3.
Leave office, get on airplane.

4.
Return to Madison, go to 1.


Why Condor and DBMS?


Premise: A running Condor system is awash
in data:


Operational data


Historical data


User data


DBMS technology can help capture,
organize, manage, archive, and query this
data.

Three potential levels of
involvement

1.
Passively collect and organize data,
expose it through DB query interfaces.

2.
Move/extend some data
-
related portions
of Condor to DBMS (Condor writes to and
reads from DBMS)

3.
Provide services to help users manage
their data.

Why do this?


For Condor developers:


Easier to trouble shoot and debug the system;


Easier to implement new functionality;


Less time hassling with data management issues;


Power of declarative data management language.


Easier to make data management aspects of the
system scalable;


Leverage 25 years of DBMS research on scalable data
management.

Why do this?


For Condor administrators


Easier to analyze and trouble shoot;


Easier to audit;


Easier to explore current and past
system status and behavior.


Why do this?


For Condor users:


An ever
-
improving system due to more productive
developers and administrators.


Easier to monitor and understand performance of their
jobs.


Easier to analyze history of their use of the system.


Complete record of every job they have submitted, and
everything that happened to every job while it was running.


Support for detailed data lineage queries.


Data management facilities to assist them in handling
large, complex, inter
-
related data sets.

Our projects and plans


Quill:
Transparently provide a DBMS
query interface to job_queue and
history data. [ready to deploy!]


CondorDB:
Transparently captures
and provides interface to critical
data from all Condor daemons.
[status: partial prototype working in
our own “sandbox”]

Longer
-
term plans


Tight integration of DBMS
technology and Condor [status:
thinking hard!].


DBMS
-
inspired data management
services to help Condor users manage
their own data. [status: thinking
really hard!]

Why doesn’t Condor currently
use DBMS technology?


Simple answer: Condor and DBMSs
“grew up” together.


Condor project started 1986.


Postgres project started 1986.


Now both are ready for each other.

Project 1: Quill

>
Non
-
invasive approach to capturing job related
information

>
Works by sniffing updates to the job queue log

>
Serves condor_q and condor_history queries

>
Independent, reliable, and efficient querying of
job related information



So how does it work?


Quill Architecture

Quill

Schedd

Job

Queue

log

RDBMS

Startd



Master

Queue

+

History

Tables

Querying Job Related
Information

RDBMS



Master

Startd

Schedd

Quill

Querying an
already busy
schedd!!

Independent and a
more powerful
query functionality

Quill benefits


Robustness: Monitored by master just like other
condor daemons


resilient to failure


Independence: Not in critical path of any other
condor daemons


Performance: Derive benefits of SQL to serve job
related queries an order of magnitude faster


Functionality: A broader range of queries


Extensibility: Easy to add more complex queries


Downside: only handles job queue and history data.

Project 2: CondorDB


CondorDB is a passive approach to
capturing operational data in a condor
pool


Modified daemons log events to the
database at run time


no log sniffing


Central database serves entire pool


Web
-
based query GUI

Data Capture in CondorDB


Condor daemons
augmented to record
important events in a
database


Database is in addition
to standard daemon
logs


Pool will run
unaffected even in the
absence of a database

Schedd

Negotiator

Starter

Startd

Shadow

A Machine

Schedd

CondorDB User Interface


Users can access
Condor through a
web
-
interface


Job queue, job
history, machine
info, match and
reject info,
aggregates and
summaries, etc…


The web server
queries the
database with PHP

Users see only their own job
information

Users see only their own job
queue on a shared machine

Drill
-
down to get detailed job
information

Matchmaking data at your
fingertips

Matches

Rejects

Machine information in a
single central repository

The data
-
centric approach
makes many tasks easier


Privacy enhanced by presenting user with
queue/history information about her jobs only


Intuitive “drill
-
down” navigation to get increasingly
detailed information


All information about a job from submit
-
time until
present available from a single screen


Useful summary information presented in tabular
and graphical format


Optionally query database directly for ad hoc
information on job queue, job history,
matchmaking and file usage

Acknowledgement


The Condor team has been
wonderfully responsive and supportive
throughout this effort.

Demos!


Come see demos of Quill and
CondorDB in room 4360 CS on Wed.
afternoon.

Virtuous Cycle


As we learn where Condor can use DBMS
technology, we also learn where DBMS
technology can be (must be?) improved.


Support for dynamic
-
schema sparse data sets.


Extreme requirements of self
-
installation and
self
-
maintenance.


Pushing match
-
making style operations into
DBMS.


Improving DBMS technology will lead to
more places that it can be installed.

Request


We want your input!


We have a lot of ideas but want to
filter, modify, and augment them
through the benefit of your
experience.


Send mail to
naughton@cs.wisc.edu

anytime.