CS 730R: Advanced Database Systems

CS 730R: Advanced Database Systems

and their application to biomedical problems

Spring 2011

Fusheng Wang

Center for Comprehensive Informatics


New applications and data types:

Spatial data and temporal data

XML data over the Web

Complex biomedical data

Semantic Web and

“Big data”

What’s the limitation of RDBMS and SQL? At what
extent can they can be extended?

What are the effective models, languages and
indexing methods to support new data types?

How to manage and integrate biomedical data?

Semantics enabled data management?

How to scale DBMSs (if possible) to manage and
query big data? How about MapReduce?

Course Introduction

The course covers recent advances in DBMSs and
their application to biomedical problems

Extensibility and extensions of database systems

XML database and


Spatial databases and medical applications

Temporal data modeling and queries

Biomedical data management and integration

Semantics enabled data management

Parallel and distributed databases

Course Information

Schedule: MW 4:00


Basic data structures and database background (CS 377)

Familiar with Java preferred but not required

Grading: Homework + Project (no exams)

Project driven

Course projects from biomedical research environments:
real data and databases

Students will be mentored on projects

IBM DB2 used for all projects

Successful completion will lead to publications

Optimization of spatial queries for large scale
biomedical database

Comparative study of parallel database and

Semantic enabled queries for an XML based
biomedical database

Design and develop a relational and spatial
database for Annotation and Image Markup
Standard (AIM)

Modeling and implementation of pathology image
database based on latest DICOM standard

Rotation Projects and Student Job

These projects could also be taken as rotation

Student job opening: advanced biomedical
database research and development

Data modeling, database extensions, query
optimization, maintaining of the databases, and
scaling the database to

of data with cluster

Course Wiki:



Advanced SQL Queries and Database

Advanced SQL queries

OLAP queries

Recursive queries

Database extensibility

Object relational databases

User defined functions

Stored procedures and PL/SQL

Database extenders

XML Data Management

Introduction of XML

XML query languages:


Native XML databases

XML data indexing methods

XML for biomedical applications

Spatial Data Management

Spatial logical models and query languages

Spatial access methods

Spatial joins

Spatial databases for biomedical imaging

Temporal Data Management

The structure of time and temporal data types

Temporal logics

Temporal modeling and databases in XML

Temporal management of RFID data

Temporal modeling and reasoning for biomedical

Semantic Data Modeling and Management

Overview of Semantic Web


Metadata and common data elements

Use of

and CDEs in biomedical data

Biomedical Data Management and

Biomedical data management overview

SciPort: an extensible platform for biomedical data
management and integration

PAIS: developing data model standards and high
performance databases for analytical medical

Biomedical data integration overview


cancer Biomedical Informatics Grid

Parallel and Distributed Databases

Introduction to parallel databases and distributed

DB2 data partitioning

Overview of MapReduce

Integration of SQL with MapReduce