PSU Research Proposal

stagetofuΤεχνίτη Νοημοσύνη και Ρομποτική

29 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

104 εμφανίσεις


0











PSU

Research Proposal









































Title
:

Health Informatics
-

Use of Medical Data Mining to
Enhance Se
rvice, Diagnosis, and
Reduce Costs



Department:
Computer
Science


PI Name:
Ahmed Sameh


Duration:
1 Year


Budget Est.:
SR 5
5,000


Date:
12/20/2010



1


I
-

PROPOSAL


I
-
1
: PROPOSAL TITLE

(Provi de a short descri pti ve ti tl e, gi ve promi nence to keywords)



I
-
2
: COMMERCIAL POTENTIAL



Could this project have commercial potential?
(Sel ect one)


Yes


No



If yes, briefly elaborate on the commercial potential


Healthcare is significantly affected by technological advancements, as technology both shapes and changes
healthcare systems. As areas of computer science, information technology, and healthcare merge
, it is
important to understand the current and future implications of health informatics. The area of data mining is a
new advancement in empirical research findings. Its application to healthcare informatics will reveal
implications and consequences both

positive and negative of health informatics for ones and the society’s
health.
As such, we can create commercial strategies for health based services/products to support new lines
of health aware businesses.
For example, after producing negative correlati
on between the
over
use of
wireless mobile phones and health hazards

on the brain
, we can set up a commercial

strategy

that promote
s

the product
based on

su
ch hazard negat ive correlat ion

(anot her example Skin creams and s kin cancer)
. We
c
an als o build a s e
t of t ools
t o be s old t o Healt hcare providers

t o reduce healt hcare cos t s by analyzing
individuals ’ healt hcare dat a and generat e deviat ion report s t hat des cribes acces s s pending. A met hodology
t hat was developed by one of t he inves t igat ors has been enhanced

and adapt ed t o t he Saudi environment
.




I
-
3
:
CHECK
-
LIST






Have you checked to ensure al l questi ons i n the appl i cation form have been answered?


Have you checked to ensure you have i ncl uded

the correct costs i n your budget?


The pri nci pal i nvesti gator and al l co
-
pri nci pal i nvesti gators shoul d si gn.



I
-
4
: PERSONNEL AND AUTHORIZATION



PRINCIPAL INVESTIGATOR

[PI]





CO
-

INVESTIGATOR(S
)
[CIs]



Health Informatics
-

Use of Medical Data

Mining to Enhance Se
rvice, Diagnosis, and
Reduce Costs


Academic Rank:


Professor

Full Nam
e:

Ahmed Sameh

College:

CIS

Department:
Computer Science

Telephone:


494
-
8524

Ext:

X8524


Mobile:

0544299846


E
-
Mail:
asameh@cis.psu.edu.sa


Signature: Date:

12/20
/
2010


(
non
-
PSU CIs permitted
)


2


1)

Full Name:


Mohamed
El
-
Affendi



Academic Rank:



Professor


E
-
Mail:


College:

CI S

Department:
Computer Science

Telephone:



Mobile:

Signature: Date:

/ /

2)

Full Name:


Gregory Shapiro

(University of
Massachusetts, Lowell, USA)



Academic Rank:

Associate Professor






E
-
Mail:

College:


Science & Engineering



Department:
Computer Science

Telephone:



Mobile:

Signature: Date:

/ /

3)

Full Name:


Mohamed Tunsi



Academic Rank:


Associate Professor



E
-
Mail:

College:
CIS

Department:
Computer Science

Telephone:



Mobile:

Signature: Date:

/ /

4)

Full Name:


Ayman kassem (King Fahd University)



Academic Rank:


Associate Professor



E
-
Mail:

College:

Department:
Computer Science

Telephone:



Mobile:

Signature: Date:

/ /

5)

Full Name:




Academic Rank:




E
-
Mail:

College:

Department:

Telephone:



Mobile:

Signature: Date:

/ /

6)

Full Name:




Academic Rank:




E
-
Mail:

College:

Department:

Telephone:



Mobile:


3



II
-

DESCRIPTION


II
-
1
: ABSTRACT

(Provide a statement of the project
-

maximum 200 words)



Signature: Date:

/ /



Health informatics (also called health care informatics, healthcare informatics, or medical informatics) is a
discipline at the intersection of information science, computer science, and health care. It deals with resources
and methods required to optimize

the acquisition, storage, retrieval, and use of information in health and
medical research. It is applied to the areas of medical research, clinical care, dentistry, pharmacy, nursing,
and public health.


In the area of medical research:
With analytical
data mining the screening, diagnosis and detection of
diseases may get more efficient by reducing both time and costs for the corresponding procedures.
It

can
also
be used to improve individuals’ medication by using patients


medication history to promote
specific drugs
directly to certain patients.

It

can be effective in providing low
-
cost screening using disease models that
require easily
-
obtained attributes from historical cases.
It can perform
automated analysis of pathological
signals (ECG, EEG, EMG),
and medical images (MRI, CT, X
-
ray, and ultrasound).

Data mining can
also
produce more accurate results in the field of empirical medical research. For example,

classification of
patterns of kidney stones in urine clustering,



Data mining can be used in
improving
clinical care
: For example it

can be used in healthcare management.
For example, time series analysis data mining algorithms can be used to predict (based on historical data)
patient volume per month, patient volume per medical
specialization
, le
ngth of stay for incoming patients per
medical department, ambulance run volume per month
, and clinical decision support systems and information
workflows.


Data mining can be used in dentistry
: It can produce dental and ana
tomical models for dentists. It can also be
used to improve dental management. It can be used effectively in dental marketing, and teledentistry
consultaion services. It can classify full crown and bridge plus all implants systems and cosmetic restorations
.
Lastly, it can be used for analysis of X
-
ray of head and neck region. It can improve infection control, and
pharmaceuticals for dental use.


Data mining can be used in pharmacy
: Classification and clustering algorithms can be used for supplements,
vitam
ins, and nutritional products grouping and recommendation. Association algorithms can be used to
discover relationships between medications. Data mining techniques can be used to
enhance alternative
medicine, acupuncture and Chinese medicine, herbs by disc
overing correlative effects between these
alternative medicines and their corresponding chemical ones.


Data mining can be used in nursing
, and public health care. It can discover better work needs for nursing
specialization. It can be used to study epide
mics and the way they spread in poor communities.


Data mining can also be used to provide summary medical reports to hand
-
held portable devices to assist
providers with data entry/retrieval or medical decision making, sometime called mHealth.


Acquisitio
n of medical data for data mining algor
ithms is quite a difficult task
, specialy
in Saudi Arabia.
Although most of healthcare and medical facilities in KSA collect large amounts of
digital
data, they are
hesitant to make this data available for research.
A
s such the scope of this proposal is somewhat
not very
specific

due to this fact.
In this project, we have some arrangements for collecting data that we hope will
eventually work.

Depending on the type of data we can secure the scope of the project will fo
cus on such
area.


As a
starter we will explore only three

of the above
area
s (management, medical research
, and reducing
healthcare costs
) until we stumble into a rich area with data, background knowledge, and specific
investigation queries

in the other areas
.

It is not clear at the moment which area will open up for us.

Healthcare agencies in KSA are so reluctant to provide their own data, and background knowledge.

For the

4



II
-
2
: PROJECT GOALS AND OBJECTIVES





The specific

goal
s

of this project are

to demonstrate the power of data mining in using healthcare informatics
to enhance:


1
-
Medical Applications: Screening, Diagnosis, Therapy, Prognosis, Monitoring,
Biomedical/Biological
analysis, Epidemiological studies, Hospital management
, Classifying uninary stones by Cluster Analysis of
ionic composition data, Efficient screening tools reduce demand on costly health care resources, Forecasting
ambulance run volum
e, Predicting length
-
of
-
stay for incoming patients, Diagnosis and classification: e.g.
ECG Interpretation: Using NN to predict which o/p: SV tachycardia, Ventricular tachycardia, LV
hypertrophy, RV hypertrophy, Myocardial infarction, Diagnosis and classifi
cation: assist in decision making
with a large number of inputs. E.g. can perform automated analysis of pathological signals (ECG, EEG,
EMG), medical images (Mammograms, Ultrasound, X
-
ray, CT, and MRI). E.g. Heart Attacks, Chest pains,
Rheumatic disorders,

Myocardial ischemia using the ST
-
T ECG complex), Coronary artery disease using
SPECT images


2

Patient medication
: Medicine revolves on pattern recognition, classification, and prediction: Diagnosis:
recognize and classify patterns in multivariate patien
t attributes; Therapy: Select from available treatment
methods, based on effectiveness, suitability to patient; Prognosis: Predict future outcomes based on previous
experience and present conditions
,
Fo
recasting Patient Volume using u
ni
-
varia
n
t

Time
-
Series

Analysis
,
Improving Classification of multiple dermatology disorders by problem decomposition


3
-
Modeling Obesity

in Saudi Arabian youth,
Modeling the educational score in
Saudi
school health surveys
,
Better insight into medical survey data
,
Building dis
ease models for the instruction and assessment of
undergraduate medical and nursing students
,
Epidemiological studies: Study of health, disease, morbidity,
injuries and mortality in human communities. E.g. Predict outbreaks in simulated populations
.
E.g. A
ssess
asthma strategies in inner
-
city children
,
Discover patterns relating outcomes to exposures
,
Study
independence

or correlation between diseases,
Detecting pathological conditions: e.g. tracking glucose levels
,
Accurate prognosis (prediction) and risk
assessment for improved disease management and outcome: e.g.
predict ambulation following spinal cord injury
.
E.g. Survival analysis for AIDS patients. Predict pre
-
term
birth risk, determine cardiac surgical risk


A separate direction of investigation is “
reducing healthcare costs” by analyzing individuals’ data and
discovering deviations that leads to higher costs. Deviation analysis is a data mining technique that can
discover also frauds and misuses. The proposed system is called “KEFIR: Key Findings Rep
orter”.


Saudi Health Care Data:

The one of the biggest problems in this research is Where to get Saudi data from.
The above list represents
possible tracks for the project. It all depends on the data that we will find. We might find partial incomplete
da
ta that we should work on its preparation.
How to

prepare and pre
-
process data? Is it possible to make u
se
of non
-
Saudi data for proof of concept?




Healthcare management and medical research, we were able to get two
Saudi data sets: Monthly patient
volume at “family Community Primary Healthcare Clin
i
c of King Faisal University
”, we used this data set to
forecast future volumes based on past data. We used time series analysis algorithm for prediction. The second
data s
et is urinary kidney stones from the Division of Urology, Department of Surgery, King Khalid
University Hospital. We used this data set to classify the samples by cluster analysis of ionic composition.
These two experiments gave good results and stand as p
ositive indication that further data sets can be
acquired and utilized by this project.

The third direction of investigation is “reducing healthcare costs” by
analyzing individuals’ data and discover
ing

deviations that leads to higher costs. Deviation anal
ysis is a data
mining technique that can discover also frauds and misuses.
The proposed system is called “KEFIR: Key
Findings Reporter”.





5



III
-

INTRODUCTION


III
-
1
:

REVIEW

AND

ANALYSIS

OF

RELATED

WORK




Medical expert syst
ems such as
MYCIN and Internist were among the first computerized systems in
healthcare and medical applications. The use of data mining in healthcare informatics is a new direction of
research.

In Saudi Arabia, t
he Saudi Association for Health Information (SAHI) was
established in 2006
]
to work under
direct supervision of King Saud University for Health Sciences to practice public activities, develop
theoretical and applicable knowledge, and provide scientific and applicable studies.

SAHI

is concerned with
use informat
ion in
health care

by
clinicians
.

SAHI

transform health care by analyzing, designing,
implementing, and evalua
ting
information

and
communication systems

that enhance individual and
popul
ation health

outcomes, improve patient

care, and strengthen the clinician
-
patient relationship.
SAHI

use
their knowledge of patient care combined with their understanding of informatics concepts, methods, and
health informatics tools

to:



assess information and knowledge needs of health care professionals and patients,



characterize, evaluate, and refine clinical processes,



develop, implement, and refine clinical decision support systems, and



lead or participate in the procurement, customization, development, implementation, management,
evaluation, and continuous improvement of clinical information systems.

Physicians

who are board
-
certified in clinical informatics collaborate with other health care and information
technology
professionals

to
develop
health informatics tools

which promote patient care that is safe, efficient,
effective, timely, patient
-
centered, and
equitable.

The purpose of this project is to add to these health
informatics tools.



III
-
2
:

SIGNIFICANCE

OF

WORK




The significance of this project is that it deals with the field of healthcare. Health care is a very important and
wide field. The outputs of this project can be new results in medical research: With analytical data mining the
screening, diagnosis and det
ection of diseases may get more efficient by reducing both time and costs for the
corresponding procedures. It can also be used to improve individuals’ medication by using patients’
medication history to promote specific drugs directly to certain patients.

It can be effective in providing low
-
cost screening using disease models that require easily
-
obtained attributes from historical cases. It can
perform automated analysis of pathological signals (ECG, EEG, EMG), and medical images (MRI, CT, X
-
ray, and ultr
asound). Also Data mining can produce more accurate results in the field of empirical medical
research. For example, classification of patterns of kidney stones in urine clustering,


The outputs of this project can be used in improving clinical care. For

example, the results of applying Data
mining in healthcare management. For example, time series analysis data mining algorithms can be used to
predict (based on historical data) patient volume per month, patient volume per medical specialization, length
o
f stay for incoming patients per medical department, ambulance run volume per month, and clinical decision
support systems and information workflows.


The outputs of this project can also be used in dentistry, pharmacy, and nursing. For example, Data minin
g
can also be used to provide summary medical reports to hand
-
held portable devices to assist providers with
data entry/retrieval or medical decision making, sometime called mHealth.

The outputs of this project can be
used for insurance fraud detection, in
fection control, and medical
w
aste management.

This direction of
investigation is “reducing healthcare costs” by analyzing individuals’ data and discovering deviations that

6


leads to higher costs. Deviation analysis is a data mining technique that can disco
ver also frauds and misuses.
The proposed system is called “KEFIR: Key Findings Reporter”.



Acquisition of medical data for data mining algor
ithms is quite a difficult taskI special
y in paudi ArabiaK
Although most of healthcare and medical facilities in
hpA collect large amounts of dataI they are hesitant to
make this data available for researchK fn this projectI we have some arrangements for collecting data that we
hope will eventually workK





IV
-

APPROACH AND METHODOLOGY


IV
-
1
: METHODOLOGY



There are many
Data mining Methods to be ap
plied to healthcare information such as: Time Series
Prediction, Classification, Clustering, and Association. Such algorithms can be applied to the various domain
in healthcare informatics
:


1
-
Medical Application
s: Screening, Diagnosis, Therapy, Prognosis, Monitoring, Biomedical/Biological
analysis, Epidemiologic
al studies, Hospital management. For example, f
orecasting Patient Volume using uni
-
variant Time
-
Series Analysis
,
Classifying uninary stones by Cluster
Analysis of ionic composition data
,
Optimize allocation of hospital resources
, Forecasting ambulance run volume, Predicting length
-
of
-
stay for
incoming patients, Therapy: Based on modeled historical performance , select best inter
vention course: e.g.
best
treat
ment plans in radiotherapy. E.g. Using patient model, predict optimum medication dosage; e.g. for
diabetics, Accurate prognosis (prediction) and risk assessment for improved disease management and
outcome: e.g. predict ambulation following spinal cord

injury. E.g. Survival analysis for AIDS patients.
Predict pre
-
term birth risk, determine cardiac surgical risk, Diagnosis and classification: e.g. ECG
Interpretation: Using NN to predict which o/p: SV tachycardia, Ventricular tachycardia, LV hypertrophy,
RV
hypertrophy, Myocardial infarction, Diagnosis and classification: assist in decision making with a large
number of inputs. E.g. can perform automated analysis of pathological signals (ECG, EEG, EMG), medical
images (Mammograms, Ultrasound, X
-
ray, CT, an
d MRI). E.g. Heart Attacks, Chest pains, Rheumatic
disorders, Myocardial ischemia using the ST
-
T ECG complex), Coronary artery disease using SPECT
images, and Risk assessment for improved disease management e.g. spinal cord injuries, and hart attacks


2
-

M
odeling the educational score in Saudi school health surveys
, Modeling Obesity in Saudi Arabian youth,
Epidemiological studies: Study of health, disease, morbidity, injuries and mortality in human communities.
E.g. Predict outbreaks in simulated population
s. E.g. Assess asthma strategies in inner
-
city children



3
-
Better i
nsight into medical survey data, effective
Dat
a Fusion from multiple sensors,
Efficient screening
tools reduce demand
on costly health care resources,
Discover patterns

relating outcomes t
o exposures,
Study
independence or correlation between diseases
,
Detecting pathological condition
s: e.g. tracking glucose levels,
Data fusion from various sensing modalities in ICUs to assist overburdened medical staff


For example, the figure shows the
medical chart of a patient.

Methods from the above three categori
es can be
applied to this chart to discover deviation measures.
This direction of investigation is “reducing healthcare
costs” by analyzing individuals’ data and discovering deviations that l
eads to higher costs. Deviation analysis
is a data mining technique that can discover also frauds and misuses. The proposed system is called “KEFIR:
Key Findings Reporter”

(see figure below)
.





7








8



The proposed system will apply deviation analysis
techniques on individuals’ healthcare data to find
“interesting deviations”. The system will then augment these findings with plausible causes, and suggest
recommendations of appropriate actionsK bach healthcare provider can apply the proposed system to it
s own
set of insured individualsK bach on possible medical areas coveredW fnpatientI lutpatientI purgicalI MaternityI

etc. For each area, patient, the proposed system will apply “measures and formulas” to discover large
deviations from the normsK lr deviat
ions from previous period andLor next periodK qhrough modelsI and
formulas such deviations can be converted to costsK


aeliverables in phase fW

Beta sersion f H its Benchmark H its quning

aeliverables in mhase ffW

Beta sersion ff H its Benchmark H its quni


aeliverables in mhase fffW

Beta sersion fff H its Benchmark H its quning

aeliverables in mhase fsW

cinal sersion H rser Manual

qhe following is the project plan scheduleK ft represents those different tasks within the research and
estimated duration for

eachK






IV
-
2
: AVAILABLE RESOURCE
S


Currently there are some o
pen source data mining algorithm
s that can be used as tools in some of the above
investigations.


IV
-
3
: EXPECTED RESULTS/O
UTPUTS




Health care is a very important and
wide field. The outputs of this project can be new results in medical
research: With analytical data mining the screening, diagnosis and detection of diseases may get more
efficient by reducing both time and costs for the corresponding procedures. It can a
lso be used to improve
individuals’ medication by using patients’ medication history to promote specific drugs directly to certain
patients. It can be effective in providing low
-
cost screening using disease models that require easily
-
obtained
attributes fr
om historical cases. It can perform automated analysis of pathological signals (ECG, EEG,
EMG), and medical images (MRI, CT, X
-
ray, and ultrasound).

Also
Data mining can produce more accurate
results in the field of empirical medical research. For example,

classification of patterns of kidney stones in
urine clustering,



9


The outputs of this project

can be used in
improving
clinical care
. For example, the results of applying
Data
mining in healthcare management. For example, time series analysis data minin
g algorithms can be used to
predict (based on historical data) patient volume per month, patient volume per medical specialization, length
of stay for incoming patients per medical department, ambulance run volume per month, and clinical decision
support s
ystems and information workflows.


The outputs of this project
can
also
be used in dentistry
, pharmacy, and nursing. For example,
Data mining
can also be used to provide summary medical reports to hand
-
held portable devices to assist providers with
data en
try/retrieval or medical decision making, sometime called mHealth.


As a starter we will explore only two of the above areas (management, medical research) until we stumble
into a rich area with data, background knowledge, and specific investigation querie
s in the other areas. It is
not clear at the moment which area will open up for us. Healthcare agencies in KSA are so reluctant to
provide their own data, and background knowledge. For the Healthcare management and medical research,
we were able to get two

Saudi data sets: Monthly patient volume at “family Community Primary Healthcare
Clinic of King Faisal University”, we used this data set to forecast future volumes based on past data. We
used time series analysis algorithm for predictionK qhe second data
set is urinary kidney stones from the
aivision of rrologyI aepartment of purgeryI hing hhalid rniversity eospitalK te used this data set to
classify the samples by cluster analysis of ionic compositionK qhese two experiments gave good results and
stand as
positive indication that further data sets can be acquired and utilized by this projectK


qhe following diagrams are results from the two data setsW




10







Methods above can be applied to individual patient’s charts to discover deviation measures.
qhis
direction of
investigation is “reducing healthcare costs” by analyzing individuals’ data and discovering deviations that
leads to higher costsK aeviation analysis is a data mining technique that can discover also frauds and misusesK
qhe proposed system is
called “KEFIR: Key Findings Reporter”.
qhe proposed system will apply deviation
analysis techniques on individuals’ healthcare data to find “interesting deviations”. The system will then
augment these findings with plausible causesI and suggest recommendat
ions of appropriate actionsK bach
healthcare provider can apply the proposed system to its own set of insured individualsK bach on possible
medical areas coveredW fnpatientI lutpatientI purgicalI MaternityI etcK cor each areaI patientI the proposed
system
will apply “measures and formulas” to discover large deviations from the norms. Or deviations from
previous period andLor next periodK qhrough modelsI and formulas such deviations can be converted to costsK





V
-

REFERENCES


1
-

Saudi Ministry of Health
http://www.moh.gov.sa/english/index.php


2
-

SAMIRAD
http://www.saudinf.com/main/c6m.htm





11





VI
-

ROLE
(S)

OF THE INVESTIGATOR(S)

(Attach a bri ef CV

for each i nvesti gator fol l owi ng the format i n Appendi x
A
)


#

Name of Investigator

Area of contribution to the project

1


Prof. Ahmed Sameh





System Design & Implementation

2


Prof. Mohamed El
-
Affendi




Data Collectio
n & Preparation

3

Dr. Mohamed Tunsi






Data Mining Tools



4

Dr. Gregory Shapiro





System Design & Implementation

5


Dr. Ayman Kassem





Testing


6









VII
-

PROJECT SCHEDULE


PHASES

OF

PROJECT

IMPLEMENTATION

(S
EE
G
ANETT
C
HART ABOVE
)


Steps

Task

Duration
(Months)

1

System requirements specifications: Sameh,
Tunsi


System Architecture

: El
-
Affendi

System Design
: Sameh, Greg

Databases Designs
: Greg

Prototyping of critical sub
-
systems
:
Tunsi, Sameh

System Detailed Design
: Sameh, Tunsi

Beta Version Implementation
: Sameh, Ayman


Testing
: El
-
Affendi

Building Deployment Environment
: Sameh

Bench Marking and Collecting Results (First Round)
: Tunsi

System Tuning (Based on First Round Results)
:
Sameh

Bench Marking and Collecting Results (Second Round Results)
: Al
-
Effendi

System Tuning (Based on Second Round Results)
: Sameh

Bench Marking and Collecting Results (Third Round Results)
: Tunsi

Version 1 Release
:Tunsi

Results Documentation and Analysis
with the Performance requirements
:Sameh

Detailed Code Documentation
: Sameh

User and Installation Guide (Full How To)
: Ayman



See Gantt
Chart
within this
proposal


12



Total duration for the proposed project

12 Month





VIII
-

BUDGET OF THE PROPOSED RESEARCH

(
Budget i n SAR)


Item

Amount
Requested

(
SAR
)

Priority
1

=

Max;






2

=

Mod;


3

=

Low.


Amount
Approved

(
SAR
)

A.
Personnel
*

(Research Assistant)




1
-

Student Ahmed
Al
-
Jabreen

2
-

Student Kamal

Qarawi

3
-

Student Omar

Al
-
Moughnee

4
-

Student Amr
o Al
-
Munajjed





24
,000


1




B.
Equipment
*

(List)






Development Server




5
,000


1





C.
Testing and Analysis
*

(Location/Laboratory)




Labtop Computer






5
,000


2





D.
Consumables
*

(List)




Desk Tools




1000


2




E.
Travel
*
(Local
/Internat
)




1
-

Travel for Gr
egory (Lowell Massachusetts /
Riyadh)

2
-

Travel for Ayman (Zahram / Riyadh)




1
0,
000


1



For

Official

Use


13


F. Software* (List)





-
SAS Data Mining Tools


-
Oracle 9i Data Mining


-
Clementines from SPSS


-
Ants Model Builder





1
0,000

1



G. Other Items* (Itemi ze)








---






Total Amount Requested (
SAR
)

5
5,000



IX
-

JUSTIFICATION OF BUDGET

(Justi fy each i tem l i sted i n the budget i n the previ ous secti on)


Item

Justification

A


Students Research
Assistants







Salary of SR 5
00 for each student for 12 months the duration of
the project.

B

Development Server







For developing the proposed experiments.

C

Laptop Compu
ter






For on
-
site data collection and on
-
site testing



D

Desk tools







For general use by team members

E

Travel




For the two outside PSU team members.


14





F

Software







Data Mining Tools Software

G













X

-

RELEASE TIME FOR RESEARCH TEAM MEMBERS



RELEASE TIME FROM

TEACHING LOAD



#

Team Member

Time Commit
ment

(hrs/w
eek
s/terms
)

Teaching
Load Max

PI



Ahmed Sameh

4 h/w

e.g.

1 course

FA11

CI1




Mohamed El
-
Affen
di

2h/w


CI2



Mohamed Tunsi

2h/w


CI3




Gregory Shapiro

1h/w


CI4



Ayman Kassem

1h/w


CI5








XI
-

EXTERNAL
FUNDING


#

Source of Funds

Amount (
SAR
)

Used for

……

costs

1



None







2









3











15




Appendix
A: CV

Format for Principal Investigator and Co
-
Investigators

(Two pages maxi mum, materi al shoul d be rel ated to submi tted project)


Title and Name:
Professor Ahmed Sameh



Specialty:

Artificial Intelligence,

Modeling and
Information Systems



Department and College:
Computer Science



Summary of Experience/Achievements Related to Research Proposal:



1
-

Ahmed Sameh, Ayman Kassem, “Lumbar Spine: Parameter Estimation for Realistic Modelling”, WSEAS
Transa
ctions

on Applied and Theoretical Mechanics, ISSN:1991
-
8747,
Issue 5, Volume 2, May 2008


2
-

Ahmed Sameh, Ayman Kassem, “A General Framework for Lumbar Spine Modelling and Simulation”,
International

Journal

of Human Factors in Modelling and Simulation, IJH
FMS, The North American Spine
Society, Volume 1, Issue 2, January 2008


3
-

Dalia El
-
Mansy, Ahmed Sameh, “A Collaborative Inter
-
Data Grid Strong Semantic Model with Hybrid
Namespaces”,

Journal

of Software (JSW), Academic Publisher, Volume 3, Issue 1,
January 2008



4
-

Ahmed Sameh, “Simulating Lumbar Spine Motion”, Research in Computing Science (RCS)
Journal
,
National Polytechnic Institute of Mexico, ISSN 1665
-
9899, Volume 18, Issue 4, June 2007



5
-

Ahmed Sameh, and Ayman Kassem, “3D Modeling and Simul
ation of Lumbar Spine Dynamics”, in the
International
Journal

of Human Factors Modelling and Simulation , Volume IJHFMS
-
942, 2007



6
-
Adhami Louai, Abdel
-
Malek Karim, McGowan Dennis, Mohamed A. Sameh, "A Partial Surface/Volume
Match for High Accuracy Object Localization", International
Journal

of Machine Graphics and Vision, vol
10, no. 2, 2001


7
-
Mohamed A. Sameh, “Interactive Learnin
g in Artificial Neural Networks Through Visualization”, The
International
Journa
l of Computers and Applications (IJCA), Vol. 20, #2, 1998


8
-

Mohamed A. Sameh and Attia E. Emad, "Parallel 1D and 2D Vector Quantizers Using Kohonen Self
-
Organizing Neural Network",

in the International
Journal
o f the Neural Computing and Applications, V.
(4), no. 2, Springer Verlag, London, 1996


9
-

Ahmed Sameh
, Amgad Madkour, “Intelligent open Spaces: Learning Us er His tory Us ing Neural Network
for Future Prediction of Reques ted Res ources ”,
Proceedings IEEE CSE'08, 11th IEEE International
Conference on Computational Science and Engineering, 16
-
18 July 2008, São
Paulo, SP, Brazil. IEEE
Computer Society 2008, ISBN 978
-
0
-
7695
-
3193
-
9



10
-

Ahmed Sameh, Ayman Kaseem, “Modelling and Simulation of Human Lumbar Spine”,
Proceedings of
the

2008 International
Conference
on Modelling, Simulation, and Visualization, MSV 2008,

Las Vegas,
Nevada, July 14
-
17, 2008,

CSREA Press 2008, ISBN 1
-
60132
-
081
-
7


11
-

Ahmed Sameh, Dalia El
-
Mansy, “A Collaborative Inter
-
Data Grids Model with Hybrid Namespace”, 14
th

IEEE International
Conference

on Availability, Reliability, and Security, (DAWAM


ARES 2007), Vienna,
Austria, April 10
-
13, 2007


16



12
-

Ahmed Sameh, “Simulating Lumbar Spine Motion: Parameter Estimation for Realistic Modelling”, The
S
th

Mexican International

Conference

on Artificial In
telligence (MICAI07), Aguascalientes, Mexico,
November 4
-
10, 2007


13
-

Sherif Akoush, Ahmed Sameh, “Bayesian Learning of Neural Networks for Mobile User Position
Prediction”, The International Workshop on Performance Modelling and Evaluation in Computers a
湤n
telecommunication ketworks EmMbCqMTF
-

part of the fbbb NS
th

International
Conference

on Computer
Communications and Networks, ICCCN 2007, Honolulu, Hawaii, August 13
-
16, 2007


14
-

Ahmed Sameh, “The Schlumberger High Performance Cluster at AUC”,
Proceedi
ngs

of the 13
th

International
Conference

on Artificial Intelligence Applications, Cairo, February 4
-
6, 2005


15
-
Mohamed A. Sameh, Rehab El
-
Kharboutly, "Modeling a Service Discovery Bridge Using Rapide
Architecture Description Language",
Proceedings

of the

18th European Simulation
Multiconference

(ESM
2004), Magdeburg, Germany, June 13
-
16, 2004


16
-
Mohamed A. Sameh, Rehab El
-
Kharboutly, and Hazem Al
-
Ashmawy, "Modeling Wireless Discovery
and Deployment of Hybrid Multimedia N/W
-
Web Services Using Rapide ADL",

Proceedings

of the 7th
IEEE International
Conference
on High Speed N/Ws amd Multimedia Communications (HSNMC04),
Toulouse, France, June 30
-

July 2nd, 2004


17
-
Mohamed A. Sameh, Rhab El
-
Kharboutly, "Modeling Jini
-
UpnP Using Rapide ADL",
Proceedings

of the
10th EUROMEDIA
Conference

(EUROMEDIA 2004), Hasselt, Belgium, April 19
-
21, 2004


18
-
Mohamed A. Sameh, "E
-
Access Custom Webber: A Multi
-
Protocol Stream Controller",
Proceedings

of
the IADIS International Conference on Applied Computing, Lisbon, Port
ugal, March 23
-
26, 2004


19
-

Ayman Kassem, A. Sameh, and Tony Keller, “Modeling and Simulation of Lumbar Spine Dynamics”,
Proceedings

of the 15
th

IASTED International Conference on Modeling and Simulat ion and Optimizat ion
(MSO 2004), Marina Del Rey, Califo
rnia, March 2004


20
-
Mohamed A. Sameh, and Shenouda S., "Tera
-
Scale High Performance Distributed and Parallel Super
-
Computing at AUC",
Proceedings

of the 12th International Conference on Artificial Intelligence, Cairo, Feb.
18
-
20, 2004


21
-
Shenouda S., Mohamed L., and Mohamed A. Sameh, "AUC Cluster Participation in Global Grid
Communities",
Proceedings

of the 12th International Conference on Artificial Intelligence, Cairo, Feb. 18
-
20, 2004


22
-
El
-
Ashmawi Hazem, and Mohamed A. Sameh, “XML
-

cket ianguage
-
fndependent aistributed lbject
Computing Model”,
Proceedings

of the 15
th

International Conference on Parallel and Distributed
Computing Systems, Louisville, Kentucky, September, 2002


23
-
Mohamed Karasha, Greenshields Ian, and Mohamed A. Sameh
, “HUSKY: A Multi
-
Agent Architecture
for Adaptive Scheduling of Grid Aware Applications”,
Proceedings

of the High Performance Computing
Symposium with the 2002 Advanced Simulation Technologies Conference (ASTC 2002), San Diego,
California, April 14
-
18, 20
02


24
-
Atef Rania, Mohamed A. Sameh,and Abdel
-
Malek Karim, "Three Dimensional Deformable Modeling of
the Spinal Lumbar Region",
Proceedings

of the 11
th

International Conference on Intelligent Sys tems on
Emerging Technologies (ICIS
-
2002), Bos ton, July 18
-
20
, 2002



25
-
Kassem Ayman, Mohamed A. Sameh, and Abdel
-
Malek Karim, "A Spring
-
Dashpot
-
String Element for
Modeling Spinal Column Dynamics",
Proceedings

of the International Workshop on Growth and Motion in
3D Medical Images, Copenhag
en, Denmark, May 28
-

June 1, 2002


26
-
Kassem Ayman, and Mohamed A. Sameh, “A Fast Technique for modeling and Control of Dynamic
System”,
Proceedings

of the 11
th

International Conference on Intelligent Sys tems on Eme rging Technologies

17


(ICIS
-
2002), Boston, J
uly 18
-
20, 2002


27
-
Mohamed A. Sameh, and Kaptan Noha, "Anytime Algorithms for Maximal Constraint Satisfaction",
Proceedings

of the ISCA 14th International Conference on Computer Applications in Industry and
Engineering (CAINE' 2001), Nov. 27
-

29, at Las V
egas, Nevada, 2001



28
-
Mohamed A. Sameh, and Mansour Marwa "Enhancing Partitionable Group Membership Service in
Asynchronous Distributed Systems",
Proceedings

the ISCA 14th International Conference on Computer
Applications in Industry and Engineering (CAI
NE' 2001), Nov. 27
-

29, at Las Vegas, Nevada, 2001


29
-
Abdalla Mahmoud, Mohamed A. Sameh, Harras Khalid, Darwich Tarek, "Optimizing TCP in a Cluster of
Low
-
End Linux Machines",
Proceedings

of the 3
rd

WSEAS Symposium on Mathematical Methods and
Computational Techniques in Electrical Engineering, Athens, Greece, Dec. 29
-
31, 2001


30
-
Rania Abdel Hamid, and Mohamed A. Sameh, “Visual Constraint Programming Environment for
Configuration Problems”,
Proceedings

of the 15
th

International Conference on Co mputers and their
Applications, New Orleans, Louis iana, March 2000


31
-
Essam A. Lotfy, and Mohamed A. Sameh, “Applying Neural Networks in Case
-
Based oeasoning
Adaptation for Cost Assessment of pteel Bui
ldings”,
Proceedings

of the 10
th

International Conference on
Computing and Information, ICCI
-
2000, Kuwait, Nov. 18
-
21, 2000


32
-
Ghada A. Nasr, and Mohamed A. Sameh, “ Evolution of Recurrent Cascade Correlation Networks with a
aistributed Collaborative ppec
ies”,
Proceedings

of the IEEE Symposium on Computations of Evolutionary
Computation and Neural Networks, San Antonio, TX, May 2000


33
-
El
-
Beltagy S., Rafea A., and Mohamed A. Sameh, “An Agent Based Approach to Expert System
Explanation”,
Proceedings

of the

12
th

International FLAIRS Conference, Orlando, Florida, 1999


34
-

Mohamed A. Sameh, Botros A. Kamal, "2D and 3D Fractal Rendering and Animation",
Proceedings

of
the Seventh Eurographics Workshop on Computer Animation and Simulation, Aug. 31st
-

Sept. 2nd,
in
Poitiers, France, 1996


35
-
Mohamed A. Sameh, "A Robust Vision System for three Dimensional Facial Shape Acquisition,
Recognition, and Understanding",
Proceedings

of the 1st Golden West International Conference on
Intelligent Systems, Reno, Nevada, 1991


36
-
Mohamed A. Sameh, "A Neural Trees Architecture for Fast Control of Motion",
Proceedings

of the
FLAIRS Artificial Intelligence Conference, Cocoa Beach, Florida, 1991


37
-
Mohamed A. Sameh, Armstrong W.W., "Towards a Computational Theory for Motion Unders
tanding:
The Expert Animator Model",
Proceedings

of the 4th International Conference on Artificial Intelligence for
Space Applications, Nasa, Huntsville, Alabama, 1988


CV of Gregory Shapiro:

Gregory Piatetsky
-
Shapiro, Ph.D.

is the President of
KDnuggets
, which provides
research and consulting

services in the areas of data mining, knowledge discovery,
bioinformatics, and business

analytics. Previously, he led data mining and consulting
groups at GTE Laboratories, Knowledge Stream Partners, and Xchange. He has extensive
experience developing CRM, customer attrition, cross
-
sell, segmentation and other models
for some of the leading
banks, insurance companies, and telcos. He also worked on clinical
trial, microarray, and proteomic data analysis for several leading biotech and
pharmaceutical companies.

Gregory served as an expert witness and provided expert opinions in several cases.


18


Gregory is also the Editor and Publisher of
KDnuggets News
, the leading newsletter on
data mining and knowledge discovery (published since 1993), and the
KDnuggets.com

website, (published since 1997) data mining community's top resource for data mining and
analytics software, jobs, solutions, courses, companies, and publications, and more. From
1994 to 1997, while at GTE Laboratories, he published K
nowledge Discovery Nuggets
website, an earlier version of KDnuggets.

Gregory is the founder of
Knowledge Discovery in Database (KDD) conferences
. He
organized and chaired the first t
hree Knowledge Discovery in Databases workshops in
1989, 1991, and 1993, and then chaired the KDD Steering Committee until 1998, when he
co
-
founded
ACM SIGKDD
, the leading professional organization for Knowledge
Disco
very and Data Mining. He served as Director (1998
-

2005) and was elected
SIGKDD Chair (2005
-
2009 term).

Gregory has over 60 publications, including 2 best
-
selling books and several edited
collections on topics related to data mining and knowledge discove
ry, including
SIGKDD
Explorations Special Issue on Microarray Data Mining

(Vol 5, Issue 2, Dec 2003).

Gregory received
ACM SIGKDD Service Award

(2000) and
IEEE ICDM Outstanding
Service Award

(2007) for contributions to data mining field and community.

Publication Record:



Data Mining and Knowledge Discovery
-

1996 to 2005: Overcoming the Hype and moving
from "University" to "Business" and "Analytics"
, Gregory Piatetsky
-
Shapiro, Data Mining
and Knowledge Discovery journal, 2007.



What Are The Grand Challenges for Data Mining? KDD
-
2006 Panel Report
, Gregory
Piatetsky
-
Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor,
Mohammed Zaki, KDD
-
06 Panel Report, SIGKDD Explorations, 8(2), Dec 2006.



10 Challenging Problems in Data Mining Research, Qiang Yang, Xindong Wu, P
edro
Domingos, Charles Elkan, Johannes Gehrke, Jiawei Han, David Heckerman, Daniel
Keim, Jiming Liu, David Madigan, Gregory Piatetsky
-
Shapiro, Vijay V. Raghavan,
Rajeev Rastogi, Salvatore J. Stolfo, Alexander Tuzhilin, and Benjamin W. Wah, Spec.
Issue of
International Journal of Information Technology & Decision Making
, Vol. 5, No. 4
(2006).



On Feature Selection through Clustering, R. Butterworth, G. Piatetsky
-
Shapiro, Dan A.
Simovici, Proceedi
ngs of IEEE ICDM
-
2005 Conference, Nov 2005.



A Comprehensive Microarray Data Generator to Map the Space of Classification and
Clustering Methods
, Piatetsky
-
Shapiro,

Gregory , and Grinstein, Georges G., Tech.
Report No. 2004
-
016, U. Massachusetts Lowell, 2004.



Microarray Data Mining: Facing the Challenges (PDF)
, Gregory Piatetsky
-
Shapiro an
d
Pablo Tamayo, SIGKDD Explorations, Dec 2003.



Capturing Best Practice for Microarray Gene Expression Data Analysis, G. Piatetsky
-
Shapiro, T. Khabaza, S. Ramaswamy, in Proceedings of
KDD
-
2003 (ACM Confere
nce
on Knowledge Discovery and Data Mining)
, Washington, D.C., 2003. (Honorary mention
for best application paper).



Measuring Real
-
Time Predictive Models (poster pres
entation)
, S. Steingold, R. Wherry, G.
Piatetsky
-
Shapiro, in Proceedings of
IEEE ICDM
-
2001 Conference
, San Jose, CA, Nov
2001.



Measuring Lift Quality in Database Marketing
, (pdf, 100K), G. Piatetsky
-
Shapiro and S.
Steingold,
SIGKDD Explorations, Dec 2000
.



Knowledge Discovery in Databases: 10 years after
, Gregory Piatetsky
-
Shapiro,
SIGKDD
Explorations
, Vol 1, No 2, Feb 2000.


19




Expert Opinion: The data
-
mining industry coming of age (PDF)
, Gregory Piatetsky
-
Shapiro,
IEEE Intelligent Systems
, Vol. 14, No. 6, November/December 1999.



Estimating Campaign Benefits and Modeling Lift
, (MS Word) Gregory Piatetsky
-
Shapiro
and Brij Masand, Proceedings of KDD
-
99 Conference, ACM Press, 1999.



Knowledge Discovery and Acquisition from Imperfect Information, G. Piatetsky
-
Shapiro, chapter in A. Motro and P. Smets, eds., Uncertainty in Information
Management, Kluwer, 1997.



From Data Mining to Knowledge Discovery in Databases
,

Usama Fayyad, Gregory Piatetsky
-
Shapiro, and Padhraic Smyth. AI Magazine 17(3):
Fall 1996, 37
-
54



Mining Business Databases, Ron Brachman, Tom Khabaza, Willi Kloesgen, Gregory
Piatetsky
-
Shapiro
, and Evangelos Simoudis, Communications of ACM, 39:11,
November 1996.



Data Mining and Knowledge Discovery in Databases: An overview, Usama M. Fayyad,
Gregory Piatetsky
-
Shapiro, Padhraic Smyth, Communications of ACM, 39:11,
November 1996.



Improving Class
ification Accuracy by Automatic Generation of Derived Fields Using
Genetic Programming, B. Masand and G. Piatetsky
-
Shapiro, in Advances in Genetic
Programming II, MIT Press, 1996.



An Overview of Issues in Developing Industrial Data Mining and Knowledge Di
scovery
Applications, Gregory Piatetsky
-
Shapiro, Ron Brachman, Tom Khabaza, Willi
Kloesgen, and Evangelos Simoudis, in
KDD
-
96 Conference Proceedings
, ed. E.
Simoudis, J. Han, and U. Fayyad, AA
AI Press, 1996.



A Comparison of Approaches For Maximizing Business Payoff of Prediction Models,
Brij Masand and Gregory Piatetsky
-
Shapiro, in
KDD
-
96 Conference Proceedings
, ed. E.
Simoudis, J
. Han, and U. Fayyad, AAAI Press, 1996.



Knowledge Discovery and Data Mining: Towards a Unifying Framework, Usama
Fayyad, Gregory Piatetsky
-
Shapiro, and Padhraic Smyth in
KDD
-
96 Conference
Proceedings
, ed. E. Simoudis, J. Han, and U. Fayyad, AAAI Press, 1996.



From Data Mining to Knowledge Discovery: an Overview, U. Fayyad, G. Piatetsky
-
Shapiro, P. Smyth, in
Advances in Knowledge Dis
covery and Data Mining
, AAAI/MIT
Press, 1996.



Selecting and Reporting What is Interesting: The KEFIR Application to Healthcare
Data, C. Matheus, G. Piatetsky
-
Shapiro, and D. McNeill, in
Advances i
n Knowledge
Discovery and Data Mining
, AAAI/MIT Press, 1996.



Knowledge Discovery in Personal Data vs. Privacy
, G. Piatetsky
-
Shapiro, IEEE expert,
April 1995



KDD
-
93: Progress and
Challenges in Knowledge Discovery in Databases (
PDF
,
latex
),
G. Piatetsky
-
Shapiro, C. Matheus, P. Smyth, R. Uthuru
samy, AI magazine, 15(3): Fall
1994, 77
-
82.



The Interestingness of Deviations, G. Piatetsky
-
Shapiro, C. Matheus, in Proceedings
of KDD
-
94 workshop, AAAI Press, 1994.



Systems for Knowledge Discovery in Databases, C. Matheus, P. Chan, G. Piatetsky
-
Shapiro,

IEEE Transactions on Data and Knowledge Engineering
, 5(6), Dec. 1993.



Measuring Data Dependencies, G. Piatetsky
-
Shapiro and C. Matheus, in Proceedings
of AAAI
-
93 Workshop on KDD, AAAI Press Report WS
-
02.



"Knowledge Discovery in Databases
-

An Overview",

W. Frawley, G. Piatetsky
-
Shapiro, C. Matheus, (
PDF
), in
Knowledge Discovery in Databases 1991
, pp. 1
--
30.
Reprinted in AI Magazine, Fall 1992.



Knowledge Discovery Workbench for
Exploring Business Databases, G. Piatetsky
-
Shapiro and C. Matheus, in
Int. J. of Intelligent Systems
, 7(7), Sep 1992.



Report on AAAI91 workshop on Knowledge Discovery in Databases, G. Piatetsky
-
Shapiro, IEEE Expert, Fall 1991



"Discovery, Analysis, and Pr
esentation of Strong Rules", G. Piatetsky
-
Shapiro (in
Knowledge Discovery in Databases 1991
), pp. 229
-
248.



Knowledge Discovery in Real Databases: A workshop report (
PDF
,
html
), AI Magazine,

20


vol. 11, no. 5, January 1991.

Books and Proceedings



ACM TKDD Special Issue on Knowledge Discovery for Web Intelligence
, Guest
Editors: Ning Zhong, Gregory Piatetsky
-
Shapiro, Yiyu Yao, Philip S. Yu, Dec 2010.



SIGKDD Explorations Special Issue on Microarray Data Mining
, Vol. 5, Issue 2, Dec 2003.



R. Agrawal, P. Stolorz, and G. Piatetsky
-
Shapiro, eds.,
Proceedings of KDD
-
98: 4th
International Conf. on Knowledge Discovery and Data Mining
, AAAI Press, 1998.



U. Fayyad, G. Piatetsky
-
Shapiro, P. Smyth, R. Uthurusamy, eds.,
Advances in
Knowledge Discovery in Databases
, AAAI/MIT Press 1996.



Mini
-
symposium on KDD vs. Privacy
, IEEE Expert, April 1995.
full text of a draft
.



Special issue of J. of Intelligent Information Systems on Knowledge Discovery in
Databases, e
d. G. Piatetsky
-
Shapiro, 4(1), Jan 1995.



KDD
-
93: Proceedings of AAAI
-
93 Workshop on KDD, ed. G. Piatetsky
-
Shapiro, AAAI
Press Report WS
-
02, 1993.



Gregory Piatetsky
-
Shapiro and William Frawley, eds.,
Knowledge Discovery in
Databases
, AAAI/MIT Press, 1991











Appendix
B
: Evaluations and Approvals



COLLEG
E

REVIEW

COMMITTEE

Evaluation and Recommendation


Item/ Evaluation

Excel
-
lent

Very

G
ood

Good

Weak

Research methodology





Research objectives





Research originality





Research contribution





Research applicability and relevance





Overall evaluation





Recommendations of
College

Committee




Approved

Disapproved

Amount of Budget Approved by
College

Committee:



(
SAR
)


Chair

College
Committee
-

Title and Full Name:




21



Signature:

Date:


/ /

Recommendations of the
College

Council



Approved

Disapproved


Dean of the College Council
-

Title and Full Name



Signature: Date:


/ /




PSU
INSTITUTIONAL
RESEARCH COMMITTEE

(IRC)

Recommendation



Recommendation of the
PSU
IRC





Approved

Disapproved

Chair

IRC

Committee
-

Title and Full Name:


Signature:




Date:


/ /






22


PSU EXTERNAL REVIEW
PANEL FOR RESEARCH P
ROPOSALS

Recommendation


Recommendation of the
Eternal Review

Committee
.



Approved: Amount of grant approved: (
SAR
)




Disapproved:





Postponed:




Directed to:


Chair

of

External Review

Panel

-

Title and Full Name:





Signature:


Date:



/ /

Recommendation of
University Council








Approved



Disapproved

Signature:


Date:



/ /