"Use of data to support medical decisions has existed for centuries. John Snow, considered to be the
father of modern epidemiology, used maps with early forms of bar graphs in 1854 to discover the
source of cholera and prove that it was transmitted through
the water supply". (Tufte 1997).
"Florence Nightingale invented polar
area diagrams in 1855 to show that many army deaths could be
traced to unsanitary clinical practises and were therefore preventable. She used the diagrams to
implement reforms that eventually reduced the number of deaths". (Audain
Nowadays, the size of data and population does not allow manual study of the data to find different
patterns. Also, there is a growing need to estimate quickly and in some cas
es, even predict if some
disease could become a pandemic so that timely actions for detection and management can be taken.
For pandemic detection and management, a number of techniques and applications have been
developed. Kellogg et al. (2006) defined t
echniques based on spatial modeling, simulation and
spational data mining to find characteristric about disease outbreaks. An algorithm WSARE (What's
strange About Recent Events) looks at data and tries to find events which may be out of the ordinary
on historical and current data, for example, number of hospital visits in a certain area. WSARE is
based on association rules and Bayesian networks.
But data mining in health care faces a number of challenges. The biggest one is that although a large
unt of data is available, there is no standard format for this data. Same tests may have different
names in different hospitals, same diagnosis from different doctors may be recorded differently.
Also, two different diseases could have similar symptoms. Fo
r example, influenza and anthrax share
the same respiratory symptoms. A system can mistake one for the other for a given data set.
There is also the issue of patient confidentiality when accessing medical records.