Data Mining and Information Security

raspgiantsneckΔιακομιστές

9 Δεκ 2013 (πριν από 3 χρόνια και 11 μήνες)

121 εμφανίσεις

1













Abstract
-

According to MTI Technology review magazine,
data mining is going to be one of the most 10 sectors that is
going to change the world in the future. Many giant
companies entered this sector recently like Oracle and IBM
by supplying software or mod
els used to serve data mining.
Also there are many companies interested with the security
of data mining like Cisco Company. But, what makes all
these companies interesting in data mining ?.What is behind
the big profit gained from data mining companies?.M
any
standards and rules was added recently to help improving
the information security .These standards are figured and
controlled by strong organizations and sometimes
governments like International Organization for
Standardization(ISO) .Lets take the ISO2
7001 for
managing the information security as an example .


In this paper, we are trying to link two important
and new aspects for data which are the security of these
data and the extracting of it or what is known as data
mining. The technique of data min
ing comes with the huge
size of databases used now. This will increase the risk of
losing or damaging these data warehouses .Then it comes
the need of more security management to guarantee your
data reliability, privacy, integrity, etc... Information secur
ity
is needed in all organizations, businesses and for individuals
also. We will try to clarify as much as possible the relation
between data
mining and information security.

In this research we are focusing on the security side
of the data mining.


I.

Introduction



We are going to talk about a new powerful technology
that helps firms and companies focus on the important
information in their warehouses. This technology is data
mining, which is extracting information from large data sets.
The
future of data mining is bright and portentous
،
and growing
very fast to reach web and text mining .Many researches are
done recently to serve the future knowledge of the data mining.
Data mining allows businesses to make positive knowledge
decisions by it
s tools which predict future trends and behaviors.
Data mining tools help finding predictive information that
experts may miss because it lies outside their

expectation.



Data mining techniques can be incorporated with
new products and sys
tems as they are brought on line, and












t
hey

can be implemented fast on obtainable software and
hardware platforms to increase the value of existing information
resources.



Information security was known as an old definition used
in the Second World War, but it becomes a large sector because
of the revolution of technologies. The security of information
avoids risks not only for individuals also for organizations,
business companies and the most important governments. When
we are talking about Information security, we are talking abou
t
the most important matter

of data mining. It's a very hard,
complicated
and long
-
time aspect. Information security cannot
be done;
there is always a risk but the goal is to reduce it as
much as possible.



We will explain data mining and we will mention the
most common techniques. And we will talk about data
warehouse Also, we will talk about the data security and then we

will move to the relation between data mining and information
security.



II.

Data Mining


a.

What is data mining?



Data mining is known as the science of extracting useful
information from large data sets or databases. Data mining is a
new discipline, it lies at the intersection of machine learning,
statistics, databases and data management, artificial intel
ligence,
pattern recognition, and more other areas. [1]


b.

Data warehouse



"A data warehouse is a subject
-
oriented, integrated, time
-
variant and non
-
volatile collection of data in support of
management's decision making process."[2]


"Subject
-
Oriented
: A data warehouse can be used to
analyze a particular subject area. For example, "sales" can be a
particular subject." [2]

A data warehouse integrates data from multiple data
sources. For example, source A and source B may have
different
ways of identifying a product, but in a data warehouse, there will
be only a single way of identifying a product.

Data Mining and Information Security

Reham Jarman
#1
,
Barea Alsa’awi
*2
,
Maha Alazizy
#3

#
Computer Science

dept
,
Prince Sultan University

Saudi Arabia Riyadh

1
reham.jarman@hotmail.com

2
e4_bebo0o@hotmail.com

3
maha_alazizy@yahoo.com

2
























Fig 1: A data warehouse example


Historical data is kept in a data warehouse. For
example, one can
retrieve data from 3 months, 6 months, 12
months, or even older data from a data warehouse. This
contrasts with a transactions system, where often only the most
recent data is kept. For example, a transaction system may hold
the most recent address of a cu
stomer, where a data warehouse
can hold all addresses associated with a customer.


Once data is in the data warehouse, it will not change.
So, historical data in a data warehouse should never be
altered."[2]


c.

Data Mining Techniques



We will describe some of the most common data mining
algorithms in use today. We have divided the techniques into
two sections:



Classical Techniques:

o

Statistics.

o

Neighborhoods

o

Clustering



Next Generation Techniques:

o

Decision Trees

o

Neural Network
s

o

Rules [3].


First: Classical techniques.


The classical technique has descriptions of techniques
that have been used for decades. It should help the user to
understand the rough differences in the techniques and at least
enough information to

be dangerous and well armed enough to
not be baffled by the vendors of

different data mining tools.

1.

Statistics


By strict definition "statistics" or statistical techniques are
not data mining.


They were being used long before the term data
mining was coined to apply to business applications.


However,
statistical techniques are driven by the data and are

used to
discover patterns and build predictive models.


And from the
users perspective you will be faced with a conscious choice when
solving a "data mining" problem as to whether you wish to attack
it with statistical methods or other data mining techniq
ues.


For
this reason it is important to have some idea of how statistical
techniques work and how they can be applied. [3]


Regression is an old and most well
-
known statistical
technique used in data mining in functions format. Some of them
are

simple like the linear regression to find appropriate values
according to predicted values. There are other advanced
regression techniques such as multiple regression for more
complex relations. Successful data mining still requires skilled
technical and
analytical specialists who can
structure the analysis
and interpret the output.

[4
]


2.

Neighborhoods


Clustering and the Nearest Neighbor prediction technique
are among the oldest techniques used in data mining.


Most
people have an intuition that they understand what clustering is
-

namely that like records are grouped or clustered together.


N
earest neighbor is a prediction technique that is quite similar to
clustering
-

its essence is that in order to predict what a
prediction value is in one record look for records with similar
predictor values in the historical database and use the predictio
n
value from the record that it “nearest” to the unclassified record.
[3]

3.

Clustering


"Clustering is a data mining (machine learning) technique
used to place data elements into related groups without advance
knowledge of the group definitio
ns.


Popular clustering techniques include k
-
means clustering and
expectation maximization (EM) clustering."
[5
]



Another definition: A grouping of a number of similar
things; a bunch of trees; a cluster

of admirers.




3



Second: Next Generation
Techniques.


The next Generation techniques represent the most often
used techniques that have been developed over the last two
decades of research. These techniques can be used for either for
building predictive models or discovering new informa
tion
within large databases

1.

Decision Trees


"Decision tree structure and nodes vary depending on the
object of data mining and on the structure of information you
possess.
" [5
]

A
s shown in fig

2


Specific decision tree methods include Classification and
Regression Trees (CART) and Chi Square Automatic
Interaction Detection (CHAID).













Fig 2: An example for a Decision Tree.

http://www.cs.odu.edu/~toida/nerzic/390teched/computability/complexity.htm














Fig 3: A simplified view of a neural network for prediction of loan
default.



2.

Neural Networks

"To

be more precise with the term “neural network” one
might better speak of an “artificial

neural

network”.


True neural
networks are biological systems (a k a

brains
) that detect
pattern
s, make predictions and learn.
T
he artificial ones are
computer
programs implementing sophisticated pattern detection
and machine learning algorithms on a computer to build
predictive models fr
om large historical databases.
Artificial
neural networks derive their name from their historic
al
development which started off with the premise that machines
could be made to “think” if scientists found ways to mimic the
structure and functioning of th
e human brain on the computer.
Thus
historically neural networks grew out of the community of
Arti
ficial Intelligence rather than from

the discipline of statistics.
Despite the fact that scientists are still far from understanding the
human brain let alone mimicking it, neural networks that run on
computers can do some of the things that people can do.
" [3]


As
fig

3 shows a
n example of

simplified view of a neural network.

3.

Rules


Finding frequent patterns, associations, correlations, or
causal structures among sets of items or objects in transactional
databases, relational databases, and other
information
repositories.
[6
]
















4



















Fig 4: Data Mining Process

http://msdn.microsoft.com/en
-
us/library/ms174949.aspx




d.

Data mining process


The data
processing comes before the algorithms because it
must be processed to bring it to a form suitable for pattern
identification. The processing consists of six phases.

As shown
in figure 4:




Define the problem by defining variables, objectives, and
requireme
nts then translate them to definition.



Prepare the data by constructing the final data set, it should
be clean (error free) and formatted. The major tasks
involved in this phase are
selecting tables, records, and
attributes as well as transformation of the

data for the next
phase.



Explore data, collect and describe the data.
Statistics are
used in this process.



Building models by selecting a model and apply functions
such as association, classification, and clustering. Different
functions can be used for th
e same data type; some
functions can only be used for specific data type.



Evaluate the model if it does not satisfy the expectations
the model is rebuild until it achieves the objectives.



Deploy the result and present it as simple report or as
complex dat
abase.
[
7
]


e.

What can data mining do?


A retailer can use point
-
of
-
sale records of customer
purchases to send targeted promotions based on an individual's
buy history and this can be done by data mining. By mining
demographic data from comment or warranty
cards, the retailer
could develop goods and promotions to demand to specific
customer segments.

These days' companies with a powerful retail,
communication, financial, and marketing organizations use data
mining. Data mining enables the companies to find o
ut the
impact on sales, customer agreement, and share profit. It also
makes it easier for the companies to determine relationships
among external factors. For example product, price, staff skills,
customer demographics, economic indicators, and positioning
.
Finally, data mining makes it easy to summary information to
view detail transactional data.
[8]


These are some examples to show you companies that
use data mining, firstly, American Express it can suggest product
to its cardholders based on analysis of
their monthly expenditure.
Secondly, blockbuster Entertainment which mines its video rental
history database to recommend rentals to individual customers.
Thirdly, Wall Mart has over 2,900 stores in 6 different countries
and it transmits these data to its
7.5 Tara byte data warehouse. It
allows more than 3,500 suppliers, to access and perform data
analyses. The suppliers use this information to manage local store
inventory and identify new opportunity.
[8]

III.

Information Security


In the past, people used to c
arry their money,
gold and silver with a big chance of losing them. Then,
they realized that we need to make a safe place and
avoiding caring expensive things. In addition to that,
banks starts working by guarantee the secure of the
customer's savings. Act
ually, we are not going far of our
topic, but we are trying to show the important of it .Now,
information in warehouse can be much more important
than savings in banks. Transferring information need to
be secure as transferring savings. Companies paid lots

of
money to make their data secure, Confidential and
feasible as much as possible.


















Fig

5: Governments Security Classification Cost 2009

http://www.govinfosecurity.com/articles




5



Fig

5

shows how the US governments spend for the
information security more than other security matter. No one of
us is not concerning about his or her information security
.Indeed, we need it most of the time to minimize the breach
crimes, but
not ending it.


a.

History



During the world war II ,armies and governments
needed to avoid leaking of information .They focused on
developing new technologies to help hiding very high secret
information .Cryptography ,for example ,is one of the mo
st
popular and powerful technique was used till now. This is the
study of hiding information.”The US department of Defense
and the Department of State improve this technique since the
1970s with expertise in cryptography.”
[9
].



Encryption was used only by governments, but
now it's used for organization and individuals
also.

It's easy to encrypt your email so no one during the
transferring can read it other than the receiver. Information
security become an ongoing
learning process in a big field
including techniques, algorithms ,issues etc For instance ,cloud
computing technology to manage sharing and saving
information very easily and safety on servers .Information
security is taken in a serious consideration to ma
ny sectors like
business and healthcare for example .The world concern about
the data security more, so governments and organizations add
new principles and strict laws to guarantee the information
security.ISO27K standards found by ISO(International
Orga
nization for Standardization) ,to protect the information
on which we all depend. Although laws are there, computer
crimes are increasing, but awareness people about how to avoid
problems in information security may increase the security of
their informa
tion.


b.

Definition



There is no universal definition of information
security
,
but we can say it's the process of protecting data by giving
authorizations to see and use a certain data. To understand
information security we need to understand the
three aspects of
information security which are: confidentially, integrity and
availability.



First, the data must be confidential to make sure that
every user is having his information in a system in a very high
private level, and no one can r
each it without his or her
permission.



Providing passwords and IDs can serve the issue. But
this is not done only by the system or in other word the
DBMS(database management system) .

Let's take an example of a person who is saving sensitive
i
nformation related to his company with no authorization (an
one who owns the file can see it) in a USB driver, and a bad day
came when the USB has been stolen .Another example is when
someone owns a credit card and he associate his password to be
all zeros

or his birth date .In the two previous cases, the system
has provide a privacy choice to the two persons, but they didn't
use it property. Let's move to more complex situation. A
company with very huge database of customer's information.
Hiding all the da
ta is not a good idea, because users want to
access data as much as possible with no many constraints. It's
difficult to the security system know which data is sensitive and
which is not. Precision is an approach which goal is to maximize
as much no sensit
ive data as possible and protect the rest data
(the sensitive one).

We move to the integrity aspect where the data must be
consistent and reliable with the intended data to minimize the
loss of data or the inconsistencies of the data; information should
no
t be changed or removed randomly.

”A successful attack can happen when integrity is violated first
then the system av
ailability or confidentiality"[10
]. The DBMS
work in this aspect by reducing and analyzing failures that could
happen. Because these failur
es are commonly happened and the
reconstruction is costly, integrity is very important for
organizations.



Last but not least, is to serve the sharing of information
which done within the availability aspect. A system with correct
controlling,
storing and communicating processes is serving the
availability aspect.


c.


Risk Management



The meaning of risk management in data reefers to the
guidelines used to reduce security risks in data to an acceptable
level. This is done by knowing th
e weaknesses in the security
system that brings threats .In a security system, risk management
are needed to serve the value of security very well. In other word,
it gives a backup plan to what if a bad situation happened .This
not only includes

the securi
ty issue. It

expand
s

to include
managing and fixing the operational and economic costs to
establish a high level of protectively and protecting the IT
systems and data that support a certain organization. . Other
impacts cannot be measured in specific unit
s but it can be
described in terms of high, medium, and low impacts .For
instance or loss of public confidence, loss of credibility. In this
research, we are only concerning about the information security
management instead of business risk management.




To manage the risk management in information security,
we must first collect factors that could affect it, which are:




Hardware



Software



People who are using the system



Sensitive data



System interfaces



Critical


"A threat is a circumstance or event with a harm effect to
an information system ".
Threat
-
Sources are commonly appeared.
They can be human threats which caused by human like
hackers

6


or environmental
threats (
physical) like the failure of a power
.
Also, some threat

can cause a direct
damage (primary

threat), or
a long term
damage (
secondary threat).


d.

RFID security



RFID refers to
Radio Frequency Identification

systems
which are the greatest technology to identifying identities and
giving more security benefit .It
work using

automatically
private networking

using high technologies to minimize failures
and
attacking.


RFID is a widely use now ,because i
n almost all
industries, there are things must be easily tracked, recorded and
identified many things in a very short time .But can this
technology be the saver of hacking and leaking?.Can People
stop
frighten

of their credit card security when they are u
sing
this technology? As we mention before, information technology
is an ongoing process, because there is always two group of
people who are against each other; devil people and good
people .A thief could steal your credit card from your wallet
,but ele
ctronic pickpocket who are using RFID can steal your
credit card information while it's on your wallet and without
even you know. Unfortunately, This can put millions of people
at risk. Electronic pickpocket will use RFID to scan wallet or
bag , then imme
diately , the credit card information is known
now like the expiration date, number , name ,etc. It's not the
risk of a credit card .Indeed, it could happened with anything
uses the technology like passport contain RFID.


IV.

Security Matter in Data Mining


Both data mining and information security have many
researches during the last few years, the researchers suggest that
raising security must be on the top of the data mining issues.
Data mining techniques can be applied to handle security
problems as they
can cause other security problems. It becomes
common in both the private and public sectors. In the matter of
fact, data mining is smart techniques to analyze gather statistical
information and help in decision making. Many of these sectors
sell the data t
o other sectors , which use these data for their own
purposes. As a result, privacy of individual is being affected
without their execution.


a.


Privacy Preserving Data Mining (PPDM)



Lots of institutions are spending more resources on
developing their data mining skills and by doing and looking for
new research on data mining.


Privacy Preserving Data Mining (PPDM) is a new
research area that helps researchers and

practitioners to identify
problems and solutions for data mining according to the security
concern. Its aim is to secure the information using different kind
of algorithms and techniques. What happened if we ignore
or
limit

the need of information secur
ity can
threaten

to derail
data mining projects. The concerns of privacy has been
increased because of the misusing of information, data mining
will prevents this misusing and guarantees no data is revealed.
The privacy preserving ensures unconditionally s
afe access to the
data and does not require from the data miner any expertise in
privacy. Most of the research on privacy focused on theoretical
properties of data mining. Recent studies focused on the use of
privacy in practical applications such as banki
ng, healthcare, and
airlines.


PPDM deals with the problem of learning accurate models
over aggregate data, while protecting privacy at the level of
individual re
cords[9
].What PPDM analyze is that individuals
wants more information security ,and
this is not applicable for
knowledge discovery that is used for decision making. In short
word, there is a conflict between the privacy purpose individuals
need and
the analyzing

purpose organizations need. The question
is: can
us

accurate good
annalist

w
ithout access the
individual's

information.



Secure multiparty computation techniques that allow
servers to compute functions over local data while ensuring that
no server learns anything about the data of the other servers,
except the output o
f the function, the computation is secure if
given just one party’s input and output from those runs this will
guarantee a strong privacy.



PPDM is not the only field regarding to the data mining
for enhancing information security. Many articles, workshops
and researches has been done and used by many sectors like
business ,governments and healthcare sectors. In short word,
PPDM is
one field between many other fields having the same
matters; security matter in data mining



These are some of the new and simplest researches
according to all sectors:



Privacy and security when mining outsourced private


data



Privacy threats in
duced by data mining



Data mining for anomaly detection



Using data mining for intrusion detection and
prevention



Privacy
-
preserving link and social network analysis



Security and privacy in spat
p
io
-
temporal data mining
.



b.

Security Classification
for Information



What is important to know for a set of information is that
not all the information are having the same level of protection.
For instance, old information; that wasn't updated for long time,
are usually not needed any more or not private as it was
. Data can
classified to classes depending on the security levels assigned to
each class as shown
in fig 7








7





















Figure 6: shows the hierarchy of the security classification among information

http://www.centos.org/docs/5/html/Deployment_Guide
-
en
-
US/sec
-
mls
-
ov.html






Classifying data according to the security level can help
shaping the data minin
g process. Because it can show what data
could be gathered, what data couldn't and avoid using the
unneeded data; like the old data. And the company will be
aware of what are the data that could be sell and not.

Handling noisy or incompatible data is an is
sue in data mining
.Classify information according to the security level can help
reducing the problem. The information requiring protection
should be described in clear according the classification.



One of the aims of classifying data accordin
g to the
security matter is that assigning all the data to a very high secret
level will waste so many resources.


c.


Information Security in Data Mining


It's obvious that there is a huge need for learning and
mining methods with enough privacy

and security guarantees
for fields that

need decision making process [11
].Also, it's
important to develop mechanisms for processing the data
without affecting the data privacy matter.


Differential privacy is a theory that serves the both
aspects in the

same time; information privacy and data mining.
The aim of it is to give an accurate query from statistical
databases and minimizing the chances of identifying its records.
Also ,data cleansing is a technique in which it identify and
remove suspicious d
ata to reach the most effective and reliable
data during the data mining .As a result, more security
information and more accurate analysis. Existing research
efforts (Maletic and Marcus 2000; Orr 1998) suggested that the
average error rate of a dataset
in a data mining application have
to be around 5%
-
10%

[12
].


Unfortunately, individuals are the victim because they
don't know what is happening behind them. Let's take the social
network databases as an example. Individuals are sharing a
valuable information among each other or sometimes they only
won't

.What is happening is that some analysts start mining and
analyzing that information and sell it to other companies. The
future concern is that if these companies still keep tracing these
data, the privacy matter will be unreachable. Because someone's
dat
a could be found in some other documents in other website
without his/her permission and knowing. Spokeo is a website that
is aggregating and organizing people related information from
the internet source. It give you the most comprehensive snapshot
of peo
ple
-
related, public data from the internet. A person could
be found by his /her name, phone, username emails and even
friends. There is two points must be realized about this website.
First, this website is mining information .Even it was from a
public res
ources, they gather these sensitive data which make it
less secure and annoying. The second point is that this
information may not be efficient.


Clickstream is a technique used to record what computer
users clicking


on while they are browsing the
web.
When

someone brows a
page, the

URL of the page


and also the IP
address of the user will be saved in the web server.

Clickstream
can analyze the behavior of users or customers and how they
interact with


a


certain


website. Using

clickstream in marketing
can help companies to choose the best website to publish their
commercials on it.

Also,

they can publish it by sending emails to
who

are using this website more often. This would be perfect for
knowledge discovery

but

not that so for privacy.

By clickstrea
m
,they can know the all pages user brows it and the exact time of
browsing each .Also,

it can easily know the user if the user
publish some of his/her information .Some of web providers start
to use these analy
sis and statistics to market it.
This process

is
considered to be legal because they only distribute user's
behavior in a way that help many business companies to make
their decisions ,and they disallow to gave them private
information about users like their names or IP address. But
sometimes its eas
y to get it because some people don't have
knowledge about what could happen if his/her information was
published. Not all internet providers give their customer a
description or even a hint about their exact work and
especially

when it comes to their priv
acy.


Google engine have another
point of view about customer's privacy related with
clickstream.By clearing cookies and turning the


cable modem
off for few minutes the customer's IP address will be realized as a
new IP address


Information security in he
alth care is a good example of
managing information security, patient's information must remain
private and secure because misusing of information, exposing, or
loss of data may harm both the individuals and the organizations.
To understand the security sy
stem data miners should first
understand the Generally Accepted System Security Principles
(GASSP) published by the International Information Security
Found
ation that was updated in 1997[13
].Owners should provide
responsible and accountable system, and the

security of

8


information systems should be explicit. The security of
information in a system should be provided as a high manner to
all users with no differentiation among them and respects the
right and interests of others. Systems should respond to
bre
aches of and threats to the security of information and
information systems .“Measures for the security of information
systems should be coordinated and integrated with each other
and with other measures, practices and procedures of the
organization so as
to create
a coherent system of security”[14
].



Dynamic Data Web technology was developed by
Quiterian company to enables multiple solutions to be
developed at the business sector .By using Dynamic Data Web
,companies can study their customer's
behavior ,give the key
factors of business success and identify risks to find the best
decision making and this is


a continues process. Dynamic Data
Web is the fastest and most powerful analytical business
intelligent platform in the market. What make it
different is that
it includes easy and powerful analytical techniques for a big
data. "It has very good

security

rules and personal data
protection control (used in Police, Health or
Banking)"[
15
].Knowing that a company is using this king of
technology wou
ld make it more trustworthy. As a result, big
companies start to use this technique like
Vodafone

and TMB.


V.

Conclusion



In
conclusion,

Data mining the knowledge of extracting
helpful information from large data sets or databases.
T
echnologies
are in evolution every day ,and more individuals
companies

and organizations start using these technologies in
the matter of easiness and to be on the first line with
competitions .On the other hand, these technologies must be in a
good security level to g
uarantees the safety of information and
the reliability of it to serve their goals .Information security is
an old definition used first in military needs and then the use of
it was needed to individuals and groups .Information security
professionals are
always facing new challenges which make
them aware to find the best secure (but not the final) to a
particular information and making backup plans .Information
security have three aspects which are :confidentially, integrity
and availability .

Many researc
hers have been used and adapted by big companies
and universities according to the security of information in data
mining technique. P
rotecting privacy of sensitive information
used for data mining purposes is a big issue discussed by
researches these days
. Classifying the security level can
guarantee more security for the information. Some
organizations are mining individual's information and selling it
to other companies. This becomes an ethical issue. Companies
will gain more profit and individuals will
be the victim. This
might end the generation of the private information.

Data
mining could bring risks to security of information and privacy,
but researchers are developing new technologies and algorithms
to make some balance between privacy on
individual
's side and
data analyzing on organizations side.


References


[1]

Hand, David, Heikki Mannila, and Padhraic Smyth. Pricnciple of
Data Mining. Libraryof Congress Catloging
-
in
-
Publication Data,
2001. Print. Qa76.9.D343 H38 2001.

[2]

"Data Warehouse Definition
-

What Is a Data Warehouse."
1Keydata
-

Home of Free Online Tutorials
. Web. 04 Jan. 2011.
<http://www.1keydata.com/datawarehousing/data
-
warehouse
-
definition.html>.

[3]

Berson, Alex, Stephen Smith, and Kurt Thearling. Building Data
Mi
ning for Applications for CRM. McGraw
-
Hill Companies,
December 22, 1999. Print

[4]

Chapple, Mike. "Regression."
About.com
. About.com, 2007. Web.
Accessed,3 Dec. 2010.

<http://databases.about.com/od/datamining/g/regression.htm>

[5]

Chapple, Mike. "Clustering (data Mining) Definition."
About
Databases: Microsoft Access, SQL Server, Oracle and More!

Web. 01
Jan. 2011.
<http://databases.about.com/od/datamining/g/clustering.htm>

[6]

Kulkarni, Sushil. "Association Rules in Data Mining Ppt
Pre
sentation."
AuthorSTREAM Online PowerPoint Presentations and
Slideshow Sharing
. Web. 04 Jan. 2011.
<http://www.authorstream.com/Presentation/sushiltry
-
108428
-
association
-
rules
-
data
-
mining
-
science
-
technology
-
ppt
-
powerpoint/>.

[7]

Andrea Andreescu, “Forecasting
Corporate Earnings a Data Mining
Approach”
.
The Swedish School of Economics and Business
Administration, 2004.

<http://www.pafis.shh.fi/graduates/andand02.pdf>

[8]

Palace, Bill. "Data Mining."
Anderson
. June 1996. Web. 14 Feb. 2011.
<
http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologi
es/palace/index.htm>.

[9]

Pfleeger, Charles P., and Shari Lawrence Pfleeger. "Elementary
Cryptography." Security in Computing. Third ed. New Jersey:
PRENTICE HALL, 2003. 35
-
91. Print.

[10]

Fraquad.
"Privacy Preseving Data Mining."
All About Education
.
Inspire and Ignite, 20 Dec. 2009. Web. 6 Dec. 2010.
<http://www.inspirenignite.com/privcy
-
preserving
-
data
-
mining/>.

[11]

"Workshop on Privacy and Security Issues in Data Mining and
Machine Learning."
ECML
PKDD2010
. ECML PKDD 2010. Web.
<http://fias.uni
-
frankfurt.de/~dimitrakakis/workshops/psdml
-
2010/>.

[12]

M
arcus, Andarian, and Jonathan Maletic. "Data
Cleansing." Data
Mining and Knowlede Discovery Handbook. New York: Springer,
2005. 50
-
55. Print.

[13]

Ralph Spencer
Poore, International Information Security Foundation,
“Generally Accepted System Security Principle” 1999.Web
<
http://www.infosectoday.com/Articles/gassp.pdf
>
"Quiterian

[14]

Data Mining Y Análisis Predictivo Para Usuarios De Negocio."
Quiterian
-

Dynamic Data
Web
-

Análisis Dinámico

De Datos
-

HOME. 10 Jan. 2011.Accessed, 14 Jan.Web 2011.
<http://www.quiterian.com/site/index.php>.

[15]

Ted Cooper and Jeff Collman. Managing information Security and
Privacy in Healthcare. Department of Ophthalmology, Stanford
Universi
ty Medical School, Palo Alto, California, ISIS Center
Georgetown University School of Medicine; Department of
Radiology;Georgetown University Medical Center, Washington D.C.,
2005.Web
<
http://ai.arizona.edu/mis596a/book_chapters/medinfo/Chapter_04.pd
f
>

[16]

"Confidentiality, Integrity, Availability (CIA)
-

Privacy / Data
Protection Project (c)2002
-
2005." Privacy / Data
Protection Project.
University of Miami., 24 Apr. 2006. Web. Accessed 10 Dec.
2010.

<http://privacy.med.miami.edu/glossary/xd_confidentiality_
integrity_
availability.htm
>

[17]

SIeglein, William. "Assisments/Risk Assesments." Security Planning &
Disaster Recovery. By Eric Maiwald. Californial: Bradon A.Nordin,
2002. Print.

[18]

Montgomery, David. "Electronic Pickpocket Stoppers." The
Washington Post 2 Apr.
2008. Print, accessed 14 Dec.2010.


9


[19]

Thearling, Kurt. "Data Mining and Privacy: A Conflict in the
Making?" Data Mining and Analytic Technologies (Kurt Thearling).
Web. Accesed14 Dec. 2010.
<
http://www.thearling.com/text/dsstar/privacy.htm
>.

[20]

Under, Filed. "Principls of Information Security."
Www.informationintegrity.org. Www.informationintegrity.org, 20
Oct. 2010. Web. Accessed 11 Dec. 2010.

<
http://www.informationintegrity.org/principles
-
of
-
information
-
security/
>.

[21]

Kimball, Ralph, and Marqy Ross. The Data Warehouse Toolkit. 2
Edition ed. Willy, April 26, 2002. Print

[22]

"ESTARD Software :: Data Mining Software :
: ESTARD Data
Miner." ESTARD Software. Data Mining Software for Business &
Science. Accessed,Web. 01 Jan. 2011.
<http://www.estard.com/products/>.