<Information and Data Privacy: An Indian Perspective>

siberiaskeinData Management

Nov 20, 2013 (3 years and 27 days ago)

180 views


<Information and Data Privacy: An Indian Perspective>

Policy Brief


There have been considerable concerns in the developed countries over the issue of using a customer's personal information or

data for intrusive and malicious purposes. In developing
countries like India the issue of information and data privacy as it is
related to individual customers has not been of much importance primarily because of lack of awareness among the general
consumers, law enforcement agencies and the various organizatio
ns with whom the consumer has to interact and also because the
concept of privacy is somewhat differently perceived in most developing countries like India as compared to the developed
western countries. With the recent advances in the field of Data Minin
g it is now possible for an individual to use data sometimes
freely available on the web to extract certain patterns or information about a consumer which can be used by some organizatio
ns
to discriminate against the particular customer. Therefore the issu
e of preserving an individual customer’s privacy while using
Data Mining techniques to extract useful and meaningful information from customer data has become even more significant. In
this paper we look at the existing or stated privacy policies of some l
eading companies operating in India in the telecom, banking
and insurance sectors. We then introduce the concept of Privacy Preserving Data Mining (PPDM) and describe the main
approaches that are prevalent. Finally we suggest a framework to suggest which P
PDM method may be applied in which domain.



Key Recommendations/Findings

Findings 1>

In the telecommunications domain
Vodafone Essar is the only company that emphasizes on the issue of sharing the customers’
information outside India.

Findings 2>

I
n the b
anking sector we find only State Bank of India has
a policy on how to limit access to customer information by their
employees.
On the other hand HDFC bank’s privacy policy does not allow it to share customers confidential information

to protect its
own i
nterests (as mentioned by ICICI bank) but only as required by law.

Findings 3>

I
n the insurance sector LIC’s privacy policy states that LIC

may collect unnamed statistics, which do not personally identify the user
and LIC reserves the right to perform stat
istical analyses
but will

provide only aggregated data from these analyses to third parties.

ICICI Lombard’s policy mentions that the log files are analyzed such that individual user is not identified while
HDFC Standard
Life’s policy retains the right to

share aggregated non
-
personally identifiable information with third parties.

Recommendation 1>

For the telecom domain we suggest Data Transformation/randomization under the Privacy Preserving Data Mining (PPDM) approach

Recommendation 2>

For the banking
sector we suggest secure multiparty computation as the best suited method under PPDM related methods.

Recommendation 3>

For the insurance sector we suggest vertically partitioning the Data to ensure that personal data that identifies a person un
iquely and
their medical history are stored separately and can't be brought together. This can be followed by a simple Data transformati
on of the
private data for additional security.

Justification

In the telecommunications domain looking

at the policies given by the three companies we find that Vodafone Essar is the only
company that emphasizes on the issue of sharing the customers’ information outside India. This is a very important issue in o
ur
judgment since the applicability of Indian

privacy policies to data that is outside Indian jurisdiction makes the issue completely
different. This is an area where the privacy laws in one country may or may not be applicable to other countries and therefor
e the issue
of an Indian customer’s privac
y may be governed by laws of a different country where the data is stored.

A comparison of the privacy policies in the banking sector shows that HDFC Bank may disclose information about a customer as
permitted or required by law only unlike ICICI bank whic
h may disclose the information provided by customers to, “Protect and
defend ICICI Bank's or its Affiliates' rights, interests or property”. In other words ICICI b
ank’s interests seem to be given more
importance than the customer’s right to privacy.

On the

other hand HDFC bank’s privacy policy does not allow it to share customers
confidential information

to protect its own interests (as mentioned by ICICI bank) but only as required by law. SBI seems to be the
only bank with a policy to limit accessibilit
y to customers’ information to the bank employees.

LIC’s policy which states that only aggregated data will be given to third parties is one of the main points that we would li
ke to
emphasize in the use of data mining techniques on large databases in the
three domains that we have chosen to look at. Our objective
is to emphasize those data analysis and data mining techniques that can reveal hidden patterns and aggregate behaviours in th
e data
without revealing individual identities. Though LIC does not sta
te explicitly the steps taken by it to protect identification of individual
identities it implicitly recognizes the need to protect individual privacy while sharing aggregate information with third par
ties.

In
ICICI Lombard’s policy
it is mentioned that th
e log files are analyzed such that individual user is not identified the goal being to
analyze overall trends on user movements and demographic information and not revealing the identity of a particular individua
l. In
case of HDFC SL

the policy

states tha
t it shall not share, rent or sell any of your personally identifiable information provided by you,
unless otherwise stated at the time of collection or otherwise. HDFC SL retains the right to share aggregated non
-
personally
identifiable information with t
hird parties outside of the Website for business purposes, to assess website traffic, patterns and other
such services.


Based on the PPDM methods that we have looked at we now venture our suggestions for the three domains of interest that we hav
e
chosen to look at. In the
telecom domain

the companies primarily collect personal data on calling patterns of customers so that t
hey
can target their product recommendations. They also tend to conduct various surveys for planning their business and the custo
mers
may give more accurate information if they knew that their privacy would be protected even when data is shared with other
companies. For this we propose the Data transformation/ randomization approach as solution.


In the
banking sector

we suggest the second approach i.e. secure multiparty computation. In this approach different parties who own
the data stored at several bank
s agree to disclose the result of certain data mining calculations performed on the joint data which can
be horizontally partitioned. The parties use a cryptographic protocol to exchange messages which are encrypted to make some
calculations efficient whil
e making other calculations computationally intractable. For example 2 or more banks may share their
individual data mining results on ATM frauds without it being possible by individual banks to trace the particular customers
or ATMs
which were associated
with the frauds.


Lastly in the insurance sector we tend to work with most sensitive types of private data like health records for example. In
many
countries the privacy standards in this domain have been protected by law, like HIPAA (Health Insurance Port
ability and
Accountability Act) in the United States (Office for Civil Rights [OCR],2003). In India an initiative in this direction is on
ly recently
being taken. Data mining over insurance records particularly medical or health records is important for pha
rmaceutical companies,
insurance companies themselves and also government policy makers. In view of the recent progress in DNA sequences and DNA
mapping it should be made mandatory to store the DNA sequences, the personal data of the individual that identi
fies him/her uniquely
and their medical history in different data stores/repositories so that they can not be brought together. Then we can perform

PPDM
over the vertically partitioned data to calculate the aggregate statistics while keeping the private da
ta intact. As an additional level of
security we suggest a simple transformation of the private data before it is made available to third parties for extracting h
idden
patterns using data mining algorithms. This is important in a country like India with we
ak data privacy laws to ensure there is no
discrimination against an individual when he/she applies for insurance and one way of doing that would be a combination of da
ta
transformation and vertical partitioning of the sensitive data as suggested above.


<
R.P.Datta, rp.datta@gmail.com, Indian Inst. Of Foreign Trade, J
-
1/14, Block EP & GP,Sec
-
5, Salt Lake City, Kolkata
-
700091, India
>