Data Mining and Privacy: A conflict in the making?

by Kurt Thearling

Published in the March 17, 1998 edition of DS*

Privacy. It’s a loaded issue. In recent years privacy concerns have taken on a more significant role in American society as merchants, insurance companies, and government agencies amass warehouses containing personal data. The concerns that people have over the collection of this data will naturally extend to any analytic capabilities applied to the data. Users of data mining should start thinking about how their use of this technology will be impacted by legal issues related to privacy.

Consider the recent uproar over CVS drug stores and their use of Elensys, a Massachusetts direct marketing company that sends reminders to customers who have not renewed their prescriptions (Boston Globe, Feb. 19, 1998). After receiving criticism over what was considered to be a violation of privacy of their customer's medical records, CVS terminated its agreement with Elensys. Although there was no direct mention of data mining during the controversy, Evan Hendricks, editor of Privacy Times, said that recent Senate hearings on medical privacy (S.1368) included discussions of Elensys and the use of data mining for marketing activities. If and when this legislation is enacted, it could very well contain one of the first legal limitations on the use of data mining technology.

This is just the tip of the iceberg. In several recent issues of Privacy Times, Hendricks described a number of US, Canadian, and European policy decisions that could directly or indirectly impact the use of data mining technology by database marketing organizations. Although the United States takes a much more laissez-faire approach to privacy, these policies are facing challenges. Beginning in October 1998, the European Union's Directive on Data Protection will bar the movement of personal data to countries that do not have sufficient data privacy laws in place. Industry groups are arguing that voluntary controls currently in place in the United States are sufficient. Privacy advocates, on the other hand, argue that any controls must be backed by legislation. Whether or not voluntary measures will suffice is still an open question but recent comments by EU officials have been critical of this voluntary approach.

Another critical evaluation of data mining and privacy was recently released in a report by Ontario Information and Privacy Commissioner Ann Cavoukian. The report, "Data Mining: Staking a Claim on Your Privacy" said data mining "may be the most fundamental challenge that privacy advocates will face in the next decade…"

The report looks at data mining and privacy in the context of the international "fair information practice" principles. These principles, established in 1980, dictate how personal data should be protected in terms of quality, purpose, use, security, openness, individual participation, and accountability. According to Commissioner Cavoukian, a number of these principles conflict with many current uses of data mining technology. For example, looking at the "purpose" principle, she writes:

"For example, if the primary purpose of the collection of transactional information is to permit a credit card payment, then using the information for other purposes, such as data mining, without having identified this purpose before or at the time of the collection, is in violation of [the purpose and use limitation principles]. The primary purpose of the collection must be clearly understood by the consumer and identified at the time of the collection. Data mining, however, is a secondary, future use. As such it requires the explicit consent of the data subject or consumer."

Although broadly written use statements could be added to customer agreements to allow data mining, Cavoukian questions whether or not these waivers are truly meaningful to consumers. Since data mining is based on the extraction of unknown patterns from a database, "data mining does not know, cannot know, at the outset, what personal data will be of value or what relationships will emerge. Therefore, identifying a primary purpose at the beginning of the process, and then restricting one's use of the data to that purpose are the antithesis of a data mining exercise."

Cavoukian sees informed consumer consent as the key the issues. Customers should be told how the data collected about them would be used and whether or not it will be disclosed to third parties. The report recommends that customers be given three levels of "opt-out" choices for any data that has been collected:

1. Do not allow any data mining of customer's data

2. Allow data mining only for internal use

3. Allow data mining for both internal and external uses

Although Canada (except for Quebec) currently does not have laws limiting the use of personal information by private companies, Cavoukian calls for controls that are "codified through government enactment of data protection legislation for private sector businesses."

These collisions between data mining and privacy are just the beginning. Over the next few years we should expect to see an increased level of scrutiny of data mining in terms of its impact on privacy. The sheer amount of data that is collected about individuals, coupled with powerful new technologies such as data mining, will generate a great deal of concern by consumers. Unless this concern is effectively addressed, expect to see legal challenges to the use of data mining technology.

[ Data Mining Page ] [ White Papers ] [ Data Mining Tutorial ]