The Myth of De-Identified Data: Sorrell v. IMS Health and the privacy risks of the prescription data trade
While my colleagues have recently identified many of the potential risks and benefits of electronic medical record keeping, a case before the Supreme Court this term presents questions about the potential dangers it poses for patient privacy in particular.
Background: Sorrell v. IMS Health
In Sorrell v. IMS Health, plantiffs data-mining firms and PhRMA, an association representing pharmaceutical drug manufacturers, have challenged a Vermont law that prohibits drug manufacturers from using of prescriber records for purposes of marketing. The plaintiffs argue that this restriction on their use of information violates their free speech rights.
The Vermont law attempts to curb marketing uses of prescription records by targeting a common three-part transaction: First, upon filling prescriptions, pharmacies collect information including the prescriber’s name and address, the name, dosage, and quantity of the drug, the date and place the prescription is filled, and the patient’s age and gender. Pharmacies sell this information to data-mining firms who aggregate it to reveal individual physician prescribing patterns.
Second, the data-mining firms “de-identify” the aggregated data by stripping it of patient information and then sell it to drug manufacturers. The extent to which the firms de-identify the data is apparently left to their discretion, since no statute defines what constitutes sufficiently de-identified data.
Third, after purchasing the data, drug manufacturers use it in their marketing efforts. Most notably, manufacturers employ representatives to promote their products during visits with individual physicians, a process known as “detailing.”
The challenged Vermont law seeks to disrupt this transaction by prohibiting pharmacies from selling or using prescription records for any marketing purposes without the express consent of the prescribing physician. Put another way, the law prohibits part one of the transaction described above in order to prevent part three. The law permits pharmacies to continue to transmit the data for non-commercial purposes such as health care research, treatment, and safety-related uses.
Plaintiffs data-mining firms and PhRMA argue that the law restricts commercial speech and therefore violates their First Amendment rights. Vermont, in contrast, argues among other things that the law is not a restriction on speech but merely conduct. Even if it were a restriction on commercial speech, Vermont argues, the law advances three substantial state interests: protecting public health, protecting patient privacy, and containing health care costs.
In November, 2010, the Second Circuit agreed with plaintiffs’ argument and struck down the law. The three judge panel held that the statute restricted commercial speech—not merely conduct—and that it failed to advance the state’s asserted interests in lowering health care costs and protecting public health. The court determined that the state’s stated interest in protecting privacy was “too speculative” to qualify as substantial.
The State’s Interest in Privacy
In rejecting the state’s interest in protecting patient privacy as substantial, the Second Circuit neglected to consider developments in technology and decryption techniques that pose a real and substantial threat to patient privacy. In fact, the state itself neglected these developments and instead argued (.pdf) that allowing marketing uses of prescription data undermined the privacy of the patient-doctor relationship.
In an amicus brief (.pdf) cited by the dissent, the Electronic Privacy Information Center (EPIC) emphasized the importance of the state’s interest in protecting patient privacy in light of recent technological developments. In particular, it explained the various ways in which de-identified data can be easily re-identified, and how this re-identification presents serious risks where medical records are at stake.
In its brief, EPIC describes one method of re-identifying anonymous data known as record linkage, which involves merging two or more databases (e.g. public census data, voting records, etc.). This method has been proven to be very effective at re-identifying individuals from supposedly anonymous data—even from ordinary desktop computers. For example, one privacy researcher employing this method was able to uniquely identify 87% of the US population by utilizing only date of birth, gender, and zip code. The same researcher also re-identified a former governor of Massachusetts’ full medical record by cross-referencing public census data with de-identified health data.
Expanding on its amicus brief for the Second Circuit, EPIC’s recent amicus brief (.pdf) filed at the Supreme Court attacks the data-mining firm IMS Health’s method for encrypting the prescription data. According to the brief, the firm uses a faulty method of encryption, known as MD5. MD5 has been abandoned not only by its inventor Ron Rivest, who has deemed the method “clearly broken,” but also the Department of Homeland Security, whose Computer Emergency Readiness Team concluded that it was “cryptographically broken and unsuitable for further use.”
The court’s failure to recognize these developments would be more understandable if computer scientists had just discovered the risks re-identification; however, these risks have been well documented for years even in the popular press. In an article published nearly two years ago, The New York Times profiled several individuals whose prescription data had been sold to drug manufacturers without their consent and re-identified so that it could be used for purposes of marketing products directly to them.
In the article, one woman in particular began receiving promotional material for various pregnancy-related products after she bought fertility drugs at a pharmacy in San Diego. Although she was unsuccessful in having a baby, she continued to receive ads for over ten years promoting at first diapers and baby formula and later discounts on family photos and “gifts suitable for an elementary school graduate.” The woman describes the ads as painful reminders of a difficult time in her life: “To just go to the mailbox and get that stuff, time after time after time, it was just awful.”
While digitizing medical records provides several benefits, we must not ignore or underestimate the risks. Although one would like to find assurance in the notion of de-identified or anonymous data, the reality proves more troubling. At the very least, state attempts to protect this sensitive data should be carefully reviewed before being struck down—and an understanding of encryption technology and methods must be part of any meaningful review.