The Ethics of Hiding Your Data From the Machines

It’s one thing to try to keep personal information from Facebook. But what if a company is going to use it to save people’s lives?
graphs
Richard Drury/Getty Images

I don’t know about you, but every time I figure out a way of sharing less information online, it’s like a personal victory. After all, who have I hurt, advertisers? Oh, boo hoo.

But sharing your information, either willingly or not, is soon going to become a much more difficult moral choice. Companies may have started out hoovering up your personal data so they could deliver that now-iconic shoe ad to you over and over, everywhere you go. And, frankly, you did passively assent to the digital ad ecosystem.

Granted, you did not assent to constant data breaches and identity theft. The bargain is perhaps not worth the use of your personal information to, say, exclude you from a job or housing listing based on your race or gender.

Come to think of it, you also did not assent to companies slicing and dicing your information into a perfectly layered hoagie just waiting to be devoured by propagandists from Russia and China, who then burped out a potentially history-altering counterintelligence operation the likes of which humanity has never seen.

That’s all easy to hate—and maybe even to ban, from a PR or legislative perspective. But now that data is being used to train artificial intelligence, and the insights those future algorithms create could quite literally save lives.

So while targeted advertising is an easy villain, data-hogging artificial intelligence is a dangerously nuanced and highly sympathetic bad guy, like Erik Killmonger in Black Panther. And it won’t be easy to hate.

I recently met with a company that wants to do a sincerely good thing. They’ve created a sensor that pregnant women can wear, and it measures their contractions. It can reliably predict when women are going into labor, which can help reduce preterm births and C-sections. It can get women into care sooner, which can reduce both maternal and infant mortality.

All of this is an unquestionable good.

And this little device is also collecting a treasure trove of information about pregnancy and labor that is feeding into clinical research that could upend maternal care as we know it. Did you know that the way most obstetricians learn to track a woman’s progress through labor is based on a single study from the 1950s, involving 500 women, all of whom were white?

It’s called the Friedman Curve, and it’s been challenged and refined since then—the American College of Obstetrics and Gynecology officially replaced it in 2016—but it’s still the basis for a lot of treatment. Worse, it has been and continues to be the basis for a huge number of C-sections, because doctors making decisions based on these outdated numbers believe a woman’s labor has stalled.

So that’s bad.

But updated data on pregnancy and labor is a tricky thing to gather, because no doctor wants to increase the possibility of a baby dying or suffering dangerous or damaging stress in the womb while they wait and see if labor will continue.

Enter our little wearable, which can gather this data efficiently, safely, and way more quickly than existing years-long research efforts. It’s already being used by underserved women who are black or brown, or poor, or both—and that is a good thing. Black women in this country are three times more likely to die in childbirth than white women, and research into women’s health, pregnancy, and labor is woefully inadequate and only slowly increasing.

To save the lives of pregnant women and their babies, researchers and doctors, and yes, startup CEOs and even artificial intelligence algorithms need data. To cure cancer, or at least offer personalized treatments that have a much higher possibility of saving lives, those same entities will need data.

The artificial intelligence necessary to make predictions, draw conclusions, or unlock the gene responsible for ALS (that last was accomplished by IBM’s AI engine, Watson) requires thousands, maybe tens of thousands of times more data than a targeted ad exchange. In fact, the same is true for self-driving cars, or predicting the effects of climate change, or even for businesses to accurately measure productivity and toilet paper costs.

The need for our data is never again going to diminish. If anything, it’s going to rapidly expand, like an unquenchable maw, a Sarlacc pit of appetite, searching for more information to examine and consume.

In the case of the company I met with, the data collection they’re doing is all good. They want every participant in their longitudinal labor study to opt in, and to be fully informed about what’s going to happen with the data about this most precious and scary and personal time in their lives.

But when I ask what’s going to happen if their company is ever sold, they go a little quiet.

For example: What if a company doing good, and trying to improve birth rate successes for black and brown mothers, finds itself sold, its data distributed to the winds, and years later, some insurance company uses that information to decide that black and brown mothers pose a higher risk of pregnancy-related complications, and companies are loath to hire those women?

There’s also a real and reasonable fear that companies or individuals will take ethical liberties in the name of pushing hard toward a good solution, like curing a disease or saving lives. This is not an abstract problem: The co-founder of Google’s artificial intelligence lab, DeepMind, was placed on leave earlier this week after some controversial decisions—one of which involved the illegal use of over 1.5 million hospital patient records in 2017.

So sticking with the medical kick I’m on here, I propose that companies work a little harder to imagine the worst-case scenario surrounding the data they’re collecting. Study the side effects like you would a drug for restless leg syndrome or acne or hepatitis, and offer us consumers a nice, long, terrifying list of potential outcomes so we actually know what we’re getting into.

And for we consumers, well, a blanket refusal to offer up our data to the AI gods isn’t necessarily the good choice either. I don’t want to be the person who refuses to contribute my genetic data via 23andMe to a massive research study that could, and I actually believe this is possible, lead to cures and treatments for diseases like Parkinson’s and Alzheimer’s and who knows what else.

I also think I deserve a realistic assessment of the potential for harm to find its way back to me, because I didn’t think through or wasn’t told all the potential implications of that choice—like how, let’s be honest, we all felt a little stung when we realized the 23andMe research would be through a partnership with drugmaker (and reliable drug price-hiker) GlaxoSmithKline. Drug companies, like targeted ads, are easy villains—even though this partnership actually could produce a Parkinson’s drug. But do we know what GSK’s privacy policy looks like? That deal was a level of sharing we didn’t necessarily expect.

To this end, all companies need to be incredibly, brutally, almost absurdly honest about how bad the side effects could really be. Like, you remember that fat-free potato chip back in the 1990s that was found to cause anal leakage? That’s the kind of information I want up front. The risk of data leakage (sorry), and the steps the company is taking to prevent that, should be clearly spelled out in every company’s terms of service, as well as their long-term planning.

Ideally, the act of planning for the worst possible outcome will also help us avoid those outcomes. Companies can then design terms of service, or governments can design regulation, which requires data deletion, or forces everyone to opt back in to data sharing in the event of an acquisition or merger. What if 23AndMe had given users the explicit opportunity to delete all their genetic data at the moment it entered into a partnership with GSK? What if it had made that deal opt in for its existing customers, instead of opt out? Some of us would have opted out, and others wouldn’t, but we should have had the choice.

As the brief history of the tech industry has shown us, there’s no such thing as a simple good. The choice of trading information for free, ad-supported services turned out to be way more complicated than we realized, and no one in the future should have to choose between sharing data and saving lives.


Updated 8-22-19, 5 pm EST: This story was updated to clarify Mustafa Suleyman's title.

More Great WIRED Stories