BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Data Silos: Healthcare's Silent Shame

Following
This article is more than 9 years old.

Deprivation has a way of making you feel excessively thankful for even the most meager offering.  Yoni Maisel, a reflective patient and patient advocate with a rare genetic primary immune deficiency disorder, conveyed just this sense of disproportionate gratitude in an exuberant recent piece describing the impact of technology on his life.

Inspired by Eric Topol’s new book, The Patient Will See You Now, highlighting the power of the smartphone (my WSJ review here), Maisel went out and bought one.  He then received (on his smartphone) an email from a doctor who had read about the symptoms Maisel had previously described on his blog related to a second, extremely rare disease he has (Sweet’s Syndrome), and thought she had a patient with a similar condition.  After viewing photos of skin lesions Maisel took (with his smartphone) and shared with her (with his smartphone), the doctor was reportedly convinced her patient had Sweet’s Syndrome as well.

Maisel tweeted enthusiastically that Topol’s book (and, implicitly, the technology he champions) “Just Played Part in Dx of 1in 1Million #RareDisease.”

A somewhat different reaction was shared by Rick Valencia, Head of Qualcomm Life, who commented, via Twitter; “Shocking that buying a smartphone and sending a pic considered a tech breakthrough. #onlyinhealthcare

On the one hand, of course, Maisel’s story obviously represents a terrific outcome for the newly-diagnosed patient, and – precisely as Maisel and Topol emphasize – highlights one way smartphone technology can improve medical care.

At the same time, Maisel’s joy, paradoxically, also reminds us of a deep flaw in the system, as Valencia’s comment begins to suggest.   At issue: poor data sharing, a medical tragedy of underappreciated dimension.  Valuable, even vital information often remains uncaptured, unanalyzed, and, especially, unshared.

The human consequences associated with poor data sharing were poignantly described by Seth Mnookin in his New Yorker article last year profiling a family whose son, Bertrand was born with a mysterious disease that eluded rapid identification. The family (like an estimated 25% of patients with unknown genetic disorders) was able to obtain a diagnosis by exome sequencing, yet struggled to locate others with a similar condition.  It wasn’t until the father, Matt Might, blogged about it – and had the story picked up by Reddit and others – that he was able to locate others with the disease.

The key point is that the networks afforded by Reddit were fundamentally richer than any medical dataset.  If someone – the father in the Mnookin story, the doctor in Maisel’s story – wants to find others who have similar genetics and phenotypes, they need to rely on public, non-medically-specific networks because these networks, while not purpose-built, are nevertheless far denser, and often, it seems, the best option available.

The issue this speaks to is what I’ve heard referred to as Matticalfe’s Law, named by physician and informaticist John Mattison to suggest a variant of the familiar Metcalfe’s Law.  (Disclosure: while I’ve no business relationship with Mattison, he is co-chair of the Global Alliance for Genetics and Health eHealth working group, on which I serve.  Also, to offer my usual reminder/disclosure, I am CMO at DNAnexus, a company that makes a cloud-based platform for genomic data management and collaboration).

Metcalfe’s Law is the idea that the value of a network is proportional to the square of the number of participants – i.e. adding more people to a network increases value not linearly, but exponentially.  It’s a key principle underlying the concept (and power) of networks (though not without its critics – see here).

Matticalfe Law, as Mattison explains it, is that “the value of data silos is very limited, but when deployed in aggregate yields a law of accelerating returns rather than a law of diminishing returns, similar to the network effect of Metcalfe's law.”  Mattison adds he “hybridized the eponym to distinguish it from the classical network effect, hence Matticalfe's Law.”

One implication here is that if every cancer center, every medical center, every rare disease center shared their data fully, then as a whole, these data would be profoundly more valuable and useful.  The chances that a patient with an unusual mutation and phenotype would have someone like them, somewhere in the world, would be so much higher.

So why isn’t this done?

For starters, most hospitals – even leading centers -- are struggling to meaningfully organize the genetic and phenotypic data of their own patients in a fashion that can truly inform clinical decision making, as I discussed late last year; thus, you can argue that it’s hard to share with others what you can barely grasp yourself.

A second factor, of course is privacy; medical centers typically emphasize the special nature of medical data, and express concern about the fate of rich information in a shared dataset.

Yet, many experts are skeptical that this represents the true (or only) explanation; as Mnookin writes,

“Isaac Kohane, a pediatric endocrinologist at Boston Children’s Hospital, told me that many researchers believe, incorrectly, that patient-privacy laws prohibit sharing useful information.

‘If you want to be charitable, you can say there’s just a lack of awareness’ about what kind of sharing is permissible, Kohane said. ‘If you want to be uncharitable, you can say that researchers use that concern about privacy as a shield by which they can actually hide their more selfish motivations.’"

In other words, even if top centers were able to collect and usefully organize phenotypic and genetic data on patients, would they share most of this information or silo it?

I’m not sure I know anyone who would bet against “silo.”

Whether consciously recognized or not, these data are perceived as representing a competitive advantage for the institutions and individuals who generated them (and notably, in this context, the “generating individual” is understood to be the researcher, not the patient!).

Leading cancer centers (for example) have more data than most other hospitals and practices – even though their total share of cancer patients is relatively small, as something like 85% of cancer care occurs in the community.  In a world without rich data sharing, today’s top cancer centers enjoy a distinct competitive advantage; their datasets (and more broadly, their experience sets), while individually small in the absolute sense, are large compared to most community hospitals and practices.  However, in a world with richer data sharing, these leading centers would arguably lose much of their competitive advantage – even though the global quality of cancer care would likely go up, driven by the knowledge the richer dataset would provide.  Thus, it’s perhaps not surprising that most leading cancer centers talk up data sharing far more than they engage in it – at least at anything like the rich level that would be ideal to advance medical science.  (Of course, there are encouraging exceptions to this generalization.)

The need for rich data sharing to accelerate what Andy Grove calls “knowledge turns” (link here – ironically but not surprisingly, preview only; JAMA has not made this open access) has both frustrated and motivated patient advocates such as Chordoma Foundation co-founder Josh Sommer, who has worked tirelessly to change the system (see here, also here).

Nevertheless, both in the context of scientific research and in the context of patient care, the unfortunate truth is that while it’s fashionable to profess commitment to data sharing, many hospitals, and many researchers, are reluctant to part with data.

Health economist Jason Shafrin recently boiled it down to this pithy explanation:

“What if you owned a business and one of your competitors said: ‘I would like a list of all your customers, as well as information on their demographics and health history.’  You would likely say, there is no way I’m giving you a list of my customers.

Well in the case of healthcare, customers = patients.”

Instead, the idea of the moment seems to be “federated” datasets – the idea that everyone can keep their own datasets, but query engines could specifically extract the exact, relatively limited data they need, affording, it’s suggested, many of the benefits of data pooling but without incurring many of the risks.  There’s a conspicuous “assume a can opener” quality to this strategy, but it’s worth watching because some very smart people (and organizations) are working intensively on this -- and because it might be the best we can hope for.

One alternative to this idea is that patients could contribute their own data into datasets, which could be used for the common good.    This is obviously attractive conceptually, but the challenge is more pragmatic: while some patients are both motivated and technologically adept, most patients struggle exhaustively just to get a handle on all of their own medical records, and most are unlikely to have the time, inclination, and ability to share – beneficial as this would be.

An idea I’ve been thinking about (see here, here) is the notion of the data-inhaling clinic, medical centers built around the premise of rich data collection and sharing, and offering genuine interoperability.  Patients choosing to seek care here would explicitly want their data shared, and in turn would benefit from the data sharing of others.  Consent to share data (which could always be withdrawn) would be a foundational condition of care at these centers, and a reason enlightened patients would seek treatment there (in addition to the empathetic care, which as always remains elemental). Institutions subscribing to this philosophy would not need to have the same owner, nor even the same EMR – just the same commitment to rich and complete data sharing among participating institutions.  (Sharing data only among participants seems necessary, at least initially, to avoid free-rider problem; the point is that any organizations willing to share appropriately-consented data in substantial fashion could belong to the network.)

While some patients might not like this approach, those in favor would vote with their feet, and I can imagine that the rich, consented dataset the subscribing, data-inhaling clinics would build would rapidly exceed those available elsewhere in the world.  Perhaps at this point, holdout institutions – which I imagine would include top academic medical centers – would finally relent and join as well.

The aspiration would be that in a world of rich data sharing, making diagnoses based on the combination of unusual symptoms and unusual genetics wouldn’t be exceptional, or even tweet-worthy; rather it would be -- and should be -- the expectation.