How Big is Big Data in Healthcare?

 Big data is a lucrative business thanks to the targeted insights it produces about consumer behavior. It has long been a staple of consumer marketing and is becoming increasingly sophisticated thanks to detailed information from geolocation services that can posit how a billboard you drove by ties to your future purchases.

(Take a look at this New York Times exposé on how much not so anonymous location data companies have on us if you’d like the urge to throw your smartphone out the window.)

While these types of micro and macro population analyses have a history of being associated with commercial enterprises, healthcare is quickly catching up to get a piece of the action.  This isn’t to say that big data healthcare projects are uncommon; Congress passed the Cancer Registries Amendment Act in
1992 to administer the collection of cancer data representing 97% of the U.S. population.  This mandatory reporting allows for monitoring overall trends and the advancement of treatment.

Not all registries are sponsored by the government, or even a nonprofit, private companies and healthcare organizations also receive this data.  Now it’s becoming commonplace for tech companies to bypass the registry model and offer hospitals cash to get ahold of valuable patient data.  We’ve seen some of these initiatives make headlines like Google’s Project Nightingale or Amazon’s health data subscriptions, but there are numerous companies knocking on doors to source their own health databases.

Executives at hospitals now say they’re being ‘inundated’ with requests for their data, increasing in frequency from once a quarter to once a day in just five years.  While this data can be analyzed to create new treatment or better care options, there’s a concern that, even when anonymized, it could easily be used to raise an individual’s premiums or in advertisement targeting.

The problem with this de-identified data is that it doesn’t take much in terms of additional data points to take patients distinguishable.  Given the amount of data available that’s freely accessible or for sale, it’s not as large of a leap to patient identification as you might imagine.

Many hospitals are hesitant to agree to sell their patient data, and have extensive governing of how their data will be used, it’s no secret that other hospitals could use the income, and might be more susceptible to playing somewhat fast and loose with their data (29 hospitals closed in 2019, many of which could have used this type of capital infusion).

At Extract, we believe in the power of data when it’s used responsibly.  For organizations who need to submit discrete data to a registry, our software can eliminate the steps of manually entering it.  Even for internal use, we’ve aided clients in their research processes by automating things like entering genetic anomaly data, saving clinicians from re-entering scores of repeated data, automatically creating individual records for each gene expression of a patient, enhancing the flexibility of their database.

Where data is shared though, we use software, improved by machine learning, to automatically identify all pieces of personal information in a document.  Whether for public consumption or for aggregated research, our familiarity working with unstructured documents creates the basis for our software to redact all kinds
of sensitive information, alleviating privacy concerns to the degree that you desire.

If you’d like to learn more about our automated data and document handling platform, please reach out today for more information.


About the Author: Chris Mack

Chris is a Marketing Manager at Extract with experience in product development, data analysis, and both traditional and digital marketing.  Chris received his bachelor’s degree in English from Bucknell University and has an MBA from the University of Notre Dame.  A passionate marketer, Chris strives to make complex ideas more accessible to those around him in a compelling way.