Healthcare programs generate monumental quantities of delicate information, however transferring, sharing, and analyzing that information securely throughout organizations remains to be a significant problem. On this submit, we’ll have a look at how we at Kythera Labs use Databricks and Delta Sharing to handle greater than 300 million affected person information and assist collaborations throughout healthcare and life sciences. The weblog will cowl the sensible points with older information‑sharing strategies, why we adopted Delta Sharing, and the influence it’s had on our storage prices, effectivity, and actual‑time collaboration.
Making Information Work in Healthcare: Kythera’s Strategy
Kythera Labs is a knowledge know-how firm that empowers healthcare and life sciences organizations with a unified, high-fidelity healthcare information platform for evaluation. As a built-on Databricks Companion, we selected Databricks and Delta Sharing not only for inside information sharing but additionally to assist seamless information alternate with exterior companions. Right now, greater than 80% of our prospects use merchandise constructed on the platform. We additionally assist exterior collaborations, together with organizations like Actual Sciences, utilizing Delta Sharing throughout 50 lively buyer workspaces.
Why Delta Sharing?
Kythera Labs selected Delta Sharing to beat important challenges in securely sharing healthcare information. With over 300 million affected person information spanning a decade of medical historical past, conventional strategies required creating and transferring a number of full copies of datasets, driving storage prices into the tons of of hundreds of {dollars} and slowing supply.
Delta Sharing adjustments that by enabling safe, actual‑time entry to dwell information with out creating duplicate copies. As a substitute of storing and sustaining separate datasets for every accomplice or atmosphere, we are able to share a single, ruled supply of fact straight. This strategy has allowed us to energy inside groups and exterior collaborations with simply 3.5 PB of storage, somewhat than the 20‑plus PB in any other case required.
One other complexity is assembly our prospects the place they’re on the cloud. Healthcare suppliers usually function in Azure, whereas many pharmaceutical firms run on AWS or GCP. With out a know-how like Delta Sharing, delivering giant datasets throughout clouds would imply pricey transfers, complicated ETL work, and a number of stale copies scattered throughout clouds. With Delta Sharing, we are able to immediately present safe entry to the identical dwell dataset — regardless of the cloud — whereas sustaining compliance and eliminating pointless copies.
This not solely streamlines our inside workflows (transferring from growth to testing to manufacturing with out re‑copying information) but additionally makes it straightforward for patrons to behave sooner, like immediately updating a most cancers remedy mannequin with the most recent information.
Changing Legacy Approaches
Given the exponential development in information quantity and complexity, conventional information sharing strategies like SFTP servers are not viable for contemporary wants. Transferring giant recordsdata forwards and backwards introduces delays, provides safety dangers, and requires storage of a number of redundant datasets.
Whereas APIs could possibly be a useful resource, they’re inadequate for sharing the huge oceans of information that organizations like Kythera handle. Counting on APIs to share the immense volumes of information we handle can be like attempting to fill a swimming pool with a backyard hose—it’s technically potential, however too sluggish and inefficient for our wants.
Operationally, we deal with 7–10 million transactions day by day whereas guaranteeing compliance by way of our customized “Vault Structure” constructed on Delta Sharing. Clients profit from real-time updates through view sharing with out guide intervention.
By adopting Delta Sharing, we’ve fully moved away from these legacy strategies and gained operational effectivity whereas enabling seamless collaboration throughout clouds and organizations.
Delta Sharing ROI
Delta Sharing has allowed us to remove legacy data-sharing strategies, lower storage wants by over 80%, and save greater than $2 million within the final 2 years. — Jeff McDonald, CEO, Kythera Labs
Delta Sharing helped Kythera lower storage wants from a projected 24 PB to only 3.5 PB. Over three years, storage demand dropped from 17 PB/month in 2024 to 12 PB/month in 2023 and 6 PB/month in 2022. These reductions add as much as tens of millions in financial savings. For context, giant pharmaceutical firms can spend as a lot as $14 million every month simply on storage.
Storage is simply a part of the story. The compute prices for performing the ETL copies could possibly be much more important, starting from equal to the storage financial savings to doubtlessly many instances better, relying on the use instances.
| 12 months | Discount in storage wants | AWS S3 Customary Value ( PB/month) | Yearly Financial savings (50% storage low cost) |
|---|---|---|---|
| 2024 | 17 PB/month | $21K | $2.1M |
| 2023 | 12 PB/month | $21K | $1.5M |
| 2022 | 6 PB/month | $21K | $0.75M |
| TOTAL | $4.375M | ||
Key Takeaways
Delta Sharing has reworked our data-sharing capabilities by lowering prices, bettering effectivity, and enabling real-time collaboration throughout clouds and organizations. The mix of Delta Sharing, Unity Catalog, and liquid clustering ensures scalability whereas sustaining compliance with healthcare information requirements, exemplifying how open, fashionable information platforms can revolutionize healthcare analytics.
