8 Challenges for the Biomedical Industry in terms of Big Data
admin | March 21st, 2016
We are anticipating an exponential growth in the availability of genomics big data, thank to Next-generation Sequencing (NGS) platforms that have reduced the time and cost for sequencing genomes. NIH data (see Figure 1) indicates a reduction in the cost of sequencing an entire genome from $100M in 2001 to around $1,000 in 2015. In a landmark diagnostic genomics case study in 2015, the time required to accomplish whole genome sequencing, analysis and diagnosis of genetic disease in critically ill infants was 26 hours.
Figure 1: Reduction in sequencing cost
Is the technology partnership between the Big Data and Biomedical Industries currently positioned to support the future proliferation of genomics data? This partnership must overcome the following key challenges:
- Realizing a value proposition;
- Varying computational infrastructure depending on the nature of the analyses;
- Expeditious transport of data from sequencing systems to the cloud;
- Privacy concerns and data ownerships of genomic data;
- Interoperability of genomics data systems with other clinical data systems;
- File formats enabling efficient parallel access to genomics sequence data;
- Cost of analysis (Informatics) including preparation, curation and aggregation;
- Sophisticated software for genomics data interpretation.
Let’s discuss a few of them:
- Although storage costs for genomic data may be minimal, the high cost of computational infrastructure (compute costs for processing and analyzing genomes) might negate the benefits of low sequencing costs. While cloud platforms bring flexibility, the prospect of streaming high volumes of genomics sequence data directly from sequencing platforms to the cloud can be a challenge.
- Privacy of genomic data is also a concern due to the uncertainty of how data will be utilized and shared. Although research data may have been collected anonymously, this doesn’t reduce the importance of genome privacy because of potential re-identification threats and potential privacy breach.
- Interoperability of genome databases with one another, and with other clinical data systems (such as Electronic Medical Records, or EMRs) is another challenge that must be overcome in order to realize the benefits to the general public of ongoing collaborative research projects such as the “100,000 Genomes Project”, and to individual patients whose ailments may have a hereditary component.
- The value proposition originating from translational genomics for the biomedical industry may not be obvious in the early stages. The costs that a biomedical organization would incur to store and analyze large volumes of their genomics data for additional insights outside of their core product or service, may not result in near-term financial gain. This might dissuade some organizations from investing in this effort. However, the knowledge gained from research oriented discovery projects can be beneficial to the community and can contribute to product enhancements. The big data vendors who are enabling the genomics big data platform may be able to monetize by providing curation and aggregation services.
Companies such as AWS, Oracle, and Google are positioning themselves to be the key players in forming the backbone for the biomedical companies by enabling the computational infrastructure for genomics data storage and analysis. These vendors recognize the potential value in bringing genomics research data into their platforms.
As the cost of genomic sequencing continues to shrink and sequencing on a much larger scale becomes viable, we anticipate a shift from reactionary medicine to predictive, proactive medicine. A vast genomics database will enable research leading to a better understanding of the genetics basis for a multitude of diseases. This knowledge will spur the development of drugs and other therapies that better target disease prevention, and enable development of personal genome interpretation software that will support lifestyle counseling providing steps an individual can take to mitigate the potential impact of a disease or condition afflicting him, or to which he has a genetic disposition.
Genomics Big Data