The growth of biomedical research data presents challenges to researchers in the field. One possible solution suggested would be to develop a “Cancer Genomics Cloud” to provide co-located storage and computational infrastructure pre-loaded with National Cancer Institute (NCI) public data. The Cancer Genomics Cloud would reduce the need for millions of dollars to be spent on equipment and staff to download the entire NCI public data.
The problem is that the current model for downloading and running analysis locally works well for smaller quantities of data, but the present model being used is costly and increasingly unsustainable for larger and more varied datasets like those generated by The Cancer Genome Atlas (TCGA) and similar projects.
The NCI’s Board of Scientific Advisors and the National Cancer Advisory Board recently approved the launch of three Genomics Cloud Pilots. The pilots are expected to provide access to pre-loaded data from TCGA.
NCI intends to fund the development of the three pilot clouds to each handle up to 2.5 Petabytes (PB) of core TCGA data with at least one additional data type. The cloud pilots will be capable of supporting large numbers of users simultaneously, use defined data standards, and be easily interoperable with other databases and systems.
Estimated costs for the design and implementation phase is expected to be between $3 million and $5 million per cloud pilot while the evaluation period is expected to cost about $500,000 per pilot. The ballpark estimate for the operational phase is between $3 million and $5 million per cloud pilot.
As part of the cloud effort, NCI issued a sources notice in July 2013, asking qualified groups to submit statements on their capabilities. In the future, NCI plans to issue a Broad Agency Announcement (BAA) a specific contract mechanism that would to be used for the cloud pilot program.
When winning applicants are selected, participants will have three months to design their systems, twelve months to construct the cloud including an application programming interface and initial users programs as well as create and supply operational cost estimates.
The use of the cloud pilots with be evaluated over a six month period by both NCI and the community. The community will also have a chance to evaluate the usefulness of these cloud pilots by participating in contests similar to those contests offered by firms such as TopCoder.
NCI also is going to propose several possible long-term support models. Under one model, the NCI would fund and expand the most successful clouds from the pilot phase. However, a second option would be to have institutions form consortia responsible for supporting clouds, while a third options would be a commercial-fee-for-service offering that would be run by existing cloud providers.
For more information on biomedical informatics, go to http://ncip.nci.nih.gov