Google today announced that it has partnered with the Broad Institute of MIT and Harvard to launch a limited alpha of the institute’s Genome Analysis Toolkit (GATK) on Google’s Cloud Platform and make it available as a service. The software, which was developed by the Broad Institute and helps scientists to quickly analyze genomic sequencing data, will be offered to academic researchers at no charge (though they will still have to pay for using Google’s Cloud Platform). Business users will have to license the software from Broad.
The service will be offered as part of Google Genomics, the company’s cloud computing platform for life science research and a relatively unknown part of the Google Cloud Platform. The service launched in early 2014, but it has been relatively quiet around this initiative ever since.
Bringing the GATK to Google Cloud Platform is the first result of this partnership between the two organizations.
DNA sequencing generates huge amounts of data (the raw data of the genome of one person takes up more than 100 gigabytes) and the Broad Institute has either sequenced or genotyped the equivalent of more than 1.4 million biological samples. It takes huge resources to analyze all of this data, which is where this partnership with Google comes in. Besides offering the computing platform to analyze the data, though, Google also notes that the GATK — which is already in use by plenty of scientists outside of this partnership — gives researchers the confidence that they are “processing their data according to the best practices, without worrying about managing IT infrastructure.”
“Broad and Google share a culture of collaboration and open access to data,” says David Glazer, Director of Google Genomics. “Google Genomics is helping scientists make genomic information more accessible and useful. By making Broad’s GATK available through the Google Cloud platform, we hope to accelerate great science.”
For researchers, having access to this toolkit in the cloud means they don’t have to worry about setting up their own computing infrastructure, something that can often take a lot of time and manpower away from the actual research and analysis.