Supercomputing's Role in Advancing Genetic Research

September 16th, 2016

Advances in data-intensive supercomputing have set the stage for tremendous strides in the study of genetics. In an article for HPC Wire, Tiffany Trader cites Cray’s Marketing Director Maria McLaughlin’s recent findings that sophisticated supercomputing capabilities enabled scientists to pinpoint the genetic patterns underlying autism-spectrum disorders, schizophrenia and similar brain conditions.

With funding from the National Science Foundation, scientists from the San Diego Supercomputing Center (SDSC) and the Institute Pasteur “identified a time-dependent gene-expression process that could help medical professionals” one day eradicate these types of disorders.

According to a report in the Genes, Brain and Behavior journal, these life-changing breakthroughs are the result of a confluence in computational and life sciences.

Igor Tsigelny, a research scientist with SDSC and UC San Diego’s Moores Cancer Center, highlighted the role that data plays in the research:

“We live in the unique time when huge amounts of data related to genes, DNA, RNA, proteins, and other biological objects have been extracted and stored,” said Tsigelny.

“I can compare this time to a situation when the iron ore would be extracted from the soil and stored as piles on the ground. All we need is to transform the data to knowledge, as ore to steel. Only the supercomputers and people who know what to do with them will make such a transformation possible,” he added.

The project relied on the innovative flash-based Gordon supercomputer, a Cray CS300-AC Cluster system, installed at the San Diego Supercomputer Center.

“Gordon’s I/O nodes are specifically designed to handle large, complex data-intensive workloads that address I/O bottlenecks,” writes McLaughlin.

The Gordon supercomputer employs a massive amount of flash memory which is how it powers through solutions that would be bogged down by slower spinning disk memory. Its innovative architecture allows Gordon to process large data-intensive problems about 10 times faster than other supercomputers, according to SDSC, and it can hold as many as 100,000 entire human genomes in its flash memory system.

Gordon has been engaged in cutting-edge research since January 2012 and is a key resource for the NSF’s Extreme Science and Engineering Discovery Environment (XSEDE) program, a nationwide partnership that includes 16 high-performance computers and high-end visualization and data analysis resources.

Source: HPC Wire