Multicore and Cloud-Based Solutions for Genomic Variant Analysis
|Title||Multicore and Cloud-Based Solutions for Genomic Variant Analysis|
|Publication Type||Journal Article|
|Year of Publication||2013|
|Authors||González CY, Bleda M, Salavert F, Sanchez R, Dopazo J, Medina I|
|Journal||Euro-Par 2012: Parallel Processing Workshops|
|Keywords||genomic variant analysis, Multicore, Mutation, OpenMP, web service|
Genomic variant analysis is a complex process that allows to find and study genome mutations. For this purpose, analysis and tests from both biological and statistical points of view must be conducted. Biological data for this kind of analysis are typically stored according to the Variant Call Format (VCF), in gigabytes-sized files that cannot be efficiently processed using conventional software. In this paper, we introduce part of the High Performance Genomics (HPG) project, whose goal is to develop a collection of efficient and open-source software applications for the genomics area. The paper is mainly focused on HPG Variant, a suite that allows to get the effect of mutations and to conduct genomic-wide and family-based analysis, using a multi-tier architecture based on CellBase Database and a RESTful web service API. Two user clients are also provided: an HTML5 web client and a command-line interface, both using a back-end parallelized using OpenMP. Along with HPG Variant, a library for VCF files handling and a collection of utilities for VCF files preprocessing have been developed. Positive performance results are shown in comparison with other applications such as PLINK, GenABEL, SNPTEST or VCFtools.