Multicore and Cloud-Based Solutions for Genomic Variant Analysis

TitleMulticore and Cloud-Based Solutions for Genomic Variant Analysis
Publication TypeJournal Article
Year of Publication2013
AuthorsGonzález CY, Bleda M, Salavert F, Sanchez R, Dopazo J, Medina I
JournalEuro-Par 2012: Parallel Processing Workshops
Start Page273
Date Published2013
ISBN Number978-3-642-36948-3
Keywordsgenomic variant analysis, Multicore, Mutation, OpenMP, web service

Genomic variant analysis is a complex process that allows to find and study genome mutations. For this purpose, analysis and tests from both biological and statistical points of view must be conducted. Biological data for this kind of analysis are typically stored according to the Variant Call Format (VCF), in gigabytes-sized files that cannot be efficiently processed using conventional software. In this paper, we introduce part of the High Performance Genomics (HPG) project, whose goal is to develop a collection of efficient and open-source software applications for the genomics area. The paper is mainly focused on HPG Variant, a suite that allows to get the effect of mutations and to conduct genomic-wide and family-based analysis, using a multi-tier architecture based on CellBase Database and a RESTful web service API. Two user clients are also provided: an HTML5 web client and a command-line interface, both using a back-end parallelized using OpenMP. Along with HPG Variant, a library for VCF files handling and a collection of utilities for VCF files preprocessing have been developed. Positive performance results are shown in comparison with other applications such as PLINK, GenABEL, SNPTEST or VCFtools.