aRNApipe: A balanced, efficient and distributed pipeline for processing RNA-seq data in high performance computing environments

Academic Article

Abstract

  • Summary

    The wide range of RNA-seq applications and their high computational needs require the development of pipelines orchestrating the entire workflow and optimizing usage of available computational resources. We present aRNApipe, a project-oriented pipeline for processing of RNA-seq data in high performance cluster environments. aRNApipe is highly modular and can be easily migrated to any high performance computing (HPC) environment. The current applications included in aRNApipe combine the essential RNA-seq primary analyses, including quality control metrics, transcript alignment, count generation, transcript fusion identification, alternative splicing, and sequence variant calling. aRNApipe is project-oriented and dynamic so users can easily update analyses to include or exclude samples or enable additional processing modules. Workflow parameters are easily set using a single configuration file that provides centralized tracking of all analytical processes. Finally, aRNApipe incorporates interactive web reports for sample tracking and a tool for managing the genome assemblies available to perform an analysis.

    Availability and documentation

    https://github.com/HudsonAlpha/aRNAPipe ; DOI:10.5281/zenodo.202950

    Contact

    rmyers@hudsonalpha.org

    Supplementary information

    Supplementary data are available at Bioinformatics online.
  • Digital Object Identifier (doi)

    Author List

  • Alonso A; Lasseigne B; Williams K; Nielsen J; Ramaker R; Hardigan A; Johnston B; Roberts B; Cooper S; Marsal S