Analysis of CUDA Approach to Next-Generation Sequencing Parallelization Using in compresso Application
Abstract
PhyNGSC is a novel parallel solution to speedup the compression of Next- Generation Sequencing (NGS) genomic data. In this project I discuss the efforts to add features named PhyNGSA. These features serve as a way of accessing the compressed data without performing full decompression, as well as testing the potential value of using the GPU based Parallel Computing language CUDA for these parallel features as opposed to the hybrid OpenMP-MPI technique in the original design of the PhyNGSC algorithm. These features have proved to be capable of both pulling important data from the NGSC file without decompressing it entirely and modifying the data as it is being decompressed. When CUDA was applied to these features to test its applicability, I found that CUDA’s versatility is limited as of this time and is therefore only minimally useful for this project.