Presentation
TANGO: A GPU-Optimized Traceback Approach for Sequence Alignment Algorithms
DescriptionSequence alignment algorithms play a central role in most bioinformatics software. However, porting these algorithms to GPUs can be challenging due to their reliance on irregular memory access patterns and integer-heavy operations. Here we present TANGO, an optimized GPU implementation of the Smith-Waterman (SW) algorithm with a focus on the traceback phase. We leverage stacked diagonal-major indexing and compressed binary representation for efficient adaptation of the traceback phase to GPUs. Our proposed implementation achieves speedups of 12.6x and 9.9x compared to state-of-the-art CPU libraries for DNA and protein alignments, respectively. It is the fastest SW library for protein alignments on GPU while providing comparable performance to other GPU libraries for DNA. Finally, we integrate TANGO into a large-scale metagenome assembly software to speed up a production workflow.