Sequencing framework for the sensitive detection and precise mapping of defective interfering particle-associated deletions across influenza A and B viruses

The mechanisms and consequences of defective interfering particle (DIP) formation during influenza virus infection remain poorly understood. The development of next generation sequencing (NGS) technologies has made it possible to identify large numbers of DIP-associated sequences, providing a powerful tool to better understand their biological relevance. However, NGS approaches pose numerous technical challenges including the precise identification and mapping of deletion junctions in the presence of frequent mutation and base-calling errors, and the potential for numerous experimental and computational artifacts. Here we detail an Illumina-based sequencing framework and bioinformatics pipeline capable of generating highly accurate and reproducible profiles of DIP-associated junction sequences. We use a combination of simulated and experimental control datasets to optimize pipeline performance and demonstrate the absence of significant artifacts. Finally, we use this optimized pipeline to reveal how the patterns of DIP-associated junction formation differ between different strains and subtypes of influenza A and B viruses and to demonstrate how this data can provide insight into mechanisms of DIP formation. Overall, this work provides a detailed roadmap for high resolution profiling and analysis of DIP-associated sequences within influenza virus populations.IMPORTANCE Influenza virus defective interfering particles (DIPs) that harbor internal deletions within their genomes occur naturally during infection in humans and cell culture. They have been hypothesized to influence the pathogenicity of the virus; however, their specific function remains elusive. The accurate detection of DIP-associated deletion junctions is crucial for understanding DIP biology but is complicated by an array of technical issue that can bias or confound results. Here we demonstrate a combined experimental and computational framework for detecting DIP-associated deletion junctions using next generation sequencing (NGS). We detail how to validate pipeline performance and provide the bioinformatics pipeline for groups interested in using it. Using this optimized pipeline, we detect hundreds of distinct deletion junctions generated during infection with a diverse panel of influenza viruses and use these data to test a long-standing hypothesis concerning the molecular details of DIP formation.