Subgenomic RNA identification in SARS-CoV-2 genomic sequencing data

Parker, Matthew D., Lindsey, Benjamin B., Leary, Shay, Gaudieri, Silvana, Chopra, Abha, Wyles, Matthew, Angyal, Adrienn, Green, Luke R., Parsons, Paul, Tucker, Rachel M., Brown, Rebecca, Groves, Danielle, Johnson, Katie, Carrilero, Laura, Heffer, Joe, Partridge, David G., Evans, Cariad, Raza, Mohammad, Keeley, Alexander J., Smith, Nikki, Filipe, Ana Da Silva, Shepherd, James G., Davis, Chris, Bennett, Sahan, Sreenu, Vattipally B., Kohl, Alain, Aranday-Cortes, Elihu, Tong, Lily, Nichols, Jenna, Thomson, Emma C., Wang, Dennis, Mallal, Simon, de Silva, Thushan I., The COVID-19 Genomics UK (COG-UK) Consortium, , Bashton, Matthew, Young, Greg, Allan, John, Loh, Joshua, Nelson, Andrew, Smith, Darren and Yew, Wen Chyin (2021) Subgenomic RNA identification in SARS-CoV-2 genomic sequencing data. Genome Research, 31 (4). pp. 645-658. ISSN 1088-9051

[img]
Preview
Text (Final published version)
645.full.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (29MB) | Preview
[img]
Preview
Text (Advance online version)
gr.268110.120.full.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (29MB) | Preview
[img]
Preview
Text
Supplemental_Material_.pdf - Supplemental Material

Download (3MB) | Preview
Official URL: https://doi.org/10.1101/gr.268110.120

Abstract

We have developed periscope, a tool for the detection and quantification of subgenomic RNA (sgRNA) in SARS-CoV-2 genomic sequence data. The translation of the SARS-CoV-2 RNA genome for most open reading frames (ORFs) occurs via RNA intermediates termed “subgenomic RNAs.” sgRNAs are produced through discontinuous transcription, which relies on homology between transcription regulatory sequences (TRS-B) upstream of the ORF start codons and that of the TRS-L, which is located in the 5′ UTR. TRS-L is immediately preceded by a leader sequence. This leader sequence is therefore found at the 5′ end of all sgRNA. We applied periscope to 1155 SARS-CoV-2 genomes from Sheffield, United Kingdom, and validated our findings using orthogonal data sets and in vitro cell systems. By using a simple local alignment to detect reads that contain the leader sequence, we were able to identify and quantify reads arising from canonical and noncanonical sgRNA. We were able to detect all canonical sgRNAs at the expected abundances, with the exception of ORF10. A number of recurrent noncanonical sgRNAs are detected. We show that the results are reproducible using technical replicates and determine the optimum number of reads for sgRNA analysis. In VeroE6 ACE2+/− cell lines, periscope can detect the changes in the kinetics of sgRNA in orthogonal sequencing data sets. Finally, variants found in genomic RNA are transmitted to sgRNAs with high fidelity in most cases. This tool can be applied to all sequenced COVID-19 samples worldwide to provide comprehensive analysis of SARS-CoV-2 sgRNA.

Item Type: Article
Additional Information: Funding information: Sequencing of SARS-CoV-2 samples was undertaken by the Sheffield COVID-19 Genomics Group as part of the COG-UK Consortium and supported by funding from the Medical Research Council (MRC), part of UK Research and Innovation (UKRI), the National Institute of Health Research (NIHR), and Genome Research Limited, operating as the Wellcome Sanger Institute. M.D.P. and D.W. were funded by the NIHR Sheffield Biomedical Research Centre (BRC; IS-BRC-1215-20017). T.I.d.S. is supported by a Wellcome Trust Intermediate Clinical Fellowship (110058/Z/15/Z). A.K. is supported by the MRC (MC_UU_12014/8). A.d.S.F. and L.T. are supported by the MRC (MC_UU_12014/12). We thank Mark Dunning of the Sheffield Bioinformatics Core for useful discussions around the normalization techniques. We thank all partners of and contributors to the COG-UK Consortium, who are listed at https://www.cogconsortium.uk/about/. Full lists of Consortium authors and affiliations are located in the Supplemental Material.
Subjects: B100 Anatomy, Physiology and Pathology
C500 Microbiology
C700 Molecular Biology, Biophysics and Biochemistry
Department: Faculties > Health and Life Sciences > Applied Sciences
Depositing User: Elena Carlaw
Date Deposited: 18 Mar 2021 16:32
Last Modified: 31 May 2021 14:39
URI: http://nrl.northumbria.ac.uk/id/eprint/45736

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics