Publication:

Toward a Primordial Genetic Alphabet: Noncanonical Nucleotides in Nonenzymatic RNA Replication

Loading...
Thumbnail Image

Date

2025-06-05

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Jia, Xiwen. 2025. Toward a Primordial Genetic Alphabet: Noncanonical Nucleotides in Nonenzymatic RNA Replication. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Nonenzymatic RNA polymerization and copying are essential for the propagation of genetic information prior to the emergence of RNA polymerase. On the early Earth, in addition to the canonical nucleotides (A, U, C, G), alternative nucleotides composed of different sugars (arabino- and threo-nucleotides) and nucleobases (diaminopurine, 2-thiocytidine, 2-thiouridine, inosine) could have arisen through plausible prebiotic chemistry. Moreover, templated copying using the current genetic alphabet exhibits a strong bias toward G- and C-rich sequences. Given the potential diversity of nucleic acid building blocks and this inherent distribution bias, my research investigates how selection during both oligomerization and templated copying shapes the incorporation of noncanonical components. I study how variations in sugars and nucleobases influence RNA hybridization, structure, and sequence distribution. At the sugar level, arabino- and threo-nucleotides likely emerged alongside ribonucleotides, as they share a common synthetic pathway. I find that ribo- and arabino-nucleotides exhibit comparable incorporation efficiencies during non-templated primer extension, whereas threo-nucleotides are significantly less reactive. Moreover, the incorporation of an arabino-nucleotide at the end of a primer acts as a chain terminator. Competition experiments further reveal a bias against the incorporation of threo-nucleotides. These inherent biases, when considered alongside selective prebiotic synthesis and the known preference for ribonucleotides in templated copying, offer a plausible explanation for the exclusion of arabino- and threo-nucleotides from primordial oligonucleotides. At the nucleobase level, I investigate the diaminopurine:uracil (D:U) base pair. Replacing adenine with diaminopurine (D) enhances pairing with uracil, but also introduces potential fidelity issues, as D can form a wobble-type mismatch with cytosine. Reassuringly, D:C mismatches exhibit high stalling factors, limiting erroneous extension. Deep sequencing of templated copying reactions further demonstrates that the noncanonical DUCG system yields more uniform product distributions and fewer mismatches than the canonical AUCG system. These findings position diaminopurine as a promising nucleobase for artificial nonenzymatic RNA replication systems. Expanding beyond diaminopurine, I further explore a noncanonical genetic alphabet composed of 2-thiouridine (s2U), 2-thiocytidine (s2C), inosine (I), and adenine (A) to address the distribution bias present in the canonical alphabet. Thermodynamic and crystallographic studies show that the I:s2C and A:s2U base pairs are both isomorphic and isoenergetic. While I:s2C is slightly weaker than the canonical G:C pair, A:s2U is stronger than A:U, resulting in a balanced base-pairing landscape. Consistent with this, kinetic analyses of nonenzymatic templated primer extension reveal similar binding in the s2U/s2C/I/A system. Together, these results support the feasibility of a primordial genetic system based on s2U, s2C, I, and A, providing a potential solution to the challenge of biased nucleotide incorporation in early RNA replication. Overall, my work investigates the roles of noncanonical nucleotides in nonenzymatic RNA replication and provides new insights into constructing a primordial genetic alphabet.

Description

Other Available Sources

Research Data

Keywords

Chemistry

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories