the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O...

37
1 Compatibility of 5-ethynyl-2’F-ANA UTP with in vitro selection for the generation of base-modified, nuclease resistant aptamers Fabienne Levi-Acobas, a,b‡ Adam Katolik, c,d‡ Pascal Röthlisberger, a,b Thomas Cokelaer, e,f Ivo Sarac, a,b Masad J. Damha, d Christian J. Leumann, c and Marcel Hollenstein a,b * a Institut Pasteur, Department of Structural Biology and Chemistry, Laboratory for Bioorganic Chemistry of Nucleic Acids, CNRS UMR 3523, 28, rue du Docteur Roux, 75724 Paris Cedex 15, France. b Institut Pasteur, Department of Genome and Genetics c Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, CH-3012 Bern, Switzerland d Department of Chemistry, McGill University, 801 Rue Sherbrooke, Montréal, QC H3A 0B8, Canada e Institut Pasteur, Bioinformatics and Biostatistics Hub, Department of Computational Biology, Institut Pasteur, USR 3756 CNRS, Paris, France. f Institut Pasteur, Biomics Platform, C2RT, Paris, France 1. Materials and Methods 2 2. Synthesis of the modified triphosphate 3-8 3. List of oligonucleotides 9 4. Biochemical protocols 10-12 5. Additional figures 13-30 Electronic Supplementary Material (ESI) for Organic & Biomolecular Chemistry. This journal is © The Royal Society of Chemistry 2019

Transcript of the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O...

Page 1: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

1

Compatibility of 5-ethynyl-2’F-ANA UTP with in vitro selection for the generation of base-modified, nuclease resistant aptamers

Fabienne Levi-Acobas,a,b‡ Adam Katolik,c,d‡ Pascal Röthlisberger,a,b Thomas Cokelaer,e,f Ivo Sarac,a,b Masad J. Damha,d Christian J. Leumann,c and Marcel Hollensteina,b*

a Institut Pasteur, Department of Structural Biology and Chemistry, Laboratory for Bioorganic Chemistry of Nucleic Acids, CNRS UMR 3523, 28, rue du Docteur Roux, 75724 Paris Cedex 15, France.

b Institut Pasteur, Department of Genome and Genetics

c Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, CH-3012 Bern, Switzerland

d Department of Chemistry, McGill University, 801 Rue Sherbrooke, Montréal, QC H3A 0B8, Canada

e Institut Pasteur, Bioinformatics and Biostatistics Hub, Department of Computational Biology, Institut Pasteur, USR 3756 CNRS, Paris, France.

f Institut Pasteur, Biomics Platform, C2RT, Paris, France

1. Materials and Methods 2

2. Synthesis of the modified triphosphate 3-8

3. List of oligonucleotides 9

4. Biochemical protocols 10-12

5. Additional figures 13-30

Electronic Supplementary Material (ESI) for Organic & Biomolecular Chemistry.This journal is © The Royal Society of Chemistry 2019

Page 2: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

2

1. Materials and methods

All reactions were performed under Ar in flame-dried glassware. Anhydrous solvents for reactions were obtained by filtration through activated aluminum oxide, or by storage over 4Å activated molecular sieves. Flash chromatography was performed using silica gel (230–400 mesh) from Silicycle. Thin layer chromatography was carried out on precoated glass-backed plates of silica gel (0.25mm, UV254) from Macherey-Nagel. All chemicals and solvents used were purchased from Sigma-Aldrich, unless stated otherwise. NMR spectra were recorded on a Bruker DRX-400 or a Bruker AC-300 spectrometer (400 or 300 MHz for 1H, 101 or 75.5 MHz for 13C, 121.4 MHz for 31P, and 376.5 MHz for 19F) and all spectra were referenced to the signals of the corresponding solvent. Chemical shifts are given in ppm ( scale) and coupling constants (J) in Hz. Assignation of the NMR signals was performed by using a combination of 1H/1H-COSY, 13C-DEPT-135, and 13C/1H-HMBC experiments. High resolution electrospray ionization (ESI) mass spectra (MS, m/z) were recorded on a Thermo Scientific LTQ Orbitrap XL instrument. MALDI-TOF spectra were recorded on an Applied Biosystems Sciex QSTAR Pulsar instrument. HPLC purification was performed using an Äkta™ basic 10/100 system (Amersham Pharmacia Biotech) equipped with a semi-preparative Phenomenex Jupiter semi-preparative RP-HPLC column (5 C18 300Å). The starting material 2-Deoxy-2-fluoro-1,3,5-tri-O-benzoyl-D-arabinofuranose was purchased from TCI.

All DNA oligonucleotides were purchased from Microsynth. All DNA polymerases (Therminator, Vent (exo-), Deep Vent, Bst, Taq, Dpo4, Q5, and the Klenow fragment of DNA polymerase I exo- (Kf exo-)) were purchased from New England Biolabs as well as the natural dNTPs and the -exonuclease. PCR reactions were performed on a SimpliAmp thermal cycler from ThermoFisher.

Acrylamide/bisacrylamide (29:1, 40%) was obtained from Fisher Scientific. Visualization of PAGE gels was performed by fluorescence imaging using a Storm 860 phosphorimager with the ImageQuant software (both from GE Healthcare). Agarose gels (2% stained with Sybr green) were from Invitrogen and ran and analyzed on an E-gel iBase system from Invitrogen.

Page 3: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

3

2. Synthesis of the modified triphosphate 8:

HBr in AcOH(33%), CH2Cl2

0oC r.tovernight

OOBz

FBzO

BzOO

FBzO

BzO Br

1 2

1-Bromo-2-deoxy-2-fluoro-3,5-di-O-benzoyl-alpha-D-arabinofuranose (2): A solution of 2-Deoxy-2-fluoro-1,3,5-tri-O-benzoyl-D-arabinofuranose (0.30 g, 6.43 mmol, 1 eq.) in CH2Cl2 (18.3 ml, dry) was placed in an ice bath to cool to 0°C and HBr (33% in acetic acid, 4.1 ml, 24.85 mmol, 3.9 eq.) was added in a dropwise manner. Next, the ice was removed and the mixture was allowed to slowly warm up to ambient temperature while stirring, and continued reacting at this temperature overnight. Importantly, the entire isolation proceeded at a reduced (near 0°C) temperature. Specifically, prior to stopping the reaction, aliquots of CH2Cl2, water, and saturated NaHCO3 solution were all pre-cooled along with the separatory funnel and collection flasks. The reaction mixture was then diluted with cold CH2Cl2 (50 mL) and washed with cold water (1 x 50 ml), cold saturated NaHCO3 solution (2 x 50 ml), dried over MgSO4, filtered and evaporated (without heating the water bath) to yield the product 2 (assumed 100%) as a brown oil. Because of its instability, product 2 was immediately reacted further without additional purification or characterization.

+

OSi

NSi

Solvent: TolueneConditions: reflux 80oC overnight

O

FBzO

BzO Br

NH

NH

O

O

Si

NHN O

O

Si

O

FBzO

BzOO

FBzO

BzONH

N O

O

Si

3 32

5-(2-Trimethylsilylethynyl)-2′-deoxy-2′-fluoro-3′,5′-di-O-benzoyl-D-arabinouridine (3): A suspension of 5-(2-trimethylsilylethynyl)-uracil (2.67 g, 12.8 mmol, 2 eq.) in dry toluene (46 ml), in a 250 mL round bottom flask was treated with BSA (6.34 ml, 9.7 mmol, 4 eq.) and stirred at reflux (80°C) for 30 min, after which it became a clear solution. The nucelaobase mixture was then temporarily removed from reflux, the brominated sugar 2 (2.71 g, 12.8 mmol, 1 eq.) in dry toluene (18 ml) was added to it, and the reflux was re-applied overnight. Next, the reaction mixture was diluted with CH2Cl2 (100 mL) and a small quantity of saturated NaHCO3 solution (20 ml) was added causing the remaining nucleobase to precipitate out of solution. The solution was then filtered through Celite and washed with more with CH2Cl2 (100 mL). Next the mixture was washed twice with saturated NaHCO3 (100 ml) and aqueous phase was extracted with

Page 4: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

4

CH2Cl2 (2 x 50 ml). The organic phases were combined and dried over MgSO4, filtrated, evaporated. The above reaction created two anomers of product, 3α and 3β of which the latter was the desired product. To achieve separation, a slow CC gradient was applied (hexane/EtOAc 4:1 → 3:2) which eluted the undesired α anomer first followed by the desired β anomer at the 3:2 ratio (1.5 g, 42% overall yield from two reactions), however there was still 9% α anomer present and this was removed at a later stage. Characterization of the β anomer:

1H NMR (400 MHz, CDCl3) δ 8.43 (s, 1H), 7.97 – 7.82 (m, 4H), 7.74 – 7.69 (m, 1H), 7.51 – 7.38 (m, 2H), 7.38 – 7.26 (m, 4H), 6.13 (dt, J = 21.2, 4.4 Hz, 1H), 5.50 – 5.41 (m, 1H), 5.25 – 5.09 (m, 1H), 4.73 – 4.55 (m, 2H), 4.41 – 4.34 (m, 1H), 0.09 – -0.08 (m, 9H).

Additional peaks corresponding to impurities not previously listed: δ 1.08 (dt, 2H) unknown impurity; δ 2.00 (s, 1H) ethyl acetate (2.05); δ 3.56 (q, 1H) unknown impurity.

13C NMR (101 MHz, CDCl3) δ 166.45, 165.36, 160.79, 149.08, 143.88, 143.85, 134.43, 133.69, 130.20, 129.98, 129.50, 128.99, 128.84, 128.35, 100.71, 100.06, 94.88, 93.55, 91.63, 85.55, 85.39, 82.07, 63.40, 58.70, 31.13, 18.66, 0.00.

19F NMR (376 MHz, CDCl3) δ -200.73, -200.78, -200.83, -200.86, -200.91, -200.96.

ESI+-HRMS: calculated for C28H28FN2O7Si+ ([M+H]+) 551.1644, found 551.1632

Solvent: THF/MeOH/H2O 5:4:1Conditions: 0oC r.t.

3

O

FBzO

BzONH

N

O

Si

4

O

FHO

HONH

N

O

O O1. NaOH2. NH4Cl

5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The benzoyl protected nucleoside 3 (1.50 g, 2.79 mmol, 1 eq) was placed in a round bottom flask and cooled to 0°C. Additionally, a solution of 0.2 M NaOH in THF/MeOH/H2O 5:4:1 (55.6 ml, 11.12 mmol of NaOH, 4 eq.) was first cooled to 0°C and then added to the nucleoside and the mixture was stirred at 0°C for 1h. Next, NH4Cl (0.753 g, 11.25 mmol, 5 eq.) was added and the mixture stirred at r.t. for 20 min. After this, the pH was tested to ensure that the solution was no longer acidic, and the mixture was evaporated. Next, silica gel, which has been pre-neutralized with triethylamine, was then added along with CH2Cl2 and this mixture was evaporated to adsorb the product. Next, CC was applied (CH2Cl2/MeOH 95:5) afforded the un-benzoylated nucleoside 4, still with 10% α anomer present (632 mg, 69%).

1H NMR (300 MHz, DMSO) δ 11.62 (s, 1H), 7.94 (d, J = 1.0 Hz), 5.89 (dd, J = 14.4, 4.4 Hz, 1H), 5.71 (d, J = 5.1 Hz, 1H), 5.04 (t, J = 5.5 Hz, 1H), 4.87 (dd, J = 56.7, 3.9 Hz, 1H), 4.04 (ddd, J = 20.6, 8.8, 5.2 Hz,

Page 5: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

5

1H,), 3.61 (d, J = 4.4 Hz, 1H), 3.47 (s, 1H), 3.42 (d, J = 5.2 Hz, 1H), 3.13 (s, 2H), 2.31 (dd, J = 3.6, 1.8 Hz, 1H).

1H NMR (400 MHz, DMSO) δ 11.62 (s, 1H), 7.94 (d, J = 1.1, 1H), 5.89 (dd, J = 14.4, 4.4, 1H), 5.70 (d, J = 5.1, 1H), 5.03 (t, J = 5.5, 1H), 4.96 – 4.79 (m, 1H), 4.10 – 3.99 (m, 1H), 3.94 (s, 1H), 3.61 (dd, J = 8.9, 4.3, 1H), 3.53 – 3.45 (m, 1H), 3.43 – 3.35 (m, 1H), 3.12 (s, 2H, impurity), 2.31 (dt, J = 3.6, 1.8, 1H, impurity).

13C NMR (101 MHz, DMSO) δ 161.89, 149.55, 145.11, 97.82, 96.89, 94.98, 84.30, 84.14, 84.09, 83.47, 83.31, 76.48, 72.90, 72.66, 59.98, 31.15.

19F NMR (376 MHz, DMSO) δ -198.49, -198.53, -198.54, -198.58, -198.63, -198.67, -198.69, -198.73.

ESI+-HRMS: calculated for C11H11FN2NaO5+ ([M+Na]+) 293.0544293.054, found 293.0541.293.05

4

O

FHO

HONH

N

O5

O

FHO

DMTrONH

N

O

O ODMTrCl

5-Ethynyl-2′-deoxy-2′-fluoro-5′-dimethoxytrityl-D-arabinouridine (5): The nucleoside 4 (581 mg, 2.15 nmol, 1 eq.) was dissolved in fresh dry pyridine (9.6 mL) and DMTr-Cl (785 mg, 2.32 nmol, 1.08 eq) was added to it at once and this created an orange solution which was left to stir overnight. The reaction mixture was diluted with CH2Cl2 (50 mL) then washed with sat. NaHCO3 solution (2 x 50 ml), and the combined aqueous layers were extracted twice with CH2Cl2 (2 x 50 mL). The combined organic phases were dried over MgSO4, filtered and evaporated, resulting in an orange foam. This product was purified by CC (hexane/EtOAc 3:1→1:1) whereby the silica gel was pre-neutralized with 3% triethylamine in CH2Cl2. At this point it was still not possible to remove the remaining α anomer. Compound 5 (1.05 g, 83%) was obtained as a white foam.

1H NMR (400 MHz, CDCl3) δ 8.65 (s, 1H), 7.80 (t, J = 4.5 Hz, 1H), 7.45 – 7.32 (m, 2H), 7.34 – 7.06 (m, 7H), 6.85 – 6.73 (m, 4H), 6.12 (dt, J = 11.4, 6.9 Hz, 1H), 5.10 – 4.85 (m, 1H,), 4.37 – 4.27 (m, 1H,), 3.97 (dd, J = 9.3, 4.7 Hz, 1H), 3.72 (d, J = 2.9 Hz, 6H), 3.42 – 3.26 (m, 2H), 3.00 (s, 1H), 2.56 (d, J = 4.3 Hz, 1H).

Additional peaks corresponding to impurities not previously listed: δ 1.97 (q, 1H) probably ethyl acetate (2.05); δ 1.57 (s, 1H) H2O; 1.18 (q, 1H) probably ethyl acetate (1.26).

Page 6: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

6

13C NMR (101 MHz, CDCl3) δ 171.22, 161.06, 158.67, 148.82, 144.60, 144.45, 139.47, 135.63, 135.54, 130.01, 129.97, 129.14, 128.03, 127.85, 127.78, 127.01, 113.34, 113.19, 98.91, 95.73, 93.81, 86.71, 84.48, 84.31, 83.55, 82.15, 75.66, 75.40, 74.23, 62.63, 60.42, 55.26, 21.05, 14.19.

19F NMR (376 MHz, CDCl3) δ -197.91, -197.96, -198.01, -198.05, -198.10, -198.15.

ESI+-HRMS: calculated for C32H29FN2NaO7+ ([M+Na]+) 595.1851, found 595.1844.

Solvent: PyridineConditions: 0oC r.t.

Ac2OO

FAcO

DMTrONH

N

O

OO

FHO

DMTrONH

N

O

O

5 6

5-Ethynyl-2′-deoxy-2′-fluoro-3′-acetyl-5′-dimethoxytrityl-D-arabinouridine (6): Compound 5 (482 mg, 0.84 mmol, 1 eq.) was dissolved in pyridine (4.3 mL) and this solution was cooled to 0°C and acetic anhydride (1.07 mL, 1.16 g, 11.36 mmol, 13.5 eq.) was added in a dropwise manner. Next, the ice was removed and the reaction mixture was allowed to slowly warm to ambient temperature. After reacting for 1h, the reaction mixture was evaporated, leaving the crude mixture in the flask. According to the TLC (Hexanes/Ethyl Acetate 1:1) it now became possible to completely remove this 10% α anomer. The purification proceeded by CC (hexane/EtOAc 4:1→7:3) and the β anomer eluted in this latter gradient while the α anomer eluted earlier. After evaporation, the product 6 was isolated as a white foam (415 mg, 80%).

1H NMR (400 MHz, CDCl3) δ 8.54 (s, 1H), 7.83 (d, J = 1.9 Hz, 1H), 7.41 – 7.34 (m, 2H), 7.30 – 7.10 (m, 7H), 6.81 – 6.72 (m, 4H), 6.06 (dt, J = 14.7, 7.3 Hz, 1H), 5.28 (dd, J = 18.0, 3.2 Hz, 1H), 5.01 (dd, J = 50.3, 3.1 Hz, 1H), 4.04 (dt, J = 7.1, 6.0 Hz, 2H), 3.71 (s, 6H), 3.46 – 3.27 (m, 2H), 2.97 (s, 1H), 2.04 (s, 3H), 1.97 (d, J = 7.3 Hz, 2H, unknown impurity), 1.53 (s, 2H, H2O), 1.17 (t, J = 7.1 Hz, 2H, unknown impurity).

13C NMR (101 MHz, CDCl3) δ 169.31, 160.83, 158.65, 148.82, 144.58, 144.55, 144.40, 135.65, 135.48, 130.02, 129.95, 128.01, 126.98, 113.33, 113.32, 99.09, 93.64, 91.72, 86.79, 84.76, 84.59, 82.49, 82.13, 76.32, 76.02, 74.10, 62.63, 60.40, 55.25, 21.04, 20.65, 14.20.

19F NMR (376 MHz, CDCl3) δ -199.05, -199.10, -199.15, -199.18, -199.24, -199.29.

ESI+-HRMS: calculated for C34H31O8 N2FNa+ ([M+Na]+) 637.1957, found 637.201966.

calculated for C34H31FKN2O8+ ([M+K]+) 653.1696, found 653.1700.

Page 7: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

7

TCA, DCMO

FAcO

HONH

N

O

OO

FAcO

DMTrONH

N

O

O

6 7

5-Ethynyl-2′-deoxy-2′-fluoro-3′-acetyl-D-arabinouridine (7): The dried compound 6 (415 mg, 0.68 mmol) was first cooled to 0°C and a solution of pre-cooled (to the same temperature) trichloroacetic acid (3%) in CH2Cl2 (5.1 mL) was added to it. Immediately the mixture became orange and stirred while warming up to room temperature. After the TLC confirmed that the detritylation was complete (t~30 min), methanol (10 mL) was added to the reaction mixture, and the total reaction mixture was evaporated. Oddly, after the evaporation the product was black in color and when re-dissolved in MeOH (10 mL), the solution was green. Next the product was re-evaporated to ensure that all the trichloroacetic acid was gone. Purification proceeded by CC (CH2Cl2/MeOH 50:1→10:1), and the product was isolated as a white foam (95 mg, 44.8%). There was also a blue-colored side product, which eluted earlier from the column (not characterized).

1H NMR (400 MHz, Acetone) δ 10.33 (s, 1H), 8.00 (d, J = 1.8 Hz, 1H), 6.10 – 6.02 (m, 1H), 5.28 (ddd, J = 19.8, 4.4, 1.6 Hz, 1H), 5.14 (ddd, J = 50.9, 3.6, 1.6 Hz, 1H), 4.31 (d, J = 5.2 Hz, 1H), 4.03 (dd, J = 8.0, 4.1 Hz, 1H), 3.75 (ddd, J = 16.8, 12.2, 4.0 Hz, 2H), 3.49 (s, 1H), 2.68 (d, J = 13.3 Hz, 2H), 1.99 (s, 3H), 1.92 (dt, J = 4.4, 2.2 Hz, 2H).

1H NMR (400 MHz, Acetone) δ 8.01 (d, J = 1.7 Hz, 1H), 6.07 (dd, J = 18.6, 3.6 Hz, 1H), 5.33 – 5.23 (m, 1H), 5.22 – 5.04 (m, 1H), 4.04 (dd, J = 7.9, 4.1 Hz, 1H), 3.77 (ddd, J = 16.8, 12.3, 3.9 Hz, 2H), 3.50 (s, 1H), 1.99 (d, J = 9.2 Hz, 3H).

13C NMR (101 MHz, Acetone) δ 169.40, 160.70, 149.12, 144.71, 144.68, 98.14, 94.41, 92.50, 84.17, 84.00, 82.71, 82.68, 81.67, 75.50, 75.48, 75.19, 60.59, 19.77.

19F NMR (376 MHz, Acetone) δ -200.65, -200.70, -200.75, -200.79, -200.84, -200.89.

ESI+-HRMS: calculated for C13H14FN2O6+ ([M+H]+) 313.0830, found 313.0839.

5-ethynyl-2’F-ANA UTP (8):

Page 8: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

8

NH

N

O

OO

O

FHO

O

salicyl phosphochlorite

NH

N

O

OO

O

FO

O

OPO

O

Cl

7

OPO

O

(nBu3NH)2H2OP2O7

NH

N

O

OO

O

FO

O

PO

OPO

PHO O

HO

O1. Iodine

NH

N

O

OO

OH

FOP-O

O

OPOPO-

O-O

O O-

2. Base

7.1 7.28

All reactions in this series proceeded in situ. A day before the reaction, nucleoside 7 (30.8 mg, 0.10 mmol, 1 eq.) and 2-chloro-1,3,2-benzodioxaphosphorin-4-one (22 mg, 0.11 mmol, 1.1) were placed in separate round bottom flasks. The nucleoside was co-evaporated twice with dry pyridine, both compounds were then dried under reduced pressure overnight, released from vacuum under Ar, and sealed. Pyridine (0.14 mL, 1.73 mmol, 17.5 eq.) was added to the 2-chloro-1,3,2-benzodioxaphosphorin-4-one (forming a yellow solution), followed by dioxane (0.28 mL, 3.3 mmol, 33 eq., creating a suspension). Next, this suspension was withdrawn and added to the nucleoside to start the reaction, which was stirred for 1h at ambient temperature. Concurrently, tributylammonium pyrophosphate (70.4 mg, 0.13 mmol, 1.3 eq) was dried under vacuum in another vessel, released under Ar, and dissolved in dimethylformamide (0.12 mL, 1.5 mmol, 16 eq) and n-butylamine (0.04 mL, 0.4 mmol, 4 eq.), and this mixture was added to the reaction mixture, and allowed to stir. After 1h, iodine (40 mg, 0.16 mmol, 1.6 eq.) was dissolved in pyridine (0.68 mL, 1.73 mmol, 86 eq.) and water (0.0014 mL, 0.77 mmol, 7.8 eq.) and added to the stirring reaction mixture, which was stirred at room temperature for 30 min. Next, sodium thiophosphate (10% aq m/v) was added to quench the excess iodine, the mixture was transferred to a 10-mL flask and evaporated under reduced pressure (water bath ≤ 30°C). Next, water (3.5 mL) was added and the reaction sat at room temperature for 30 min. Next, the mixture was transferred to a 25-mL vessel, and concentrated aqueous NH4OH (7.0 mL) was added, and stirred for 1h. Next, some NaClO4 (2% m/v in acetone) was added to trigger the precipitation of the product (observed yellow). The product was dissolved in water, split over >20 portions, and each of these portions was separately purified by RP-HPLC (0 to 8 min: 0% B; 8 to 48 min: 0→40% B; 48 to 50 min: 40→100% B; eluent A: 50 mM TEAB, pH 7.7; eluent B: 50 mM TEAB in CH3CN/H2O 1:1). After collection, the fractions were frozen, lyophilized, dissolved in water, re-lyophilized, and this cycle was repeated twice with water, and twice with ethanol to remove the maximum possible quantity of triethylammonium bicarbonate. The products were then passed through a sodium charged Dowex ion exchange resin to exchange the counter ions from triethylammonium to sodium. They were then combined, evaporated, and re-dissolved in D2O for NMR, and quantified via UV260 (2 μmol, 2% yield)

1H NMR (300 MHz, D2O_salt) δ 8.37 (s, 3H), 8.10 (s, 1H), 7.89 – 7.73 (m, 2H), 6.21 (dt, J = 13.7, 3.1 Hz, 1H), 5.23 – 5.01 (m, 1H), 4.49 (dd, J = 20.6, 9.9 Hz, 1H), 4.16 (d, J = 8.3 Hz, 3H), 3.63 (s, 1H), 2.59 (d, J = 22.6 Hz, 1H), 1.83 (s, 1H), 1.24 (d, J = 6.9 Hz, 1H).

Page 9: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

9

31P NMR (162 MHz, D2O) δ 21.80 (d, J = 97.5 Hz), -6.45 (s), -10.10 – -11.45 (m), -21.59 (s).

MS MALDI: calculated for: C11H13FN2O14P3- ([M-H]-) 508.957, found 509.04

C11H12FN2NaO14P3- ([M-2H+Na]-) 530.939, found 531.04

C11H11FN2Na2O14P3- ([M-3H+2Na]-) 552.921, found 553.05

Rt (RP-HPLC): 23.6 min.

Page 10: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

10

3. Oligonucleotides:

Table S1. Sequence composition of oligonucleotides used for PEX reactions.

Name Sequence (5’ to 3’ direction)a

P1 FAM-GAC CCA CTC CAT CGA GAT TTCT1 TGC CTG GTG TTT GGG AGA AAT CTC GAT GGA GTG GGT CP2 FAM-CAT GGG CGG CAT GGGT2 AAA AAA ACC CAT GCC GCC CAT GP3 FAM-CAT GGG CGG CAT GGG T3 CTA GCA TGA GCT CAG TCC CAT GCC GCC CAT GP4 FAM-CAA GGA CAA AAT ACC TGT ATT CCT TP5 CAA GGA CAA AAT ACC TGT ATT CCT T P6 GAC ATC ATG AGA GAC ATC GCT4 Phosphate-GAC ATC ATG AGA GAC ATC GCC TCT GGG CTA ATA GGA CTA

CTT CTA ATC TGT AAG AGC AGA TCC CTG GAC AGG CAA GGA ATA CAG GTA TTT TGT CCT TG

P7 GTG GTG CGA AAT TTC TGA CP8 CAC TCA CGT CAG TGA CAT GCT5 CAC TCA CGT CAG TGA CAT GCA TGC CGA TGA CTA GTC GTC ACT AGT

GCA CGT AAC GTG CTA GTC AGA AAT TTC GCA CCA Ca Underlined sections represent the primer binding regions

Page 11: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

11

4. Biochemical protocols:

i) Protocol for primer extension reactions

A 5'-FAM-labelled primer (10 pmol) was annealed to the appropriate template (15 pmol) in the presence of 10× enzyme buffer (provided by the supplier of the DNA polymerase) by heating to 95 °C and then gradually cooling to room temperature over 30 min. The appropriate DNA polymerase was then added to the annealed oligonucleotide mixture on ice. Finally, natural dNTPs and/or modified triphosphate (200 µM final concentrations) were added for a total reaction volume of 10 μL. Following incubation at the optimal temperature for the enzyme, the reactions were quenched by adding stop solution (10 μL; formamide (70%), ethylenediaminetetraacetic acid (EDTA; 50 mM), bromophenol (0.1%), xylene cyanol (0.1%)). The reaction mixtures were subjected to gel electrophoresis in denaturing polyacrylamide gel (20%) containing trisborate-EDTA (TBE) 1× buffer (pH 8) and urea (7 M). Visualization was performed by fluorescence imaging using a Storm 860 phosphorimager.

ii) Protocol for PCR:

The PCR reaction mixtures (50 L total) contained both primers (400 nM of each P7 and P8), template T5 (25 nM), dNTPs (200 mM for the natural dNTPs and 200 mM or 400 mM of the modified nucleotide), and of the DNA polymerase (1 U) in the reaction buffer provided by the manufacturer. The PCRs were carried out in a SimpliAmp thermal cycler from ThermoFisher and 30 cycles were performed including an initial denaturation step (2 min at 94°C) and a final extension step (5 min at 72°C). Each cycle included denaturation at 94°C for 1 min, annealing for 1 min at 50°C, and extension for 2 min at 72°C. All PCR products were analyzed by agarose gels (2.5 %) in TBE buffer, containing ethidium bromide. The gels were visualized by phosphorimager analysis.

iii) Click reaction with modified oligonucleotides

Primer extension reactions were carried out as described in protocol ii) using primer P3 (30 pmol), template T3 (45 pmol), Therminator as DNA polymerase, and 60 min reaction times. The reaction mixtures were heat deactivated (90°C, 15 min) and 30 µL of DMSO were added. 2-azidoethanamine-biotin was then added to the reaction mixture (1.5 µL, 8.3 mM solution in DMF) along with sodium ascorbate (0.5 mM final concentration). The solution was then thoroughly degassed and Cu(II) with the TBTA ligand (0.5 mM final concentrations) were then added. The reaction mixture was again degassed, heated at 80°C for 3 min and then incubated at room temperature overnight. The reaction mixtures were then ethanol precipitated and dissolved in 20 µL of water. Avidine in water (2 µM final concentration) was then added and the resulting reaction mixtures incubated at 37°C for 30 min. After addition of stop solution, the reaction products were analyzed by PAGE (20%) and visualized by fluorescence imaging using a Storm 860 phosphorimager.

Page 12: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

12

iv) Sequencing and bioinformatics analysis

Scheme S1. Schematic representation of the synthesis of modified DNA and reverse-transcription to natural DNA. FANA containing oligonucleotides are obtained by Therminator-mediated incorporation of the modified nucleotide by PEX reactions. The unmodified strand is removed by a -exonuclease treatment and the resulting modified ssDNA product subjected to PCR with natural dNTPs.

The description of the method is highlighted in Scheme S1. Primer extension reactions were carried out as described in protocol ii) using primer P4 (40 pmol), template T4 (60 pmol), Therminator as DNA polymerase, either natural dNTPs or 5-ethynyl-2’F-ANA UTPFANA-EUTP, dCTP, dGTP, dATP (200 µM final concentrations), and 60 min reaction times. After heat deactivation (90°C, 15 min), the reaction mixtures were then treated by -exonuclease (20 U) at 37°C for 2h. The reaction mixtures were then purified by phenol-chloroform extraction followed by ethanol precipitation. After exonuclease digestion, the ssDNA products were dissolved in 10µL H2O. The PCR reaction mixtures consisted of 2.5 µL ssDNA products, primers P5 and P6 (0.4 µM final concentrations), reaction buffer 1x (supplied with the polymerase), natural dNTPs (200 µM final concentration), and Q5 DNA polymerase (2 U). The PCR program consisted of an initial denaturation step (5 min at 94°C) followed by 30 PCR cycles (30 sec at 94°C, 30 sec at 54°C, 30 sec at 72°C) and a final elongation step (5 min at 72°C). Reaction products were analyzed by agarose gel electrophoresis (Figure 3). The Q5 DNA polymerase showed the best amplification yields of the different polymerases that were evaluated.

After PCR, sequencing of the products was performed at the Biomics platform of Institut Pasteur. A suitable library (average length of 245 nucleotides) was obtained using PCRFree kits from Illumina. The resulting, purified PCR products were quantified by Quant-iT PicoGreen dsDNA fluorometric quantification (Thermo Fisher Scientific, Waltham, MA, USA) and pooled for sequencing in paired-end (2x162 bases) on the Illumina MiSeq platform using the 500-cycle v2 chemistry (Illumina, San Diego, CA, USA). The samples contained 80% of the library (1 pM) and 20% of Phix (12.5 pM). Sequencing analysis yielded a total of 791,572 reads in total (R1 and R2), and 404650 reads after phix removal. All these reads were then analyzed as a function of their concordance (which is defined as (1 - (D+I+M) / (D+I+M+S) with D = total number of deletions; I = total number of insertions; M = total number of mutations) and the number of errors per read which are shown in Figure 3 of the manuscript and Figure S4, respectively.

Summary of the results from data analysis:

Page 13: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

13

1) About 800,000 reads available with 400,000 as phix.

2) No SNVs identified.

3) No significant INDELs (<0.1%)

4) 39 millions bases in all alignments with 0.013 of insertions, 0.23% deletions, 0.47% substitutions

5) Average quality is 37.9 (phred score)

6) 99.2% of the alignments have a concordance > 95%

7) 99% of the alignments have length >70 nucleotides

8) Only 0.2 % of reads have concordance < 90%

9) 67% of the reads have exactly 98 bases

Page 14: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

14

5. Additional figures:

Figure S1. Gel image (PAGE 20%) of primer extension reactions with primer P1 and template T1. Reactions were carried out at the adequate reaction temperature for 1 h and a mixture of 5-ethynyl-2’F-ANA UTP, dCTP, dGTP, dATP (200 µM final concentrations). Lane 1: Therminator; lane 2: Vent (exo-); lane 3: Taq; lane 4: Bst; lane 5: Deep Vent; lane 6: Kf exo-; lane 7: control reaction with natural dNTPs and Taq; lane 8: control reaction with dATP, dCTP, dGTP, and Taq.

Page 15: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

15

Figure S2. Gel image (PAGE 20%) of primer extension reactions with primer P2 and template T2. Reactions were carried out at the adequate reaction temperature for 1 h and a mixture of 5-ethynyl-2’F-ANA UTP, dCTP, dGTP, dATP (200 µM final concentrations). Lane 1: Therminator; lane 2: Vent (exo-); lane 3: Taq; lane 4: Bst; lane 5: Deep Vent; lane 6: Kf exo-.

Page 16: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

16

Figure S3. Gel image analysis (PAGE 20%) of PEX reactions with template T4 and primer P4 with 2’-FANA-E-UTP 8 and Therminator (lane 1) and with natural dNTPs and the Vent (exo-) polymerase (lane 2). All reaction mixtures were incubated at 60°C for 60 min in the presence of 2U of polymerase and 200 µM triphosphates (modified and natural). P represents unreacted primer.

Page 17: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

17

Figure S4. Gel image (agarose 2.5 %) analysis of the PCR products with template T5 and primers P7 and P8. Lane 1: Therminator and 200 µM 5-ethynyl-2’F-ANA UTP (final concentration); lane 2: Deep Vent and 200 µM 5-ethynyl-2’F-ANA UTP (final concentration); lane 3: Therminator and 400 µM 5-ethynyl-2’F-ANA UTP (final concentration); lane 4: Deep Vent and 400 µM 5-ethynyl-2’F-ANA UTP (final concentration); lane 5: Vent (exo-) and 200 µM 5-ethynyl-2’F-ANA UTP (final concentration); lane 6: negative control with only dCTP, dGTP, dATP and Taq; lane 7: positive control with natural dNTPs and Taq. L represents a DNA molecular zieght ladder.

Page 18: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

18

Figure S5. Analysis of the number of errors for all the reads obtained from the sequencing data.

Page 19: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

19

Figure S6. 1H NMR of compound 3.

Figure S7. 13C NMR of compound 3.

Page 20: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

20

Figure S8. 19F NMR of compound 3.

Figure S9. COSY spectrum of compound 3.

Page 21: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

21

Figure S10. NOESY spectrum of compound 3.

Figure S11. HR-MS of compound 3.

Page 22: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

22

Page 23: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

23

Figure S12. 1H NMR of compound 4.

Page 24: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

24

Figure S13. 13C NMR of compound 4.

Figure S14. 19F NMR of compound 4.

Page 25: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

25

Figure S15. HR-ESI-MS spectrum of compound 4.

Page 26: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

26

Figure S16. 1H NMR of compound 5.

Page 27: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

27

Figure S17. 13C NMR of compound 5.

Figure S18. 19F NMR of compound 5.

Page 28: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

28

Figure S19. HR-ESI-MS spectrum of compound 5.

Page 29: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

29

Figure S20. 1H NMR of compound 6.

Figure S21. 13C NMR of compound 6.

Page 30: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

30

Figure S22. 19F NMR of compound 6.

Page 31: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

31

Figure S23. HR-ESI-MS spectrum of compound 6.

Page 32: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

32

Figure S24. COSY spectrum of compound 6.

Page 33: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

33

Figure S254. 1H NMR of compound 7.

Page 34: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

34

Figure S265. 13C NMR of compound 7.

Page 35: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

35

Figure S276. 19F NMR of compound 7.

Figure S287. HR-ESI-MS spectrum of compound 7.

Page 36: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

36

Figure S298. 1H NMR of compound 8.

Figure S3029. 31P NMR of compound 8.

Page 37: the generation of base-modified, nuclease resistant ... · F BzO BzO NH N O Si 4 O F HO HO NH N O O O 1. NaOH 2. NH 4 Cl 5-Ethynyl-2′-deoxy-2′-fluoro-D-arabinouridine (4): The

37

Figure S310. MALDI-TOF analysis of compound 8.