Trichinella spiralis mtDNA: A Nematode Mitochondrial ... · Trichinella spiralis mtDNA: A Nematode...
Transcript of Trichinella spiralis mtDNA: A Nematode Mitochondrial ... · Trichinella spiralis mtDNA: A Nematode...
Copyright 2001 by the Genetics Society of America
Trichinella spiralis mtDNA: A Nematode Mitochondrial Genome That Encodesa Putative ATP8 and Normally Structured tRNAs and Has a Gene
Arrangement Relatable to Those of Coelomate Metazoans
Dennis V. Lavrov and Wesley M. Brown
Department of Biology, University of Michigan, Ann Arbor, Michigan 48109-1048
Manuscript received March 28, 2000Accepted for publication October 11, 2000
ABSTRACTThe complete mitochondrial DNA (mtDNA) of the nematode Trichinella spiralis has been amplified in
four overlapping fragments and 16,656 bp of its sequence has been determined. This sequence containsthe 37 genes typical of metazoan mtDNAs, including a putative atp8, which is absent from all other nematodemtDNAs examined. The genes are transcribed from both mtDNA strands and have an arrangement relatableto those of coelomate metazoans, but not to those of secernentean nematodes. All protein genes appearto initiate with ATN codons, typical for metazoans. Neither TTG nor GTT start codons, inferred for severalgenes of other nematodes, were found. The 22 T. spiralis tRNA genes fall into three categories: (i) thosewith the potential to form conventional “cloverleaf” secondary structures, (ii) those with TCC arm 1variable arm replacement loops, and (iii) those with DHU-arm replacement loops. Mt-tRNA(R) has a 59-UCG-39 anticodon, as in most other metazoans, instead of the very unusual 59-ACG-39 present in thesecernentean nematodes. The sequence also contains a large repeat region that is polymorphic in size atthe population and/or individual level.
MITOCHONDRIAL DNAs (mtDNAs) vary exten- nematode mtDNAs are remarkably compact, ranging insively in size and gene content across diverse size from 13,747 bp for O. volvulus to 14,284 bp for A.
eukaryotic groups; those of animals (Metazoa), how- suum. These nematode mitochondrial genomes shareever, are surprisingly uniform (Lang et al. 1999). A several unusual features: most of their protein, rRNA,typical metazoan mtDNA is a circular molecule of 14–18 and tRNA genes are smaller than in other metazoanskb and encodes 37 genes: 13 for proteins [subunits 6 and one (atp8) is missing altogether; several of theirand 8 of the F0 ATPase (atp6 and atp8), cytochrome c protein genes initiate at the unusual start codons GTToxidase subunits 1–3 (cox1–cox3), apocytochrome b and TTG, and none at an orthodox ATG codon (Oki-(cob), and NADH dehydrogenase subunits 1–6 and 4L moto et al. 1990); and all tRNAs encoded have second-(nad1–6 and nad4L)]; 2 for ribosomal RNAs [small and ary structures that lack either a TCC or a DHU armlarge subunit rRNAs (rrnS and rrnL)]; and 22 for tRNAs (Okimoto and Wolstenholme 1990). The gene ar-[designated by the one-letter code, with the two leucine rangements of these nematode mtDNAs are also veryand two serine tRNAs differentiated by their anticodon unusual: although the four share some arrangementssequences (uag/uaa and ucu/uga, respectively)] (Wol- among themselves, they differ at nearly every genestenholme 1992). The arrangement of these genes in boundary from all other metazoans (Boore 1999). Anmetazoan mtDNA is relatively well conserved, with some extreme case of nematode mitochondrial genome orga-blocks of genes shared even among different phyla nization has been reported recently for the potato cyst(Boore 1999). nematode Globodera pallida (Armstrong et al. 2000).
One metazoan group with mtDNA that deviates from The mitochondrial genome of this animal is multipar-the pattern just described is the phylum Nematoda. Com- tite and exists as a population of small circular DNAs ofplete mitochondrial gene arrangements are available for different sizes and gene contents, a unique organizationfour nematode species: Ascaris suum and Caenorhabditis among studied metazoans. The above features haveelegans (Okimoto et al. 1992), Meloidogyne javonica (Oki- helped to reinforce a widely held view that nematodesmoto et al. 1991), and Onchocerca volvulus (Keddie et al. are a bizarre group, with unclear phylogenetic affinities1998); complete sequences are available for all except to the major metazoan lineages.M. javonica. With the exception of the latter species, The four species of nematodes for which completewhich has an unusually large noncoding region, the mtDNA sequences and/or complete gene arrange-
ments have been published are all in the class Secernen-tea, one of two traditionally recognized nematode
Corresponding author: Dennis Lavrov, Departement de Biochimie,classes (Brusca and Brusca 1990). There is far lessUniversite de Montreal, C.P. 6128, Montreal, QC H3C3J7, Canada.
E-mail: [email protected] information about mtDNA from representatives of an-
Genetics 157: 621–637 (February 2001)
622 D. V. Lavrov and W. M. Brown
genes were derived by analogy to other published rRNA geneother class, Adenophorea, which is often consideredstructures and drawn using the RnaViz program (De Rijk andmore primitive. A limited amount of sequence and geneDe Wachter 1997).
arrangement data is available for the adenophorean The amino acid sequences were inferred from mitochon-species Romanomermis culicivorax (Azevedo and Hyman drial protein genes of T. spiralis, A. suum, C. elegans (Okimoto
et al. 1992), O. volvulus (Keddie et al. 1998), and Limulus1993), and partial cox1 sequences are available frompolyphemus (Lavrov et al. 2000) and aligned using the ClustalWseveral species in the genus Trichinella (Nagano et al.program (Thompson et al. 1994) in MacVector 6.5 (gap pen-1999). To complicate the matter, the monophyly of thealty 5 5; extension penalty 5 1; no gap separation distance;
class Adenophorea is questionable: no synapomorphies all other options at default settings), and percentage of theirsupport its monophyly, and both morphological and similarity was determined. Amino acid and codon usage on
the different mtDNA strands were compared using x2 analysesDNA sequence data indicate that it may be paraphyleticof contingency tables; when a 2 3 2 contingency table was(Adamson 1987; Malakhov 1994; Blaxter et al. 1998;used, the Yates correction for continuity was applied (YatesVoronov et al. 1998). We describe here the mitochon-1934). To illustrate the quantitative difference, the odds ratios
drial genome of Trichinella spiralis, the first comprehen- (ORs) were calculated as the ratio of a particular amino acidsively studied mtDNA from a nematode outside the class (group of amino acids, codon, group of codons) to all other
amino acids (codons) for one strand, divided by the sameSecernentea.ratio for the second strand.
MATERIALS AND METHODSRESULTS AND DISCUSSION
mtDNA amplification and sequencing: Total DNA fromGenome size and organization: The estimated size ofz10,000 larvae of the nematode T. spiralis was a gift from D.
Despommier. Conserved primers designed in our laboratory T. spiralis mtDNA varies between ca. 21 and 24 kb. Thiswere used to amplify portions of cox3, cob, nad5, and nad1. variation is due to an apparent size polymorphism of aWe designed two primers going in opposite directions for region downstream from nad1 and nad2, as indicatedeach of these gene fragments, designated:
by the results of PCR amplification and by SouthernTrichi-cox3-F1, 59-TACGTAGAATACCACACATCCAC-39; hybridization analysis (data not shown). Partial sequenc-Trichi-cox3-R1, 59-ATTCTTCCGTTTACTCCTCTCGA-39; ing from the two ends of this region revealed the pres-Trichi-cob-F1, 59-CAATCCATTAGGTACACACTCAC-39;
ence of two repeat units of 1323 bp, the first overlappingTrichi-cob-R1, 59-CCTGTAATTCTGTATCCTCCTCA-39;nad1 by 3 nucleotides and the second ending 153 nucle-Trichi-nad5-F1, 59-TTGGTAGTTGTGGTGGGTAAGTC-39;
Trichi-nad5-R1, 59-AACAACACCACCAACCTGAGCAC-39; otides downstream from nad2 (Figure 1). The repeatTrichi-nad1-F1, 59-CACTAGCACTTACCATTCCAGCC-39; unit closest to nad2 contains 50 bp of the inferred trnK;Trichi-nad1-R1, 59-GGTTGTTGCTAGGTTGTATGAGTC-39. the remaining 12 bp of that gene is located in the adja-
cent sequence. The partially sequenced region betweenUsing a Perkin Elmer (Norwalk, CT) XL PCR kit and primerpairs cox3-F1-nad1-R1, cox3-R1-cob-F1, and cob-R1-nad5-R1, these repeat units includes smaller repeats and homo-we amplified regions between nad1 and cox3 (z4.4 kb), cox3 polymer runs, which interfere with further sequencing.and cob (z3.6 kb), and cob and nad5 (z3.0 kb), respectively.
The results of PCR amplifications using one primerEach PCR reaction yielded a single band when visualized withcomplementary to a sequence inside the large repeatethidium bromide staining after electrophoresis in a 1% or
0.7% agarose gel. Amplification of the remaining portion of unit and a second primer complementary to a sequencemtDNA, downstream from nad1 and nad5, was very problem- in either nad1 or nad2 suggest the presence of additionalatic. The flanking sequences of this region were amplified large repeat units in this region (data not shown). Theusing Step-Out PCR (Wesley and Wesley 1997), and the
whole region downstream from nad1 and nad2 will,entire region was later amplified using a TaKaRa LA Taq kithereafter, be referred to as the repeat region. The se-(Takara Shuzo Co.), but several products of different sizes
were produced in all of the latter amplifications. quence of the T. spiralis mtDNA, excluding the repeatPCR reaction products were purified by three serial passages region, is 13,902 bp in size and encodes 36 of the 37
through Ultrafree [30,000 nominal molecular weight limit genes (all but trnK).(NMWL)] columns (Millipore, Bedford, MA) and used as tem-
In contrast with other nematodes studied, the geneplates in dye-terminator cycle-sequencing reactions accordingarrangement of T. spiralis mtDNA can be easily relatedto supplier’s (Perkin Elmer) instructions. Both strands of each
amplification product were sequenced by primer walking, us- to those of several other metazoans by invoking a moder-ing an ABI Prizm 377 automated DNA sequencer (Perkin ate number of rearrangements. The greatest similarityElmer). The sequence has been submitted to GenBank under is to the primitive arthropod gene arrangement [exem-accession no. AF293969.
plified by L. polyphemus (Staton et al. 1997)] with whichSequence analysis: Sequences were assembled using Se-it shares three different blocks of three or more genesquencing Analysis and Sequence Navigator software (Perkin
Elmer) and analyzed with MacVector 6.5 and GCG (Oxford and four additional two-gene boundaries (Figure 2). InMolecular Group) programs. Protein and ribosomal RNA addition, the location of two tRNA(S) genes is similargene sequences were identified by their similarity to published in the two genomes: one is situated immediately down-metazoan mtDNA sequences; tRNA genes were recognized
stream from cob, and the second is in the tRNA clusterinitially by their potential to be folded into tRNA-like second-downstream from nad3. However, the specificity of theseary structures, after which they were identified specifically by
their anticodon sequences. The secondary structures of rRNA genes is reversed in the two species [cob-trnS(ucu)/nad3-
623T. spiralis Mitochondrial Genome
mitochondrial gene arrangement is most similar to thatprimitive for arthropods, we found no synapomorphiesin this or other metazoan gene arrangements that eithersupport or refute the Ecdysozoa hypothesis.
Nucleotide composition: The A 1 T content of T.spiralis mtDNA, excluding the repeat region, is 65.2%,lower than those reported for other nematodes. Eachof the large repeat units is 77.7% A 1 T. The two strandsof T. spiralis mtDNA have significantly different nucleo-tide composition. The strand that contains the sensesequence of nine mRNAs, both ribosomal RNAs, and12 tRNAs (hereafter referred to as the a-strand) is ACrich (i.e., its A/T and C/G ratios are .1) and the otherstrand (hereafter the b-strand) is GT rich. The differ-ence is especially pronounced in the region containingcoding sequences on the b-strand (clockwise, from nad2to trnP in Figure 1; nucleotides 1–4307 in the GenBanksequence) and is less extreme in the repeat region. Thecorresponding GC and AT skews [GC skew 5 (G 2 C)/(G 1 C) and AT skew 5 (A 2 T)/(A 1 T); Perna andKocher 1995] for these two regions are 20.59, 0.48and 20.25, 0.03, respectively; for the rest of the genome,GC skew 5 20.33 and AT skew 5 0.14. If the AT andFigure 1.—Gene map of T. spiralis mtDNA. Protein and
rRNA genes are abbreviated as in the text; tRNA genes are GC skews are a consequence of asymmetrical mtDNAabbreviated using the one-letter amino acid code; the two replication, as has been suggested (Brown and Simpsonleucine and two serine tRNA genes are additionally identified 1982; Asakawa et al. 1991; Reyes et al. 1998), the differ-by their anticodon sequences with trnL(uag) marked as L1,
ence in nucleotide composition between the two strandstrnL(uaa)2 as L2, trnS(ucu)2 as S1, and trnS(uga)2 as S2. Arrowsof T. spiralis mtDNA implies that b-strand replicationindicate the direction of transcription of each gene. Positive
numbers at gene boundaries indicate the number of in- precedes a-strand replication in this mtDNA. This wouldtergenic nucleotides; negative numbers indicate the number be similar to the situation in arthropod and vertebrateof overlapping nucleotides. Asterisks mark incomplete stop mtDNAs, in which the sense sequence of most genes iscodons (T or TA). The size of the repeat-containing region
located on the AC-rich strand that is also the laggingis not to scale; the unsequenced portion of this region isstrand in mtDNA replication. By contrast, the sensedemarcated by curved lines.sequence of all genes in the secernentean nematodemtDNAs that have been studied is located on the GT-rich strand, which, by the above criterion, is also thetrnS(uga) in T. spiralis and cob-trnS(uga)/nad3-trnS(gcu)
in L. polyphemus; Figure 2]. Mechanistically, this reversal leading strand in mtDNA replication.could have arisen by either multiple rearrangements oranticodon switching. The latter hypothesis is supported
Protein genesby the phylogenetic analysis of mitochondrial trnS se-quences, which tends to group T. spiralis trnS(ucu) and Size and sequence similarity: Thirteen protein genes
are commonly present in metazoan mtDNAs; however,trnS(uga) with the trnS(kcu) and trnS(uga), respectively,of other animals (data not shown). A plausible mecha- one of them (atp8) is absent from all nematode mtDNAs
previously examined. Eleven T. spiralis protein genesnism for anticodon switching, involving tRNA gene du-plication with consecutive changes in the anticodon se- (all but atp6 and atp8) were easily identified by sequence
comparisons with other species’ mtDNAs. In addition,quence, has been proposed (Cantatore et al. 1987).However, since the anticodons of the two serine tRNAs two open reading frames (ORFs) were tentatively identi-
fied as atp6 and atp8. The first ORF, located between(UCU and UGA) differ at two positions and since achange at either would create an anticodon for a differ- rrnL and cox3, has some sequence similarity to other
metazoan atp6 ’s, but is significantly larger [276 senseent amino acid, two simultaneous substitutions wouldbe needed for conversion of one serine tRNA gene to codons vs. 199 in A. suum and C. elegans (Okimoto et
al. 1992), 224 in L. polyphemus (Lavrov et al. 2000), andthe other. Evidence for a relatively high frequency ofsuch mutational events has been recently provided 226 in human (Anderson et al. 1981)]. The second
ORF, located between trnD and nad3, also has some(Averof et al. 2000).Aguinaldo et al. (1997) proposed that arthropods, sequence similarity to other metazoan atp8’s, but is
smaller than those (41 sense codons vs. 51 in L. poly-nematodes, and several other “minor” phyla form a mo-nophyletic group, the Ecdysozoa. While the T. spiralis phemus and 68 in human). In addition to sequence simi-
624 D. V. Lavrov and W. M. Brown
Figure 2.—Comparison of gene arrangements in the mtDNAs of A. suum (Okimoto et al. 1992), L. polyphemus (Staton et al.1997), and T. spiralis. Only coding sequences are shown. Protein and rRNA genes are indicated by open boxes, tRNA genes byhatched boxes. No pairwise gene arrangement is identical between A. suum and T. spiralis or A. suum and D. yakuba. Blocks ofthree or more genes shared between T. spiralis and D. yakuba are underlined and interconnected with arrows, and sharedboundaries between two genes outside these blocks are marked with asterisks. All abbreviations and other symbols are as inFigure 1.
larities, the hydropathy profiles of the ORF-encoded internal initiation codon and/or truncated stop codonin this ORF. Taking the analyses of hydropathy andproteins are similar to those of ATP6 and ATP8 in Limu-
lus and human (Figure 3). The similarities are further codon nucleotide composition together, it is unclear ifthe presumptive atp6 ends with an incomplete termina-enhanced by ending the presumptive atp6 with an abbre-
viated stop codon 45 codons upstream from the end of tion codon or has a greatly expanded 39 end.Most mitochondrial protein genes in T. spiralis arethe ORF and by starting atp8 at the ATC codon 18
nucleotides (nt) downstream from the beginning of the slightly larger than their counterparts in other nema-todes and slightly smaller than those in L. polyphemusORF. This would result in a putative ATP6 of 232 amino
acids, with a well-conserved C-terminal motif (EX2- (Table 1). The differences are within 5% of the T. spiralisgene length for all genes except atp6 and atp8 (discussedVX3QX2FX2LX3YX2EXn), and a putative ATP8 of 35
amino acids, with a partially conserved sequence at the above); cox1 and nad3 (6.4% longer and 7.4% shorter,respectively, in O. volvulus); nad2, nad4, nad4L, and nad5N terminus. Both with and without the first 6 amino
acids, the putative ATP8 would be shorter than its coun- (.5% longer in L. polyphemus); and nad6 (8.3 and 7.6%shorter in A. suum and C. elegans, respectively). Theterparts in other species; however, most of the size re-
duction is in the positively charged, hydrophilic domain, comparison of amino acid sequences inferred from theprotein genes of T. spiralis with those of three otherwhich is known to vary greatly in length in this protein
(Gray et al. 1998). Additional evidence that both ORFs nematode species and L. polyphemus revealed cox1 as themost conserved and atp6, nad2, and nad6 as the least-encode functional proteins comes from the similarity
in their codon nucleotide composition with those of conserved genes, with amino acid identities of the en-coded proteins ranging from 8.3 to 59.9% (Table 1).other genes for a-strand-encoded proteins (Figure 4, A,
C, and F), which have T-rich second and AC-rich third The size differences and low amino acid similarity ofthe putative ATP6 and ATP8 proteins made their align-codon positions. Similar patterns of nucleotide usage
prevail when only the first or, to a lesser extent, the last ments difficult, and the reported sequence identitiesfor them should be regarded as preliminary estimates.50 codons are analyzed for the presumptive atp6 (Figure
4, D and E), which argues against the presence of an Translation initiation and termination signals: An
Fig
ure
3.—
Com
pari
son
sof
T.s
pira
lis,L
.pol
yphe
mus
,an
dh
uman
AT
P6an
dA
TP8
hyd
ropa
thy
prof
iles.
Eac
hw
asca
lcul
ated
byth
em
eth
odof
Kyt
ean
dD
oo
litt
le(1
982)
.W
indo
wsi
ze5
7.N
umbe
rsbe
low
prof
iles
desi
gnat
eam
ino
acid
posi
tion
sin
each
prot
ein
.A
rrow
sin
dica
tes
apo
ssib
leal
tern
ativ
een
dof
AT
P6an
da
poss
ible
alte
rnat
ive
begi
nn
ing
ofA
TP8
inT
.sp
iral
is,
both
ofw
hic
hw
ould
incr
ease
the
sim
ilari
tyin
hyd
ropa
thy
ofth
eT
.sp
iral
ispr
otei
ns
toth
ose
ofL
.po
lyph
emus
and
hum
an.
626 D. V. Lavrov and W. M. Brown
Figure 4.—Comparisons of nucleotide composition at first, second, and third codon positions of T. spiralis genes for a-strand-encoded proteins (except ATP6 and ATP8) (A), b-strand-encoded proteins (B), putative ATP6 (C–E), and putative ATP8 (F).For atp6, nucleotide composition is shown for all codons (C), for the first 50 codons (D), and for the last 50 codons (E).Nucleotide percentages: black bars, T; dark gray bars, C; light gray bars, A; white bars, G. 1, 2, and 3 indicate first, second, andthird codon positions, respectively; 3* indicates third codon positions in fourfold degenerate codon families.
ATG, ATT, or ATA codon occurs at the beginning of may form a complete stop codon is quite frequent formetazoan mtDNAs and suggests that this may be a con-all inferred protein genes in T. spiralis mtDNA. Neither
TTG nor GTT, both of which were reported as initiation served feature to prevent readthrough of unprocessedtranscripts. As presently inferred, atp6 overlaps cox3 bycodons of several protein genes in other nematodes
(Okimoto et al. 1990), are used as such in T. spiralis. The 8 bp and terminates with TAA. It is also possible thatatp6 terminates after the T preceding the 59 end of cox3,use of ATG as an initiation codon in five mitochondrial
protein genes of T. spiralis is also a departure from O. or even earlier (see above). If this is the case, however,it would be unclear how the 39 end of atp6 transcript isvolvulus, A. suum, and C. elegans, none of which use it
in this function (Okimoto et al. 1992; Keddie et al. formed, since there are no obvious sequence cues thatcould guide RNA processing at these positions (e.g.,1998). Among the five T. spiralis genes initiated by ATG,
three (cox1, cox2, and cob) share a sequence motif [59- potential stem-loop structures; see Bibb et al. 1981; Oki-moto et al. 1992).ATGATAAAATSA-39 (S 5 G or C)] at their 59 ends, and
a fourth (cox3) has a slightly modified version of this Codon usage: In contrast to the other nematode spe-cies examined, the proteins are encoded by both strandsmotif (59-ATGAATAAATCC-39). The fifth gene with an
ATG initiation codon (nad4) does not share this pattern. of T. spiralis mtDNA. Nine (ATP6, ATP8, COX1, COX2,COX3, COB, NAD1, NAD3, and NAD6) are encodedAll genes except cob and nad4 appear to end with
complete termination codons (seven with TAA, four by the a-strand, and four (NAD2, NAD4, NAD4L, andNAD5) are encoded by the b-strand. Since the twowith TAG). The truncated stop codons inferred for cob
(T) and nad4 (TA) are parts of TAG triplets that also strands have very different nucleotide compositions, thepattern of codon usage in protein genes with codingcontain the 59 ends of adjacent tRNA genes and are
assumed to be completed by polyadenylation to TAA sequences on different strands was analyzed separately.Nonsynonymous codon usage (amino acid composition):codons after tRNA excision (Yokobori and Paabo 1997;
Reichert et al. 1998). The observation that the next The amino acid frequencies differ significantly (x2 5398, d.f. 5 19, P , 0.001) in proteins encoded by theone or two nucleotides after a truncated stop codon
627T. spiralis Mitochondrial Genome
TA
BL
E1
Com
pari
son
ofm
itoc
hond
rial
prot
ein
gene
sin
T.
spir
alis
wit
hth
ose
ofot
her
nem
atod
esan
dth
eho
rses
hoe
crab
L.po
lyph
emus
Pred
icte
din
itia
tion
and
No.
ofen
code
dam
ino
acid
s%
amin
oac
idid
enti
ty
term
inat
ion
codo
nsi
nT
.spi
ralis
Prot
ein
T.
A.
C.
O.
L.
Tri
chin
ella
/T
rich
inel
la/
Tri
chin
ella
/T
rich
inel
la/
gen
esp
iral
issu
uma
eleg
ansa
volv
ulus
apo
lyph
emus
aA
scar
isC
aen
orh
abdi
tis
On
choc
erca
Lim
ulus
atp6
276
199
199
199
224
10.5
b9.
0b11
.9b
12.3
bA
TT
(0)c
TA
A(2
8)at
p841
NFd
NF
NF
51—
——
18.5
bA
TT
(0)
TA
A(6
)co
b37
136
537
036
137
733
.536
.235
.940
.8A
TG
(12)
T(A
G)e
(0)
cox1
514
525
525
549
511
53.5
54.3
43.4
59.8
AT
G(2
2)T
AA
(5)
cox2
225
232
231
233
228
33.6
34.9
32.3
43.3
AT
G(5
)T
AA
(7)
cox3
257
255
255
260
261
31.6
31.2
25.6
32.7
AT
G(2
8)T
AG
(25)
nad1
299
290
291
291
310
37.3
40.0
35.1
35.5
AT
A(1
9)T
AA
(1)
nad2
295
281
282
284
338
18.7
17.1
19.7
18.9
AT
T(0
)T
AG
(AT
f)
nad3
116
111
111
108
114
31.0
31.9
31.6
30.8
AT
T(6
)T
AA
(40)
nad4
411
409
409
411
445
25.8
26.5
28.8
28.0
AT
G(1
)T
A(G
)(0
)na
d4L
8177
7781
9921
.722
.919
.021
.2A
TT
(1)
TA
G(1
)na
d551
852
852
853
257
127
.727
.127
.924
.4A
TT
(0)
TA
G(0
)na
d615
614
414
515
015
315
.916
.617
.120
.4A
TA
(28)
TA
A(1
2)
aD
ata
for
A.
suum
and
C.
eleg
ans
are
from
Oki
mo
toet
al.
(199
4),
for
O.
volv
ulus
from
Ked
die
etal
.(1
998)
,an
dfo
rL
.po
lyph
emus
from
Lav
rov
etal
.(2
000)
.b
Acc
urac
yof
the
num
ber
isun
cert
ain
due
toal
ign
men
tam
bigu
itie
s.c
Th
en
umbe
rsin
pare
nth
eses
afte
rin
itia
tion
and
term
inat
ion
codo
ns
show
the
num
ber
ofn
onco
din
gn
ucle
otid
esup
stre
aman
ddo
wn
stre
amof
age
ne.
Th
en
egat
ive
num
bers
indi
cate
that
the
gen
esar
eov
erla
ppin
g.d
NF,
not
foun
d.e
Nuc
leot
ides
inpa
ren
thes
esin
dica
tea
pote
nti
alfo
rco
mpl
ete
term
inat
ion
codo
nov
erla
ppin
gth
edo
wn
stre
amge
ne.
fA
Tin
dica
tes
AT
-ric
hre
peat
regi
onis
adja
cen
tto
the
gen
e.
628 D. V. Lavrov and W. M. Brown
TABLE 2
Amino acid composition of inferred proteins in T. spiralis
a-Strand- b-Strand-encoded encodedproteinsa proteinsb Both strands
Amino acid No. % No. % No. % ORc x2 testd
NonpolarAlanine (GCN) 81 3.59 40 3.07 121 3.40 1.18 0.55Isoleucine (ATY) 200 8.87 50 3.83 250 7.02 2.44 31.36***Leucine (Total) 348 15.43 203 15.56 551 15.48 0.99 0.00— (CTN) 237 10.51 38 2.91 275 7.72 3.92 65.89***— (TTR) 111 4.92 165 12.64 276 7.75 0.36 67.83***Methionine (ATR) 193 8.56 96 7.36 289 8.12 1.18 1.45Phenylalanine (TTY) 147 6.52 101 7.74 248 6.97 0.83 1.72Proline (CCN) 98 4.35 28 2.15 126 3.54 2.07 11.09***Tryptophan (TGR) 73 3.24 46 3.52 119 3.34 0.92 0.13Valine (GTN) 75 3.33 251 19.23 326 9.16 0.14 249.55***Total 1215 53.88 815 62.45 2030 57.02 0.70 24.44***
PolarAsparagine (AAY) 113 5.01 35 2.68 148 4.16 1.91 10.68***Cysteine (TGY) 14 0.62 37 2.84 51 1.43 0.21 27.16***Glutamine (CAR) 31 1.37 9 0.69 40 1.12 2.01 2.90Glycine (GGN) 112 4.97 86 6.59 198 5.56 0.74 3.84*Serine (Total) 229 10.16 143 10.96 372 10.45 0.92 0.49— (AGN) 103 4.57 68 5.21 171 4.80 0.87 0.61— (TCN) 126 5.59 75 5.75 201 5.65 0.97 0.02Threonine (ACN) 214 9.49 27 2.07 241 6.77 4.96 70.96***Tyrosine (TAY) 92 4.08 55 4.21 147 4.13 0.97 0.01Total 805 35.70 392 30.04 1197 33.62 1.29 11.61***
AcidicAspartate (GAY) 38 1.69 19 1.46 57 1.60 1.16 0.15Glutamate (GAR) 50 2.22 22 1.69 72 2.02 1.32 0.93Total 88 3.90 41 3.14 129 3.62 1.25 1.16
BasicArginine (CGN) 35 1.55 11 0.84 46 1.29 1.85 2.73Histidine (CAY) 45 2.00 14 1.07 59 1.66 1.88 3.77Lysine (AAR) 67 2.97 32 2.45 99 2.78 1.22 0.64Total 147 6.52 57 4.37 204 5.73 1.53 6.69**
Grand total 2255 1305 3560
a ATP6, ATP8, COX1, COX2, COX3, COB, NAD1, NAD3, NAD6.b NAD2, NAD4, NAD4L, NAD5.c OR, odds ratio, the proportion of an amino acid (or a group of amino acids) to all other amino acids
encoded by the a-strand over the same proportion for the amino acids encoded by the b-strand.d x2 test of the difference in the frequency of an amino acid or a group of amino acids encoded by the two
strands. *, **, and *** indicate the probabilities P , 0.05, 0.01, and 0.001, respectively, that this differencewould be observed by chance. No asterisk indicates P . 0.05.
a- and b-strands of T. spiralis mtDNA: all amino acids specified by AC-rich equal to 0.74 and 3.6 for a- andb-strand encoded proteins, respectively. Individual dif-with A- and/or C-rich (AC-rich) codons are more fre-
quent in a-strand-encoded proteins; those with GT-rich ferences were statistically significant for seven aminoacids; six of those are specified by either AC-rich or GT-codons are more frequent in b-strand-encoded proteins
(Table 2). When the amino acids represented by GT- rich codon families and one (isoleucine) is specified byATY codon family (Table 2). Thus, there exists a strongor AC-rich codon families were pooled in two groups
and their frequencies in proteins encoded by the correlation between the biased nucleotide compositionof the a- and b-strands and the amino acid compositiona- and b-strands were compared, we found them to be
significantly different (P ! 0.001), with the ratios of of the proteins encoded by them. It is likely that asym-metrical mutational pressure, rather than specificamino acids specified by GT-rich codon families to those
629T. spiralis Mitochondrial Genome
amino acid requirements of the proteins, determines metazoan mtDNAs. Both genes are encoded by thea-strand and are separated from each other by trnVthe observed codon-usage differences between the
strands, since both the protein and ribosomal genes on (Figure 1), an arrangement typical for many metazoanmtDNAs, but unlike that in the other nematode specieseach strand demonstrate similar nucleotide biases and
since different proteins encoded by the same strand examined. The 59 and 39 ends of rrnS are tentativelydefined to be immediately adjacent to the 39 end ofhave similar biases in amino acid compositions (data
not shown). We have made a similar observation for the trnS(ucu) and the 59 end of trnV; those of rrnL are as-sumed to be immediately adjacent to the 39 end of trnVmt-proteins of L. polyphemus (Lavrov et al. 2000).
Synonymous codon usage: Each amino acid in nematode and the 59 end of atp6. Secondary structure models forboth srRNA and lrRNA (Figures 5 and 6) were derivedmtDNAs is specified by either a two- or four-codon fam-
ily, or by a combination of two such families. In all cases, based on the structures of the corresponding rRNAs ofEscherichia coli (Noller and Woese 1981; Noller etwhen an amino acid is specified by a two-codon family,
the two members of such a family [ending with either al. 1981), two other nematode species (Okimoto et al.1994), and on generalized patterns of phylogenetic con-a purine (A or G) or a pyrimidine (T or C)] occur
with significantly different frequencies in protein genes servation observed in ribosomal genes across many dif-ferent taxa (Gutell et al. 1993; Gutell 1994).transcribed from different strands, in accordance with
the nucleotide compositional biases of the two strands rrnS: The size of T. spiralis mt-rrnS, as defined above,is 688 bp, similar to those of other nematodes (697 bp(Table 3). Likewise, the usage of codons within four-
codon families is also significantly different in protein in C. elegans; 700 bp in A. suum; 684 bp in O. volvulus),but shorter than those of most other metazoans [e.g.,genes transcribed from the different strands. However,
when the frequencies of individual codons from each 789 bp in Drosophila yakuba (Clary and Wolstenholme1985); 955 bp in mouse (Bibb et al. 1981)]. In confor-four-codon family were compared in these genes, we
found several cases in which they were not significantly mity with the general model (Noller and Woese 1981),the structure we propose for T. spiralis mt-srRNA (Figuredifferent. Those cases, underlined in Table 3, may be
due either to other constraints on codon usage, such 5) can be partitioned into four domains bounded by thethree sets of long-range interactions that form helices 3,as selection or dinucleotide bias (Karlin and Burge
1995), or to an artifact of insufficient sampling. The 22, and 32. The structures at the domain boundariesare well conserved in T. spiralis, as are most other ele-two amino acids that are each specified by two different
codon families (serine and leucine) occur with similar ments of the core structure (Raue et al. 1988; Gutell1994), with the notable exception of helices 31 and 48,frequencies on the two strands. However, the represen-
tation of the two leucine families (CTN and TTR) is which, if real, are much shorter than those in othersrRNAs. [We note, however, that the reductions in thehighly uneven in protein genes encoded by the different
strands (Table 2). In contrast, the frequencies of the lengths of both helices are unaccompanied by a declinein the total numbers of nucleotides in the correspond-two serine codon families (AGN and TCN) are not statis-
tically different between these genes. This observation ing stem-loop structures, which are about the same oreven greater than in the related secondary elements inalso accords with the strand biases: the TTR family of
leucine is T rich, whereas both serine families lack a the C. elegans/A. suum model (Okimoto et al. 1994).]In addition, alternative folding is possible for severalGT/AC bias.
The strong influence of mutational pressure on both structures (e.g., helices 3, 22, 23, and 39), and some ofthese determine the way other structures are formed.synonymous and nonsynonymous codon usage can af-
fect phylogenetic reconstruction, as suggested by Fos- Thus, two alternatives are possible for helix 3, which,in turn, lead to alternative foldings for helices 1, 4, andter and Hickey (1999). It can also, in principle, explain
the observation that highly rearranged mt-genomes of- 16 (Figure 5). Since alternative foldings of the 59 enddomain have also been proposed for the other nema-ten produce long branches in sequence comparisons (J.
Boore, personal communication). If rearrangements tode srRNAs (e.g., compare Okimoto et al. 1994 andGutell 1994), it is clear that further studies of srRNAsresult in strand exchange (inversions) or in a change in
the polarity of mtDNA replication, the new mutational from closely related species are needed to test thesestructural alternatives.pattern might “overwrite” the nucleotide and amino
acid compositions of the genes transferred, thus creat- rrnL: The estimated size of T. spiralis mt-rrnL, 947 bp,is similar to those of other nematodes (953 bp in C.ing long branches on phylogenetic trees inferred using
those sequences. elegans; 960 bp in A. suum; 987 bp in O. volvulus), butshorter than those of most other metazoans (e.g., 1325bp in D. yakuba; 1581 bp in mouse). The two 39-most
rRNA genesnucleotides of helix H5N (Figure 6) plus the six nucleo-tides following them form an octomer (59-GUACAAAA-The T. spiralis mt-small and -large subunit ribosomal
RNA genes (rrnS and rrnL, respectively) were identified 39) that is complementary to the sequence 27 nu-cleotides downstream from the inferred 59 end of rrnLby their sequence similarities to rrnS and rrnL in other
630 D. V. Lavrov and W. M. Brown
TA
BL
E3
Per
cent
age
and
num
ber
ofco
dons
inge
nes
for
prot
eins
enco
ded
bydi
ffer
ent
stra
nds
ofT
rich
inel
lam
tDN
A
Gen
esfo
ra
-str
and-
enco
ded
prot
ein
saG
enes
for
b-s
tran
d-en
code
dpr
otei
nsb
Am
ino
acid
NN
TN
NC
NN
AN
NG
NN
TN
NC
NN
AN
NG
OR
cx
2te
std
Non
pola
rA
la(G
CN
)24
.7(2
0)24
.7(2
0)49
.4(4
0)1.
2(1
)70
.0(2
8)2.
5(1
)7.
5(3
)20
.0(8
)25
.71
47.3
***
Ile
(AT
Y)43
.5(8
7)56
.5(1
13)
——
98.0
(49)
2.0
(1)
——
63.6
445
.7**
*L
eu(C
TN
)12
.7(3
0)20
.3(4
8)e
63.7
(151
)3.
4(8
)57
.9(2
2)10
.5(4
)7.
9(3
)23
.7(9
)23
.19
77.2
***
Leu
(TT
R)
——
96.4
(107
)3.
6(4
)—
—23
.6(3
9)76
.4(1
26)
86.4
213
8.1*
**M
et(A
TR
)—
—93
.8(1
81)
6.2
(12)
——
12.5
(12)
87.5
(84)
105.
5818
7.3*
**Ph
e(T
TY)
38.1
(56)
61.9
(91)
——
99.0
(100
)1.
0(1
)1
—16
2.50
92.6
***
Pro
(CC
N)
15.3
(15)
17.3
(17)
67.3
(66)
0.0
(0)
60.7
(17)
3.6
(1)
14.3
(4)
21.4
(6)
25.4
552
.6**
*T
rp(T
GR
)—
—95
.9(7
0)4.
1(3
)—
—21
.7(1
0)78
.3(3
6)84
.00
67.1
***
Val
(GT
N)
16.0
(12)
14.7
(11)
65.3
(49)
4.0
(3)
52.2
(131
)2.
4(6
)7.
6(1
9)37
.8(9
5)36
.16
148.
3***
Pola
rA
sn(A
AY)
30.1
(34)
69.9
(79)
——
97.1
(34)
2.9
(1)
——
79.0
045
.7**
*C
ys(T
GY)
50.0
(7)
50.0
(7)
——
100.
0(3
7)0.
0(0
)—
—∞
17.4
***
Gln
(CA
R)
——
96.8
(30)
3.2
(1)
——
0.0
(0)
100.
0(9
)∞
29.9
***
Gly
(GG
N)
13.4
(15)
17.9
(20)
65.2
(73)
3.6
(4)
57.0
(49)
0.0
(0)
10.5
(9)
32.6
(28)
41.8
810
4.4*
**Se
r(A
GN
)14
.6(1
5)18
.4(1
9)66
.0(6
8)1.
0(1
)45
.6(3
1)1.
5(1
)4.
4(3
)48
.5(3
3)87
.00
108.
8***
Ser
(TC
N)
19.0
(24)
33.3
(42)
46.0
(58)
1.6
(2)
64.0
(48)
1.3
(1)
9.3
(7)
25.3
(19)
32.2
194
.0**
*T
hr
(AC
N)
9.3
(20)
26.2
(56)
63.6
(136
)0.
9(2
)59
.3(1
6)3.
7(1
)3.
7(1
)33
.3(9
)10
9.09
115.
3***
Tyr
(TA
Y)34
.8(3
2)65
.2(6
0)—
—10
0.0
(55)
0.0
(0)
——
∞57
.9**
*
Aci
dic
Asp
(GA
Y)39
.5(1
5)60
.5(2
3)—
—10
0.0
(19)
0.0
(0)
0.0
(0)
0.0
(0)
∞16
.8**
*G
lu(G
AR
)—
—94
.0(4
7)6.
0(3
)—
—13
.6(3
)86
.4(1
9)99
.22
42.8
***
Bas
ic Arg
(CG
N)
11.4
(4)
11.4
(4)
77.1
(27)
0.0
(0)
27.3
(3)
9.1
(1)
9.1
(1)
54.5
(6)
34.8
826
.9**
*H
is(C
AY)
22.2
(10)
77.8
(35)
——
92.9
(13)
7.1
(1)
——
45.5
019
.5**
*L
ys(A
AR
)—
—94
.0(6
3)6.
0(4
)—
—6.
3(2
)93
.8(3
0)23
6.25
70.2
***
aat
p6,
atp8
,co
x1,
cox2
,co
x3,
cob,
nad1
,na
d3,
nad6
.bna
d2,
nad4
,na
d4L
,na
d5.
cO
R,
odds
rati
o,th
epr
opor
tion
ofco
don
sen
din
gw
ith
G/T
toth
ose
endi
ng
wit
hA
/Con
b-s
tran
dov
erth
esa
me
prop
orti
onon
a-s
tran
d.d
x2
test
ofth
edi
ffer
ence
inth
efr
eque
nci
esof
codo
ns
inco
don
fam
ilies
ontw
ost
ran
ds.
For
two-
codo
nfa
mili
esd.
f.5
1,fo
rfo
ur-c
odon
fam
ilies
d.f.
53.
***
indi
cate
sth
atth
isdi
ffer
ence
wou
ldbe
obse
rved
bych
ance
wit
hP
,0.
001.
eT
he
freq
uen
cyof
all
codo
ns
exce
ptth
ose
unde
rlin
eddi
ffer
ssi
gnifi
can
tly
inge
nes
for
prot
ein
sen
code
dby
diff
eren
tst
ran
dsof
mtD
NA
.
631T. spiralis Mitochondrial Genome
Figure 5.—Secondary structure model for T. spiralis mt-small subunit rRNA. The sequence is numbered every 25 nt from the59 end. Helices that appear to be conserved relative to the E. coli 16S rRNA model (Noller and Woese 1981; Gutell 1994)are numbered in boldface; numbering is according to Van de Peer et al. (1994). An alternative secondary structure (Alt) isshown for the boxed 59 end region.
(59-UUUUGUAU-39). Although the potential for pair- Gutell et al. 1993). The structural loss in the 59 halfof the molecule is especially extreme in T. spiralis, as ining of the two ends of lrRNA is known for Eubacteria
and most Archea, it has not been observed previously other nematodes (Okimoto et al. 1994; Keddie et al.1998); most of the distinctive elements in domains A–Cin either cytoplasmic or mitochondrial lrRNAs of eu-
karyotes (De Rijk et al. 1999), with the exception of and F are gone, and only those bounded by helices D6and D17 are identifiable in domain D.Metridium senile mt-lrRNA (Beagley et al. 1998).
Structurally, T. spiralis mt-lrRNA is typical of mt- By contrast, structures in domains E and G are rela-tively well conserved in T. spiralis and other triploblasts,lrRNAs from other triploblastic metazoans: the 59 half
is drastically reduced in size, with a concomitant loss of with the exception of helices E19 and E20, which aremissing, and of helices E23, E25, and several in thestructures, whereas the 39 half is conserved and structur-
ally similar to even E. coli’s lrRNA (Raue et al. 1988; terminal region bounded by helix G2, which are re-
632 D. V. Lavrov and W. M. Brown
Figure 6.—Secondary structure model for T. spiralis mt-large subunit RNA. The sequence is numbered every 25 nt from the59 end. Helices that appear to be conserved relative to the E. coli 23S rRNA model (Noller et al. 1981; Gutell et al. 1993) arenumbered in boldface; numbering is according to De Rijk et al. (1999). Two helices potentially present in T. spiralis (H5N andD10N) are not present in E. coli. The boxed nucleotides at the 59 and 39 ends of the rRNA can form a helix similar to oneobserved in Eubacteria and most Archea. Nucleotides in domains G and E that are identical to those in similar locations of theE. coli 23S rRNA model are shown in boldface.
633T. spiralis Mitochondrial Genome
duced in size relative to those in E. coli lrRNA (Figure previously studied nematode mt-tRNAs (Wolsten-holme et al. 1994) and T. spiralis mt-tRNAs (Figure6). The reduction in the region bounded by helix G2,
which is believed to be associated with the ribosomal 7), but not in mt-tRNAs from most other metazoans(Wolstenholme 1992). Since nucleotides at severalE site, is more extreme in nematodes than in other
metazoans (Okimoto et al. 1994) and is especially pro- conserved positions were shown to be involved in thetertiary interactions in standard tRNAs and nematodenounced in T. spiralis, which has lost all helices between
G2 and G6 (Figure 6). mt-tRNAs (Kim et al. 1974; Robertus et al. 1974; Wata-nabe et al. 1994; Ohtsuki et al. 1998), we evaluated thepotential for similar tertiary interactions in T. spiralis mt-
tRNA genestRNAs. We found such for previously described hydrogenbondings between nucleotides 8•4•21, L3(46)•22-13,T. spiralis has the 22 mt-tRNA genes typical of metazo-
ans; the genes vary in size from 53 (trnH) to 65 (trnW) L2(45)•10-25, 15•L4(48), 9•23-12, and 26•44(L1) (Figure7). We also found three deviations from the previouslybp. Twelve can be folded into structures characteristic
for other nematodes (Wolstenholme et al. 1987); in described patterns of nucleotide conservation.First, while a strong correlation exists in the occur-those, the TCC arm and variable loop are replaced by
a loop (the TV loop). Similarly, in both serine tRNAs, rence of nucleotides at positions 13-22 of the DHU-stemand L3(46) of the TV-replacement (variable size) loopthe DHU arms are replaced by unpaired loops (Figure
7). The remaining 8 can be folded into conventional in the inferred T. spiralis mt-tRNAs, its pattern differsfrom the usual R46(L2)•R22-Y13. We found that in allcloverleaf structures.
Each tRNA has been inferred to have an aminoacyl but one case the same nucleotide and not necessarilya purine, is present at position 22 of the Watson-Crickacceptor stem of 7 bp, an anticodon stem of 5 bp, and
an anticodon loop of 7 nt. Fifteen mismatches were 13-22 pair and at L3(46) [G46(L2)•G22-U13 in sixtRNAs, U46(L2)•U22-A13 in three tRNAs, and A46(L2)•found among the aminoacyl acceptor stems, and three
were found at the base of anticodon stems. The most A22-T13 in six tRNAs]. The only exception is tRNA(K),which has an A22-T13 pair in the DHU stem but G atcommon mismatch position was between nucleotides 7
and 66, at the base of the aminoacyl acceptor stem. This position 46. There are mismatches between positions13 and 22 in the DHU stems of four additional tRNAs.position was mismatched in 8 of the 12 tRNAs with TV
loops (R, A, N, E, Q, G, F, P, and V), but in no others. In all these cases there are different nucleotides at posi-tion 22 and L3(46).Interestingly, mismatches at this position are also com-
mon in the tRNAs of the other nematodes (Wolsten- Second, the nature of bond III (usually RL2(45)•R10-Y25 in standard tRNAs and in the mt-tRNAs of otherholme et al. 1987, 1994; Keddie et al. 1998). Additional
mismatches in the aminoacyl acceptor stems of T. spiralis nematodes) appears to vary among T. spiralis tRNAswith different secondary structures. Although the R10-mt-tRNAs are between nucleotides 1 and 72 in tRNAs
R and V and between nucleotides 3 and 70 in tRNA Y. Y25 pair is present in all tRNAs except tRNA(I), six ofeight tRNAs with cloverleaf structures have a pyrimidineMismatches at the base of the anticodon stem occur in
tRNAs E, L(tag), and V. at position 45, whereas all those with TV loops have apurine at the corresponding position (L2). A purine isIn tRNAs with a DHU arm, the stem is usually 4 bp
long [3 bp in tRNAs C, L(uag), L(uaa), and Y] and the also present at L2 in all other nematode mt-tRNAs withTV loops.loop is between 3 nt [tRNA(H)] and 12 nt [tRNA(E)].
The DHU-replacement loops are 5 and 4 nt in tRNA Third, there are differences between T. spiralis andother nematode mt-tRNAs in the presence of specific(S)(ucu) and tRNA(S)(uga), respectively. The TCC
arm, when present, has a stem of 2, 3, or 5 bp and a nucleotides at positions 9, 12, and 23, which are involvedin the formation of the hydrogen bond V. In secernen-loop of 3 to 8 nt. The variable loop in these cases is
either 4 or 5 nt. When the variable loop and TCC arm tean nematodes nucleotide 9 is always A and the 12-23pair is always W12-W23 (Wolstenholme et al. 1994;are absent, they are replaced by a TV loop of 6 to 8 nt.
The anticodons in T. spiralis mt-tRNAs are generally Keddie et al. 1998). However, nucleotide 9 is G in fourT. spiralis mt-tRNAs [E, G, L(uag), L(uaa)], and thethe same as those in other nematode mt-tRNAs. How-
ever, that for T. spiralis tRNA(R) is 59-TCG-39, as in most 12-23 pair is S12-S23 in six [A, R, E, G, H, M]. Thecombinations of nucleotides at these positions otherother metazoans, instead of the very unusual 59-ACG-39
present in the secernentean nematodes (Watanabe et than A9•W23-W12 are also very common in standardtRNAs.al. 1997).
Conserved nucleotides and possible tertiary interac- tRNA-like structure: In addition to the set of 22 tRNAgenes commonly present in metazoan mtDNAs, a se-tions: The tRNAs encoded by prokaryotic, nonanimal
organellar and nuclear genomes (referred to as stan- quence between trnG and trnD, designated trnM2 in Fig-ure 1, has the potential to form a tRNA-like structuredard tRNAs) have several invariable and semi-invariable
nucleotide positions (Dirheimer et al. 1995). Nucleo- with an anticodon (59-UAU-39) that would recognizemethionine codons. Two genes for tRNA(M), one withtides at some of these positions are also conserved in
634 D. V. Lavrov and W. M. Brown
Fig
ure
7.—
Con
sen
sus
seco
nda
ryst
ruct
ures
for
thre
egr
oups
oftR
NA
sin
T.s
pira
lis.N
umbe
rin
gof
nuc
leot
ides
isba
sed
onth
eco
nve
nti
onus
edfo
rye
astt
RN
AF
(Ro
bert
us
etal
.19
74);
num
beri
ng
inT
Vlo
ops
follo
ws
Wo
lste
nh
olm
eet
al.
(199
4).
Ope
nci
rcle
sw
ith
num
bers
,n
ucle
otid
espr
esen
tin
all
tRN
As
inea
chgr
oup;
solid
gray
circ
les,
nuc
leot
ides
pres
enti
nso
me,
but
not
all,
tRN
As;
solid
blac
kci
rcle
sw
ith
lett
ers,
nuc
leot
ides
con
serv
edin
anti
codo
nlo
ops,
TC
C,D
HU
,an
dva
riab
lear
ms,
orth
eir
repl
acem
ent
loop
s,of
all
oral
lbu
ton
eof
the
tRN
As
inea
chof
thes
egr
oups
.K5
Gor
T;R
5A
orG
;Y5
Cor
T;W
5A
orT
.Th
epa
tter
nof
nuc
leot
ide
con
serv
atio
nis
not
show
nfo
rtR
NA
sw
ith
aD
HU
-rep
lace
men
tlo
op,
due
toth
elim
ited
sam
ple
size
.B
roke
nlin
esin
dica
tepo
ssib
lete
rtia
ryin
tera
ctio
ns.
635T. spiralis Mitochondrial Genome
anticodon 59-CAU-39 and the second with 59-UAU-39, described previously. In several respects, T. spiralismtDNA is more similar to those of non-nematode meta-were reported in Mytilus edulis mtDNA (Hoffmann et
al. 1992). Both trnM(uau) and its transcription product zoans: it has the 37 genes typical of most metazoanmtDNAs; its gene arrangement has clear affinities withhave also been found in the related species M. califor-
nianus (Beagley et al. 1999). The location of T. spiralis those of coelomate metazoans; its protein genes initiatewith standard ATN codons; and tRNA(R) encoded hastrnM2 within a set of six tRNA genes suggests that it is
transcribed and most likely processed. However, it lacks a typical metazoan 59-UCG-39 anticodon. Thus, the un-usual gene arrangements, initiation codons, 59-ACG-39some well-conserved nucleotides (T in position 33, R in
position 37), has an unusual secondary structure with anticodon in tRNA(R), and the lack of atp8 observedin the mtDNAs of secernentean nematodes appear toa very large (21 nt) TCC loop, and overlaps the down-
stream trnG by 2 bp, all of which suggest that it may not be derived features that arose within that lineage afterthe divergence of secernentean nematodes from otherbe functional. Interestingly, a sequence identical to part
of trnM2 is found in the noncoding region bounded by metazoan groups. In other respects, T. spiralis mtDNAis more similar to those of other nematodes or interme-trnT and trnP.diate between them and those of non-nematode metazo-ans: it encodes rRNAs that are similar to their counter-Noncoding regionsparts in other nematodes both in size and structure;
The region between nad1 and nad2 contains at least most of its protein genes are intermediate in size be-two copies of a large (1232 bp) repeat, which, though tween those of other nematodes and coelomate metazo-mostly noncoding, also includes part of trnK. The repeat ans; and some of its tRNAs have conventional cloverleafunits proximal to each end of the region were com- structures, whereas others have the “bizarre” structurespletely sequenced and found to differ at three positions. that are characteristic of secernentean nematode mt-Two potential stem-loop structures were found in each tRNAs.repeat unit. Both have 14-bp stems; the one proximal
We thank D. Despommier and R. Polvere for T. spiralis DNA,to nad1 has a 7-nt loop, while that proximal to nad2 has J. Boore for help with data analysis, and K. Helfenbein and threea 15-nt loop; the latter also has a poly(T) tract, a feature anonymous reviewers for helpful comments and suggestions on an
earlier version of this manuscript. This work was supported by Nationalcommon to the class of stem-loop structures implicatedScience Foundation (NSF) dissertation improvement grant DEBas possible origins of mtDNA replication in metazoans9972712 (to W.M.B. and D.V.L.) and NSF grant DEB 9807100 (to(Wolstenholme 1992). The structures do not appearW.M.B).
to be artifactual: the probability of their occurring bychance in a sequence of equal length and nucleotidecomposition to that of the repeat unit is ,0.01, as esti-
LITERATURE CITEDmated by computer simulation (Lavrov et al. 2000).Adamson, M. L., 1987 Phylogenetic analysis of the higher classifica-Only a small amount of sequence between the two
tion of the Nematoda. Can. J. Zool. 65: 1478–1482.flanking repeat units was determined. However, asAguinaldo, A. M. A., J. M. Turbeville, L. S. Linford, M. C. Rivera,
stated above, it is likely that additional repeat units are J. R. Garey et al., 1997 Evidence for a clade of nematodes,arthropods and other moulting animals. Nature 387: 489–493.present between the two sequenced. It is also likely that
Anderson, S., A. T. Bankier, B. G. Barrell, M. H. L. De Bruijn,the size variation in this region is caused by the differ-A. R. Coulson et al., 1981 Sequence and organization of the
ences in the number of repeat units among different human mitochondrial genome. Nature 290: 457–465.Armstrong, M. R., V. C. Block and M. S. Phillips, 2000 A multipar-mtDNA molecules in the same and/or different individ-
tite mitochondrial genome in the potato cyst nematode Globoderauals, as previously observed (e.g., Densmore et al. 1985;pallida. Genetics 154: 181–192.
Moritz and Brown 1987; La Roche et al. 1990). An- Asakawa, S., Y. Kumazawa, T. Araki, H. Himeno, K. Miura et al.,1991 Strand-specific nucleotide composition bias in echino-other relatively large noncoding region (168 bp), be-derm and vertebrate mitochondrial genomes. J. Mol. Evol. 32:tween trnT and trnP, contains a 56-bp sequence identical511–520.
to part of trnM2, a part of which has the potential to Averof, M., A. Rokas, K. H. Wolfe and P. M. Sharp, 2000 Evidencefor a high frequency of simultaneous double-nucleotide substitu-form a structure with an 11-bp stem and a 3-nt loop.tions. Science 287: 1283–1286.Aside from the two noncoding regions just described,
Azevedo, J. L., and B. C. Hyman, 1993 Molecular characterization117 additional noncoding base pairs are present in 15 of lengthy mitochondrial DNA duplications from the parasitic
nematode Romanomermis culicivorax. Genetics 133: 933–942.small intergenic regions. These range in size from 1 toBeagley, C. T., R. Okimoto and D. R. Wolstenholme, 1998 The40 bp, have no shared sequence motifs or potential to
mitochondrial genome of the sea anemone Metridium senile (Cni-form structures, and can be characterized as intergenic daria): introns, a paucity of tRNA genes, and a near-standard
genetic code. Genetics 148: 1091–1108.“spacers” (Figure 1).Beagley, C. T., R. Okimoto and D. R. Wolstenholme, 1999 Mytilus
mitochondrial DNA contains a functional gene for a tRNASer(UCN) with a dihydrouridine arm-replacement loop and a
CONCLUSIONS pseudo-tRNASer(UCN) gene. Genetics 152: 641–652.Bibb, J. M., R. A. Van Etten, C. T. Wright, M. W. Walberg and
The mtDNA of T. spiralis establishes a link between D. A. Clayton, 1981 Sequence and gene organization of mousemitochondrial DNA. Cell 26: 167–180.typical metazoan mtDNAs and the nematode mtDNAs
636 D. V. Lavrov and W. M. Brown
Blaxter, M. L., P. De Ley, J. R. Garey, L. X. Liu, P. Scheldeman tion—restriction fragment length polymorphism of the mito-et al., 1998 A molecular evolutionary framework for the phylum chondrial cytochrome c-oxidase subunit I gene. Int. J. Parasitol.Nematoda. Nature 392: 71–75. 29: 1113–1120.
Boore, J. L., 1999 Animal mitochondrial genomes. Nucleic Acids Noller, H. F., and C. R. Woese, 1981 Secondary structure of 16SRes. 27: 1767–1780. ribosomal RNA. Science 212: 403–411.
Brown, F. F., and M. V. Simpson, 1982 Novel features of animal Noller, H. F., J. Kop, V. Wheaton, J. Brosius, R. R. Gutell etmtDNA evolution as shown by sequences of two rat cytochrome al., 1981 Secondary structure model for 23S ribosomal RNA.oxidase subunit II genes. Proc. Natl. Acad. Sci. USA 79: 3246– Nucleic Acids Res. 9: 6167–6189.3250. Ohtsuki, T., G. Kawai and K. Watanabe, 1998 Stable isotope-
Brusca, R. C., and G. J. Brusca, 1990 Invertebrates. Sinauer Associ- edited NMR analysis of Ascaris suum mitochondrial tRNAMetates, Sunderland, MA. having a TV-replacement loop. J. Biochem. 124: 28–34.
Cantatore, P., M. N. Gadaleta, M. Roberti, C. Saccone and A. C. Okimoto, R., and D. R. Wolstenholme, 1990 A set of tRNAs thatWilson, 1987 Duplication and remoulding of tRNA genes dur- lack either the TyC arm or the dihydrouridine arm: towards aing the evolutionary rearrangement of mitochondrial genomes. minimal tRNA adaptor. EMBO J. 9: 3405–3411.Nature 329: 853–855. Okimoto, R., J. L. Macfarlane and D. R. Wolstenholme, 1990
Clary, D. O., and D. R. Wolstenholme, 1985 The ribosomal RNA Evidence for the frequent use of TTG as the translation initiationgenes of Drosophila mitochondrial DNA. Nucleic Acids Res. 13: codon of mitochondrial protein genes in the nematodes, Ascaris4029–4045. suum and Caenorhabditis elegans. Nucleic Acids Res. 18: 6113–6118.
Densmore, L. D., J. W. Wright and W. M. Brown, 1985 Length Okimoto, R., H. M. Chamberlin, J. L. Macfarlane and D. R. Wol-variation and heteroplasmy are frequent in mitochondrial DNA stenholme, 1991 Repeated sequence sets in mitochondrialfrom parthenogenetic and bisexual lizards (genus Cnemidopho- DNA molecules of root knot nematodes (Meloidogyne): nucleotiderus). Genetics 110: 687–707. sequences, genome location and potential for host-race identifi-
De Rijk, P., and R. De Wachter, 1997 RnaViz, a program for the cation. Nucleic Acids Res. 19: 1619–1626.visualization of RNA secondary structure. Nucleic Acids Res. 25: Okimoto, R., J. L. Macfarlane, D. O. Clary and D. R. Wolsten-4679–4684. holme, 1992 The mitochondrial genomes of two nematodes,
De Rijk, P., E. Robbrecht, S. de Hoog, A. Caers, Y. Van de Peer Caenorhabditis elegans and Ascaris suum. Genetics 130: 471–498.et al., 1999 Database on the structure of large subunit ribosomal Okimoto, R., J. L. Macfarlane and D. R. Wolstenholme, 1994RNA. Nucleic Acids Res. 27: 174–178. The mitochondrial ribosomal RNA genes of the nematodes Caeno-
Dirheimer, G., G. Keith, P. Dumas and E. Westhof, 1995 Primary, rhabditis elegans and Ascaris suum: consensus secondary-structuresecondary and tertiary structures of tRNAs, pp. 93–126 in tRNA: models and conserved nucleotide sets for phylogenetic analysis.Structure, Biosynthesis, and Function, edited by D. Soll and U. J. Mol. Evol. 39: 598–613.RajBhandary. American Society for Microbiology, Washington, Perna, N. T., and T. D. Kocher, 1995 Patterns of nucleotide compo-DC. sition at fourfold degenerate sites of animal mitochondrial ge-
Foster, P. G., and D. A. Hickey, 1999 Compositional bias may affect nomes. J. Mol. Evol. 41: 353–358.both DNA-based and protein-based phylogenetic reconstructions. Raue, H. A., J. Klootwijk and W. Musters, 1988 EvolutionaryJ. Mol. Evol. 48: 284–290. conservation of structure and function of high molecular weight
Gray, M. W., B. F. Lang, R. Cedergren, G. B. Golding, C. Lemieux ribosomal RNA. Prog. Biophys. Mol. Biol. 51: 77–129.et al., 1998 Genome structure and gene content in protist mito- Reichert, A., U. Rothbauer and M. Morl, 1998 Processing andchondrial DNAs. Nucleic Acids Res. 26: 865–878. editing of overlapping tRNAs in human mitochondria. J. Biol.
Gutell, R. R., 1994 Collection of small subunit (16S- and 16S-like) Chem. 273: 31977–31984.ribosomal RNA structures: 1994. Nucleic Acids Res. 22: 3502– Reyes, A., C. Gissi, G. Pesole and C. Saccone, 1998 Asymmetrical3507. directional mutation pressure in the mitochondrial genome of
Gutell, R. R., M. W. Gray and M. N. Schnare, 1993 A compilation mammals. Mol. Biol. Evol. 15: 957–966.of large subunit (23S- and 23S-like) ribosomal RNA structures. Robertus, J. D., J. E. Ladner, J. T. Finch, D. Rhodes, R. S. BrownNucleic Acids Res. 21: 3055–3074. et al., 1974 Structure of yeast phenylalanine tRNA at 3 A resolu-Hoffmann, J. R., J. L. Boore and W. M. Brown, 1992 A novel tion. Nature 250: 546–551.mitochondrial genome organization for the blue mussel, Mytilus Staton, J. L., L. L. Daehler and W. M. Brown, 1997 Mitochondrialedulis. Genetics 131: 397–412. gene arrangement of the horseshoe crab Limulus polyphemus L.:Karlin, S., and C. Burge, 1995 Dinucleotide relative abundance
conservation of major features among arthropod classes. Mol.extremes: a genomic signature. Trends Genet. 11: 283–290.Biol. Evol. 14: 867–874.Keddie, E. M., T. Higazi and T. R. Unnasch, 1998 The mitochon-
Thompson, J. D., D. G. Higgins and T. J. Gibson, 1994 CLUSTALdrial genome of Onchocerca volvulus: sequence, structure and phy-W: improving the sensitivity of progressive multiple sequencelogenetic analysis. Mol. Biochem. Parasitol. 95: 111–127.alignment through sequence weighting, position-specific gapKim, S. H., F. L. Suddath, G. J. Quigley, A. McPherson, J. L.penalties and weight matrix choice. Nucleic Acids Res. 22: 4673–Sussman et al., 1974 Three-dimensional tertiary structure of4680.yeast phenylalanine transfer RNA. Science 185: 435–440.
Van de Peer, Y., I. Van den Broeck, P. De Rijk and R. De Wachter,Kyte, J., and R. F. Doolittle, 1982 A simple method for displaying1994 Database on the structure of small ribosomal subunit RNA.the hydropathic character of a protein. J. Mol. Biol. 157: 105–132.Nucleic Acids Res 22: 3488–3494.Lang, B. F., M. W. Gray and G. Burger, 1999 Mitochondrial ge-
Voronov, D. A., Y. V. Panchin and S. E. Spiridonov, 1998 Nema-nome evolution and the origin of eukaryotes. Annu. Rev. Genet.tode phylogeny and embryology. Nature 395: 28.33: 351–397.
Watanabe, Y., H. Tsurui, T. Ueda, R. Furushima, S. TakamiyaLa Roche, J., M. Snyder, D. I. Cook, K. Fuller and E. Zouros, 1990et al., 1994 Primary and higher order structures of nematodeMolecular characterization of a repeat element causing large-(Ascaris suum) mitochondrial tRNAs lacking either the T or Dscale size variation in the mitochondrial DNA of the sea scallopstem. J. Biol. Chem. 269: 22902–22906.Placopecten magellanicus. Mol. Biol. Evol. 7: 45–64.
Watanabe, Y., H. Tsurui, T. Ueda, R. Furusihima-Shimogawara,Lavrov, D. V., J. L. Boore and W. M. Brown, 2000 The completeS. Takamiya et al., 1997 Primary sequence of mitochondrialmitochondrial DNA sequence of the horseshoe crab Limulus poly-tRNA(Arg) of a nematode Ascaris suum: occurrence of unmodi-phemus. Mol. Biol. Evol. 17: 813–824.fied adenosine at the first position of the anticodon. Biochim.Malakhov, V. V., 1994 Nematodes: Structure, Development, Classifica-Biophys. Acta 1350: 119–122.tion, and Phylogeny. Smithsonian Institution Press, Washington,
Wesley, U. V., and C. S. Wesley, 1997 Rapid directional walk withinDC.DNA clones by step-out PCR. Methods Mol. Biol. 67: 279–285.Moritz, C., and W. M. Brown, 1987 Tandem duplications in animal
Wolstenholme, D. R., 1992 Animal mitochondrial DNA: structuremitochondrial DNAs: variation in incidence and gene contentand evolution. Int. Rev. Cytol. 141: 173–216.among lizards. Proc. Natl. Acad. Sci. USA 84: 7183–7187.
Wolstenholme, D. R., J. L. Macfarlane, R. Okimoto, D. O. ClaryNagano, I., Z. Wu, A. Matsuo, E. Pozio and Y. Takahashi, 1999Identification of Trichinella isolates by polymerase chain reac- and J. A. Wahleithner, 1987 Bizarre tRNAs inferred from DNA
637T. spiralis Mitochondrial Genome
sequences of mitochondrial genomes of nematode worms. Proc. Yates, F., 1934 Contingency tables involving small numbers and theNatl. Acad. Sci. USA 84: 1324–1328. x2 test. J. R. Stat. Soc. 1 (Suppl.): 217–235.
Wolstenholme, D. R., R. Okimoto and J. L. Macfarlane, 1994 Yokobori, S., and S. Paabo, 1997 Polyadenylation creates the dis-Nucleotide correlations that suggest tertiary interactions in the criminator nucleotide of chicken mitochondrial tRNA(Tyr). J.TV-replacement loop-containing mitochondrial tRNAs of the Mol. Biol. 265: 95–99.nematodes, Caenorhabditis elegans and Ascaris suum. Nucleic Acids
Communicating editor: N. TakahataRes. 22: 4300–4306.