Nucleotide sequence of the genome of eggplant mosaic tymovirus

8
VIKOLOOY 172,547-554 (1989) Nucleotide Sequence of the Genome of Eggplant Mosaic Tymovirus MARIA ELENA OSORIO-KEESE, PAUL KEESE,’ AND ADRIAN GIBBS’ Research School of Biological Sciences, Australian National University, Canberra, A. C. T. 260 1, Australia Received May 27, 1988; accepted May 22, 7989 The sequence of the RNA genome of an isolate of eggplant mosaic tymovirus from Trinidad (EMV-Trin) has been determined. The genome is 6330 nucleotide residues in length and contains three open reading frames; two overlap- ping genes, whose initiation codons are separated by seven nucleotide residues (nucleotide residues 102-2051 and 109-5628) near the 5’terminus, and the virion protein gene, which is near the 3’terminus (nucleotide residues 5633- 6199). The genomes of EMV-Trin and turnip yellow mosaic tymovirus have the same genomic organization and similar nucleotide and encoded amino acid sequences. The nucleotide residues adjacent to the initiation codons of tymoviral overlapping genes have closely similar sequences which may form a weak stem-loop secondary structure that regu- lates their translation. 0 1999Academic Press. Inc. INTRODUCTION spond to the possible antigenic sites of TYMV show only 20% sequence similarity (Dupin ef al., 1985), which may account for the lack of a detectable serolog- ical cross-reaction between EMV and TYMV virions (Koenig, 1976). Furthermore, EMV and TYMV have dis- tinct host ranges; TYMV is restricted to members of the Brassicaceae and two closely related families. None- theless, these two tymoviruses have genomes of sim- liar base composition and were found to be closely re- lated in viral RNA-cDNAin vitro hybridization tests (Blok et al., 1987). Eggplant mosaic tymovirus (EMV) was first reported to occur in Trinidad, West Indies (Ferguson, 1951; Dale, 1954), where it caused mosaic diseases of egg- plant and tomato. Host range studies indicate that EMV mainly infects species of the Solanaceae and Chenopodiaceae (Dale, 1954; Gibbs et al., 1966; Gibbs and Harrison, 1969). Variants of EMV have been reported that differ in virulence, but not antigenic speci- ficity, from EMV type strain (Gibbs and Harrison, 1969), which is called EMV-Trin in this paper. The unipartite single-stranded EMV RNA genome has been estimated to have a M, of about 2.0 X 1 O6 (Gibbs and Harrison, 1973). EMV viral RNA has a 5’ 7- methyl guanosine cap (G. W. Both, personal communi- cation), and can be efficiently and specifically aminocy- lated by valyl-tRNA synthetase (Pinck et a/., 1974). The 3’terminal 80 nucleotide residues can form a tRNA-like secondary structure (van Belkum et al., 1987). Purified virions of EMV separate into three components when centrifuged in CsCl gradients. The bottom infectious component contains genomic RNA. The middle and top components contain host tRNAs and the subgeno- mic mRNA for the virion protein (Szybiak et a/., 1978). EMV virion protein is composed of 188 amino acid residues (Dupin et al., 1984) and has 32% sequence similarity with the related virion protein of turnip yellow mosaic tymovirus (TYMV) (Dupin et al., 1985). How- ever, the regions of the EMV virion protein that corre- Sequence data from this article have been deposited with the EMBUGenBank Data Libraries under Accession No. J04374. ’ Present address: CSIRO Division of Plant Industry, Black Moun- tain, Canberra, A.C.T., 2601, Australia. ’ To whom reprint requests should be addressed. The determination of the nucleotide sequence of the genomes of EMV-Trin and the Club Lake isolate of TYMV (Keese eta/., 1989) has allowed common geneti- cal features to be identified and provided further details of the evolutionary relationships of tymoviruses. MATERIALS AND METHODS The materials and methods used were those de- scribed by Keese et al., (1989), except for the following details. The enzymes used included Escherichia co/i poly(A) polymerase obtained from Bresatec. Primers for cDNA synthesis Included random synthetic hexamers (Bresa- tee), dT,G (P. L. Biochemicals), and EMV-specific oli- godeoxynucleotides kindly synthesized by G. Mayo. The Trinidad isolate of EMV used was that described by Gibbs and Harrison (1969). It was propagated in plants of Nicotiana clevelandi L. Viral RNA was ex- tracted from particles purified as described previously (Gibbs eta/., 1966; Guy and Gibbs, 1985). DNA complementary to the genome was synthe- sized using reverse transcriptase and, as primers, ei- ther synthetic random hexanucleotides or dT,G with 547 0042.6822/89 $3.00 CopyrIght 0 ,989 by Academtc Press. Inc All rights of reproduckx I” any form resewed

Transcript of Nucleotide sequence of the genome of eggplant mosaic tymovirus

Page 1: Nucleotide sequence of the genome of eggplant mosaic tymovirus

VIKOLOOY 172,547-554 (1989)

Nucleotide Sequence of the Genome of Eggplant Mosaic Tymovirus

MARIA ELENA OSORIO-KEESE, PAUL KEESE,’ AND ADRIAN GIBBS’

Research School of Biological Sciences, Australian National University, Canberra, A. C. T. 260 1, Australia

Received May 27, 1988; accepted May 22, 7 989

The sequence of the RNA genome of an isolate of eggplant mosaic tymovirus from Trinidad (EMV-Trin) has been determined. The genome is 6330 nucleotide residues in length and contains three open reading frames; two overlap- ping genes, whose initiation codons are separated by seven nucleotide residues (nucleotide residues 102-2051 and 109-5628) near the 5’terminus, and the virion protein gene, which is near the 3’terminus (nucleotide residues 5633- 6199). The genomes of EMV-Trin and turnip yellow mosaic tymovirus have the same genomic organization and similar nucleotide and encoded amino acid sequences. The nucleotide residues adjacent to the initiation codons of tymoviral overlapping genes have closely similar sequences which may form a weak stem-loop secondary structure that regu- lates their translation. 0 1999Academic Press. Inc.

INTRODUCTION spond to the possible antigenic sites of TYMV show only 20% sequence similarity (Dupin ef al., 1985), which may account for the lack of a detectable serolog- ical cross-reaction between EMV and TYMV virions (Koenig, 1976). Furthermore, EMV and TYMV have dis- tinct host ranges; TYMV is restricted to members of the Brassicaceae and two closely related families. None- theless, these two tymoviruses have genomes of sim- liar base composition and were found to be closely re- lated in viral RNA-cDNAin vitro hybridization tests (Blok et al., 1987).

Eggplant mosaic tymovirus (EMV) was first reported to occur in Trinidad, West Indies (Ferguson, 1951; Dale, 1954), where it caused mosaic diseases of egg- plant and tomato. Host range studies indicate that EMV mainly infects species of the Solanaceae and Chenopodiaceae (Dale, 1954; Gibbs et al., 1966; Gibbs and Harrison, 1969). Variants of EMV have been reported that differ in virulence, but not antigenic speci- ficity, from EMV type strain (Gibbs and Harrison, 1969), which is called EMV-Trin in this paper.

The unipartite single-stranded EMV RNA genome has been estimated to have a M, of about 2.0 X 1 O6 (Gibbs and Harrison, 1973). EMV viral RNA has a 5’ 7- methyl guanosine cap (G. W. Both, personal communi- cation), and can be efficiently and specifically aminocy- lated by valyl-tRNA synthetase (Pinck et a/., 1974). The 3’terminal 80 nucleotide residues can form a tRNA-like secondary structure (van Belkum et al., 1987). Purified virions of EMV separate into three components when centrifuged in CsCl gradients. The bottom infectious component contains genomic RNA. The middle and top components contain host tRNAs and the subgeno- mic mRNA for the virion protein (Szybiak et a/., 1978).

EMV virion protein is composed of 188 amino acid residues (Dupin et al., 1984) and has 32% sequence similarity with the related virion protein of turnip yellow mosaic tymovirus (TYMV) (Dupin et al., 1985). How- ever, the regions of the EMV virion protein that corre-

Sequence data from this article have been deposited with the EMBUGenBank Data Libraries under Accession No. J04374.

’ Present address: CSIRO Division of Plant Industry, Black Moun- tain, Canberra, A.C.T., 2601, Australia.

’ To whom reprint requests should be addressed.

The determination of the nucleotide sequence of the genomes of EMV-Trin and the Club Lake isolate of TYMV (Keese eta/., 1989) has allowed common geneti- cal features to be identified and provided further details of the evolutionary relationships of tymoviruses.

MATERIALS AND METHODS

The materials and methods used were those de- scribed by Keese et al., (1989), except for the following details.

The enzymes used included Escherichia co/i poly(A) polymerase obtained from Bresatec. Primers for cDNA synthesis Included random synthetic hexamers (Bresa- tee), dT,G (P. L. Biochemicals), and EMV-specific oli- godeoxynucleotides kindly synthesized by G. Mayo.

The Trinidad isolate of EMV used was that described by Gibbs and Harrison (1969). It was propagated in plants of Nicotiana clevelandi L. Viral RNA was ex- tracted from particles purified as described previously (Gibbs eta/., 1966; Guy and Gibbs, 1985).

DNA complementary to the genome was synthe- sized using reverse transcriptase and, as primers, ei- ther synthetic random hexanucleotides or dT,G with

547 0042.6822/89 $3.00 CopyrIght 0 ,989 by Academtc Press. Inc All rights of reproduckx I” any form resewed

Page 2: Nucleotide sequence of the genome of eggplant mosaic tymovirus

548 OSORIO-KEESE, KEESE, AND GIBBS

EMV-Trin genomic RNA that had been polyadenylated according to the suppliers instructions. Double stranded DNA was then generated from the cDNA us- ing the method of Gtibler and Hoffman (1983) and hy- drolyzed with one or other of the restriction endonucle- ases, &al, Haelll, Alul, Sau3Al, Taql and /-/inpI. The fragments were fractionated by electrophoresis in 59/o polyactylamide gels and ligated into the appropriately cleaved Ml 3mpl8 RF vectors. The ligated vectors were used to infect E. co/iJM 101 cells (Hanahan, 1983) and cloned, and the EMV-encoded DNA inserts of se- lected recombinant bacteriophages were sequenced by the dideoxynucleotide chain termination method (Sanger et al., 1980). The larger (500-l 000 bp) &al and Sau3Al fragments were ligated into the Smal and BarnHI sites of pGEM 1, and subcloned into M 13mpl8 and 19 vectors.

Synthetic oligodeoxynucleotides, complementary to selected sequences of the EMV-Trin genome, were used to generate additional cDNA clones. The primers 5’-TXGGGATCGCAGCT-3’ (complementary to nucle- otide residues 1488-l 502) and 5’-GGTGGGATTGGC- GTA-3’ (complementary to nucleotide residues 2251- 2265) were phosphorylated using T4 polynucleotide ki- nase in the presence of ATP and used to produce a 657-bp and a 500-bp fragment, using Taql and HindIll restriction endonuclease, respectively. These two frag- ments were transcribed with the Klenow fragment in the presence of all four deoxynucleotides to produce flush ends. The primer 5’-GGAACTGGTTGCTCT-3’ (complementary to nucleotide residues 4208-4222) was partly end-labeled with T4 polynucleotide kinase, and [T-~‘P]ATP and used to generate a single 163-bp fragment after hydrolysis with the restriction endonu- clease &al. All three fragments were isolated by elec- trophoresis in 5% polyacrylamide gels and they were cloned into the Smal site of M 13mpl8 and 19.

Three EMV-Trin-specific oligodeoxynucleotide prim- ers (5’-TTGTAGAAGCATC-3’, 5’-CAGTTCTACGCAG- TC-3’, and 5’-AGTCACAGATGAATG-3’) were used for direct RNA sequencing by the dideoxynucleotide chain termination method (Ou et al., 1981) and initiated cDNA synthesis at nucleotide residues 154, 1020, and 2589.

The sequences of the 5’ and 3’ terminal 40 nucleo- tide residues of the EMV-Trin genome were deter- mined by direct RNA enzymatic sequencing as de- scribed by Haseloff and Symons (198 1).

The sequences were compiled by the Staden (1982) library of computer programs for shotgun sequencing and analyzed using the SEQ program package of the Research School of Biological Sciences and the align- ment method of Feng and Doolittle (1987).

RESULTS AND DISCUSSION

EMV sequence

EMV-Trin genomic RNA is 6330 nucleotide residues long. Its complete sequence is shown in Fig. 1, to- gether with the predicted amino acid sequences of its three longest open reading frames (ORFs). The nucleo- tide sequence was compiled from the sequences of many short overlapping sequences. These were ob- tained by sequencing M 13 recombinant clones gener- ated by hydrolysis of EMV DNA with one of seven re- striction endonucleases; by direct RNA enzymatic se- quencing of the 5’ and 3’ termini; and by direct sequencing of the EMV-Trin genomic RNA template using reverse transcriptase and one of three synthetic oligodeoxynucleotide primers adjacent to regions of the EMV-Trin genome not represented in the recombi- nant M 13 DNA clones. The primers 5’-TTGTAGAAG- CATCT-3’ and 5’-CAGTTCTACGCAGTC-3’ hybridize close to EcoK restriction recognition sequences in EMV-Trin DNA at nucleotide positions 90 and 93 1. The EcoK regions of EMV-Trin cDNA were not represented in M 13 clones derived from transfections into E. co/i JM 101. Except for these two regions, the remainder of the EMV-Trin genomic sequence was derived from two or more M 13 clones and 95% of the sequence was determined in both orientations. A total databank of about 50,000 nucleotides was used to compile the complete sequence. The sequence presented in Fig. 1 was obtained by choosing only those nucleotide resi- dues that were represented in two or more indepen- dent cDNA clones to avoid possible artifacts arising from transcription errors using AMV reverse transcrip- tase (Keese et al., 1989).

The 3’ terminal 74 nucleotide residue sequence of EMV-Trin genomic RNA has three nucleotide differ- ences and two extra nucleotides compared with the sequence reported by van Belkum et a/. (1987). The 3’ noncoding sequence of EMV-Trin genomic RNA may be folded into a five stemmed base-paired tRNA-like structure (Fig. 2) similar to that proposed for the 3’ non- coding region of TYMV RNA (Florentz et al., 1982; Riet- veld et a/., 1982; van Belkum et al., 1988).

Open reading frames

Figure 3 shows the ORFs, with standard start and stop codons, found in EMV-Trin genomic RNA and its complementary sequence. Only three large ORFs are found in the positive strand. The 3’ terminal ORF en- codes the virion protein (VP) (n/l, 19,769) and has the same sequence as that reported by Dupin et a/., (1984), who directly sequenced the virion protein, ex- cept that there is a glutamine rather than glutamic acid

Page 3: Nucleotide sequence of the genome of eggplant mosaic tymovirus

E

1 120 m7GpppGvMvCAGAACCAGMcvMccc~G~~~~c~~ccvv~~vvcuuvu~cuu~ccu~ucc~UU~~~GMCCGACUAGUGCC~UC~~~~~~CCACVACG~~~~~~~~~~~~G~~~~~~~~

RP MA F Q (41 OP M P H G L s (6)

240 UC"GC"CUCG~GCUcuc~CUc~c"~CucAc~G~G~"GCUUcUAC~UCCM""CVGMCVCCG"CGVGG~CCUcucCGcGAc"c"CU~"CCc"AvAUccc"GGc"cc"~Jccc~

RP s A LE A LN S T T H R DA S T N P I LNSVVEPLRDSLSLYPWLLPK (44) OPY c s R s s QLNYSQRCFYKSNSELRRGTSPRLSIPISLAPSQ 146)

360 G~~CCG""CCCCACC""C"AUCCUGGGGCAUCCCGMC"CCGGCCUCG~AGUCAC"CCCCACCCCCACCC~"CCAC-CAG~CGAG~~~"""~~~~"~""~~"~~~~~~~~"~~~

RP EAVPH L L S W G I P N S GLGVTPHPHPIHKTVETFLLFNHWHA OPRSRSPPSI

,841 L G H P ELRPRSHSPPPPNPQNSRDFSP v Q S I. A c ,861

~~~~C~~GC~UGC~"~C~~~G"GA~G"~CAUG~CCG"CCMGUUUC~C""GCGGCUCV~CCC~"VCC~GAG""GAVC~C~U"~GA~"~~CUG~~GCCGA~ACCA~~ 480

RP LARLPS TVMFMKP SKFQKLAALNP K F Q E L I N F R LTAADTT QPS R S P A F NCDVHET"Q"SKTCGSKPKIPRVDQLSTf+cRRHH

1124) (126)

CGCUACCCC"CCACCUCACUCAC""U"CCAAGC~~"C~"UUGC~UCA~GCACGA~GCUCUGA~GUAC"~UVC"CCAGC"CAGA~CGUCGA~CUC~"C~C"CAGVC"CCCGCACVCGAG 600

RP R Y p S TSLTFP SNSICFMHDALNYFSPAQI VDLFTQSPALE QpsLPLHLTHFSKQFNLLHARcSDVLFS

(1641 SSDRRSLH s v s R T R 1166)

ACCC~G"ACUGCAG~CUCAVAGVGCC"CCAGAGUCUCA"UVCACAGAUC~CUCUC~CVUCCCCGAGA"CVACAC"~AC~GAVCUCAGGUCAGACUC~CCAC~ACA"CCCGGAG~~CAC 720

RP TLYCSLIVPPESHFTDLSLFPEIYTYKISGQTLHYIPENH QpDPVLQSHSASRVSFHRSLSLPRDLHLQD

(204) LRSDSPLHPGES (206)

CACUCCGGC;CGVACAAUC~GCCCCVCCAAGCCC"AVCU~GGCUG~GA~VVCC"CCAVCCUCVCGCCU~CCCUCGCVVUGUCVGUGAC~~GCVGG~~C""GGGGCC~AG~CCACUCC 840

RP H S G S Y N Q P L QA L S W L K I S S I L S P S L AL S V T K L E SW G P V H S 0PpLRL"QSApPSPILAEDFLHPLAFPRFVCDQAGILGPSPL (244)

1246)

A"A""GA"C~AGCGAGCCC;ACCACCARAGCCCVCVCvC;CUGCACGCC~CCCCGVCCV~CC~"C~~CUCCCCG"G~AACAACvCC~~C"CCC~Cc~CVGC~GCA"CAGACA 960

RP I L I QRG LP P Kp S LS ARP p "LP NQP P RATTP N S Q NO L L HQT (2841 OPH I D P AR p TTKALS L C T P p RPAKSTSP C N N S QLPKPTAASD (286)

AGCCAGC"A;UCUUCGAAC~GCAGCAGCCiCAACUCAGC~UGGUCUCCu~CCG~U"CC~GACuGCG"A~~C"GCCAC~GCCACC""~c"GCGCC~~C"C"CCGCC~CC~;GC"AG"G 1080

RP SQLFFE LQQPQLSLVSFRIPDCVELPQATFLRQPLRHRL" QPKPAILRTAAASTQPGLLPNSRLRRTATS

(324) "LSAPTSpPPAS (326)

CCRACAAGC~"V"ACRACG~"CVC"UCAC~"ACAC"CGC~CAG"CCGCA~UCVVCGCAC~VCCGACCCA~CCGGAV""G~GCG~CUC~GC~C~~CCGAGCACG~U"~;GG"CAC" 1200

RP P TSYYNALFTYTRAVRT LRTSDPAGFVRTQSNKP OPANKRLQRSLHLHS

EHAWVT (3641 RSPHSSHFRPSRI CANSKQQTRARiGH 1366)

CCRAACGCG;GGGACAAVC;GCAGACCUU~VCVG"CMV~CCCCCCACC~CCCCCMGV~VGC"ACCACV"CVUCUCC~~CCCCGUGGC~GG""~G~"CCAC"VCG~CC~CACVGG 1320

RP P N A W D N L Q T L S VN A P H R P Q" C Y H F F S S P" A R L K OP s KRVGQSAD LVCQCPPPPPSHLPL

L H F A Q H W ,404, L L L P RGK"KAPLR?TL (4061

CGAGCC~A~C~~~~GG~~C~~A~~C~AU~~~~"A~~ACG~CA~~~~~~~~~C~C~C~"~~~~~~~~~C~~AC~~~"~~CCCCC~~C~~CGGC"*C~~~~VCUG"~"C~CC~~~"~GG~G 1440 RP RAY LLALTPFLTTSPLLLPLFNFNTPFPLP OPASLSFGSHPI R L L S

PY"VTSSP L F 8 il S V (444)

PLIQFQHPFPPPSATF SVSPLG I4461

~~~~CACCA~GGCU"~"GC~C"C~VCC~~CCCAG~CAG~UGAGAGGAG~UGCGA~CCC~M"CGCCCA~~CCCAC~C"~GG~CAC~C"ACA~CAC~""C~CGAC"~CCA~"CCC"C 1560 RP S S P RLLHSILPSQLRGAAI OP" L T

PNRPLPLWVTKLHHFLDS TAFALNP TQSAERSCDP li s L

ESPTPT (484)

LGHKTTS LSRLPLP (4861

CUCCCCACUCCCCCCAUUCGGCCCAGGAV~GAGCU~CAGCGCVVGCCA~~GA~GUC~C~~"~CCG~CC~~~G"~~~~C~~~~A~"G~CC~"CCUCC~U~~~"~~~,~~~~AUC 1680 RP L P T P P I R P R IELQRLPLk,SLIPKP OPP P H S P H

KIVLPLLSLLLSS?TI SAQDRASALATDVSNSETKNCPSP (524)

TV P P P F L P N H 1526)

"ACAVCCACU~C"VCCAGG~ACAGACCCC~C~CMC"CCACGAC~U~~~~A~C~~~A~C~~~A"~~C~C~~G~~~~G~C~~UC~~G~A~~~"G~AG~~A~A~CA"G~G~C~~~G~~ 1800 RP Y IHFFQAQTPQQLHDNYHLHLH~s QPL H P L LPGTDP P T T P

RFELSWT RQLSPSPSSLS

LQSYHVTQA LRTFLD

1564) S A V I S c c s s (5661

CAGU~~~"CC~CCCUC~~~~~C~CCCAGC~~C~AC~C~GCVC~GCUV~C~UCCVGC~CCVCGCCCC~CCGC~VUCCA"GCUAUCCC~C~CCCCCCU~AGCCCUCGA~C~~~~C~~~~ 1920 RP Q S F L P L L L P A P T Q A QA S N P A P R P P A F H A I p L p 0PP"LPPSPSPSS"SSSS PQPSTSSS

FQ S C T S P P RF P C Y P P P p S A L D L L F (604) (606)

~~"C~AC"C~AGGAAC~GA~C~UUUCCCC~CACCUGRUA~ACCCCCCCC~CAC~GAG~CCA"CGCCC~"G~CGGCU~CGCC~G~GA~AG"GCGC"A~~C~C~~~CA~AG~"G~GA"G 2040 RP PPLQEPTLSPHL IHPPLTREPSP OPS s T P GTDPFPPPDTPPP

LNGCACDSALLPS H K R T I ALSRLR

T A A M (644) LRQCATPFHSCD 16461

A~G~~~G~"~~~A~~C~A~"~~A~"~~~~~~~CCACA~C~AGCCC~~ACCAGACGV~CC~CCVCCC~ACUCACCCG~"~CCCAVC~C"~~~G~G~~GU~~~"~~~G~G~G~C 2160 RP TSAEHPTP L N P P T P S P OPD V c '

TPDVPPP DSPGNPS LLKQVPP E A N (684) 1649)

UUGCAU~~~~UCCAC~CC~AGACCVCCC~VCUU~CACC~C"CVVCCUU~UGGGGCCCU~ACAC"GGUC~CAGCC~~VCC"~CCA"~VA~G~C~"~~~A~CCCCC~C~G""~~C~" 2280 RP L H P I H N P D L e s S TTLPSGALT LVPAKTPS IYANPTPP s s H (724)

~~G~~~AC~~CAC"GGC"G~"GACCCCAC~GC"G"GGG"~CU~GCC"AC~GVVCCACG~~C~CCACCCG~C~G~C"AC~~"C~~C"~"~~GCCGAG"""~"~~CA~GGA~C~GGCA~G"~ 2400 RP P F T P LA D D P T AV G P CLPFHVLH P A D Y F P L S A E F L T R T R H Y (764)

~C~~C~~~"~CU~~~"~A~~"~C~C~~~"~GCCUA~"CACCUGC"~VVC~G~CV~~CAGGAC~C~CUGAG"CAG~UC"VVGG~~~UC~~"G~~~C~~AC""C~~~;AC"CCC~ 2520 RP P P S S L S H P K L N c LLTCFSELSG H S E S D L w L s L Q S I L P D S Q 1804)

~~~C~"~CUG~G"~~~GACA~""GG~~"G~CCACU~ACAUVCUCA~AGC~C"CUG~VUCAVCUAC~A~~CA"C~G~GAC~C~~~A~GC~~CC"CA~GAG~""A"C~C~,A~GG~A"A 2640 RP L Q N P E V S T L G LS T D I LTALCFI YHSSVTLHAPSGVYHYGi (844)

GCC"CC"~""~"~~C~~~~~cA"~~~cvA"c~ccAGGccc"ccucc"cA"v"""c"c"c"cccc"AGAcv"GccGc""c"Gc"ccucGc"GcM~~cc~c~~~~~~~~~""~~"~ 2760 RP A S IHYQPGPPP H F S L S P R LAASAP R c N P TNSRLY 1884)

A~A~~~~~C~~~~~~~~~~~G~~GGCGAG~~~C~CCCC~~CACCCAGGCVVACGCGCA~G~~C~~CCA~CACCCA~G~~~CC~~A~~~~~~CA~~~~~~~~~~~~~~A~ 2880 RP RQALRFKLNGEFLPFTQAYAHESSI THAKNLIS NMKNGFD ,924)

FIG. 1. Nucleotide sequence of EMV-Trin genomic RNA together with the encoded amino acid sequence of the three largest open reading

frames. Dots over the sequence occur every 10 bases. RP is a possible replicase protein; OP is an overlapplng out-of-phase protein; and vp is the vinon protein.

Page 4: Nucleotide sequence of the genome of eggplant mosaic tymovirus

E

RP

RP

RP

RP

RP

RP

RP

RP

RP

RP

RP

RP

3000 GGMUCA~GUC~~CUCUCACUGACVCCVC~MGGGVCCC~CCCCCCGVG~CUGACCACVCVCGACVCUCVCAVAGA~GVCGCVGC~CCVCGCG~GVVVC~CVCA~CCACAVCG~C

G I MS S LT D S S K GP S P R E K L T T LD S LIDVAAP R E “S L I H I A (964,

3240 CMCCMCAGCVGAAAAUG;VVGGAGGVV~VCCACAVGGGMVCCAGCCVGCVC~CA~VCCGAGAVCCVCGV~VCGACGAGAUVUACMGCVCCCVCGVGGCVACC~AGAVCVCVCC

QPTAENVWRFSTWESSLLKHSEILVIDE I Y K L P R G Y L D L S (1044,

3360 AVCCUVGCVGAVCCAACUCVCVCCVVGGVCAVCAVCCVVGGVGACCCVCVCCMGGAGAGUAVCACVCGACCVCVCCVCACAGCVCC~VCACVVVCVVCCMGVGAGGVCCACCGCVVC

ILADPTLSLVIILGDPLQGEYHSTSPHSSNHFLPSEVHRF(084)

3480

MGVCVVACAVCGACVGCV~CVGVUVVUGGVCCCACCGCAVVCC~GCAGAVAGCAVC~VVGVUCGGCGVAGVAUGCCAC~CACGMCGMGGVUVCGVGAGAGCCCVCACAVCVCAV K s Y I DCYCFWSHRIPKQIASLFGVVC”NTNKGF”RALTSH,ll24,

3600

CCCCCCMVVCCAAAAACCVCACCMVGCGACCMCACVGCVCVCAGVCVCCMCAGAUGGGCCACCACGCVAVCACCAUCAGCGCCAGMGGGVCACCVVCACCGAGGCCCAVACMUV P P N S K N L T N A T N T A L S LQ Q M G H H A I T I S A R R V T F T E A” T 1,164,

3720

C”GC”“GA”CG”CA”ACCMCC”“C”C”CCCCCMCMC”G”C””G””GCCC”CACCCGCAGCCGCAC”GGCG”C”AC”“CG”CGGC~“C”GCACC”GGCA”C~CAGC”““GGCACA LLDRHTNLLSPNNCLVALTRSRTGVYFVGNLHLASNSFGT,1204,

3840

Mc”ACAVGVvCVCVCMGCUCVCVGCCMGGCACMVCGACCV~C~CGVGVUCCCCCACAVCAUGCCVCACCVCCCG~VGVAVGMCCCAVCCGCUCCCGGVCCMCCGVVVV NYMFSQALCQGTIDLNNVFPHIMPHLPKMYEP I R S R S N R F (1244)

3960

G”GGCUGGGVCCCVCMUvVVCGACCMCCACCMUVCCCGCCVCCUVUCCAGVCVCACVMGCC~CCCACCUCCCCCCVCACAVCCCVACC~CCACVCCCVGGAUGVCCvAGVVVCC VAGSLNFRPTTNSRLLSSLTKPTHLPPHIP T N H S L D ” L V S 11284,

4080

MCCCVGVGCVCCVVGGVGAGACCCVCGACCCVCGAVVGGAGGVCCVCCACCUCCCCCC~CVCGCCvCCCAVVGCAVCVGGACCVCCVGCCCACAGUACCVVCCVCVUCCAGCVUCVCC NPVLLGETLDPRLEVLHLPPTRLP L H L D L L P T ” P S S S S F S(1324,

4200

“cAGvcGAccAvc”vVVCccAACCCCCAVCVCCCCCGCVAVCVGCGGCVACACCVVCG~VVVGGCCGCAVUCVUCCVCCCAGCVCAVGACCCGGACcv~GGAGG~GcvcAVcMv SVD”LFPTPISPAICGYTFENLAAFFLPAHDPDLKEVLIN,=64,

4320

GAcc~GAGc~ccAG”“cCCAVACVVGGACGCCCCUVUVGAGCV~CGVGCCMCCCVCCVCACVGVVGGCACCMVVCAc~GCcGGcC”cGGA”cc~ccc”“cVccc”GGc”cc

DQKSNQFP YLDAPFELSCQPSS L L A P I ” K P A S D P T L L P G S,1404,

4440

A”CMGAAACGCC”CAGA”“CCGCGC”“C”“CC”CCCCA”A””CCA”CAC”CCA”C”GA”C~C””C”“GG”C~CACC”C”“C”C”“C”““G”GCC”GGC”“A”GGGCGCMCCCCM” RPIKKRLRFRASSSPYSITPSDQLLGQ”LFSSLCLAYGRNPN(444,

4560

“C”G”CC”CCCC”“CCMCC”GAGC”C”“CAG”GAG”GCA”A”GCA””M”GA”“ACGC”CMC”C”CC”CCMGACVC~GCCACCA”CG”GGCC~”CA”C~GG”C”GA”CC”GAC RP S V LP F QP ELF S E C I C I ND YAQ L S S K T Q A T I V A N H Q R S D P D(l484)

4680

“GGCGCC”MC”GC”G”CCGCA”C”““GCCMGGC”CMCAC~G”~CGACGC”“CCA”C””””CCGGG”GG~GGC”“GCC~C”C”AGCCC”GA”GCACGG”“ACA”CA”VC”C RPWRLTAVRIFAKAQ”KVNDASIFSGWKACQTLALMHGYIIL,1524,

4800

G”ACVCGGCCCAG”CMGA”ACCMCGCA”““““GA””CCMGGACAGACC”CCCCACA”C”AC”ACCAC”GCGG”~C”CCC”CCCAGC”C”CCCM”GG”GCC~C”CACC”” RP V L G P V K K Y Q R I FDSKDRPPHIYYHCGKTPSQLSQWCQTHL(1564,

4920

“C”GGC”C”“CC”ACA”CGCCMCGAC”ACAC”GCC”””GA”CAG”CCCMCACGGCGAGGC”G”GG”CC”GG~“G”““GMGAVGCGCCGCC”C”CCA”CCCGGAC”C”C”CA”“CAG RPSGSSYIANDYTAFDQSQHGEAVVLECLKKRRLSIPDSLIQ,l604,

5040

C”CCACUCC~ACC”CAAGVGUUCCG”CGACACCCAGVVCGGCCCCCVCACCVGCAVGCGCCVCACUGGCGAGCCGGGCACVVAVGAVGACMCVCVGACvACMCCUAGCUGVCAVCVAC RP L H S H L K C S V D T Q F G P L T CM R L T GE P GTYDDNSDYNLAVI Y (1644,

5160

“CCCAAVAC;CCC”CMVGGCCACCCCAVVCVGAVCVCAGGCGAVGACVCCGVCCVVVGCGGCACACCGCCCCCVVCVCCACUVVGGCCCACVCVC~G~VGCVVCAUCVCCGvVVC RP S QY S LN G H P I L I S G D D S V L C G T P P P S P L W P T L K K M L H L R F,1684,

5280

~GA”CG~~GGA~~“CCC~~CCC~“C”“C”GCGGG”AUVACGVCVCCCCVCA”GGCGCVGCCCGCMCCCGVAUGCVCVCVVCGCC~GCVCAUGAVCVGCGVVGAVG~CMGAGC~UC RP K I E R T S HP LFCGYY”SPHGAARNPYALFAKLMICVDDKSL(1724,

5400

CAUGACAAG~GUVGVCCV~VCVCVCvGMVVCVCCACVGGCCAVCVGGCVGGCGACCVGGVCACCVCCAUVCVCCCVVCCCACCVACVUCCCVAVCAGVCCGCCGVGCACGACVVCVVC RP HDKKLSYLSEFS TGHLAGDLVTSI LPSHLLP Y Q S A ” ” D F F (1764,

5520

VGCCGGAAVVGCACGCCCGCGG AAMA’ ““C”CC”G”C”C”GGACCCAVCCC”~GVCCAAAAUCC”CCAGC”CA”“C”C~G”“CGC”GGGC”“C”C~GC”““C”“““CC”ACC”G

RPCRNCTPAEKILLSLDPIPESKILQLILKVRWASQAFFSYL,1804,

5640

CCVCAAAAA~CVCGCGMC~CCVVGVGGCACGCVCVUCVCVCCCGVCCCVCVAUUCC~VCCC~GVCVCUC~CVGGAGVCVG~VVGCVVCCCVVCUCVCMVAGAVCMVGG~GA RP PQKARELLVARSSLPS LYSNPKVSQLESELLPFSQ’ (1839,

ME D (3,

5760

CACAGCM”~AVCAGAAGC~CVCAGCCCUCCAV~CGCACCAGGCVVCCAUCVGCCACCCACCGACVC~CMC~VCCVCUGCVAVVGMCVCCCCVV~CAGVVVCAG~CCACCACVVV VP T A I IRSPQPSI NAPGFHLP PTDSQQSSAI ELPFQFQATTF (43,

5880

VGGCGCGAC~GAAACAGCV~CVCARAUCAGVCVGGCCVCCGCC~CGCVAVVACC~GCVCGCGUCVCVCVACCGCCAVGVGCGGCVCACGCAGVGCGC~GCCACCAVCACVCCGACAGC VP GATE TAAQ I S LASANA I TK LAS LYRHVR LTQCAAT I TP TA (83,

6000

GGCCGCCAV~GCCAAvCCV~VCAC”GVC~CAVCGVCVG~GVGVCVGAC~VVCCACVG~C~GCCCAC~GAGAVVCVC~VGVCUVVG~VGGAVCVVC~VACACGVVV~GCGGCGCCCV VP A A I AN P LTVNTVWVSDNS T AK P TEILNVFGGSSYTFGGAL(123)

6120

CAnVGCCACiMGCCCCVV~CCAVCCCVC~CCCCAVGM~VCGGVCMC~GVAUGCVC~GGACVCVGV~CVVVACACA~AVvGCCC~GCVCCVGGC~VACUCAGCV~CVCCCAGCVC VP NATKPLTIP LPMNSVNCMLKDSVLYTDCPKLLAYSAAPSS,~~~)

6240

VCCCVCCAAksCCCCAACC~CCACVAVCC~VCCAVGG~MGCVCCGCVVGVCCVCCC~CCVCCUCC~GCC~VVMCVCVCUCUCC~UCAGCCACC~CCVCGCVCC~CCCCCAVCUC VP P S K T P T A T IQIHGKLRLSSPLLQAN” ,188,

6331

~“A”GG”~UVG~GGA~AG~“~~GC”C~~~~“AG~A~A~~GAGG”~~AV~“GGG”G~GA~”~~~~~C~C~C~~G”GGG”~~~GGG~~~A

FIG. 1 -Continued

Page 5: Nucleotide sequence of the genome of eggplant mosaic tymovirus

EGGPLANT MOSAIC VIRUS GENOME 551

AC

3'-(?.)C C A A G G G C A "

I I I I I I I uCCCGU G

5’-cc c GG u c c c c c

C c c UC ” UACC ccc A

v C I I I I I I I G II C AUGG GGG c

” ” u “G A-U A-U “-A

u c

c u CA

A AGGCG GGAGA c I”

G I I I I I I I I I I A III

“UCCGC ccucu c UC AG

FIG. 2. Possible secondary structure of the 3’terminal 103 nucleo- trde residues of EMV-Trin genomic RNA. The stem-loops are num- bered in the same order as those of the equivalent region of the TYMV genome; the loops of the TYMV genome were determined experimentally by Florentz et al. (1982). Stem-loop III corresponds to the anticodon loop of valine tRNAs. The secondary structure is presented in a form that more closely resembles the postulated ter- tiary structure of the TYMV 3’terminal sequence than the cloverleaf structure.

at position 39. The virion protein sequence begins at nucleotide residue 5633 rather than at the first AUG codon of the VP ORF at nucleotide residue 5268. Thus, although the VP ORF overlaps the RF ORF (Fig. 3), the virion protein initiation codon begins five nucleotides to the 3’ side of the RP ORF and is expressed from an EMV subgenomic mRNA, the reported size of which (Szybiak eta/., 1978) is close to that expected for a start at nucleotide 5633 rather than at one of the earlier initi- ation codons.

The remaining two overlapping ORFs of EMV-Trin

+

FIG. 3. Diagram illustrating the positions of open reading frames (open rectangles) in the three codon phases of the viral (+) strand and the complementary (-) strand of the EMV-Trin genome. The left border of each open rectangle indicates where an AUG codon oc- curs, and its right border indicates the position of the first in-phase UGA. UAG, or UAA triplet. The scale units are kilobases.

TABLE 1

PERCENTAGE NUCLEOTIDE AND AMINO ACID SEQUENCE SIMILARIN BETWEEN EMV-Trin AND TYMV-CL GE~VOMES

Percentage similarity

Nucleotide Ammo acid

5’ noncoding 44 OP ORF 51 28

RP ORF 51 49

VP ORF 50 33

3’ noncodrng 52

Total 51

genomic RNA are similar in size and arrangement to the overlapping ORFs of TYMV-CL (Keese eta/., 1989). The longest EMV-Trin ORF (nucleotide residues 109- 5628) encodes a protein of M, 204,731 that is probably a replicase protein (RP), because it is homologous to the corresponding protein in TYMV-CL and shares amino acid sequence similarities with other viral RNA replicases (Keese et al., 1989). The shorter of the two overlapping ORFs (nucleotide residues 102-2051) en- codes an out-of-phase protein (OP) that has a pre- dicted M, of 70,233 and an unknown function.

Szybiak et al. (1978) reported that when EMV geno- mic RNA was translated in vitro in rabbit reticulocyte lysates it yielded three major products with M,‘s of 180,000, 150,000, and 70,000. The largest two pro- teins probably correspond to the largest two in vitro translation products of TYMV genomic RNA (Mellema et a/., 1979). They are thought to result from the full- length and partial translation of the RP ORF; it is con- cluded that the partial translation terminates at a leaky intermediate termination signal, the exact form of which is unknown. The M, 70,000 in vitro translation product may correspond to either the OP or a proteo- lytic fragment of RP (hll, 180,000), as reported for the full-length RP of TYMV (March and Benicourt, 1980).

-4 +I -5 Plant COFlSell*“* A R c A A ” G G c

EMV-Trin OP UACGuCAAUGCCU TYMV-CL AUUGCAAAUGAG”

OYMV-Tin “GAAUUCAUGOC”

EM”-Trln RP A”GCC”CA”G’:CC

TYMV-CL AUGAGUAAUGGCC OYMV-Tin AUGVCUAAUGOCC

EMV-Trim VP “AGAUCAA”GSAA TYMV-CL CCCCGACAUGSAA OYMV-Tin UUCAAUCAVGGAA

FIG. 4. Nucleotrde sequences surrounding the postulated AUG ini- tiation codons (in bold) of the three major ORFs (RP, OP, and VP) of EMV-Tnn, TYMV-CL (Keese er al., 1989) and OYMV-Tin (Ding et al.. 1989) aligned with the consensus sequence determined for plant genes (LDtcke eta/., 1987).

Page 6: Nucleotide sequence of the genome of eggplant mosaic tymovirus

552 OSORIO-KEESE, KEESE, AND GIBBS

A * c A

C-G “-A U-A

c c c C

G-C “-A G-C A-" EHV-Trin RNA

U-A C

C-G A-U

5, G-C 3' GAACC AAUGCC”CnUG 66 111

AG = -4.tlkcal/mol

” A G A

C-G U-A

C C C A

C A A-” TYMV-CL RNA A-" C-G G-C U-A

5' "-A 3' UACAC -AGUAm 58 98

AG = -3.7kcal/mol

B G C A

A IJ C ”

C-G C-G U-A C-G

A C C C measles virus

C-G U-A A-" C-G

5' C-G

AG = -2.9kcalhol

3' AUCAA CAGAAGAGCAGGCACGCCAUG 33 a7

AG = -lO.lkcal/mol

u c

” G

“-A

C-G

C C

C C

A-”

C-G

A-U OYMV-Tin RNA

C-G

“-A

“-A

A-”

5' A-" 3‘ UCUUU CAUGUCUAAUG 138 181

AG = -5.3kcal/mol

” G

G A "-A "-A "-A G-C A-" KYMV-BP RNA

" C-G " -IA A-”

5’ C-G 3' A""GC "CAA- 52 86

AG = -3.2kcal/mol

A G G

A A U-G A-” A-” C-G patainfluenta virus-3 "-A

5' A-" 3' ACUCA ~GAAAGCG~ 60 92

FIG. 5. Possible secondary structures with maximal base pairing, adjacent to the initiation codons (underlined) of some overlapping viral genes. The free energy value of each structure was calculated using the parameters of Freier et al. (1986) for RNA in 1 M NaCl at 37”. (A) Tymoviral OP and RP genes of EMV-Trin, OYMV-Tin (Ding eta/., 1989), TYMV-CL (Keese et al., 1989) and the Bawley Point isolate of kennedya yellow mosaic virus (A. Mackenzie, unpublished results). (B) P, C genes of measles virus (Bellini et a/., 1985) and parainfluenza virus-3 (Luk et a/., 1986).

Not only do EMV-Trin and TYMV-CL genomes have the same genome organization, but clear similarities are also found in the nucleotide and encoded amino acid sequences of the genes of both viruses (Table 1). They have about 51% overall nucleotide sequence similarity, although somewhat lesser amino acid se- quence similarity. The RPs are about 49% similar, but the OP and VPs are only 28 and 33% similar.

Tin) (Ding er al., 1989) have the same genomic arrange- ment. Near the 5’terminus of all three viruses, the initia- tion codon of the RP ORF starts seven nucleotide resi- dues to the 3’side of the initiation codon of the OP ORF, and the 3’terminal virion protein gene is not in the same reading frame as the RP gene. This common arrange- ment of the genes of all three tymoviruses is probably essential for the correct in viva expression and regula- tion of the genes.

Regulation of translation initiation in tymoviral overlapping genes

EMV-Trin, TYMV-CL (Keese et al., 1989), and the type strain of ononis yellow mosaic tymovirus (OYMV-

The genomes of several animal viruses with negative single-stranded RNA genomes also initiate translation of two out-of-phase overlapping ORFs less than 24 nu- cleotide residues apart. This arrangement of overlap- ping genes is found in the P/C genes of Sendai para-

Page 7: Nucleotide sequence of the genome of eggplant mosaic tymovirus

EGGPLANT MOSAIC VIRUS GENOME 553

myxovirus (Curran and Kolakofsky, 1988), parainflu- enza-3 paramyxovirus (Luk et al., 1986), and measles morbillivirus (Bellini et al., 1985); the N and NS, genes of snowshoe hare bunyavirus (Bishop eta/., 1982); and the NA and NB genes of influenza B orthomyxovirus (Shaw et al., 1983).

There are several ways in which the translation of overlapping genes could be regulated and coordi- nated. For example, one of the two genes could be ex- pressed from a subgenomic RNA, alternatively both could start at the same AUG but one involve a shift of reading frame during translation, or there could be some form of competition between the initiation co- dons. In the bifunctional mRNAs of animal viruses, the first AUG codon lies in an unfavorable sequence con- text for initiation compared to the optimal sequence de- termined for preproinsulin -CCACCAUGG- (Kozak, 1986a), thus the 40 S ribosomal subunit may bypass the first initiation codon on some occasions, allowing protein synthesis to initiate at the following AUG codon (Kozak, 1986b).

A similar argument could be applied to tymoviruses taking into account the fact that the conserved se- quence context of plant gene initiation sites is different from that of animals (Heidecker and Messing, 1986; Lijtcke et al., 1987) and is affected differently by muta- tional changes when tested in plant and animal assays both in vitro (Liitcke et al., 1987) and in viva (Gallie et a/., 1988). Such an analysis indicates that some gener- alizations can usefully be made. For example, in plant genes the +4 position is the least variable and is usu- ally guanine, whereas in animal genes one of the most crucial nucleotides for regulating initiation is a purine (usually adenine) at position -3. Most plant genes also have adenine at position -3, but less frequently than in animal genes (Ltitcke et a/., 1987). The initiation co- don of all three tymovirus OP genes appears to have nucleotides at positions +4 and -3 that least favor initi- ation (Fig. 4); a pyrimidine at position -3 and no gua- nine at position $4. The only plant viral gene sequence known to have a similar sequence pattern is the initia- tion codon of brome mosaic virus RNA3 (Ahlquist et a/., 1981). Thus, if Kozak’s (1986b) proposal is correct, the ribosomal initiation complex may usually bypass the AUG of the OP and preferentially synthesize RP.

Another notable feature of the sequence adjacent to the OP initiation site of tymoviruses is that it could form a secondary structure that would preferentially influ- ence initiation of translation of OP and RP ORFs (Fig. 5a). Kozak (1986~) found that not only would stable stem-loops (AG = -50 kcal/mol) prevent initiation at a downstream AUG codon, but that less stable struc- tures (AG = -30 kcaI/mol) could significantly decrease

initiation of translation. Therefore, weak stem struc- tures such as those depicted in Fig. 5a may, in con- junction with the poor sequence context of the OP initi- ation codon, play a role in the regulation and differential translation of OP and RP. Similar secondary structures can also be proposed for some of the overlapping genes of animal viruses, such as the P/C genes of para- influenza-3 virus and measles virus (Fig. 5b).

In summary, the nucleotide sequence of EMV-Trin genomic RNA has the same genome organization as other tymoviral genomes and has a significant se- quence similarity with that of TYMV-CL.

ACKNOWLEDGMENTS

We thank A. Mackenzie, M. Torronen, and J. Howe for excellent technical assistance.

REFERENCES

AHLQUIST. P., LUCKOW, V., and KAESBERG, P. (1981). Completenucleo- tide sequence of brome mosaic virus RNAB. /. Mol. Viol. 153, 23- 38.

Bellini, W., Englund, G., Rozenblatt, S., Arnheiter, H., and Richard- son, C. II. (1985). Measles virus P gene codes for two proteins. /. V;ro/. 53, 908-919.

BISHOP, D. H. L., GOULD, K. G., AKASHI, H., and VAN HAASTER, M. C. (1982). The complete sequence and coding content of snowshoe hare bunyavirus small(S) viral RNA species. NucleicAcid.s Res. 10, 3703-3713.

BLOK, J., GIBBS, A., and MACKENZIE, A. (1987). The classification of tymoviruses by cDNA-RNA hybridization and other measures of relatedness. Arch. Virol. 96, 225-240.

CURRAN, J., and KOLAKOFSKY, D. (1988). Ribosomal initiation from an ACG codon In the Sendai virus P/C mRNA. EMBOJ. 7,245-251,

DALE, W. T. (1954). Sap-transmissible mosaic diseases of Solana- ceous crops in Trinidad. Ann. Appl. Biol. 41, 240-247.

DING, S., KEESE, P., and GIBBS, A. (1989). Nucleotide sequence of the ononis yellow mosaic tymovirus genome. Virology 172, 555-563.

DUPIN A., COLLOI, D., PETER, R., and WITZ, 1. (1985). Comparisons between the primary structure of the coat proteins of turnip yellow mosaic virus and eggplant mosaic virus I Gen Viral. 66, 2571~ 2579.

DUPIN, A., PETER, R., COLLOT, D., DAS, 6. C., PETER, C., BOUILLON, P., and DURANTON, H. (1984). The primary structure of the eggplant mosaic virus (EMV) coat protein. C. R. Acad. Sci. Paris Ser. C 298, 219-221.

FERGUSON, I. A. C. (1951). Four virus diseases of solanaceous plants in Tnnldad. Plant Dis. Rep. 35, 102- 105.

FENG, D.-F., and DOOLITTLE, R. F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351-360.

FLORENTZ, C., BRIAND, J. P., ROMBY, P.. HIRTH, L., &EL, J. P., and GIEG~ R. (1982). The tRNA-like structure of turnip yellow mosaic virus RNA: Structural organization of the last 159 nucleotides from the 3’OH terminus. fMf?O/. 1, 269-276.

FREIER. S. M., KIERZEK, R., JAEGER, J. A., SuGIMo-ro, N., CARIJTHERS, M. H., NEILSON.T., ~~~TURNER, D. H.(1986). Improvedfree-energy parameters for predictions of RNA duplex stability. Proc. Nat/. Acad. Sci. USA 83,9373-9377.

GALLIE, D. R., SLEAT, D. E., WARS, J. W., TURNER, P. C., and WILSON, T. M. A. (1988). Mutational analysis of the tobacco mosaic virus

Page 8: Nucleotide sequence of the genome of eggplant mosaic tymovirus

554 OSORIO-KEESE, KEESE, AND GIBBS

Y-leader for altered ability to enhance translation. Nucleic AC;& Res. 16,883-893.

GIBBS, A. J., and HARRISON, B. D. (1969). Eggplant mosaic virus, and its relationship to Andean potato latent virus. Ann. App/. Bio/, 64, 225-231.

GIBBS, A. J., and HARRISON, B. D. (1973). Eggplant mosaic virus. “C.M.I./A.A.B. Descrip. Plant Viruses,” Set 7, No. 124.

GIEIBS, A. J., HECHT-POINAR, E., and WOODS, R. D. (1966). Some prop- erties of three related viruses: Andean potato latent, dulcamara mottle, and ononis yellow mosaic. J. Gen. Microbial. 44, 177-l 93.

GOBLER, U., and HOFFMAN, B. J. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269.

GUY, P. L., and GIBBS, A. 1. (1985). Further studies on turnip yellow mosaic tymovirus isolates from an endemic Australian Cardamine. Plant Pathol. 34, 532-544.

HANAHAN, D. (1983). Studies on transformation of fscherichia co/i with plasmids. 1. Mol. Biol. 166, 557-580.

HASELOFF, J., and SYMONS, R. H. (1981). Chrysanthemum stuntviroid: Primary sequence and secondary structure. Nucleic Acids Res. 9, 2741-2752.

HEIDECKER, G.. and MESSING, 1. (1986). Structural analysis of plant genes. Annu. Rev. Plant Physiol. 37, 439-466.

KEESE, P., MACKENZIE, A., and GIBBS, A. (1989). Nucleotide sequence of the genome of an Australian isolate of turnip yellow mosaic ty- movirus. Virology 172, 536-546.

KOENIG, R. (1976). A loop-structure in the serological classification system of tymoviruses. Virology 72, l-5.

KOZAK, M. (1986a). Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribo- somes. Cell 44,283-292.

KOZAK, M. (1986b). Regulation of protein synthesis in virus-infected animal cells. Adv. Virus Res. 31, 229-292.

KOZAK, M. (1986c). Influence of mRNA secondary structure on initia- tion by eukaryotic ribosomes. Proc. Nat/. Acad. Sci. USA 83, 2850-2854.

LUK, D., SANCHEZ, A., and BANERJEE, A. K. (1986). Messenger RNA encoding the phosphoprotein (P) gene of human parainfluenza vi- rus 3 is bicistronic. Virology 153, 318-325.

L~~TCKE, H. A., CHOW, K. C., MICKEL, F. S., Moss, K. A., KERN, H. F., and SCHEELE, G. A. (1987). Selection of AUG initiation codons differs in plants and animals. fMBO/. 6, 43-48.

MELLEMA, J.-R., B~NICOURT, C., HAENNI, A.-L., NOORT, A., PLEIJ, C. W. A., and BOSCH, L. (1979). Translational studies with turnip yellow mosaic virus RNAs isolated from major and minor virus par- ticles. virology 96, 38-46.

MORCH, M.-D., and B~NICOURT, C. (1980). Post-translational proteo- lytic cleavage of in vitro synthesized turnip yellow mosaic virus RNA-coded high-molecular-weight proteins. 1. Viral. 34, 85-94.

OU, J.-H., STRAUSS, E. G., and STRAUSS, J. H. (1981). Comparative studies of the 3’-terminal sequences of several alphavirus RNAs. Virology 109,281-289.

PINCK, M., GENEVAUX, M., and DURANTON, H. (1974). Studies on the amino acid acceptor activities of the eggplant mosaic viral RNA and its satellite RNA. Biochimie 56,423-428.

RIETVELD, K., VAN POELGEEST, R., PLEIJ, C. W. A., VAN BOOM, J. H., and BOSCH, L. (1982). The tRNA-like structure at the 3’terminus of turnip yellow mosaic virus RNA. Differences and similarities with canonical tRNA. Nucleic Acids Res. 10, 1929-l 946.

SANGER, F., COULSON, A. R., BARRELL, B. G., SMITH, A. J. H., and ROE, B. A. (1980). Cloning in a single-stranded bacteriophage as an aid to rapid DNA sequencing. /, Mol. Biol. 143, 161-l 78.

SHAW, M. W., CHOPPIN. P. W., and LAMB, R. A. (1983). A previously unrecognized B virus glycoprotein from a bicistronic mRNA that also encodes the viral neuraminidase. Proc. Nat/. Acad. SC;. USA 80,4879-4883.

STADEN, R. (1982). Automation of the computer handling of gel read- ing data produced by the shotgun method of DNA sequencing.

Nucleic Acids Res. 10,473 l-475 1.

SZYBIAK, U., BOULEY, J. P., and FRITSCH, C. (1978). Evidence for the existence of a coat protein messenger RNA associated with the top component of each of three tymoviruses. Nucleic Acids Res. 5,1821-1831.

VAN BELKUM, A., JIANG, B., RIETVELD, K., PLEIJ, C. W. A., and BOSCH, L. (1987). Structural similarities among valine-accepting tRNA-like structures in tymoviral RNAs and elongator tRNAs. Biochemistry

26,1144-1151.

VAN BELKUM, A., VERLAAN, P., JIANG, B., PLEIJ, C., and BOSCH, L. (1988). Temperature dependent chemical and enzymatic probing of the tRNA-like structure of TYMV RNA. Nucleic Acids Res. 16, 1931- 1950.