DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II...

8
Yeast 15, 419–426 (1999) DNA Sequencing and Analysis of a 40 kb Region from the Right Arm of Chromosome II from Schizosaccharomyces pombe MANUEL SA u NCHEZ, FRANCISCO DEL REY, ANGEL DOMI u NGUEZ, SERGIO MORENO AND JOSE u L. REVUELTA* Departamento de Microbiologı ´a y Gene ´tica, Instituto de Microbiologı ´a Bioquı ´mica, Universidad de Salamanca/CSIC, Campus Miguel de Unamuno, 37007 Salamanca, Spain We have determined the complete nucleotide sequence of a 39 648 bp segment, contained in cosmid c32F12, derived from the right arm of chromosome II from the fission yeast Schizosaccharomyces pombe. Computer analysis of the sequence revealed the presence of 15 non-overlapping open reading-frames (ORFs) longer than 300 bp and one tRNA-Thr gene. Six ORFs correspond to the previously known rec14 + , tug1 + , rum1 + , pch1 + , gpd1 + and cyr1 + genes. Five ORFs code for putative proteins with significant homology to proteins from other organisms. SPBC32F12.01c shows considerable similarity to human neutral sphingomyelinase, whereas SPBC32F12.03c, SPBC32F12.10 and SPBC32F12.14 exhibit strong homology to glutathione peroxidase, phos- phoglucomutase and ubiquitin–protein ligase E·3 components from various organisms, respectively. The four remaining ORFs identified show weak or non-significant homology to previously sequenced genes. The nucleotide sequence has been submitted to the EMBL database under Accession Number AL023796. Copyright ? 1999 John Wiley & Sons, Ltd. Schizosaccharomyces pombe; genome sequencing; chromosome II; ORF analysis; neutral sphingo- myelinase; glutathione peroxidase; phosphoglucomutase; ubiquitin protein ligase; rec14; cyr1; rum1; tug1; pch1; gpd1 INTRODUCTION Chromosomes II and III of the fission yeast Schizosaccharomyces pombe have been selected to be sequenced by a team of 14 European laboratories, with funding from the European Commission, to complement work carried out elsewhere on the genome and to enable the complete sequence to be obtained over the next 3 years. Within the framework of the European Union programme for systematic sequencing of the entire Sz. pombe genome, we have sequenced and analysed a DNA fragment of about 40 kb from the right arm of chromosome II. This frag- ment corresponds to the entire insert of cosmid c32F12 and is located approximately 1390 kb from the centromere (Hoheisel et al., 1993). In addition to six previously known genes, the sequence of this fragment revealed the presence of one tRNA gene and nine novel open reading frames (ORFs) of at least 100 amino acids. *Correspondence to: J. L. Revuelta, Departamento de Micro- biologı ´a y Gene ´tica, Instituto de Microbiologı ´a Bioquı ´mica, Universidad de Salamanca/CSIC, Campus Miguel de Unamuno, 37007 Salamanca, Spain. Tel.: 34-923-294671; fax: 34-923-224876; e-mail: [email protected] Contract/grant sponsor: European Commission, Schizosac- charomyces genome sequencing project; Contract/grant numbers: B10-4-CT96-0159. Contract/grant sponsor: Comisio ´ n Interministerial de Ciencia y Technologica; Contract/grant numbers: B1097-1535-C04-CE. CCC 0749–503X/99/050419–08 $17.50 Copyright ? 1999 John Wiley & Sons, Ltd. Received 25 July 1998 Accepted 12 October 1998

Transcript of DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II...

Page 1: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

Yeast 15, 419–426 (1999)

DNA Sequencing and Analysis of a 40 kb Region fromthe Right Arm of Chromosome II fromSchizosaccharomyces pombe

MANUEL SAuNCHEZ, FRANCISCO DEL REY, ANGEL DOMIuNGUEZ, SERGIO MORENO ANDJOSEu L. REVUELTA*

Departamento de Microbiologıa y Genetica, Instituto de Microbiologıa Bioquımica, Universidad deSalamanca/CSIC, Campus Miguel de Unamuno, 37007 Salamanca, Spain

We have determined the complete nucleotide sequence of a 39 648 bp segment, contained in cosmid c32F12,derived from the right arm of chromosome II from the fission yeast Schizosaccharomyces pombe. Computeranalysis of the sequence revealed the presence of 15 non-overlapping open reading-frames (ORFs) longer than300 bp and one tRNA-Thr gene. Six ORFs correspond to the previously known rec14+, tug1+, rum1+, pch1+,gpd1+ and cyr1+ genes. Five ORFs code for putative proteins with significant homology to proteins from otherorganisms. SPBC32F12.01c shows considerable similarity to human neutral sphingomyelinase, whereasSPBC32F12.03c, SPBC32F12.10 and SPBC32F12.14 exhibit strong homology to glutathione peroxidase, phos-phoglucomutase and ubiquitin–protein ligase E·3 components from various organisms, respectively. The fourremaining ORFs identified show weak or non-significant homology to previously sequenced genes. The nucleotidesequence has been submitted to the EMBL database under Accession Number AL023796. Copyright ? 1999John Wiley & Sons, Ltd.

— Schizosaccharomyces pombe; genome sequencing; chromosome II; ORF analysis; neutral sphingo-myelinase; glutathione peroxidase; phosphoglucomutase; ubiquitin protein ligase; rec14; cyr1; rum1; tug1; pch1;gpd1

*Correspondence to: J. L. Revuelta, Departamento de Micro-biologıa y Genetica, Instituto de Microbiologıa Bioquımica,Universidad de Salamanca/CSIC, Campus Miguel deUnamuno, 37007 Salamanca, Spain. Tel.: 34-923-294671; fax:34-923-224876; e-mail: [email protected]/grant sponsor: European Commission, Schizosac-charomyces genome sequencing project; Contract/grantnumbers: B10-4-CT96-0159.Contract/grant sponsor: Comision Interministerial de Ciencia y

INTRODUCTION

Chromosomes II and III of the fission yeastSchizosaccharomyces pombe have been selectedto be sequenced by a team of 14 Europeanlaboratories, with funding from the European

Technologica; Contract/grant numbers: B1097-1535-C04-CE.

CCC 0749–503X/99/050419–08 $17.50Copyright ? 1999 John Wiley & Sons, Ltd.

Commission, to complement work carried outelsewhere on the genome and to enable thecomplete sequence to be obtained over the next3 years. Within the framework of the EuropeanUnion programme for systematic sequencing ofthe entire Sz. pombe genome, we have sequencedand analysed a DNA fragment of about 40 kbfrom the right arm of chromosome II. This frag-ment corresponds to the entire insert of cosmidc32F12 and is located approximately 1390 kb fromthe centromere (Hoheisel et al., 1993). In additionto six previously known genes, the sequence of thisfragment revealed the presence of one tRNA geneand nine novel open reading frames (ORFs) of at

least 100 amino acids.

Received 25 July 1998Accepted 12 October 1998

Page 2: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

420 . .

MATERIALS AND METHODS

Cosmids, plasmids and strainsCosmid c32F12 was provided by the DNA co-

ordinator E. Maier (Max-Planck Institut, Berlin).It contains a 40 kb insert of chromosome IIobtained by Sau3A partial digestion of Sz. pombeDNA (strain 972h") and cloned into the BamHIsite of the cosmid vector Lawrist4 (Hoheisel et al.,1993). The insert of cosmid c32F12 partially over-laps the insert of cosmid c19C7, assigned toC. Gaillardin (Institute National Agronomique,Thiverval-Grignon, France). Cosmid c32F12 wasmapped by digestions with EcoRI, BamHI,HindIII and SalI followed by gel electrophoresisand hybridization, using gel-purified EcoRI andBamHI fragments of c32F12 as probes. The map isshown in Figure 1. The phagemid pBluescript KS+

(Stratagene) was used as vector for all subsequentsubcloning and sequencing steps.

The Escherichia coli strain used as host fortransformation and amplification of plasmids wasDH5á (supE44 ÄlacU169 [ö80 lacZÄM15] hsdR17recA1 endA1 gyrA96 thi-1 relA1; Sambrook et al.,1989). E. coli transformants were selected on LBmedia supplemented with 100 mg/l ampicillin.

Copyright ? 1999 John Wiley & Sons, Ltd.

Manipulation of nucleic acidsRoutine DNA manipulations, cosmid prep-

aration, subcloning, Southern blotting, restrictionenzyme digestions, agarose gel electrophoresis,ligation of DNA fragments and E. coli transfor-mation were performed according to standardtechniques (Sambrook et al., 1989). Plasmidpreparations were carried out using Wizardminiprep columns (Promega).

Figure 1. Genomic organization of the 39 648 bp DNA frag-ment from the right arm of chromosome II contained in theinsert of cosmid c32F12. The position and orientation of theopen reading-frames (ORFs) is indicated by arrows, whereblack rectangles represent putative intervening sequences. OnlyORFs longer than 100 amino acids are shown. Previouslyidentified genes are labelled with their names. The portion ofthe insert contained in cosmid c19C7 that partially overlapswith cosmid c32F12 is indicated at the bottom of the figure.Abbreviations for the restriction enzymes used are: B, BamHI;E, EcoRI; H, HindIII; S, SalI.

Sequencing strategyThe DNA sequence was determined using a

random approach. A shotgun library of shortfragments of the 40 kb insert of c32F12 wasobtained as follows: 10 ìg of purified cosmid DNAwas subjected to sonication in an Eppendorf tube,using an MSE Soniprep 150 sonicator–cell disrup-tor. After sonication for 5 s, fragments rangingin size between 100 and 5000 bp were obtained.Sonication products were end-repaired using T4DNA polymerase and electrophoresed on 1%agarose gel. Fragments in the size range 1–5 kbwere extracted from agarose by electrolution andinserted into the EcoRV of the pBluescript KS+

vector. The recombinant plasmids obtained wereused to transform the E. coli DH5á strain. A totalof 350 clones were selected and stored at "30)C in96-well plates. The actual size of the insertsranged from 2 to 4 kb, with a mean size of 3·5 kb.Random sequencing reactions were made usinguniversal and reverse primers. Gap-filling sequenc-ing reactions were performed using custom-synthesized primers. Sequencing was performed ona ABI 377 sequencer (Applied Biosystems, Inc.)using the Taq DyeDeoxy= Terminator CycleSequencing Kit, as supplied by the manufacturer.The kit uses dITP as a standard substitute fordGTP, which effectively eliminates compressionsformed during polyacrylamide gel electrophoresis.

In total, 551 random sequences (248 direct and253 reverse reads) and 50 custom primer-directedsequences were performed. Altogether, raw datafrom 297 350 bases were aligned to assemble thefinal contig, the average reading number per basepair being 6·5 and each base pair being sequencedon both strands and at least three times (upper andlower strand together). The quality of the finalsequence was ensured by visual inspection of thesequencing profiles at each position on each DNAstrand. The sequence was considered final onlywhen an unambiguous reading of each nucleotideon each strand was achieved.

Yeast 15, 419–426 (1999)

Page 3: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

421 .

Computer-assisted sequence analysisAssembly of the sequences was done with the

SeqMan program of the DNASTAR programpackage (DNASTAR Ltd). ORFs were predictedwith the help of computer analysis using POMBE,a fission yeast gene-finding and exon–intron struc-ture prediction program (Chen and Zhang, 1998),with additional predictions for the branch-acceptor sites supplied by the program Sp3splice(B. G. Barrell, unpublished). ORFs were namedaccording to the working nomenclature of theEuropean Union Sz. pombe Genome SequencingProject. The letters SP stand for Sz. pombe and theletter B for chromosome number (B=chromosomeII); the following alphanumeric symbols indicatethe cosmid name (C32F12) and the last two digitsrefer to consecutive ORFs in the cosmid. Anadditional ‘c’ letter indicates a complementarystrand.

The database scan for similar sequences wasdone using the BLAST (Altschul et al., 1995) andFASTA (Pearson and Lipman, 1988) programs(parameters: BLOSUM62 matrix for BLAST; andKtup=2 for FASTA). Multiple-sequence align-ments were obtained using the CLUSTALW pro-gram (Thompson et al., 1994) or PILEUP (GCGpackage). Protein patterns (motifs) were identifiedby the ProfileScan and ScanProsite programs ofthe ExPASy WWW server (Appel et al., 1994) inthe PROSITE database of protein sites and pat-terns (Bairoch et al., 1997). Putative transmem-brane domains were defined using the TMAPprogram (Persson and Argos, 1994). The resultswere compared with the analysis of the sequenceperformed at the Sanger Centre.

RESULTS AND DISCUSSION

Sequence analysisCosmid c32F12 contained an insert of 40 kb

from the right arm of chromosome II of Sz. pombe.The nucleotide sequence of the insert was deter-mined and the sequence of 39 648 bp was analysedas described in Materials and Methods. A largeportion of the insert of cosmid c32F12 overlapswith the insert of cosmid c19C7, sequenced at thelaboratory of C. Gaillardin (unpublished), which,according to the published map (Hoheisel et al.,1993), should not be near c32F12.

The sequenced region has an overall G+C con-tent of 36·5%, while the coding region alone hasa slightly higher G+C content of 39·9%. The

Copyright ? 1999 John Wiley & Sons, Ltd.

sequenced segment carries 15 ORFs (two of thempartial) longer than 300 bp and one tRNA gene(Figure 1). Of the 15 ORFs, six correspond topreviously identified genes; thus, 36·5% of thesequence was already available in the databases.The 15 coding sequences cover 54·1% of the totalsequence, a value much lower than the 72%described for Saccharomyces cerevisiae (Dujon,1996) and in accordance with the lower genedensity expected for the fission yeast.

ORF analysis

Figure 2. Local alignment of the Sz. pombe SPBC32F12.01cprotein with neutral sphingomyelinase from human (GenBankAccession Number: AJ222801) and rat (AJ222800) and homo-logues from S. cerevisiae (U18778) and C. elegans (Z82060)showing (underlined) the conserved ATP/GTP-binding sitemotif (Saraste et al., 1990).

SPBC32F12.01c. Starting at the centromere-proximal end of the sequenced segment, the firstORF is the partial SPBC32F12.01c ORF thatextends beyond the left part of the segment.Homology searches revealed that the putativeprotein encoded by this ORF shows significantsimilarities to human and mouse neutral sphin-gomyelinase (GenBank Accession NumbersAJ222801 and AJ222800). In mammalian cells,two sphingomyelinases (SMAse; E.C. 3.1.4.12)—the lysosomal acid sphingomyelinase and theplasma membrane-bound neutral sphingomye-linase—determine the major route of sphingo-myelin degradation in a phospholipase hydrolysisreaction, yielding ceramides and phosphocholine.Activation of the ‘sphingomyelin pathway’ bySMases has been described to increase the produc-tion of ceramide, which subsequently triggers sig-nalling pathways leading to either cell proliferationand differentiation or to apoptosis (Tomiuk et al.,1998). In addition to mammalian SMases, theSPBC32F12.01c gene product also showed simi-larity to proteins of unknown function from S.cerevisiae (Yer019w) and Caenorhabditis elegans(T27F6.6). An ATP/GTP-binding site motif A(P-loop) (Saraste et al., 1990) is shared by thisgroup of proteins (Figure 2).

Yeast 15, 419–426 (1999)

Page 4: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

422 . .

SPBC32F12.02. This ORF is identical to thecoding region of rec14+, a gene involved at anearly step of meiotic recombination in Sz. pombe(Evans et al., 1997). The encoded protein containssix Trp–Asp (WD) repeat motifs found in theG-â-transducin family of proteins, including the S.cerevisiae Ski8 (Rec103) protein. â-transducin isone of the three subunits (á, â and ã) of theguanine nucleotide-binding proteins which actas intermediates in the transduction of signalsgenerated by transmembrane receptors (Gilman,1987).

Figure 3. Multiple sequence alignment. The CLUSTAL program was used to align thededuced protein sequence of SPBC32F12.03c with glutathione peroxidases (GSHPx) fromvarious organisms: human (GenBank Accession Number: Y00483), S. cerevisiae (U22446),C. reinhardii (AF014927) and E. coli (M14031). Black boxes indicate identical residues in atleast three sequences. Dashes denote gaps introduced to improve alignment. The twosignature patterns of GSHPx are underlined. *Position of the catalytic active site seleno-cysteine residue.

SPBC32F12.03c. This ORF, recently designatedas gene gpx1+, codes for a putative glutathioneperoxidase (GSHPx) involved in the oxidativestress response (EMBL Accession NumberAB012395). The gpx1+ DNA sequence was foundto be identical to the SPBC32F12.03c sequence,except for the insertion of a single C nucleotidein the 3* non-coding region of the gpx1+ sequence(at position 1006) which is not present in theSPBC32F12.03c sequence. The SPBC32F12.03c

Copyright ? 1999 John Wiley & Sons, Ltd.

predicted protein shows strong homology to gluta-thione peroxidases (EC 1.11.1.9) which catalysethe reduction of hydroxyperoxides by glutathione.Its main function is to protect against the damag-ing effect of endogenously-formed hydroxyperox-ides. Selenium, in the form of selenocysteine, ispart of the catalytic site of GSHPx (Stadtman,1990). The sequence around the selenocysteineresidue is moderately well conserved in GSHPxproteins and related proteins and can be usedas a signature pattern. This motif ([GN]–[RKHNFYC]–x–[LIVMFC]–[LIVMF](2)–x–N–[VT]–x–[STC]–x–C–[GA]–x–T; where C is theactive site selenocysteine residue) appears at pos-itions 24–39 in SPBC32F12.03cp (Figure 3). Asecond signature for this family of proteins,consisting of a highly conserved octapeptide([LIV]–[AGD]–F–P–[CS]–[NG]–Q–F) located inthe central section of these proteins, is also presentin SPBC32F12.03cp (positions 60–67). Down-stream from SPBC32F12.03cp, and in the samedirection of transcription, a tRNA-Thr gene (AGTanticodon) was detected.

Yeast 15, 419–426 (1999)

Page 5: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

423 .

SPBC32F12.04. The ORF SPBC32F12.04 DNAsequence is identical to the tug1+ gene sequence,except for two nucleotide differences in the 5*non-coding sequence (at positions 2 and 122) andnucleotide changes in the last three positions of the3* non-coding region of the submitted tug1+ DNAsequence (GenBank Accession Number M63447).The tug1+ gene, which contains six putative in-trons (Table 1), encodes ã-tubulin. This essentialprotein is specifically found at microtubule organ-izing centres such as the spindle poles or thecentrosome, suggesting that it is involved in theminus-end nucleation of microtubule assembly(Stearns et al., 1991; Horio et al., 1991).

SPBC32F12.05c. This ORF codes for a putativeprotein of 217 amino acids. The FASTA searchrevealed weak homologies to the proteinsof unknown function F53B7.3 from C. elegansand Yjr050wp (Utr3p) from S. cerevisiae.SPBC32F12.05c contains two introns, as predictedby computer analysis (Table 1). Consensus splicedonor (GTAAGG, at positions 8846–8851 for thefirst intron; and GTACGT, at positions 8652–8657for the second intron) and branchpoint and accep-tor sequences (CTAACCCCTTTGTTATAG, atpositions 8791–8808 for the first intron; CTAACCATGAATAG, at positions 8567–8580 for thesecond intron) were identified.

SPBC32F12.06. The ORF SPBC32F12.06 corre-sponds to the essential gene pch1+, which encodesa cyclin C homologue. This gene was isolated in atwo-hybrid screening for proteins that interactwith Cdc2p (Furnari et al., 1997). The cyclinbox region of pch1+ protein shares the highestsequence identity with mammalian and DrosophilaC-type cyclins (approximately 33% identity).Pch1p is significantly less similar to Mcs2p (19%identity), a second essential member of the C-typecyclin family in Sz. pombe. It has been describedthat the pch1+ ORF is encoded by a single exon(Furnari et al., 1997). In contrast, our sequenceanalysis predicts that the pch1+ coding sequence isinterrupted by two introns of 57 bp and 43 bp inlength (Table 1). In addition, at the protein levelour sequence predicts a valine instead of a glycineresidue at amino acid position 157.

SPBC32F12.07c. This ORF encodes a putativeprotein of 344 residues which shows weak simi-larity (FASTA score: 125) to a hypothetical40·4 kDa protein (Q20846) encoded by chromo-

Copyright ? 1999 John Wiley & Sons, Ltd.

some III of C. elegans (Wilson et al., 1994).Comparison of the two proteins revealed thatsequence similarities are essentially confined totheir N-terminal regions (positions 3–63 ofSPBC32F12.07c and 29–86 of the C. elegans pro-tein) with conservation of a cysteine-rich domain,called the C3HC4 zinc-finger or ‘RING’ finger,known to participate mainly in protein–proteininteractions (Borden and Freemont, 1996).

SPBC32F12.08c. This ORF encodes a putativepolypeptide of 19·5 kDa which shows no signifi-cant homology to known proteins or ESTs. Theclosest homologue found (FASTA optimal score:149) was a replication factor (called Rep-like;TrEMBL Accession Number: G3068583) encodedby the Dictyostelium discoideum nuclear plasmidDdp5 (Rieben et al., 1998). The SPBC32F12.08cgene product has a high content of glutamine(12·7%), glutamic acid (11·5%), and serine (10·5%)residues.

SPBC32F12.09. This ORF corresponds to thepreviously sequenced rum1+ gene, which encodesan inhibitor of cdc2/cyclin B complexes (Correa-Bordes and Nurse, 1995). This CDK inhibitor isimportant for regulating the G1 phase of thefission yeast cell cycle (Moreno and Nurse, 1994;Correa-Bordes and Nurse, 1995). The DNAsequence of SPBC32F12.09 is 99% identical to therum1+ sequence (GenBank Accession Number:X77730). Two consecutive base pair changes(cosmid coordinates: 15 644–15 645) led to twodifferent amino acids (I72]M and V73]L).

SPBC32F12.10. This ORF encodes a protein of60·6 kDa which exhibits extensive sequence simi-larity over its entire length to phosphogluco-mutases from different organisms, the besthomologue, with 53·4% identity in a 567 aminoacid region, being a phosphoglucomutase A fromD. discoideum (TrEMBL Accession Number:Q23919). Phosphoglucomutase (PGM; EC 5.4.2.2)is an enzyme responsible for the conversion of-glucose 1-phosphate into -glucose 6-phosphatethat participates in both the breakdown and syn-thesis of glucose. The catalytic mechanism ofPGM involves the formation of a phosphoserineintermediate (Dai et al., 1992). The sequencearound the serine residue ([GSA]–[LIVM]–x–[LIVM]–[ST]–[PGA]–S–H–x–P–x(4)–[GNHE]; Sbeing the phosphoserine residue) is well conserved

Yeast 15, 419–426 (1999)

Page 6: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

Table 1. Characteristics of open reading frames (ORFs) identified in the cosmid c32F12.

Best homologyb

FASTA

Opt.score

Selfscore

omyelinase, human 342 9042051 2051

hion peroxidase (S.p.) 1078 1078

2916 2916

rotein F53B7.3 (C.e.) 403 1441

2203 2203

rotein CEMSC22F (C.e.) 125 2311149 1101

1538 1545utase (D.d.) 1937 3690

2183 2183rotein YMR071C (S.c.) 416 1068

rotein C35E7.9 (C.e.) 161 1262itin–protein ligase E3 component (S.p.) 13 194 13 194

10 347 10 347

ictyostelium discoideum.

424

.

.

Copyright

?1999

JohnW

iley&

Sons,L

td.Y

east15,

419–426(1999)

ORF namea Position (bp) Size (aa) MW (kDa) pI Introns (bp)

SPBC32F12.01c 1–390 130 Neutral sphingSPBC32F12.02 1817–2778 302 32·9 4·9 1 (2106–2158) rec14+ (S.p.)SPBC32F12.03c 3910–4386 158 18·0 8·2 Putative glutattRNA-Thr 5214–5285SPBC32F12.04 6309–7981 446 49·9 5·9 1 (6358–6419) tug1+ (S.p.)

2 (6533–6581)3 (6636–6704)4 (7182–7225)5 (7691–7740)6 (7857–7914)

SPBC32F12.05c 8061–8866 217 25·6 6·1 1 (8567–8657) Hypothetical p2 (8791–8851)

SPBC32F12.06 9805–10 933 342 38·3 5·9 1 (9984–10 040) pch1+ (S.p.)2 (10 586–10 628)

SPBC32F12.07c 11 155–12 177 340 39·0 8·1 Hypothetical pSPBC32F12.08c 12 846–13 346 166 19·5 8·6 Rep-like (D.d.)SPBC32F12.09 15 428–16 120 230 25·3 9·4 rum1+ (S.p.)SPBC32F12.10 17 356–19 020 554 60·6 6·3 PhosphoglucomSPBC32F12.11 22 476–23 486 366 35·9 6·5 gpd1+ (S.p.)SPBC32F12.12c 25 588–26 566 164 17·9 6·9 1 (25 789–25 910) Hypothetical p

2 (26 096–26 407)3 (26 445–26 495)

SPBC32F12.13c 27 355–27 993 197 22·6 9·9 1 (27 920–27 967) Hypothetical pSPBC32F12.14 28 188–34 064 1958 225·8 6·0 Putative ubiquSPBC32F12.15 34 847–39 646 1600 180·1 6·4 cyr1+ (S.p.)

ac indicates complementary strand.bOnly the highest score is shown. S.p.=Schizosaccharomyces pombe; C.e.=Caenorhabditis elegans; D.d.=D

Page 7: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

425 .

among PGMs and can be found inSPBC32F12.10p at positions 108–121.

SPBC32F12.11. This ORF corresponds to thegpd1+ gene encoding glyceraldehyde 3-phosphatedehydrogenase (EC 1.1.1.8; Orlandi et al., 1996).FASTA analysis revealed 100% identity within thetotal 336 amino acids with the protein encodedby the submitted cDNA sequence (GenBankAccession Number: X85332).

SPBC32F12.12c. This ORF encodes a potential164 amino acid protein which contains three pre-dicted transmembrane spans and shows significantsimilarity (FASTA optimal score: 416) to an S.cerevisiae ORF (YMR071c) of unknown function(Bowman et al., 1997). The occurrence in thesequence of three pairs of Sz. pombe canonicalsplice donor and branch end acceptor sequences(GTATGC/CTAACTTATTGTAG, GTATGT/CTAACCTACTACCTTCAG and GTAAGT/TTAACTCTTTTAG; at positions 26 490–26 495/26 446–26 459, 26 402–26 407/26 096–26 113 and25 905–25 910/25 789–25 801, respectively) pre-dicts that the SPBC32F12.12c coding sequence isinterrupted by three introns, which are not presentin the S. cerevisiae YMR071c ORF (Table 1).

SPBC32F12.13c. The amino acid sequence ofthis ORF exhibited weak similarity (FASTA opti-mal score: 161) to the C. elegans hypotheticalprotein C35E7.9 (GenBank Accession Number:AF067216). The search for introns revealed thepresence of a single putative intervening sequence(splice donor sequence GTATGG, at positions27 962–27 967 and branch end acceptor sequenceTTGACTAAAGTTTTTATTTAG at positions27 920–27 940).

SPBC32F12.14. This ORF encodes a large pro-tein of 1958 amino acids which can be unambigu-ously aligned to the ubiquitin–protein ligase E3component (the recognition component of theN-end rule pathway) from human, mouse andbudding yeast (GenBank Accession Numbers:AF061556; AF061555; X53747). The N-end rulepathway targets proteins for ubiquitin-dependentproteolysis according to a degradation signalwhich comprises a destabilizing amino-terminalresidue and a specific internal lysine residue. Thisprotein binds to proteins bearing amino-terminalresidues that are destabilizing according to theN-end rule, but does not bind to otherwise identi-

Copyright ? 1999 John Wiley & Sons, Ltd.

cal proteins bearing stabilizing N-terminal residues(Bartel et al., 1990; Kwon et al., 1998). A zinc-finger motif found in ubiquitin hydrolases andother proteins was identified between amino acidpositions 1827 and 1850.

SPBC32F12.15. The last ORF is partial and con-tains the sequence encoding the first 1600 out of1692 residues of the previously known cyr1+ gene(Yamawaki-Kataoka et al., 1989). The DNAsequence is identical to the cyr1+ sequence(GenBank Accession Number: M24942). Thecyr1+ gene encodes adenylate cyclase (E.C.4.6.1.1), which converts ATP into the secondmessenger, cAMP, as part of many eukaryoticsignal transduction pathways. In the fission yeast,adenylate cyclase plays essential roles in the regu-lation of cellular metabolism, including sexualdifferentiation in response to nutritional condi-tions and the onset of gluconeogenesis in responseto glucose starvation (Maeda et al., 1990; Hoffmanand Winston, 1991).

ACKNOWLEDGEMENTS

This work was supported by the European Com-mission in the framework of the EuropeanSchizosaccharomyces genome sequencing project(BIO-4-CT96-0159) and by the Comision Inter-ministerial de Ciencia y Tecnologıa (BIO97-1535-C04-CE). We are indebted to V. Wood, M. A.Rajandream and B. G. Barrell, the Sanger Centreteam (Cambridge, UK) for their help with thesequence analysis.

REFERENCES

Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang,J., Zhang, Z., Miller, W. and Lipman, D. J. (1995).Gapped BLAST and PSI-BLAST: a new generationof protein database search programs. Nucleic AcidsRes. 25, 3389–3402.

Appel, R. D., Bairoch, A. and Hochstrasser, D. F.(1994). A new generation of information retrievaltools for biologists: the example of the ExPASyWWW server. Trends Biochem. Sci. 19, 258–260.

Bairoch, A., Bucher, P. and Hofmann, K. (1997). ThePROSITE database, its status in 1997. Nucleic AcidsRes. 25, 217–221.

Bartel, B., Wunning, I. and Varshavsky, A. (1990). Therecognition component of the N-end rule pathway.EMBO J. 9, 3179–3189.

Borden, K. L. and Freemont, P. S. (1996). The RINGfinger domain: a recent example of a sequence-

structure family. Curr. Opin. Struct. Biol. 6, 395–401.

Yeast 15, 419–426 (1999)

Page 8: DNA Sequencing and analysis of a 40 kb region from the right arm of chromosome II fromSchizosaccharomyces pombe

426 . .

Bowman, S., Churcher, C., Badcock, K. et al. (1997).The nucleotide sequence of Saccharomyces cerevisiaechromosome XIII. Nature (suppl.) 387, 90–93.

Chen, T. and Chang, M. Q. (1998). Pombe: a fissionyeast gene-finding and exon–intron structure predic-tion system. Yeast 14, 701–710.

Correa-Bordes, J. and Nurse, P. (1995). p25rum1 ordersS phase and mitosis by acting as an inhibitor of thep34cdc2 mitotic kinase. Cell 83, 1001–1009.

Dai, J. B., Liu, Y., Ray, W. J. and Konno, M. (1992).The crystal structure of muscle phosphoglucomutaserefined at 2.7 Ar resolution. J. Biol. Chem. 267,6322–6337.

Dujon, B. (1996). The yeast genome project: what did welearn? Trends Genet. 12, 263–270.

Evans, D. H., Li, Y. F., Fox, M. E. and Smith, G. R.(1997). A WD repeat protein, Rec14, essential formeiotic recombination in Schizosaccharomycespombe. Genetics 146, 1253–1264.

Furnari, B. A., Russell, P. and Leatherwood, J. (1997).pch1+, a second essential C-type cyclin gene inSchizosaccharomyces pombe. J. Biol. Chem. 272,12 100–12 106.

Gilman, A. G. (1987). G proteins: transducers ofreceptor-generated signals. Ann. Rev. Biochem. 56,615–649.

Hoffman, C. S. and Winston, F. (1991). Glucose repres-sion of transcription of the Schizosaccharomycespombe fbp1 gene occurs by a cAMP signalingpathway. Genes Dev. 5, 561–571.

Hoheisel, J. D., Maier, E., Mott, R., McCarthy, L.,Grigoriev, A. V., Schalkwyk, L. C., Nizetic, D.,Francis, F. and Lehrach, H. (1993). High resolutioncosmid and P1 maps spanning the 14 Mb genome ofthe fission yeast S. pombe. Cell 73, 109–120.

Horio, T., Uzawa, S., Jung, M. K., Oakley, B. R.,Tanaka, K. and Yanagida, M. (1991). The fissionyeast ã-tubulin is essential for mitosis and is localizedat microtubule organizing centers. J. Cell Sci. 99,693–700.

Kwon, Y. T., Reiss, Y., Fried, V. A., Hershko, A.,Yoon, J. K., Gonda, D. K., Sangan, P., Copeland,N. G., Jenkins, N. A. and Varshavsky, A. (1998).The mouse and human genes encoding the recognitioncomponent of the N-end rule pathway. Proc. NatlAcad. Sci. USA 95, 7898–7903.

Maeda, T., Mochizuki, N. and Yamamoto, M. (1990).Adenylyl cyclase is dispensable for vegetative cellgrowth in the fission yeast Schizosaccharomycespombe. Proc. Natl Acad. Sci. USA 87, 7814–7818.

Copyright ? 1999 John Wiley & Sons, Ltd.

Moreno, S. and Nurse, P. (1994). Regulation of progres-sion through the G1 phase of the cell cycle by therum1+ gene. Nature 367, 236–242.

Orlandi, I., Popolo, L., Cavadini, P. and Vai, M. (1996).Cloning and characterization of a cDNA encodingglyceraldehyde-3-phosphate dehydrogenase from thefission yeast Schizosaccharomyces pombe. Rend. Fis.Acc. Lincei 7, 315–322.

Pearson, W. R. and Lipman, D. J. (1988). Improvedtools for biological sequence comparison. Proc. NatlAcad. Sci. USA 85, 2444–2448.

Persson, B. and Argos, P. (1994). Prediction of trans-membrane segments in proteins utilising multiplesequence alignments. J. Mol. Biol. 237, 182–192.

Rieben, W. K., Gonzales, C. M., Gonzales, S. T.,Pilkington, K. J., Kiyosawa, H., Hughes, J. E. andWelker, D. L. (1998). Dictyostelium discoideumnuclear plasmid Ddp5 is a chimera related to theDdp1 and Ddp2 plasmid families. Genetics 148, 1117–1125.

Sambrook, J., Fritsch, E. and Maniatis, Y. (Eds) (1989).Molecular Cloning. A Laboratory Manual. ColdSpring Harbor Laboratory Press, New York.

Saraste, M., Sibbald, P. R. and Wittinghofer, A. (1990).The P-loop: a common motif in ATP- and GTP-binding proteins. Trends Biochem. Sci. 15, 430–434.

Stadtman, T. C. (1990). Selenium biochemistry. Ann.Rev. Biochem. 59, 111–127.

Stearns, T., Evans, L. and Kirschner, M. (1991).Gamma-tubulin is a highly conserved component ofthe centrosome. Cell 65, 825–836.

Thompson, J. D., Higgins, D. G. and Gibson, T. J.(1994). CLUSTAL W: improving the sensitivity ofprogressive multiple sequence alignment through se-quence weighting, position-specific gap penalties andweight matrix choice. Nucleic Acids Res. 22, 4673–4680.

Tomiuk, S., Hofmann, K., Nix, M., Zumbansen, M. andStoffel, W. (1998). Cloned mammalian neutral sphin-gomyelinase: functions in sphingolipid signaling?Proc. Natl Acad. Sci. USA 95, 3638–3643.

Wilson, R., Ainscough, R., Anderson, K. et al. (1994).2.2 Mb of contiguous nucleotide sequence fromchromosome III of C. elegans. Nature 368, 32–38.

Yamawaki-Kataoka, Y., Tamaoki, T., Choe, H.-R.,Tanaka, H. and Kataoka, T. (1989). Adenylatecyclases in yeast: a comparison of the genes fromSchizosaccharomyces pombe and Saccharomycescerevisiae. Proc. Natl Acad. Sci. USA 86, 5693–5697.

Yeast 15, 419–426 (1999)