Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis...

5
Proc. Nati. Acad. Sci. USA Vol. 85, pp. 7307-7311, October 1988 Genetics Human genes for complement components Clr and Cls in a close tail-to-tail arrangement (domain structure/serine protease/tissue specificity) H. KusuMOTO*, S. HIROSAWA*, J. P. SALIER*t, F. S. HAGENt, AND K. KuRACHI*§ *Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109-0618; and tZymoGenetics, Inc., 2121 North 35th Street, Seattle, WA 98103 Communicated by James V. Neel, June 20, 1988 ABSTRACT Complementary DNA clones for human Cls were isolated from cDNA libraries that were prepared with poly(A)+ RNAs of human liver and HepG2 cells. A clone with the largest cDNA insert of 2664 base pairs (bp) was analyzed for its complete nucleotide sequence. It contained 202 bp of a 5' untranslated region, 45 bp of coding for a signal peptide (15 amino acid residues), 2019 bp for complement component Cls zymogen (673 amino acid residues), 378 bp for a 3' untrans- lated region, a stop codon, and 17 bp of a poly(A) tail. The amino acid sequence of Cis was 40.5% identical to that of COr, with excellent matches of tentative disulfide bond locations conserving the overall domain structure of COr. DNA blotting and sequencing analyses of genomic DNA and of an isolated genomic DNA clone clearly showed that the human genes for COr and Cls are closely located in a "tail-to-tail" arrangement at a distance of about 9.5 kilobases. Furthermore, RNA blot analyses showed that both COr and Cis genes are primarily expressed in liver, whereas most other tissues expressed both COr and Cis genes at much lower levels (less than 10% of that in liver). Multiple molecular sizes of specific mRNAs were observed in the RNA blot analyses for both COr and Cis, indicating that alternative RNA processing(s), likely an alter- native polyadenylylation, might take place for bith genes. The human complement system, which is composed of the classical and alternative pathways, involves about two dozen plasma proteins, including proteases and cofactors (1-3). These proteins are sequentially activated to form a lytic complex that attacks the foreign cell. Anaphylatoxic and vasoactive peptides are also generated during the course of this reaction (4). Clr and Cls are single-chain plasma glyco- proteins of about 85 kDa. These proteins are highly homol- ogous to each other, and both are subcomponents of the complement C1 complex in which two molecules of each Cir and Cls form a complex with one molecule of Clq, another subcomponent of C1, in the presence of calcium ions (2, 3). Clq in the C1 complex binds to the antigen-antibody immune complex through the constant region of immunoglobulin heavy chains, resulting in autoactivation of Clr (5). The activated Clr then proteolytically activates Cls, which in turn activates C2 and C4 in the complement cascade reac- tions. Upon proteolytic activation, both Clr and Cls are converted to two-chain-form proteases comprised of a heavy chain (58 kDa) and a light chain (27 kDa) and are readily inhibited by C1 inhibitor, forming a stoichiometric complex. Recently, an isolation and characterization of Clr and its unique domain structures have been reported (6). In addition to the serine protease module contained in the light chain of Cir, the heavy chain of Clr contains five distinct structural domains. Domains I and III are homologous repeats, and domain II is an epidermal growth factor precursor-like sequence that is also found in many other proteins such as blood coagulation factors and low density lipoprotein recep- tors. Domains IV and V are homologous repeats that are also found in several proteins, including factor XIIIb; comple- ment factors B, H, and C4BP; P-2 glycoprotein I; haptoglo- bin; and interleukin-2 receptor (6-8). The complete amino acid sequence of the heavy chain and the partial amino acid sequence of the light chain of Cls have been reported (9, 10). In this report we first describe an isolation and nucleotide sequenceJ of an essentially full-length cDNA for human Cls that complements well the cDNA data recently reported for this protein (11, 12). Then we describe the close location of the Clr and Cls genes in a "tail-to-tail" orientation. We also report that these two genes are expressed primarily in the liver but also in many other tissues with virtually identical tissue specificities. MATERIALS AND METHODS Materials. Construction of a human liver cDNA library in pUC13 plasmid vector and a HepG2 cDNA library made in Agtll phage have been described (13, 14). The Clr cDNA (designated HClr2200) used in the present study also has been described (6). Human genomic DNA library con- structed in A phage (Charon 4A) was a gift of T. Maniatis (Harvard University). The DNA sequencing kit with Seque- nase and the pTZ18 vector were obtained from United States Biochemical (Cleveland). Deoxy and dideoxy nucleotides and restriction enzymes were obtained from Boehringer Mannheim. Phage T4 DNA ligase, Klenow fragment, RNase A, and DNA-modifying enzymes were from Bethesda Re- search Laboratories. GeneScreenPlus nylon membranes were from New England Biolabs. Radiolabeled nucleotides (35S-substituted dATP, [32P]dCTP, and [32P]ATP) were from Amersham. Synthetic oligonucleotides were obtained from the oligonucleotide synthesis service laboratory of Howard Hughes Medical Institute at the University of Washington. Various tissues of an adult male baboon used in RNA preparation were provided by Judy Johnson through the organ distribution program of the Regional Primate Center at University of Washington, Seattle. Screening and Characterization of Human Liver and HepG2 cDNAs. Screening of human liver cDNA library in pUC13 plasmid with a 32P-labeled oligonucleotide probe (about 1.5 x 109 cpm/,ug) was carried out by using 2.5 x 106 cpm/ml at 55°C in a modification of the procedure previously de- scribed (13, 15). A cDNA insert isolated from a strongly hybridizing clone (designated phCls450) was then radiola- tOn leave of absence from Institut National de la Sante et de la Recherche Medicale, Bois-Guillaume, France. §To whom reprint requests should be addressed. $The sequence reported in this paper is being deposited in the EMBL/GenBank data base (IntelliGenetics, Mountain View, CA, and Eur. Mol. Biol. Lab., Heidelberg) (accession no. J04080). 7307 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on July 25, 2020

Transcript of Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis...

Page 1: Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis article weredefrayed in partbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Nati. Acad. Sci. USAVol. 85, pp. 7307-7311, October 1988Genetics

Human genes for complement components Clr and Cls in a closetail-to-tail arrangement

(domain structure/serine protease/tissue specificity)

H. KusuMOTO*, S. HIROSAWA*, J. P. SALIER*t, F. S. HAGENt, AND K. KuRACHI*§*Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109-0618; and tZymoGenetics, Inc., 2121 North 35th Street,Seattle, WA 98103

Communicated by James V. Neel, June 20, 1988

ABSTRACT Complementary DNA clones for human Clswere isolated from cDNA libraries that were prepared withpoly(A)+ RNAs of human liver and HepG2 cells. A clone withthe largest cDNA insert of2664 base pairs (bp) was analyzed forits complete nucleotide sequence. It contained 202 bp of a 5'untranslated region, 45 bp of coding for a signal peptide (15amino acid residues), 2019 bp for complement component Clszymogen (673 amino acid residues), 378 bp for a 3' untrans-lated region, a stop codon, and 17 bp of a poly(A) tail. Theamino acid sequence of Cis was 40.5% identical to that of COr,with excellent matches of tentative disulfide bond locationsconserving the overall domain structure of COr. DNA blottingand sequencing analyses of genomic DNA and of an isolatedgenomic DNA clone clearly showed that the human genes forCOr and Cls are closely located in a "tail-to-tail" arrangementat a distance of about 9.5 kilobases. Furthermore, RNA blotanalyses showed that both COr and Cis genes are primarilyexpressed in liver, whereas most other tissues expressed bothCOr and Cis genes at much lower levels (less than 10% of thatin liver). Multiple molecular sizes of specific mRNAs wereobserved in the RNA blot analyses for both COr and Cis,indicating that alternative RNA processing(s), likely an alter-native polyadenylylation, might take place for bith genes.

The human complement system, which is composed of theclassical and alternative pathways, involves about two dozenplasma proteins, including proteases and cofactors (1-3).These proteins are sequentially activated to form a lyticcomplex that attacks the foreign cell. Anaphylatoxic andvasoactive peptides are also generated during the course ofthis reaction (4). Clr and Cls are single-chain plasma glyco-proteins of about 85 kDa. These proteins are highly homol-ogous to each other, and both are subcomponents of thecomplement C1 complex in which two molecules ofeach Cirand Cls form a complex with one molecule of Clq, anothersubcomponent of C1, in the presence of calcium ions (2, 3).Clq in the C1 complex binds to the antigen-antibody immunecomplex through the constant region of immunoglobulinheavy chains, resulting in autoactivation of Clr (5). Theactivated Clr then proteolytically activates Cls, which inturn activates C2 and C4 in the complement cascade reac-tions. Upon proteolytic activation, both Clr and Cls areconverted to two-chain-form proteases comprised of a heavychain (58 kDa) and a light chain (27 kDa) and are readilyinhibited by C1 inhibitor, forming a stoichiometric complex.

Recently, an isolation and characterization of Clr and itsunique domain structures have been reported (6). In additionto the serine protease module contained in the light chain ofCir, the heavy chain of Clr contains five distinct structuraldomains. Domains I and III are homologous repeats, anddomain II is an epidermal growth factor precursor-like

sequence that is also found in many other proteins such asblood coagulation factors and low density lipoprotein recep-tors. Domains IV and V are homologous repeats that are alsofound in several proteins, including factor XIIIb; comple-ment factors B, H, and C4BP; P-2 glycoprotein I; haptoglo-bin; and interleukin-2 receptor (6-8). The complete aminoacid sequence of the heavy chain and the partial amino acidsequence of the light chain of Cls have been reported (9, 10).

In this report we first describe an isolation and nucleotidesequenceJ of an essentially full-length cDNA for human Clsthat complements well the cDNA data recently reported forthis protein (11, 12). Then we describe the close location ofthe Clr and Cls genes in a "tail-to-tail" orientation. We alsoreport that these two genes are expressed primarily in theliver but also in many other tissues with virtually identicaltissue specificities.

MATERIALS AND METHODSMaterials. Construction of a human liver cDNA library in

pUC13 plasmid vector and a HepG2 cDNA library made inAgtll phage have been described (13, 14). The Clr cDNA(designated HClr2200) used in the present study also hasbeen described (6). Human genomic DNA library con-structed in A phage (Charon 4A) was a gift of T. Maniatis(Harvard University). The DNA sequencing kit with Seque-nase and the pTZ18 vector were obtained from United StatesBiochemical (Cleveland). Deoxy and dideoxy nucleotidesand restriction enzymes were obtained from BoehringerMannheim. Phage T4 DNA ligase, Klenow fragment, RNaseA, and DNA-modifying enzymes were from Bethesda Re-search Laboratories. GeneScreenPlus nylon membraneswere from New England Biolabs. Radiolabeled nucleotides(35S-substituted dATP, [32P]dCTP, and [32P]ATP) were fromAmersham. Synthetic oligonucleotides were obtained fromthe oligonucleotide synthesis service laboratory of HowardHughes Medical Institute at the University of Washington.Various tissues of an adult male baboon used in RNApreparation were provided by Judy Johnson through theorgan distribution program of the Regional Primate Center atUniversity of Washington, Seattle.

Screening and Characterization ofHuman Liver and HepG2cDNAs. Screening of human liver cDNA library in pUC13plasmid with a 32P-labeled oligonucleotide probe (about 1.5x 109 cpm/,ug) was carried out by using 2.5 x 106 cpm/mlat 55°C in a modification of the procedure previously de-scribed (13, 15). A cDNA insert isolated from a stronglyhybridizing clone (designated phCls450) was then radiola-

tOn leave of absence from Institut National de la Sante et de laRecherche Medicale, Bois-Guillaume, France.§To whom reprint requests should be addressed.$The sequence reported in this paper is being deposited in theEMBL/GenBank data base (IntelliGenetics, Mountain View, CA,and Eur. Mol. Biol. Lab., Heidelberg) (accession no. J04080).

7307

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

July

25,

202

0

Page 2: Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis article weredefrayed in partbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 85 (1988)

beled (1 x 109 cpm/,ug) by an oligonucleotide-primingmethod (16) and used to screen the HepG2 cDNA library with5 x 105 cpm/ml. Positive clones obtained in the primaryscreening were subjected to three rounds of plaque purifica-tion. Liquid phage stocks of purified clones were prepared bythe standard method (17). A clone, designated AhCls2700,containing the largest insert [2.7 kilobases (kb)] was sub-jected to sequence analysis by the dideoxy chain-terminationmethod with Sequenase. In this experiment, pTZ18 vectorscontaining a set of end-staggered fragments prepared bysequential deletion ofthe insert DNA with BAL-31 were usedas sequencing templates. Six percent polyacrylamide gelswith gradient buffer were used in the sequencing experi-ments. The dried gels were exposed to x-ray films for 16-48hr.

Screening and Characterization of Human Genomic DNAClones for COr and Cls. The human genomic library made inACharon 4A vector was screened with the Clr cDNA probe(5 x 105 cpm/ml). Aliquots (1 jul) of purified phage stocks ofpositive clones were then spotted on a pregrown Escherichiacoli (LE 392) lawn in a grid prepared on duplicated agarplates. The phage plaques produced were then blotted tonitrocellulose filters. The filters were treated by a standardprocedure (17) for hybridization with either radiolabeled Clror Cls cDNA probe. Dot-blot analysis of the purified phageDNAs (50-ng aliquots) obtained from liquid cultures wascarried out by a standard method (17) using the 32P-labeledwhole cDNAs for Clr and Cls or 5'- and 3'-specific cDNAsas probes (5 x 105 cpm/ml).DNA Blot Analysis of Human Genomic DNA and Cloned

Genomic DNA Fragments. Aliquots (10 ,g) ofhuman genomicDNA prepared from peripheral blood of a normal individual(17) were digested with various restriction enzymes (20 units)or with mixtures of two selected enzymes overnight at 37°C.The digests were subjected to 0.8% agarose gel electropho-resis followed by blotting of the gels against GeneScreenPlusmembranes. Aliquots (3 ,ug) of cloned genomic DNAs alsowere used in similar analyses. Membranes then were hybrid-ized to either Cls or Clr cDNA probe and were exposed tox-ray films (X-Omat AR, Kodak; sandwiched with twoenhancing screens) for 5 days for the whole genomic DNAsor 20 hr for the cloned DNAs. Based on these DNA blotanalyses, a restriction map of the region connecting Clr andCls gene loci was constructed.RNA Blot Analysis. Total RNAs of various baboon tissues

were prepared by the guanidinium isothiocyanate procedure,followed by a centrifugation for 21 hr through a 5.7 M CsClcushion (17). RNA preparations (20 ,ug for each tissue) wereelectrophoresed in 1.5% agarose gel containing 6.7% form-aldehyde in 20 mM phosphate buffer (pH 7.0). The agarosegels then were blotted onto a GeneScreenPlus membrane.Prehybridization, hybridization with radiolabeled cDNAprobe(s), washing, dehybridization, and rehybridizationwere carried out as described for GeneScreenPlus membraneby the manufacturer. Chicken actin cDNA probe was used toconfirm the presence of RNAs on the blots.

RESULTS AND DISCUSSIONAbout 3.5 x 105 recombinant E. coli colonies (strain TB-1)of human liver cDNA library in pUC13 were screened withan oligonucleotide probe (5'-ATYTCNCCRTACAT TGTT-GGYTC-3', where Y is T or C, N is G or A or T or C, andR is G or A) that was synthesized by the amino-terminalsequence (Glu-Pro-Thr-Met-Tyr-Gly-Glu-Ile) of the humanCls heavy chain (9). One positive clone, designatedphCls450, was obtained. The cDNA insert [about 450 basepairs (bp) in length] of this clone was then used to screenabout 2.5 x 106 plaque-forming units of a HepG2 cDNAlibrary in Agtll, resulting in seven strongly hybridizing

positives. One clone, designated AhCls2700, contained thelargest insert of about 2.7 kb. The insert DNA was releasedwith EcoRI from the phage arms and subcloned into pUC18and into pTZ18 and was subjected to sequencing analyses.All sequences were analyzed multiple times in independentexperiments, and any ambiguous sequences were furtheranalyzed by sequencing both strands with specifically syn-thesized oligonucleotides as primers. The complete sequenceis shown in Fig. 1. This cDNA was found to be 2664 bp inlength, containing 45 bp for a signal peptide of 15 amino acidsand 2019 bp for the Cls mature protein with 673 amino acids.Our data successfully corrected two artifacts, an invertedsequence (59 bp) in the 5'-end untranslated region and aninsertion of a short foreign sequence of 30 bp in the proteincoding region, which were recently described for the cDNAsequence by MacKinnon et al. (11). Present data, further-more, filled a gap of sequence of about 100 bp spanning the3'-end region of a cDNA clone for Cls that was recentlyreported by Tosi et al. (12).Human Cls was found to have an amino acid composition

of Asp41Asn34Thr33Ser42Glu53Glnl9Pro49Gly6lAla35Val52-Metl2Ile3oLeu38Tyr3lPhe3lLys37His11Arg26Cys26Trpl2 witha calculated Mr of 74,891 without carbohydrates. The aminoacid sequences of Cls and Clr were found to be identical at40.5% of the positions, including all of the 13 potentialdisulfide bonds as well as the essential residues involved inthe formation of the protease-active sites of these proteins(data not shown). This strongly indicates that the overalldomain structure of Clr is also retained in Cls molecule.The amino-terminal sequence of Cls is reported to be

Glu-Pro-Thr (9), whereas the amino-terminal residue of Ciris not known because its a-amino group is apparently blockedand undetectable in the sequence analysis (18). The chemicalnature of the blocking group is not known at the present time.The signal peptide sequences of Cls and Clr were also foundto have a significant similarity (presence of a tryptophanresidue next to the first methionine residue in both Clr andCls) in addition to the overall hydrophobic sequences.These data clearly indicate that Cls and Clr genes were

generated from a common ancestral gene, probably about 600million years ago, when one assumes 1% divergence in aminoacid sequence per 10 million years (19). After the geneduplication event, each gene has gone through an indepen-dent evolutionary process involving mainly multiple-pointmutations, minor deletions, or insertions at their close loci,resulting in the modem genes for Clr and Cls molecules.When 2 x 106 recombinant phage plaques of a human

genomic DNA library were screened with Clr cDNA(HClr2200) as the radiolabeled probe, eight positive phageclones were obtained. These clones were then subjected to across-hybridization experiment with Cls cDNA probe(AhCls2700), resulting in identification of one positive clonewith an insert about 13 kb long. Designated AhgClrsl9, thisclone hybridized with the 5'- and 3'-specific probes of Clsand with the 3'-specific probe of Clr but not with the5'-specific probe of Clr (Fig. 2). In these experiments, the 5'-and 3'-specific probes for Cls (1591 bp and 1073 bp, respec-tively) and for Clr (528 bp and 637 bp, respectively) wereprepared by digesting AhCls2700 with EcoRI and Hha I orHC1r2200 with EcoRI and BamHI, respectively. The hybrid-ization results suggested two possible relative orientationsfor the Clr and Cls genes: (i) a tandem arrangement with theClr gene at the 5' edge or (ii) a tail-to-tail (3' end to 3' end)arrangement.DNA blot analyses of the isolated DNA of AhgClrsl9 and

of genomic DNAs after digestions with various restrictionenzymes were also carried out. A DNA fragment of 4.6 kbgenerated by a digestion of the genomic DNA with Sst Istrongly hybridized with both whole Clr or Cls cDNAprobes and weakly hybridized with the 5'-specific probe of

7308 Genetics: Kusurnoto et al.

Dow

nloa

ded

by g

uest

on

July

25,

202

0

Page 3: Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis article weredefrayed in partbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Genetics: Kusumoto et al. Proc. Natl. Acad. Sci. USA 85 (1988) 7309

1 5' GGGCCGGAGTTCCTGC

17 AGAGGGAGCGTCMGGCCCTGTGCTGCTGTCCCTGGGGGCCAGAGGGGTTGCCCAGCATGCCCACTGGCAGGAGAGAGGGMCTGACCCACTTGCTCCTACCAGCTTCTGAAGGCTCCAAAGT-15 Met Trp Cys Ile Val Leu Phe Ser Leu Leu Ala Trp Val Tyr Alay140 CCGGAGTGCAGAAAGCCAGGACCAAGAGACAGGCAGCTCACCAGGGTGGACAAATCGCCAGAG ATG TGG TGC ATT GTC CTG TTT TCA CTT TTG GCA TGG GTT TAT GCT

1 Glu Pro Thr Met Tyr Gly Glu Ile Leu Ser Pro Asn Tyr Pro Gln Ala Tyr Pro Ser Glu Val Glu Lys Ser Trp Asp Ile Glu Val Pro Glu248 GAG CCT ACC ATG TAT GGG GAG ATC CTG TCC CCT AAC TAT CCT CAG GCA TAT CCC AGT GAG GTA GAG AAA TCT TGG GAC ATA GM GTT CCT GAA

32 Gly Tyr Gly Ile His Leu Tyr Phe Thr His Leu Asp Ile Glu Leu Ser Glu Asn Cys Ala Tyr Asp Ser Val Gln Ile Ile Ser Gly Asp Thr341 GGG TAT GGG ATT CAC CTC TAC TTC ACC CAT CTG GAC ATT GAG CTG TCA GAG MC TGT GCG TAT GAC TCA GTG CAG ATA ATC TCA GGA GAC ACT

63 Glu Glu Gly Arg Leu Cys Gly Gin Arg Ser Ser Asn Asn Pro His Ser Pro Ile Val Glu Glu Phe Gln Val Pro Tyr Asn Lys Leu Gin Val434 GAA GAA GGG AGG CTC TGT GGA CAG AGG AGC AGT AAC MT CCC CAC TCT CCA ATT GTG GM GAG TTC CM GTC CCA TAC MC AAA CTC CAG GTG

94 Ile Phe Lys Ser Asp Phe Ser Asn Glu Glu Arg Phe Thr Gly Phe Ala Ala Tyr Tyr Val Ala Thr Asp Ile Asn Glu Cys Thr Asp Phe Val527 ATC TTT AAG TCA GAC TTT TCC AAT GM GAG CGT TTT ACG GGG TTT GCT GCA TAC TAT GTT GCC ACA GAC ATA MT GAA TGC ACA GAT TTT GTA

125 Asp Val Pro Cys Ser His Phe Cys Asn Asn Phe Ile Gly Gly Tyr Phe Cys Ser Cys Pro Pro Glu Tyr Phe Leu His Asp Asp Met Lys Asn620 GAT GTC CCT TGT AGC CAC TTC TGC MC AAT TTC ATT GGT GGT TAC TTC TGC TCC TGC CCC CCG GAA TAT TTC CTC CAT GAT GAC ATG MG MT

156 Cys Gly Val Asn Cys Ser Gly Asp Val Phe Thr Ala Leu Ile Gly Glu Ile Ala Ser Pro Asn Tyr Pro Lys Pro Tyr Pro Glu Asn Ser Arg713 TGC GGA GTT AAT TGC AGT GGG GAT GTA TTC ACT GCA CTG ATT GGG GAG ATT GCA AGT CCC MT TAT CCC AAA CCA TAT CCA GAG AAC TCA AGG

187 Cys Glu Tyr Gln Ile Ar8 Leu Glu Lys Gly Phe Gln Val Val Val Thr Leu Arg Arg Glu Asp Phe Asp Val Glu Ala Ala Asp Ser Ala Gly806 TGT GM TAC CAG ATC CGG TTG GAG AAA GGG TTC CM GTG GTG GTG ACC TTG CGG AGA GAA GAT TTT GAT GTG GM GCA GCT GAC TCA GCG GGA

218 Asn Cys Leu Asp Ser Leu Val Phe Val Ala Gly Asp Arg Gln Phe Gly Pro Tyr Cys Gly His Gly Phe Pro Gly Pro Leu Asn Ile Glu Thr899 AAC TGC CTT GAC AGT TTA GTT TTT GTT GCA GGA GAT CGG CM TTT GGT CCT TAC TGT GGT CAT GGA TTC CCT GGG CCT CTA MT ATT GAA ACC

249 Lys Ser Asn Ala Leu Asp Ile Ile Phe Gln Thr Asp Leu Thr Gly Gln Lys Lys Gly Trp Lys Leu Arg Tyr His Gly Asp Pro Met Pro Cys992 AAG AGT AAT GCT CTT GAT ATC ATC TTC CM ACT GAT CTA ACA GGG CM AM MG GGC TGG AM CTT CGC TAT CAT GGA GAT CCA ATG CCC TGC

280 Pro Lys Glu Asp Thr Pro Asn Ser Val Trp Glu Pro Ala Lys Ala Lys Tyr Val Phe Arg Asp Val Val Gln Ile Thr Cys Leu Asp Gly Phe1085 CCT MG GAA GAC ACT CCC MT TCT GTT TGG GAG CCT GCG AAG GCA AM TAT GTC TTT AGA GAT GTG GTG CAG ATA ACC TGT CTG GAT GGG TTT

311 Glu Val Val Glu Gly Ar8 Val Gly Ala Thr Ser Phe Tyr Ser Thr Cys Gln Ser Asn Gly Lys Trp Ser Asn Ser Lys Leu Lys Cys Gln Pro1178 GAA GTT GTG GAG GGA CGT GTT GGT GCA ACA TCT TTC TAT TCG ACT TGT CAA AGC MT GGA MG TGG AGT MT TCC AM CTG AM TGT CAA CCT

342 Val Asp Cys Gly Ile Pro Glu Ser Ile Glu Asn Gly Lys Val Glu Asp Pro Glu Ser Thr Leu Phe Gly Ser Val Ile Arg Tyr Thr Cys Glu1271 GTG GAC TGT GGC ATT CCT GM TCC ATT GAG MT GGT AAA GTT GM GAC CCA GAG AGC ACT TTG TTT GGT TCT GTC ATC CGC TAC ACT TGT GAG

373 Glu Pro Tyr Tyr Tyr Met Glu Asn Gly Gly Gly Gly Glu Tyr His Cys Ala Gly Asn Gly Ser Trp Val Asn Glu Val Leu Gly Pro Glu Leu1364 GAG CCA TAT TAC TAC ATG GM MT GGA GGA GGT GGG GAG TAT CAC TGT GCT GGT MC GGG AGC TGG GTG MT GAG GTG CTG GGC CCG GAG CTG

404 Pro Lys Cys Val Pro Val Cys Gly Val Pro Arg Glu Pro Phe Glu Glu Lys Gln ArgyIle Ile Gly Gly Ser Asp Ala Asp Ile Lys Asn Phe1457 CCG AAA TGT GTT CCA GTC TGT GGA GTC CCC AGA GM CCC TTT GM GM AM CAG AGG ATA ATT GGA GGA TCC GAT GCA GAT ATT AAA MC TTC

435 Pro Trp GLn Val Phe Phe Asp Asn Pro Trp Ala Gly Gly Ala Leu Ile Asn Glu Tyr Trp Val Leu Thr Ala Ala His Val Val Glu Gly Asn1550 CCC TGG CM GTC TTC TTT GAC MC CCA TGG GCT GGT GGA GCG CTC ATT AAT GAG TAC TGG GTG CTG ACG GCT GCT CAT GTT GTG GAG GGA AAC

466 Arg Glu Pro Thr Met Tyr Val Gly Ser Thr Ser Val Gln Thr Ser Arg Leu Ala Lys Ser Lys Met Leu Thr Pro Glu His Val Phe Ile His1643 AGG GAG CCA ACA ATG TAT GTT GGG TCC ACC TCA GTG CAG ACC TCA CGG CTG GCA AAA TCC MG ATG CTC ACT CCT GAG CAT GTG TTT ATT CAT

0497 Pro Gly Trp Lys Leu Leu Glu Val Pro Glu Gly Arg Thr Asn Phe Asp Asn Asp Ile Ala Leu Val Arg Leu Lys Asp Pro Val Lys Met Gly

1736 CCG GGA TGG MG CTG CTG GM GTC CCA GM GGA CGA ACC MT TTT GAT MT GAC ATT GCA CTG GTG CGG CTG AAA GAC CCA GTG AAA ATG GGA

528 Pro Thr Val Ser Pro Ile Cys Leu Pro Gly Thr Ser Ser Asp Tyr Asn Leu Met Asp Gly Asp Leu Gly Leu Ile Ser Gly Trp Gly Arg Thr1829 CCC ACC GTC TCT CCC ATC TGC CTA CCA GGC ACC TCT TCC GAC TAC MC CTC ATG GAT GGG GAC CTG GGA CTG ATC TCA GGC TGG GGC CGA ACA

559 Glu Lys Arg Asp Arg Ala Val Arg Leu Lys Ala Ala Arg Leu Pro Val Ala Pro Leu Arg Lys Cys Lys Glu Val Lys Val Glu Lys Pro Thr1922 GAG MG AGA GAT CGT GCT GTT CGC CTC MG GCG GCA AGG TTA CCT GTA GCT CCT TTA AGA AAA TGC AAA GM GTG AAA GTG GAG AAA CCC ACA

590 Ala Asp Ala Glu Ala Tyr Val Phe Thr Pro Asn Met Ile Cys Ala Gly Gly Glu Lys Gly Met Asp Ser Cys Lys Gly Asp Ser Gly Gly Ala2015 GCA GAT GCA GAG GCC TAT GTT TTC ACT CCT MC ATG ATC TGT GCT GGA G&A GAG MG GGC ATG GAT AGC TGT AAA GGG GAC AGT GGT GGG GCC

621 Phe Ala Val Gln Asp Pro Asn Asp Lys Thr Lys Phe Tyr Ala Ala Gly Leu Val Ser Trp Gly Pro Gln Cys Gly Thr Tyr Gly Leu Tyr Thr2108 TTT GCT GTA CAG GAT CCC AAT GAC MG ACC AAA TTC TAC GCA GCT GGC CTG GTG TCC TGG GGG CCC CAG TGT GGG ACC TAT GGG CTC TAC ACA

652 Arg Val Lys Asn Tyr Val Asp Trp Ile Met Lys Thr Met Gln Glu Asn Ser Thr Pro Arg Glu Asp Stop2201 CGG GTA AAG MC TAT GTT GAC TGG ATA ATG MG ACT ATG CAG GAA MT AGC ACC CCC CGT GAG GAC TM TCCAGATACATCCCACCAGCCTCTCCAAGGG

2301 TGGTGACCMATGCATTACCTTCTGTTCCTTATGATATTCTCATTATTTCATCATGACTGAAAGMAGACACGAGCGAATGATTTAAATAGMACTTGATTGTTGAGACGCCTTGCTAGAGGTAGA24 24 GTTTGATCATAGMATTGTGCTGGTCATACATTTGTGGTCTGACTCCTTGGGGTCCTTTCCCCGGAGTACCTATTGTAGATMACACTATGGGTGGGGCACTCCTTTCTTGCACTATTCCACAGG2547 GATACCTT3TTCTTTGTTTCCTCTTTACCTGTTCAAA'TTCCATTTACTTGATCATTCTCAGTATCCACTGTCTATGTAC'TAAAGGATGTTTATAA A AAA 3'

FIG. 1. Complete nucleotide and amino acid sequence of AhCls2700. Amino acid sequence of the leader sequence is shown as the reversenegative number. A small arrowhead indicates a tentative cleavage site by signal peptidase. A large arrowhead indicates the site of proteolyticactivation by the activated Cir. Residues involved in the active site formation of Cis are shown by solid circles. A putative p-hydroxylatedasparagine residue and potential carbohydrate attachment sites are marked with + and stars, respectively. The polyadenylylation signal,AATAAA, and potential alternative signal sites are shown with solid and dotted underlines, respectively.

Cls but not with the 5'-specific probe of Clr (data not 3' portion of the cDNA spanning the catalytic subunit and theshown). However, the Sst I fragment of 4.6 kb hybridized 3' untranslated sequence. The other fragment contained thewith the 3'-specific probes of both Clr and Cls (Fig. 3). In 3'-end portion of the Cls gene, including part of the last introndetailed hybridization and DNA subcloning analyses (data sequence located between nucleotide 1472 and 1473 (see Fig.not shown), this Sst I fragment was found to be a mixture of 1 for numbering) and also the rest of the 3' portion of thetwo unrelated fragments with almost identical sizes. Further- cDNA sequence. The genomic sequences encoding themore, these fragments did not overlap structurally with each catalytic subunit and the 3' untranslated region completelyother, as evidenced by DNA blotting analysis of the sub- matched with those of the corresponding sequences ofcloned fragments (data not shown). These two Sst I fragments cDNAs for Cir or Cls, indicating no introns in the catalyticsubcloned were then subjected to sequence analysis. The subunit and in the 3' untranslated regions of these genes.5'-end nucleotide sequence of one of the fragments matched Based on these results, a detailed restriction map of thewith the sequence starting at the unique Sst I site (nucleotide region connecting Cir and Cls genes was constructed (Fig.1650) within ClrcDNA and also included the rest ofthe entire 4). The BamHI fragment (3.8 kb) located in the middle of the

Dow

nloa

ded

by g

uest

on

July

25,

202

0

Page 4: Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis article weredefrayed in partbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 85 (1988)

1 2 3 4

SWS

FIG. 2. Dot-blot analysis ofthe cloned AhgClrsl9 with the 5'- and3'-specific probes of Clr and Cls. The isolated recombinant phageDNA (aliquots of 50 ng) was applied to four separate small pieces ofnitrocellulose filter. After prehybridization, filters were separatelyhybridized to the Clr 5'-specific, Clr 3'-specific, Cls 5'-specific, or

Cls 3'-specific probes, washed, and exposed to x-ray film.

interspacing sequence was also subcloned and sequenced forboth of its end regions. The DNA sequences obtainedcompletely matched with those of the 3'-end regions of thetwo 4.6-kb Sst I fragments, confirming that this BamHIfragment bridged the two unique 4.6-kb Sst I fragments (Fig.4). Three BamHI fragments (4.4, 2.7, and 0.6 kb) in Fig. 4correspond to those observed in Fig. 3. These data unam-biguously determine that Clr and Cls genes are located at a

distance of about 9.5 kb in a tail-to-tail arrangement. Theregional assignment for both genes to chromosome 12 at p13has recently been reported by Van Cong et al. (20) in goodagreement with the present data. It is also noteworthy thatthe organizations of the Clr and Cls genes rather resemblethat of haptoglobin and are different from those for factor IX,prothrombin, plasminogen activator, chymotrypsin, andtheir related proteins, which have multiple introns locatedwithin the region of the catalytic subunit (21).

Blot-hybridization analyses of total RNA prepared fromvarious tissues of baboon and from human cultured cellsclearly indicated that tissue specificities of expression of COrand Cls genes are essentially identical (Fig. 5). Both Clr andCls genes are primarily expressed in the liver. However,these genes are also expressed in most other tissues tested,such as brain and kidney, albeit at much lower levels (lessthan 10% of that for liver). This is in good agreement with theprevious observation that Cls is expressed in other cells likemonocytes in addition to hepatocytes (22). As expected froma Clr cDNA with different polyadenylylation sites (6), twomRNA bands of 2.8 and 3.1 kb were observed for Cir. Two

CIr

XhgClrsl9 1

Sequenced i-W-

region

.iTAWK.

kb kb

*..MWN.6*.:

4.~~~~~ ~.4-

2.

* 0-O.6

FIG. 3. DNA blot analyses of human genomic DNA with the3'-specific probes of Cls and Cir genes. In this experiment, restric-tion fragments generated by digestion of normal genomic DNAs withBamHI (lanes 1 and 3) and Sst I (lanes 2 and 4) are shown.GeneScreenPlus filters blotted with DNA fragments electrophoresedin a 0.8% agarose gel were hybridized with the radiolabeled 3'-specific probes for Clr (lanes 1 and 2) or Cls (lanes 3 and 4). The soliddot indicates the well position.

distinct sizes ofmRNA (2.55 and 2.85 kb) were also found forCls (Fig. 5). Potential alternative polyadenylylation sites inthe Cls nucleotide sequence were identified in AhCls2700clone at about 250 bases upstream of its polyadenylylationsite. Tosi et al. (12) reported that mRNAs for Cls containthree distinct molecular sizes. In our data, however, only twoRNA molecular species were observed in autoradiographs,even with various lengths of exposure time (data not shown).The discrepancy may be due to the possible heterogeneous

3 ~~~~~~5.-*

FIG. 4. Restriction map of the connecting region of human Clr and Cls genes. Regions for the Clr and Cls genes are shown by an openunclosed box, and the interspacing region between the two genes is shown by a fat solid line. The closed open-box region for the Cls genecorresponds to the last exon of the gene, which was unambiguously defined by sequencing. Sequencing has not yet been extended to thecorresponding splicing site to define the last exon of the Clr gene. Dashed lines indicate the undefined region of the genes. The two oppositearrows between 5' and 3' indicate the transcriptional directions of the Clr and Cls genes. The region contained in the insert of AhgClrsl9 isshown by a thin line. Small arrows with short vertical bars indicate the area sequenced, the direction of sequencing, and the restriction sitesthat were used as initial starting points for sequencing.

Clr Cls

5,

7310 Genetics: Kusurnoto et al.

I3

I.-,

I.-,

.- (riO. vi

Dow

nloa

ded

by g

uest

on

July

25,

202

0

Page 5: Humangenes complement Clr and Cls in - PNAS · 2005-04-22 · 7307 Thepublicationcostsofthis article weredefrayed in partbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement"

Proc. Natl. Acad. Sci. USA 85 (1988) 7311

Aa bc de f g h j k Imn

0

28 S-

23 S-

18 S-16 S-

Ba bc de f g h j k mn

FIG. 5. RNA blot analysis of total RNAs from baboon tissues and human hepatoma cell lines. Radiolabeled whole Cis (A) and Clr (B) cDNAswere used in the hybridizations. Size markers were ribosomal RNAs (calfthymus, 28S and 18S; and E. coli, 23S and 16S). An arrowhead indicateswell positions. Lanes: a, adrenal gland; b, brain; c, heart; d, kidney; e, liver; f, lung; g, pancreas; h, salivary gland; i, spleen;j, testis; k, thyroid;1, thymus; m, HT-1080 cells (human hepatoma); n, HTC PAI cells (human hepatoma).

length of poly(A) tails in mRNA preparations used in ourblotting analysis.High similarity of the tissue specificity for the expressions

ofthe Clr and Cls genes is in good agreement with their closerelative locations and structural similarity. It is also note-worthy that concentrations in plasma (about 50 Ag/ml) arealso almost identical for Cls and Clr (2). Whether or not Clrand Cls genes may be controlled by a common regulatorymechanism for their tissue expression remains to be deter-mined. Some other homologous human genes such as hemo-globin genes on chromosomes 11 and 16 (23) and genes ofimmunoglobulin heavy chains on chromosomes 14 (24) arealso known for their close gene loci. In contrast to the Clr andCls genes, however, these genes are apparently undersignificantly different regulations from one to the others (23,25).

This work was supported in parts by National Institutes of HealthResearch Grant HL38644 to K.K., and Department of EnergyResearch Grant DE-F60287ER60533 to J. V. Neel.

1. Reid, K. B. M. & Porter, R. R. (1981) Annu. Rev. Biochem. 50,443-464.

2. Reid, K. B. M. (1986) Essays Biochem. 22, 27-68.3. Cooper, N. R. (1985) Adv. Immunol. 37, 151-216.4. Perez, H. D. (1984) CRC Crit. Rev. Oncol. Hematol. 1, 199-

225.5. Arlaud, G. J., Gagnon, J., Villiers, C. L. & Colomb, M. E.

(1986) Biochemistry 25, 5177-5182.6. Leytus, S. P., Kurachi, K., Sakariassen, K. S. & Davie, E. W.

(1986) Biochemistry 25, 4855-4863.7. Klicksten, L. B., Wong, W. W., Smith, J. A., Morton, C.,

Fearon, D. T. & Weis, J. H. (1985) Complement 2, 44.8. Ripoche, J., Day, J., Harris, T. J. R. & Sim, R. B. (1988)

Biochem. J. 249, 593-602.

9. Spycher, S. E., Nick, H. & Rickli, E. E. (1986) Eur. J.Biochem. 156, 49-57.

10. Carter, P. E., Dunbar, B. & Fothergill, J. E. (1984) Philos.Trans. R. Soc. London Ser. B 306, 293-299.

11. MacKinnon, C. M., Carter, P. E., Smyth, S. J., Dunbar, B. &Fothergill, J. E. (1987) Eur. J. Biochem. 169, 547-553.

12. Tosi, M., Duponche, C., Meo, T. & Julier, C. (1987) Biochem-istry 26, 8516-8524.

13. Kurachi, K., Davie, E. W., Strydom, D. J., Riordan, J. F. &Vallee, B. L. (1985) Biochemistry 24, 5494-5499.

14. Hagen, F. S., Gray, C. L., O'Hara, P., Grant, F. J., Saari,G. C., Woodbury, R. G., Hart, C. E., Insley, M., Kisiel, W.,Kurachi, K. & Davie, E. W. (1986) Proc. Natl. Acad. Sci. USA83, 2412-2416.

15. Wallace, R. B., Johnson, M. J., Hirose, T., Miyake, T., Ka-washima, E. H. & Itakura, K. (1981) Nucleic Acids Res. 9, 879-894.

16. Feinberg, A. & Vogelstein, B. (1984) Anal. Biochem. 137, 266-267.

17. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) MolecularCloning:A Laboratory Manual (Cold Spring Harbor Lab., ColdSpring Harbor, NY).

18. Gagnon, J. & Arlaud, G. J. (1985) Biochem. J. 225, 135-142.19. Nathans, J., Thomas, D. & Hogness, D. S. (1986) Science 232,

193-202.20. Van Cong, N., Tosi, M., Gross, M. S., Cohen-Haguenauer, O.,

Meo, T. & Frezal, J. (1987) in Genetic Maps 1987, ed. O'Brien,S. J. (Cold Spring Harbor Lab., Cold Spring Harbor, NY), p.563.

21. Rogers, J. (1985) Nature (London) 315, 458-459.22. Reboul, A., Prandini, M.-H., Bensa, J.-C. & Colomb, M. E.

(1985) FEBS Lett. 190, 65-68.23. Stamatoyannopoulos, G. & Nienhuis, A. W. (1987) in The

Molecular Basis ofBlood Diseases, eds. Stamatoyannopoulos,G., Nienhuis, A. W., Leder, P. & Majerus, P. W. (Saunders,Philadelphia), pp. 66-93.

24. Kirch, I. R., Morton, C. C., Nakahara, K. & Leder, P. (1982)Science 216, 301-303.

25. Morrison, C. (1984) Immunol. Today 5, 37-38.

Genetics: Kusurnoto et al.

0..

Dow

nloa

ded

by g

uest

on

July

25,

202

0