The Nucleotide and Derived Amino Acid Sequence of...

5
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1986 by The American Society of Biological Chemists, Inc Val. 261. No. 5. Issue of February 15. pp. 1998-2002,1986 Printed in U.S.A. The Nucleotide and Derived Amino Acid Sequence of Human Apolipoprotein A-IV mRNA and the Close Linkage of Its Gene to the Genes of Apolipoproteins A-I and C-111” (Received for publication, August 14, 1985) Nab11 A. Elshourbagyss, David W. Walkers, Mark S. Boguskil, Jeffrey I. Gordon$, and John M. TaylorSPII From $The Gladstone Foundation Laboratories for CardiovascularDisease, the §CardiovascularResearch Institute, and the Department of Physiology, University of California, Sun Francisco, California 94140-0608 and the YDepartments of Biological Chemistry and Medicine, Washington University School of Medicine, St. Louis, Missouri 63130 Both cDNA and genomic clones encoding human apo- lipoprotein (apo-) A-IV have been isolated and char- acterized. Southern blot analyses of apo-A-IV gene- containing cosmids revealed that the apo-A-IV gene is linked to the apo-’A-Iand apo-C-I11 genes within a 20- kilobase span of chromosome 11 DNA. The apo-A-IV gene is located about 14 kilobases downstream from the apo-A-I gene in the same orientation, with the apo- C-I11 gene located between them in the opposite ori- entation. The nucleotide sequence of the corresponding human apo-A-IV mRNA was determined, and the de- rived amino acid sequence showed that mature plasma apo-A-IV contained 376 residues. Throughout most of its length, human apo-A-IVwas found to contain mul- tiple tandem 22-residue repeated segments having am- phipathic, a-helical potential. Amino acid substitutions within these homologous segmentswere generally con- servative in nature. A comparison of the sequences of human and rat apo-A-IV revealed a 79% identity of amino acid positions in the amino-terminal 60 residues and a 58% identity in the remainder of the sequences, with the human protein containing 5 extra residues near the carboxyl terminus. An examination of the distribution of apo-A-IV mRNA in different tissues of the rat, marmoset, and manshowed that apo-A-IV mRNA was abundant in both the liver and small intes- tine of the rat, but abundant in only the small intestine of the marmoset and man. It was expressed in only trace amounts in all other tissues that were examined. These findings on the structure and expression of apo- A-IV and the close linkage of its gene to those of apo- A-I and apo-(3-111 suggest a regulatory relationship between the three genes. Human apolipoprotein (apo-’) A-IV is a major component of newly synthesized chylomicrons, but it is not found in significant amounts in other lipoproteins, including chylomi- cron remnants (reviewed in Ref. 1). In the rat, apo-A-IV is also a major component of high density lipoproteins (2). In all species examined, about half of the circulating apo-A-IV is found in the lipoprotein-free fraction of plasma (3), but it may be redistributed to lipoproteins according to the require- ments of extracellular lipid metabolism (4). While the overall * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked ‘‘advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The abbreviations used are: apo-, apolipoprotein; kb, kilobases. metabolic role of apo-A-1V is unknown, Steinmetz and Uter- mann (5) have demonstrated that human apo-A-IV can be a significant activator of 1ecithin:cholesterol acyltransferase. This finding is consistent with the structure of rat apo-A-IV that we determined previously (6). Rat plasma apo-A-IV, a single chain protein of 371 amino acids (Mr = 44,465), contains closelyhomologous, tandem repetitions of a 22-residue segment with amphipathic a-heli- cal potential (6). The amphipathic character of these repeated units is probably responsible for the 1ecithin:cholesterol acyl- transferase-activating capability of apo-A-IV (7, 8). The se- quence and organization of the docosapeptide units are re- markably homologous to the tandemly repeated segments found in apo-A-I (7,9), which suggests that the corresponding genes might have arisen from an unequal gene duplication event and that they may be closelylinked (6). This study reports the complete structure of human plasma apo-A-IV and thefinding of an extensive domain of repeated amphipathic segments within the protein. In addition, the gene that encodes human apo-A-IV is linked closely to the apo-A-I gene, with the apo-C-I11 gene interposed. Our results expand the previously reported linkage of the apo-A-I and apo-C-I11genes found on chromosome 11 (10-12), raising the possibility of a coordinate control of the expression of all three genes. EXPERIMENTALPROCEDURES Human apo-A-IV cDNA clones were selected from a XgtlO (13) liver cDNA library (provided by Dr. Beatriz Levy-Wilson, Gladstone Foundation Laboratories) by screening (14) at a reduced stringency with a previously characterized rat apo-A-IV cDNA (6). The hybrid- ization probe was a 1227-base pair XmnI-BstXI restriction endonu- clease fragment of the rat cDNA (6) that was 32P-labeled by random priming (15). Plaque hybridizations were carried out at 42 “C in a buffer containing 20% deionized formamide, 0.9 M NaCl, 50 mM sodium phosphate at pH 7.0, 5 mM EDTA, 0.1% sodium dodecyl sulfate, and 200 pg/ml denatured herring sperm DNA. The cDNA inserts from positive recombinants were subcloned into bacterio- phages M13mp18 and M13mp19 for nucleotide sequence determina- tions by the dideoxynucleotide chain termination method (16). Hu- man apo-A-IV sequence-specific oligodeoxynucleotide primers were synthesized with an Applied Biosystems (Foster City, CA)Model 380A synthesizer. Human apo-A-IV genomic clones were selected from a cosmid (17) genomic DNA library (provided by Dr. Chris Lau, University of California at San Francisco) by screening with the 32P-labeled insert prepared from a cloned human apo-A-IV cDNA that had been char- acterized as indicated above. Positive recombinants were selected, andtheir inserts were subcloned for partial nucleotide sequence determinations as described above. Total cellular RNA was prepared from adult rat, marmoset, and human tissues (18) and examined by dot blot and Northern blot 1998 by on February 28, 2007 www.jbc.org Downloaded from

Transcript of The Nucleotide and Derived Amino Acid Sequence of...

Page 1: The Nucleotide and Derived Amino Acid Sequence of ...markboguski.net/docs/publications/Human_AIV_gene.pdf · gene is located about 14 kilobases downstream from ... The screening of

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1986 by The American Society of Biological Chemists, Inc

Val. 261. No. 5. Issue of February 15. pp. 1998-2002,1986 Printed in U.S.A.

The Nucleotide and Derived Amino Acid Sequence of Human Apolipoprotein A-IV mRNA and the Close Linkage of Its Gene to the Genes of Apolipoproteins A-I and C-111”

(Received for publication, August 14, 1985)

Nab11 A. Elshourbagyss, David W. Walkers, Mark S . Boguskil, Jeffrey I. Gordon$, and John M. TaylorSPII From $The Gladstone Foundation Laboratories for Cardiovascular Disease, the §Cardiovascular Research Institute, and the Department of Physiology, University of California, Sun Francisco, California 94140-0608 and the YDepartments of Biological Chemistry and Medicine, Washington University School of Medicine, St. Louis, Missouri 63130

Both cDNA and genomic clones encoding human apo- lipoprotein (apo-) A-IV have been isolated and char- acterized. Southern blot analyses of apo-A-IV gene- containing cosmids revealed that the apo-A-IV gene is linked to the apo-’A-I and apo-C-I11 genes within a 20- kilobase span of chromosome 11 DNA. The apo-A-IV gene is located about 14 kilobases downstream from the apo-A-I gene in the same orientation, with the apo- C-I11 gene located between them in the opposite ori- entation. The nucleotide sequence of the corresponding human apo-A-IV mRNA was determined, and the de- rived amino acid sequence showed that mature plasma apo-A-IV contained 376 residues. Throughout most of its length, human apo-A-IV was found to contain mul- tiple tandem 22-residue repeated segments having am- phipathic, a-helical potential. Amino acid substitutions within these homologous segments were generally con- servative in nature. A comparison of the sequences of human and rat apo-A-IV revealed a 79% identity of amino acid positions in the amino-terminal 60 residues and a 58% identity in the remainder of the sequences, with the human protein containing 5 extra residues near the carboxyl terminus. An examination of the distribution of apo-A-IV mRNA in different tissues of the rat, marmoset, and man showed that apo-A-IV mRNA was abundant in both the liver and small intes- tine of the rat, but abundant in only the small intestine of the marmoset and man. It was expressed in only trace amounts in all other tissues that were examined. These findings on the structure and expression of apo- A-IV and the close linkage of its gene to those of apo- A-I and apo-(3-111 suggest a regulatory relationship between the three genes.

Human apolipoprotein (apo-’) A-IV is a major component of newly synthesized chylomicrons, but it is not found in significant amounts in other lipoproteins, including chylomi- cron remnants (reviewed in Ref. 1). In the rat, apo-A-IV is also a major component of high density lipoproteins (2). In all species examined, about half of the circulating apo-A-IV is found in the lipoprotein-free fraction of plasma (3), but it may be redistributed to lipoproteins according to the require- ments of extracellular lipid metabolism (4). While the overall

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked ‘‘advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The abbreviations used are: apo-, apolipoprotein; kb, kilobases.

metabolic role of apo-A-1V is unknown, Steinmetz and Uter- mann (5) have demonstrated that human apo-A-IV can be a significant activator of 1ecithin:cholesterol acyltransferase. This finding is consistent with the structure of rat apo-A-IV that we determined previously (6).

Rat plasma apo-A-IV, a single chain protein of 371 amino acids (Mr = 44,465), contains closely homologous, tandem repetitions of a 22-residue segment with amphipathic a-heli- cal potential (6). The amphipathic character of these repeated units is probably responsible for the 1ecithin:cholesterol acyl- transferase-activating capability of apo-A-IV (7, 8). The se- quence and organization of the docosapeptide units are re- markably homologous to the tandemly repeated segments found in apo-A-I (7,9), which suggests that the corresponding genes might have arisen from an unequal gene duplication event and that they may be closely linked (6).

This study reports the complete structure of human plasma apo-A-IV and the finding of an extensive domain of repeated amphipathic segments within the protein. In addition, the gene that encodes human apo-A-IV is linked closely to the apo-A-I gene, with the apo-C-I11 gene interposed. Our results expand the previously reported linkage of the apo-A-I and apo-C-I11 genes found on chromosome 11 (10-12), raising the possibility of a coordinate control of the expression of all three genes.

EXPERIMENTAL PROCEDURES Human apo-A-IV cDNA clones were selected from a XgtlO (13)

liver cDNA library (provided by Dr. Beatriz Levy-Wilson, Gladstone Foundation Laboratories) by screening (14) at a reduced stringency with a previously characterized rat apo-A-IV cDNA (6). The hybrid- ization probe was a 1227-base pair XmnI-BstXI restriction endonu- clease fragment of the rat cDNA (6) that was 32P-labeled by random priming (15). Plaque hybridizations were carried out at 42 “C in a buffer containing 20% deionized formamide, 0.9 M NaCl, 50 mM sodium phosphate at pH 7.0, 5 mM EDTA, 0.1% sodium dodecyl sulfate, and 200 pg/ml denatured herring sperm DNA. The cDNA inserts from positive recombinants were subcloned into bacterio- phages M13mp18 and M13mp19 for nucleotide sequence determina- tions by the dideoxynucleotide chain termination method (16). Hu- man apo-A-IV sequence-specific oligodeoxynucleotide primers were synthesized with an Applied Biosystems (Foster City, CA) Model 380A synthesizer.

Human apo-A-IV genomic clones were selected from a cosmid (17) genomic DNA library (provided by Dr. Chris Lau, University of California at San Francisco) by screening with the 32P-labeled insert prepared from a cloned human apo-A-IV cDNA that had been char- acterized as indicated above. Positive recombinants were selected, and their inserts were subcloned for partial nucleotide sequence determinations as described above.

Total cellular RNA was prepared from adult rat, marmoset, and human tissues (18) and examined by dot blot and Northern blot

1998

by on February 28, 2007

ww

w.jbc.org

Dow

nloaded from

Page 2: The Nucleotide and Derived Amino Acid Sequence of ...markboguski.net/docs/publications/Human_AIV_gene.pdf · gene is located about 14 kilobases downstream from ... The screening of

Human Apolipoprotein A-IV mRNA 1999 analyses as described previously (18). Animals were maintained and tissues were collected as described (18). Tissues from four rats, four marmosets, and two human trauma victims were examined.

RESULTS AND DISCUSSION

The screening of about 500,000 recombinants yielded 10 clones that were candidates for containing human apo-A-IV cDNA inserts. DNA was prepared from each of them for analysis, and examination by agarose gel electrophoresis (14) indicated that the insert sizes ranged from -300 to 1100 base pairs in length. The DNAs from six of these candidates were examined further by hybridization selection and translation (19) of human intestine mRNA. In each case, the hybridiza- tion-selected mRNA directed the synthesis of a M, = 46,000 protein that was immunoprecipitated by an antibody specific for human apo-A-IV (data not shown). The identity of the cloned inserts was subsequently demonstrated by nucleotide sequence analysis as described below.

Linkage of the Apolipoprotein A-I, C-III, and A-IV Genes- To select the corresponding gene, a human genomic cosmid library was screened with the longest cloned apo-A-IV cDNA insert. Three positive recombinants were identified that had an average insert length of 34 kb and some, but not all, restriction endonuclease fragments in common. The cloned DNAs were digested with EcoRI and HindIII, resolved by agarose gel electrophoresis, and blotted to a nitrocellulose filter (14). The filter was probed with a cloned human apo-A- IV cDNA as well as with previously characterized human apo-

A A-I c-m A-W, 5’ A-IP, 3

0 9.4. 23.1.

I 6.6. 4.4.

I I I I 51 52 21 21 51 52 21 21 51 52 21 21 51 52 21 21

I l l 1 I I I I I l l 1

L ‘ I I I I I I I I I I I I I I I I I I I 2 4 6 8 10 12 14 16 18

Nucleotides x

FIG. 1. Linkage of the human apo-A-I, apo-C-111, and apo- A-IV genes. A, the DNAs from cosmid clones pHA4G21, pHA4G51, and pHA4G52 were digested with EcoRI or HindIII and examined by Southern blot hybridization to 32P-labeled cDNA probes specific for apo-A-I, apo-C-111, and the 5’ and 3’ portions of apo-A-IV. After each hybridization, the probe was removed from the filter (14), and the same filter was rehybridized to another probe. B, the linkage of the apo-A-I, apo-C-111, and apo-A-IV genes is shown with a few of the known restriction endonuclease sites in this gene locus. The arrows indicate the direction of transcription of these genes.

Pstl Pstl EcoRl Pstl

3 c -

* ’ hHA4C202 *”””-

I c - - hHA4C181 - AHA4C201 - hHA4C151

. .... D\u pHA4G51 - *- - - - - - -. ”“”” ““”c

2 4 6 8 10 12 14 t I l l l l l l l I l l I l l 1

Nucleotides x

FIG. 2. Nucleotide sequence strategy for apo-A-IV mRNA. The solid bar indicates the nucleotide sequence that was determined from the cDNA inserts that were cloned in XgtlO. The stippled and hatched bars and the open bar indicate the sequence that was deter- mined from two exons of a cosmid clone. In this cosmid gene clone, the regions indicated by the stippled and hatched bars were separated by an intron, which is illustrated by the line connecting the bars in the inset. The sites of restriction endonucleases that were used for the subcloning of DNA fragments into bacteriophage M13 for se- quence analysis are indicated. The arrows indicate the direction and length of the sequence determination, with the solid arrows indicating the use of a universal primer (16) and the dashed arrows indicating the use of oligonucleotide primers that were synthesized to correspond to sequences that were determined in this study. The X cDNA clones and the cosmid gene clone employed are identified by their clone numbers.

A-I (13) and apo-C-I11 (20) cDNAs. Fig. lA shows that all of the probes hybridized to each of the cosmids and that the probes bound to a single HindIII fragment of about 19 kb that was common to each cosmid insert. Therefore, the apo-A-IV gene was linked closely to the apo-A-I and apo-C-I11 genes.

The orientation of the apo-A-IV gene with respect to the apo-A-I and apo-C-I11 genes was investigated by Southern blot hybridization. To identify the apo-A-IV gene, two probes were prepared from a cloned cDNA cleavage of an EcoRI site within this cDNA yielded a 5”terminal fragment of about 400 base pairs and a 3”terminal fragment of about 700 base pairs. The cDNA probes for both the apo-A-I and apo-C-I11 genes had been characterized previously by restriction endo- nuclease mapping and nucleotide sequencing (13, 20) and corresponded to about three-fourths of the lengths of their corresponding mRNAs (data not shown). A single Southern blot of the three unique cosmid DNAs that had been digested with EcoRI was prepared and hybridized to each probe se- quentially (Fig. lA). The EcoRI enzyme was chosen because the cosmid vector (17) contained unique sites for this enzyme on either side of the genomic DNA insertion site (BamHI).

The apo-A-I probe hybridized to a different size EcoRI fragment in each cosmid DNA, indicating that the apo-A-I gene was located at one end of the human genomic DNA insert. The apo-C-I11 probe bound to this same fragment in each cosmid, consistent with the previously described finding that it was about 2.6 kb downstream from the apo-A-I gene and in the opposite orientation (10, 11). The apo-C-I11 probe also bound to a 3.0-kb fragment that was identical in size in each of the cosmids, indicating that it contained the 5’ portion of the apo-C-I11 gene (according to the previously described (10, l l ) linkage orientation) and that it was contained within the interior of the genomic DNA insert. The 5”terminal apo- A-IV probe bound to a 1.2-kb fragment that was the same size in each cosmid, indicating that it also was contained within the interior of the cloned human DNA. The 3”termi-

by on February 28, 2007

ww

w.jbc.org

Dow

nloaded from

Page 3: The Nucleotide and Derived Amino Acid Sequence of ...markboguski.net/docs/publications/Human_AIV_gene.pdf · gene is located about 14 kilobases downstream from ... The screening of

2000 Human Apolipoprotein A-IV mRNA

G l u V a l Ser A l a A s p G l n V a l A l a Th,r V a l Met Trp A s p Tyr Phe Ser G l n L e u Ser A s n A s n A l a L y s G l u A l a V a l G l u His L e u G l n GAG GTC AGT GCT GAC CAG GTG GCC ACA GTG ATG TGG GAC TAC TTC AGC CAG CTG AGC AAC AAT G C C AAG GAG GCC GTG GAA CAT CTC CAG

90

~ y s Ser G l u L e u Thr G l n G l n Leu A s n A l a Leu Phe G l n ASP ~ y s L e u G l y G l U V a l A s n Thr Tyr A l a G l y A s p L e u G l n L y s L y S L e u AAA T C T GAA CTC ACC CAG CAA C T C AAT GCC CTC TTC CAG GAC AAA CTT GGA GAA GTG AAC ACT TAC GCA GGT GAC CTG CAG AAG AAG CTG

1 30 60

120 150 180

V a l Pro Phe A l a Thr G l u L e u H i s G l u A r g Leu A l a - L y s A s p Ser G l u Lys L e u L y s G l u G l u I le G l y L y S G l U Leu G l U G l u Leu A r g ~ n ; CCC TTT GCC ACC GAG CTG CAT GAA CGC CTG GCC AAG GAC TCG GAG AAA CTG AAG GAG GAG ATT GGG AAG GAG CTG GAG GAG CTG AGG

GCC CGG CTG CTG CCC CAT GCC AAT GAG GTG AGC CAG AAG ATC GGG GAC AAC CTG CGA GAG CTT CAG CAG CGC CTG GAG CCC TAC Gffi GAC A l a A r g L e u Leu Pro His A l a A s n G l u V a l Ser G l n Lys I l e G l y A s p A s n L e u A r g G l u Leu G l n G l n A r g L e u G l u Pro "Yr A l a A S P

300 330 360

G l n Leu A r g Thr G l n V a l A s n Thr G l n A l a G l u G l n Leu A r g A r g G l n Leu Thr Pro Tyr Ala G l n A r g net G l U A r g V a l Leu A r g G l U CAG CTG CGC ACC CAG GTC AAC ACG CAG GCC GAG CAG CTG CGG CGC CAG CTG ACC CCC TAC GCA CAG CGC ATG GAG AGA GTG CTG CGG GAG

390 420 450

A s n A l a A s p Ser L e u G l n A l a Ser L e u A r g Pro His A l a ASP G l U Leu LYS A l a LYS I l e ASP G l n A s n V a l G l u G l u L e u L y s G l y A r g AAC GCC GAC AGC CTG CAG GCC TCG Cl'G AGG CCC CAC GCC GAC GAG CTC AAG GCC AAG ATC GAC CAG AAC GTG GAG GAG CTC AAG GGA CGC

480 510 540

Leu Thr Pro Tyr A l a ASP G l U Phe LYS V a l LYS I le ASP G l n Thr V a l G l U G l U L e u A r g A r g ser L e u A l a Pro Tyr A l a G l n ASP T h r

210 240 2 1 0

CTT ACG CCC TAC GCT GAC GAA TTC AAA GTC AAG A I T G A C CAG ACC GTG GAG GAG CTG CGC CGC AGC CTG GCT CCC TAT GCT CAG GAC ACG 570 600 630

G l n G l u Lys L e u A s n His G l n L e u G l u G l y Leu Thr Phe G l n Met LYS LYS A s n A l a G l u G l u L e u LYS A l a A r g I l e Ser A l a Ser A l a CAG GAG AAG CTC AAC CAC CAG C T T GAG GGC CTG ACC T T C CAG ATG AAG AAG AAC GCC GAG GAG CTC AAG GCC AGG <ATC TCG GCC AGT GCC

660 690 720

G l U G l U Leu A r g G l n A r g L e u A l a Pro Leu A l a G i u A s p V a l A r g G l y A s n Leu A r g G l y A s n Thr G l u G l y L e u G l n L y s Ser Leu A l a GAG GAG CTG CGG CAG AGG CTG GCG CCC TTG GCC GAG GAC GTG CGT GGC AAC CTG AGG GGC AAC ACC GAG GGG CTG CAG AAG TCA CTG GCA

750 7 80 810

G l U Leo G l y G l y His Leu A s p G l n G l n V a l G l u G l u Phe R r g A r g A r g V a l G l u Pro Tyr G l y G l u A s n Phe A s n Lys A l a Leu V a l G l n GAG CTG GGT GGG CAC CTG GAC CAG CAG GTG GAG GAG T T C CGA CGC CGG GTG GAG CCC TAC GGG GAA AAC T T C AAC AAA GCC CTG GTG CAG

8 4 0 870 900

G l n Met G l u G l n L e u A r g Thr L y s Leu G l y Pro His A l a G l y A s p V a l G l u G l y His L e u Sec Phe Leu G l u L y s A s p Leu- A r g A s p L Y S CAG ATG GAA CAG CTC AGG ACG AAA CTG GGC CCC CAT GCG GGG GAC GTG GAA GGC CAC TTG AGC T T C .CTG GAG AAG GAC CTG AGG GAC AAG

930 96 0 990

V a l A s n Ser Phe Phe Ser Thr Phe L y s G l u L y s G l u Ser G l n A s p L y s Thr Leu Ser L e u Pro G l u Leu G l u G l n G l n G l n G l u G l n His G T C AAC TCC TTC TTC AGC ACC T T C AAG GAG AAA GAG AGC CAG GAC AAG ACT CTC TCC CTC CCT GAG CTG GAG CAA CAG CAG GAA CAG CAT

1020 1050 1080

G l n G l u G l n G l n G l n G l u G l n V a l G l n Met Leu A l a Pro Leu G l u Ser *** CAG GAG CAG CAG CAG GAG CAG GTG CAG ATG CTG GCC CCT TTG GAG AGC TGA GCTGCCCCTG GTGCACTGGC CCCACCCTCG TGGACACCTG

1110 1141 1171

CCCTGCCCTG CCACCTGTCT GTCTGTCCCA AAGAAGTTCT GGTATGAACT TGAGGACACA TGTCCAGTGG GAGGTGAGAC CACCTCTCAA TATTCAATAA 1201 1 2 3 1 1261

AGCTGCTGAG AATCTAGCCT C - P O l y ( A ) 1 2 9 1

FIG. 3. Nucleotide and amino acid sequence of human apo-A-IV. The numbers indicate nucleotide sequence positions; The sequence begins with the first amino acid of the mature plasma protein. The translation termination codon is indicated by asterisks.

nal apo-A-IV probe bound to a fragment that varied in size in each cosmid, but 'was different from the fragments that bound the apo-A-I probe, indicating that the 3'4erminal portion of the apo-A-IV gene was contained within an EcoRI fragment located at the opposite end of the insert from the apo-A-I gene. Our results, together with the previously deter- mined linkage and orientation of the apo-A-I and apo-C-I11 genes (10, ll), also indicated that the apo-A-IV gene had the same transcriptional orientation as the apo-A-I gene. Addi- tional Southern blot analyses with other restriction endonu- cleases were consistent with these conclusions (data not shown). The linkage of the three genes is illustrated in Fig. 1B. The apo-A-IV gene is located about 8 kb away from the apo-(3-111 gene and about 14 kb downstream from the apo-A- I gene.

Structure of H u m a n Plasma Apolipoprotein A-IV-The nu- cleotide sequence of human liver apo-A-IV mRNA was deter- mined according to the strategy shown in Fig.'2. Since none of the cDNAs were full-length copies of the mRNA, the sequence of the coding portion that contained amino acids 1- 55 of plasma apo-A-IV was determined from the gene, and the remaining portion was determined from the cloned

cDNAs. The amino-terminal boundary of the mature plasma protein coding region was identified by comparison to the previously determined partial amino acid sequence of this region (21).' In addition, the amino-terminal sequence of rat apo-A-IV (6) was found to be particularly homologous to the human sequence. Furthermore, the carboxyl-terminal amino acid of the 20-residue signal peptide sequence of human apo- A-IV (21) was identified as alanine (data not shown), which is consistent with the position of the mature amino terminus of human plasma apo-A-IV. This residue was apparently adjacent to an upstream intron similar to the intron location of other apolipoprotein genes (22).

The nucleotide sequence of the apo-A-IV plasma protein coding region of the gene and its corresponding mRNA, together with the 3"terminal noncoding region, is shown in Fig. 3. The derived amino acid sequence is also shown. Human

The amino acid sequence of 39 residues beginning at the amino terminus of human plasma apo-A-IV has been determined by Drs. Stanley C. Rall, Jr., and Karl H. Weisgraber of the Gladstone Foun- dation Laboratories (unpublished observations). It is in complete agreement with the amino acid sequence that was derived from the nucleotide sequence.

by on February 28, 2007

ww

w.jbc.org

Dow

nloaded from

Page 4: The Nucleotide and Derived Amino Acid Sequence of ...markboguski.net/docs/publications/Human_AIV_gene.pdf · gene is located about 14 kilobases downstream from ... The screening of

Human Apolipoprotein A-IV mRNA

EVSAWVATVMW human

EVTSWVANVMW r a t

13 40 DYFSQLSNNAKEAVEHLQKSELTQQLN ALFQDKLCEVNTYACDLQKKLV

DYFTQLSNNAKEAVEQLQKTDVTQQLN TLFQDKLCNINTYADDLQNKLV 5.:: :::

......... .........

................................. .................................

62 PFATELHERLAKDSEKLKEEIC

PFAVQLSCHLTKETERVREEIQ

95 PHANEVSQKICDNLRELQQRLE

PHANKVSQMFCDNVQKLQEHLR

PYAQRMERVLRENADSLQASLR 139

PYIQRMQTTIQDNVENLQSSMV

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

..... . . . . . . . . . . . . . 183 PYADEFKVKIWTVEELRRSLA

PRANELKATIDQNLEDLRSRLA

227 KNAEELKARISASAEELRQRLA

KNAEELHTKVSTNIDQLQKNLA

267 KSLAELGCHLDQQVEEFRRRVE

KSLEDLNKQLDQQVEVFRRAVE

31 1 PHACDVECHLSFLEKDLRDKVN

SDSCDVESHLSFLEKNLREKVS

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

............... ...............

84 KELEELRARLL

KELEDLRANMM ....... .......

117 PYADQLRTQVNTQAEQLRRQLT

PYATDLQAQINAQTQDMKRQLT

161 PHADELKAKIDQNVEELKCRLT

PFANELKEKFNQNMECLKCQLT

205 PYAQDTQEKLNHQLECLTFQMK

PLAECVQEKLNHQMECLAFQMK

249 PLAEDVRCNLRCNTECLQ

PLVEDVQSKLKCNTECLQ

289

. . . . . . . .... . . . . . . . ....

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

PYCENFNKALVQQMEQLRTKLC

PLCDKFNMALVQQMEKFRQQLC . . . . . . . . . . . . . . . . . . . . . . . . . . . .

SFFSTFKEKESQDKTLSLPELEQQQEQHQEQQQEQVQMLAPLES . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . .... SFMSTLQKKCSPWPLALPLPEQVQEQVQEQVQPK-----PLES

human

r a t

human

r a t

human

r a t

human

r a t

human

r a t

human

rat

human

r a t

human

r a t

human

r a t

FIG. 4. Alignment of the amino acid sequences of human and rat apo-A-IV. The optimal alignment of the human and rat amino acid sequences was made with the use of the PRTALN program (ktuple = 1, window = 20, gap penalty = 1) obtained from Dr. David Lipman, National Institutes of Health. The methods used for defining the boundaries of the repeated sequence units have been described previously (6, 24). Amino acid identities are indicated by a colon. The numbers indicate the first residue in each repeat unit.

plasma apo-A-IV contains 376 amino acids, with a M, = 45,150. There are no cysteine residues. Analysis of the se- quence by the correlogram algorithm of Kubota et al. (23) as well as the comparison matrix algorithm of McLachlan (9) indicated the presence of multiple repeated segments within human apo-A-IV. As observed in rat apo-A-IV (6), these segments are not exact duplications, but have conservative amino acid substitutions that generally preserve their physical and chemical characteristics (Fig. 4). The repeated units correlated approximately with the positions of proline resi- dues, which are located with unusual regularity throughout the sequence. Accordingly, these analyses indicated that nearly the entire human apo-A-IV protein consisted of 14.5 tandem repetitions of a 22-residue element that is itself the product of an 11-mer duplication (data not shown). The optimal alignment of these repeat units is illustrated in Fig. 4, which indicates a limited amount of length polymorphism. One repeat unit contained a deletion of 4 residues, equivalent to one turn of an a-helix, which should not disrupt the surface properties of the overall alignment of repeat units.

A Rat

Liver- @

Small Intestine - 0 Large Intestine

Testes

Spleen

Pancreas

Kidney

Lung

Stomach

Brain

Adrenal

Heart

3 2 1 0.5

B Rat

2001

Marmoset Man

3 2 1 0.5 3 2 1 0.5 w RNA

Human -Origin

- m a -1169 - 1101

-526 - 447

-215

Liver' 'Small Intestine

Liver Small J L

Intestine

FIG. 5. Distribution and size of apo-A-IV mRNA in the tis- sues of rats, marmosets, and man. A, aliquots containing different amounts of total cellular RNA were supplemented with yeast RNA, denatured, applied to nitrocellulose filters, and examined by hybrid- ization and autoradiogram densitometry as described previously (18). Rat RNA dot blots were hybridized to a previously characterized rat apo-A-IV cDNA (6), and the marmoset and human RNA dot blots were hybridized to a human apo-A-IV cDNA that was characterized in the present study. Autoradiograms of the blots are shown here. B, samples containing 30 pg of total cellular RNA were denatured with glyoxal, electrophoresed in 1.1% agarose gels, blotted to nitrocellulose, and hybridized to homologous 32P-labeled cDNA probes (18). Auto- radiograms of the blots are shown here. The positions of HindIII- digested SV40 DNA fragments are indicated as molecular size stand- ards.

The human apo-A-IV amino acid sequence was compared to that of the rat (Fig. 4), and they were found to be identical in 61% of their overall positions. The most highly conserved portion of the two proteins was in the amino-terminal region, where there was a 79% identity in the first 60 residues. The significance of this domain is unknown. The remainder of apo-A-IV was found to have a 58% homology between the human and rat proteins. The repeating units in the human protein are more highly conserved with respect to each other than those in the rat. This difference may reflect either the finding that rodent genes evolved considerably faster than those of man (25) or that there were particular selection pressures on these proteins.

To achieve an optimal alignment in the carboxyl-terminal domain, a 5-residue deletion in the rat sequence was postu-

by on February 28, 2007

ww

w.jbc.org

Dow

nloaded from

Page 5: The Nucleotide and Derived Amino Acid Sequence of ...markboguski.net/docs/publications/Human_AIV_gene.pdf · gene is located about 14 kilobases downstream from ... The screening of

2002 Human Apolipoprotein A-IV mRNA

lated. In this region, the sequence E-Q-X-Q was repeated four times in the human protein and three times in the rat protein, which may have significance with respect to the rat deletion. Analysis of the carboxyl-terminal 44 residues of apo-A-IV did not reveal a significant sequence homology with the preceding repeated segments.

Tissue Distribution of Apolipoprotein A-IV mRNA-The comparison of rat and human apo-A-IV was extended to an examination of the expression of the corresponding mRNAs in various tissues. Total cellular RNAs from different tissues of the rat, the marmoset (a new-world primate), and man were examined by dot blot and Northern blot hybridizations, followed by quantitative scanning densitometry (18). Fig. 5A shows that apo-A-IV mRNA is most abundant in the small intestine in each animal species. In the adult rat, apo-A-IV was present in the liver at a level that was 12% of that observed in the small intestine, with no observable difference in size between the two tissues (Fig. 5B). In contrast, apo-A- IV mRNA in the marmoset liver or human liver was <2% of that observed in the small intestine. The significance of this difference is unclear, but it may reflect differences in lipid metabolism among these species. No other tissue in any of these three species contained significant amounts of apo-A- IV mRNA. These findings indicate that the intestine may be the only significant source of apo-A-IV in humans and that the liver may contribute only minor amounts of this apolipo- protein to the plasma.

The precise function of apo-A-IV is unclear. The multiple repeated amphipathic segments of this protein suggest that it can be a significant activator of 1ecithin:cholesterol acyltrans- ferase (6); this cofactor activity has been demonstrated re- cently for human apo-A-IV ( 5 ) . However, the close association of apo-A-IV with triglyceride-rich lipoproteins further sug- gests a potential role in the metabolism or structure of these particles. In this regard, a recent examination of the properties of human plasma apo-A-IV has suggested that its lipid binding properties are especially sensitive to microenvironmental fac- tors (26). The amphipathic structure of apo-A-IV reported in this study suggests that the association of apo-A-IV with lipoproteins may be particularly sensitive to their surface characteristics. Thus, the distribution of apo-A-IV between lipoproteins and the lipoprotein-free fraction of plasma may be a function of potential changes in the surface properties of lipoproteins as a consequence of their metabolism during circulation.

Acknowledgments-We thank Dr. Karl H. Weisgraber for provid- ing antibodies to human apo-A-IV and Dr. Seatriz Levy-Wilson for providing a human apo-C-I11 cDNA. We thank Drs. Stanley C. Rall, Jr., and Karl H. Weisgraber for their determination of the amino acid sequence of the amino-terminal 39 residues of human plasma apo-A- IV. Gratitude is expressed to Dr. Robert W. Mahley for his interest

and support of these studies and to Drs. Brian McCarthy, Beatriz Levy-Wilson, and Stanley C. Rall, Jr., for their helpful discussions. We thank James X. Warger and Norma Jean Gargasz for graphics assistance and Barbara Allen and Sally Gullatt Seehafer for editorial assistance.

REFERENCES 1. Mahley, R. W., Innerarity, T. L., Rall, S. C., Jr., and Weisgraber,

2. Swaney, J. B., Braithwaite, F., and Eder, H. A. (1977) Biochem-

3. Fidge, N. H. (1980) Biochim. Biophys. Acta 6 1 9 , 129-141 4. DeLamatre, J. G., Hoffmeier, C. A., Lacko, A. G., and Roheim,

5. Steinmetz, A., and Utermann, G. (1985) J. Biol. Chem. 260 ,

6. Boguski, M. S., Elshourbagy, N. A., Taylor, J. M., and Gordon, J. I. (1984) Proc. Natl. Acad. Sei. U. S. A. 81,5021-5025

7. Segrest, J. P., Jackson, R. L., Morrisett, J. D., and Gotto, A. M., Jr. (1974) FEBS Lett. 38,247-253

8. Kaiser, E. T., and Kezdy, F. J. (1983) Proc. Natl. Acad. Sci.

9. McLachlan, A. D. (1977) Nature 267,465-466

K. H. (1984) J. Lipid Res. 2 5 , 1277-1294

istry 16,271-278

P. S. (1983) J. Lipid Res. 24 , 1578-1585

2258-2264

U. S. A. 8 0 , 1137-1143

10. Karathanasis, S. K., McPherson, J., Zannis, V. I., and Breslow,

11. Protter, A. A., Levy-Wilson, B., Miller, J., Bencen, G., White, T.,

12. Bruns, G. A. P., Karathanasis, S. K., and Breslow, J. L. (1984)

13. Seilhamer, J. L., Protter, A. A., Froward, P., and Levy-Wilson,

14. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular CEorting:A Laboratory Munut , Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

15. Feinberg, A. P., and Vogelstein, B. (1983) Anal. Biochem. 132,

16. Messing, J. (1983) Methods Enzymol. 101 , 20-78 17. Lau, Y.-F., and Kan, Y. W. (1983) Proc. Nutl. Acad. Sci. U. S. A.

18. Elshourbagy, N. A., Liao, W. S., Mahley, R. W., and Taylor, J. M. (1985) Proc. Natl: Acad. Sci. U. S. A. 82,203-207

19. Ricca, G. A., Hamilton, R. W., McLean, J. W., Conn, A., Kali- nyak, J. E., and Taylor, J. M. (1981) J. Biol. Chem. 2 5 6 ,

20. Levy-Wilson, B., Appleby, V., Protter, A,, Auperin, D., and Seil- hamer, J. J. (1984) DNA (N. Y.) 3,359-364

21. Gordon, J. I., Bisgaier, C. L., Sims, H. F., Sachdev, 0. P., Glickman, R. M., and Strauss, A. W. (1984) J. Biol. Chem.

22. Paik, Y.-K., Chang, D. J., Reardon, C. A., Davies, G. E., Mahley, R. W., and Taylor, J. M. (1985) Proc. Natl. Acad. Sci. U. S. A.

23. Kubota, Y., Takahashi, S., Nishikawa, K., and Ooi, T. (1981) J.

24. Boguski, M. S., Elshourbagy, N. A., Taylor, J. M., and Gordon,

25. Wu., C.-I., and Li, W.-H. (1985) Proc. Natl. Acad. Sci. U. 5'. A.

26. Weinberg, R. B., and Spector, M. S.(1985) J. Biol. Chem. 260 ,

J. L. (1983) Nature 304 , 371-373

and Seilhamer, J. J. (1984) DNA (N. Y.) 3,449-456

Arteriosclerosis 4 , 97-102

B. (1984) DNA (N. Y.) 3,309-317

6-13

8 0 , 5225-5229

10362-10368

259,468-474

82,3445-3449

Theor. Biol. 91, 347-361

J. I. (1985) Proc. Nutl. Acad. Sci. U. S. A. 82, 992-996

82,1741-1745

4914-4921

by on February 28, 2007

ww

w.jbc.org

Dow

nloaded from