MADS-box Ephedra andina (Gnetales)€¦ · Ephedra andina (Gnetales) Department of Biology McGill...
Transcript of MADS-box Ephedra andina (Gnetales)€¦ · Ephedra andina (Gnetales) Department of Biology McGill...
O n the evolutionary origin of angiosperms: Characterurition of MADS-box floral homeotic gene homologues in
Ephedra andina (Gnetales)
Department of Biology McGill University
Montréal, Québec, Canada
A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfilment of the requirements of the degree of Master of Science
Nationai Libraty Bibliothèque nationale du Canada
Acquisitions and Acquisitions et Bibliographie Services services bibliographiques
The author has granted a non- exclusive licence aiiowing the National Library of Canada to reproduce, loan, distri'bute or seîî copies of this thesis in microform, paper or electronic foxmats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author' s permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, dktriiuer ou vendre des copies de cette thèse sous la forme de micmfïche/fihn, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protige cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
Despite a cenniry of research, the evolutionary origin of angiosperms remains
uncertain. Morphological studies have identified the gnetophytes as the sister group
of angiosperms mainly because of the similar organization of their reproductive
structures. Molecular studies have k e n ambiguous as to whether these two groups
are closely related. Study of the development of seed plant reproductive structures
can help to untangle this issue. Here, I report the cloning of five MADS-box floral
homeotic gene homologues fiom the gnetophyte Ephedra andina. Three of these
genes belong to A G, A GL 6 and TM3 subfamilies. These monophyletic groups
comprise angiosperm as well as conifer homologues. Phylogenetic analysis of the
plant MADS-box gene fmily reveals that within subfarnilies, Ephedru genes always
form subclades with other gymnosperm genes to the exclusion of al1 angiospem
genes. These results suggest that gnetophytes are more closely related to conifers than
to angiosperms.
Malgré un siècle de recherche, l'origine évolutive des angiospermes demeure
incertaine. Les études morphologiques ont identifié les gnétophytes comme groupe
soeur des angiospermes principalement à cause de l'organisation similaire de leur
structures reproductrices. Les études moléculaires quant à elles sont demeurées
ambigües à savoir si les deux groupes sont apparentés. L'étude du développement des
structures reproductrices des plantes à graines peut aider à résoudre la question. Ici, je
rapporte l'identification de cinq homologues de gènes homéotiques floraux MADS-
box chez le gnétophyte Ephedra andina. Trois de ces gènes appartiennent aux sous-
familles AG, AGL6 et TM3. Ces groupes monophylétiques comprènnent des
homologues des angiospermes et des coniferes. L'analyse phylogénétique de la famille
des gènes MADS-box révèle qu'à l'intérieur des sous-familles, les gènes de Ephedra
forment toujours des sous-groupes avec les gènes des gymnospermes à l'exclusion des
gènes des angiospermes. Ces résultats suggèrent que les gnétophytes sont plus
apparentés aux conifêres qu'aux angiospermes.
TABLE OF CONTENTS
.. Résumé.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . ..A
m.. Table of contents.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . ..ul
List of figures.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . .vi
introduction
. . Ongtn of angiosperms.. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . .. .. . .. . . . . . . . . . .1
Relationships of angiosperms to other seed plants.. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . .2
Developmental genes as a new class of evidence.. . . . . . . . . . . . . .. . . . .. .. . . . ..... . . .... 4
The ABC of floral development ... . . . . .. . . . . ... . . . .. .. . . . . . . . ... ... . . . .. . .. . . . . . . ...... 5
The MADS-box gene family of transcription factors.. . . . . . . . . . . . . . . . . . . . . . . . . . ..... 7
The plant MADS-box genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
Evolution of the plant MADS-box gene family. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
MADS-box genes and the ongins of flowering plants.. . . . . . . . .. . . . . . . . . . ..... . ... 12
Material & metbods
Plant material and total RNA extraction.. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .... 14
Isolation of cDNAs.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... 1 5
Phy logenetic analy sis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
Results
Cloning of MADS-box gene cDNA fiom Ephedra andinn.. . . . . . . . . . . . . . ... . . . ... 17
Structural evaluation of EA Ml -5.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1 7
Phylogenetic analysis.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
LIST OF FIGURES
. .......................... Figure 1A EAMZ cDNA and deduced amino-acid sequences -22
........................... . Figure 1B MM2 cDNA and deduced amino-acid sequences 23
........................... . Figure 1C EAM3 cDNA and deduced arnino-acid sequences 24
........................... . Figure 1D EAMl cDNA and deduced amino-acid sequences 25
. ........................... Figure 1 E EP M j cDNA and deduced amino-acid sequences 26
Figure 2 . Sequence aiignrnent of several plant MADS-box genes illustrahg
............................................. the conserved domains organization 27
Figure 3 . Phylogenetic tree of a subset of the plant MADS-box protein farnily
.................................... inferred using the neighbor-joining method 28
Figure 4A . Phylogenetic tree of the AG subfamily inferred using the neighbor-
. . * ..................................................................... jouiing method 29
Figure 4B . Phylogenetic tree of the AGL6 subfamily inferred using the neighbor-
................................................................... joining method 30
Figure 4C . Phylogenetic tree of the TM3 subfamily inferred using the neighlrnr-
.................................................................... joining method 30
First and foremost, 1 wish to thank my supervisors, Graham Bell and Thomas
Bureau for their support, m s t and guidance. You initiated me in the study of
evolution, and most importantly you helped me to fuid my own niche into the world of
biological sciences. You gave me a chance and 1 am indebted to you for that.
There are many other people without whom this project would have been
impossible. Thanks to members of my supervisory cornmittee, Daniel Schoen and
Leslie Sieburth, for critical evaluation my research proposal; Denis Barabé and Jardin
Botanique de Montréal who permitted me to collect Ephedru andinu sarnples, Janet
George for support and technical advice, and Chris Olive for helping me with the
cornputer work. 1 also wish to thank my colleagues and fnends fiom the labs: Hien,
Monique, Rees, Ruying, Sujatha, Vincent and Zhihui ak.a. Lily. Without you, life at
McGill would have never been so exciting. Special thanks to Stephen for helpful
discussions about evolution.
Merci à maman, papa et à toute la familia: Je vous aime!
Il y a aussi ma petite diaspora: Jeff, Yannick, Vincent, Jean-Luc et Candice.
Chacun à votre manière vous avez su m'aider, m'encourager et me critiquer même si
certains d'entre vous étiez au bout du monde. Merci beaucoup. Plus haut, plus grand,
plus loin, plus fou!
Origin of angiosperms
The angiospenns are the most diverse group of plants comprising about
300000 extant species. According to the fossil record, angiosperms seem to have
appeared and diversified rather suddenly during the Eariy Cretaceous between 130
and 90 million years ago (MYA) (Wolfe et al., 1975; Doyle & Hickey, 1976; Hickey
& Doyle, 1977). Despite the large number of taxa descrïbed from early in this
diversification, no direct evidence of the phylogenetic origin of the group has been
found. The contentious question of the evolutionary origin of angiosperms has thus
fascinated plant evolutionary biologists for more than a century.
There are three reasons for the dificulty of understanding the origin and early
evolution of angiosperms. First, the number of available early fossils of angiosperrns
is very small. Secondly, the relationship among extant angiosperms is uncertain,
especially at the base of the clade. Thirdly, there is a very large morphological gap
between angiosperms and their putative gyrnnosperm relatives, extant or fossil. Those
uncertainties about almost every aspect of angiosperms evolutionary history leave
unanswered basic questions concerning the ancestral floral structure, the closest extant
relative of flowenng plants, and the relationships of modem forms to their Paleozoic
and Mesozoic ancestors.
Rehtionships of angiosperms to other seeti plants
The angiosperms are usually considered to be a monophyletic group closely
related to gymnosperms (Crane, 1985 and references therein). The narne angiosperm
(Greek angeion, vessel, Little case and sperrnu, seed) emphasizes that seeds develop
within an enclosed ovary formed by the fusion of the carpels. The morphologically
heterogeneous gyrnnosperm group is composed of 4 groups of uncertain phylogenetic
relationship: conifers, cycads, Ginkgo biloba and Gnetales ( a h called gnetophytes).
Gymnosperms (Greek gymnos, naked and sperma, seed) produce monosexual cones
that are structurally very different from flowers. Female cones produce "naked seeds"
which develop from ovules borne on an ovuliferous scale. Angiosperms and
gymnosperms differ by a wide range of characters, most notably in their reproductive
structures. These differences make structural homologies between the two groups
very difficult to assess. For that reason, the phylogenetic relationship between
angiosperms and gymnosperms is uncertain and the closest gyrnnosperm relative or
ancestor of angiosperm is still to be identified. However, several hypotheses
conceming the relationship of angiosperrns to other seed plants have been suggested
during the last century.
In the early 1900s, Wettstein (1907) proposed that angiosperms were derived
from Gnetales, which include the 3 dissimilar genera Ephedra, Gnetum and
Welwitchia. The gnetophytes are unique in the gymnosperrn group because they have
features otherwise restricted to angiospems: presence of water-conducting vessels in
the wood, reproductive structures composed of flower-like uni& arranged in whorls,
and seeds with a micropy lar tube. Recently , it has been shown that the Gnetales also
have a double fertilization process (Friedman, 1990, 1992; Cannichael & Friedman,
1996). Wettstein based his hypothesis on structural similarities between the
compound strobili of Gnetales and the inflorescence of Amentiferae. A competing
theory, proposed by &ber & Parkin (1907), was that angiospems and the Mesozoic
order Bennettitdes were derived fkom a common ancestor, both groups having flower-
like structures. They also thought that Gnetales were related to angiospems (Arber &
Parkin, 1908), but they interpreted their "flowers" as reduced, rather than being
primitively simple as argued by Wettstein.
In subsequent decades, there was a movement against the idea that Gnetales
were related to angiosperms and the morphological differences between these two
groups were stressed. For example, Bailey (1944) argued that the water-conducting
vessels in the 2 groups, previously considered to be evidence for their relationship,
originated fiom different kinds of tracheids. He also showed that the Gnetales have
conifer-like pcimary xylem. The angiosperms have also been associated with the
Mesozoic seed fems such as the Triassic genus Caytonia (Gaussen, 1946) or with the
Permian glossoptetids (Stebbins, 1974).
In the mid-eighties, cladistic analysis based on morphological data was used to
study the angiosperm-gymnosperm relationship. Al1 the studies agreed that Gnetales
were the closest extant relatives of angiospenns (Crane, 1985; Doyle & Donoghue,
1986; Loconte & Stevenson, 1990; Doyle & Donoghue, 1992; Doyle et al., 1994;
Nixon et al., 1994; Rothwell & Serbert, 1994; Doyle, 1996; Hickey & Taylor, 1996).
The resulting group comprising angiospems, Gnetales, and Bennenitales (extinct),
and in some studies Pentoxylales (extinct) also, was named "anthophyte" to
emphasize the concennic whorled arrangement of the reproductive units of its
memben (Doyle & Donoghue, 1986). However, the studies differ in how Gnetales
and angiospenns are related. In some studies Gnetales and angiospems are both
monophyletic (Crane, 1985; Doyle & Donoghue, 1986; Loconte & Stevenson, 1990;
Doyle & Donoghue, 1992; Doyle et al., 1994; Rothwell & Serbert, 1994; Doyle,
1996), whereas in others the angiospenns are nested within the Gnetaies (Nixon, 1994
et al.; Hickey & Taylor, 1996).
In the 1990s, molecular studies addressing the question of the angiosperm-
gymnosperm relationship have led to conflicting conclusions. In a general way,
molecular analyses agree that Gnetales and angiosperms are both monophyletic, but
disagree on their relationship. Some studies based on sequences of 18s and 26s
rRNA (Hamby & Zimmer, 1992; Doyle et al., 1994), 28s rRNA (Stefanovic et ai.,
1998) and rbcL (Chase et al., 1993) have supported Gnetales as the sister group of
angiospems, whereas other studies based on rbcL (Hasebe et al., 1992; Albert et al.,
1994), chloroplast intergenic transcribed spacea (cpITS) (Goremykin et al., 1996) and
18s rRNA (Chaw et al., 1997) concluded that they were not closely related.
Developmental genes as a new class of evidence
How can the question of the origin of angiospems be resolved? Molecular
studies are contradictory and statistical support for any hypothesis is weak. In
comparison, phylogenetic trees based on morphology al1 tend to support the
anthophyte theory, although the resolution of the clade is poor and the relationship of
anthophytes to other seed plants (extant or fossil) is unclear. Uncertainties in seed
plant relationships and in the homology of reproductive structures, suggest that these
structural and developmental similarities between reproductive structures of
angiospems and other groups must be clarified. Most current molecular phylogenetic
studies have been done with slowly evolving genes such as chloroplast (rbcL) and
ribosomal RNA genes (Lewin, 1996). Such slowly evolving genes resolve much of
the rapid early evolution of angiosperms only poorly. Molecular markers such as the
master developmental genes can provide new phylogenetic information since genes
that control the development of organs are believed io al- play important roles during
the evolution of these organs (Doebley, 1993).
The ABC of floral development
The diversity o f angiosperrn flowers is tmly astounding. The hundreds of
thousands of different flowering plant species bear flowers covering an enormous
range of form, size, color, and odour to name only the most familiar traits. The
comparative study of the development of floral structures rnight help to understand
better the evolutionary diversification of seed plants.
Floral development is a complex process controlled by both genetic and
environmental factors (Steeves & Sussex, 1989). During floral induction the shoot
apical meristem is converted to an inflorescence meristem that will give rise to one or
several floral meristems. Each of these floral meristems will then differentiate into a
set of floral organ primordia from which the different organs will develop. In
Arabidopsis thafiana, which is a model system used to study the genetic conml of
floral development, Bowers are composed of four concentric whorls occupied by
different organs. The innerrnost whorl contains carpels that are surrounded by
stamens further surrounded by a sterile perianth composed of petals and sepals.
Although this organ arrangement is similar among al1 angiosperrns, the morphology
and number of organs in each whorl Vary enormously between taxa.
Several floral homeotic genes have been identified in Arabidopsis (Coen &
Meyerowitz, 199 1 ; Weigel & Meyerowitz, 1994). These genes fdl into one or more
of three different classes: organ identity, cadastral or floral meristem identity genes.
Organ identity genes defme organs of the flower, cadastral genes spatially regulate
organ identity gene expression and meristem identity genes spec ie the floral
menstem and induce organ identity genes. Floral organ identity genes can be M e r
subdivided into three classes of organ identity functions: A, B and C. The ABC
model of floral development explains how the fates of floral organ primordia are
determined ( B o ~ m a n et al., 199 1 b; Coen & Meyerowitz, 199 1 ; Meyerowitz et al.,
1991, reviewed in Weigel & Meyerowitz (1994) and Riechmann & Meyerowitz,
1997). This genetic model States that the interaction between homeotic genes of A, B
or C genetic activity speci& whether a particular region of the floral meristem will
develop into sepals, petals, stamens or carpels. A genes alone will give nse to sepals,
A+B genes the petals, B+C genes the stamens and C genes alone the carpels. In
addition to its combinatonal basis, this model has two other basic features. First,
organ identity activities are independent of their position in the floral pnmordium.
Second, A and C functions are antagonistic: if one of the activities is absent, the
domain of the other expands to occupy the entire floral primordium. In Arabidopsis,
class A genes are APETALAI (API ) and APETALA2 (AP2) (Bowman et al., 1989,
1993; Kunst et al., 1989; Irish & Sussex, 1990; Meyerowitz et al., 1989, 1991;
Mandel et al., 1992b; Okamuro et al., 1993; Gustafson-Brown, 1994; Jofiku et ai.,
1994), class B genes include APETALA3 ( A H ) and PISTILLATA (Pi) (Bowman et al.,
1989, 1991 b; Hill & Lord, 1989; Meyerowitz et al., 1989, 1991 ; Jack et al., 1992;
Goto & Meyerowitz, 1994) and the only known class C gene is AGAMOUS (AG)
(Bowman et al-? 1989; Yanofsky et al., 1990). Al1 of these genes have been cloned
(Yanofsky et al., 1990; Jack et al., 1992; Mandel et al., 1992b; Goto & Meyerowitz,
1 994; JO fuku et al., 1 994). API , AP 3, PI and AG belong to the MADS-domain family
of DNA-binding proteins whereas AP2 belong to another family of DNA-binding
proteins (Weigel, 1995; Okamuro et al., 1997).
The MADS-box gene family of transcription factors
The MADS-box gene family encodes proteins characterized by the presence of
the MADS-domain, a conserved DNA-binding/dimerization region of 56 amino-acids
in length. Members of this multigene family were first identified in yeast and in
vertebrates, where they encode transcription factors (Dubois et al., 1987; Norman et
al., 1988; Passmore et al., 1988). This class of transcription factors is presumably
very ancient according to the variety of kingdoms (plants, animals and fungi) where
they are found. Like other eukaryotic transcription factors, the MADS-domain
proteins have a modular structure that include DNA-binding and dimerization
domains. Sequence similarity between different kingdoms is limited to the MADS-
domain (Shore & Sharrocks, 1995). nie name "MADS" is an acronym of the four
founding members of the family: MCMl (eom the yeast Saccharomyces cerevisiae),
AGAMOUS (fiom Arabidopsis thaliana), DEFICIENS ( from Antirrhinum majus) and
SRF (from human) (Schwarz-Sommer et al., 1990). The biological h c t i o n s of -
MADS-domain proteins are quite diverse, reflecting the variety of genetic
backgrounds where they are expressed. They are involved in mating type
determination, pheromone response and arginine metabolism in yeast, growth factor
response and muscle development in vertebrates and insects, and idiorescence and
flower development in higher plants (reviewed in Shore & Sharrocks, 1995). In a
general way, MADS-domain proteins are mainly involved in regulating aspects of ce11
differentiation and developmental processes.
The plant MADS-box genes
In vascular plants, the MADS-box genes represent a large multigew family.
Since the first members of the plants MADS family were identified in the model
systems of A rabidopsis thaliana (AGAM0 US; Yano fsky et al., 1 990) and Antirrhinum
majus (snapdragon) (DEFICIENS; Sommer et al., 1990) more than 300 MADS genes
have been cloned from various monocots and eudicots such as rïce (Chung et al.,
1994,1995; Kang et al., 1995, 1998), wheat (Murai et al., 1997), sorghum (Greco et
al., 1997), maize (Schmidt et (il., 1993; Mena et al., 1995; TheiDen et al., 1999,
Brassica (Mandel et al., 1992a; Heck et d., 1995; Kempin et al., 1999, tomato
(Pnueli et al., 199 1 ), tobacco (Hansen et al., 1993; Kempin et al., 1993; Mandel et al.,
1994), and petunia (Angenent et al., 1992, 1993, 1995) as well as fiom the lower
angiosperm clades Magnoliales and Piperales (Kramer et al., 1998), conifea (Tandre
et al., 1995; Mouradov et al., 1 998; Rutledge er al., 1 W8), Gnetales (Winter et al.,
1999) and ferns (Münster et al., 1997; Hasebe et al., 1998). Many angiosperm genes
of the MADS family are implicated in crucial steps of the floral development.
However, the roles of MADS-box genes are not restricted to floral homeotic
functions. As suggested by their expression pattern, they might have other functions
in vegetative growth, root and f i t development, and embryogenesis (Ma et al., 1991 ;
Pnueli et al., 199 1 ; Rounsley et a!., 1995; Heck et al., 1995; Huang et al., 1995;
Carmona et al., 1998; Gu et al., 1998; Yao et al., 1999).
Al1 known plant MADS-domain proteins display a very similar modular
organization with separate functional domains and regions including the MADS-
domain, the 1 region, the K-domain, and the N- and C-tenninal regions (Ma et al.,
1991). The MADS-domain is the most highly conserved domain of the protein
(Punigganan et al., 1995). It contains amino-acids that are involved in contact with
DNA as weil as in dimerization (Pellegrhi et al., 1995). Plant MADS-domain
proteins bind to DNA as dimers and recognize A+T-rich sequences called CArG-
boxes whose consensus is CC(A/T)&G (Schwarz-Sommer et al., 1992; Trobner et
al., 1992; Huang et al., 1 993, 1995, 1996; Shiraishi et al., 1 993; Davies et al., 1996;
Mizukami et al., 1996; Riechmann et al., 1996a, 1996b). In vascular plants, the
MADS-domain is located at the amino-terminus of the protein except for AGAMOUS
and its closest relatives, which have an amino-terminal extension (N-terminal region)
variable in length and sequence (Riechmann & Meyerowitz, 1997). The fùnction of
this amino-terminal extension is unknown. The K-box encodes a 66 amino-acids
domain which shows low but significant similarity to the coiled-coi1 domain of the
keratin gene, hence its narne (Ma et al., 1991). This region can potentially fonn 2
arnphipathic a-helices which are involved in dimenzation through interaction between
the K-domains of different proteins (Ma er al., 1991; Pnueli et al., 199 1 ; Davies &
Schwarz-Sommer, 1994). Between the MADS- and the K-domain, is the 1 region (for
intervening or inter-domain, it has also k e n referred to as L for Iinker). This segment
io variable in sequence and length (Purugganan et al., 1995). It is an essential part of
the minimal DNA-binding domain (Riechmann et al., 1996a). The C-terminal region
is the least conserved part of the protein, both in sequence and length (Purugganan et
al., 1995). It has been shown to be an important region (Kempin et al., 1995) even if
it does not always contribute to the protein functional specificity (Krizek &
Meyerowitz, 1996). One possible role of the C-terminal region could be to act in
transcriptional activation (Riechrnann & Meyerowitz, 1997). The domains and
regions presented here are somewhat preliminary, because structural information is
not currently available for plant MADS proteins. However, these divisions broadly
correlate with the introdexon structure of MADS-box genes (Riechmann &
Meyerowitz, 1 997).
Evolution of the plant MADS-box gene family
The fact that MADS-domain proteins show high levels of sequence similarity
and similar organization of domains suggests that these genes share a common
evolutionary ancestor. The analyses of the molecular evolution of the multigene
family have led to several relevant conclusions. In phylogenetic mes, most memben
of the gene farnily are grouped in distinct subfamilies or monophyletic gene groups
(Doyle, 1994; Purugganan et al., 1995; TheiBen et al., 1996). These subfamilies
contain highly homologous genes of a given species as well as orthologues from other
species. Members of the same monophyletic group tend to have, in addition to
sequence similarities, related functions and expression patterns (Thean et al., 1996).
Finally, gene duplication and sequence diversification are the mechanisms most
commoniy implicated in generating new genes, whereas exon shuffling does not seem
to have played a major role (Tandre et al., 1995; Theikn ei al., 1996).
Molecular dock estimation suggests that the establishment of the floral
homeotic gene lineages predates the appearance of flowering plants and took place in
a relatively short span of time 340 MYA (Purugganan et al., 1995). This was
demonstrated by the cloning of C type AGAMOUS orthologues from conifers (Tandre
et al. 1995; Mouradov et al., 1998; Rutledge et al., 1998). These orthologues have
been shown to be functional homologues of the Arabidopsis AGAMOUS gene
(Rutledge et al., 1998; Tandre et al., 1998). Moreover, TM3-, AGLZ- and AGL6-like
MADS-box genes have also been found in conifers (Tandre et al., 1995; Mouradov et
al., 1998). These results indicate that the MADS-box rnultigene family known fiom
flowering plants is at least an ancestral character of seed plants and not a novelty of
the angiosperms. It also suggests that some of the genetic pathways controlling the
development of the reproductive structures of conifers and angiosperms have a
comrnon origin and were already established before the divergence of conifers and
angiosperms during the Carboniferous approximately 285-350 MYA (Beck, 1988;
Martin et al., 1993; Stewart & Rothwell, 1993; Savard et al., 1994). It is noteworthy
that no homologues of genes with A or B genetic activity have been found in conifers.
A MADS-box multigene family also exists in fems but no MADS-box floral
homeotic gene orthologues have been identified to date (Miinster et al., 1997; Hasebe
et al., 1998). Al1 the fem MADS genes cloned belong to phylogenetically distinct
subfamilies. According to Münster et al. (1997), this fmding may be explained by the
absence of seed plant specific structures in fems. Fems and seed plants are thought to
have diverged about 400 MYA during the mid-Paleozoic (Stewart & Rothwell, 1993).
Collectively, these data suggest that some floral homeotic gene clade have
been established in the tirne interval between the divergence of fems and the radiaîion
of seed plants (Middle Devonian to Early Carboniferous) (Münster et al., 1997).
MADSbor genes and the origins of fiowering plants
Seed plant evolution has been characterized by extensive divergence in the
morphology of reproductive structures. Today, the angiosperm flower and
gymnosperm cone are morphologically very different despite the fact that they
perform the sarne basic function. It is a cornmon theme in comparative biology that
development and evolution are related phenornena: master developmental genes that
play key roles in patteming body structures are also important in the evolution of these
structures. Given the crucial role of MADS proteins in flower and cone development,
the evolution of the plant MADS gene family needs to be compared with the
morphological diversification of angiosperm and gymnosperm reproductive structures.
In this respect, gnetophytes play a central role in the understanding of seed plant
evolution and the ongin of flowers.
Here, I present the characterization of MADS-box floral homeotic gene
homologues from the Gnetaies Ephedm andina. The objectives of this study are as
follows. First, to test the anthophyte theory of origin of flowers. Secondly, to
investigate funher phylogenetic relationships among spermatophytes. Finally, to
present additional data conceming the evolution of development of reproductive
structures in seed plants.
Plant materiat and total RNA extraction
Cones were collected in Apnl and May fiom male and female Ephedra andino
(Ephedra urnericana var. andina Stapf) plants growing at the Jardin Botanique de
Montréal.
Total RNA extraction was perfomed using a guanidine thiocyanate method
modified fiom Lessard et al. (1997). Plant material was ground to a fme powder in
liquid nitrogen and mixed with t O volumes (w/v) of extraction buffer 1 ( 5 3 M
guanidine thiocyanate, 4% PVP-40,25 m M Tris-HCI pH 8,0,1% fbmercaptoethanol).
Nucleic acids were then isopropanol precipitated and resuspended in 10 ml of
extraction bufTer 11 (50 mM Tris-HCl pH 8. 10 m M EDTA, 100 mM NaCl, 0,2%
SDS). After one extraction with pheno1:chloroform:isoarny 1 alcohol (25 :24: 1 ) and
one with ch1oroform:isoamy 1 alcohol (24: 1 ), nucleic acids were isopropanol
precipitated. The nucleic acid pellet was then air-dried and resuspended in 0,s ml of
DEPC-H20. Total RNA was precipitated ovemight at 4OC with LiCl (2,5 M final
concentration). Traces of gDNA were eliminated by a treatment with DNase 1
(GibcoBRL).
Isolation of cDNAs
Partial MADS-box cDNA sequences were obtained using a 3'-RACE
approach. RT-PCR was performed using 1 pg of male or fernale cone total RNA as
template. First strand cDNA were synthesized using Superscript II reverse
transcriptase (Gibco/BRL) and the adapter primer 5'-GACCACGCGTATCGATGTC
GACT The reaction mix was then treated with RNase H (Boehringer-
Mannheim).
PCR was performed using 1/20th volume of the RT-PCR reaction, AmpliTaq
Gold DNA polymerase (0,025 U/@) (PerkinElmer), a MADS-box specific and an
anchor primer (200 nM each), dNTP (200 pM) and MgCl2 (1,5 mM). The MADS-
box specific primer of sequence 5'-CGICARGTIACITTCTSIAARCG-3' was targeting
the highly conserved amino-acid sequence RQVTFSKR. The anchor primer was
denved from the adapter primer used in the RT-PCR reaction and had the sequence 5'-
GACATGCCGTTATCAGTCATTAACGG-3'. The PCR reaction parameters were: 3
min. at 94OC, 40 cycles of 30 sec. at 94OC, 30 sec. at 60°C and 1 min. at 72OC
followed by a final extension of 7 min at 72°C.
PCR products were cloned in the vector pCR2.1 of the TA cloning kit
(Invitrogen). Clones were then analyzed by digestion with EcoRI and inserts between
0,s kb and 1,s kb were sequenced. Sequencing of both strands was done with the
SequiThem EXCEL II Long-Read DNA sequencing kit (Epicentre) and the IRD-
labeled M 1 3 reverse and forward primers (LI-COR).
Phylogenetic analysis
Alignment of the putative amino-acid sequence of 260 cornplete or near
complete MADS-box genes available through GenBank was performed with the
multiple sequence alignment program CLUSTALW version 1.8 (Thompson er al., 1994)
and then refined manually (see appendix 1). A gap opening penalty of 10 and a gap
extension penalty of 6 was used. Subsets of this alignment were created to constnict
phylogenetic trees of the protein family and of the AG, AGL6 and TM3 subfarnilies.
The alignments were M e r modified for the phylogenetic analysis by encoding gaps
as a single character to avoid over-weighting (based on gap length) of these inferred
evolutionary events.
Phylogenetic trees were constructed based only on the MIK-domains of
MADS proteins because the C-region is too poorly conserved to be properly aligned.
Evolutionary distances were estimated under the PAM mode1 of amino-acid
substitution (Dayhoff, 1979) using the program PROTDlST as implemented in the
Phylogeny Inference Package (PHY LIP) version 3 S7c (Felsenstein, 1993). Trees were
generated with the Neighbor-joining algorithm (Saitou Br Nei, 1987) using the
program NEIGHBOR of the PHYLIP package. In each case, a random addition of
sequences was performed. The subfamily trees were rooted with a single MADS-
domain protein sequence as outgroup. AG tree was rooted with TM3, AGL6 with
AGL2 and TM3 with AGLI3 . The bootstrap values in the consensus trees were
obtained from 1 O00 replicate bootstrapping runs.
Cloniog of MADSbox gene cDNA from Ephedro andina
129 cDNAs were cloned fiom Ephedra cones using a 3'-RACE approach. 27
positive clones, sequenced in their entire length fiom both strsnds, were classified into
5 groups comprising 4, 8, 11, 2 and 2 sequences respectively. These groups were
found to represent 5 MADS-box genes named EAMI-5 for Ephedra _andina M D S -
box gene (figure 1 A-E). Each gene has k e n cloned at least once from both male
and female cones. Every EAM cDNA is slightly truncated at the 5' end because of the
cloning procedure used (see Material & methods). They al1 contain a single large
open reading M e that encodes a putative MADS-domain protein, a 3'-untranslated
region and a poly(A) tail. It is not known if EAMI-5 are single copy genes since
Southem blot analysis has not been performed successfully.
Structural evaluation of W I - 5
The alignment of EAM putative amino-acid sequences along with closely
related MADS-domain proteins fiom gnetophyte, conifer and angiospenns reveals
structural as well as sequence similarities (figure 2). EAM proteins display the
domainlregion organization characteristic of plant MADS-domain proteins. The
MADS-domain of EAMI-5, although incomplete, shows a high level of sequence
conservation. This suggests that, as in other plant MADS proteins, this domain is
subject to strong selective pressure in Ephedra. The 1-region of EAMs is variable in
length (29-35 amino-acids) and in sequence. The K-domain of Ephedra MADS
proteins, as defined by alignment with a region of the hurnan type II keratin protein
(Krr) (Tyner et al., 1985; Ma er al.. 199 1 ), comprises 66 amino-acids. Although this
domain is generally variable in arnino-acid sequence, it exhibits hydrophobic residues
at conserved positions (Ma et al., 199 1). EAM proteins display the hydrophobic motif
characteristic of this region. Therefore, the capacity that angiosperm MADS proteins
have to form dimers through this region seem to be a property of Ephedm MADS
proteins as well. The C-region is, as expected, the least conserved part of the proteins.
EAMC-regions vary in length from 57 to 88 amino-acids.
Some Ephedra MADS-domain proteins display high sequence similarity to
other plants MADS proteins (figure 2). Over its entire length, EAM2 displays high
amino-acid sequence similarity with AG-like proteins. It is 83% similar to GGM3
fiom the gnetophyte Gnetum gnemon and 79% to DAL2 fiom the conifer Picea abies.
EAM2 exhibits 54% similarity with AG, which is fairly high for such distmtly related
proteins. EAM3 is more similar to GGMl I from Gnerum gnemon with 70% sequence
identity and to other AGL6-like proteins such as DAL l (48%) and AGL6 itself (47%).
M M 4 shows similarity with proteins of the TM3 group. It is 67% identical to GGMI,
57% to DAL3 and 47% to TM3. EAMS is only slightly similar to GGM12 with 41%
amino-acid identity (mainly in the MADS and K-domains) and EAMI shows very low
sequence similarity with GGM1,6 and 8 (1 7%, 22% and 22% respectively).
Phytogenetic anrrlysis
To infer evolutionary relationship between plant MADS-domain proteins, a
phylogenetic analysis was done using the neighbor-joining method (figure 3). For
convenience, the tree presented here includes only a subset of the plant MADS-
domain protein family. However, the alignment on which the tree is based includes
260 complete or near complete plant MADS sequences available through GenBank
(appendix 1). The relevant topology of this tree is identical to the one that contains al1
known MADS-domain proteins.
Some EAM proteins are part of spermatophyte MADS-protein subfamilies
while others do not belong to any kwwn subfamilies (figure 3). G i M 2 belongs to the
AGAMOUS subfamily. This family includes floral homeotic genes of C genetic
activity such as AG and OsUADS3 fiom Arabidopsis and Oryza sativa as well as
genes expressed in developing ovules like FBPIl fiorn Petunia hybrida and AGLl l
from Arabidopsis (Yanofsky et al., 1990; Angenent et al., 1995; Rounsley et al.
1995). This grouping is very well supported by bootstrap analysis with a score of
99% over 1000 replicates. EAM3 is part of the AGL6 clade. Bootstrap support for
this relationship is 84%. AGL6-like genes such as ZAG3 from Zea mays and GGMI I
from Gnetum gnemon are expressed in reproductive structures whereas other genes
like DALI and PrMADS3 from conifers are expressed in reproductive as well as
vegetative structures (Mena et al., 1995; Tandre et al., 1 995; Mouradov et al., 1998;
Winter et al., 1999). EAMl clusters within the TM3 subfamily. This relationship
shows a moderate bootstrap support (68%). Al1 TM3-like genes are expressed in
vegetative and in reproductive structures except for the Arabidopsis AGLIl genes
whose transcnpt is only fouiid in roots (Rounsley et al., 1995). EAMI forms a
monophyletic group with GGM1, 6 and 8 from Gnerum gnemon. However this
relationship is not supported by bootstrap analysis with a score of 16%. EAM5 is
more closely related to GGMZ2 and to sorne extent to GGMS, both from Gnetum
gnemon. Bootstrap support for the EAMj/GGM12 group is 92% whereas monophyly
of the GGMS/EAMj/GGMI 2 clade is only weakly supported (21 %). The
relationships of EAMI and EAMYGGMIZ with these Gnetum gnemon genes should
therefore be considered unresolved. These genes probably represent first members of
new subfarnilies containhg other genes yet to be identified. Finally, it is noteworthy
that genes related to floral homeotic genes of A or B genetic activities have not k e n
identified in Ephedra.
Because EAM2-4 fa11 within well c haracterized plant MADS- box subfarnilies,
they can be considered homologous to the other spermatophyte genes belonging to
these clades. However, orthology or paralogy of EAM2-4 cannot be inferred since no
information concerning their copy number have been obtained (it is usually assumed
that single homologous loci fiom different species are orthologous). Furthemore, it is
not clear whether EAM2-4 are fûnctional homologues of proteins in their respective
clades. As proposed above, a single subfamily may include members that have
different functions as suggested by their expression pattern. Expression patterns of
EA MI -5 are unknown.
Rehtionships at the subclade level
From the phylogenetic tree (figure 3), it is clear that in subfamilies where
genes fkom angiospems, conifers and gnetophytes are available (AG, AGL6 and TM3
subfamilies), gnetophyte genes from Ephedra and Gnetum gnemon always form a
group closely related to conifer genes. Together, angiosperm genes form separate
clades. These relationships are strongly supported by the bootstrap values for the A G
and TM3 subfamilies. The relationship is less obvious in the case of the AGL6 clade
where two Gnetum gnemon genes form sister clades with conifer (Picea abies) and
Ephedra genes. This probably reflects a case of paralogy where only one of two loci
has been identified in Picea abies and Ephedra. Bootstrap supports for the
monophyly of gymnosperm and angiosperm groups are moderate (48 and 43%
respectively). To fbrther evaluate the statistical significance of these results,
phylogenetic trees of single subfamilies where constnicted using single genes from
other subfarnilies as outgroups (figure 4A-C). Monophyly of coniferslGnetales
subclades in AG, AGL6 and TM3 subfamilies are supported at 98%, 51% and 97%
respectively whereas bootstrap support for the monophyly of angiosperm subclades
are 45%, 97% and 8 1%. In a general way, monophyly is always favored but bootstrap
supports vary depending on the genes used to construct the phylogenetic tree.
CTCATCGTCTTCTCCACCACCGGAAAGCTCACCGAATGGGCGCGACCATGGGAT 120 L Z V F S T T G X L T E W A S D N M K D 40
ACTCTCAAGAAGTTCGAAGCCGTCTCTGGGATTGTTTCTTCGGACTATCAGCGCCCAG 180 T L K K F E A V S G I V S S D Y Q R Q Q 60
CTACGCCTGGAGATGGCTAGAATCGCTCGAGAAAATGAACAACTCATGGCCCAGATAAGG 240 L R L E M A R I A R E N E Q L M A Q I R 80
TACAGGAAAGGCGAGGACATTCAACACTTGACmCCGATCAGCTGGCACGCTTAGMGGG 300 Y R K G E D I Q H L T T D Q L A R L E G 100
GACCTGCAGAATGTTGTCACCGAAGTGCGAAAGAAAAAGTGTGATTTCCTGGAGCT 360 D L Q N V V T E V R K K K C D F L E K T 120
ACTGACCGCCTAAAGAAAAAGGTCGGTTACCAGGATGAAATCCGTATGGATAAGATAGAG 420 T D R L K K K V G Y Q D E I R M D K I E 140
AGGCTGGAGAGAAACAATGTGTACGTAGAGAAAGACTTGACTTGATGTCGTACTACTATCAGCAC 480 R L E R N N V Y V E K D L M S Y Y Y Q H 160
ATAAGCCAGAAACCTAACCCTGCGGGTGTGGGTGCTGCCTCAGTTTACCATCATCAGGTT 540 I S Q K P N P A G V G A A S V Y H H Q V 180
CAAGGAGAGGACCAAGCCCAAGCGGATCATTTGCCTATMCGCACAGTATCTGMGTTG 600 Q G E D Q A Q A D H L P I N A Q Y L K L 200
ACTGAAGCGAGTTCCTCGTATGCTCCTGGGTTTGCTAAGACCATTCACCAT 660 T E A S s S Y A P G F A K T I K e n d 2 1 6
A T C T T A G T G T G C A A C A C C A T T A T A T T A 720 CATGGATTTCTGTCAGCTTACGAACATATATTCCTCCACTTTTGTTTTATTGGAATAAAT 780 GGTACTTCGTTTTGATGTTA. 8 0 0
Figure 1A-E. EAMI-5 cDNA and deduced amino-acid sequences. The MADS-boxes are shown in boldface and the K-boxes are underlined. A "4' sign at the beginning of the sequences indicates that they are incomplete at the S'-end or at the N terminus because of the cloning procedure used.
CTTGATCGTCTTCTCCAGCCGCGGCAGGCTCTACWTTCGCCAATAACAGCAGCGTGAA 120
L I V F S S R G R L Y E F A N N S S V K 40
ACGAACGATTGAAAGGTACAAGAAAACATGCGCTGACTCCATGGCATTGCTATCTC 180
R T I E R Y K K T C A D N N H G I A I S 60
CGAGTCAAATGCACAGTATTGGCAACAGGAGGCTGTAAAGCTGAAGCAACAGATAGAAGT 240
E S N A Q Y W Q Q E A V K L K Q Q I E V 80
TCTCAATAACCAATTCAGACACTACATGGGTGATAGCATTCAGTCCATGACTGTGA 300
L N N Q F R H Y M G D S I Q S M T V K E 100
G C T G A A G C A G C T G G A G G G A A G G C T A G A G A A A G G C 360
L K Q L E G R L E K G L G R V R A K R N 120
TGAAAGCCTTCTTGAGGAGATTGAGATTATGCAAAGGAGGGAGCATCCCTCATTCGA 420
E S L L E E I E I M Q R R E H H L I Q E 140
GAATGAATTCCTTCGTGCGAAGATAGCAGAATGCCAAAGCTGCCGCAGTCCCATGCTATGTT 480
N E F L R A K I A E C Q S S H H A N M L 160
GCCAGCACAAGAGTATGAGGCTCTGCCAGCACCCTACGACTCTAGAAACTTTATGCATGC 540
P A Q E Y E A L P A P Y D S R N F M H A 180
AAACCTGATAGAGGCAGCAGCTGCTCAGCATTATGCCCGTCGCAGACAGCTCTTCA 600
N L I E A A A A Q H Y A R Q E Q T A L Q 200
GCTTGGGTGGGTTTGAGTTGTTTGTAAATATTTTATTTTAAGGCAACTCAATAAAAAGACCATTT 660
L G W V end204
CTCATCATCTTCTCCAGCCGCGGMGCTCTACGAGTTCGGCAGCGCCGGCACGTTGG 120
ACACTGGAGCGCTATCAAAAATGCTCATATTCAATGCTCATATTCmTGCMGMGmTTCTTCAGACCGC 180
T L E R Y Q K C S Y S M Q E E N S S D R 60
GAGGCACAGAACTGGCATCATGAGGTCAGCAAACTAAAAGCAAAGGTTGAATTGCTGCAA 240
E A Q N W H H E V S K L K A K V E L L Q 80
CGCTCGC-4AAGGCACTTGATGGGAGAGGACCTTGGACCCCTGAGTATGGGAGCTACAG 300
R S Q R H L M G E D L G P L S I R E L Q 100
AACCTTGAAAGACAAATAGAGGCTGCATTGACACAAGTCAGAGCTAGGACACTTG 360
N L E R Q I E A A L T Q V R A R K T Q L 120
A T G C T A G A T A T G A T G G A A G A C C T M G G A G A G G C 420
M L D M M E D L R R K E R L L Q E I N K 140
TCATTGCGTAAAAAGCTCCAGGATGCAGAAGGGCAAGCTTATAATTCCATTCAAATTCCT 480
S L R K K L Q D A E G Q A Y N S I Q I P 160
CAAGAATGGAATTCAAATGCAATTGCAAACCCCTCAAACCATATTACATGTGAACCTACA 540
Q E W N S N A I A N P S N H I T C E P T 180
TTACAGATTGGGTACTATGATCCTCAGAATTCATCAGCTCCTGCCTGAGAGCTT 600
L Q I G Y Y D P Q N S S A P K P E S N N 200
AACTACATTCATGGATGGATGATTTGATAATGGATTCACATTAGTTCTTCTTGTATATGA 660
N Y 1 H G w M 1 end208
ACTTATTATGTATTAATAAACTAAATAGTTATTATTCTCCTACMGMTmCTTTGGATA 720
CTCTAATCAACTTCATGTGGTATTTCTTCAAAAATATTATTACATATTCTTATTTGTTAGCTT 780
AATATATTGAGTATTCAAAACTTGGCAn 807
CTCATCATCTTCTCTCCCCGCGGCAAGCTCTACGAGTTCGCCAGTCCCTGCATGCAAAAG 120
L I Z P S P R G K L T E F A S P C M Q K 40
ATGCTGGAAAGATATCAAAAATGTTGTTGTCAAGAAGCAAATCCAAATTCGAGGrAAAACATTA 180
M L E R Y Q K C C Q E A N P N S S K T L 60
GAAGAAGATACCCAGCATTTGAAGCAAGAGATTGCTCATATGGAGGAGMGATT-GG 240
E E D T Q H L K Q E I A H M E E K I K G 80
CTCGAATCAGCACAGAGAAAATTGCTTGCTTGGCGMGMTTGTCTTGTTTGACMT.ZmGAT 300
L E S A Q R K L L G E E L S C L T M K D 100
GAGTTATTGATGGACCAAATCAACCAGCTTAAGAAAAAGGCTCAGATATTAGGAGAGGAA 420
E L L M D Q I N Q L K K K A Q I L G E E 140
AATGCCATTTTACGAAAAAAGTGCACAAATGTTCCTTATGGGGATGGCATTGTATCACAT 480
N A I L R K K C T N V P Y G D G I V S H 160
ATGGGAACTGCTAATAGCAACTCTATGGGAAACATTGAAGATGTGGAAACACAATTCTC 540
M G T A N S N S M G N I E D V E T Q L L 180
A T A G G T C C G C C T G A C A A C C A T T G T A G C T T G G A C T G A T T C 600
I G P P D N H C S L D ~ ~ ~
TGGCTCCTTCAGCAACAGCTTGCTCAAAAGAGAGAACATGTTGTCTCTGCCTGAAAATAACC 660
TAAAATAGCACTCAGGTGCATTCTTAAATAATGGATCATATTATTCACCTAGACACCTTT 720
ATAGTTTTCATGTTCTAGAGTTCAACATAAACTACAAATTGTTAACGTGGTTATATTTCC 780
TAAATAACTCTTTCCATGTA, 800
GTCATAGTCTTCTCTTCCACCGGCAGACTCTACGAGTTCTGCCGCCAGCATGGGAT 120
V I V F S S T G R L Y t B C N A S M E D 40
GTTTTGGACAAGTACAACAGAAATTTTCCAGGGAAAAGAACAAAGACATGAGATTAAGATT 180
V L D K Y N R N F Q G K E Q R H E I K I 60
GATAGCCCAGAAATTATGGCTGCCCAACAACAATTAACTGAGCTTCAACACAGGCAAAGG 240
D S P E I M A A Q Q Q L T E L Q H R Q R 80
CAACTTTTGGGAGAAAACTTGGAAGGACTTTCTCAAGAAGAGCTTCAAACTTTAGAAACT 300
Q L L G E N L E G L S Q E E L Q T L E T 100
AAGCTTGAAACAACCTTAAAACTAGTCCGATTACAGAAAGTACAAAAATTACAAGGAAAT 360
K L E T T L K L V R L Q K V Q K L Q G N 120
ATTCACAATCTGCWTAAGGTAAAGACAATGATAGAGGATAACGACACTCTTCGCAAG 420
I H N L Q N K V K T M I E D N D T L R K 140
CAATTGGAAGAAACACAAGGAACAATCTTAAGCTCAAGGAACAAAGAAAGTGAAGATATC 480
Q L E E T Q G T I L S S R N K E S E D I 160
TTCCCTCTGAAACAAAGAGGAGACACACAACCATCTACTCAGTTTGTTTCCACAACTT 540
F P L K Q R G D T Q P S T Q F V S T T L 180
AGTTTTTCTTTCAACAAATAAACATTATGAGmCmGMGCCTTTGGCACATCCTTACAT 600
S F S F N K e n d
TTGGGGCTAAAAACTCTTTCTTAAGGTTTTAATAAATGAGTATTTCATGAAAAATGGTAT 660
TAGAGAGCTTGTATTGTGAGGACCACTCCTGTAAATTTCTTTGGATACATCCATAAATTA 720
TTAAGCCAAATACATTTTAGGACCATGGTTGTGAATATCTTGTTATAACTCCAATAATAT 780
AAATAAGTAAATTACCTTTGCTTCTATCA, 8 O 9
.......... f GM 1 1 Y?.GRVELKR IENKINRQVT FSKRRFiGLLK .......... DALl
LA)Ip .............................. --.<RuIuK C o n s e n s u s GESSPLnK .xGRGK-E-KR IN-TNRQVT FSKRRNCUX
con
8 1
NRVFQGZKEQR END FREKGTA KXTCADN-NH RKTCADN-NQ KKTCVDN-NH :<KAISDN-SN QK-CSYSMQE QU-CSYALQE EK-CSYAYQD Nil-C-YNCSL GKCCQWPN Qi(CCQESTAN DK-CSEGSNT KRHTKDRVQP E&v------ s -KTCSDNHQN
HKIKIDSP-- RDQEI DNG- G I A I S E S W GGAIAESNAQ GGVI SESNSQ TGsVAEINAQ E N S - S D R W Sm-SDRDAQ TTGVSCREAQ s , u r u e E r r p SSKTTCEDTQ TSKXLVEDTQ TNTTKERDIQ ENQAGPQYLQ GIVSSDYQRQ -N-ISE--AQ
KATUçVfCC) AE-S TGRLYEFC.I- ASKEDVLDKY KAYELSILCD AEVAtIIFSS TGKLYDYCS- SSMKVLLERY KAYELÇVLCD AEVALIVFSS RGRLYEFJ4N.ï SSVKRTSERY
AISALIVFÇS RGRLYEFAh?; -SVKnTI ERY KAYELÇVLCIi AEVALIVFÇS UGRLYEFANH -5WKXTIZRY KAYELSVLCD AEVALNFSS RGRLYEYSNN -SVKGTIERY
AEVALfIFSS RGKLYEFGS- AGTLKTLERY AEVRCIIFSS RGKLYEFGS- AGTLKTLERY AEVALIIFST RGKLYEEAS- SSZINKILERY
m D AEVALXfFçS RGKLYEFGS- VGIESTIERY
P S M Q ~ L E R Y SSTQEIIRGN
.......... NK VKTXIEDEJGT LRKQLEETQG TILSSRNKES FDZFPLKQRG DTQPSTQFVS TTTSfSFNK. RK GQEILESNNR LaQQL3QRYN NMPLZNFEE SESLPTGQLL AJEPPQSQSS 3SISTSFSLK LGNGVWPDN RR EHHLIQENZF L ~ I A Ë C Q S SMKXWLPAQ --EYEALL-P APYDSRNRM ANLI-JVIAA- ---QHYA--- RR EDNLIRENEY IWKIAECQS HQFSNXLTAA AVEYDAI--? AAYDSRNFXEI WIEAAAA- --HHUYA--- RR EHILIQENEI LRSKIAECQN SHNTNKLSA- -PEYDAL--? A-FDSRNFLH ANLIDkA--- ---HHYd---
EVDLHNDNQI L R ~ K I A E N ~ R NNPSISLMPG GSNYEQMPT PQT~~SQPFDS ~ N Y F Q V W PNNHHYSS~U; ERLLQEINKS LWLQDAEG Q-----AYNS :--QI----? OEWNSNAIAN --------es N------ HIT
-11 DALl AGL6 X A W -1 DAI3 TM3
mm C o n s e n s u s
ERLL~EWKS LRK:(LDETEG Q-----VYSN ERLLHEVNKS LQKiCLSETEG RDVITGI E3T ERQLCXJINKQ LKIKP-ETEG H A M - - FQD AQI LGEENAI LRKKC--TNV PY-GDGIVS- SQLLGEENAV LRKKC---NG PYHGGCLLSI ERISSEENAF HRKSL--SI L tMnMVPfUQL m . . . . . . . . ....................
23 1 789 ................................................. EVSDTSWLG LPSHS .................................. RQEQTRLQLG 'WJ........ ............................. QQEQTALHLG SEHKYSWY? DPQMKFQT. ................... HQEQTTLQLG ....................................... ....................................... RQCQTALQLV C--EPTLQIG YY-DPQNSSA PKP--ESh'-- --NNYIHGm f . . . . . . . . C--ZPTCh'LG TH-LLLSûAr' PGKI i LRT-- --TTYRGGdS NLIPDAL'WO
w SDVETQL~IG PPJNHCSLD. ............................. G a 1 EDVETQiNIG PPDWCSINQ C......... ................... DaL3 ................................................. T m .........................................-....... W LKLT'SSSY WGFAKTIK. .............................
C o n s e n s u s CQEET-S(ILG -PD--CS--- PP---KSN-- G--NY-QGd- -LIPDANNQ
PWALPPTPQ N------ A M TAYAIsHPQQ NSNASLHHVD SEFPVEPSHP m------ LD ---------- NI SCIVTHHDDNN NNrnSNNVNV .................... .................... HQVQGLDQAQ ADHLPINAQY AVLI --MAQ NNNHHYA--D
Figure 2. Sequence alignment of several plant MADS-box genes illustrating the conserved domains organization. Deduced amino acid sequences o f EAMI-5 were aligned with gnetalean, conifer and angiosperm homologs. N-, MADS-, 1-, K- and C- regions are identified. Hydrophobie residues o f the K-domain are dark-gray shaded. "c indicates that sequences are incomplete, "//" that a partial sequence is presented and dashes denote gaps.
SQUA (Antinhinum) 100 1
I APl (Arobidopsis)
Figure 3. PhyIogenetic tree of a subset of the plant MADS-domain protein family inferred using the neighbor-joining method. Genus names from which genes were isolated are between parentheses. Ephedra genes are in black, Gneium in dark gray and conifers in light gray boxes. The remainder are angiosperm genes. Subfarnil ies are identified with brackets on the right. Numbers next to some nodes give bootstrap percentage from 1000 replicates. Only bootstrap at relevant nodes are presented. Nodes with less than 15% support are collapsed.
A) AG subfamily
I HAGl (Hyaünthus)
98
?
Figure 4A-C. Phylogenetic tree of AG, AGL6 and TM3 subfarnilies inferred using the neighbor-joining method. Genus names from which genes were isolated are between parentheses. Ephedra genes are in black, Gnerurn in dark gray and conifers in light gray boxes. The remainder are angiospem genes. Numbers next to some nodes give bootsaap percentage fiom 1000 replicates. Only bootstrap at relevant nodes are presented.
B) AGL6 subfamily
AGL 13 IAmbidopys)
AGL6 (ArsbidoWffl
C ) TM3 subfamily
The evolutionary origin of angiosperms and their flowers is a contentious
question. The large morphological diversity of seed plants makes structural
homologies difficult to assess, especially between reproductive units. This creates
uncertainties regarding the evolutionary relationships of angiosperms to other seed
plants. Studies based on morphological characters have identified the Gnetales as the
sister group of angiospems whereas molecular studies have k e n equivocal as to
whether these two groups are closely related. Developmental genes can help to
untangle this issue since genes that are involved in the control of structural
development (e.g. floral development) are believed to also be implicated in the
evolution of these structures.
Ephedr<r and the anthophyte theo y o f origin of flowen
The anthophyte theory state that Gnetales and angiosperms along with
Bennettitales and Pentoxylales form a monophyletic group, and suggest that their
common ancestor had flower-like structures (Doyle & Donoghue, 1986). If tnie, this
implies that the concentric whorled arrangement of the reproductive units of Gnetales
and angiosperms evolved once and is hornologous in the two groups. In Gnetales,
ovules have one integument and are further surrounded by an outer envelope
composed of decussate pairs of organs that may correspond to the perianth of
Bennettitales and angiospems (Martens, 197 1 ; Doyle, 1998). From a developmental
perspective, if angiosperm and Gnetales reproductive structures are homologous,
developrnental pathways speciming these structures should also be homologous.
Consequently, homologues of ABC floral homeotic genes should be present in the
developing Gnetalean cones.
In this study, a homologue of a C-type gene (EA M2) has been cloned. No
indication of the presence of A- or B-type gene homologues has been found in
Ephedra. However, the search was not exhaustive and it is possible that A P I , A P 3 or
PMke genes exist in the genome of Ephedra. Search for ABC-type genes in conifers
and Gnetum gnemon have also led to the identification of putative C-type gene
orthologs (DAM in Tandre et al., 1995; SAG in Rutledge et al., 1998; GGM3 in
Winter et al., 1999). In Gnetum gnemon, GGM3 is expressed early in al1 organs of
male and female cones and becomes resmcted to the outer envelope later in the
development. This expression pattern suggests that the outer envelope of Gnetum
gnemon cones is not homologous to the angiosperm perianth where C-type genes are
never expressed but to the outer integurnent of the angiosperm ovule where C-type
gene expression is found (Winter et al, 1999). Unfortunately, the expression pattern
of EAM2 is currently unknown, preventing a similar conclusion to be drawn nom
Ephedra. Nevertheless if EAM2 is a true homologue of GGM3, similar results should
be obtained. AP3PI-like genes have also been cloned from Gnetum gnemon and
Picea abies (GGM2 in Winter et al., 1999; DAL13, Engstrorn, P., unpublished but
included in the Winter et al. phylogenetic analysis). A putative homologue of
GGM2DAL13 probably exists in Ephedra but attempts to clone its cDNA have not
been successful. Interestingly, the expression pattern of GGM2 is restncted to the
antherophores of Gnetum gnemon male cones, which seems to indicate that this gene
has a different function than angiosperm B-type genes that are expressed in petals and
stamens. No A-type genes have k e n found in conifers or Gnetales. Taken togeîher,
these results indicate that the whorled arrangement of flower and cone reproductives
units is not homologous between angiosperms and Gnetales. Still, it should be noted
that dissimilarities between developmental pathways specieing angiospenns and
Gnetales reproductive structures does not necessarily imply that homology at the
subunit level does not exist. To the contrary, the expression of A G-like genes in
structures surrounding ovules in Gnetales (outer envelope) and angiospenns (outer
integurnent, carpel) suggest that homologies may exist between female reproductive
subunits of the two groups.
Three of five Ephedra MADS-box genes cloned in this study belong to the
previously characterized spermatophyte subfamilies AG, A G L 6 and TM3. At the
subclade level, the topology of the tree indicates that Gnetales genes are more closely
related to conifer than to angiospem genes. Clearly, this result suggests that the
Gnetales, probably as a monophyletic group, are a sister group of conifers and not of
angiospenns as stated by the anthophyte theory. This conclusion, very well supported
by bootstrap analyses, contradicts morphological studies (Crane, 1985; Doyle, 1996
and many others) but corroborates results previously obtained in molecular studies
(Hasebe et al., 1992; Goremykin et al., 1996; Chaw et al., 1997). The monophyly of
the Gnetales clade suggested here cannot be demonstrated unequivocally because
MADS gene homologues fiom Welwitchia mirabilis (Gnetales) have not been cloned
yet. Nevertheless, monophyly of this group is usually the consensus of rnorphological
and molecular studies (see Doyle, 1998). The phylogenetic tree also suggests that
angiosperms are a monophyletic group. Similar conclusions have been drawn in
several morphological as well as molecular phylogenetic studies (see Doyle, 1998).
Since MADS-box genes are not known from cycads and Ginkgo biloba
(gymnosperms), it cannot be decided from these data whether gymnosperms are a
monophyletic group, thus leaving undesignated the extant sister group of angiospems.
Conflicts between morphological and molecular phylogenetic analyses
This study is not unusual in that. like many molecular studies, it is in
contradiction with the conclusions obtained in morphological studies. Here, 1 will
argue that conflicts between molecular and morphological studies are caused by
erroneous inference of structural homologies in morphological analyses.
The major weakness of morphological data is that character analysis is
somewhat subjective compared to molecular data where character States are
objectively inferred through the alignment of homologous sequences. To better
illustrate this point, 1 will consider the most recent overall reanalysis of morphological
data performed by Doyle (1996). Although this study dealt with the phylogeny of the
seed plants, 1 will focus on the relationship of Gnetales to angiosperms. In this work,
Doyle found the Gnetales to be the sister group of angiospenns, thus supporting the
anthophyte theory. This result was based on the analysis of 91 characters covering
aspects of the vegetative and reproductive morphology of seed plants. From these 91
characters, only 9 clearly associated Gnetales and angiosperms to the exclusion of al1
the other seed plant groups. Although the homology of characters such as apical
meristem with a tunica, lignin with a Maüle reaction, fused microsporangia and a
cellular embryogenesis are plausible, others are questionable. Strikingly, water-
conducting vessels and double fertilization have been assigned the same character
state (implying homology) in Gnetales and angiosperms solely on the basis of their
presence. It is quite surpnsing that anyone should attempt to descnbe so complex a
structure as vessels, or so complex a process as double fertilization, with a single
character state, thus making the inference of homology subjective or at least
superficial. In agreement with this position, a recent review of wood anatomy in
Gnetales presented evidence that favon independent origin of vessels in angiosperms
and in Gnetales (Carlquist, 1996). Likewise , the different outcomes of the double
fertilization in Gnetales and angiospems suggests that it might have evolved
independently on two occasions. In Gnetales, the product of the second fertilization
event is not a triploid endosperm but an abortive diploid embryo (Friedman and
Carnichael, 1996). This fact has been recognized and consequentiy scored by Doyle.
However, since the biological significance of the double fertilization in Gnetales is
unknown, a m e r characterization of this process at the cellular level is required to
clearly infer evolutionary homology . Another example of subjectivity in character
state inference is the megaspore wall thickness. In this case, a thin wall has been
considered homologous to the absence of a wall. Conifers, cycads and Ginkgo have a
thick wall, the wall in Gnetales is thin and angiosperms do not have a megaspore wall.
How c m the absence of a trait be considered homologous to its presence, even weak?
Surprisingly, 1 1 characters descnbing fundamental aspects of seed plant reproductive
structures (perianth, seed coat, male and fernale subunits morphology) were scored as
unknown in al1 seed plants but angiospems. This is justifiable insofar as it makes it
possible to avoid cases of problematic homology among reproductive structures of
seed plants. However, the major character shared by seed plants, that is reproduction
through seeds, is intimately linked to the reproductive structures. It is therefore hardly
conceivable that one can attempt to resolve the seed plant phylogeny without
considering these characters in al1 seed plants. This again emphasizes the importance
of the use of developmental genes to help inferring structural homologies.
Evolution of the MADS-box gene family and relationship to the origin of
reproductive stmctures in seed plants
Three MADS-box gene subfamilies were f o n d to include sequences from
conifers, Gnetales and angiosperms. Another family, AGL2, has been found to
include genes from both conifers and angiosperms. Consequently, it is likely that at
least some pathways that control the development of reproductive structures in these
groups were present in their comrnon ancestor about 300 MYA. The hinction of AG-
like genes in the specification of ovules, stamens and carpels indicate that these
developrnental processes might have a common origin in these seed plant groups. The
function of TM3- and AGL6-like genes is currently unknown but their expression in
the seed plant reproductive structures suggests a similar conclusion. Data from fems
and molecular dock estimates suggest that these patterns were established after the
divergence of fems and seed plants 400-340 MYA (Münster et al., 1997; Purrugganan
et al., 1995). This corresponds to the Devonian era where progymnospenns were
widespread (Beck, 1988). Ephedra and Gnetum gnemon also contain genes unique to
Gnetales (e.g. M M I , 5, GGMI-6,8 and 12) suggesting that Gnetaies MADS-box
gene family has fiuther diversified afier divergence from conifers and angiosperms.
Future directions
Although this study an others present evidence concerning the ongin of seed
plants, more results need to be obtained to clearly understand the evolutionary history
of spermatophytes. Only a few genes have been characterized in Ephedra. More can
be found (e.g. GGM2IDALI3 homologue). In addition. expression patterns as
determined by Northem blots and in situ hybridization and the function of the proteins
definitely need be determined. To M e r assess relationships between seed plants
groups, Ginkgo and cycads also need to be investigated. For future characterizaiion of
MADS-box genes in seed plants I would suggest the use of genomic or cDNA
libraries. The PCR approach used in this study, although efficient for the cloning of
MADS-box genes, has shown some bias conceming the cDNA amplified. For
exarnple, EAM3 was cloned at least 11 times whereas EAMl was cloned only twice.
ï h e potentially large diversity of the MADS-box gene family of Ephedm therefore
remains somewhat unappreciated. Moreover, plants such as cycads are much more
technically challenging since these plants usually produce only one cone once a year.
The use of cDNA libraries would facilitate the cloning of MADS-box genes
specifically expressed in male or femaie reproductive units, which is highly relevant
to the establishment of homologies among parts of the reproductive structures of seed
plants.
The goal of this study was to investigate M e r the phylogenetic relationship
of Gnetales to other seed plants and to provide more data bearing on the controversial
question of the origin and evolution of flowers. The data 1 obtained show that the
gnetophyte Ephedra andina is more closely related to conifers than to angiospems
and that Ephedra cones are less closely related to angiospem flowen than previously
thought. These conclusions are ineconcilable with the anthophyte theory of origin of
flowers. In the light of these results, new unifying theories conceming the
phy logenetic relationships of seed plants and the evolution of their reproductive
structures need to be developed.
This study shows how developmental genes can be useful in resolving
complex questions of phylogenetic relationships and structural homologies.
Notwithstanding my criticism of morphological phylogenetic studies, 1 do not reject
the morphological data. On the contras>, 1 believe that different types of data should
be additive rather than exclusive. The difficulty of resolving the question of the origin
and evolution of seed plants emphasizes the need of a rnultidisciplinary approach,
including morphological and paleontological as well as moleculai-, developmental and
biochemical information.
Albert, V.A.. Backlund, A., Bremer. K.. Chase. M.W., Manhart, J.B.. Mishler. B.D. &
Nixon, K.C. (1994) Functional constraints and rbcL evidence for the land plant
phylogeny. Ann. Mo. Bot. Gard 81,534-567.
Angenent. G.C., Busscher, M., Franken, J.. Mol, J.N.M. & van Tunen. A.J. (1992)
Differential expression of two MADS-box genes in wild-type and mutant petunia
flowers. Plant Ce12 1,983-993.
Angenent, G.C., Franken, J., Busscher, M., Colombo. L. Br van Tunen, A.J. (1993)
Petal and stamen formation in petunia is regulated by the homeotic gene jbpl. Plant
J . 4, 101-1 12.
Angenent, G.C., Franken, J., Busscher, M., van Dijken, A., van Went, J.L., Dons,
H.J.M. & van Tunen, A.J. (1995) A novel class of MADS-box genes is involved in
ovule development in petunia. Plant Ce11 7, 1 569- 1 5 82.
Arber, E.A.N. & Parkin, J. (1907) On the ongin of angiosperms. J Linn. Soc. London
Bot. 38,29080.
Arber, E.A.N. & Parker, J. (1908) Studies on the evolution of the angiospems: he
relationship of the angiospems to the Gnetales. Ann. Bot. London 22,489-5 15.
Bailey, LW. (1944) The development of vessels in angiospenns and the significance
in rnorphological research. Am. J. Bot. 31,421428.
Beck, C.B. (1988) Origin and Evolution of Gymnosperms, Columbia Univ. Press,
New York. 504p.
Bowman, J.L., Smyth, D.R. & Meyerowitz, E.M. (1989) Genes directing flower
development in Arabidopsis. Plant Cell 1 , 3 7-52.
Bowman, J.L., Smyth, D.R. & Meyerowitz, E.M. (1991) Genetic interactions among
floral homeotic genes of Arabidopsis. Developmenr 112, 1-20.
Bowman, J.L., Alavarez, J., Weigel, D., Meyerowitz, E.M. & Smith, D.R. (1 993)
Control of flower development in Arabidopsis thaliana by A P E TA L A 1 and
interacting genes. Developrnent 11 9,72 1 -743.
Carlquist, S. (1996) Wood, bark, and stem anatomy of Gnetales: a summary. Ini. J.
Plant Sci. 157, S58476.
Carmichaet, J.S. & Friedman, W.E. (1996) Double fertilization in Gnetum Gnemon
(Gnetaceae): it's bearing on the evolution of sexual reproductive within the Gnetales
and the antophyte clade. Am. J. Bot. 83,767-780.
Carmona, M.J., Ortega, N. & Garcia-Maroto, F. (1998) Isolation and molecular
characterization of a new vegetative MADS-box gene from Solunum tuberosum L.
Planta 207, 181488.
Chase, M.W., Solstis, D.E., Olmstead, R.G., Morgan, D., Les, D.H., Mishler, B.D.,
Duvall, MA., Price, R.A., Hills, H.G., Qiu, Y.-L., Kron, K.A., Retting, J.H., Conti,
E., Palmer, J.D., Manhart, J.R., Sytsma, K.J., Michaels. H.J., Kress, W.J., Karol,
K.G., Clarck, W.D., Hedrén, M., Gaut, B.S., Jansen, R.K., Kim, K.-J., Wimpee,
C.F., Smith, J.F., Fumier, G.R., Strauss, S.H., Xiang, Q.-Y., Plunkett, G.M., Solstis,
P.S., Swensen, S.M., Williams, S.E., Gadek, P.A., Quinn, C.J., Eguiarte, L.E.,
Golenberg, E., Learn, Jr. G.H., Graham, S.W., Barrett, S.C.H., Dayanandan, S. &
Albert, V.A. (1993) Phylogenetics of seed plants: an analysis of nucleotide
sequences fiom the plastid gene rbcL. Ann. Mo. Bof. Gard 80,528-580.
Chaw, S.M., Zharkikh, A., Sung, H.M., Lau, T.C. & Li, W.H. (1997) Molecular
phylogeny of extant gyrnnosperrns and seed plant evolution: analysis of nuclear 18s
rRNA sequences. Mol. Biol. Evol. 14,5668.
Chung, Y.-Y., Kim, S.-R., Finkel, D., Yanofsky, M.F. & An G. (1994) Early
flowering and reduced apical dominance result corn ectopic expression of a rice
MADS-box gene. Plant Mol. Biol. 26,657-665.
Chung, Y.-Y., Kim, S.-R., Kang, H.-G., Noh, Y.-S., Park, M.C., Finkel, D. & An G.
(1995) Characterization of two rice MADS-box genes homologous to GLOBOSA.
P h Sci. 109,45-56.
Coen, E.S. & Meyerowitz, E.M. (1991) The war of the whorls: genetic interactions
controlling flower development. Nature 323,3 1 -37.
Crane, P.R. (1 985) Phylogenetic analysis of seed plants and the origin of angiospems.
Ann, Mo. Bot. G a d 72, 7 16-793.
Davies, B. & Schwarz-Sommer, 2. (1994) Control of floral organ identity by
homeotic MADS-box transcription factors, in Plant, Promoters and Transcription
Factors (Nover, L. ed.) pp. 235-258, Springer-Verlag, Berlin.
Davies, B., Egea-Cortines, M., de Andrade Silva, E., Saedler, H. & Sommer, H.
(1 9%) Multiple interactions amongst floral homeotic MADS-box proteins. EMBO J.
15,43304343.
Dayhoff, M.O. (1979) Atlas of protein sequence and structure, vol. 5, supp. 3.
National Biomedical Research Foundation, Washington, D.C.
Doebley, J. (1993) Genetics, development and plant evolution. Curr. Opin. Genet.
Dev, 3,865-872.
Doyle, J.A. & Hickey, L.J. (1976) Pollen and leaves fiom the mid-Cretaceous
Potomac group and their bearing on early angiosperm evolution, in Origin and Early
Evolution of Angiosperms (Beck, C.B. ed.) pp. 139-206, Columbia Univ. Press, New
York.
Doyle, J.A. & Donoghue, M.J. (1986) Seed plant phylogeny and the origin of the
angiosperms: an experimental cladistic approach. Bot. Rev. 52, 32 1 4 3 1.
Doyle, J.A. & Donoghue, M.J. (1992) Fossils and seed plant phylogeny reanalysed.
Brittonia 4489- 106.
Doyle, J.A., Donoghue, M.J. & Zimmer, E.A. (1994) Integration of morphological
and ribosomal RNA data on the ongin of angiosperms. Ann. Mo. Bot. Gard 81,4 19-
450.
Doyle, J.A. (1996) Seed plant phylogeny and the relationships of Gnetales. Int. J .
Plmt Sci. 157(6 suppl.), S3439.
Doyle, LA. (1998) Molecules, morphology, and the relationship of angiosperms and
Gnetales. Mol. Phyl. Evol. 9,448462.
Doyle, J.J. ( 1 994) Evolution of a plant homeotic multigene farnily : toward comecting
molecular systematics and molecular development genetics. Syst. Biol. 13,307-328.
Dubois, E., Bercy, J. & Messenguy, F. (1987) Characterization of two genes, ARGRI
and ARGRII required for specific regulation of arginine metabolism in yeast. Mol.
Gen. Genet. 207, 142-148.
Felsenstein, J. ( 1 993) PH Y L P (Phylogeny Inference Package) version 3 S7c.
Distributed by the author. Departement of Genetics, University of Washighton,
Seattle.
Friedman, W.E. (1990) Sexual reproduction in Ephedro nevadensis (Ephedraceae):
M e r evidence of double fertilization in non-flowering seed plant. Am. J. Bot. 77,
1 582- 1598.
Friedman, W.E. (1992) Evidence of a pre-angiosperm origin of endosperm:
implications for the evolution of flowering plants. Science 255,3369339.
Gaussen, H. (1946) Les gymnospermes, actuelles et fossiles. Trav. Lab. Forest.
Toulouse, Tome II Etud. Dendrol., Sect. 1, Vol. 1, Fasc. 3, Chap. 5, pp. 1-26.
Goremykin, V., Bobrova, V., Pahnke, J., Troitsky, A., Antonov, A. & Martin, W.
(1 996) Noncoding sequences fiom the slowly evolving chloroplast inverted repeat in
addition to rbcL data do not suppm Gnetaiean f in i t i es of angiosperms. Mol. Biol.
EvoZ. 13,383-396.
Goto, K. & Meyerowitz, E.M. (1994) Function and regdation of the Arabidupsis
floral homeotic gene PISTILLA TA. Genes Dev. 8, 1548- 1 560.
Greco, R., Stagi, L., Colombo, L., Angenent, G.C., Sari-Goda M. & Pe, M.E. (1997)
MADS-box genes expressed in developing inflorescences of rice and sorghum. Md.
Gen. Genet. 253,6 1 5-623.
Gu, Q., Ferrandiz, C., Yanofsky, M.F. & Martienssen, R (1998) The miitfull MADS-
box gene mediates ce11 differentiation dunng Arabidopsis fruit development.
Development 125, 1 5094 5 1 7.
Gustafson-Brown, C., Savidge, B. & Yanofsky, M.F. (1994) Regulation of
Arabidopsis floral homeotic gene APETALAI. Cell 76, 13 1 - 143.
Hamby, R.K. & Zirnmer, E.A. (1 992) in Molecular Systematics of Plants (Solstis,
P.S., Solstis, D.E. & Doyle J.J. eds) pp. 50-91, Chapman & Hall, New York.
Hansen, G., Estruch, J.J., Sommer, H. & Spena, A. (1993) NTGLO: a tobacco
homologue of the GLOBOSA floral homeotic gene of Antirrhinum majus: cDNA
sequence and expression pattern. Mol. Gen. Genet. 239,3 10-3 12.
Hasebe, M., Kofugi, R., Ito, M., Kato, M. & Ueda, K. (1992) Phylogeny of
gymnosperms inferred fiom rbcL gene sequences. Bor. Mag. Tokyo 105,673-679.
Hasebe, M., Wen, C.K., Kato, M. & Banks, J.A. (1998) Characterization of MADS
homeotic genes in the fem Ceratopteris richardii. Proc. Nat1 Acad. Sci. USA 95,
622296227.
Heck, G.R., Perry, S.E., Nichols, K.W. & Femandez, D.E. (1995) AGLIS, a MADS-
domain protein expressed in developing embryos. PIanr Cell. 7, 127 1 - 1282.
Hickey, L.S. & Doyle, J.A. (1 977) Early Cretaceous fossil evidence for angiosperm
evolution. Bot. Rev. 43,3- 1 04.
Hickey, L.J. & Taylor, D.W. (1996) Origin of the angiosperm flower, in FZowering
Piani Origin. Evolution and Phylogeny (Taylor, D.W. & Hickey, L J . eds.) pp. 176-
23 1, Chapman & Hall, New York.
Hill, J.P. & Lord, E.M. (1 989) Floral development in Arabidopsis thaliana: a
cornparison of the wildtype and the homeotic pistillata mutant. Can. J. Bot. 67,
2922-2936.
Huang, H., Mizukami, Y., Hu, Y. & Ma, H. (1993) Isolation and chanicterization of
the binding sequences for the product of the Arabidopsis floral homeotic gene
AGAMOUS. Nucleic Acids Res. 2 1,47694776.
Huang, H., Tudor, M., Weiss, C.A., Hu, Y. Br Ma, H. (1995) The Arabidopsis MADS-
box gene AGL3 is widely expressed and encodes a sequence-specific DNA-binding
protein. Plant Moi. Bioi. 28,549-567.
Huang, H., Tudor, M., Su, T., Zhang, Y., Hu, Y. & Ma, H. (1996) DNA-binding
properties of two Arabidopsis MADS-domain protein: binding consensus and dimer
formation. Plant Ce Il 8, 8 1 -94.
Irish, V.F. & Sussex, I.M. (1990) Function of the apefalal-l gene during Arabidopsis
floral development. Plant Cell2,74 1-753.
Jack, T., Brockman, L.L. & Meyerowitz, E.M. (1992) The homeotic gene APETALA3
of Arabidopsis thaliana encodes a MADS-box and is expressed in petals and
starnens. Cell68,683-697.
Jofuku, K.D., den Boer, B.G.W., van Montagu, M. & Okamuro, J.K. (1994) Control
ofArabidopsis fiower and seed development by the homeotic gene APETALAZ.
Planr Cell6, 121 1-1225.
Kang, HA., Noh, Y.-S., Chung, Y.-Y., Costa, M.A., An, K. & An, G. (1995)
Phenotypic alterations of petal and sepal by ectopic expression of a rice MADS-box
gene in tobacco. Plunt M d . Biol. 29,l- 1 0.
Kang, H.-G., Jeon, I.-S., Lee, S. & An, G. (1998) Identification of class B and class C
floral organ identity genes nom rice plants. Planr Mol. Biol. 38, 1 02 1 - 1029.
Kempin, S.A., Mandel, M.A. & Yanofsky, M.F. (1993) Conversion of perianth into
reproductive organs by ectopic expression of the tobacco floral homeotic gene
NAGI. Plunr Physiol. 103, 1041-1046.
Kempin, S.A., Savidge, B. & Yanofsky, M.F. (1995) Molecular basis of the
caulijower phenotype in Arabidopsis. Science 267,522-525.
Kramer, E.M., Dorit, R.L. & Irish, V.F. (1998) Molecular evolution of genes
controlling petal and stamen development: duplication and divergence within the
APETALA3 and PISTILLA TA MADS-box gene lineages. Genefics 149,765-783.
Krizek, B.A. & Meyerowitz, E.M. (1996) Mapping the protein regions responsible for
functional specificities of the Arabidopsis MADS-domain organ identity proteins.
Proc. Natl Acad Sci USA 93,40634070.
Kunst, L., Klenz, J.E., Martinez-Zapater, J. & Haughn, G. (1989) A P 2 determines the
identity of perianth organs in flowers of Arubidopsis îhalianu. Plant Cell2,741-753.
Lessard P., Decroocq, V. & Thomas, M. (1997) Isolation and analysis of messenger
RNA from plant cells: cloning of cDNAs, in Plant Molecuiur Biology - A
L a b o r a t o ~ Manual (Clarck, M.S. ed.) pp. 154-20 1. Springer-Verlag , Berlin.
Lewin, R. (1996) The molecular evolutionary clock, in Patterns in Evolution: the New
Molecular View. Pp. 1 06- 1 1 9, Scientific American Library, New York.
Loconte, H. & Stevenson, D. W. ( 1 990) C ladistics of the Spermatophyta. Brittonia 42,
197-21 1.
Ma, H., Yanofsky, M.F. & Meyerowitz, E.M. (1991) AGL1-AGL6, an Arabidopsis
gene family with similarity to floral homeotic and transcription factor genes. Genes
Dev. 5,484-495.
Mandel, M.A., Bowman, J.L., Kempin, S.A., Ma, H., Meyerowitz, E.M. & Yanofsky,
M.F. (1992a) Manipulation of flower structure in transgenic tobacco. Cell 71, 133-
143.
Mandel, M.A., Gustafson-Brown, C., Savidge, B. & Yanofsky, M.F. (1992b)
Moiecular characterization of the Arabidopsis homeotic gene APETALA1. N-re
360,2739277.
Mandel, T., Lutziger, 1. & Kuhlemeier, C. (1994) A ubiquitously expressed MADS-
box gene fiom Nicotiunu tabacum. Plant Mol. Biol. 25,3 19-32 1.
Martens, P. (1971) Les gnétophytes, in Handbuch der Pjanzenanatomie, Vol. 12
(Zimmermann, W., Carlquist, S., Ozenda, P., Wulff, H.D. eds.) Bomstraeger, Berlin.
295 p.
Martin, W., Lydiare, D., Brinkmann, H., Forkmann, G., Saedler, H. & Cerff, R.
(1 993) Molecular phylogenies in angiosperm evolution. Mol. Biol. Evol. 10, 140-
162.
Mena, M., Mandel, M.A., Lemer, D.R., Yanofsky, M.F. & Schmidt, R.J. (1 995) A
characterization of the MADS-box gene famiIy in maize. Plant J. 8,845-854.
Meyerowitz, E.M., Bowrnan, J.L., Brockman, L.L., Drews, G.N., Jack, T., Sieburth,
L.E. & Weigel, D. (1991) A genetic and molecular mode1 for flower development in
Arabidopsis thdiana. Development Su& 1, 1 5 7- 167.
Meyerowitz, E.M., Smyth, D.R. & Bowrnan, J.L. (1989) Abnormal flowers and
pattern formation in floral development. Development 106,209-2 1 7.
Minikami, Y., Huang, H., Tudor, M., Hu, Y. & Ma, H. (1996) Functional domains of
the floral regulator AGA MO US: characterïzation of the DNA-binding domain and
analy sis of dominant negative mutations. Plant Cell8,83 1 -845.
Mouradov, A., Glassick, T.V., Hamdorf, B.A., Murphy, L.C., Marla, S.S., Yang, Y .M.
& Teasdale, R.D. (1998) Family of MADS-box genes expressed early in male and
female reproductive structures of Monterey pine. Plant Physiol. 11 7,55-61.
Münster, T., Pahnke, J., Di Rosa, A., Kim, J.T., Martin, W., Saedler, H. & Theikn G.
(1997) Floral homeotic genes were recmited fiom homologous MADS-box genes
preexisting in the common ancestor of fems and seed plants. Proc. Narl Acad Sci-
USA 94,24 1 5-2420.
Murai, K., Murai, R. & Ogihara Y. (1997) Wheat MADS-box genes, a multigene
family dispersed throughout the genome. Genes Genet. Syst- 72,3 17-32 1.
Nixon, KC., Crepet, W.L., Stevenson, D. & Friis, E.M. (1994) A reevaluation of seed
plant phylogeny. Ann. Mo. Bot. Gard 81,484-533.
Norman, C., Runswick, M., Pollock, R. & Treisman, R. (1988) Isolation and
properties of cDNA clones encoding SRF, a transcription factor that binds to the c-
fos serum response element. Cell jS,989- 1003.
Okamuro, J.K., den Boer, B.G.W. & Jofuku K.D. (1993) Regulation of Arabidopsis
Bower development . Plant Cell5, 1 1 8 3 - 1 1 93.
Okamuro, J.K., Caster, B., Villarroel, R.. van Montagu M. & JoNtu, K.D. (1 997)
The AP2 domain of APETALAt defines a large new farnily of DNA-binding proteins
in Arabidopsis. Proc. Nat1 Acad Sci. USA 94, 7076-708 1.
Passmores, S., Maine, G.T., Elble, R., Christ, C. & Tye, B.-K. (1 988) Saccharomyces
cerevisiae protein involved in plasmid maintenance is necessary for mating of
MATa cells. J. Mol, Biol. 204,593-606.
Pellegnni, L., Tan, S. & Richmond, T.J. (1995) Structure of serum response factor
cote bound to DNA. Nature 376,4900498.
Pnueli, L., Abu-Abeid, M., Zamir, D., Nacken, W., Schwarz-Sommer, 2. & Lifschitz
E. (1991) The MADS-box gene farnily in tomato: temporal expression during floral
development, conserved secondary structures and homology with homeotic genes
fiom Antirrhinum and Arabidopsis. Plunt J. 1,255-266.
Pumgganan, M.D., Rounsley, S.D., Schmidt, R.J. & Yanofsky, M.F. (1995) Molecular
evolution of flower development: diversification of the plant MADS-box regdatory
gene farnily. Genetics 140,345-356.
Riechmann, J.L.. Krizek, B.A. & Meyerowitz, E.M. (1 996a) Dimerization specificity
of Arabidopsis MADS-domain homeotic proteins A PE TA LA 1 , A PE TALA3,
PISTILLA TA, and AGAMOUS. Proc. Akd Acad Sci. USA 93,479394798.
Riechmann, J.L., Wang, M. & Meyerowitz, E.M. (1996b) DNA-binding properties of
Arabidopsis MADS-domain homeotic proteins APETALA 1 , APETALA3,
PISTILLA TA, and AGA MOUS. Nucleic Acids Res. 24,3 1 34-3 14 1.
Riechmann, J.L. & Meyerowitz, E.M. (1997) MADS-domain proteins in plant
deveiopment. Biol. Chem. 3 78, 1079- 1 10 1 .
Rothwell, G.W. & Serbert, R. (1994) Lignophyte phylogeny and the evolution of
spermatophytes: a numerical cladistic analy sis. Syst. Bor. 1 9,443482.
Rounsley, S.D., Ditta, G.S. & Yanofsky, M.F. (1995) Diverse roles for MADS-box
genes in Arabidopsis development. Plant Ceil 7, 1 25 9- 1 269.
Rutledge, R., Regan, S., Nicolas, O., Fobert, P.. Coté, C., Bosnich, W., Kauffeldt, C.,
Sunohara, G., Séguin, A. & Stewart, D. (1998) Characterization of an AGAMOUS
homologue from the conifer black spmce (Piceo mariana) that produces floral
homeotic conversions when expressed in Arabidopsis. P i m J. 15,6250634.
Saitou, N. & Nei, M. (1987) The neighbor-joining method: a new method for
reconstwting phylogenetic trees. Mol. B id Evol. 4,406-425.
Savard, L., Li, P., Strauss, S.H., Chase, M.W., Michaud, M. & Bousquet, J. (1994)
Chloroplast and nuclear gene sequences indicate late Pemsylvanian time for last
common ancestor of extant seed plants. Proc- Nafl Acad. Sci. USA 91, 5 163-5 167.
Schmidt, RJ., Veit, B., Mandel, M.A., Mena, M., Hake, S. & Yanofsky, M.F. (1993)
Identification and molecular characterization of ZAGI, the maize homologue of the
Arabidopsis floral homeotic gene AGAMOUS. Plant Cell5,729-737.
Schwarz-Sommer, Z., Huijser, P., Nacken, W., Saedter, H. & Sommer, H. (1990)
Genetic control of flower development by homeotic genes in Antirrhinum majus.
Science 250,93 1 -936.
Schwarz-Sommer, Z., Hue, I., Huijser, P., Flor, P.J., Hansen, R., Tetens, F., Lonnig,
W.E., Saedler, H. & Sommer, H. (1992) Characterization of the Anfirrhinum floral
homeotic MADS-box gene deficiens: evidence for DNA-binding and autoregulation
of its persistent expression throughout flower development. EMBO J . 1 1,25 1 -263.
Shiraishi, H., Okada, K. & Shimura, Y. (1993) Nucleotide sequences recognized by
the AGAMOUS MADS-domain of Arabidopsis thaliana in vitro. Plant J. 1,3 85-398.
Shore, P. & Sharrocks, A.D. (1995) The MADS-box family of transcription facton.
Eur. J. Biochem. 229, 1-1 3.
Sommer, H., Beltriin, J.-P., Huijser, P., Pape, H., Lonnig, W.-E., Saedler, H. &
Schwarz-Sommer, 2. (1990) Deficiens, a homeotic gene involved in the control of
flower morphogenesis in Antirrhinum majus: the protein shows homology to
transcription facton. EMBO J. 9,6056 1 3.
Stebbins, G.L. (1974) Flowering Plants: Evolution Above the Species Level, Harvard
Univ. Press, Cambridge. 39%.
Steeves, T.A. & Sussex, LM. (1989) Determinate shoots: thorns and flowers, in
Patterns in P Ianî Developmenf pp. 176-202, Cambridge Univ. Press, New York.
Stefanovic, S., Jager, M., Deutch, J., Boutin, J. & Masselot, M. (1998) Phylogenetic
relationships of conifers infierred fiom partial 28s rRNA gene sequences. Am. J . Bor.
85,688-697.
Stewart, W.N. & Rothwell, G.W. (1993) Paleobotany and the Evolution of Plants,
Cambridge Univ. Press, New York. 52 1 p.
Tandre, K., Albert, V.A., Sun&, A. & Engstam, P. (1995) Conifer homologues to
genes that control floral development in angiospems. Plant Mol. Biol. 27,69-78.
Tandre, K., Svenson, M., Svensson, M.E. & Engstrom, P. (1998) Conservation of
gene structure and activity in the regulation of reproductive organ development of
conifers and angiospems. PIonl J . 25,6 1 5-623.
TheiOen, G., Kim, J.T. & Saedler, H. (1996) Classification and phylogeny of the
MADS-box multigene family suggest defined roles of MADS-box gene subfamilies
in the morphological evolution of eukaryotes. J. Mol. Evol. 13,484-5 16.
Theikn, G., Strater, T., Fischer, A. & Saedler, H. (1995) Structural characterization,
chromosomal localization and phylogenetic evaluation of two pairs of AGAMOUS-
like MADS-box genes fkom maize. Gene 156, 155- 166.
Thompson, AD., Higgins, D.H. & Gibson, T.J. (1994) CLUSTALW: improving the
sensitivity of progressive multiple sequence alignment through sequence weighting,
positions-specific gap penalties and weight ma& choice. Nue. Acids Res. 22,4673-
4680.
Trobner, W., Ramirez, L., Motte, P., Hue, I., Huijser, P., Lonnig, W.E., Saedler, H.,
Sommer, H. & Schwarz-Sommer, 2. (1992) GLOBOSA: a homeotic gene which
interacts with DEFICIENS in the control of Ant i r rhum floral organogenesis.
EMBO J. I 1,4693-4704.
Tyner, A.L., Eichman, MJ. & Fuchs, E. (1985) The sequence of the type II keratin
gene expressed in human skin: conservation of structure among al1 intemediate
filament genes. P roc. Nafl Acad Sei. USA 82,468304687.
Weigel, D. ( 1995) The APETALAZ domain is related to a novel type of DNA-binding
domain. Plant CeIl 7,388-389.
Weigel, D. & Meyerowitz, E.M. (1994) The ABCs of floral homeotic genes. Cell 78,
203 -209.
Wettstein, R.R. von (1907) Handbuch der systematischen Botanik, II. Band. Franz
Deuiicke, Leipzig, Wien. 577p.
Winter, L U . , Becker, A., Münster, T., Kim, J.T., Saedler, H. & TheiBen, G. (1 999)
MADS-box genes reveal that gnetophytes are more closely related to conifers than to
flowering plants. Proc. Nad Acad Sci. 96,7342-7347.
Wolfe, J.A., Doyle, J.A. & Page, V.M. (1 975) The bases of angiospenns phylogeny:
paleo botany . Ann. Mo. Bot. Gard 62, 80 1-824.
Yanofsky, M.F., Ma, H., Bowrnan, J.L., Drews, G.N., Feldmann, K.A. & Meyerowitz,
E.M. (1990) The protein encoded by the Arabidopsis homeotic gene agamous
resernbles transcription factors. Nature 346,3 5-3 9.
Yao, J.-L., Kvamheden, A. & Morris, B. (1999) Seven MADS-box genes in apple are
expressed in different parts of the h i t . J. Am. Soc. Hort. Sei. 124, 8-13.
Appendix 1
Alignment of the MIK regions of the plant MADS-domain proteins
AF101420 S tMADS 16 AGL24 S tMADS 11 AF0 0 62 10 AF023615 SAG-a SAG-d SAG-b SAG-c DAL2 €AM2 GGM3 AG BAG 1 MZEAGAMOU ZMM2 OsMADS3 HAG 1 ZAGl CAG 1 CUMl O MdMADS 1 O AGLll FBP1l FBP7 AGL 1 AGL5 LAG FBP6 PAGLl CAG3 CUMl PTAGl PTAGS CaMADS 1 PLE M G NAG PMADS 3 TAGl FAR GAGA1 GAGA2 GAG2 CAGS CUS 1 RAD2-2 RAP 1 SLMl ZAG2 ZmOV13 ZMM1
I 50 MAREKIKIRK IDNITARQVT FSKRRRGLLK KAEELAVLCD ADVALVIFSA MAREKIKIKK IDNITARQVT FSKRRRGLFK KAEELSVLCD ADVALIIFSS NAREKIRIKK IDNITARQVT FSKRRRGIFK KADELSVLCD ADVALIIFSA MVRQKIQIKK IDNLTARQVT FSKRRRGLFK KAQELSTLCD ADIGLIVFSA MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS ---- KIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS ---------- -_-------- ---- RNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS SGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS AGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSWCD AEVALIVFSS QGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALWFSS ---------- ---TTSRQVT FCKRRNGLLK KAYELSVLCD AEVALWFSS MGRGKIEIKR IENTTNRQVT FCmRNGLLK KAYELSVLCD AEVALIVFSS ----- IEIKR IENTTSRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFST RGKGKTEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSILCD AEVALIVFST MGRGKIEIKR IENSTNRQVT FC-KRRNGLLK KAYELSVLCD AEVALIVFST MGRGKIEIKR IENNTNRQVT FCKRRNGLLK KAYELSVLCD AEIALIVFST MGRGKIEIKR IENNTNRQVT FCKRRNGLLK KAYELSVLCE AEIALIVFST LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALVIFST IGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALVIFST MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEIALIVFSS SGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS SGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKVEIKR IENTTNRQVT FCKRRSGLLK KAYELSVLCD AEVALIVFSS LGRGKVEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEIALIVFSS NGRGKIEIKR IENITNRQVT FCKRRNGLLK KAYELSVLCD AEVALWFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSN LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALWFSN IGRGKIEIKR 1 ENKTNQQVT FCKRRNGLLK .KAYELSVLCD AEVALWFSS MGKGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFST TGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS TGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENVTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGKIEIKR IENVTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS LGRGKIEIKR IENTTNRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGRIEIKR IENNTSRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGRIEIKR IENNTSRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSS MGRGRIEIKR IENNTSRQVT FCKRRNGLLK KAYELSVLCD AEVALWFSS
ZmOV23 GGMlO AGLIS PrMADS7 A G L l 4 ETL PbMADSl SaMADSA TM3 TobMADSl FDRMADS8 PrMADS 6 PrMADS8 PrMADS4 DAL 3 PrMADS5 PrMADS9 EAM4 GGMl AGL13 AGL 6 OsMADS6 ZAG3 TaMADS12 ZAGS MdMADSll D U 1 PrMADS3 EAM3 G G M l l MADSl PrMADS2 GGM 9 P.GL 2 AGL 4 MdMADS l MdMADS8 MdMADS9 AGL 9 SaMADSD DEFH200 DEFH72 FBP2 NsMADS3 TM5 M T F l EGMl OM1 0 T G 7 CMB 1 EGM3 PrMADSl MdMADS3 MdMADS7 MdMADS 6 DEFH4 9 AGL3 FDRMADSl OsMADS4 5 OsMADS7
1 50 MGRGRIEIKR IENNTSRQVT FCKRRNGLLK KAYELSVLCD AEVALWFSS ---------- ------ RQVT FCKRRGGLMK KAYELSVLCD A E V A L I I F S S MARGKIQLKR IENPVHRQVT FCKRRTGLLK KP-KELSVLCD AE I G W I F S P TVRGKTQLKR IENGTSRQVT FCKRRNGLLK KAYELSVLCD AEVALIVFSP ------ EMKR IENATSRQVT FSKRRNGLLK KAFELSVLCD A E V A L I I F S P
MVRGKTQMKR IENDTSRQVT FSKRRNGLLK KAFELSVLCD A E V A L I I F S P MVRGKTQMRR IENATSRQVT FSKRRNGLLK KAFELSVLCD A E V A L I I F S P MVRGKTQMKR IENATSRQVT FSKRRNGLLK KAFELSVLCD A E V S L I I F S P MVRGKTQMRR IENATSRQVT FSKRRNGLLK KAFELSVLCD A E V G L I I F S P MVRGKTQMRR IENATSRQVT FSKRRNGLLK KAFELSVLCD AEVGLVIFSP MVRGRTELKR IENPTSRQVT FSKRRNGLLK KAFELSVLCD AEVALIVFSP MARGKTQMRK IESATSRQVT FSKRRNGLLK KAYEMSVLCD AQLGLIVFSP MARGKTQMRK IESATSRQVT FSKRRNGLMK KAYELSVLCD AQLGLIVFSP MGRGRIRLRK IESATSRQVT FSKRKNGLLK KAYELSVLCD VELGLIVLSP MVRGKTQMKR IENDTSRQVT FSKRRNGLLK KAYELSVLCD AEVALIVFSP MVRGKTQMKR IENATSRQVT LSKRRNGLLK KAYELSVLCD AEXGLIVFSP MVRGKTQMEC? IENDTSRQVT FSKRRNGLLK KAYELSVLCD A E V G L I I F S P ---------- ---------- ---- RNGLLK KAYELSVLCD A E V G L I I F S P ---------- ---------- ---- RNGLLK KAYELS I L C D A E V G L I I FS P MGRGKVEVKR IENKITRQVT FSKRKSGLLK KAYELSVLCD A E V S L I I F S T MGRGRVEMKR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVELKR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVELKR IENKINRQVT FSKXRNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVELKR IENKINRQVT FSKRRNGLLK KAYELSVLCD AEVALI 1 F S S MGRGRVELKR IENKINRQVT FSKRRNGLLK KAYELSVLCD AEVALIIFSG MGRGRVELKR IENKINRQVT FS-KRRNGLLK KAYELSVLCD A E V G L I I F S S MGRGRVQLRR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S T MGRGRVQLRR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S T ---------- --------1- ---- RNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVELKR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVELKR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVELKR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S S MGRGRVQLRR IENKINRQVT FSKRRNGLLK KAYELSVLCD A E V A L I I F S T MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVSLIVFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN ---- RVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIVFSN
MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IEGKINRQVT FAKRRNGLLK KAYELSVLCD AEVRLIIFSN MGRGRVELKR VENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIVFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD ACVGLIIFSN MGRGRVELKM IENKINRQVT FAKRRKRLLK KAYELSVLCD AEVALIIFSN MGRGRVEMKR IENKINRQVT FAKRRTGLLK KAYELSVLCD VEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIVFSN MGRGKVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGKVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD A E V A L I I F S S YGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD A E V A L I I F S S ---GKVELKR IENKSNRQVT FSKRRNGLLK KAFELSVLCD AEVALIIFSG MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGKVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD A E I A L L I F S N ---------- -ENSTNRQVT FAKRRNGLLK KAYELSVLCD AEVAL IVFSN
MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRKVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN
M 7 9 OsMADS2 4 OsMADS8 SbMADS 1 ZMM7 MdMADS4 FDRMADSS OsMADS5 ZMM3 OsMADSl ZMM8 AGL8 SaMADSB B o A P 1 Boi l A P l B o i 2 A P 1 SaMADSC-2 SaMADSC A P 1 BOCAL B o i C A L BobCAL CAL BpMADS 3 MdMADS5 BpMADS5 MdMADS2 NAP1-1 NsSMADSl POTM1-1 POTM1-2 SCMl TM4 SLM5 NAP1-2 NsMADS2 NtMADS5 GSQUAl SQUA SLM4 BpMADS4 LtMADS r TaMADS11 SbMADS2 ZAPl LtMADS2 A G L I S - 1 AGL15-2 AGLI 5 GGM13 FDRMADS 5 TM8 FL F CerMADS l c m 4 CMADS3 CRMl cm5 CMADS2 CRM2
1 50 MGRGKVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN ---------- -ENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KFrYELSVLCD AEVALIIFSN MGRGRVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALI 1 FSN ---------- ---KINRQVT FAKRRNGLLK KAYELSVLCD AEVALIIFSN
MGRGKVELKR IENKINRQVT FAKRRNGLLK KAYELSVLCD AEVALIVFST ---------- -ENKMNRQVT FAKRRNGLLK KAYELSVLCD A E V A L I I F S T
MGRGKVELKR IENKISRQVT FAKRRNGLLK KAYELSVLCD A E V A L I I F S T ---------- ---KISRQVT FAKRRNGLLK KAYELSVLCD AEVALI 1 F S S
MGRGKVELKR I E N K I SRQVT FAKRRNGLLK KAYELSLLCD AEVALI 1 F S G ---------- ---KISRQVT FAKRRNGLLK KAYELSLLCD AEVALI 1 FSG
MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVALIVFSS MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVALVI F S S MGRGRVQLKR IENKINRQVT FSKRRAGLFK KAHEISVLCD AEVALWFSH MGRGRVQLKR IENKINRQVT F SKRRAGLFK KAHEISVLCD AEVALWFSH MGRGRVQLKR I E N K I NRQVT FSKRRAGLMK KAHEISVLCD AEVALWFSH MGRGRVQLKR IENKINRQVT FSKRRAGLMK KAHE ISVLCD AEVALWFS H MGRGRVQLKR IENKINRQVT FSKRRAGLLK KAHEISVLCD AEVALWFSH MGRGRVQLKR IENKINRQVT FSKRRAGLLK KAHEISVLCD AEVALWFSH MGRGRVEMKR IENKINRQVT FSKRRAGLLK K A H E I S I L C D AEVSL I V F S H MGRGRVEMKR IENKINRQVT FSKRRAGLLK K A H E I S I L C D AEVSLIVFSH MGRGRVEMKR IENKINRQVT FSKRRAGLLK K A H E I S I L C D AEVSLIVFSH MGRGRVELKR IENKINRQVT FSKRRTGLLK KAQEISVLCD AEVSLIVFSH MGRGRVQLKR IENKINRQVT FSKRRGGLLK KAHEISVLCD AEVAVIVFSH MGRGRVQLKR IENKINRQVT FSKRRTGLLK KAHE ISVLCD AQVALIVFSN MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVALIVFST MGRGRVQLKR IENKINRQVT FSKRRSGLMK KAHEISVLCD A E V A L I I F S T MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVGLIVFST MGRGRVQLKR IENKINRQVT FSXRASGLLK KAHEISVLCD AEVGLIVFST MGRGRVQLKR IENKINRQVT FSKRRSGLLK ECAHEISVLCD AEVGLIVFST MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVGLIVFST MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVGLIVFST MGRGRVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVGLIVFST MGRGRVQLKR IENKINRQVT FSKRRTGLLK KAHEISVLCD ADVGLIVFST MGRGKVQLRR IENKINRQVT FSKRRGGLVK KAHEISVLCD AEVALIVFSH MGRGKVQLRR IENKINRQVT FSKRRGGLVK KALEISVLCD AEVALIVFSH MGRGKC'QLRR IENKINRQVT FSKRRGGLAK KAHE ISVLCD AEVALIVFSH MGRGKVQLECR IENKINRQVT FSKRRGGLLK KAHEISVLCD AEVALIVFSA MGRGKVQLKR IENKINRQVT FSKRRGGLLK KAHELSVLCD AEVALIVFSN MGRGRVQLKM IENKINRQVT FSKRRSGI I K KAHEISVLCD AEVALI I F S H MGRGRVQLKR IENKISRQVT FSKRRTGLLK KAHEISVLCD AEVALIVFST MGRGKVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVGLIIFST MGRGKVQLKR IENKINRQVT FSKRRSGLLK KAHEISVLCD AEVGLIIFÇT ---- KVQLKR IENKINRQVT FSKRRNGLLK KAHEISVLCD AEVAVIVFSP
MGRGKVQLKR IENKINRQVT FSKRRNGLLK KAHE ISVLCD AEVAVIVFS P MGRGKVQLKR IENKINRQVT FSKRRNGLLK KAHEISVLCD AEVAVWFSP MGRGKIEIKR IENANSRQVT FSKRRAGLLK =HELSVLCD AEVAVIVFSK MDRGKIEIKR IENANSRQVT FSKRRAGLLK KAHELSVLCD SEVAVIVFSK ------ E I K R IENANSRQVT FSKRRSGLLK KARELSVLCD AEVAVIVFSK
MGRGKIEIKR IENTTNRQVT FSKRRGGLLK KAHELSVLCD A E L G L I I F S S ---------- -ENKTNRQVR FSKRRAGLFK KAFELALLCD AEVALLVFSP MGRGKVELKR IENQTNRQVT FSKRRNGLLK KAYELS I L C D AEVALLLFS P MGRKKLEIKR IENKSSRQVT FSKRRNGLIE KARQLSVLCD ASVALLWSA MVRTKIKIKR IENATTRQVT FSKRRGGLFK KAHDLSVLCD AEVAVIIFSS -_-__----- --------__ -SKRRGGLFK KAHDLSVLCD AEVAVIIFSS
MVRRKIKIKR IENATTRQVT FSKRRGGLLK KAHDLSVLCD AEVAVIIFSS MVRRKIKIKR IENATTRQVT FSKRRGGLLK KAHDLSVLCD AEVAVI I F S S --------c- ---------- -SKRRGGLIK KAHDLSVLCD ADVAVIIFSS MVRRKIKIKR IENATTRQVT FSKRRGGLLK KAHDLSVLCD ADVGVIIFSS ---------- ---------- ---------- -------- CD VEVGVIIFSS
CMADS4 CMADS 6 CRM3 ANRl DEFH125 NMHCS AGL17 EAM5 GGMl2 CerMADS2 C m 6 GPMl CerMADS3 CMADSl c m 7 EAMl GGM6 GGM7 GGM4 GGM8 GGM5 CUM2 S LM2 FBP3 PMADS2 GLO SVPI-1 FBPl NTGLO GGLOl EGMS PI D e P f -1 ScPI LtPI MfPI DaPI HPI-1 H?I-2 %PI-I RFPI-1 %PI-2 Rf PI-2 PnPI-1 PnPI-2 OsMADS2 OsMADS4 PhPI PmPI-1 BobAP3 Boi2AP3 BoilAP3 AP3 Dl RaDl D2 RaD2-1 DEF SvAP3 LeAP3
50 FSKRRAGLFK KAHDLSVLCD AKVAVIVFSE ---------- ---------- -DVAHIVFSS FCKRRAGLVK KARELSVLCD ADVALIVFSS FSKRRSGLLK KAKELSILCD AEVGVIIFSS FSKRRSGLLK EViKELAI LCD AEVGWI FSS FSKRRNGLLK KAKELAILCD AEVGVMIFSS FSKRRKGLIK KAKELAILCD AEVCLI 1 FSN ---- RRGIFK KATELSVLCD AELSVIVFSS -------- LK KAYELSILCD AEVALIIFSS FSKRRNGLLK KAHELSVLCD AEIALIIFSS ---- RNGLLK KAHELSVLCD AEIALI 1 FSS -SKRRNGLLK KAYELSVLCD AEVAVIVFSG FSKRRNGLLK KASELSILCD AEIAAIVFSS FSKRRNGLLK KAYELSVLCD AEVGVMVFSA ---- RTGLLK KAYELSVLCD AEVAVIVFSS ---- RGGLLK KARELSVLCE AEVALIVFST ---- KNGLKK KVTELSILCG AEIALVIFSN ---- RSGLLK KAHELSVLCD AEVAVI IFSN ---- KNGLLK KALELGVLCD VDVAVLIRSD ---- RNGLLK KAFELSVLCD ADTALLVFSE ---- RGGLLK KAHELSVLCD AEIALIIFSS YSKRRNGIIK KAKEITVLCD AQVSLVIFAS YSKRRNGIIK KAGEITVLCE AKVSLIIFSN YSKRRNGIIK KAKEITVLCD AKVSLIIFGN YSKRRNGIIK KAKEITVLCD AKVSLIIFGN YSKRRNGIMK KAKEISVLCD AHVSVIIFAS YSERRNGLMK KAKEISVLCD AQVSWIFAS YSKRRNGILK KAKEISVLCD ARVSVIIFAS YSKRRNGILK KAKEISVLCD ARVSVIIFAS YSKRKNGIIK KAKEITVLCD ANVSLVIYGS YSKRRNGLIK KAKEISVLCD AQVSVIIFGS FSKRRNGLVK KAKEITVLCD AKVALIIFAS YSKRRNGIMK KAKEITVLCC AEVSLVIFSS ----- NGILK KAKEITVLCD AKVSLVIFSS ----- GGf LK KAKEITVLCD AQVSLVIFSS ----- GGILK KAKEITVLCD AQVSLVIFSS YSKRRNGILK KAKEITVLCD AEVSLWFSS FSKRRNGIIK KAKEISVLCE SEI AIWFSS FSKRLNGIIK KAKEISVLCE SEIAIVIFSS FSKRRNGILK KAKEIAILCE AKVSLVIFTS YSKRRNGIIK KAKEIAILCA AEVSLVIFSS YTKRKNGILK KAKEISILCS AEVSLVIFSC ----- TGITK KAKEISILCA AEISLVIFNS YSKRKNGILK KAKEITILCD AHVSLVIFSS YSKRRNGILK KAKEISILCD ANLSLVMISE FSKRRSGILK KAREISVLCD AEVGWIFSS FSKRRSGILK KAREIGVLCD RCVGWIFSS ----- KGIIK KAQEISVLCD THVSVLIFSS YSKRKKGI IK KAQEISVLCD TQVSLVIFSS YSKRRNGLFK KAHELTVLCD ARVSIIMFSS YSKRRNGLFK KAHELTVLCD ARVSIIMFSS YSKRRNGLFK KAHELTVLCD ARVSIIMFSS YSKRRNGLFK KAHELTVLCD ARVSIIMFSS YSKRRSGLFK KAKELTILCD AKVSIIMISN YSKRRSGLFK KAKELTILCD AKVSIIMISN YS KRRNGLFK KAQELWLCD AKVS I IMISS YSKRRNGLFK KAQELWLCD AKVSIIMISS YSKRRNGLFK KAHELSVLCD AKVSIIMISS -a--- NGLFK KAHELTVLCD AKVSIIMISS ----- NGLFK KANELTVLCD AKVSIVMISS
PD4 PMADS 1 NTDEF NMH7 GDEF2 SLM3 DeAP3- I S c A P 3 L t A P 3 MfAP3 PnAP3-2 RfAP3-2 P c A P 3 PnAP3-1 RbAP3-1 RfAP3-1 GDEFl TM6 PTD L t A P 3 - 1 P t A P 3 - 2 OsMADS16 TaMADS 5 1 CMB2 PhAP3 GGM2
A F 1 0 1 4 2 0 S t M A D S I 6 AGL24 S t M A D S l l A F 0 0 6 2 1 0 A F 0 2 3 6 1 5 SAG-a SAG-d SAG-b SAG-c DAL2 EAM2 GGM3 AG BAGl MZEAGAMOU ZMM2 OsMADS3 HAG1 ZAGl CAGl CUMlO MdMADS IO A G L l i F B P l l FBP7 AGLl AGLS LAG FBP6 PAGL 1 CAG3
I MARGKIQIKK MARGKIQI KR MARGKIQIKR MARGKIQIKR MARGKIQIKK MARRKfQIKK
MGRGKIEIKR
MGRGKIEIKK ---GKIEIKK MGRGKIE IKK
MGRGKIEIKR MGRGKIEIKR MGRGKLEIRK
MGRGKI EMKK
5 1 TGKLFEYASS TGKLFDFAST TGKLFEFSSS TGKLFEYSSS RGRLYE FANH RGRLYEFANH RGRLYEFANH RGRLYEFANH RGRLYEFANH RGRLYEFANH RGRLY E FANH RGRLYEFANN RGRLYE FANN RGRLYEYSNN RGRLYEYSNN RGRLYEYANN RGRLYEYANN RGRLYE YANN RGRLYEYSNS RGRLYEYANN RGRLYEYSNN RGRLYEYSNN RGRLYEYSNN RGRLYEYANN RGRVYEYANN RGRVYEYSNN RGRLYEYANN RGRLYEYANN RGRLYEYANN RGRLYEYANN RGRLYEYANN
50 IENQTNRQVT YSKRRNGLFK KANELTVLCD AKVSIVMISS IENQTNRQVT YSKRRNGLFK KANELTVLCD AKVSIIMISS IENQTNRQVT YSKRRNGLFK KANELTVLCD A K V S I I M I S S IENTTNRQVI YSKRRNGLFK ECANELTVLCD AKVSIIMFSS IENSTNRQVT YSKRRNGLFK KASELTVLCD AKVSIIMVSC
YSKRRNGLFK --de- AGI MK ----- AGIMK ----- GGIMK ----- GGIMK YSKRRSGILK ----- AGIMK YSKRRSGIFK ----- SGIFK ----- AGI I K YSKRRAGIMK YSKRRNGI FK YS KRRNGI FK Y SKRRNGI FK ----- NGIFK ----- NGIFK YSKRRTGIMK YSKRRSGIMK FSKRRNGIMK FSKRRNGLFK FS KRRNGLMK
--S--MQELL GKYKLHSTNN --S--MKDIL GKYKLQSAS- --R--MRDIL GRYSLHASNI --S--V?QLI EKHKMQSERD --S--VKRTI ERYKKTCVDN --S--VKRTI ERYKKTCVDN --S--VKRTI ERYKKTCVDN --S--VKRTI ERYKKTCVDN --S--VKRTI ERYKKTCVDN --S--VKRTI ERYKKTCVDN --S--VKRTI ERYKKTCVDN -SS--VKRTI ERYKKTCADN --S--VKRTI ERYRKTCADN --S--VKGTI ERYKKAISDN --S--VKGTI ERYKKAISDN --S--VKSTI ERYKECANSDS --S--VKSTI ERYKKANSDS --S--VKSTV ERYKKANSDT N-S--VKTTI ERYKKACTDT --S--VKGTI ERYKKATSDN --S--1KTTI ERYKKACSDS --S--1KTTI ERYKKACSDS N-S--IRNT1 ERYKKACSDS N---- I R S T I ERYKKACSDS N---- I K G T I ERYKKATAET N---- I R A I 1 DRYKKATVET --S--VRGTI ERYKKACSDA --S--VRGTI ERYKKACSDA --S--VKSTI ERYKKAS-DT --S--VRATI DRYKKH HADS --S--VRATI DRYKKHHADS
KANELTVLCD ATVSIIXLSS KARELTVLCD AZVSLIMFSS KARELTVLCD AEVSLIMFSS KAKELTVLCD AEVSLIMFSS KAKELTVLCD AQVSLIMFSS KAKELTVLCD AEVSLIMFSS KAQELTVLCD AKVALLMFSS KAKELTILCD AQVCLIMFSN KAKELTILCD AQVCLIMFSN KAQELTVLCD AEVSLIMVSS KAKELTVLCC AKVSLIMFSS KAHELTVLCD AKVSLIMFSN KRKELTVLCD AKISLIMLSS KAQELTVLCD PXVSLI IVPN KALELSVLCD AKVSIIMVAT KAKELSVLCD AKVSIIMVAT KARELTVLCD AQVAI IMFSS KARELTVLCD AQVAI IMFSS KAQELTVLCD AKVSLLMISS KAQELTVLCD A Q I S I I L I S S KAQELAVLCD AEVGLI 1 F S S
1 0 0 ----- VNKVD EPSLD----- ----- LEKVD EPSLD----- ----- NKLMD PPSTH----- SM---DNPEQ LHSSN----- NH---GGAIS ESNSQ----- NH---GGAIS ESNSQ----- NH---GGAIS ESNSQ----- NR---GGAIS ESNSQ----- NH---GGAIS ESNSQ----- NH---GGAIS ESNSQ----- NIi---GGVIS ESNSQ----- NH---GIAIS ESNAQ----- NQ---GGAIA ESNAQ----- SN---TGSVA EINAQ----- SN---TGSVA EINAQ----- SN---SGTVA EVNAQ----- SN---SGTVA EVNAQ----- SN---SGTVA EVNAQ----- TN---TGTVS EANSQ----- SSA--AGTIA EVTIQ----- SA---TSSVT ELNTQ----- SA---TSSVT ELNTQ----- TG---SSSVT EINAQ----- TN---TSTVQ EINAA----- SN---ACTTQ ELNAQ----- SN---AFTTQ ELNAQ----- VN---PPSVT EANTQ----- VN---PPTIT EANTQ----- SN---PGSVS ETNAQ----- TS---TGSVS EANTQ----- TÇ---TGSVS EANTQ-----
RGRZlYEYANN --S--VKATI DRYKKASSDS SN---TGSTS EANTQ-----
CUMl PTAGl PTAG2 CaMADSl PLE RAG NAG 1 PMADS3 TAGl FAR GAGA1 GAGA2 GAG2 CAG2 CUS 1 RaD2-2 R A P l SLM1 ZAG2 ZmOVi 3 ZMLY 1 ZmOV2 3 GGMl O A G L l 2 PrMADS7 AGL14 ETL PbMADS l SaMADSA TM3 TobMADSl FDRMADS8 PrMADS 6 PrMADS 8 PrMADS4 DAL3 PrMADS5 PrMADS9 EAM 4 GGM l AGL13 AGL6 OsMADS6 ZAG3 TaMADS12 ZAGS MdMADS 1 l DALl PrMADS3 EAM 3 GGM 1 l MADSl PrMADS2 GGM 9 AGL2 AGL 4 MdMADSl MdMADS 8 MdMADS 9 AGL 9
51 1 0 0 RGRLYEYANN --S--VKATI DRYKKASSDS SN---TGSTS EANTQ----- RGRLYEYSND --S--VKSTI ERYKKASADS SN---TGSVS EANAQ----- RGRLYEYSNN --S--VKSTI ERYKKACADS SN---NGSVS EANAQ----- RGRLYEYANN -SS--VKTTI ERYKKACADS SN---SGSVS EANTQ----- RGRLYEYANN --S--VRATI ERYKKASADS SN---SVSTS EANTQ----- RGRLYEYSNN --S--VRETI ERYKKACADS SN---NGSVS EATTQ----- RGRLYEYANN --S--VKATI ERYKKACSDS SN---TGSIS EANAQ----- RGRLYEYANN --S--VKATI ERYKKACSDS SN---TGSIA EANAQ----- RGRLYEYANN --S--VKATI ERYKKACSDS SN---TGSVS EANAQ----- RGRLYEYANN --S--VKATI DRYKKASSDS SL---NGSIS EANTQ----- RGRLYEYANN --S--VKGTI DKYKKACLDP PT---SGTVA EANTQ----- RGZLYEYANN --S--VKGTI DRYKKACLDP PS---SGSVA EANAQ----- RGRLYEYANN --S--VKGTI ERYKKACTDS PN---TSSVS EANAQ----- RGRLYEYANN --S--VRATI SRYKKAYSDP ST---AMTVS EANTQ----- RGRLYEYANN --S--VRATI SRYKKAYSDP ST---AMTVS EANTQ----- RGRLYEYANH --S--VKATI ERYKKTCSDS TG---VTSVE EANAQ----- RGRLYEYANH --S--VKATI ERYKECKSDS TG---VTSVE EANAQ----- RGRLYEYANH --S--VKGTI DRYKKASSDN SG---ASSAI4 EANAQ----- RGRLYEYANN --S--VKATV ERYKECAHTV- -GSSSGPPLL EHNAQQ---- RGRLYEYANN --S--VKATV ERYKKAHTV- -GSSSGPPLL EHNAQQ---- RGRLYEYANN --S--VKATI ERYKKAHAV- -GSSSGPPLL EHNAQQ---- RGRLYEYANN --S--VKATf ERY-V- -GSSSGPPLL EHNAQQ---- RGKLYELATS NKS--MMSTL ERYQRSSAT- -GKQLNLYPG SSNEK----- QGKLFELATK G-T--MSGMI DKYMKCTGGG RGSSSATFTA QEQLQPP-NL RGKRYEFANP --S--MQKML ARYENFSEGS KATSTA---K EQDVQEW-IL RGKLYEFSSS S-S--1PKTV ERYQKRIQDL GS--NHK--R NDNSQQ---- RGKLYEFSSS S---- LCKTI EKYQTRAKDM EA--KTA--E 1s-MQP---- RGKLHEFASS S---- MHETI ERYRKHTKDV QS-NNTP--V VQNMQH---- KGKLYEFASS N----MQDTV DRYLRHTKDR VS-SKPV--S EENMQH---- RGKLYEFASS S---- T Q E I I RGNKRHTKDR VQPENQA--G PQYLQY---- RGKLYEFASS S----MQEII ERYKRHTKDK VQPENQV--G EQNLQH---- RGRLYEFASA P-S--LQKTI DRYKAYTKDH VN--NKT--1 QQDIQQ---- RGKVYEFSST C----MQKML ARYENFSEGS K-ATSTA--K EQDVQG---- RGWYEFSST C---- MQKML ARYEKCSEGS --DTSTS--K EQDVQC---- RGKVHEFSST C----MQKML ERYEKCSEGS K-TTSIA--K EEDPKA---- RGKLYEFANP S---- MQKML ERYDKCSEGS N-TTNTT--K ERDIQY---- SGKLYEFAST S---- MQKLL EKYEICSQEC G-TSESN--K KQDPQC---- RGKLYEFASP S---- MEEIL EKYKKRSKEN G-MAQTT--K EQDTQYS-KH RGKLYEFASP C----MQKML ERYQKCCQEA NPNSSKT--L EEDTQH---- RGKLYEFANP S----MQKML DRYQKCCQES TANTSKN--L VEDTQH---- GGKLYEFSN- -VG--VGRTI ERYYRCKDNL LDN---- DT- LEDT----- Q RGKLYEFGS- -VG--1ESTI ERYNRCYNCS LSN---- NK- PEETT----Q RGKLYEFGS- -AG--1TKTL ERYQHCCYNA QDS----NN- ALSET----Q RGKLYEFGS- -AG--1TKTL ERYQHCCYNA QDS---- NG- ALSET---+ RGKLYEFGS- -AG--TTKTL ERYQHCCYMA QDS---- NG- ALSET----Q RGKLYEFGS- -AG--VTKTL ERYQHCCYNA QDS---- NNS ALSES----Q RGKLYEFAS- -AG--MSKTL ERYQRCSFTP PEN---- S 1- -ERET----Q RGKLYEFAS- -SS--MNKTL ERYEKCSYAM QDT---- TGV SDREA----Q RGKLYEFAS- -SS--MNKTL ERYEKCSYAM QDT---- TGV SDREA----Q RGKLYEFGS- -AG--TLKTL ERYQKCSYSM QEE---- N-S SDREA----Q RGKLYEFGS- -AG--TLKTL ERYQKCSYAL QES---- N-N SDRDA----Q RGKLYEFGS- -AG--MLKTL ERYQKCSYVL QDA----T-V SDREA----Q RGKLYEFGS- -AG--MLKTL ERYQKCSYVL QDA----T-V SDREA----Q RGKLYEFAS- -SS--MSKTL ERYEKCSYSM QEN----A-S TDRDA----Q RGKLYEFCS- SSN--MLKTL DRYQKCSYGS IEV----NNK PAKEL---- E RGKLYEFCS- TSN--MLKTL ERYQKCSYGS IEV----NNK PAKEL---- E RGKLYEFCS- SSS--MLKTL DRYQKCSYGA VDQ----VNR PAKEL---- E RGKLYEFCS- SSS--1LECTL DRYQKCSYGA VDQ----VNR PAKEL---- E RGKLYEFCS- SPS--1LQTV DRYQKCSYGA VDQ----VNI PAKEL---- E RGKLYEFCS- SSS--MLRTL ERYQKCNYGA PEP----- NV PS REALAVEL
SaMADSD DEFH200 DEFH72 FBP2 NsMP-DS 3 TM5 M T F l EGM l OM1 OTG7 CMB 1 EGM3 PrMADS 1 MdMADS3 MdMADS7 MdMADS6 DEFH4 9 AGL 3 FDRMADS 1 OsMADS4 5 OsMADS7 M 7 9 OsMADS2 4 OsMADS8 SbMADS 1 ZMM7 MdMADS 4 FDRMADS2 OsMADSS Z M M 3 OsMADSl ZMM8 AGL8 SaMADS8 B o A P l B o i l A P l B o i 2 A P 1 SaMADSC-2 SaMADSC A P 1 BOCAL B o i C A L BobCAL CAL BpMADS3 MdMADS 5 BpMADS 5 MdMADS2 NAP1-1 NsMADS 1 POTM1-1 POTM1-2 SCMl TM4 SLMS NAP 1-2 NsMADS2 NtMADS5 GSQUAl
5 1 IO0 RGKLYEFCS- SSS--MIRTL ERYQKCNYGP PEP----- NV PSREALAVEL RGKLYEFCS- STS--MLNTL ERYQKCNYGP PET----- NV STREA--LEI, RGECLYEFCSN SGT--MLKTL ERYQKCNYGA PEA----- NV STREA--LEL RGKLYEFCS- SSS--MLKTL ERYQKCNYGA PET----- N I STREA--LEI RGKLYEFCS- SSS--MLKTL ERYQKCNYGA PET----- N I STREA--LEI RGKLYEFCS- SSS--MLKTL ERYQKCNYGA PEP----- N I STREA--LEI RGKLYEFCS- TSS--MLKTL ERYQKCNYGA PEG----- NV T S KEALVLEL RGKLYEFCS- SSS--MLKTL ERYQKCNYGA LEP----- NV SARES--LEL RGKLYEFCY- STS--ML-CL EKYQKCNFGS PES----- T I ISRET----Q RGKLYEFCS- SRS--MLKTL EKYQKCSDGA PEM----- TM TSRET----Q RGKLYEFCS- TSC--MNKTL ERYQRCSYGS LET----- S Q PSKET----5 RGKLYEFCS- SSS--MMKTI EKYQKCSYGS LET----- NC SINEM----Q RGKLYEFCS- SSS--MMKTI EKYQKCSYGS LET----- NC SINEM----Q RGKLYEFCS- SFS--MMKTL EKYQSCSYGS LEA----- NL PANET----Q RGKLYEFCS- SES--MMKTL EKYQSCSYGS LEA----- NL PANET----Q RGKLYEFSS- SLS--MMKTL ERYQRCSYSS LDA----- NR PANET----Q RGKLYEFCS- SSN--MLKTI ERYQKSSYGS LEV----- NH QAKDI---EA RGKLYEFCSS PSG--MARTV DKYRKHSYAT MDP----- NQ SAKDL----Q RGKLYEFCS- TQS--MTKTL EKYQKCSYAG PET----AVQ NRESEQ--LK RGKLYEFCS- TQS--MTKTL EKYQKCSYAG PET----AVQ NRESEQ--LK RGKLYEFCS- TQS--MTKTL EKYQKCSYAG PET----AVQ NRESEQ--LK RGKLYEFCS- TQS--MTKTL EKYQKCSYAG PET----AVQ NRESEQ--LK RGKLYEFCS- GQS--MTRTL ERYQKFSYGG PDT----AIQ NKENEL--VQ RGKLYEFCS- GQS--MTRTL ERYQKFSYGG PDT----AIQ NKENEL--VQ RGKLYEFCS- GQS--1TKTL ERYEKHMR-- PDT----AVQ NKENEL--VQ RGKLYEFCS- GQS--1TKTL CRYEKNSYGG PDT----AVQ NKENEL--VQ SGKLYEFCS- GPS--IAKTL ERHQRCTYGE LGA----- S Q SAEDE----Q RGRLFEFST- SSC--MYKTL ERYRSCNYNL NSC---- EAS AALET---EL RGRLFEFST- SSC--MYKTL ERYRSCNYNL NSC---- EAS AALET---EL RGRLFEFST- SSC--1YKTL ERYRSCSF-- -AS---- EAS APLEA---EL RGRLFEFSS- SSC--MYKTL ERYRÇCNYN- -SQ---- DAA AP-EN---EI RGRLFEFSS- SSC--MYKTL ERYRSSNYS- -TQ---- EVK APLES---EI KGKLFEYST- DSC--MERIL ERYDRYLYSD KQL----VGR DVSQS----E KGKLFEYST- DSC--MEKIL ERYDRYLYSD KQL----VCR DISQS----E KGKLFEYST- DPC--MEKIL ERYERYSYAE RQL----1AP ESDVN----T KGKLFEYST- DSC--MEKIL ERYERYSYAE RQL---- I A P ESDVN----T KGKLFEYST- DSC--MEKIL ERYERYSYAE RQL---- I A P ESDSN----T KGKLFEYST- DSC--MEKIL EAYERYSYAE RQL---- I A P ESDSN----T KGKLFEYST- DSC--MEKIL ERYERYSYAE RQL---- I A P ESDVN----T KGKLFEYST- DSC--MEKIL ERYERYSYAE RQL---- IAP ESDVN----T KGKLFEYSS- ESC--MEKVL EHYERYSYAE KQL----KVP DSHVNA--QT KGKLFEYSS- FSC--MEKVL EHYERYSYAE KQL----KVP DSHVNA--QT KGKLFEYSS- E S C - - M E m ERYERYSYAE KQL----KM DSHVNA--QT KGKLFEYSS- ESC--MEKVL ERYERYSYAE RQL----1AP DSHVNA--QT KGKLFEYAT- DSS--MEKIL ERYERYSYAE AQL----VAA DSEGQ----G KGKLFEYAT- DSC--MEQIL ERYERYSYAE RQL----VEP DFESQ----G KGKLFEYST- DSC--MERIL ERYERYSYAD RQL----LAN DLEQN----G KGKLFEYSN- DSC--MERIL ERYERYSYTE RQL----LAN DNEST----G KGKLFEYST- DSC--MERIL ERYERYSYAE RQL----TAT DHETP----G KGKLFEYST- DSC--MERIL ERYERYSYAE RQL----TAT DDETP----G KGKLFEYAN- DSC--MERLL ERYERYSFAE RQL----VPT DHTSP----G KGKLFEYAN- DSC--MERLL ERYERYSFAE RQL----VPT DHTSP----G KGKLFEYAT- DSC--MERLL ERYERYSFAE KQL----VPT DHTSP----G KGKLFEYAN- DSC--MERIL ERYERYSFAE KQL----VPT DHTSP----V KGKLFEYAT- DSC--MEKIL ERYERYSYAE RQL----TAP DPDSH----V KGKIFEYSS- DSC--MEQIL ERYERYSYTE RRL----LAS NSESSV--QE KGKIFEYSS- DSC--MEQIL ERYERYSYTE RRL----LAS NSESSV--QE KGKIFEYSS- DSC--MEQIL ERYERYSYAE RRL----LSS NSESSV--QE KGKLFEYST- DSC--MENIL DRYEQYSNID RQH----VAV DTDSP---- 1
SQUA KGKLFEYST- DSC--MDRIL EKYERYSFAE RQL----VSN EPQSP----A
SLM4 BpMADS4 L t M A D S l TaMADS 1 l SbMADS2 Z A P l LtMADS2 AGLIS-I AGL15-2 A G L 1 5 GGM13 FDRMADSS TM8 FL F C e r M A D S l c m 4 CMADS3 cm1 C M 5 CMADS2 CRM2 CMADS 4 CMADS 6 cm3 A N R l DEFHf 2 5 NMHCS AGL 1 7 EAM5 GGM12 CerMADS2 C M 6 OPMl CerMADS3 CMADS 1 c m 7 EAM 1 GGM 6 GGM7 GGM4 GGMB GGM5 CUM2 6 SLM2 F B P 3 PMADS2 GLO S V P I - I F B P l NTGLO GGLOl EGM2 P I D e P I - i S c P I L t P I M f P f D a 2 1 H P I - 1 HPI-2
5 1 RGKLFDFAS- KGKLFEFSS- KGKLYEFAT- KGKLYEFST- KGKLYEYAT- KGKLYEYAT- KGKLYEYAT- SGKLFEFS-- SGKLFEFS-- SG'KLFEYS-- SGKLFEYSSA AGKLYEYSS- SGKAYHFAS- SGKLYSFSSG KGKLFHFG-- KGKLFHFG-- KGKL FQFA-- KGKLFQ FA- - KGKLFHFA-- KGKLFQFA-- RGRL FQFA-- KGRL FEFA-- SGRLFEYAG- SGRLFEYAG- TGKLYDYASN TGKLYEFSS- TAKLY D FAS - TDKLYDFAS- TGRLYEFCN- TGKLYDYCS- TGKLFEYSSS TGKLFEHSSS TGKLSEFASS TGRLSEFASS TGRLSEFAST TGRLS EFAT P TGKLTEWASD TGKLYSHVGK TGKLYEYASS AGELYHFSNP SGKLHQFSNR TGKLFEFSSA SGKMHEYCS P NGKMHAYHS P SGKMHEYCS P SGKMHEYCS P SGKMHEFCSP SGKMHEECS P SGKMHEFSS- SGKMHEFSS- SGKMYEYCSP SGKMHEYCSS NGKMIDYCCP TGKMSEYCSP TGKMAEYCSP TGKISEYCSP TGKMSEYCSP TGKMSEFISP LSKMSEFCSP
1 0 0 DSC--MEKIL ERYERYCYAE KQL----ASN DPDAQ----V DSS--MDRIL ERYERYSYAD ML----MAT ESESQ----G DSC--MDKIL ERYERYSYAE KVL----1ST ESEfQ----G ESC--MEKIL ERYERYSYAE WL----VSS ESEIQ----G DSR--MDKI L ERYERYSYAE KAL---- I S A ESESE----G DSR--MDKIL ERYERYSYAE ELAL---- I S A ESESE----G DSS--MDKIL ERYERYSYAE KAL---- I S A ESESE----G STS--MKKTL LRYGNYQISS DVP----- G I STG--MKRTV LRYENYQRSS DAP----- L I STG--MKQTL SRYGNHQSSS AS-------- SSS--MKKII ERYQKVSGAR IT-------- -SS--1EDTY DRY-QQFAGA RRD----LN= -HD--1ERTI LRY-KNEVGL SKN----SDQ -DN--LVKIL DRYGKQWDD LKA----LDH NPS--METVL KRYMKANGDP ------- KAG NPS--METVL KRYMKANGDP ------- KAG NPS--METVL GRYVKASRDP ------- FAG NPS--METVL GRYVKASRDP ------- EAG NPS--METVL GKYVKASGDP ------- GAG NPS--MKSVL ERYYKAQGDA ------- ESA SPS--MPSVL KRYLKAQTGA ------- KSA SPS--MESIL KRYMDSQKYL G------ I S E SRS--MREII QAWDAHEDS SSL--LQLRS SRS--MREII Q A W D W E D S SSL--LQLRS SS---MKTfI ERYNRVKEEQ HQLL--NHAS TS---MKSII ERHTKTKZDH HQLL--NHGS TS---LRSVI GRYNKSKEEH NQLG--STAS SS---VKSTI ERFNTAKMEE QELM--NPAS AS---MEDVL DKYNRNFQGK ------ EQRH SS---MKVLL ERYENDFREK ------ GTAR RG---IKKIL ERYKRCSGIL QDVGG-TVIR RG---IKKIL ERYKRCSGIL QDVGG-TVIR SM---MRRIL EKHRQWEGS QSIKP-TSQD s---- MDKII RRYEDLQSQS ASR---ALLH s---- MQKVL ERYQEHSNGA PSRK--VLLQ s---- MPKML EKYHRAIQRS QGNE--HSLQ N---- MKDTL KKFEAVSGIV SSDYQRQQLR HGS--LNQfI HRYLQNPHAQ LRYDQIFQTT s---- MRKTI EKYQRFEENS TNSTKSFKIK E---- MNTML AKYKVYSLKE GNLDRGMDI D E---- M N T I I TRYKSFIERH GDSKN-NELH N---- MKDLL DKYNKYLDGS NSSPT--ECD STP--LVDIL DKYHKQSGKR -------- LW ETA--VEDIL DQYHKISGKR -------- LW STT--LPDML DGYQKTSGRR -------- LW STT--LPDML DGYQKTSGRR -------- LW STT--LVDML DHYHKLSGKR -------- LW STT--LIDML DQYHKLSGKK -------- LW -TS--LVDIL DQYHKLTGRR -------- LL -TS--LVDIL DQYHKLTGRR -------- LW KTN--LIDML DRYQRLSGNK -------- LW NTS--LVDIL DQYHTQCGKR -------- LW SMD--LGAML DQYQKLSGKK -------- LW S T T - - L I K I L DRYQKASGKR -------- LW CP-- -LIEIL DRYQKSSGKR -------- LW S T T - - L W I L DRYQKSSGKK -------- LW
NCKTE--NQE KYKPE--NQE ------- KAE ---EY--DNQ GSTSIN---- GPRAMEV--- QSKALN---- CNGSS---TD DNGSS---TD DNGSS---TD DNGSS---TD DNGSS---TD DNVSS---AE DNGSS---IK EENSS---AR EERCV---SQ EEACV---SQ EIK------- EVK------- EIK------- EVK- - - - - - - EIK------- DQE------- DVE------- DVE------- VVE - - - - - - - QRE------- DVE------- NIE------- LEMAR----- LTYAK----- SEAGS----- TARMS----- QTRG------ NINNQ----- DAKHEN---- DAKHEN---- DAKHEN---- DAKHEN---- DPKHEH---- DAKHEH---- DAKHEN- - - - DAKHEN---- DAKHEN---- DAKQEN---- DAKHEN---- DAKHEY---- DAKHEY---- DAKHEH----
STE--LVNVL DRYHKSAGKK -------- LW DAKHEH---- NAS--LIKIL DKYQRTSGRR -------- LW DAKHEY---- NTT--FPKML EKYQQHSGKK -------- LW DAKHEN----
LNKISDFCSP NTS--LPKML EKYQQHSGKK -------- LW DAKHEN----
R b P I - 1 R f P I - I RbPI -2 RFPI -2 P n P I - 1 P n P I - 2 OsMkDS2 OsMADS4 P h P I PmPI- I BobAP3 B o i 2 A P 3 Boi lAP3 AP3 D l R a D l D2 RaD2-I DEF SvAP3 LeAP3 PD4 PMADS l NTDEF NMH7 GDEF2 SLM3 DeAP3-1 S c A P 3 L t A P 3 PIIfAP3 PnAP3-2 RfAP3-2 PcAP3 PnAP3-1 RbAP3-I RfAP3-1 GDE €1 TM6 PTD P tAP3-1 PtAP3-2 OsMADSI6 TaMADSSI CMBS PhAP3 GGMS
51 1 0 0 NSKMFEFNSH P---- L P E f L HKYQKDTGNK -------- LW DAKHEY---- S D ~ M s E F N S R P---- LPQIL EKYQKSSGNK -------- LW DAKHEY---- NNKMSEFHSH S---- L I NSL KRYHKLCPGK -R------ LW DAKHEF---- NGKMDEFHSH S---- LINSL KRFHHKCPEK TR------ LW DPKHEY---- TGKMNEYCSS P---- LIKQL DRYQKASGNK -------- LW DAKHEY---- AGSIDEYSSS P---- LVEQL ARYQKETGNK -------- LW DAKHER---- AGKLYDYCSP KTS--LSRIL EKYQTNSGKI -------- LW DEKHKS---- AGKLSDYCTP KTT--LSRIL EKYQTNSGKI -------- LW DEKHKS---- AGNMGEFCSP KTT--MDAIL TRYQNSTGTQ -------- LW NAKHES---- AGNMGEFCSP KTT--MDAIL TRYQNSKGSQ -------- LW NAKHEY---- SNKLHEFISP --NTTTKEIL DLYQTVSDVD -------- VW NAHYER----
LTTKEIL DLYQTVSDVD -------- SNKLHEFISP --Nm VW SAHYER---- SNKLHEFISP --NTTTKEII DLYQTVSDVD -------- VW SAHYER---- SNKLHEYISP --NTTTKEIV DLYQTISDVD -------- W ATQYER---- TNKLHEFISP --NITTKQVY DAYQTTFSPD -------- LW TSHYAK---- TNKLHEFISP --NITTKQVY DAYQTTFSPD -------- LW TSHYAK---- RNKLHEFTTP --GTTTKQIY DMYQQLSGND -------- VW SSQYAM---- RNKLHEFTTP --GTTTKQIY DMYQQLSGND -------- VW SSQYAM---- TQKLHEYISP --TTATKQLF DQYQKAVGVD -------- LW SSHYEK---- TQKIHEYISP --TSSTKQLF DLYQTTVGVD -------- LW ITHYER---- TGKLHEFISP --SITTKQLF DLYQKTIGVD -------- I W TTHYEK---- TGKLHEFISP --SITTNNLF DLYQKTIGVD -------- IW TSHYEK---- TGKLHEFISP --SITTKQLF DLYQKTVGVD -------- LW NSHYEK---- TGKLHEFISP --SVTTKQLF DLYQKTVGVD -------- LW NSHYEK---- TGKLHEYISP --SASTKQFF DQYQTTVGID -------- LW NSHYEN---- TDZnHEYISP --SITTKQFF DQYQKASGID -------- LW NSHYEK---- NLKLHEFLSP GSNLTTKDVY DRYQKALGVD -------- I W VTHEKR---- TGKFSEYISP SAT--TKRMF DRYQQVSGVN -------- LW NSHYER---- TGKFSEYISP SAT--TKRIY DRYQQVSGTN -------- LW NSHYES---- TGKFSEYCSP STT--TKKIF DRYQQVSGSS -------- LW NSHYEK---- TGKFSEYCSP STT--TKNIF DRYQQASGTS -------- LW NSHYER---- TGKMTEYLSP SLNGNTKRVY DKYQQLSGIS -------- LW NSHYES---- SGKVSEYVSP GTS--FKSVY DQYQAINKMS -------- LW DSEYEK---- TGKVCEYVSP STT--MKEFF DRFRRVTNID -------- LW ASQYET---- TGKVCEYVSP STT--MKEFF DRFRRITNID -------- LW ASQYET---- SGKCVDFISP TIS--QKEFY DKYQKITKQD -------- LW KSQYDE---- TGKCVDFISP SIS--PKAFY DKYRDVTGDD -------- LW KSQYDK---- TGKFHEYISP STT--TKKMY DQYQSTVGFD -------- LW SSHYER---- TRKYHEYTSP NTT--TKKMI DQYQSALGVD -------- I W SIHYEK---- TNKLNEYISP STS--TKKIY DQYQNALGID -------- LW GTQYEK---- NRKLHEYTSP HTT--TKELY DLYQKASGKS -------- LW NSHYER---- NRKLHEYTSP HTT--TKDLY DLYQKASGNS -------- LW NSHYER---- TGKYHEFCSP STD--1KGIF DRYQQAIGTS -------- LW IEQYEN---- TGKYHEFCST GTD--1KGIF DRYQQAIGTS -------- LW IEQYEN---- THKLHHYLSP GVS--LKKMY DEYQKIEGVD -------- LW RKQWER---- TNRLYDYCSP STS--HKKVY DQYQDGRKVD -------- LW KKRYEN---- TGKLFQYCNT --S--MSQVL EKYHKSPGVD -------- HW DIELQI----
1 0 1 1 5 0 LQLVESQESR MSQEVLEKDR --ELSQLR-G --EDLQGLTL EELQRLESLL LQLENSLNMR LSKQVADKTR --ELRQMR-G --EELEGLSL EELQQIEKRL LRLENCNLSR LSKEVEDKTK --QLRKLR-G --EDLDGLNL EELQRLEKLL LLSEKKTHAM LSRDFVEKNR --ELRQLH-G --EELQGLGL DDLMKLEKLV YWQQEAGKLR QQIEISQNAN ---- RHLM-G --DGLTALNI KELKQLEVRL YWQQEAGKLR QQIDILQNAN ---- RHLM-G --DGLTALNI KELKQLEVRL YWQQEAGKLR QQIEILQNAN ---- RHLM-G --DGLTALNI KELKQLEVRL YWQQEAGKLR QQIEILQNAN ---- RHLM-G --DGLTALNI KELKQLEVRL YWQQEAGKLR QQIEILQNAN ---- RHLM-G --DGLTALNI KELKQLEVRL YWQQEAGKLR QQIEILQNAN ---- WLM-G --DGLTALNI KELKQLEVRL YrdQQEAGECLR QQIEILQNAN ---- RHLM-G --DGLTALNI KELKQLEVRL
EAM2 GGM3 AG BAGl MZEAGAMOC ZMM2 OsMADS3 HAG 1 ZAGl CAG 1 CUMlO MdMADS 1 O A G L l 1 F B P l l FBP7 AGL 1 AGL 5 LAG FBP6 PAGLl CAG 3 CUMl PTAGl PTAG2 CaMADS 1 P L E RP-G NAG 1 PMADS3 T A G l FAR GAGA1 GAGA2 GAGS CAG2 cus 1 RaD2-2 R A P l S L M l ZAG2 ZrnOV13 ZMMl ZrnOV2 3 GGMlO AGL12 PrMADS7 AGL14 ETL PbMADS 1 SaMADSA TM3 TobMADS l FDRMADS8 PrMADS 6 PrMADS8 PrMADS 4 DAL 3 PrMADSS PrMADS 9 EAM4
1 0 1 1 5 0 WQQEAVKLK QQIEVLNNQF ---- RHYM-G --DSIQSMTV KELKQLEGRL YWQQEAVKLK QQIDVLNNQI ---- RHYM-G --ECLQSMTI KELKQLEGKL YYQQESAKLR Q Q I I S I Q N S N ---- RQLM-G --ETIGSMS? KELRNLEGRL YYQQESAKLR Q Q I I S I Q N S N ---- RQLM-G --ETIGSMSP KELRNLEGRL
3 YYQQESSKLR QMIHSLQNAN T---RNIV-G --DSIHTMGL RDLKQMEGKL YYQQESSKLR QMIHSLQNAN T--- RNIV-G --DS IHTMGL RDLKQMEGKL HYQQESSKLR QQISSLQNAN S---RTIV-G --DSINTMSL RDLKQVENRL YYQQEATKLR QQITNLQNTN ---- RTLM-G --ESLSTMSL RELKQLEGRL HYKQESARLR QQIVNLQNSN ---- RALI-G --DSITTMSH KELKHLETRL YYQQESAKLR QQIQMLQNSN ---- RHLM-G --DSLSALTV KELKQLENRL YYQQESAKLR QQIQMLQNSN SNLVRHLM-G --DSLSALTV KELKQLENRL ---------- ------- NSN ---- MLM-G --DALSTLTV KELKQVENRL YYQQESAKLR QQIQTIQNSN ---- RNLM-G --DSLSSLSV KELKQVENRL FYQQESKKLR QQIQLLQNTN ---- RHLV-G --EGLSALNV RELKQLENRL FYQQESKKLR QQIQLIQNSN ---- RHLV-G --EGLSSLNV RELKQLENRL YYQQEASKLR RQI RDIQNSN ---- RHIV-G --ESLGSLNF KELKNLEGRL YYQQEASKLR RQIRDIQNLN ---- RHIL-G --ESLGSLNF KELKNLESRL FYQQESSKLR RQIRDIQNLN ---- RHIM-G --EALSSLTF RELKNLEGRL YYQQEAAKLR RQIRDIQTYN ---- RQIV-G --EALSSLSP RGLKNLEGKL YYQQEAAKLR RQIRDIQTYN ---- RQIV-G --EALSSLSP RDLKNLEGKL FYQQEAAKLR VQIGNLQNSN ---- RNML-G --ESLSSLTA KDLKGLETKL FYQQEAAKLR VQIGNLQNSN ---- RNML-G --ESLSSLTA KDLKGLETKL YYQQEAAKLR SQIGNLQNSN ---- RHML-G --EALSSLSV KELKSLEIRL FYQQEAAKLR SQIGNLQNSN ---- RNML-G --ESLSALSV KELKSLEIKL FYQQEMKLR GQIRSVQDSN ---- RHML-G --EALSELNF KELKNLEKNL FYQQEANKLR RQIREIQTSN ---- RQML-G --EGVSNMAL KDLKSTEAKV YYQQEAAKLR AQITTLQNSN ---- RGYM-A --EGLSNMS I KELKGVETKL YYQQEASKLR AQIGNLQNQN ---- RNML-G --ESLAALSL RDLKNLEQKI YYQQEASKLR AQIGNLQNQN ---- RNFL-G --ESLAALNL RDLRNLEQKI YYQQEASKLR AQIGNLMNQN ---- RNMM-G --EALAGMKL KELKNLEQRI YYQQEASKLR P-QISNLQNQN ---- RNML-G --ESLGALSL RELKNLESRV YYQQEAAKLR QQIANLQNQN RQFYRNIM-G --ESLGDMPV KDLKNLEGKL FYQQEAAKLR QQIANLQNQN RQFYRNIM-G --ESLGNMPA KDLKNLESKL FYQQEASKLR QEISSIQKNN ---- RNMM-G --ESLGSLTV RDLKGLETKL FYQQESAKLR AQIGNLQNLN ---- RHLL-G - -ESISSLSV KDLKSLEVKL FYQQESAKLR AQIGNLQNLN ---- RHLL-G - -ESISSLSV KDLKSLEVKL ---QEAAKLR NQIRTLQNQT RNTSRNLM-G --EGLTSMNM KDLKNLETRL ---QEAAKLR NQIRTLQNQT RNTSRNLM-G --EGLTSMNM KDLKNLETRL YYQQEAAKLR NQIRTVTENN ---- RHLM-G --EGLSSLNM KDLKSLENKL FYQQESAKLR NQIQMLQNTN ---- RHLV-G -- DSVGNLSL KELKQLESRL FYQQESAKLR NQIQMLQNTN - --- RHLV-G --DSVGNLSL KELKQLESRL FYQQESVKLR NQIQMLQNTN ---- RHLV-G --DSVGNLSL KELKQLESRL FYQQESVKLR NQIQMLQNTN ---- RHLV-G --DSVGNLSL KELKQLESRL -LDLEVKFLR NQVEQLKATN ---- RYLM-G --EELATMSL DELNELEAQL DPKDEINVLK QEIEMLQKGI ---- SYMF-G --GGDGAMNL EELLLLEKHL SEENAFLGKK ,WDPHSVSKT ---- PGSE-S --GSIQNSEV ETQLVMRPPC -SKDETYGLA RKIEDLEIST ---- RKMM-G --EGLDASSI EELQQLENQL -SKGNTLDME KKIEHFEISR ---- RRLL-G --EGLDSCSV EELQQTENQL -LKHETASLA KKIELLEVSK ---- RKLL-G --EGLGTCSI NELQQIEQQL -FKHEAANMM KKIEQLEASK ---- RKLL-G --EGIGSCSI EELQQIEQQL -MQHEAANLM KKIELLETAK ---- RKFL-G --EGLQSCTL QEVQQIEKQL -MQHAAASLM KKIELLEESK ---- RKLL-G --EGLQSCSL VELQQIEKQL -VKDDTLGLA KKLEALDESR ---- RKI L-G --ENLEGCS 1 EELRGLEMKL -LKRQIANME ERIEILDSMH ---- RKML-G --DELASCAL KDLNELESQV -LKRESANME ERIEILESMQ ---- RKML-G --EELASCAL KDLNQLESQV -LKREIANME ERIEILERTQ ---- RKML-G --EELASCAL KDLNQLESQV -LKREIANRE ERIKILESRQ ---- RKMV-G --EELASCAL SDLNLLESQV -LKQEIENME KRVRILQSTQ ---- RKML-G --EGLALCS 1 KELNQLEGQV -SKQKLANME EQIRILESTQ ---- RKML-G --EGLESCSM AELNKLESQA -LKQEIAHME EKIKGLESAQ ---- RKLL-G --EELSCLTM KDLNQLENQA
GGMl AGL 13 AGL6 OsMADS 6 ZAG3 TaMADS12 ZAGS MdMADS 1 1 DAL 1 PrMADS3 EAM3 G G M l l MADS 1 PrMADS2 GGM 9 AGL2 AGL4 MdMADS 1 MdMADS8 MdMADS 9 AGL9 SaMADSD DEFH200 DEFH72 F B P 2 NsMADS3 TM5 MTFI EGMl OM1 0 T G 7 CM31 EGM3 PrMADSl MdMADS 3 ML1MADS7 MdMADS 6 DEFH4 9 AGL3 FDRMADSl OsMADS4 5 OsMADS7 M7 9 OsMADÇ24 OsMADS8 SbMADS 1 ZMM7 MdMADS 4 FDRMADS2 OsMADSS ZMM3 OsMADSl ZMM8 AGL8 SaMADSB B o A P l B o i l A P l B o i 2 A P 1 SaMADSC-2 SaMADSC
101 -LKREVAIME EKIKMLEYAQ GLRQEVTKLK CKYESLLRTH SWCQEVTKLK SKYESLVRTN SWYHEMSKLK AKFEALQRTQ SWYQEMSKLR AKFEALQRTQ SWYQEMSKLK AKFEALQRTQ SWYQEMSKLR AKFEALQRTQ SWYQEVTKLK AKYESLQRTQ NWHQEVTKLK GKVELLQRSQ NWHQEVTKLK GKVELLQRSQ NWHHEVSKLK AKVELLQRSQ TWHHEVSKLK TKVEILQRSQ NWHQEVGKLK ARVELLQRSQ NWHQEVGKLK ARVELLQRSQ NWHHEVTKLK AKLESLHKAQ NSYREYLKLK GRYENLQRQQ NSYREYLKLK GRYENLQRQQ SSYREYMKLK GRYESLQRTQ SSYREYMKLK GRYESLQRTQ SSYREYMKLK GRCESLQRTQ SSQQEYLKLK ERYDALQRTQ SSQQEYLKLK ERYDALQRTQ SSQQEYLKLK ARYEALQRSQ SSQQEYLKLK ARYEALQRSQ SSQQEYLKLK ARYEALQRSQ SSQQEYLKLK ARYEALQRSQ SSQQEYLKLK GRYEALQRSQ SSQQEYLKLK ARYESLQRSQ SCQQEYLRLK ARYEGLQRTQ SSQQEYLKLK NRVEALQRSQ SSQVEYLKLK SQVEALQRSQ SSYQEYLKLK AKVDVLQRSH NSYQDYLKLK ARVEVLQRSQ NSYQDYLELK ARVEVLQRSQ NSYQDYLMLK ARVEVLQQSQ NSYQDYLMLK ARVEVLQQSQ NSYQEYLQLE TRVEALQQSQ SSYKEYLKLK SKYESLQGYQ DKYQDYLKLK SRVEILQHSQ ASRNEYLKLK ARVENLQRTQ ASRNEYLKLK ARVENLQRTQ ASRNEYLKLK ARVENLQRTQ ASRNEYLKLK ARVENLQRTQ SSRNEYLKLK ARVENLQRTQ SSRNEYLKLK ARVENLQRTQ SSRNEYLKLK ARVDNLQRTQ SSRNEYLKLK ARVDNLQRTQ SRYQEYLKLK TKVEALQRTQ SNYQEYLKLK TRVEFLQTTQ SNYQEYLKLK TRVEFLQTTQ NNYQEYLKLK TRVEFLQTTQ N-YQEYLKLK TRVEFLQTTQ 14-YQDYLKLR TRVDFLQTTQ NWVLEKAKLK ARVEVLEKNK NWVLEHAECK ARVEVLEKNK NWSMEYNRLK AKIELLERNQ NWSMEYNRLK AKIELLERNQ NWSMEYNRLK AKIELLERNQ NWSMEYNRLK AKIELLERNQ
150 ---- KKLL-G --ENLESLSM KELTQLENQA ---- RNLV-G --EDLEGMSI KELQTLERQL ---- RNLL-G --EDLGEMGV KELQALERQL ---- RELL-G --EDLGPLSV KELQQLEKQL ---- RHLL-G --EELGPLSV KELQQLEKQL ---- RWLL-G --EDLGPLSV KELQQLEKQL ---- RHLL-G --EDLGPLSV KELQQLEKQL ---- W L L - G --EDLGPLSV KELQNLEKQL ---- RHLL-G --EDLGPLNV KELQQLERQL ---- RHLL-G --EDLGPLNV KELQQLERQL ---- RHLM-G --EDLGPLSI RELQNLERQI ---- RHLL-G --EDLGPLSI RELQTLERQI ---- RHLL-G --EDLGPLSI KELQQLERQL ---- RHLL-G --EDLGPLSI KELQQLERQL ---- RSLM-G --EDLGPLNI KELQSLEQQL ---- RNLL-G --EDLGPLNS KELEQLERQL ---- RNLL-G --EDLGPLNS KELEQLERQL ---- RNLL-G --EDLGPLNT KELEQLERQL ---- RNLL-G --EDLGPLNT KELEQLERQL ---- RNLL-G --EELGPLNT KELEQLERQL ---- RNLL-G --EDLGPLST KELESLERQL ---- RNLL-G --EDLGPLST KELELLERQL ---- RNLL-G --EDLGPLNS KELESLERQL ---- RNLL-G --EDLGPLNS KELESLERQL ---- RNLL-G --EDLGPLNS KELESLERQL ---a RNLL-G --EDLGPLNS KELESLERQL ---- RNLL-G --EDLGPLNS KELESLERQL ---- RNLM-G --EDLGPLSS KDLETLERQL ---- RNLL-G --EELGQLCS KELESLERQL ---- RNLL-G --EDLGPLGS KELCQLERQL ---- RNLL-G --EDLNPLGG KDLDQLERQL ---- RNLL-G --EDLGELST KELEQLEHQL ---- RNPP-W --EELGPLNS KELEQLEHQL ---- F.:dLL-G --EELGPLNS KELEQLERQL ---- RNLL-G --EDLSHLNT KELEHLEHQL ---- RNLL-G --EDLSHLNT KELEHLEHQL ---- RNLL-G --EDLATLNT KKLEELEHQL ---- RHLL-G --DDLGPLNM NDLEHLEHQL ---- RHLL-G --EELSEMDV NELEHLERQV ---- RNLL-G --EDLDSLGI KELESLEKQL ---- RNLL-G --EDLDSLGI KELESLEKQL ---- RNLL-G --PDLDSLGI KELESLEKQL ---- RNLL-G --EDLDSLGI KELESLEKQL ---- RNLL-G --EDLGTLGI KELEQLEKQL ---- RNLL-G --EDLGTLGI KELEQLEKQL ---- RNLL-G --EDLGSLGI KELEQLEKQL ---- RNLL-G --EDLGSLGV KELDQLEKQI ---- RHLL-G --EDLVHLGT KELQQLENQL ---- RNLL-G --EDLVPLSL KELEQLENQI ---- RNLL-G --EDLVPLSL KELEQLENQI ---- RNLL-G --EDLGPLSV KELEQLENQI ---- RNIL-G --EDLGPLSM KELEQLENQI ---- RNIL-G --EDLGPLSM KELEQLENQI ---- RNFM-G --EDLDSLSL KELQSLEHQL ---- RNFM-G --EDLDSLSL KELQSLEHQL ---- RHYL-G --EDLQAMSP KELQNLEQQL ---- RHYL-G --EDLQAMSP KELQNLEQQL ---- RHYL-G --EDLQAMSP KELQNLEQQL ---- RHYL-G --EDLQAMSP KELQNLEQQL
NWSMEYNRLK AKIELLERNQ ---- RHYL-G --CDLQAMSS KELQNLEQQL
AP1 BOCAL BoiCAL BobCAL CAL BpMADS 3 MdMADS 5 BpMADS 5 MciMAD S 2 NAPl-1 NsMADS 1 POTMI - 1 POTM1-2 SCMl TM4 SLMS NAPI-2 NsMADS2 NtMADS5 GSQUAl SQUA SLM4 BpMADS 4 L t t A D S 1 TaMADS 11 SbMADS2 ZAPl LtMADS2 AGL15-I AGL15-2 AGLl5 GGM13 FDRMADS 5 TM8 FLF CerMADS 1 CRM4 CMADS 3 CRMl CRMS CMADS2 CRM2 CMADS 4 CMADS 6 CRM3 ANRl DEFH125 NMHC5 AGL17 EAM5 GGM12 CerMADS2 CRM6 OPMl CerMADS3 CMADS 1 CRM7 EAMl GGM6 GGM7
1 0 1 NWSMEYNRLK NWSVEYSRLK NWSVEYSRLK NWSMEYSRLK NWSMEYSRLK SWTMEFARLK NWT FEYSRLK SWTLEEfAKLK S W T L E M L K SWTLEPAKLK SWTLEHAKLK SWTLEHAKLK SWTLEHAKLK SWTLENAKLK SWTLEKRKLK SWTLEHAKLK NWSLEYAKLK NWSLEYAKLK NWSLEYAKLK RWTQECNKLK NWTLEYSKLK NWT FDYAKï,K SWSLEFPKLS NWCHEYRKLK NWCHEYRKLK NWCHEYRKLK NWCHEYRKLK NWCHEYRKLK E-CTEVDLLK EDCTEVDFLK EDCAEVDILK HLYCEMTRMK --SDENASIH -WRTKIDDMT -------- YG NVEADH--LT NVEADH--LT NVEADR--LT NVEADR--LT NLDTDR--LT NIEIDR--1T NLEI DR--LT LLNGDKHWVT DLRAELTELR DLRAELTELR EWQREVASLQ EWQREAATLR FC-QREAAVLR FWQREAETLR IDSPEIMAAQ I DNGDVLKAQ YWKQEAERLK YWKQEAERLK YWRHEATRLK YWKNQALHLR FWKREVL FLR HWQHEAZNLR --IARENEQL --EDKKFDLC
AKIELLERNQ AKIELLERNQ AKIELLERNQ AKIELWERNQ AKIELLERNQ GKVELLQRNH Am-EVLQRNH ARIEVLQRNQ ARVEVLQRNQ GRLEVLQRNQ ARLEVLQRNQ ARLEVLQRNQ ARLEVLQRNQ ARLEVLQRNE ARLEVLQRNQ ARLEILQKNH A K I DLLQRNH A K I DLQQRNH AKI DLLQRNH SRAELLQRNL ARIELLQRNH AKLDLLQRNH ARIEVLERNI AKVET IQRCQ AKVET IQKCQ AKIETIQKCH AKIET IQKCH AKIETIQKCH DEISMLQEKH NEISKLQEKH DQLSKLQEKH NENEKLQTNI YRLRDIT--- RTIHELEARD SHYELLELVD VFTEKLKLLQ VFTEKLKLLQ VFTEKLKMLQ VFTEKLKMLQ VFTEKLKMLQ LFTEKLKALQ PLIQKLKALQ TLRNRLKMLR KEVESLRQEK KEVESLRQEK QQLQHLQECH QQLQDLQENH QQLHNLQESH QELHSLQENY QQLTELQHRQ QQVAELERAR ERLTYMEEIQ ERLTYMEEIQ HQLGWQETQ RQVGCMNDIQ DQLFHLKNYE KRIQLLQLRQ MAQIRYRKG- RFFSDLRELM
1 5 0 ---- RHYL-G --EDLQPMSP KELQNLEQQL ---- RHYL-G - - E D L E S I S I KELQNLEQQL ---- RKYL-G - - E D L E S I S I KELQNLEQQL ---- RHYL-G - - E D L E S I S I KELQNLEQQL ---- RHYL-G --EELEPMSL KDLQNLEQQL ---- REYL-G --DDLESLSH KELQNLEQQL ---- RHYL-G --EDLDSLTL KEIQNLEQQL ---- KHFV-G --EDLDSLSL KELQNLEQQL ---- RHYM-G --EDLQSLSL KELQNLEQQL ---- GHYA-G --EDLDSLCM KELQNLEHQL ---- RHYA-G --EDLDSLSM KELQNLEHQL ---- KHYV-G --EDLESLNM KELQNLEHQL ---- KHYV-G --EDLESLNM KELQNLEHQL ---- KLYV-G --EDLESLNM KELQNLEHQL ---- KHYV-G --EDLESLSM KELQNLEHQL ---- RHYM-G --EDLDTLSL KELQNFEHQL ---- KHYM-G --EDLDSLNL KDLQNLEQQL ---- KHYM-G --EDLDSLSL KDLQNLEQQL ---- KHYM-G --EDLDSLNL KDLQNLEQQL ---- RHYM-G --EDIESLGL REIQNLEQQL ---- RHYM-G --EDLDSMSL KEIQSLEQQL ---- RQYL-G --QDLDALNL KELQSLEQQL ---- RNLL-G --EDLDPLSL RELQNMEQQL ---- KHLM-G --EDLESLNL KELQQLEQQL ---- KHLM-G --EDLESLNL KELQQLEQQL ---- KHLM-G --EDLESLNP KELQQLEQQL ---- KHLM-G --EDLESLNP KELQQLEQQL ---- KHLM-G --EDLECLNL KELQQLEQQL ---- LHMQ-G --KPLNLLSL KELQHLEKQL ---- LQMQ-G --KGLNALCL KELQHLEQQL ---- LQLQ-G --KGLNPLTF KELQSLEQQL ---- R-WM-G --EDLTSLTM TELHHLGQQL ---- AWSL-Q --NNADESDA NQLEKLEKLL ---- KHFC-W --RRVIKSWY ERLKQLERQL ---- SKLV-G --SNVKNVSI DALVQLEEHL ---- S W I - G --DDLERLSV RDIIYLEQQF ---- S W I - G --DDLERLSV RD1 IYLEQQF ---- SNVI-G --DDLERSSL RDLIHLEQQV ---- SNVI-G --DDLERLSL RDLIHLEQQV ---- SNVI-G --DDLERLSL RDLIHLEQQV ---- RNVI-G --DDLERLSL RDLIHLEQQI ---- S N I I - G --DDLEGLSL RDLIYLEQQI ---- SDLV-E --LDLERLSG GNLIRLEQEM ---- RRKD-G DIHDLKLLSA DELDSLEGEV ---- RRKD-G DIHDLKLLSA DELDSLEGEV ---- RKLV-G --EELSGMNA NDLQNLEDQL ---- RKLM-G --EELQGLNV EDLHRLENQL ---- RQIM-G --EELSGLTV KELQGLENQL ---- RQLT-G --VELNGLSV KELQNIESQL ---- RQLL-G --ENLEGLSQ EELQTLETKL ---- RQML-G --EDLEGLSL KQLQILEANL ---- RNML-G --ESLGSLQI KDLQNLEAKL ---- RNML-G --ESLGSLQI KDLQNLEAKL ---- RHML-G --FSLETLTY RDLQKLESKL ---- SCIM-G --ENAAALSL DELQNTEARL ---- NHIL-G --ENQIPLDL AEIQRVETRL ---- SHLM-G --ENLTCFQL KDLDILESRL ---------- --EDIQHLTT DQLARLEGDL _------__- --QELESVPN LELQSLEDEL
--TADVRSLL LEMKAMENKH ---- RNSM-G --EDLSSLSV PELKRLEQEL
GGM4 GGM8 GGM S CUM2 6 SLM2 FBP3 PMADSS GLO S v P I - I F B P l NTGLO GGLOl EGMS P I D e P I - 1 S c P I L t P I M f P I D a o 1 H P I - 1 HPI-2 R b P I - 1 R f P I - I R b P I - 2 Ri P I - 2 P n P I - 1 P n P I - 2 OsMADS2 OsMADS4 P h P I PmPI -1 BobAP3 B o i 2 A P 3 B o i l A P 3 AP3 D I R a D l D2 RaD2-1 DEF SvAP3 LeAP3 PD4 PMADS 1 NTDEF NMH7 GDEF2 SLM3 DeAP3-1 S c A P 3 L t A P 3 MfAP3 PnAP3-2 RfAP3-2 PcAP3 PnAP3-1 RbAP3-I RfAP3-1 GDEFl TM6
101 150 --SIKLDQLL SaIKSLRRTQ ---- KYIM-G --EDLDSLPT KSLERLHKKL ---WKLQKLL GRIERLHKMK ---- KNIG-G --EDLDSLSF KALDRLQRQL --IQEVSRLK LQIENLQLKQ ---- KHIM-G --EQLENLSF EELDQLEKQM -LSNEMDRVK KENDNMQIEL ---- RHLR-G --EDITSLNL KELMALEEAL -LSNEIDRVK KENDNMQIEL ---- RHLK-G --EDITSLPY PDLMRLEDAL -LSNEIDRIK KENDSMQV'KL ---- RHLK-G --EDINSLNH KELMVLEEGL -LSNEIDRIK KENDNMQVKL ---- RHLK-G --EDIliSLNH KELMVLEEGL -LDNEINRVK KENDSMQIEL ---- RHLK-G --EDITTLNY KELMVLEDAL -LDNEINRIK KENDRMQTEL ---- RHLV-G --EDITTLNY KELMVLEEVL -LDNEINKVK KDNDNMQIEL ---- RHLK-G --EDITSLNH RELMILEDAL -LDNEINKVK KDNDNMQIEL ---- RHLK-G --EDITSLNH RELMMLEDAL -LQNEIDRIK KENESMQIEL ---- RHLK-G --EDITSLNY EELIAYEDAL -LSNELDRIK KENDNLQIQL ---- RHLK-G --EDITSLNH RELIILEDTL -LSNEIDRIK KENDSLQLEL ---- RHLK-G --EDIQSLNL KNLMAVEHAI -LSSEVDRVK KENDNMQIEL ---- RHLK-G --EDLTSLHP KELISIEDAL -LSAEVDRIK KENDNMQIEL ---- RHLK-G --EDLTSLHP KELIPIEKAL -LSNEVERIK KENDSMQIKL ---- RHLK-G --EDITSLHP RELLPIEEAL -LSNELERIK KENDSMHVKL ---- RHLK-G --EDITSLHP KELIPIEEAL -LNSEVERIK KENDNMQIEL ---- WLK-G --EDLTSLNP K E L I P I E M L -LSAEIDRIK RENDNMQIEL ---- MLK-G --EDLSSLNP RELIPIEEAL -LSAEIDRIK RENDNMQIEL ---- RRLK-G --DDLTSLNP RELIPIEEAL -LHQECARIK KENESMQREL ---- GHLK-G --EDINSLQP I E L I P I E Q A L -LSQEIARVE KENQSMRQEL ---- KHLK-G --EEINSLQP KELIPIEKAL -LHQEIERIE NENKSMQI EL ---- RHLK-G --EDINSLQP RELIPIERAL -LHQEIERIK KENSSMKTQL ---- KHLK-G --EDLNVLQP TELIPIEHAL -LSAEVDRVK KENDNMQIEL ---- RHLK-G --EDLTPLNP RELIPIESAL -LSAEIDRVK KENSNLQIEI ---- FUiLK-G --EDLKPLGP RELYAIENDL -LSAEIDRIK KENDNMQIEL ---- RHLK-G --EDLNSLQP KELIMIEEAL -LSAEIDRVK KENDNMQIEL ---- RHMK-G --EDLNSLQP KELIAIEEAL -LKQEKERIE KENGRLQLRL ---- RQLK-G --EDITÇLKP E E L I E I E S I L -LKQEKERIE KENGRLQLRL ---- RQLK-G --EDITSLKP E E L I E I E N I L -MQETKRKLL ETNRNLRTQI ---- KQRL-G --ECLDEFDI QELCSLEEEM -MQETI(RKLL ETNRNLRTQI ---- KQRL-G --ECLDEFDI QELLSLEEEM -MQETKRKLL ETNRKLRTQL ---- KQRL-G --ECLDELDT QELRSLEEEM -MQETKRKLL ETNRNLRTQI ---- KQRL-G --ECLDELDI QELRRLEDEM -MEQELRNLN EVNRQIRKEI ---- RRRM-G --CCLEDMSY QELVFLQQDM -MEQELRNLN EVNRQIRKEI ---- RRRM-G --CCLEDMSY QELVFLQQDM -MLEELRKIK EANGNIRKEI ---- RRRM-G --FSMEDMSF RELVILQQDM -MLEELRKIK EANGNIRKEI ---- RRRM-G --FSMEDMSF RELVILQQDM -MQEHLKKLN EVNRNLRREI ---- RQRM-G --ESLNDLGY EQIVNLIEDM -MQEHLRKLK DINKNLRREI ---- RQRM-G --ESLNDLNY DQIVSLIEDV -MQEQLRKLK DVNRNLR-KEI ---- RQW-G --ESLNDLNY EQLEELMENV -MQEQLRKLK DVNRNLRKEI ---- RQRM-G --ESLNÛLNF EQLEELMENV -MQEQLRKLK EVNRNLRKEI ---- RQRM-G --ESLNDLNY EQLEELMENV -MQEQLRKLK DVNRNLRREI ---- RQRM-G --ESLNDLNY EQLEELNENV -MQENLKKLK DVNRNLRKEI ---- RQGM-G --ECLNDLSM EELRLLEDCM -MQEELRQLK EVNRNLRRQI ---- RQRL-G --DCLEDLGC EEFLDLEKES -MQDDLQKLN ELNRKLQTDI ---- RQRM-G --DCLEDLSF EELCRLGQEM -MQDNLNKQK EINNKLRREI ---- RQRM-G --EDLNDLSI EELRGLEQNM -MQNHLNKQK EINNRLRREI ---- RQRM-G --EDLDDLTF EELRGLEQNL -MQSHLNKLK EDNNSLRRAI ---- RHRI-G --EDLDDLEI EELRGLEQNL -MQGHLIKLK EENNNLRREI ---- RQRI-G --EDLDDLEI EZLRGLEQNL -LQNALNKQK EINRRLRREI ---- RQRM-G --EDLDELTI EELRSLEQNL -MQGTLKKVK ETNNNLRREI ---- RQRQ-G --DDLDGLSF MELRGLEQNL -LQEELKKQK EINSRLKKEI ---- RQRT-G Q-DDLNELTF EELRSLEANL -LQEELKTQK EINNKLKKEI ---- RQRT-G Q-DDLSELSL DEMRILEKNL -MQERFKHLM ETNRKLRREI ---- GQRV-G --EDLEGLGI HELRSLEQDL -MQQELKTLV ETNRKLRREI ---- GQRV-G --EDLSNLSI KELRGLEQDL -MKETMKKLK DTNNKLRREI ---- RQRVLG --EDFDGLDM NDLTSLEQHM -MQENLKXLK EINNKLRREI ---- RQRT-G --EDMSGLNL QELCHLQENI
PTD PtAP3-1 PtAP3-2 OsMADS 16 TaMADS 5 l CMB2 PhAP3 GGM2
AF101420 S tMADS 16 AGL2 4 StMADS11 AF006210 AF023615 SAG-a SAG-d SAG-b SAG-c 3AL2 EAM2 GGM 3 AG BAGl MZEAGAMOU ZMM2 OsMADS3 HAG 1 ZAGl CAG 1 CUMlO MdMADS 1 O AGL 1 l FBP11 FBP7 AGL 1 AGL 5 LAG FBP6 PAGLl CAG3 CUMl PTAGl PTAG2 CaMADSl PLE RAG NAG 1 PMADS3 TAGl FAR GAGA1 GAGA2 GAG2 CAG2 CUS 1 RaD2-2 RAP 1 SLM1
101 -MQEHLRKLN DINHKLRQEI -MNDNLNKLK EINNKLRTEI -MKDNLNKLK DINNKLRTEI -MQRTLSHLK DINRNLRTEI -MQRTLSHLK DINRNLRTEI -MQEQHRKVL ELNSLLRREI -MKHQLNEQS ERSNKLKKEI -MGQELIKER RENEKLRSKL
151 EGRLNRVAPD KG-------- EAGFNRVLEI KGTRIMDEIT ESGLSRVSEK KGECVMSQIF EGGISRVLRI KGDKFMKEIS EKGISRVRSK KNEMLLEEID EKGISRVRSK KNEMLLEEID EKGIGRVRSK KNEMLLEEID EKGIGRVRSK KNEMLLEEID EKGIGRVRSK KNEMLLEEID EKGIGRVRSK KNEMLLEEID EKGIGRVRSK KNEMLLEEID EKGLGRVRAK RNESLLEEIE EKGLGRVRSK RNEKLLEDID ERSITRIRSK KNELLFSEID DRSVNRIRSK KNELLFAEID EKAIIKIRAR KNELLYAEVD EKAI 1 KIRAR KNELLYAEVD EKGIAKIRAR KNELLYAEVE ERG1 NKIRTK KNELLSAEIE DKALGKIRAK KNDVLCSEVE ERGITRIRSK KHEMLLAEIE ERGITRIRSK KHEMLLAEIE ERGITRIRSK KHELLLAEIE EKAISRIRSK KHELLLVEIE ERGITRIRSK KHEMILAETE ERGIARIRSK KHEMILAESE EKGI SRVRSK KNELLVAEIE EKGISRVRSK KREMLVAEIE EKGISRIRSK KNELLFAEIE
150 ---- RQRR-G --EGLNDLSI DHLRGLEQHM ---- RQRM-G --EDLNELRL DELRGLEQNM ---- RQRM-G --EDLNDLRL EELRGLEQNI ---- RQRM-G --EDLDGLEF DELRGLEQNV ---- RQRM-G --EDLDALEF EELRDLEQNV ---- SRRM-G --GDLEGLTL VELSALQQEM ---- RQFM-G --EELDGLSF EQLHGLEQKV ---- RYMM-G --EDIGELKI AQLEKLEHDL
176 ------ NLQRKG S LE KRG SLKKKE IMQRRE IMQRRE IMQRRE IMQRRE IMQRRE IMQRRE IMQRRE IMQRRE TLQRRE YMQKRE YMQKRE YMQKRE YMQKRE YMQKRE YMQKRE YMQRRE YLQKRE YLQKRE YFQKKE NAQKRE NLQKRE DLQKRE YMQKRE YMQKRE YMQKRE
EKAIGRVRSK KNELLFSEIE LMQKRE EKAIGRVRSK KNELLFSEIE LMQKRE EKGI SRIRSK KNELLFAEIE YMRKRE EKGISRIRSK KNELL FAEIE YMRKRE EKGISRIRSK KNELLFAEIE YMQKRE EKGIGRIRSK KNELLFAEIE YMQKRE EKGINRIRSK KNELLLAEIE YMHKRE EKAISRIRSK KNELLFAEIE HMQKRE EKAI SRIRSK KNELLFAEIE YMQKRE EKGISKIRSK KNELLFAEIE YMQKRE EKGISKIRAK KNELLFAEIE YMQKRE EKGISKIRSK KNELLFAEIE YMQKRE ERGISRIRSK KNELLFAEIE YMQKRQ EKAISRIRAK KNELLFAEIE YMQKRE EKGIGKIRSK KNEILFAEIE YMQKRE EKGISRIRSK KNELLFAEIE YMQKKE EKGISRIRSR KNELLFSEIE YMQKRE EKGISRIRSR KNELLFSEIE YMQKRE EKGISRVRAK KNEELFGEIE FMQKKE EKGISRVRAK KNELLFGEIE FMQKKE ERGISRIRSK KNELLFAEI E FMQKRE
ZAG2 z m o v 1 3 ZMMl ZmOV2 3 GGMlO AGL12 PrMADS7 AGL14 ETL PbMADS 1 SaMADSA TM3 TobMADS 1 FDRMADS8 PrMADS6 PrMADS8 PrMADS4 DAL3 PrMADS 5 PrMADS9 EAM4 GGMl AGL13 AGL 6 OsMADS 6 ZAG3 TaMADS12 ZAG5 MdMADS 11 DAL I PrMADS 3 EAM3 GGM 11 MADS1 PrMADS2 GGM 9 AGL2 AGL4 MdMADS 1 MdMADS8 MdMADS 9 AGL 9 SaMADSD DEFH200 DEFH72 FBP2 NsMADS3 TM5 MTFl EGMl OM1 OTG7 CM3 1 EGM3 PrMADSl MdMADS 3 MdMADS7 MdMADS 6 DEFH4 9 AGL3
151 176 EKGISKIRAR KSELLAAEIS YMAKRE EKGISKIRAR K Ç E L L M I S YMAKRE EKGISKIRAR KSELLAAEIN YMAKRE EKGISKIFtAR KSELLAAEIN YMAKRE QKGINQVRAK KTDLMLEEIK ALQNKE EYWISQIRSA KMDVMLQEIQ SLRNKE T N M F L I N S S H--------- ------ DRSLMKIRAK KYQLLREETE KLKEKE ERSLTKIRAR KNHLIREHIE RLKAEE EKSVCTVRAR KMQVFKEQIE QLKEKE CKSVKCVRAR KTQVFKEQIE QLKQKE ERSVGTIRAR KLQVFKEQVE RLKKKK ERSVSTIRAR KIQVFKEQIE RLKEKE EKSLHNIRLK KTELLERQIA K L K E Z ERGLRNVRAR KTEILVTEIE QLQRKE ERXLRNVRAR KERILSEENA FLSKKF ERGLRNIRAR KSEILVTQIE QLQRKE ERGLRHIRAR KTQILVAEIE ELKRKE ZRGLNHVRAT KTKVLLDEIE KLKQKE ERGLSHZRAR KTEILVDQIE CLKRKE ERGLSHIRAR KTELLMDQIN QLKKKA ERGLVNIRAR KTEILMDQIN QLKRKS EGALSATRKQ KTQVMMEQME ELRRKE EAALTATRQR KTQVMMEEME DLRKKE ECALSQARQR KTQLMMEQVE ELRRKE ECALSQARQR KTQLMMEQVE ELRRKE ECSLSLARQR KTQLMMEQVE ELRRKE ECALSQARQR KTQVMMEQVE ELRRTE EGALAQTRQR KTQLMIEQME DLRKKE EVALAHLRSR KTQVMLDQIE ELRQRE EVALTHLRSR KTQVMLDQIE ELRQRE EAALTQV-UR KTQLMLDMME DLRRKE EVALTQVRAR KTQVMMDMMD DLKKKE EVALTHVRSR KTQVMLEMMD ELRRKE EVALTHVRSR KTQVMLEMMD ELRRKE EVALGHVRNR KTQLLIQTID ELRDKE DGSLKQVRSI KTQYMLDQLS DLQNKE DGSLKQVRCI KTQYMLDQLS DLQGKE EGSLKQVRST KTQYMLDQLS DLQNKE EGSLKQVRST KTQYMLDQLS DLQNKE EASLKQVRST KTQYMLDQLS ALQNKE DSSLKQIRAL RTQFMLDQLN DLQSKE DSSLKQIRAL RTQFMLDQLN DLQSKE DMSLKQIRST RTQAMLDTLT DLQRKE DMSLKQIRST RTQAMLCTLT DLQRKE DMSLKQIRST RTQLMLDQLQ DLQRKE DMSLKQIRST RTQLMLDQLT DLQRKE DMSLKQIRST RTQLMLDQLT DYQRKE DSSLKQIRST RTQFMLDQLG DLQR-XE DGSLKQIRSR RTQYMLDQVT DLQHRE DSSLRQIRST RTQFMLDQLA DLQRRE EASLKQI I S T RMQYMLDQLG DLQQRE DKSLRQIRSI KTQHMLDQLA DLQKKE ENSLKQIRSA KTQE'MFDQLX HLQHKE ENSLKQIRSA KTQFMFDQLA HLQH-KE ETSLKQIRSR KTQFILDQLS DLQNRE ETSLKQIRSR KTQFILDQLS DLQNRE ETSLNKIRST KTQFMLDQLS DLQNRE ETSLKHIRST RTQVMLDQLS DLQTKE DASLRQIRST KARSMLDQLS DLKTKE
FDRMADS 1 OsMADS4 5 OsMADS7 M 7 9 OsMACS24 OsMADS8 SbMADS 1 ZMM7 MdMADS4 FDRMADS2 OsMADS5 ZMM3 OsMADSl ZMM8 AGL8 S a k i i D S B BoAP 1 B o i l A P l B o i 2 A P 1 SaMADSC-2 SaMADSC API BOCAL B o i C A L BobCAL CAL BpMADS 3 MdMADSS BpMADS5 MdMADS2 NAP1-1 NsMADSl POTMI-1 POTMI-2 S C M l TM4 SLMS NAP1-2 NsMADS2 NtMADS5 GSQUAl SQUA SLM4 BpMADS 4 L tMADSI TaMADS l 1 SbMADS2 ZAP 1 LtMADS2 AGLl5-1 AGL15-2 AGL15 GGM13 FDRMADS5 TM8 FLF CerMADSI cm4 CMADS3 CRMl
151 DSSLKHVRTT DSSLKHVRTT DSSLKHVRTT DSYLKHVRTT DSSLRHIRST DSSLRHIRST DSSLRHIRST DSSLSHIRST DVSMKKIRST EISLMNIRSS E I S L M N I R S S S I S L K Q I R S S EVSLKQIRSR EVS LKH 1 RS R DAAIKS I R S R HPAIKSIRSR DTALKHIRSR DTALKHIRSR DTALKH 1 RS R DTALKHIRSR DTP-LKHIRSR DTALKHIRTR DTSLKHIRSR DTSLKHIRSR DTSLKHIRSR ETALKHIRSR DTALKHVRTR DTALKQ 1 RLR
176 RTKHLVDQLT ELQRKE RTKHLVDQLT ELQRKE RTKHLVDQLT ELQRKE RTKHLVDQLT ELQRKE RTQHMLDQLT DLQRRE RTQHMLDQLT DLQRRE RTQHMLDQLT DLQRRE RTQHMLDQLT DLQRRE KTQFMHVQIS ELQRKE KNQQLLDQVF ELKRKE KNQGLLDQVF ELKRKE KNQQMLDQLF DLKRKE KNQALLDQLF DLKSKE KNQMLLDQLF DLKSKE KNQAMFESIS ALQKKD KNQAMFES IS ALQKKD KNQLMYES LN ELQRKE KNQLMYDSVN ELQRKE KNQLMYDSIN ELQRKE KNQLMYDSIN ELQRKE KNQLMHDSIN ELQRKE KNQLMYESIN ELQKKE KNQLMHESLN HLQRKE KNQLMHESLN HLQRKE KNQLMH ---- ------ KNQLMNESLN HLQRKE KNQLMYES IS QLQKKE KNQLMNESIS ELQRKR
DSALKHIRSR KNQLMYESIS ELQRKD DSALKHIRSR KNQVMYESIS ELQKKD DSALKHIRSR KNQLMHESIS ELQKKD DSALKHIRSR KNQLMHESIS ELQKKD DSALKHIRSR KNQLMHESIS VLQKQD DSALKHIRSR KNQLMHESIS VLQKQD ASALKHIRSR KNQLMHESIS VLQKQD DSALKHIRSR KNQLMHESIS VLQKKD DTALKHIRSK KNQLMYES I H ELQKKD DTSLKLIRSR KNQLMHESIS MLQKKE DTSLKLIRSR KNQLMHESIS MLQKKE DTSLKLIRSR KNQLMHESIS MLQKKE DTALKRIHSK KNQLLHQSIS ELQKKE DTALKNIRTR KNQLLYDSIS ELQHKE DVGLKHIRSK KNQLMHDSIS ELQKKE DTGLKRLRTR KNQVMHESIM ELQKKE ESSLKHIRSR KSQLMHESIS ELQKKE ESSLKHIRSR KNQLMHESIS ELQKKE ESSLKHIRSR KSHLMAESIS ELQKKE DSSLKHIRSR KSHLMAESIS ELQKKE ESSLKHIRSR KSHLMMESIS ELQKKE NFSLISVRER KELLLTKQLE ESRLKE NVSLISVRER KELLLTKQIE ZSRïRE YHALITVRER KERLLTNQLE ESRLKE ESASSRVRSR KNQLMLQQLE NLRRKE TNALRDTKSK KMUKQNGEG SRSRAN RVGVERIRSK KHKILHEENI HLQKQV ETALSVTRAK KTELMLKLVE NLKEKE HENLGRIRAK KDELMLERNN DLMQKM HENLGRIRAK KDELMLERNN DLMQKV HESLGHIRAK K D E L I L E Q I D EFKQKM HESLGHIRAK K D E L I L E Q I D EFKQKM
CRM5 CMADS2 CRMZ CMADS4 CMADS6 CRM3 ANRl DEFH125 NMHCS AGL17 EAMS GGM12 CerMADS2 C M 6 OPMl CerMADS3 CMADS 1 C M 7 EAM 1 GGM 6 GGM7 GGM 4 GGM8 GGM5 CUM2 6 SLMS FBP3 PMADS2 GLO SvPI-I F B P l NTGLO GGLOl EGM2 P I D e P I - 1 S c P I L t P I M f P I D c P I H P I - l H P I - 2 R b P I - 1 R F P I - I RbPI-2 RE PI-2 P n P I - 1 P n P I - 2 OsMADS2 OsMADS4 P h P I P m P I - 1 BobAP3 Boi2AP3 BoilAP3 AP 3 D l R a D l D 2 R a D 2 - 1
151 HESLGRIRAK KDELILDQID HESLGRIRAK KEEMILDQLE HRRLGCIRAK KEEMILDELD NYNLGRLRAK KDQLILREIE ETSLCSIRKR QKQLYREKMN ETSLCSIRKR QKQLYREKMN VTSLKGVRLK KDQLMTNEIR EMSLRGVKMK WQMLTDEVH EISLRGVRMK KEQLFMDEIQ EMSLRGIRMK REQILTNEIK ETTLKLVRLQ KVQKLQGNIH ETALNRVRNR KGVQILKDIN DSGLYKIRGA KTQLMARQVQ DSGLYKIRGA KTQLMVRQVQ NGALNQVRGR KNQI ISERLV QIALDKIRTR RNELLAMQTQ ENALNKI R I Q KVQVLHGEMQ SVTVEKIREE KMRRFHVHAE QNWTEVRKK KCDFLEKTTD QLATYKVRKK KEEAAAKEYD ELGIHRVRAR QNELFEAEIC HNAKRRVFNR KMKLLQEESN EGAKKRVFNR KIKILSQTVK EISMNRIKTK KDQSLFKRIE ENGLTGVREK QSEE'MKMMRT ENGLVGVREK QMEMYKLHKK TNGLSSISAK QSEILRIVRK TNGLSSISAK QSEILRMVRK ENGTSALKNK Q M E F V W R K ENGISSLKAK QME-WPAMRK ENGLTSIRNK QNEVLRMMRK DNGLTSIRNK QNDLLRMMRK ENGLTNIREK KDEI PKIMRK
1 7 6 DFNQKV DFKKKV -LKREV SYSNKE ETFRKE ETFRKE ELNRKG ZLRRKG ELNRKG ZLTRiCR NLQNW DLQRKG ELQKKE ELQKKE YLQEKE N I I S K G QIYKQS EMHKQE RLKKKV SLQMDL GLKRKE SLAQEV LLTNEV E 1 EVGN NERMME NHKMLE NDQILE NDQILE HNEMVE HNEMLG KTQSME KTQSME REQVLE
ENGVGCVRDQ KDEVLMTHRR NQKQLE EHGLDKVRDH QMEILISKRR NEKMMA QNGLVGVRAK QMEE'MKMMKK NERMLE EVGYASVRAK QMEIWKTLKK NGRLLE QNGLACVRSK QMEYLBILKK NERTLE ENGLACVRSR QMQCLKMLKK NERSLE QNGLTEVMK QAZVWKMMKK NDiULE QNGVTGARAK QMEFLKMMKL NGKLLE QNGVTGACAK QMEFLKMMKL NGKLLE DDGIARVKER KNEIYRMMKR NDKMLE ENGITKVKEK QMEIYRMMKR NDRKLE DNGITKVRAK I D E I P R I L E K NGRMIE DNGITKLRAK LDNIPRIMEK NGRRIE DDGIVGVKAK IKEHYRALKK RTRMLE EDGYACVRDK IMEQWKKLKR NGRRLE DNGIVNVNDK LMDHWEWVR TDKMLE NNGQANLRDK MMDHWRMHKR NEKMLE DDGLTNIRNK QDKGFG-RKE PALGLH EDGLTNIRNK SDGLLEGSNQ EYKGFG ENTFKLVRER KFKSLGNQIE TTKKKT ENTFKLVRER KFKSLGNQI E TTKKKN ENTFKLVRER KFKSLGNQIE TTKKKN ENTFKLVRER KFKSLGNQIE TTKKKN ENAVTNLSER KYKVLSNQIE TGKKKL ENAVTNLSER KYKVLSNQIE TGKKKL QDSVAKISER KYKAIANQIE TTRKKL QDSVAKISER KYKAIANQIE TTRKKL
DEF SvAP3 LeAP3 PD4 PMADS 1 NTDEF NMH7 GDEF2 SLM3 DeAP3-1 ScAP3 LtAP3 MfAP3 PnAP3-2 RfAP3-2 PcAE3 PnAP3-1 RbAP3-i RfAP3-1 G D E F l TM6 PTD PtAP3-I PtAP3-2 OsMADS16 TaMADS51 CMB2 PhAP3 GGM2
151 176 DNSLKLIRER KYKVISNQID TSKKKV DDSLRKIRER KYKVIGNQIE TSKKKL DNSLKLIRER KFKVIGNQIE TYRKKV DNSLKLIRER KYKVIGNQIE TYRKKV DNSLKLIRER KYKVIGNQIE TFKK'rN DNSLKLIRER KYKVIGNQID TYKKKV DKALKAIRER KYKVITNQID TQRKKF QEAVYIfRER KLKVIGNKLE TSKKKV QEAVTLIRER KYKKIDNQID TTKKKV DNSLKIVRDR KYHVITTQTE TYRKKL DTSLKWRDR KYHVITTQTD TTRKKI ESSIKWRER KYHVINTQTE TYKKKL ESSIKVVRER KYHVIQTQTE TYKKKL EASVKVVRDR KYHVIITQTE TTRKKL ESSVDRVRHR KNHVIRTQTD TTNKKI LSSVEIVRLR KFHVLGSHTE TSKKRN IDSADIVRNR KNHVLNSHTE TSKKRN RNSAKVVRLR KFGLLSSQGE TQKKKI RDTEKVVRQR KFGLLSSQGE TQRKKI QDSLTLVRER KWVIKTQTD TCRKRV TESVAEIRER KYHVIKNQTD TCKKKA TEALNGVRGR KYKVIKTQNE TYRKKV EECLKNIRDR KEHQLRNQIG TSKKKT QESLMIVGDR KEHQLRNQIG TYKKKS DAALKEVRHR KYHVISTQTE TYKKKV DAALKEVRQR KYHVITTQTE TYKKKV EEAIIQIRNK KYHTIKNQTG TTRKKI ERASNIVRER KEKAISTKVD TLNKKV ESALRLVRRK KDHAWDYQRT ILLKKV