back to Mr. Moore's home page

Plastid Genome Evolution and Flowering Plant Phylogenetics

angiosperm phylogeny images
Above: A circle tree that depicts the relationships among the major lineages of flowering plants as determined from plastid genome sequence data. Starting at the bottom center (at the base of the tree) and proceeding counterclockwise, are images of Amborella, Nymphaea, Illicium, Chloranthus, Piper, Liriodendron, Ceratophyllum, Ranunculus, Pelargonium, Helianthus, Yucca, Triticum, and Acorus. The background picture is of the island of New Caledonia, the home of Amborella. To see the paper associated with this image, click here. Click on the image to see a larger version.

Plants contain genetic material in three separate cellular compartments: the nucleus, the mitochondrion, and the plastid (which, when used for photosynthesis—as it mostly is—is called a chloroplast).  It may seem strange that two organelles should have their own genomes.  Why do mitochondria and plastids have DNA?  The circular nature of the both genomes reveals the answer:  mitochondria and plastids represent highly reduced versions of what at one time (~2 billion years ago!) were free-living bacteria.  As these ancestral bacteria took up residence within the eukaryotic cell and transitioned to fully integrated organelles, most of their genes were either lost or transferred to the host cell’s nucleus, leaving only several dozen genes that are most important to the day-to-day functioning of these organelles.  As a result, the plant mitochondrial and plastid genomes generally range from 150,000-500,000 base pairs in length, while a typical plant nuclear genome would range from 0.5-4.0 billion base pairs (bp) in length!

The two plant organellar genomes differ in several fundamental ways.  Unlike the animal mitochondrial genome, which has an extremely compact organization, the plant mitochondrial genome has a more fluid organization due to the relatively high proportion of A/T-rich (as opposed to C/G-rich) intergenic spacer regions, which tend to accumulate repeat sequences.  In addition, the plant mitochondrial genome has a relatively high propensity to incorporate foreign DNA, such that many mitochondrial genomes in plants have genes or entire regions that were acquired from other genomes at some point during their evolutionary history.  As a result of these characteristics, plant mitochondrial genomes vary in size by roughly an order of magnitude across flowering plants (from 400,000 bp in some species to 4 million bp in the basal angiosperm Amborella!).  In stark contrast stands the chloroplast genome.  The chloroplast genomes of almost all flowering plants share an identical gene content and arrangement, with much less intergenic spacer and, in general, few or no significant repeat sequences.  Plastid genomes essentially never incorporate foreign DNA; despite active searching, almost no examples of horizontal gene transfer have been detected across all of plants (flowering or otherwise).  Plastid genomes can vary in size, although nearly all flowering plant plastids have a genome from 150,000-170,000 bp in length.

plastid genome map
Above: A map of a typical angiosperm plastid genome (click on the image to go to a larger version). Below: An overview of angiosperm relationships, taken from Moore et al. (2007), "Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms", PNAS. The five major lineages of mesangiosperms are indicated by different colors. Click on the image to link to this paper.
overview of flowering plant phylogeny

Because of its structural stability and its uniparental inheritance (in other words, recombination among different copies of DNA does not occur, as it does in the nucleus, where there are maternal and parental copies of all genes), the plastid genome provides a critical source of information for reconstructing the evolutionary relationships of flowering plants.  Indeed, the plastid genome has been the workhorse of plant systematics since the beginning of the molecular phylogenetic era.  However, until recently most researchers relied on only a handful of genes to reconstruct plant phylogeny (phylogeny is the term we use to refer to the evolutionary relationships within a group).  Recent advances in DNA sequencing technology over the past five years now enable the rapid and (relatively) cheap sequencing of entire plastid genomes, thus increasing the amount of available sequence data 10-50x over what has historically been used in plant phylogenetics.

Plant systematists have made great strides in reconstructing angiosperm (flowering plant) phylogeny over the past 20 years, mostly through the analysis of a handful of plastid genes.  The overall backbone of angiosperm phylogeny is now well-established (see figure at right).  This new view of angiosperm relationships has played a central role in our developing understanding of numerous aspects of angiosperm evolution, including the origins and evolution of the flower.  However, many of the relationships in the deeper levels of angiosperm phylogeny are still unclear.  The most critical unresolved relationships among flowering plants involve the mesangiosperms, which collectively represent 99% of flowering plant diversity.  We have strong evidence to suggest that mesangiosperms fall into five major lineages (see figure at right), but the relationships among these lineages have so far been impossible to resolve confidently, even in analyses that utilize as much as 9,000 bp of DNA sequence data.

Plastid genomics raises the possibility of finally resolving outstanding issues in the flowering plant Tree of Life.  Theoretical as well as some experimental work has suggested that greatly increasing the number of base pairs in a phylogenetic analysis can resolve recalcitrant phylogenetic relationships under certain conditions.  I have been part of a large group of plant systematists that have sequenced the entire plastid genomes of representatives from all the major basal angiosperm groups, including all five mesangiosperm lineages, in an effort to test this idea in flowering plants.  This sequencing effort has allowed phylogenetic data sets up to 70,000 bp in length to be analyzed.  These analyses provide the first reasonably strong support for a completely resolved tree of basal angiosperms, although sophisticated statistical analyses still indicate a possibility that some of these relationships cannot yet be completely accepted.

Many avenues remain open to achieving strong statistical support for mesangiosperm relationships.  Perhaps the most promising route to take is to sequence the plastid genomes of strategically selected unsampled lineages of basal angiosperms.  Much theoretical and experimental work has indicated that strategic phylogenetic sampling can be a powerful technique (even more powerful than additional sequence data) to overcome difficulties in recovering evolutionary relationships.  To that end, colleagues and I will be sequencing the plastid genomes of several other important basal angiosperms in the near future.

Plastid Genome Evolution

Although plastid genomes are typically strongly conserved structurally, several lineages of angiosperms have experienced significant structural rearrangement or gene loss.  Understanding these phenomena could provide powerful insight into both the molecular evolution of the angiosperm plastid as well as the evolution of the associated plants.  For example, plastids in the angiosperm order Santalales, which is composed mainly of parasitic plants (including all mistletoe species), display a number of interesting gene losses that may be related to their assumption of a parasitic lifestyle.  Outside of Santalales, plants in other groups such as Rhododendron (azaleas) possess plastid genomes that have a more or less typical complement of genes, but these genes are highly rearranged with respect to the usual angiosperm plastid.  I plan to explore the phylogenetic extent of gene loss in Santalales and gene rearrangements in other groups in an effort to elucidate mechanisms that may have been associated with the formation of the unusual features of these genomes.

 

Top

last updated November 30, 2007

All images are the copyright of Michael J. Moore