Wednesday, November 29, 2006

The Three Domain Hypothesis (part 4)

[Part 1][Part 2][Part 3]

Ludwig and Schleifer question the reliability of the SSU tree. They begin by comparing trees constructed from the small ribosomal RNA subunit (SSU) and the large ribosomal RNA subunit (LSU). The example they use is 18 species of Enterococcus and they show that there are significant differences between the two trees. Surprisingly, they dismiss these differences as “minor local differences.” These authors are convinced that “SSU and LSU rRNA genes fulfill the requirements of ideal phylogenetic markers to an extent far greater than do protein coding genes.”

In spite of this bias, they compiled a database of protein trees from conserved genes that are found in all three of the proposed Domains. According to them, the Three Domain Hypothesis is supported by EF-Tu, the large subunits of RNA polymerase, Hsp60, and some aminoacyl-tRNA synthetases (aspartyl, leucyl, tryptophanyl, and tyrosyl).

The Three Domain Hypothesis is refuted by ATPase, DNA gyrase A, DNA gyrase B, Hsp70, RecA, and some aminoacyl-tRNA synthetases. Note the inclusion of ATPase in this list. The phylogeny of ATPase was one of the strongest bits of evidence for the Three Domain Hypothesis back in 1989 but further work has shown that these genes (proteins) now refute the hypothesis.

My own favorite is the HSP70 gene family, arguably the most highly conserved gene in all of biology and therefore an excellent candidate for studies of deep phylogeny. Hsp70 is the main chaperone in all species. It is responsible for the correct folding of proteins as they are synthesized. It forms a complex with DnaJ and GrpE in bacteria and similar proteins in eukaryotes. The complex associates with the translation machinery (ribomes etc.) during protein synthesis.

The conflict between trees constructed with HSP70 and the ribosomal RNA trees has been known for a long time. The actual pattern of the HSP70 tree can be interpreted in two different ways depending on where you place the root [see 1995] but neither one agrees with the Three Domain Hypothesis.

Here’s an example of an HSP70 tree that I just created using the latest sequences. It’s fairly typical of the trees that do not support the Three Domain Hypothesis. Eukaryotes cluster as a monophyletic group (lower left) and all prokaryotes form another distinct clade. The archaebacteria sequences (black dots) do not form a single clade, let alone a “domain.” Instead, they tend to be dispersed among the other bacterial groups.


Note that this tree, like many others, shows numerous short branches at the bottom of the bacteria tree suggesting that the diversity among bacteria is ancient. Phillippe and Forterre (1999) were among the first to document the serious differences between conserved protein trees and rRNA trees in “The Rooting of the Universal Tree of Life Is Not Reliable” (J. Mol. Evol. 49:509-523). It’s worth quoting their abstract in order to emphasize the controversy since Ludwig and Schleifer don’t do a very good job.
Several composite universal trees connected by an ancestral gene duplication have been used to root the universal tree of life. In all cases, this root turned out to be in the eubacterial branch. However, the validity of results obtained from comparative sequence analysis has recently been questioned, in particular, in the case of ancient phylogenies. For example, it has been shown that several eukaryotic groups are misplaced in ribosomal RNA or elongation factor trees because of unequal rates of evolution and mutational saturation. Furthermore, the addition of new sequences to data sets has often turned apparently reasonable phylogenies into confused ones. We have thus revisited all composite protein trees that have been used to root the universal tree of life up to now (elongation factors, ATPases, tRNA synthetases, carbamoyl phosphate synthetases, signal recognition particle proteins) with updated data sets. In general, the two prokaryotic domains were not monophyletic with several aberrant groupings at different levels of the tree. Furthermore, the respective phylogenies contradicted each others, so that various ad hoc scenarios (paralogy or lateral gene transfer) must be proposed in order to obtain the traditional Archaebacteria-Eukaryota sisterhood. More importantly, all of the markers are heavily saturated with respect to amino acid substitutions. As phylogenies inferred from saturated data sets are extremely sensitive to differences in evolutionary rates, present phylogenies used to root the universal tree of life could be biased by the phenomenon of long branch attraction. Since the eubacterial branch was always the longest one, the eubacterial rooting could be explained by an attraction between this branch and the long branch of the outgroup. Finally, we suggested that an eukaryotic rooting could be a more fruitful working hypothesis, as it provides, for example, a simple explanation to the high genetic similarity of Archaebacteria and Eubacteria inferred from complete genome analysis.
The problem is obvious. All trees, RNA and protein, have potential problems of saturation and long branch attraction. Although Ludwig and Schleifer argue in favor of the ribosomal RNA tree, there is still serious debate over which sequences are revealing the “true” phylogeny. Are there good reasons for rejecting those trees that refute the Three Domain Hypothesis as it's supporters maintain?


Microbobial Phylogeny and Evolution: Concepts and Controversies Jan Sapp, ed., Oxford University Press, Oxford UK (2005)

Jan Sapp The Bacterium’s Place in Nature

Norman Pace The Large-Scale Structure of the Tree of Life.

Woflgang Ludwig and Karl-Heinz Schleifer The Molecular Phylogeny of Bacteria Based on Conserved Genes.

Carl Woese Evolving Biological Organization.

W. Ford Doolittle If the Tree of Life Fell, Would it Make a Sound?.

William Martin Woe Is the Tree of Life.

Radhey Gupta Molecular Sequences and the Early History of Life.

C. G. Kurland Paradigm Lost.

No comments:

Post a Comment