Join the Meeting Place for Moms!
Talk to other moms, share advice, and have fun!

(minimum 6 characters)

Gorilla - evidence for or against?

Posted by on Apr. 20, 2012 at 2:58 AM
  • 4 Replies
  • 347 Total Views

Gorilla - evidence for or against?

by on Apr. 20, 2012 at 2:58 AM
Add your quick reply below:
You must be a member to reply to this post.
Replies (1-4):
Clairwil
by Member on Apr. 20, 2012 at 2:58 AM

(source)

A tiny bit of knowledge is a dangerous thing

| 45 Comments

Good news! The gorilla genome sequence was published in Nature last week, and adds to our body of knowledge about primate evolution. Here's the abstract:

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.

I've highlighted one phrase in that abstract because, surprise surprise, creationists read the paper and that was the only thing they saw, and in either dumb incomprehension or malicious distortion, took an article titled "Insights into hominid evolution from the gorilla genome sequence" and twisted it into a bumbling mess of lies titled "Gorilla Genome Is Bad News for Evolution". They treat a phenomenon called Incomplete Lineage Sorting (ILS) as an obstacle to evolution rather than an expected outcome.

This problem is related to a biological paradigm called independent lineage sorting. To illustrate this concept among humans and primates, some segments of human DNA seem more related to gorilla DNA than chimpanzee DNA, and vice versa. This well-established fact produces different evolutionary trees for humans with various primates, depending on the DNA sequence being analyzed.

In a significant number of cases, evolutionary trees based on DNA sequences show that humans are more closely related to gorillas or orangutans than chimpanzees--again, all depending on which DNA fragment is used for the analysis. The overall outcome is that no clear path of common ancestry between humans and various primates exists, so no coherent model of primate evolution can be achieved.

The recent release of the gorilla genome spectacularly highlights this evolutionary quandary. According to the Nature study, "in 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other."

When you compare the genomic sequences of three related species, such as the human, chimp, and gorilla, you'll typically find from an average that a pattern of relatedness is revealed: humans and chimps are closer to each other than they are to gorillas, indicating a more recent divergence between humans and chimps than between humans and gorillas. However, that's an average result: if you compare them base by base, you'll find genes and regions of the chromosomes in which the gorilla sequence is more similar to the human sequence than to the chimpanzee sequence; if you looked at only that gene, you'd conclude that humans and gorillas were closer cousins, and chimpanzees were more distant.

Is ILS a problem? It complicates the analysis of sequences for sure (although it also can be used as a probe to look at evolution). But it's not a problem that calls evolution into question; to the contrary, it's an expected phenomenon.

Here's why. This diagram illustrates the simplistic, naïve expectation you might have.

hc1.jpeg

The outline of the tree illustrates the average pattern of sequence similarity, with the conclusion that humans (H) and chimpanzees (C) diverged more recently than humans/chimps and gorillas (G), which diverged more recently than humans/chimps/gorillas and orangutans (O). The solid line inside the outline illustrates the history of a single gene, drawn in black to represent the ancestral state, and then drawn in blue at the time humans and chimps diverged.

This is a gene that acquired its unique differences in the two lineages at the time of the human-chimpanzee split. It fits perfectly with the average pattern.

But just ask yourself: how likely is that? There are tens of thousands of genes in each of these species. Do you really think all the differences popped into existence simultaneously, at one instant when two populations of our last common ancestor discretely and completely separated? Of course not: you'd have to be a creationist to believe in something that stupid.

Here's another possibility. Speciation wasn't instantaneous, but a matter of multiple populations existing in parallel, with changes in genes appearing in different subsets at different times, spread out over long periods of time. So sometimes a mutation unique to one extant lineage appeared long before the split, and was just sorted at the time of separation into one lineage or the other.

hc2.jpeg

In this case, comparison of the gene in question would give the same qualitative answer — humans and chimps are most closely related — but a different quantitative difference in the time of divergence. But as you can see, it requires nothing weird or unexplainable or contradictory to evolutionary theory: you just have to appreciate the population nature of evolution.

We can go further: different forms of the genes can be sorted into different lineages entirely by chance.

hcgalts.jpeg

In these cases, we have two different forms of a gene that arose in ancestral population, ancestral to humans, chimps, and gorillas. By drift, one form was lost in the gorilla lineage, but both forms continue to be found in the ancestral manpanzee population; at the time of human/chimp divergence, these gene forms were sorted into different lineages. By chance, these will show either a closer relationship between humans and gorillas or chimpanzees and gorillas.

And the likelihood of HC2, HG, and CG above are equally probable!

So the creationist argument against evolution on the basis of incomplete lineage sorting is very, very silly. The only way you would fail to see ILS is if every genetic difference between two species emerged simultaneously, in lockstep, in one grand swoop. That is, the observation of ILS contradicts creationism, not evolution.

The authors of the Nature paper were well aware of this, and even illustrated it in their first figure.

hcgo_tree.jpeg
Phylogeny of the great ape family, showing the speciation of human (H), chimpanzee (C), gorilla (G) and orang-utan (O). Horizontal lines indicate speciation times within the hominine subfamily and the sequence divergence time between human and orang-utan. Interior grey lines illustrate an example of incomplete lineage sorting at a particular genetic locus--in this case (((C, G), H), O) rather than (((H, C), G), O). Below are mean nucleotide divergences between human and the other great apes from the EPO alignment.

We can measure the average genetic distance between the species (the percentages at the bottom of the figure), but we can still see individual genes (the gray line) that branched at different points in their history. This is simply not a problem for evolutionary theory; once again, the creationists rely on their proponents having a foolishly cartoonish version of evolution in their heads in order to raise a false objection.


Scally A, Dutheil JY, Hillier LW et al. (2012) Insights into hominid evolution from the gorilla genome sequence. Nature 483:169-175.

Dutheil JY, Ganapathy G, Hobolth A, Mailund T, Uyenoyama MK, Schierup MH (2009) Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach. Genetics 183: 259-274

Piskie
by Member on Apr. 20, 2012 at 3:09 AM
That's expected. Some parts of our ancestor gene were more useful for gorillas and humans than chimpanzees. On average, we are closer to chimps.
Siblings are on average a lot more genetically similar than cousins, but that doesn't mean cousins won't share some traits closer than some siblings.

My dh has very thin, very straight hair, just like his cousin Emma. They look very similar.
My BIL however looks just like Hurley from lost. Very big, tightly curled hair.

On average they'd be very genetically similar, but taking just the hair gene the cousins would be more similar.

It's not a problem for evolution at all.
Posted on CafeMom Mobile
Clairwil
by Member on Feb. 6, 2013 at 4:52 AM

(source)

THE circle of ideas that has come to be known as the coalescent has proved to be a useful tool in a range of genetical problems, both in modeling biological phenomena and in making statistical sense of the rich data now available. In the 20 years since the concept crystallized, it has been extended in a number of directions. It is not the purpose of this note to document recent developments or to record the way in which others later arrived at similar conclusions by different routes; for that, consult, for instance, Hudson (1990), Donnelly and Tavaré (1995), and Stephens and Donnelly (2000).

The technique has been widely applied in recent studies of evolution, thanks to advances in molecular biology and computer technology. By being sample based, it has provided rigorous statistical analyses of population data and has provided a rationale for designing simulations. It has led to two different estimators of the key parameter, θ = 4Nμ, where N is the effective population number and μ is the mutation rate, and, therefore, to a test of neutrality. It has provided estimates of the time to a common ancestor, and, in particular, a very long time estimate provided strong evidence for balancing selection in the ancestry of the HLA and other loci. This technique has also provided estimates of recombination and rate of selfing. It has been helpful in assessing migration patterns in human ancestry, in particular, sex differences as revealed by comparison of within- and between-group variability of Y chromosome and mitochondrial DNA. For a recent review, see Fu and Li (1999).

I shall not discuss these applications either. Mine is the much simpler aim of describing the way in which the ideas first came together, in the period leading to my 1982 articles. This is inevitably a personal account, but one that I hope is accurate, being based on records from these years. I have had the benefit of comments from Warren Ewens and Peter Donnelly, for which I am most grateful, but the interpretations and emphases are mine.

Three insights, in combination, comprise the essential basis of the coalescent. The first is the idea of tracing the ancestry of a gene backward in time and building up the family tree of the genes (at a particular locus) in a population sample back to the point at which they have a single common ancestor. This is just a generalization of Malécot’s “identity by descent” (Nagylaki 1989) to more than two genes. It becomes powerful because of the second insight, that for a large class of demographic models, characterized by selective neutrality and constrained population size, the stochastic structure of the genealogy does not depend on the detail of the reproductive mechanism. Finally, for such models the effect of mutation is statistically independent of the genealogy.

What is surprising is that these rather simple ideas took so long to emerge. For me the story begins in 1974, when I was traveling in Australia meeting mathematicians who shared my interests in random processes and their applications. I had not worked on genetics since, as a Cambridge undergraduate, I had published juvenilia on polymorphisms maintained by single locus selection. But in Melbourne I encountered Warren Ewens, who was exploring some ideas of Ohta and Kimura (1973) on neutral evolution in finite populations (the “charge-state” model). Moving on to Canberra, I found that Pat Moran was working on a similar problem (Moran 1975), and the enthusiasm of these two Australians inspired their English visitor.

They considered a locus at which the different alleles can be labeled by a single numerical quantity and in which mutation causes a random addition or subtraction. Thus, a single line of descent will have genes that perform a random walk on the line. There was little biological credibility in such a description, but it accorded with the experimental techniques of gel electrophoresis, which were then the best way of distinguishing alleles (e.g., Singhet al. 1976). A haploid population of fixed size n was supposed to evolve in discrete generations, the numbers of children of the members of one generation having a symmetric multinomial joint distribution (the Wright-Fisher model).

Thus, the genes of one generation are represented by N points on the line. As we observe from generation to generation, the N points perform a “coherent random walk.” The group strays farther and farther from its starting point, but the extent of the group remains relatively limited, and the distribution of the relative positions of the points converges to a proper limit. In my 1976 article I quoted Ewens as explaining this phenomenon by noting that “the probability that two points of Gt (the tth generation) have a common ancestor in Gs is 1 - (1 - N-1)t-s, which is near unity when (t - s) is large compared with N.

Thus, the whole of Gt is descended from a common ancestor in Gt, where the random integer Δ = Δ(t) remains stochastically bounded as t → ∞. The relative distances are the result, therefore, only of displacements in these Δ generations.”

This is tantalizingly close to the idea of deducing the genetic structure of the population from the genealogy back to the common ancestor, but the article then goes off into complicated mathematical analysis, which adds little to our understanding of the model. There are, however, two pointers to later work. First, the algebra is such that it forces consideration of samples of size n from the population and produces a recursion between n and n + 1. This led me to an interest in the Ewens sampling formula (Ewens 1972), which had already begun to take a central place on the population genetics stage.

The second pointer was the use of Fourier transforms, which made easy a generalization from a gene as a single number to one described by a family of d numbers. This led Kingman (1977a) to a theory of coherent random walks in space of d dimensions, at the price of no extra algebra. I think that I felt that escaping from the restriction to one dimension could lead to more realistic models, and this was confirmed when it became clear that all the formulae were simplified when d was allowed to tend to infinity. This corresponds to an assumption that a mutation always produces a new allele, the “infinite alleles” hypothesis (see Crow 1989).

It would not have been difficult to use the machinery of this article (Kingman 1977a) to derive the Ewens sampling formula, but only a special case was carried through. In Kingman (1977b), a different approach to that formula was introduced. Watterson (1974, 1976) had derived the sampling formula from the assumption that the population gene frequencies had a Dirichlet joint distribution, an assumption derived from the diffusion theory approach of the Kimura school, as well as earlier results of Wright. These frequencies, when arranged in descending order, have a limiting joint distribution known as the Poisson-Dirichlet limit, which I happen to have come across in a quite different connection (Kingman 1975). Watterson shows precisely that a population whose gene frequencies have this joint distribution will satisfy the Ewens sampling formula. The converse is proved in Kingman (1977b), and this is linked in Kingman (1978) to the consistency of the Ewens formula between different sample sizes.

Thus, by the end of 1978, the nature of the Ewens sampling formula and its links with, on the one hand, nonrecurrent mutation and, on the other, the classical Kimura diffusion approach to neutral evolution were well understood. Moreover, I had noted in Kingman (1977a) that the results were robust, in that they held when the Wright-Fisher multinomial model was replaced by other symmetric reproductive processes. What was still missing was the crucial connection with the genealogy.

As a result not only of this work but also of research into deterministic selective models, I was invited to be the speaker at a conference held at Iowa State University in June 1979 under the auspices of the Conference Board of the Mathematical Sciences and the National Science Foundation. The proceedings of that conference were published as Kingman (1980), and only one of its four chapters is devoted to neutral evolution. This adds little to the earlier articles, and for present purposes the most interesting feature is an annex, Appendix II, entitled “The genealogy of the Wright-Fisher model.” This takes a rather subtle inequality, used in Kingman (1976) to prove a convergence result, and gives it a probabilistic meaning in terms of the random variable called Δ(t) above. It shows, in fact, that the probability that Δ(t) is greater than any integer r is at most 3(1 - N-1)r, the constant 3 being the best possible. But despite the title, there is no exploration of the family tree beyond the number of generations back to the common ancestor.

Our host at Iowa State, Oscar Kempthorne, had gathered an impressive group of participants, both mathematicians and biologists, and we discussed the problems of population genetics far into the night. It does not appear that the structure of the family tree entered into these debates, but it must have been there that the crucial idea was conceived, because the first account of the coalescent appears in Kingman (1982a), submitted for publication less than a year after the conference, in May 1980.

A footnote on the first page of that article observes that “genealogy means the whole family tree structure,” so the cat is out of the bag. The argument starts from the observation that the Wright-Fisher multinomial model is equivalent to the rule that each member of a generation chooses its mother at random from the previous generation, the choices of different members being independent. This means that two members of the same generation have a probability (1 - N-1)r of having different ancestors r generations back. If time is measured in units of N generations and N → ∞, the time to a common ancestor for the two has a negative exponential distribution with probability density e-t.

Now consider n members of a particular generation, and trace their family tree backward through time. For some time there will be n ancestors, but at some instant two of the lines come together. The probability density of this coalescence time (in the limit as N → ∞) is ke-kt, where k = n(n - 1)/2 is the number of pairs that might coalesce. Now trace back the n - 1 lines until they coalesce; the argument is the same with n replaced by n - 1 and so on, until the number of lines is reduced to 1. The article sets this up formally, by means of a Markov chain whose states are equivalence relations on {1, 2,..., n}, and relates it to a representation of the Ewens sampling formula in terms of a certain “random paintbox.” Thus the circle of ideas is complete.

Kingman (1982b) covers much the same ground in a more mathematical way, but a more important article is Kingman (1982c). This proves strong robustness results and goes a long way to explaining why the coalescent is useful for a wide range of neutral models. It also shows directly how, by allowing mutation to act on the branches of the family tree, the Ewens formula follows.

There is a moral to this tale. The first articles on coherent random walks (Ewens 1974; Moran 1975; Kingman 1976) were bogged down in complex algebra. If we had asked what the equations meant in probabilistic terms, we could not have missed the significance of the family tree or the simplification that comes if mutation is nonrecurrent (or if the mutant is independent of the parent). Those who analyze stochastic models should always lift their eyes from their equations to ask what they actually mean.

Clairwil
by Member on Feb. 6, 2013 at 4:54 AM

Who needs an IQ test when you’ve got coalescence?

(source)

I am just blown away by the consistency of this observation. You know, the creationists are not all stupid; there’s a wide range of intelligence in their camp, even if they are all wrong. But this one recent paper on the gorilla genome has become such an excellent tool for discriminating the competent from the incompetent.

This was the paper that unsurprisingly explained that gorilla genes reveal a mosaic; that some gorilla genes are closer to human or chimpanzee than the latter are to each other. If you understand the logic of coalescent theory at all, you know this is an expected result. The only way you could fail to see the distribution we observe is if the population went through a bottleneck of exactly two individuals.

But once again, one of the so-called scientists of intelligent design creationism blows it. Doug Axe has announced that the ape family tree is hopelessly broken, and that the gorilla data should call evolutionary theory into question.

Until recently, the answer was that a real family tree should generate a fully consistent pattern of similarities. [Not true at all. Coalescent theory is an extension of Fisher/Wright models of large populations, and the formal mathematics were worked out in the 1980s] For example, we are told that chimps and humans came from the same ancestral stock (call it CH stock) and that gorillas, chimps and humans all came from an earlier ancestral stock (GCH stock) [Correct so far]. If so, then the human and chimp genomes should consistently be more similar to each other than either is to the gorilla genome [WRONG. They should not be consistently more similar. Does he know nothing of probability?], since the human and chimp histories were one and the same thing more recently than the human and gorilla (or chimp and gorilla) histories were.

Well, the recent publication of the gorilla genome sequence shows that the expected pattern just isn’t there [Jebus. Read the paper. The pattern observed is the expected pattern.]. Instead of a nested hierarchy of similarities, we see something more like a mosaic [AS WE'D EXPECT.]. According to a recent report, “In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other…”

That’s sufficiently difficult to square with Darwin’s tree that it ought to bring the whole theory into question. And in an ideal world where Darwinism is examined the way scientific theories ought to be examined, I think it would. But in the real world things aren’t always so simple [And yet the creationists keep throwing up their simplistic models and being surprised that they're wrong].

Axe is the one guy the creationists keep touting as a real scientist, a guy with genuine chops in molecular biology, the man who is doing serious scientific work. You know, if you’re going to publicly criticize an observation and claim it calls into question the entirety of evolutionary theory, you ought to first look into it and see whether that observation actually fits a prediction of evolution — actual evolutionary theory, not that cartoonishly naive caricature of evolution the creationists all have in their heads.

Here’s a nice, short history of coalescent theory by Kingman. It’s been around for decades, long before the gorilla genome was sequenced, and it predicted what kinds of distributions we ought to see in our comparisons of different species…predictions that were borne out by the paper Axe thinks contradicts evolutionary theory

Add your quick reply below:
You must be a member to reply to this post.
Join the Meeting Place for Moms!
Talk to other moms, share advice, and have fun!

(minimum 6 characters)