Humans and Chimps

Administrator2 · Feb 15, 2002

[Administrator: Dave Plaisted send in five emails for this thread. All are on this post with separations showing.]

DAVID PLAISTED

In response to Froggie: I looked up the data about introns and here is the
result:
Although gene-dense clusters are obvious, almost half the genes are
dispersed in low G+C sequence separated by large tracts of apparently
noncoding sequence. Only 1.1% of the genome is spanned by exons,
whereas 24% is in introns, with 75% of the genome being intergenic
DNA.

This is from the abstract to the following article:

The sequence of the human genome, by Ventner et al, Science 2001 Feb. 16
p. 1304-1351.

In fact one table suggests that the amount of the genome in introns
could be considerably higher, if there are somewhat more genes. This
shows that the total number of base pairs in each gene is almost 25
times that devoted to coding DNA.

I also read somewhere that in the fruit fly, some entire genes are in
the introns of other genes -- I wonder if this has been considered in
the human genome as well.

* * *

In response to a query by Froggie (I think) here is the reference for
the statement about rapid evolution of finches:
EVOLUTIONARY BIOLOGY:
Finches Adapt Rapidly to New Homes

Elizabeth Pennisi

Science, Volume 295, Number 5553, Issue of 11 Jan 2002, pp. 249-250.

Birds of a feather don't necessarily stick together. A study of house
finches has demonstrated that in just 30 years, finches newly settled
in Montana and Alabama begin to look and act quite different from each
other, despite being close kin. Alexander Badyaev, an evolutionary
ecologist at Auburn University in Alabama, and his colleagues have
also shown that these flourishing avian pioneers improve their chances
of success in part by controlling the sex of their eggs as they lay
them. In this way, mothers influence the size of their offspring, an
important survival trait.

The new work, reported on page 316 of this issue of Science, shows
that "the time scale of decades [not centuries] is really enough for
animals to evolve," notes David Reznick, an evolutionary biologist at
the University of California, Riverside. "The idea that the
[divergence] could be that rapid is really remarkable," adds Ben
Sheldon, an evolutionary biologist at Oxford University, United
Kingdom.

* * *

I just want to clarify one remark of Froggie's -- she gave results
about the calculation of sequence divergence from the thermal
stability of DNA heteroduplexes. She quotes an article that shows a
linear relationship between sequence divergence and the thermal
stability of DNA heteroduplexes. If anyone thinks this is an evidence
for evolution it is not -- it just shows that one can estimate
sequence divergence of DNA by forming a heteroduplex (I think this
means joining two kinds of DNA) and testing its thermal stability.

* * *

More responses to Froggie:
Froggie said that only a third of the amino acids in Cytochrome C are
necessary for its function. It's not that you can just delete 2/3 of
the protein, however. This just means that 1/3 of the amino acids are
the same in all Cytochrome C. The others typically vary a little but
generally only among amino acids that have similar properties, so they
are also necessary for function.

If the sequence of Cytochrome C does not affect its function then why
do all E. Coli all over the world have the same Cytochrome C sequence
(which I assume they do)? With such a huge population and short
generation times one would expect many mutations to arise and many
versions of Cytochrome C to be found.

The fact that a different version can be substituted in an organism
and still function does not mean it functions as well. A good test
would be to take two organisms, identical but with different Cytochrome
C and see how they fare in competition. Even if the Cytochrome C
functions it could impair the organism in some way. Since species
are so highly uniform in their Cytochrome C (and indeed all proteins)
it suggests that these sequences have a benefit to the organism and
mutations are eliminated from the population.

More to come ...

* * *

Froggie asserts that the differences between Cytochrome C in different organisms support the theory of evolution. However the patterns of molecular evolution do not seem to fit what one would expect:

The paper, Vagaries of the molecular clock, by Francisco J. Ayala, from Proc. Natl. Acad. Sci. USA Vol. 94, pp. 7776-7783, July 1997, illustrates some of the problems with the so-called molecular clocks. The abstract follows:
The hypothesis of the molecular evolutionary clock asserts that informational macromolecules (i.e., proteins and nucleic acids) evolve at rates that are constant through time and for different lineages. The clock hypothesis has been extremely powerful for determining evolutionary events of the remote past for which the fossil and other evidence is lacking or insufficient. I review the evolution of two genes, Gpdh and Sod. In fruit flies, the encoded glycerol-3-phosphate dehydrogenase (GPDH) protein evolves at a rate of 1.1 x 1010 amino acid replacements per site per year when Drosophila species are compared that diverged within the last 55 million years (My), but a much faster rate of 4.5 x 1010 replacements per site per year when comparisons are made between mammals (70 My) or Dipteran families (100 My), animal phyla (650 My), or multicellular kingdoms (1100 My). The rate of superoxide dismutase (SOD) evolution is very fast between Drosophila specie! s (16.2 x 1010 replacements per site per year) and remains the same between mammals (17.2) or Dipteran families (15.9), but it becomes much slower between animal phyla (5.3) and still slower between the three kingdoms (3.3). If we assume a molecular clock and use the Drosophila rate for estimating the divergence of remote organisms, GPDH yields estimates of 2,500 My for the divergence between the animal phyla (occurred 650 My) and 3,990 My for the divergence of the kingdoms (occurred 1,100 My). At the other extreme, SOD yields divergence times of 211 My and 224 My for the animal phyla and the kingdoms, respectively. It remains unsettled how often proteins evolve in such erratic fashion as GPDH and SOD.

The text of the paper reveals the puzzlement of the authors as to how this could occur. Also, Helen posted an article stating that no known combination of mechanisms can explain the observed pattern of molecular evolution -- I don't have! the reference, however.

Dave Plaisted

Administrator2 · Feb 15, 2002

DAVID PLAISTED

A good web page (I finally found it again) about self-complementary DNA is the following:
http://post.queensu.ca/~forsdyke/bioinfo2.htm

Note especially the following (from section 5):

<BLOCKQUOTE>quote:</font><HR>We propose above that in some circumstances evolutionary selective pressures have acted to preserve nucleic acid secondary structure, sometimes at the expense of an encoded protein. That this might also apply to the species-dependent component of the base composition, (C+G)%, arose from Naboru Sueoka's demonstration in 1961, before the genetic code was deciphered, that the amino acid composition of the proteins of microorganisms is influenced, not just by the demands of the environment on the proteins, but also by the base composition of the genome encoding those proteins. The observation has since been abundantly confirmed in a wide variety of animal and plant species (Lobry, 1997). <HR></BLOCKQUOTE>

Sueoka (1961) further pointed out that for individual "strains" of Tetrahymena the (C+G)% (referred to as "GC") tends to be uniform throughout the genome:

<BLOCKQUOTE>quote:</font><HR>"If one compares the distribution of DNA molecules of Tetrahymena strains of different mean GC contents, it is clear that the difference in mean values is due to a rather uniform difference of GC content in individual molecules. In other words, assuming that strains of Tetrahymena have a common phylogenetic origin, when the GC content of DNA of a particular strain changes, all the molecules undergo increases or decreases of GC pairs in similar amounts. This result is consistent with the idea that the base composition is rather uniform not only among DNA molecules of an organism, but also with respect to different parts of a given molecule."<HR></BLOCKQUOTE>

Again, this observation has been abundantly confirmed for a wide variety of species (Muto and Osawa, 1987), although many organisms considered higher on the evolutionary scale have their genomes sectored into regions of low or high (C+G)% (Bernardi and Bernardi, 1986; Bernardi 2000; see Section 9).

Sueoka (1961) also noted a link between (C+G)% and reproductive isolation for strains of Tetrahymena:

<BLOCKQUOTE>quote:</font><HR>"DNA base composition is a reflection of phylogenetic relationship. Furthermore, it is evident that those strains which mate with one another (i.e. strains within the same 'variety') have similar base compositions. Thus strains of variety 1 ..., which are freely intercrossed, have similar mean GC content."<HR></BLOCKQUOTE>

When the genetic code was deciphered in the early 1960s, it was observed that there are more codons than amino acids, so that most amino acids can correspond to more than one triplet codon. This gives some flexibility to a nucleic acid sequence. Sometimes an amino acid can be encoded from among as many as six possible synonymous codons. Walter Fitch (1974) noted that "the degeneracy of the genetic code provides an enormous plasticity to achieve secondary structure without sacrificing specificity of the message".

Yet, as outlined above, sometimes even this "plasticity" is insufficient, so that, with the exception of genes under positive Darwinian selection (Forsdyke, 1995b, 1996a), genomic secondary structure ("fold pressure") and (C+G)% "call the tune". Non-synonymous codon changes modify the amino acid sequence, sometimes at the expense of protein structure and function. A protein has to adapt to the demands of the environment, but it also has to adapt to genomic forces which we will show have derived, not from the conventional environment acting upon the convention ("classical") phenotype, but from what we call the "reproductive environment" acting on the "genome phenotype'', or "reprotype! ". Thus Bernardi and Bernardi noted in 1986 that:

<BLOCKQUOTE>quote:</font><HR>"The organismal phenotype comprises two components, the classical phenotype, corresponding to the 'gene products', and a 'genome phenotype' which is defined by [base] compositional constraints."<HR></BLOCKQUOTE>

This suggests that even the structure of proteins is biased to preserve this C+G property, and could explain why the same protein differs in different organism even if this difference does not directly influence its function.

Dave Plaisted

Humans and Chimps

Administrator2 New Member

Administrator2 New Member