One of the great transitions in the history of life on our planet was that from single-celled blob to multicellular… well, probably also blob. But a better blob; a more complex blob, with aspirations of one day having tentacles or growing leaves or inventing the talking thermometer.
Unfortunately, this precise moment in the lineages we care about (ie. animals; or, more specifically, us) has been lost to the sands of time. There is no surviving species we can point to as how we would have looked at that crucial time in our past. Thus, although we can talk all we want as to how it might have happened, we are adrift on a sea of speculation, without the land of empiricism in sight.
However, there exist other systems that we can study to get some idea of the changes involved in moving from unicellular to multicellular life.
Prochnik SE, Umen J, Nedelcu AM, Hallmann A, Miller SM, Nishii I, Ferris P, Kuo A, Mitros T, Fritz-Laylin LK, Hellsten U, Chapman J, Simakov O, Rensing SA, Terry A, Pangilinan J, Kapitonov V, Jurka J, Salamov A, Shapiro H, Schmutz J, Grimwood J, Lindquist E, Lucas S, Grigoriev IV, Schmitt R, Kirk D, & Rokhsar DS (2010). Genomic analysis of organismal complexity in the multicellular green alga Volvox carteri. Science (New York, N.Y.), 329 (5988), 223-6 PMID: 20616280
The Volvocine algae apparently have nothing to do with Swedish cars, but are a morphologically diverse lineage of green algae, containing both unicellular and multicellular members. Not only this, but they exhibit different levels of complexity, from simple sheets of identical cells as in Gonium, to differentiated somatic and reproductive cells as in Volvox.
(The idea of somatic cells is worth thinking about for a second. We humans take it for granted that we have regular body (somatic) cells, and specialised sex cells. But from the point of view of a single-celled organism that’s really weird. Some cells give up their own reproduction to help someone else? What’s in it for them?? So while just two different cell types may seem very basic to us, it’s a big deal for the algae.)
The authors of this paper attempted to characterise the changes from single- to multi-celled in Volvocine algae at the genomic level. The single-celled alga Chlamydomonas reinhardtii already had a genome assembly available; they used whole-genome shotgun sequencing to get the Volvox carteri genome. The idea is to look at the primitive multicellular organism and see what it has that its single-celled relative doesn’t.
So what did they find? Well, the two genomes were very similar sizes: 138 Mb for Volvox compared to 118 Mb for Chlamydomonas. In fact, the difference in size was largely made up of more repetitive DNA (junk DNA) in Volvox; the number of protein-coding genes was almost identical (14,520 and 14,516).
Not a lot to go on there. The difference must be in the type of genes, not the number. They looked in detail at genes involved in pathways they thought would be important to Volvox: protein secretion and trafficking; the cytoskeleton; the extracellular matrix (ECM); and cell-cycle regulation.
They found that, and I quote, “The components of these pathways are nearly identical in Volvox and Chlamydomonas.”
The ECM proteins showed the biggest differences, with 2 families in particular having substantially more members in Volvox. They also found a few extra genes involved in regulating the cell cycle; important if you have different cell types (somatic and reproductive) supposed to be doing different things at the same time.
But that was it! Fundamentally, they found that a few minor changes and duplications of existing genes were enough to let the unicellular ancestor of Volvox make the supposedly giant leap to multicellular co-operative harmony.
Firstly, Chlamydomonas has more genes than most other single celled organisms to start with. Thus the gains in gene number required to “go multi” may have mostly been already met.
Secondly, this paper is not saying, “oh, this is how it happened for all multicellular life.” Not at all. This is one possible pathway. There have almost certainly be multiple independent lineages that made the leap at some stage; probably many more than have survived to let us know.
Thirdly, and perhaps most importantly, the authors focused almost exclusively on the protein-coding content. As we now know, there’s a lot more to genetics than proteins. There might be regulatory RNAs running the whole show in there; we have no idea. Not only that, but they were looking at known protein content. If it couldn’t be identified or a function deduced from identified functional domains (that’s where you look at individual bits of a protein to figure out what it does), they couldn’t do much with it.
But that said: the expectation is that you find a whole lot of functional innovation in the genome that allow Volvox to be multicellular while Chlamydomonas is not. And they didn’t. Even if there were important genes in the pool that they couldn’t identify, there were very few genes that didn’t have a counterpart in Chlamydomonas.
The more we learn about genetics, the more we find that what we think of as big changes needing brand new genes is just an illusion. It’s easy to look at something as complex as a hand or a flower or an insect’s wing and think there’s no way that could have just grown. But they do, all around us, every day. The bacterial flagellum is just the product of a protein-trafficking gene gone a bit wrong. The human eye is only a sheet of light-receptive cells with accessories tacked on a bit at a time. And as it turns out, multicellular organisms are just single cells that stuck together.
Never fear – the CDC is on it!
Click here for everything you will need to be ready for the upcoming zombie invasion.
…you become great, apparently, by being xkcd.
It is done. I am shortly to be the proud recipient of a set of T-Rex shot glasses.
I would like to thank my agent; my management team; my wonderful co-stars, the box and the anonymous cat – you guys were great, I couldn’t have done it without you; my parents; my husband; the entire LOLcatz internet community; and most importantly, everyone who I coerced into voting for me. Ceiling Cat bless you all.
Have you ever watched interviews with tennis players immediately after the match?
“Well, I thought it was a good game, but at the end of the day, I thought maybe I played a bit better in the first set, but then he played a bit better in the next two sets, and, at the end of the day, he won, but it was a good game, both of us played really well, but, at the end of the day, he won.”
Yeah. That’s what writing up negative results feels like.
“Well, I thought it was a good experiment, but at the end of the day, I thought maybe it was going well at first, but then the results were all crap, and, at the end of the day, it didn’t work, but it was a good experiment, everything went really well, but at the end of the day, it didn’t work.”
Everyone* knows that DNA codes for proteins. In between the DNA and the protein though, there is an intermediate molecule called RNA – ribonucleic acid (DNA is deoxyribonucleic acid). RNA is similar to DNA in many respects – long chain of bases that we can consider “letters”, which form a code – but it’s less stable, and thus less suitable for long-term storage of information (that’s DNA’s job). RNA has many uses in the cell – the actual machinery that makes proteins? That’s made of RNA! – but for now we’ll just concentrate on the mRNA.
That’s right, I said mRNA. It stands for messenger-RNA, to distinguish it from the various other types of RNA. This is the bit that gets directly copied (transcribed) from DNA; and this is what forms the template for translation to a protein. (Transcription and translation are technical terms. Don’t get them mixed up.)
But sometimes, things get a little mixed up. Sometimes, instead of going on to make a nice protein, the mRNA gets reverse transcribed back into DNA – and gets snatched back into a random place in the genome. This is called a retrocopy.
Now even a perfectly good bit of code is useless in a genome without something to flag it up to the cell’s machinery. These bits are called promoters. Promoters are in the non-coding DNA just outside of the gene proper, so they aren’t found in the subsequent mRNA. Thus retrocopies don’t have any promoters.
Without promoters, the retrocopies are just playing to an empty room, so they tend to degenerate into pseudogenes – they look like the parent gene, but they’re non-functional, and they tend to accumulate mutations.
These exist primarily to screw up my alignments.
But once in a while, a retrocopy will find itself inserted next to some pre-existing promoters. Now it’s a whole new copy of the original gene – a retrogene.
The twist? Many genes in eukaryotes (that’s essentially every living thing except bacteria and viruses) have gaps in the coding sequence – bits of DNA that don’t get transcribed. These are called introns, because they’re in-between the coding sequence. (The bits that get expressed are called exons. Again, don’t get them mixed up.)
But the retrogenes come from mRNA. And mRNA has the introns already cut (spliced) out. So retrogenes don’t have any introns – that’s how you can tell which one is the original, or parent gene, and which one is the copy. Neat huh?
The final thing that can happen is that a retrocopy ends up not only with a new promoter, but also close enough to another bit of coding sequence that this new bit gets transcribed along with it, all as one gene. This is called a chimeric retrogene – a gene composed of different original bits.
OK? Everyone up to speed?
Zhu, Z., Zhang, Y., & Long, M. (2009). Extensive Structural Renovation of Retrogenes in the Evolution of the Populus Genome PLANT PHYSIOLOGY, 151 (4), 1943-1951 DOI: 10.1104/pp.109.142984
Up until recently, it was thought that retrocopies and retrogenes were mostly an animal thing, and didn’t really play a big role in plant evolution. This was mostly based on the fact that very few retrogenes were found in Arabidopsis thaliana, the major model species in plant genetics. This rather neatly highlights the problem of basing judgements on a whole kingdom on one species. As soon as researchers started looking elsewhere, they found LOTS of retrogenes, including lots of chimeric retrogenes (Charlesworth et al., 1998; Wang et al., 2006).
The current paper looked at retrocopies in Populus trichocarpa, still the major model tree in genetics (yes, I know I just said that using model species was inherently flawed… but you gotta start somewhere, that’s why we have them).
The first step, and really the core of the paper, is their “pipeline” for identifying potential retrocopies. They started with the whole Populus genome, got 71,278 candidates… and ended up with 106 retrocopies. That’s a hell of a lot of narrowing down; I’m not going to break down the specifics here, but they admit that their criteria for inclusion tended towards the stringent. By using the same method with Arabidopsis, they ended up with 32 out of 69 previously identified – clearly, they were prioritising finding unambiguous cases rather than all possible cases.
Then they took their 106 retrocopies and tried to figure out if they were just copies or actual functional genes – retrogenes; and if so, whether they were straight-up copies of the parent, or if they were chimerised. They found 95 out of their 106 copies showed evidence that they were being transcribed. That’s 89.6%. As they state, for comparison, only 16% of retrocopies so far identified in the human genome are potentially functional.
Whoah, that’s some serious stuff. But wait, let’s think about this… Their “pipeline” was super-stringent, as we already said. They were most likely throwing out a lot of potential retrocopies along the way. Have they really found that “OMG 89% of Populus retrocopies are functional!?!” Or is it that functional retrocopies are more likely to survive their inclusion process, giving a biased proportion? I would guess the latter, and my major criticism of this paper is that I think they were a bit premature in throwing this out as the major finding.
Because – let’s not mess around here – they found 95 potentially functional retrogenes in the Populus genome! In one study! That’s awesome! And wait – it gets better!
Not only were there 95 retrogenes, but there were 12 chimeric retrogenes. This is pretty cool of itself. But they also found what may be a NEW way for retrogenes to produce novel genetic information: intronisation. Some of the retrogenes generated a new intron out of previously coding DNA.
This is big! As you’ll recall, one of the features we expect from retrogenes is that they don’t have any introns – because they come from mRNA, where the introns are already spliced out. These ones not only have introns – they’re different introns from the parent gene.
So to sum up: plants have retrogenes; they may have a LOT of retrogenes; intronisation of retrogenes is potentially a new mechanism for generating new genetic information; and the authors really downplayed that in their own paper in favour of a number (89%!) which I don’t trust. Oh well.
Charlesworth, D., Liu, F.L. & Zhang, L. The evolution of the alcohol dehydrogenase gene family by loss of introns in plants of the genus Leavenworthia (Brassicaceae). Molecular Biology and Evolution 15, 552 -559 (1998).
Wang, W. et al. High Rate of Chimeric Gene Origination by Retroposition in Plant Genomes. The Plant Cell Online 18, 1791 -1802 (2006).
Zhu, Z., Zhang, Y. & Long, M. Extensive structural renovation of retrogenes in the evolution of the Populus genome. Plant Physiol 151, 1943-1951 (2009).