Papaya sex chromosomes
Papaya was an early success story of transgenics research, to prevent a nasty disease called papaya ringspot virus wiping out the industry on Hawaii (and elsewhere – but Hawaii is where they got the funding and did the work). Resistant transgenic varieties were developed in the early 1990s. Substantial genetic data was generated, eventually leading to a full genome sequence (published 2008)*. One of the findings was that one of the nine pairs of chromosomes showed signs of being incipient sex chromosomes.
Papaya is “trioecious” (that’s what the authors call it, ‘k? I’m still not totally sure about it) – that is, it has female, male and hermaphrodite forms. There is some commercial interest in sex-determination – obviously males don’t produce fruit, but also hermaphrodites are preferred to females (apparently they taste better). The three sexes are morphologically indistinguishable until flowering.
It was found that papaya has an XY sex chromosome system with a twist. Females are XX, males are XY, and hermaphrodites are XYh. (The Y and Yh were designated such because they are substantially identical in sequence.) There is a region of very low recombination and high divergence, as you would expect from sex chromosomes. BUT only a little bit of each is divergent – and you can’t tell them apart just by looking at them, like you can with the human X and Y.
So what’s going on with these “incipient sex chromosome”?
Yu Q, Hou S, Hobza R, Feltus FA, Wang X, Jin W, Skelton RL, Blas A, Lemke C, Saw JH, Moore PH, Alam M, Jiang J, Paterson AH, Vyskot B, & Ming R (2007). Chromosomal location and gene paucity of the male specific region on papaya Y chromosome. Molecular genetics and genomics : MGG, 278 (2), 177-85 PMID: 17520292
This paper did two things: first, they used fluorescent in-situ hybridisation (FISH) to look at the bits of the chromosome that are sex-specific; and secondly they looked at the actual sequence in that region to see if it had features that are thought to be common to Y-chromosomes.
For both, they constructed some neat things called BACs – Bacterial Artificial Chromosomes. These are big bits of DNA that you trick bacteria into making for you. Normal sequencing methods only let you make bits that are a few kilo-bases long (bases being nucleotides, or “letters” of DNA). BACs can be upwards of 700kb, although 150kb is more typical.
So they made 5 BACs that were complementary to pieces of the male-specific Y region (MSY) of the papaya Y chromosome. Now – and this is the clever bit – they stick a probe onto a BAC that fluoresces a certain colour under UV light. Then, they stick the BACs into a cell and let them do what DNA does – find another, complementary copy of itself, and bind to it. This is how you can see where a particular sequence of DNA really is.
What they found was that the BACs specific to the MSY only bound (hybridised) to ONE chromosome in a cell, but that other non-MSY BACs hybridised to TWO (ie. a matching pair of) chromosomes. (In the figure, somewhat confusingly, the MSY-specific BAC is green in one, and red in the other; and all the arrows are red; but hopefully you can see two spots of one colour, and one spot of the other in each frame. Unless you’re colour-blind. Sorry.)
This is pretty good damn evidence that the mapping was correct, and these BACs are indeed hitting the Y-chromosome – the odd one out. In another frame, they show that a different BAC gets a strong response out of one chromosome, and a much weaker response from another – that is thought to be the X chromosome, which still shares some sequence with the Y.
Using the same technique, they also established that the MSY BACs hybridised near the centromere. The centromere is a knot of proteins and DNA that many chromosomes have somewhere along their length, usually towards the middle, which plays a role in sending one of each pair in opposite directions during cell division. DNA is a physical thing as well as an abstract “information carrier”, and centromeres are a reminder of that. The areas around the centromere, known as pericentromeric regions, are typically gene-poor and full of repetetive sequences, and characteristically have very low recombination.
Huh, what else do we know that’s like that? *think, think*
The analysis of the sequence of the 5 BACs found… well, nothing unexpected. They found a lot of gypsy-type retroelements. Retroelements are are a major source of repetitive sequences. They are kind of like viruses within our genomes (in fact some of them are, or were, viruses); they have mechanisms that allow themselves to copy themselves over and over again, and in different places. I can’t possibly do them justice here; you’ll have to go read ERV until you get it.
Anyway, the pericentromeric regions in plants typically have lots of gypsy-type retroelements. And so did this one. They found numerous small duplications (where the DNA copying machinery gets lost mid-sequence and puts a paragraph in twice – we’ve all been there). They found that three of the BACs had matching sequences to the Arabidopsis genome**, mostly in pericentromeric regions.
What I’m getting at here is that they mostly found that the pericentromeric region of this chromosome was like other Y chromosomes everywhere. But also like pericentromeric regions of plant chromosomes everywhere. High retroelements; high small duplications; low recombination.
Then they looked for functional genes. They searched their five MSY BACs against a database of known papaya genes. They searched the protein database of GenBank (the NIH-uber-database of all things genetic). They used Gene-Scan software to predict where functional genes ought to be, and attempted to find them in the RNA content of 3 different tissues.
Wind whistling through trees, tumbleweed blowing across road, nothing. Not a single functional gene.
That is very low gene content; zero in 714kb of sequence between the combined BACs. Even assuming that there were some not found, that is low. The authors have to resort to calling on unpublished data where they did find 4 genes in the MSY; they calculate the gene density at 1 gene per 257 kb; this compares to 1 gene per 294 kb in the human Y chromosome, which in all other respects is significantly more degenerate.
Disappointingly, they did not compare this to the gene content of the equivalent region in the X chromosome; although this may have been because they were too diverged to easily map the corresponding region. Indeed, if I have a major criticism of this paper it is that they did not compare their BAC sequences to similar regions of the X chromosome, or even to other pericentromeric regions in papaya; but I recognise that due to the divergence between X and Y, it would have been a lot more work to find similar size and location BACs.
So: what are we looking at here?
They found that their MSY BACs were characterised by features that are often found in both pericentromeric regions and in Y chromosomes. They found that yes, the location of the sex-determination locus in a pericentromeric region provided the initial restricted recombination required to allow the divergence of X and Y. They postulate in the Discussion section that this may mean that sex chromosomes universally originate in areas of restricted recombination; or that it might mean that it is only one of many ways in which the initial suppression of recombination might occur. They go on to discuss other early-stage sex chromosomes, such as those found in sticklebacks and medaka (both fish).
While I was reading this paper, I felt a little bit… unsatisfied. It didn’t feel like they were really finding anything new. The pericentromeric region/MSY had high “junk” and suppressed recombination. Big whoop. No one could have predicted that.
But then it hit me: THIS IS *SCIENCE*! This is how it’s supposed to work! The Y chromosome is showing EXACTLY the sort of behaviour you would expect a young Y chromosome to be showing – originating out of a non-related mechanism for suppression of recombination; accumulating junk and losing genes. Over time, we would expect the region of divergence to grow, and the whole chromosome change to be morphologically distinct from the X – as appears to have happened in Silene latifolia (another dieocious plant with apparently older sex chromosomes).
This is why science works.
*Once there is some data available on a particular system/organism, it tends to attract more research, which generates more data. The whole thing snowballs very quickly, and before you know it you have a genome project on your hands.
**Arabidopsis is like the Walmart of the plant genetics world; sooner or later you end up there because nothing else is open, and they’ve got one of everything. Basically, it’s been used in genetic studies of all kinds since forever; it was the first plant genome published; and there are a bajillion different inbred lines being used in labs around the world.
Yu Q, Hou S, Hobza R, Feltus FA, Wang X, Jin W, Skelton RL, Blas A, Lemke C, Saw JH, Moore PH, Alam M, Jiang J, Paterson AH, Vyskot B, Ming R (2007) Chromosomal location and gene paucity of the male specific region on papaya Y chromosome. Mol Genet Genomics 278: 177-185.
Ming et al. (and seriously, al is like a gazillion other authors, sorry for not naming you all) (2008) The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452: 991-997.