|
MOLECULAR GENETICS OF
RETROTRANSPOSONS
|
|||
|
Henry L. Levin,
Ph.D., Principal Investigator |
![]() |
||
|
Retroelements are a large class of genetic elements that multiply by the reverse transcription of an RNA intermediate. The resulting cDNA is incorporated into the genome of host cells. In eukaryotes, the wide-spread success of long terminal repeat-containing retroelements has led to replication mechanisms that are conserved among diverse families of retrotransposons and retroviruses. The medical importance of retroviruses such as HIV has intensified the need to understand the molecular details of the mechanisms responsible for the propagation of long terminal repeat (LTR) retroelements. Given that LTR-retrotransposons exist in yeast, the powerful techniques of yeast genetics can be applied to answer basic questions about the function of LTR-retroelements, perhaps yielding fundamental information that may identify new antiviral targets or strategies that can be used to combat the spread of retroviruses such as HIV. The Section on Eukaryotic Transposable Elements is studying the retrotransposon Tf1, which is found in the fission yeast Schizosaccharomyces pombe, with the aim of understanding the molecular mechanisms of reverse transcription, transport of Tf1 into the nucleus, and integration of Tf1 cDNA into the host genome. Since the process of integration has the potential to compromise the fitness of the host, we wish to understand the balance between the ability of the transposon to insert into the host genome versus the efforts of the host to maintain its viability. The recently completed sequence of the S. pombe genome has allowed us to ask whether Tf1 integration results in the disruption of host genes. Much of what is known about the insertion sites of LTR-retrotransposons indicates that the process of integration is specifically controlled to avoid the disruption of host genes. In fact, all five transposons of Saccharomyces cerevisiae select sites for integration that lack coding sequences for host genes. Ty1, Ty2, Ty3, and Ty4 integrate into gene-poor regions associated with the 5' ends of tRNA genes. Ty5 avoids the disruption of host genes by selecting regions of silent chromatin for its integration. Much less is known about the interactions between the genome of S. pombe and its transposons. The recent completion of the genome sequence of S. pombe allowed us to study the full set of transposon sequences and their relationship to the host genes. Transposon Content of S. pombe On a larger scale, the LTRs were found to be widely distributed throughout each of the three chromosomes of S. pombe. However, it was particularly surprising that the concentration of LTRs on chromosome 3 was twice that of the other two chromosomes. The position of Tf LTRs in intergenic regions and their high density in chromosome 3 could be the result of either specific biases in the removal of LTRs or preferences in the selection of insertion sites. Our recently completed study of transposition events allowed us to distinguish between these two possibilities. Association of Tf LTRs with Chromosome 3 and with Intergenic Segments
Is the Result of Preferences in the Selection of Insertion Sites An analysis of the intergenic regions disrupted by the insertions revealed a strong preference for sequences between gene pairs that were transcribed in either divergent or tandem directions. While 18 insertions occurred between divergent genes and 32 occurred between tandem genes, none of the insertions occurred between genes transcribed in convergent directions. Given that 1,299 gene pair in the genome of S. pombe are divergent, 1,302 are convergent, and 2,289 are tandem, we predicted that 13.3 of our inserts should be in divergent regions, 13.8 in convergent regions, and 24 in tandem regions. These results are surprising in that we expected a greater lack of insertion between a convergent pair of genes. However, this calculation does not reflect the fact that average regions between divergent and tandem genes are larger than the average space between convergent genes. If we take into consideration that the average size of intergenic regions between divergent genes is 1.34 kb and that the average sizes of regions between tandem and convergent genes are 0.97 and 0.56 kb, respectively, we would expect that unbiased insertion of the 51 events into intergenic regions would produce 18.9 insertions between divergent pairs, 24 between tandem pairs, and 7.6 between convergent pairs. Since no inserts were found between convergent genes, the insertion of Tf1-ori/neo demonstrated a strong bias against intergenic regions associated with convergent genes. To identify the sequences within the intergenic regions that are recognized by Tf1, we mapped the distance from each insertion to the 5' and 3' ends of the adjacent coding sequences. Seventy-four percent of the insertions were closer to the 5' end of genes than to the 3' ends. Even though the association with the 5' ends of genes occurred with distances of up to 1.54 kb, significant clustering occurred within 300 nucleotides of the start of translation. To determine whether the transposition required specific classes of intergenic spaces, we compared the sizes of the intergenic regions disrupted by Tf1-ori/neo with the average sizes in the genome. For the transposition events that occurred between tandem gene pairs, the average size of the intergenic space was 1.44 kb, which was larger than 0.97 kb, the average intergenic space between tandem genes within the genome. The average size of the divergent intergenic spaces that received inserts was 1.55 kb, which was somewhat larger than 1.3 kb, the genome average for divergent spaces. In summary, Tf1 integration occurred in intergenic regions that were larger than the average sizes. Perhaps the most striking result was the observation that Tf1 integration had a significant preference for chromosome 3. Per unit length of DNA, chromosome 3 received approximately twice the number of inserts than occurred in chromosome 1 or 2. Such preference was found not to be attributable to differences in the distribution or composition of intergenic sequences within the three chromosomes. Our results demonstrate that chromosome 3 was, in some physiological aspect, distinct from the other chromosomes, perhaps because of a unique form of chromatin structure or the presence of chromosome-specific factors. One possible role for the chromosome 3 preference could be a means of equalizing the probability of insertion between each chromosome. Given that chromosome 3 is about half the size of chromosomes 1 and 2, an equal number of insertions per chromosome would cause chromosome 3 to have twice the density of events. Support for such a model comes from the observation that such an equalized process has been observed for the Ty elements of S. cerevisiae. Despite as much as six-fold differences in size, the three smallest chromosomes have approximately the same number of Ty1 elements per tRNA than did the three largest chromosomes. Functions of RT of Tf1 that Are Specifically Required after DNA Synthesis
Is Initiated The Domains of Nup124p that Contribute to the Nuclear Import of Tf1
To investigate the contribution of host factors to the transposition of Tf1, we conducted large-scale screens for mutations in host genes that cause reduced Tf1 mobility. Interestingly, we identified one gene that encodes Nup124p, a nuclear pore factor that possesses a specific activity required for the nuclear import of Tf1 cDNA and protein. The protein contains an N-terminal domain that interacts with the Gag of Tf1. The C-terminal domain of Nup124p contains 11 copies of FXFG, a motif associated with the binding of transport receptors. The repeats are predicted to contribute to the nuclear import of Tf1 by binding to transport receptors associated with Gag and IN. To investigate which sequences of Nup124p contribute to Tf1 activity and to its association with the nuclear pore complexes, we generated an extensive series of deletions, one of which removed individual groups of the FXFG repeats. All the alleles of nup124 retained their ability to support Tf1 transposition. Surprisingly, when all FXFG repeats were removed, transposition activity was severely reduced. The results indicate that, although no specific set of FXFG repeats is required for transposition, a threshold number must be present. Interestingly, the removal of FXFG repeats did not reduce the association of Nup124p with the nuclear pores. We tested whether the domain of Nup124p that interacts with Gag is required for transposition. By deleting different segments of the N-terminal domain, we identified a segment of 300 residues of Nup124p that is necessary for Tf1 transposition. The allele also retains its association with the nuclear pores. We are currently testing whether the 300 residues are necessary for the interaction with Gag. The deletion of residues in the C-terminus of Nup124p revealed a domain that is required for its association with nuclear pores. The deletion of as few as 10 residues from the C-terminus of Nup124p caused a disruption of the association with the nuclear pores as indicated by immunofluorescence microscopy. As expected, the removal of the 10 residues from the C-terminus of Nup124p caused a severe defect in Tf1 transposition. |
|||
|
PUBLICATIONS
aV. Wood, The Sanger Centre, Cambridge,
U.K. |
|||