Henry L. Levin, PhD, Head, Section on Eukaryotic Transposable Elements
Angela Atwood-Moore, BA, Senior Research Assistant
Tracy Ripmaster, PhD, Research Assistant
Atreyi Chatterjee, PhD, Visiting Fellow
Hirotaka Ebina, PhD, _Visiting Fellow
Young-Eun Leem, PhD, Visiting Fellow
Xu Lin, PhD, Visiting Fellow
Anasua Majumdar, PhD, Visiting Fellow
Adam Evertts, BA, Postbaccalaureate Fellow
Marc Heincelman, BA, Postbaccalaureate Fellow
Robert Judson, BA, Postbaccalaureate Fellow
Christopher Plymire, BA, Postbaccalaureate Fellow
Ryan Rampersaud, BA, Postbaccalaureate Fellow

Retroviral diseases such as AIDS and leukemia have intensified the need to understand the mechanisms of retrovirus replication. Our primary objectives are to understand how reverse transcription of viral mRNA occurs and how cDNA products are integrated into the genome of infected cells. Owing to their similarity to retroviruses, long terminal repeat (LTR) retrotransposons are important models for retrovirus replication. The retrotransposon under study in our laboratory is the Tf1 element of the fission yeast Schizosaccharomyces pombe. During the synthesis of cDNA, reverse transcriptase (RT) generates a series of highly specific intermediates. We identify which amino acids of RT recognize specific cDNA intermediates. Our experiments allow us to attribute specialized functions to specific amino acids that are widely conserved. Once in the nucleus, the cDNA of retroviruses and retrotransposons is integrated into specific targets in the host chromosomes. Tf1 has a strong integration preference for pol II promoters. The similarity between this pattern and the integration preference of HIV-1 motivates us to study the structures of Tf1 IN and the promoters responsible for this intriguing mechanism.
Integration preference of Tf1 for Pol II promoters
Leem, Ripmaster, Heincelman, Levin; in collaboration with Hoffman
Our analysis of the genome sequence of S. pombe revealed a strong clustering of pre-existing LTRs associated with the 5′ end of open reading frames (ORFs). Experiments based on the production of new integration events revealed that the association of Tf1 with LTRs was the result of integration preference. To define the determinants of the target sites, we developed an in vivo assay for integration by using a plasmid that contained ade6 as the target and a plasmid with Tf1 that induced transposition. The version of Tf1 that we expressed in our experiments contained a neo gene that cause target plasmids with insertions to gain resistance to kanamycin. When Tf1-neo was expressed, the plasmid with ade6 served as an efficient target for integration. We isolated 50 separate insertions in the intact target plasmid and found that 95 percent occurred within a 160 nt region in the ade6 promoter. To determine which sequences of Ade6 were required for efficient integration, we created a series of 10 deletions within the target plasmid. The analysis revealed that the 160 nt region of the promoter was the only sequence required for efficient integration. We asked whether promoter activity was required for integration by measuring transcript levels of ade6. Deletions of sequence on either side of the 160 nt region caused five- to ten-fold reductions in ade6 mRNA. Nevertheless, the deletions caused no reduction in integration efficiency. The results indicated that transcription was not important for target site activity. We next considered whether transcription factors themselves were directing the integration of Tf1. To identify positions at which factors bind in the promoter of Ade6, we used micrococcal nuclease mapping and observed a strong correlation between micrococcal-sensitive sites and the position of the prominent insertion sites. Our observations suggested that transcription factors played a role in directing Tf1 integration. Hoffman and colleagues showed previously that the transcription factor Atf1p binds to and activates the promoter of fbp1. Using the target plasmid assay, we tested whether the promoter of fbp1 is a target of Tf1 integration. We found that the fbp1 promoter was a target for Tf1 insertion and that the majority of the insertions occurred 40 nt from the position at which Atf1p binds. A mutation that blocks the binding of Aft1p caused a significant reduction in Tf1 integration at the promoter of fbp1. The data indicate that Atf1p is responsible for targeting Tf1 to specific insertion sites in the fbp1 promoter.
The domains of Tf1 IN
Ebina, Hizi, 1 Lin, Levin
To combat HIV-1, there is great need for new drugs with lowered toxicity and greater potency. Although drugs that inhibit the protease and reverse transcriptase of HIV-1 are in use, inhibitors of integrase (IN) are not yet available. Designing anti-IN drugs is difficult in part because no three-dimensional structure exists for an intact IN. Our objective is to determine whether Tf1 IN can serve as a model for HIV-1 IN and whether it is amenable to crystallographic analysis. The IN of Tf1 contains the Zn-binding and catalytic motifs present in the INs of retroviruses. In addition, the C-terminus of Tf1 IN possesses a GPY domain and a chromodomain. The GPY domain is a prominent motif present in retrotransposons related to Tf1 and in gamma retroviruses such as Moloney murine leukemia virus. Although the GPY motif is widespread, its function is unknown. We previously found that Tf1 IN can be purified in large quantities when expressed in bacteria and that the protein has both the processing and strand-transfer activities typical of retrovirus INs. Perhaps our most significant observation was that Tf1 IN was substantially more soluble than the INs of retroviruses. We obtained concentrations of 22 mg/ml without any notable precipitation.
The structural domains of HIV-1 IN are the N-terminal domain containing the Zn-binding structure, the catalytic core, and the C-terminal domain with DNA binding activity. To compare the structure of Tf1 IN to the HIV-1 protein, we subjected Tf1 IN to partial proteolysis. A time course of IN treated with trypsin revealed three dominant fragments. The products were purified and their N-termini sequenced. One fragment contained the Zn finger-like motif and corresponded to the Zn-binding domain of HIV-1 IN. Another fragment constituted a stable domain containing the catalytic residues and therefore corresponded to the catalytic core of HIV-1 IN. The last fragment corresponded to the C-terminal domain of HIV-1 IN. The results indicated that the domain structure of the Tf1 and HIV-1 INs share significant similarities.
An analysis of the domains in HIV-1 IN led to the surprising discovery that the catalytic core was sufficient for catalytic activity. Our detection of activity relied on an assay that measures the reverse of integration called disintegration. To test whether the catalytic activity of Tf1 IN was also compartmentalized, we assayed each region of the protein. Fifteen proteins derived from Tf1 IN were purified and assayed for disintegration activity. As described in our report last year, IN lacking the chromodomain (CH-) has substantially more activity than the full-length IN. In comparison, the catalytic core exhibited no activity. To determine which regions of the IN were necessary for activity, we generated a set of sequential truncations. Each deletion in the N-terminus of CH- rendered the protein inactive. Similarly, deletions in the C-terminus of the CH- protein inactivated the enzyme. Thus, the Zn-binding motif in the N-terminal domain and the GPY motif in the C-terminal domain were necessary for catalytic activity. This finding stands in sharp contrast to the IN of HIV-1, in which the central domain is sufficient for activity.
The INs of retroviruses form multimers, including dimers, tetramers, and larger assemblies of subunits. To test whether Tf1 IN forms multimers, we injected a sample of the purified IN onto a Superdex 200 column in an AKTA chromatography system. The profile produced by the UV monitor showed a single peak that was largely symmetric. We relied on protein standards to calculate an observed molecular weight for IN of 122.3 kDa, which is very close to 115 kDa, the expected weight of a dimer. Thus, under the conditions tested, Tf1 IN forms a stable dimer. Taken together, the structure of its domains and its formation of a dimer complex indicate that Tf1 IN is an important model for the IN of HIV-1.
The self-primer of Tf1 is not removed during reverse transcription
Atwood-Moore, Judson, Levin
The reverse transcription of retroviruses and LTR retrotransposons is primed by specific tRNA species for minus strand initiation and by polypurine tracts (PPT) for plus strand initiation. During reverse transcription, the RNase H activity of RT removes the tRNA and PPT primers from the 5′ ends of the cDNA so that their sequences are not copied into the 3′ termini of the complementary strand of cDNA. This process is critical because addition of the nucleotides after the conserved "CA" at the 3′ ends of the cDNA would block integration.
Tf1 uses a unique mechanism of self-priming to initiate reverse transcription. Instead of using a tRNA, Tf1 primes minus strand synthesis with an 11-nucleotide RNA removed from the 5′ end of its own transcript. An increasing number of LTR elements in eukaryotes from yeast to vertebrates have been found to use this self-priming mechanism. A recent study of mutations in the RT of Tf1 revealed that RNase H does remove the PPT from the 5′ end of the plus strand cDNA. Random mutagenesis of RT resulted in a cluster of mutations in RNase H that inhibited the removal of the PPT without reducing the amount of cDNA produced. It was not surprising that RNase H was responsible for primer removal because the PPT of Tf1 is similar to those of other LTR elements. However, it is not known whether, owing to its unique nature, the self-primer of Tf1 is also removed by RNase H.
To determine whether the self-primer of Tf1 was removed during reverse transcription, we sequenced the ends of cDNA produced by Tf1. We purified the cDNA from virus-like particles and determined the sequence of the cDNA at the 3′ end of the plus strand by ligating an oligonucleotide to the cDNA and using a complementary oligonucleotide to amplify the terminal sequences by PCR. By far, the most common cDNA contained the entire self-primer. The self-primer sequence found at the 3′ end of the plus stand was undoubtedly templated during reverse transcription by the primer itself, which is present at the 5′ end of the minus strand. The data demonstrated that the self-primer had not been removed from the cDNA before RT completed the synthesis of the plus strand. These findings are particularly surprising because they differ from the results associated with the processing of tRNA primers. Evidence from many laboratories demonstrates that tRNA primers of both retroviruses and retrotransposons are efficiently removed by RNase H before completion of reverse transcription of the termini. The removal of the tRNA primers is essential for positioning the "CA" dinucleotide at the 3′ end of the plus strand where it can be recognized by IN. Although the processing activity of INs that removes nucleotides 3′ of the "CA" is capable of removing one or two nucleotides, it cannot remove the extensive sequences of the tRNA primers. Thus, cDNAs that retain the tRNA primers would be inactive for integration.
The presence of the self-primer on the 3′ end of the cDNA suggests that Tf1 IN may have a novel processing activity capable of removing the 11 nt primer. Consistent with this model is the recent finding that the IN of Tf1 does possess a processing function that is capable of removing several nucleotides. We tested whether recombinant IN was capable of removing the self-primer from the 3′ ends of model substrates. The oligonucleotide substrates mimicked the U5 end of the LTR. While we were able to detect precise removal of the intact self-primer when present as a single-stranded DNA extension, we were unable to detect specific removal of the self-primer when present as a double-stranded extension of a DNA/DNA duplex or an RNA/DNA duplex. The possibility remains that RNase H and IN function together during integration to remove the complement of the self-primer when annealed to the primer itself. A growing body of experimental evidence indicates that retroelement IN and RT proteins cooperate during their replication. Based on the dominant levels of cDNA with the primer, we propose that IN or IN with RT removes the primer.
The Hermes transposon of the house fly is highly active in S. pombe
Plymire, Evertts, Levin
Integration of Tf1 occurs primarily into pol II promoters. Although we currently believe that such preference is the result of a mechanism that actively targets Tf1, it is possible that the insertion bias is attributable to greater accessibility at the promoter sequences. We are currently testing such a possibility by studying in S. pombe cells the integration pattern of Hermes, a "cut and paste" transposon isolated from the house fly. Given that Hermes propagates in a host that is evolutionarily distant from S. pombe, a mechanism that would actively position insertion sites probably does not exist. Thus, any integration of Hermes in S. pombe would likely occur at positions accessible to the transposase. In addition, unbiased activity of a transposon in S. pombe could be widely adapted as a tool for random mutagenesis. No methods currently exist for transposon mutagenesis of S. pombe.
The transposase of Hermes was expressed in S. pombe by fusing its gene to the promoter of nmt1. We used three versions of the nmt1 promoter to express various levels of the transposase. Immunoblots of cell extracts demonstrated that the transposase was expressed in a stable form. Cells that expressed the transposase also contained a plasmid-encoded copy of neo flanked by the terminal inverted repeats (TIRs) of hermes. We tested the ability of the transposase to cut out neo with the TIRs and insert the DNA into the S. pombe genome. Once the transposase was expressed, cells were grown on medium containing 5-fluoro-orotic acid, a treatment that removes the plasmid carrying the Hermes TIRs and neo. We then grew the strains on agar plates containing G418 to select for cells that had transposed copies of Hermes. The strains expressing the transposase generated surprisingly high levels of resistance to G418. We analyzed 30 independent strains that became G418-resistant for potential insertion events. It was significant that each strain had acquired a copy of Hermes as the result of a bona fide integration event. Analysis of the inserted copies revealed that 60 percent of them disrupted ORFs. Given that 60 percent of the S. pombe genome is coding sequence, our results indicate that the insertion of Hermes is largely random, a finding that stands in strong contrast to the integration of Tf1 where virtually none of the inserts occurs in ORFs. The data indicate that the integration bias of Tf1 is not the result of access to insertion sites. In addition, we found that 60 percent of cells induced for the expression of Hermes transposase acquired an insertion. This high frequency of insertion indicates that Hermes can be used as a tool for the random disruption of S. pombe genes.
1 Amnon Hizi, PhD, former Oak Ridge Senior Fellow
COLLABORATOR
Charles Hoffman, PhD, Boston College, Newton, MA
For further information, contact henry_levin@nih.gov.

