Transcriptomics is generally associated with efforts to probe gene expression levels within total mRNA samples, an area of research that has yielded significant insights into processes such as carcinogenesis and cellular differentiation, especially following advances in high-throughput technology. However there is more to transcriptomics than analysing the end product of transcription alone, the process itself also raises numerous avenues for exploration. The question therefore arises of how this complex process can be captured on a genome-wide scale. In a Method article in Genome Biology, Moshe Oren, Gilad Fuchs, Yoav Voichek and colleagues from the Weizmann Institute of Science, Israel, present a novel method for measuring genome-wide transcriptional elongation rates termed 4sUDRB-seq. Here Oren, Fuchs and Voichek discuss the challenges they faced during its development and the surprising results this method revealed.
What led you to develop the 4sUDRB-seq method?
We became interested in transcription elongation mainly due to our interest in chromatin biology. More specifically, we are interested in histone post translational modifications (PTMs) localised to the transcribed regions of genes. We wanted to employ a suitable method in order to explore the role of those histone PTMs in transcription elongation at a high resolution (minute scale) and across the genome. We were quite surprised to discover that back then a method to measure genome-wide transcription elongation rates did not exist.
How does this fit in with your earlier work?
Our lab didn’t really focus directly on transcription elongation rates in the past. However, we are interested in the role of chromatin in regulating gene expression. More specifically, we previously reported that histone H2B monoubiquitylation (H2Bub1) is needed for the induction of relatively long genes during stem cell differentiation (Molecular Cell, 2012, 46: 662–673). Based on this observation as well as on earlier studies in yeast and in cell-free transcription systems, we speculated that H2Bub1 is needed specifically for optimal transcriptional elongation during differentiation. We also recently identified the SWI/SNF remodeling complex as a H2Bub1 interactor, necessary for the transcription of a subset of genes, presumably through interaction with H2Bub1 within the transcribed regions (Cell Reports, 2013, Aug 15, 4, 3:601-8). Furthermore, we found that H2Bub1 can negatively affect transcription elongation of specific genes by preventing the binding of the elongation factor TFIIS (Molecular Cell, 2011, May 20, 42, 4:477-88). Now, by using the 4sUDRB-seq method we can assess more specifically the impact of each factor on genome-wide transcriptional elongation and initiation rates.
What were the biggest challenges you encountered while developing the 4sUDRB-seq method?
Since we wanted to measure the elongation rates of as many as possible genes we decided to perform the measurements four and eight minutes after removal of the reversible transcription inhibitor DRB. Working with these very short time points and obtaining reproducible results was quite a challenge. We actually calculated the time it takes to open the incubator door and take out the plates for harvesting. The additional challenge was a computational one: we needed to employ an algorithm that can identify very accurately for each gene the front edge of the advancing transcription wave. However, we were not sufficiently pleased with the algorithms that are commonly used for that purpose, and we therefore decided to develop a completely new method. After extensive optimisation and comparison of different approaches, we concluded that a logic based on estimation of the local background of each gene gave the best results.
Were you surprised by any of the results you obtained on transcription elongation rates and initiation frequencies?
We were a bit surprised by the fact that methylation of histone H3 on lysine 79 (H3K79me2) was significantly correlated with the transcription elongation rate. Methylations are usually quite stable modifications. It therefore seems more probable that the local levels of H3K79me2 are not a direct consequence of the dynamic elongation by RNA polymerase II, but rather H3K79me2 may have a more regulatory impact on elongation rate. It will be interesting to address this notion by depleting Dot1 (the enzyme that methylates H3K79) in vivo or testing the elongation rate in vitro in a H3K79-methylated nucleosome array.
Another pleasant surprise was that the method was found to be highly reproducible. Since we were dealing with relatively short measurements (four and eight minutes) we expected that in each biological repeat the transcription wave would not reach exactly the same point within each gene. However, we were happy to see that the variation between biological repeats was lower than we had anticipated.
Was there anything you wanted to incorporate into the 4sUDRB-seq method but could not due to technical, resource or time limitations?
Since we sequenced all tagged newly transcribed RNA (without PolyA selection), part of our reads originated from rRNA. We would have liked to add a rRNA depletion step in order to reduce rRNA contamination. However, due to time constraints and concerns that we might end up with too little RNA for the final RNA-seq analysis, we have not done it so far.
An exciting possibility that may not be presently feasible is to use this method in order to measure transcription elongation rates and initiation frequencies at single cell resolution. It will be very interesting to figure out how similar or different elongation rates and initiation frequencies are in different cells within a population.
What’s next for your research using the 4sUDRB-seq method?
We are now contemplating various directions in which the method can be implemented. One major direction is to figure out whether, and to what extent, differential gene expression can be regulated at the level of transcription elongation rates. Specifically, we would like to identify biological conditions where transcription elongation rates might be altered in a manner that affects the biological outcome. One such example is differentiation of embryonic stem cells. Since we and others have reported that the chromatin of embryonic stem cells is significantly altered during differentiation (we observed a significant increase in H2Bub1 levels), we would like to test whether such changes affect the transcriptional elongation rates in the differentiated cell in a manner that makes it more suitable for its new functions.
An additional direction is to use this method in order to test if changes in elongation rates contribute to specific pathologies. For example, aggressive acute leukaemia is known to be driven by translocations between the MLL gene and various components of the super elongation complex (SEC). It will be interesting to use the 4sUDRB-seq method in order to test whether the fusion between the MLL and SEC drives leukaemia through enhancing or decreasing the transcription elongation rates of specific genes.
A similar method was published by Artur Veloso, Mats Ljungman and colleagues in Genome Research. What are the main differences between the methods?
The two methods are indeed similar in concept, although they vary in technical detail (e.g. the use of bromouridine (BrU) versus 4-thiouridine (4sU) for labelling nascent RNA). Our calculated elongation rates are generally faster than those deduced by Veloso and colleagues. This might be due to several reasons. One possibility is that in the HeLa cells employed by us, transcription elongation is faster than in the cell lines examined by Ljungman’s group. We note, however, that a previous study that also used DRB but without using a nucleotide analogue has estimated the average transcription elongation rate to be ~3.8 Kb/min in both in Tet-21 and HEK293 cells (Nat Struct Mol Biol. 2009 Nov;16(11):1128-33). Incidentally, this is very similar to the value deduced in our study (3.6 Kb/min). We also note that Ljungman’s group measured relatively similar transcription elongation rates in five different cell lines.
An additional possibility is that different regions of the same gene are transcribed at different rates. This was actually suggested by Danko and colleagues (Mol Cell. 2013 Apr 25;50(2):212-22). Since we measured transcription elongation four and eight minutes after DRB removal while Veloso’s measurements were performed ten and 20 minutes after DRB removal, the two studies actually assessed elongation rates in different regions of the same genes. However, initial analysis of recent measurements that we performed eight and 12 minutes after DRB removal suggests that the elongation rates within these more downstream regions are quite similar to those measured at four and eight minutes after DRB removal, suggesting that elongation rates are relatively constant throughout genes.
Lastly, as noted above, Veloso used BrU in order to tag nascent RNA, while we used 4sU. It is possible that different nucleotide analogues affect the RNA polymerisation rate differently.
In addition to the possible impact of the technical differences between the methods, we believe that some of the apparent incongruence between our conclusions and those of Veloso and colleagues may stem from differences in the algorithms employed to determine the exact position of the elongation wave front. Of note, we measured elongation rates for several genes by qRT-PCR analysis, and the values obtained were in good agreement with those calculated by us form the 4sUDRB-seq analysis.
In addition, we also provide a method to calculate transcription initiation frequencies. We believe that the ability to determine transcription elongation rates and initiation frequencies in a single experiment are advantageous for understanding more precisely how specific transcription factors regulate gene expression. So far, when researchers deplete a specific transcription factor and performed RNA-seq, they cannot discriminate at which stage of the transcription process this specific factor has a role. Hence, combining 4sUDRB-seq with depletion of a specific transcription factor can provide a more comprehensive understanding of its role. Lastly, we believe that our method to calculate transcription initiation frequencies can also be applied successfully to data generated by Veloso and colleagues.