

This report was generated with RLSeq v0.9.0.

Sample information

Sample name: TC32 DRIP-Seq

Sample type: DRIP

Label: POS

Genome: hg38

Time: Wed Sep 29 11:42:15 2021


1. RLFS Analysis

Z-Score distribution

R-loop forming sequences (RLFS) were compared to the ranges in TC32 DRIP-Seq to measure enrichment. The resulting Z-score distribution is visualized below:

Note: for samples which map R-loop successfully, enrichment is expected. See representative examples for POS and NEG sample types here.


Additional details

RLFS were derived across the genome using QmRLFS-finder.py. R-loop broad peaks were called with macs and then compared with RLFS using permTest from the regioneR R package. An empirical distribution of RLFS was generated using the circularRandomizeRegions method and compared to the peaks in order to calculate enrichment p value and zscore (effect size of enrichment).

From this analysis, the empirically-determined p value was 0.009901 (with 100 permutations, the minimum possible p value was 0.009901). The enrichment z-score was 30.0936.

2. Sample classification

Predicted label for sample TC32 DRIP-Seq is “POS” (i.e., robust R-loop mapping).


Additional Details

To evaluate sample quality, a binary classifier was developed via the online-learning approach described in the RLSuite manuscript. The classifier evaluates features engineered from the RLFS Z score distribution, specifically, the following features:

Abbreviations: Z, Z-score distribution; ACF, autocorrelation function; FT, Fourier Transform.
feature description raw_value processed_value
Z1 mean of Z 0.6549506 26.8252132
Z2 variance of Z 0.3684280 381.5501762
Zacf1 mean of Z ACF -0.9372186 0.0116647
Zacf2 variance of Z ACF -0.7892965 28.1122824
ReW1 mean of FT of Z (real part) 1.1992794 23.3756193
ReW2 variance of FT of Z (real part) 0.3716032 5409.3785964
ImW1 mean of FT of Z (imaginary part) 0.2556674 0.0000000
ImW2 variance of FT of Z (imaginary part) -2.0495079 17.6387940
ReWacf1 mean of FT of Z ACF (real part) -0.9203275 4.6892162
ReWacf2 variance of FT of Z ACF (real part) -0.7343688 334.2000706
ImWacf1 mean of FT of Z ACF (imaginary part) -0.5411562 0.0000000
ImWacf2 variance of FT of Z ACF (imaginary part) -0.8089526 217.1651382

From these features, classification was performed to derive a prediction (predicted label) regarding whether the sample mapped R-loops or not. In short, “POS” indicates any sample for which all the following are true:

  1. Criteria 1: The RLFS Permutation test P value is significant (p < .05)
  2. Criteria 2: The Z-score distribution middle is > 0.
  3. Criteria 3: The Z-score distribution middle is > the start and the end.
  4. Criteria 4: The model predicts a label of “POS”.

The criteria for TC32 DRIP-Seq are shown below:

Results from quality analysis of TC32 DRIP-Seq
Criteria Result
  1. PVal Significant
  1. ZApex > 0
  1. ZApex > ZEdges
  1. Predicted ‘POS’

These results led to the final prediction: “POS” (i.e., robust R-loop mapping).

3. Feature enrichment test

Enrichment plots

The results were then visualized with the plotEnrichment() function:











Note: If < 200 peaks in user-supplied sample, ◇ will be missing from plots.

Summary table

Additional Details

Annotations were derived from a variety of sources and accessed using RLHub (unless custom annotations were supplied by the user). Detailed explanations of each database and type can be found here. The valr R package was implemented to test the enrichment of these features within the supplied ranges for TC32 DRIP-Seq.

4. Correlation analysis

Using the method described in Chedin et al. 2020, the inter-sample correlations between TC32 DRIP-Seq and the samples in RLBase were calculated.

In the resulting heatmap, TC32 DRIP-Seq is identified via the group annotation.

Note: In the plot legend (mode panel), misc includes the modes with < 12 samples: BisMapR, DREAM, DRIP-RNA-Seq, DRIPc-HBD, DRIVE, DRNA, m6A-DIP, RNH-CnR, RR-ChIP, S1-DRIP.

5. Gene Annotations

hg38 Gene annotations were downloaded from AnnotationHub and overlapped with R-loop ranges in TC32 DRIP-Seq. The resulting gene table was then filtered for the top 2000 peaks (by p-adjusted value) and is observed here:

6. RL-Regions Test

RL-Regions are consensus R-loop sites derived from a meta-analysis of all high-confidence R-loop mapping samples in RLBase (see the RLSuite manuscript for a full description). The ranges supplied for TC32 DRIP-Seq were compared to the RL-Regions to determine the degree and significance of overlap.


For more information about RLSeq please visit the package homepage here.

Note: if you use RLSeq in published research, please reference:

Miller et al., RLSeq, (2021), GitHub repository, Bishop-Laboratory/RLSeq

