This report was generated with RLSeq v0.99.6.
Sample name: RDIP-Seq +RNH1
Sample type: RDIP
Label: NEG
Genome: hg38
Time: Thu Oct 14 14:26:57 2021
Result | Available |
---|---|
RLFS Analysis | TRUE |
Sample classification | TRUE |
Feature enrichment test | TRUE |
Correlation analysis | TRUE |
Gene Annotations | TRUE |
RL-Regions Test | TRUE |
R-loop forming sequences (RLFS) were compared to the ranges in RDIP-Seq +RNH1 to measure enrichment. The resulting Z-score distribution is visualized below:
Note: for samples which map R-loop successfully, enrichment is expected. See representative examples for POS and NEG sample types here.
RLFS were derived across the genome using QmRLFS-finder.py
. R-loop broad peaks were called with macs
and then compared with RLFS using permTest
from the regioneR
R package. An empirical distribution of RLFS was generated using the circularRandomizeRegions
method and compared to the peaks in order to calculate enrichment p value and zscore (effect size of enrichment). For additional detail, please refer to the RLSeq::analyzeRLFS
documentation (link).
From this analysis, the empirically-determined p value was 0.009901 (with 100 permutations, the minimum possible p value was 0.009901). The enrichment z-score was 4.2791.
Predicted label for sample RDIP-Seq +RNH1 is “NEG” (i.e., poor R-loop mapping).
To evaluate sample quality, a binary classifier was developed via the online-learning approach described in the RLSuite manuscript. The classifier evaluates features engineered from the RLFS Z score distribution, specifically, the following features:
feature | description | raw_value | processed_value |
---|---|---|---|
Z1 | mean of Z | -1.0410735 | 3.7278413 |
Z2 | variance of Z | -1.2084578 | 56.2603406 |
Zacf1 | mean of Z ACF | -0.9787910 | 0.0046035 |
Zacf2 | variance of Z ACF | -1.2978184 | 5.2776535 |
ReW1 | mean of FT of Z (real part) | -0.5141733 | 4.2790722 |
ReW2 | variance of FT of Z (real part) | -1.2069132 | 769.0159302 |
ImW1 | mean of FT of Z (imaginary part) | -2.8001678 | 0.0000000 |
ImW2 | variance of FT of Z (imaginary part) | 1.0383560 | 211.7189408 |
ReWacf1 | mean of FT of Z ACF (real part) | -1.2202493 | 1.8505918 |
ReWacf2 | variance of FT of Z ACF (real part) | -1.2602061 | 59.5376294 |
ImWacf1 | mean of FT of Z ACF (imaginary part) | -0.5630640 | 0.0000000 |
ImWacf2 | variance of FT of Z ACF (imaginary part) | -1.2682520 | 45.3194174 |
From these features, classification was performed to derive a prediction (predicted label) regarding whether the sample mapped R-loops or not. In short, “POS” indicates any sample for which all the following are true:
The criteria for RDIP-Seq +RNH1 are shown below:
Criteria | Result |
---|---|
|
TRUE |
|
TRUE |
|
FALSE |
|
FALSE |
These results led to the final prediction: “NEG” (i.e., poor R-loop mapping).
For additional detail, please refer to the RLSeq::predictCondition
documentation (link).
The results were then visualized with the plotEnrichment()
function:
Note: If < 200 peaks in user-supplied sample, ◇ will be missing from plots.
Annotations were derived from a variety of sources and accessed using RLHub (unless custom annotations were supplied by the user). Detailed explanations of each database and type can be found here. The valr
R package was implemented to test the enrichment of these features within the supplied ranges for RDIP-Seq +RNH1. For additional detail, please refer to the RLSeq::featureEnrich
documentation (link).
Using the method described in Chedin et al. 2020, the inter-sample correlations between RDIP-Seq +RNH1 and the samples in RLBase were calculated. For additional detail, please refer to the RLSeq::corrAnalyze
documentation (link).
In the resulting heatmap, RDIP-Seq +RNH1 is identified via the group
annotation.
Note: In the plot legend (mode panel), misc includes the modes with < 12 samples: BisMapR, DREAM, DRIP-RNA-Seq, DRIPc-HBD, DRIVE, DRNA, m6A-DIP, RNH-CnR, RR-ChIP, S1-DRIP.
hg38 Gene annotations were downloaded from AnnotationHub and overlapped with R-loop ranges in RDIP-Seq +RNH1. For additional detail, please refer to the RLSeq::geneAnnotation
documentation (link). The resulting gene table was then filtered for the top 2000 peaks (by p-adjusted value) and is observed here:
RL-Regions are consensus R-loop sites derived from a meta-analysis of all high-confidence R-loop mapping samples in RLBase (see the RLSuite manuscript for a full description). The ranges supplied for RDIP-Seq +RNH1 were compared to the RL-Regions to determine the degree and significance of overlap. For additional detail, please refer to the RLSeq::rlRegionTest
documentation (link).
For more information about RLSeq please visit the package homepage here.
Note: if you use RLSeq in published research, please reference:
Miller et al., RLSeq, (2021), GitHub repository, Bishop-Laboratory/RLSeq
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] RLHub_0.99.4 RLSeq_0.99.6 dplyr_1.0.7 magrittr_2.0.1
##
## loaded via a namespace (and not attached):
## [1] circlize_0.4.13 AnnotationHub_3.1.5
## [3] BiocFileCache_2.1.1 systemfonts_1.0.2
## [5] plyr_1.8.6 splines_4.1.1
## [7] BiocParallel_1.27.10 crosstalk_1.1.1
## [9] listenv_0.8.0 GenomeInfoDb_1.29.8
## [11] ggplot2_3.3.5 digest_0.6.28
## [13] yulab.utils_0.0.2 foreach_1.5.1
## [15] htmltools_0.5.2 magick_2.7.3
## [17] fansi_0.5.0 memoise_2.0.0
## [19] BSgenome_1.61.0 cluster_2.1.2
## [21] doParallel_1.0.16 ComplexHeatmap_2.9.4
## [23] recipes_0.1.16 globals_0.14.0
## [25] Biostrings_2.61.2 gower_0.2.2
## [27] matrixStats_0.61.0 svglite_2.0.0
## [29] colorspace_2.0-2 rappdirs_0.3.3
## [31] blob_1.2.2 rvest_1.0.1
## [33] xfun_0.26 crayon_1.4.1
## [35] RCurl_1.98-1.5 jsonlite_1.7.2
## [37] survival_3.2-13 iterators_1.0.13
## [39] glue_1.4.2 kableExtra_1.3.4
## [41] gtable_0.3.0 ipred_0.9-12
## [43] zlibbioc_1.39.0 XVector_0.33.0
## [45] webshot_0.5.2 GetoptLong_1.0.5
## [47] DelayedArray_0.19.4 shape_1.4.6
## [49] future.apply_1.8.1 BiocGenerics_0.39.2
## [51] scales_1.1.1 futile.options_1.0.1
## [53] DBI_1.1.1 Rcpp_1.0.7
## [55] xtable_1.8-4 viridisLite_0.4.0
## [57] clue_0.3-59 gridGraphics_0.5-1
## [59] bit_4.0.4 stats4_4.1.1
## [61] lava_1.6.10 prodlim_2019.11.13
## [63] DT_0.19 htmlwidgets_1.5.4
## [65] httr_1.4.2 RColorBrewer_1.1-2
## [67] ellipsis_0.3.2 pkgconfig_2.0.3
## [69] XML_3.99-0.8 farver_2.1.0
## [71] nnet_7.3-16 sass_0.4.0
## [73] dbplyr_2.1.1 utf8_1.2.2
## [75] caret_6.0-88 ggplotify_0.1.0
## [77] AnnotationDbi_1.55.1 later_1.3.0
## [79] tidyselect_1.1.1 labeling_0.4.2
## [81] rlang_0.4.11 reshape2_1.4.4
## [83] munsell_0.5.0 BiocVersion_3.14.0
## [85] tools_4.1.1 cachem_1.0.6
## [87] ggprism_1.0.3 generics_0.1.0
## [89] RSQLite_2.2.8 ExperimentHub_2.1.4
## [91] evaluate_0.14 stringr_1.4.0
## [93] fastmap_1.1.0 yaml_2.2.1
## [95] ModelMetrics_1.2.2.2 knitr_1.34
## [97] bit64_4.0.5 purrr_0.3.4
## [99] KEGGREST_1.33.0 pbapply_1.5-0
## [101] future_1.22.1 nlme_3.1-153
## [103] mime_0.12 formatR_1.11
## [105] xml2_1.3.2 caretEnsemble_2.0.1
## [107] compiler_4.1.1 rstudioapi_0.13
## [109] png_0.1-7 interactiveDisplayBase_1.31.2
## [111] filelock_1.0.2 curl_4.3.2
## [113] tibble_3.1.4 bslib_0.3.0
## [115] stringi_1.7.4 futile.logger_1.4.3
## [117] highr_0.9 lattice_0.20-44
## [119] Matrix_1.3-4 vctrs_0.3.8
## [121] pillar_1.6.2 lifecycle_1.0.1
## [123] BiocManager_1.30.16 GlobalOptions_0.1.2
## [125] jquerylib_0.1.4 data.table_1.14.0
## [127] bitops_1.0-7 httpuv_1.6.3
## [129] rtracklayer_1.53.1 GenomicRanges_1.45.0
## [131] R6_2.5.1 BiocIO_1.3.0
## [133] promises_1.2.0.1 gridExtra_2.3
## [135] IRanges_2.27.2 parallelly_1.28.1
## [137] codetools_0.2-18 lambda.r_1.2.4
## [139] MASS_7.3-54 assertthat_0.2.1
## [141] SummarizedExperiment_1.23.4 rjson_0.2.20
## [143] withr_2.4.2 regioneR_1.25.1
## [145] GenomicAlignments_1.29.0 Rsamtools_2.9.1
## [147] S4Vectors_0.31.3 GenomeInfoDbData_1.2.7
## [149] parallel_4.1.1 VennDiagram_1.6.20
## [151] grid_4.1.1 rpart_4.1-15
## [153] timeDate_3043.102 class_7.3-19
## [155] rmarkdown_2.11 MatrixGenerics_1.5.4
## [157] pROC_1.18.0 shiny_1.7.0
## [159] Biobase_2.53.0 lubridate_1.7.10
## [161] restfulr_0.0.13
RLSeq © 2021, Bishop Lab, UT Health San Antonio
RLSeq maintainer: Henry Miller