This report was generated with RLSeq v1.0.3.
Sample name: SH-SY5Y shCTR
Sample type: DRIP
Label: POS
Genome: hg38
Time: Thu Jul 7 20:26:41 2022
Result | Available |
---|---|
RLFS analysis | TRUE |
Sample classification | TRUE |
Noise analysis | TRUE |
Feature enrichment test | TRUE |
Transcript feature overlap | TRUE |
Correlation analysis | TRUE |
Gene annotations | TRUE |
RL-Regions test | TRUE |
R-loop forming sequences (RLFS) were compared to the ranges in SH-SY5Y shCTR to measure enrichment. The resulting Z-score distribution is visualized below:
Note: for samples which map R-loop successfully, enrichment is expected. See representative examples for POS and NEG sample types here.
RLFS were derived across the genome using QmRLFS-finder.py
. R-loop broad peaks
were called with macs
and then compared with RLFS using
permTest
from the regioneR
R package. An empirical
distribution of RLFS was generated using the
circularRandomizeRegions
method and compared to the peaks
in order to calculate enrichment p value and zscore (effect size of
enrichment). For additional detail, please refer to the
RLSeq::analyzeRLFS
documentation (link).
From this analysis, the empirically-determined p value was 0.009901 (with 100 permutations, the minimum possible p value was 0.009901). The enrichment z-score was 40.3393.
Predicted label for sample SH-SY5Y shCTR is “POS” (i.e., robust R-loop mapping).
To evaluate sample quality, a binary classifier was developed via the online-learning approach described in the RLSuite manuscript. The classifier evaluates features engineered from the RLFS Z score distribution, specifically, the following features:
feature | description | raw_value | processed_value |
---|---|---|---|
Z1 | mean of Z | 1.2186307 | 37.2195411 |
Z2 | variance of Z | 0.8306204 | 529.9093513 |
Zacf1 | mean of Z ACF | -0.8338110 | 0.0292045 |
Zacf2 | variance of Z ACF | -0.4505824 | 69.4893772 |
ReW1 | mean of FT of Z (real part) | 1.7594982 | 32.1432798 |
ReW2 | variance of FT of Z (real part) | 0.8288778 | 7512.6761798 |
ImW1 | mean of FT of Z (imaginary part) | -0.1184944 | 0.0000000 |
ImW2 | variance of FT of Z (imaginary part) | -1.3255137 | 35.8420653 |
ReWacf1 | mean of FT of Z ACF (real part) | -0.5486679 | 11.7401943 |
ReWacf2 | variance of FT of Z ACF (real part) | -0.3981260 | 835.4160391 |
ImWacf1 | mean of FT of Z ACF (imaginary part) | -0.5221044 | 0.0000000 |
ImWacf2 | variance of FT of Z ACF (imaginary part) | -0.5032904 | 522.1719304 |
From these features, classification was performed to derive a prediction (predicted label) regarding whether the sample mapped R-loops or not. In short, “POS” indicates any sample for which all the following are true:
The criteria for SH-SY5Y shCTR are shown below:
Criteria | Result |
---|---|
|
TRUE |
|
TRUE |
|
TRUE |
|
TRUE |
These results led to the final prediction: “POS” (i.e., robust R-loop mapping).
For additional detail, please refer to the
RLSeq::predictCondition
documentation (link).
To visualize the results of noiseAnalyze
we can use a
“fingerprint plot” (named after the deepTools
implementation by the same name).
This plot shows the proportion of signal contained in the corresponding proportion of coverage bins. In the plot above, we can observe that relatively few bins contain nearly all the signal. This is exactly what we would expect to see when our sample has good signal-to-noise ratio, a sign of good quality in R-loop mapping datasets.
While a fingerprint plot is useful for getting a quick view of the
dataset, it is also useful to compare the analyzed sample to
publicly-available the datasets provided by RLBase. The
noiseComparisonPlot
enables this comparison.
The results were then visualized with the
plotEnrichment()
function:
Note: If < 200 peaks in user-supplied sample, ◇ will be missing from plots.
Annotations were derived from a variety of sources and accessed using
RLHub (unless custom annotations were supplied by the user).
Detailed explanations of each database and type can be found here. The valr
R package was implemented to test
the enrichment of these features within the supplied ranges for
SH-SY5Y shCTR. For additional detail, please refer to
the RLSeq::featureEnrich
documentation (link).
The results were then visualized with the
plotTxFeatureOverlap()
function:
Using the method described in Chedin et al. 2020, the inter-sample
correlations between SH-SY5Y shCTR and the samples in
RLBase were calculated. For additional detail, please refer to
the RLSeq::corrAnalyze
documentation (link).
In the resulting heatmap, SH-SY5Y shCTR is
identified via the group
annotation.
Note: In the plot legend (mode panel), misc includes the modes with < 12 samples: BisMapR, DREAM, DRIP-RNA-Seq, DRIPc-HBD, DRIVE, DRNA, m6A-DIP, RNH-CnR, RR-ChIP, S1-DRIP.
hg38 Gene annotations were downloaded from AnnotationHub and overlapped with R-loop
ranges in SH-SY5Y shCTR. For additional detail, please
refer to the RLSeq::geneAnnotation
documentation (link). The resulting gene table was then filtered
for the top 2000 peaks (by p-adjusted value) and is observed here:
RL-Regions are consensus R-loop sites derived from a meta-analysis of
all high-confidence R-loop mapping samples in RLBase (see the RLSuite
manuscript for a full description). The ranges supplied for
SH-SY5Y shCTR were compared to the RL-Regions to
determine the degree and significance of overlap. For additional detail,
please refer to the RLSeq::rlRegionTest
documentation (link).
For more information about RLSeq please visit the package homepage here.
Note: if you use RLSeq in published research, please reference:
Miller et al., RLSeq, (2021), GitHub repository, Bishop-Laboratory/RLSeq
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices datasets utils methods base
##
## other attached packages:
## [1] RLHub_1.2.0 RLSeq_1.0.3 dplyr_1.0.9 magrittr_2.0.3
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.2 tidyselect_1.1.2
## [3] RSQLite_2.2.14 AnnotationDbi_1.58.0
## [5] htmlwidgets_1.5.4 grid_4.2.0
## [7] BiocParallel_1.30.3 pROC_1.18.0
## [9] aws.signature_0.6.0 munsell_0.5.0
## [11] codetools_0.2-18 DT_0.23
## [13] future_1.26.1 withr_2.5.0
## [15] colorspace_2.0-3 Biobase_2.56.0
## [17] filelock_1.0.2 highr_0.9
## [19] knitr_1.39 rstudioapi_0.13
## [21] stats4_4.2.0 listenv_0.8.0
## [23] MatrixGenerics_1.8.1 labeling_0.4.2
## [25] GenomeInfoDbData_1.2.8 bit64_4.0.5
## [27] farver_2.1.0 parallelly_1.32.0
## [29] vctrs_0.4.1 generics_0.1.3
## [31] lambda.r_1.2.4 ipred_0.9-13
## [33] xfun_0.31 BiocFileCache_2.4.0
## [35] doParallel_1.0.17 regioneR_1.28.0
## [37] R6_2.5.1 GenomeInfoDb_1.32.2
## [39] clue_0.3-61 gridGraphics_0.5-1
## [41] bitops_1.0-7 cachem_1.0.6
## [43] DelayedArray_0.22.0 assertthat_0.2.1
## [45] promises_1.2.0.1 BiocIO_1.6.0
## [47] scales_1.2.0 nnet_7.3-17
## [49] gtable_0.3.0 valr_0.6.4
## [51] globals_0.15.1 processx_3.6.1
## [53] timeDate_3043.102 rlang_1.0.3
## [55] systemfonts_1.0.4 GlobalOptions_0.1.2
## [57] splines_4.2.0 rtracklayer_1.56.1
## [59] ModelMetrics_1.2.2.2 broom_1.0.0
## [61] BiocManager_1.30.18 yaml_2.3.5
## [63] reshape2_1.4.4 GenomicFeatures_1.48.3
## [65] crosstalk_1.2.0 backports_1.4.1
## [67] httpuv_1.6.5 caret_6.0-92
## [69] tools_4.2.0 lava_1.6.10
## [71] ggplotify_0.1.0 ggplot2_3.3.6
## [73] ellipsis_0.3.2 kableExtra_1.3.4
## [75] jquerylib_0.1.4 RColorBrewer_1.1-3
## [77] BiocGenerics_0.42.0 Rcpp_1.0.8.3
## [79] plyr_1.8.7 base64enc_0.1-3
## [81] progress_1.2.2 zlibbioc_1.42.0
## [83] purrr_0.3.4 RCurl_1.98-1.7
## [85] ps_1.7.1 prettyunits_1.1.1
## [87] rpart_4.1.16 GetoptLong_1.0.5
## [89] pbapply_1.5-0 S4Vectors_0.34.0
## [91] cluster_2.1.3 SummarizedExperiment_1.26.1
## [93] futile.options_1.0.1 data.table_1.14.2
## [95] caretEnsemble_2.0.1 circlize_0.4.15
## [97] matrixStats_0.62.0 hms_1.1.1
## [99] mime_0.12 evaluate_0.15
## [101] xtable_1.8-4 XML_3.99-0.10
## [103] VennDiagram_1.7.3 shape_1.4.6
## [105] IRanges_2.30.0 gridExtra_2.3
## [107] compiler_4.2.0 biomaRt_2.52.0
## [109] tibble_3.1.7 crayon_1.5.1
## [111] htmltools_0.5.2 later_1.3.0
## [113] tzdb_0.3.0 ggprism_1.0.3
## [115] tidyr_1.2.0 lubridate_1.8.0
## [117] aws.s3_0.3.21 DBI_1.1.3
## [119] formatR_1.12 ExperimentHub_2.4.0
## [121] ComplexHeatmap_2.12.0 dbplyr_2.2.1
## [123] MASS_7.3-57 rappdirs_0.3.3
## [125] Matrix_1.4-1 readr_2.1.2
## [127] cli_3.3.0 parallel_4.2.0
## [129] gower_1.0.0 GenomicRanges_1.48.0
## [131] pkgconfig_2.0.3 GenomicAlignments_1.32.0
## [133] recipes_1.0.0 xml2_1.3.3
## [135] foreach_1.5.2 svglite_2.1.0
## [137] bslib_0.3.1 hardhat_1.2.0
## [139] webshot_0.5.3 XVector_0.36.0
## [141] prodlim_2019.11.13 rvest_1.0.2
## [143] yulab.utils_0.0.5 stringr_1.4.0
## [145] callr_3.7.0 digest_0.6.29
## [147] Biostrings_2.64.0 rmarkdown_2.14
## [149] restfulr_0.0.15 curl_4.3.2
## [151] shiny_1.7.1 Rsamtools_2.12.0
## [153] rjson_0.2.21 lifecycle_1.0.1
## [155] nlme_3.1-157 jsonlite_1.8.0
## [157] futile.logger_1.4.3 viridisLite_0.4.0
## [159] BSgenome_1.64.0 fansi_1.0.3
## [161] pillar_1.7.0 lattice_0.20-45
## [163] KEGGREST_1.36.2 fastmap_1.1.0
## [165] httr_1.4.3 survival_3.2-13
## [167] interactiveDisplayBase_1.34.0 glue_1.6.2
## [169] png_0.1-7 iterators_1.0.14
## [171] BiocVersion_3.15.2 bit_4.0.4
## [173] class_7.3-20 stringi_1.7.6
## [175] sass_0.4.1 blob_1.2.3
## [177] AnnotationHub_3.4.0 memoise_2.0.1
## [179] renv_0.15.5 future.apply_1.9.0
RLSeq © 2022, Bishop Lab, UT Health San Antonio
RLSeq maintainer: Henry Miller