function aggregate_scores(obj) {
return d3.mean(obj.map(val => {
if (val.score === undefined || isNaN(val.score)) return 0;
return Math.min(1, Math.max(0, val.score))
}));
}
function transpose_list_of_objects(list) {
return Object.fromEntries(Object.keys(list[0]).map(key => [key, list.map(d => d[key])]))
}
function label_time(time) {
if (time < 1e-5) return "0s";
if (time < 1) return "<1s";
if (time < 60) return `${Math.floor(time)}s`;
if (time < 3600) return `${Math.floor(time / 60)}m`;
if (time < 3600 * 24) return `${Math.floor(time / 3600)}h`;
if (time < 3600 * 24 * 7) return `${Math.floor(time / 3600 / 24)}d`;
return ">7d"; // Assuming missing values are encoded as NaN
}
function label_memory(x_mb, include_mb = true) {
if (!include_mb && x_mb < 1e3) return "<1G";
if (x_mb < 1) return "<1M";
if (x_mb < 1e3) return `${Math.round(x_mb)}M`;
if (x_mb < 1e6) return `${Math.round(x_mb / 1e3)}G`;
if (x_mb < 1e9) return `${Math.round(x_mb / 1e6)}T`;
return ">1P";
}
function mean_na_rm(x) {
return d3.mean(x.filter(d => !isNaN(d)));
}
GRN Inference
Benchmarking GRN inference methods
5 datasets · 10 methods · 3 control methods · 8 metrics
Info
Repository
Issues
dev
MIT
Task info Method info Metric info Dataset info Results
geneRNIB is a living benchmark platform for GRN inference. This platform provides curated datasets for GRN inference and evaluation, standardized evaluation protocols and metrics, computational infrastructure, and a dynamically updated leaderboard to track state-of-the-art methods. It runs novel GRNs in the cloud, offers competition scores, and stores them for future comparisons, reflecting new developments over time.
The platform supports the integration of new inference methods, datasets and protocols. When a new feature is added, previously evaluated GRNs are re-assessed, and the leaderboard is updated accordingly. The aim is to evaluate both the accuracy and completeness of inferred GRNs. It is designed for both single-modality and multi-omics GRN inference.
In the current version, geneRNIB contains 11 inference methods including both single and multi-omics, 8 evalation metrics, and five datasets (OPSCA, Nakatake, Norman, Adamson, and Replogle).
See our publication for the details of methods.
Summary
Display settings
Filter datasets
Filter methods
Filter metrics
Results
Results table of the scores per method, dataset and metric (after scaling). Use the filters to make a custom subselection of methods and datasets. The “Overall mean” dataset is the mean value across all datasets.
Dataset info
Show
! Error processing doi 'NA'
! Error processing doi 'NA'
! Error processing doi 'NA'
! Error processing doi 'NA'
! Error processing doi 'NA'
OPSCA
Data source · 19-02-2025 · 10.28 MiB
scRNA-seq data with 146 (originally) perturbations with chemical compounds on PBMCs. Multiome data available for the control compound [@].
Novel single-cell perturbational dataset in human peripheral blood mononuclear cells (PBMCs). 144 compounds were selected from the Library of Integrated Network-Based Cellular Signatures (LINCS) Connectivity Map dataset (PMID: 29195078) and measured single-cell gene expression profiles after 24 hours of treatment. The experiment was repeated in three healthy human donors, and the compounds were selected based on diverse transcriptional signatures observed in CD34+ hematopoietic stem cells (data not released). This experiment was performed in human PBMCs because the cells are commercially available with pre-obtained consent for public release and PBMCs are a primary, disease-relevant tissue that contains multiple mature cell types (including T-cells, B-cells, myeloid cells, and NK cells) with established markers for annotation of cell types. To supplement this dataset, joint scRNA and single-cell chromatin accessibility measurements were measured from the baseline compound using the 10x Multiome assay.
Norman
Data source · 19-02-2025 · 10.28 MiB
Single cell RNA-seq data with 231 perturbations (activation) on K562 cells [@].
How cellular and organismal complexity emerges from combinatorial expression of genes is a central question in biology. High-content phenotyping approaches such as Perturb-seq (single-cell RNA-seq pooled CRISPR screens) present an opportunity for exploring such genetic interactions (GIs) at scale. Here, we present an analytical framework for interpreting high-dimensional landscapes of cell states (manifolds) constructed from transcriptional phenotypes. We applied this approach to Perturb-seq profiling of strong GIs mined from a growth-based, gain-of-function GI map. Exploration of this manifold enabled ordering of regulatory pathways, principled classification of GIs (e.g. identifying suppressors), and mechanistic elucidation of synergistic interactions, including an unexpected synergy between CBL and CNN1 driving erythroid differentiation. Finally, we apply recommender system machine learning to predict interactions, facilitating exploration of vastly larger GI manifolds.
Adamson
Data source · 19-02-2025 · 10.28 MiB
Single cell RNA-seq data with 82 perturbations (KD) on K562 cells [@].
Functional genomics efforts face tradeoffs between number of perturbations examined and complexity of phenotypes measured. We bridge this gap with Perturb-seq, which combines droplet-based single-cell RNA-seq with a strategy for barcoding CRISPR-mediated perturbations, allowing many perturbations to be profiled in pooled format. We applied Perturb-seq to dissect the mammalian unfolded protein response (UPR) using single and combinatorial CRISPR perturbations. Two genome-scale CRISPR interference (CRISPRi) screens identified genes whose repression perturbs ER homeostasis. Subjecting ∼100 hits to Perturb-seq enabled high-precision functional clustering of genes. Single-cell analyses decoupled the three UPR branches, revealed bifurcated UPR branch activation among cells subject to the same perturbation, and uncovered differential activation of the branches across hits, including an isolated feedback loop between the translocon and IRE1α. These studies provide insight into how the three sensors of ER homeostasis monitor distinct types of stress and highlight the ability of Perturb-seq to dissect complex cellular responses.
Reologle
Data source · 19-02-2025 · 10.28 MiB
Single cell RNA-seq data with 9722 perturbations (KO) on K562 cells [@].
A central goal of genetics is to define the relationships between genotypes and phenotypes. High-content phenotypic screens such as Perturb-seq (CRISPR-based screens with single-cell RNA-sequencing readouts) enable massively parallel functional genomic mapping but, to date, have been used at limited scales. Here, we perform genome-scale Perturb-seq targeting all expressed genes with CRISPR interference (CRISPRi) across >2.5 million human cells. We use transcriptional phenotypes to predict the function of poorly characterized genes, uncovering new regulators of ribosome biogenesis (including CCDC86, ZNF236, and SPATA5L1), transcription (C7orf26), and mitochondrial respiration (TMEM242). In addition to assigning gene function, single-cell transcriptional phenotypes allow for in-depth dissection of complex cellular phenomena-from RNA processing to differentiation. We leverage this ability to systematically identify genetic drivers and consequences of aneuploidy and to discover an unanticipated layer of stress-specific regulation of the mitochondrial genome. Our information-rich genotype-phenotype map reveals a multidimensional portrait of gene and cellular function.
Nakatake
Data source · 19-02-2025 · 10.28 MiB
RNA-seq data with 463 perturbations (overexpression) on SEES3 cells [@].
Transcription factors (TFs) play a pivotal role in determining cell states, yet our understanding of the causative relationship between TFs and cell states is limited. Here, we systematically examine the state changes of human pluripotent embryonic stem cells (hESCs) by the large-scale manipulation of single TFs. We establish 2,135 hESC lines, representing three clones each of 714 doxycycline (Dox)-inducible genes including 481 TFs, and obtain 26,998 microscopic cell images and 2,174 transcriptome datasets-RNA sequencing (RNA-seq) or microarrays-48 h after the presence or absence of Dox. Interestingly, the expression of essentially all the genes, including genes located in heterochromatin regions, are perturbed by these TFs. TFs are also characterized by their ability to induce differentiation of hESCs into specific cell lineages. These analyses help to provide a way of classifying TFs and identifying specific sets of TFs for directing hESC differentiation into desired cell types.
Method info
Show
portia
Documentation · Repository · Source Code · Container · dev
GRN inference using PORTIA
GRN inference using PORTIA.
ppcor
Documentation · Repository · Source Code · Container · dev
GRN inference using PPCOR
GRN inference using PPCOR.
scenic
Documentation · Repository · Source Code · Container · dev
GRN inference using scenic
GRN inference using Scenic pipeline.
scenicplus
Documentation · Repository · Source Code · Container · dev
GRN inference using scenicplus
GRN inference using scenicplus.
scprint
Documentation · Repository · Source Code · Container · dev
GRN inference using scPRINT
GRN inference using scPRINT.
grnboost2
Documentation · Repository · Source Code · Container · dev
GRN inference using GRNBoost2
GRN inference using GRNBoost2.
scglue
Documentation · Repository · Source Code · Container · dev
GRN inference using scglue
GRN inference using scglue.
granie
Documentation · Repository · Source Code · Container · dev
GRN inference using GRaNIE
GRN inference using GRaNIE
figr
Documentation · Repository · Source Code · Container · dev
GRN inference using figr
GRN inference using figr.
celloracle
Documentation · Repository · Source Code · Container · dev
GRN inference using celloracle
GRN inference using celloracle.
Control method info
Show
pearson_corr
Documentation · Repository · Source Code · Container · dev
Baseline based on correlation
Baseline GRN inference method using Pearson correlation.
Negative control
Documentation · Repository · Source Code · Container · dev
Source-target links based on random assignment
Randomly assigns regulatory links to tf-target links with a given tf and target list. This is to perform near random.
positive_control
Documentation · Repository · Source Code · Container · dev
Baseline based on correlation
Baseline model based on Pearson correlation that uses both inference and evaluation dataset to infer the GRN.
Metric info
Show
R1 (all)
Regression 1 score for all genes with mean gene expression set for missing genes.
Regression 1 score for all genes with mean gene expression set for missing genes
R1 (grn)
Regression 1 score for only genes in the network.
Regression 1 score for only genes in the network
R2 (precision)
Captures the perfomance for the top regulatory links.
Captures the perfomance for the top regulatory links
R2 (balanced)
Balanced performance scores considering both prevision and recall.
Balanced performance scores considering both prevision and recall
R2 (recall)
Captures the perfomance for the more broad regulatory links (recall).
Captures the perfomance for the more broad regulatory links (recall)
WS (precision)
Captures the perfomance for the top regulatory links.
Captures the perfomance for the top regulatory links
WS (balanced)
Balanced performance scores considering both prevision and recall.
Balanced performance scores considering both prevision and recall
WS (recall)
Captures the perfomance for the more broad regulatory links (recall).
Captures the perfomance for the more broad regulatory links (recall)
Quality control results
Show
Category | Name | Value | Condition | Severity |
---|---|---|---|---|
Raw results | Method 'celloracle' %missing | 0.8750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Method 'figr' %missing | 0.8750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Method 'granie' %missing | 0.8750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Method 'scenicplus' %missing | 0.8750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Method 'scglue' %missing | 0.8750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Dataset 'nakatake' %missing | 0.6634615 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'ws-theta-0.0' %missing | 0.6461538 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'ws-theta-0.5' %missing | 0.6461538 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'ws-theta-1.0' %missing | 0.6461538 | pct_missing <= .1 | ✗✗✗ |
Raw results | Method 'scprint' %missing | 0.4750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Dataset 'adamson' %missing | 0.4615385 | pct_missing <= .1 | ✗✗✗ |
Raw results | Dataset 'norman' %missing | 0.3846154 | pct_missing <= .1 | ✗✗✗ |
Raw results | Dataset 'replogle' %missing | 0.3846154 | pct_missing <= .1 | ✗✗✗ |
Raw results | Dataset 'op' %missing | 0.3750000 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'r1_all' %missing | 0.3384615 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'r1_grn' %missing | 0.3384615 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'r2-theta-0.0' %missing | 0.3384615 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'r2-theta-0.5' %missing | 0.3384615 | pct_missing <= .1 | ✗✗✗ |
Raw results | Metric 'r2-theta-1.0' %missing | 0.3384615 | pct_missing <= .1 | ✗✗✗ |
Dataset info | Pct 'data_reference' missing | 1.0000000 | percent_missing(dataset_info, field) | ✗✗ |
Dataset info | Pct 'data_url' missing | 1.0000000 | percent_missing(dataset_info, field) | ✗✗ |
Dataset info | Pct 'task_id' missing | 1.0000000 | percent_missing(dataset_info, field) | ✗✗ |
Method info | Pct 'paper_reference' missing | 0.7692308 | percent_missing(method_info, field) | ✗✗ |
Metric info | Pct 'paper_reference' missing | 1.0000000 | percent_missing(metric_info, field) | ✗✗ |
Scaling | Worst score ppcor r1_all | -2.3365000 | worst_score >= -1 | ✗✗ |
Scaling | Best score grnboost2 r1_all | 4.1990000 | best_score <= 2 | ✗✗ |
Scaling | Best score portia r1_all | 3.3190000 | best_score <= 2 | ✗ |
Scaling | Best score grnboost2 r1_grn | 3.3110000 | best_score <= 2 | ✗ |
Raw results | Method 'grnboost2' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Raw results | Method 'negative_control' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Raw results | Method 'pearson_corr' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Raw results | Method 'portia' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Raw results | Method 'positive_control' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Raw results | Method 'ppcor' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Raw results | Method 'scenic' %missing | 0.1500000 | pct_missing <= .1 | ✗ |
Scaling | Best score ppcor r1_all | 2.8383000 | best_score <= 2 | ✗ |
Scaling | Worst score portia r2-theta-0.5 | -1.0274000 | worst_score >= -1 | ✗ |