This document introduces a guide for extending SigBridgeR with custom algorithms.
Installation
It’s recommended to install these packages for checking the code
pak::pkg_install(c(
"tictoc",
"yonicd/tidycheckUsage",
"codetools",
"knitr",
"lintr"
))Prepare a custom function
After the v3.2.0 update, SigBridgeR supports registering custom algorithms for screening phenotype-associated cell method into the package. Let’s do this with a detailed example:
my_screen_function <- function(
matched_bulk,
sc_data,
phenotype,
label_type = NULL,
phenotype_class = c("binary", "survival", "continuous"),
...
) {
dots <- list(...)
verbose <- dots$verbose %||% TRUE
# do something, here we just randomly assign a label to each cell
modified_sc_data <- SeuratObject::AddMetaData(
sc_data,
c(
rep("Positive", floor(ncol(sc_data) / 2)),
rep("Negative", ceiling(ncol(sc_data) / 2))
),
col.name = "my_method"
) %>%
# record parameters
AddMisc(
my_method_label = label_type,
phenotype = phenotype_class
)
intermediate_var <- "value"
if (verbose) {
cli::cli_alert_success("my_screen_function finished")
}
list(
scRNA_data = modified_sc_data,
intermediate_var = intermediate_var
)
}Format requirements for custom extension functions:
- The input arguments must include
-
sc_data(required): A Seurat object -
matched_bulk(required): A data.frame/matrix/Matrix (genes × samples) containing RNA-seq counts:- Genes must overlap with those in
sc_data. - Samples must correspond to those in
phenotype.
- Genes must overlap with those in
-
phenotype(required):- For survival: a data.frame with columns
timeandstatus; rownames must matchcolnames(matched_bulk) - For binary or continuous: a named vector; names
must match
colnames(matched_bulk)
- For survival: a data.frame with columns
-
label_type(required): A character used to label cell with study cases. -
phenotype_class(required): One or more of"binary","survival","continuous".
-
- The output must be a list containing at least one
elements:
-
scRNA_data(required): A Seurat object with meta.data modified - Other elements are optional.
-
To facilitate format validation, we provide a function
ValidateScreenFunc to check the above requirements. The
validation output resembles that of rcmdcheck; please
ensure there are no errors or warnings, and as few notes as
possible.
ValidateScreenFunc(my_screen_function)
# ── Screening Function Validation ──────────────────────────────────────────────────────────────────────
# Start at 2026/01/19 22:04:17
# ✔ All input arguments explicitly specified
# ✔ Verbose control supported
# ✔ Syntax check passed
# ✔ Return value is a list with `scRNA_data` slot
# Duration: 0.095 sec elapsed
# 0 error ✔ | 0 warning ✔ | 0 note ✔By the way if providing a bad function
bad_fun <- function(x) {
z <- x + 1
y
return(NULL)
}
ValidateScreenFunc(bad_fun)
# ── Screening Function Validation ───────────────────────────────────────────────────────────────────
# Start at 2026/01/19 22:07:43
# ❯ Missing required arguments ... ERROR
# More arguments should be added:
# sc_data: A fully preprocessed Seurat object
# matched_bulk: Bulk RNA-seq matrix (gene * samples):
# • Samples match `phenotype` in number/order;
# • Senes overlap with `sc_data`
# phenotype: Phenotype: a named vector or data.frame, names/rownames match `matched_bulk`
# • For binary/continuous: named vector recommended
# • For survival: data.frame with `time` (1st col) and `status` (2nd col) recommended.
# label_type: Labeling phenotype-associated cell with real study identifiers
# phenotype_class: Phenotype types:
# • Must be one or more of `binary`, `continuous` and `survival`
# ❯ Verbose control not supported ... NOTE
# Consider adding `verbose` control to ease error tracing
# ❯ Syntax error in function ... ERROR
# | line | object | col1 | col2 | warning_type | warning |
# |:----:|:------:|:----:|:----:|:-----------------:|:-----------------------------------------------:|
# | 2 | z | 3 | 3 | unused_local | local variable ‘z’ assigned but may not be used |
# | 3 | y | 3 | 3 | no_global_binding | no visible binding for global variable ‘y’ |
# ❯ Return value is not a list ... ERROR
# scRNA_data: recommended to be the first element of the return value
# • Should be of class <Seurat>
# Duration: 0.158 sec elapsed
# 3 errors ✖ | 0 warning ✔ | 1 note ✖Registering the function
Now we can register the function to the package:
RegisterScreenMethod(
my_method = my_screen_function,
supported_phenotypes = c("binary", "survival"),
parameter_mapper = function(params) {
params$a <- params$a %||% 123
params
},
registry = ScreenStrategy,
verbose = TRUE
)
# ✔ Registered `my_method`Details of the arguments:
-
my_method = me_screen_function:- formatting as key = func. Key is used to name the function. If no name provided, the function name will be used.
-
supported_phenotypes: The phenotype types supported by the function. -
parameter_mapper: A function that transforms the input parameter list before passing it to the executor. Useful for changing parameters from interface function. Receives a named list and must return a modified list. -
registry: The registry to register the function. -
verbose: Whether to print messages.
Let’s check whether it has indeed been registered
GetExistingStrategy()
# [1] "Scissor" "scPP" "LP_SGL" "my_method" "PIPET" "DEGAS" "scAB" "scPAS"Use the function
Now we can use the function in the Screen function:
a_seurat <- SeuratObject::CreateSeuratObject(Matrix::Matrix(
1:100,
nrow = 10,
dimnames = list(paste0("gene", 1:10), paste0("cell", 1:10))
))
bulk <- matrix(
1:100,
nrow = 10,
dimnames = list(paste0("gene", 1:10), paste0("sample", 1:10))
)
pheno <- setNames(sample(0:1, 10, TRUE), paste0("sample", 1:10))
my_res <- Screen(
sc_data = a_seurat,
matched_bulk = bulk,
phenotype = pheno,
label_type = "Test",
phenotype_class = "binary",
screen_method = "my_method",
a = 123
)
# ✔ my_screen_function finished
my_res$scRNA_data |> class()
# [1] "Seurat"
# attr(,"package")
# [1] "SeuratObject"If you still have questions, please use GitHub Issues or Discussions.