Integrates matched bulk expression data and phenotype information to identify phenotype-associated cell populations in single-cell RNA-seq data using one of four computational methods. Ensures consistency between bulk and phenotype data before analysis.
Arguments
- matched_bulk
Matrix or data frame of preprocessed bulk RNA-seq expression data (genes x samples). Column names must match names/IDs in
phenotype
.- sc_data
A Seurat object containing scRNA-seq data to be screened.
- phenotype
Phenotype data, either: - Named vector (names match
matched_bulk
columns), or - Data frame with row names matchingmatched_bulk
columns- label_type
Character specifying phenotype label type (e.g., "SBS1", "time")
- phenotype_class
Type of phenotypic outcome (must be consistent with input data): -
"binary"
: Binary traits (e.g., case/control) -"continuous"
: Continuous measurements (only forScissor
,scPAS
,scPP
) -"survival"
: Survival objects- screen_method
Screening algorithm to use, there are four options: -
"Scissor"
: see alsoDoScissor()
-"scPP"
: see alsoDoscPP()
-"scPAS"
: see alsoDoscPAS()
-"scAB"
: see alsoDoscAB()
, no continuous support- ...
Additional method-specific parameters:
- Scissor
- alpha
(numeric or NULL) Significance threshold. When NULL, alpha will keep increasing iteratively until the corresponding cells are screened out, default 0.05
- cutoff
(numeric) A threshold for terminating the iteration of alpha, only work when
alpha
is NULL, default 0.2- path2load_scissor_cache
(character) default
NULL
- path2save_scissor_inputs
(character) A path to save the intermediary data. By using
path2load_scissor_cache
, the intermediary data can be loaded from the specified path. default"Scissor_inputs.RData"
- nfold
(integer) Cross-validation folds for reliability test, default 10
- reliability_test
(logical) Whether to perform reliability test, default FALSE
- scPP
- ref_group
(integer or character) Reference group or baseline for binary comparisons, e.g. "Normal" for Tumor/Normal studies and 0 for 0/1 case-control studies. default: 0
- Log2FC_cutoff
(numeric) Minimum log2 fold-change for binary markers, default 0.585
- estimate_cutoff
(numeric) Effect size threshold for continuous traits, default 0.2
- probs
(numeric) Quantile cutoff for cell classification, default 0.2
- scPAS
- assay
(character) Assay to use from sc_data, default "RNA"
- imputation
(logical) Whether to perform imputation, default FALSE
- nfeature
(integer) Number of features to select, default 3000
- alpha
(numeric or NULL) Significance threshold, When NULL, alpha will keep increasing iteratively until the corresponding cells are screened out, default 0.01
- independent
(logical) The background distribution of risk scores is constructed independently of each cell. default: TRUE
- network_class
(character) Network class to use. default: 'SC', indicating gene-gene similarity networks derived from single-cell data. The other one is 'bulk'.
- permutation_times
(integer) Number of permutations, default 2000
- FDR_threshold
(numeric) FDR value threshold for identifying phenotype-associated cells default 0.05
- scAB
- alpha
(numeric) Coefficient of phenotype regularization ,default 0.005
- alpha_2
(numeric) Coefficent of cell-cell similarity regularization, default 5e-05
- maxiter
(integer) NMF optimization iterations, default 2000
- tred
(integer) Z-score threshold, default 2
Value
A list containing:
- scRNA_data
Filtered Seurat object with phenotype-associated cells
- Some screen_result
Important information about the screened result related to the selected method
Data Matching Requirements
matched_bulk
column names andphenotype
names/rownames must be identicalPhenotype values must correspond to bulk samples (not directly to single cells)
Mismatches will trigger an error before analysis begins, and there is a built-in pre-run check.
Method Compatibility
Method | Supported Phenotypes | Additional Parameters |
Scissor | All three types | alpha , cutoff , path2load_scissor_cache , path2save_scissor_inputs , nfold , reliability_test , reliability_test_n |
scPP | All three types | ref_group , Log2FC_cutoff , estimate_cutoff , probs |
scPAS | All three types | n_components ,assay , imputation ,nfeature , alpha ,network_class ,permutation_times ,FDR_threshold ,independent |
scAB | Binary/Survival | alpha , alpha_2 , maxiter , tred |