Skip to contents

Efficiently writes input data files for DEGAS model training using optimized data handling and fast I/O operations. This function converts various data types to efficient CSV format using data.table's fwrite for rapid file operations with comprehensive error handling.

Usage

writeInputFiles.optimized(scExp, scLab = NULL, patExp, patLab = NULL, tmpDir)

Arguments

scExp

A matrix, data frame, or Matrix object containing single-cell expression data. Rows typically represent genes and columns represent cells.

scLab

A matrix, or data frame containing single-cell labels corresponding to the expression data. Can be NULL if no labels are available.

patExp

A data frame, or Matrix object containing patient-level expression data. Rows typically represent genes and columns represent patients.

patLab

A matrix, or data frame containing patient-level labels corresponding to the patient expression data. Can be NULL if no labels are available.

tmpDir

Character string specifying the directory path where input files will be written. The directory will be created if it doesn't exist.

Value

Invisibly returns TRUE if all files are successfully written. If any error occurs during file writing, the function will abort with an informative error message.

Details

This function provides an optimized pipeline for writing input files required by DEGAS models with the following features:

File Output:

The function creates four CSV files in the specified temporary directory:

  • scExp.csv: Single-cell expression data

  • scLab.csv: Single-cell labels (if provided)

  • patExp.csv: Patient-level expression data

Note

The function uses comma-separated values (CSV) format without row names to ensure compatibility with Python-based DEGAS training scripts. All input data is converted to dense format during writing, so ensure sufficient memory is available for large datasets.

References

Johnson TS, Yu CY, Huang Z, Xu S, Wang T, Dong C, et al. Diagnostic Evidence GAuge of Single cells (DEGAS): a flexible deep transfer learning framework for prioritizing cells in relation to disease. Genome Med. 2022 Feb 1;14(1):11.

See also

data.table::fwrite() for the underlying fast writing implementation, purrr::safely() for the error handling mechanism.

Other DEGAS: DoDEGAS(), LabelBinaryCells(), LabelContinuousCells(), LabelSurvivalCells(), Vec2sparse(), predClassBag.optimized(), readOutputFiles.optimized(), runCCMTL.optimized(), runCCMTLBag.optimized()

Examples

if (FALSE) { # \dontrun{
# Write input files for DEGAS training
writeInputFiles.optimized(
  scExp = single_cell_expression,
  scLab = single_cell_labels,
  patExp = patient_expression,
  patLab = patient_labels,
  tmpDir = "/tmp/degas_input"
)
} # }