These functions collapse duplicated row names (e.g., gene symbols) or column names (e.g., sample IDs) in matrix-like objects by aggregating values using configurable methods. They support:
- Rows
AggregateDupRows: merges rows sharing the same row name.- Columns
AggregateDupCols: merges columns sharing the same column name.- Both
AggregateDups: convenience wrapper applying row-then-column aggregation.
Designed for expression matrices, count tables, or any numeric data where feature/sample duplication occurs.
Handles matrix, data.frame, and S4 Matrix classes (e.g. dgCMatrix) robustly.
Convenience wrapper that first aggregates duplicated rows, then duplicated columns. Useful for cleaning matrices where both feature and sample duplication may occur.
Usage
AggregateDupRows(
x,
method = c("max", "sum", "mean", "median", "first"),
verbose = TRUE,
...
)
AggregateDupCols(
x,
method = c("max", "sum", "mean", "median", "first"),
verbose = TRUE,
...
)
AggregateDups(
x,
method = c("max", "sum", "mean", "median", "first"),
row_method = NULL,
col_method = NULL,
verbose = TRUE,
...
)Methods
Supported methods (applied column-wise for rows, row-wise for columns):
"max"Maximum value per group (default).
"sum"Sum of values per group.
"mean"Arithmetic mean (uses
na.rm = TRUE)."median"Median value.
"first"First occurrence in original order.
Input Types and Return Types
| Input class | Output class (unless noted) |
matrix | matrix |
data.frame | data.frame |
S4 Matrix | matrix (dense) — S4 attributes dropped for generality |
Row/column order in output follows first occurrence of each unique name in rownames(x) / colnames(x).