Integrates clustering, indicator analysis, differential testing, and Random Forest importance into a unified workflow.

get_discriminant_features(
  tibble_profile,
  analysis,
  feature_col,
  metadata,
  group_col = NULL,
  norm = "hellinger",
  min_presence = 2
)

Arguments

tibble_profile

Wide profile table (features × MAGs)

analysis

Character: "KEGG", "Pfam", "INTERPRO", "dbCAN", or "MEROPS"

feature_col

Column name containing feature IDs

metadata

MAG-level metadata (rownames = MAG identifiers)

group_col

Metadata column to discriminate (e.g. "Depth", "Class", "Phylum")

norm

Normalization method ("hellinger" or "clr")

min_presence

Minimum number of MAGs in which a feature must appear

Value

List with: matrices (counts, hellinger, clr), clustering, indicator results, differential tests, Random Forest importance, and consensus ranking.