reads the output of the read_interpro or mapping_ko functions to pull out the unique IDs based on metadata. This function identifies the IDs that are only present in a particular metadata group (i.e. KOs only present in environment "a" and absent in the rest of the environments). And retrieves a tibble object.
get_subset_unique(tibble_rbims, data_experiment, experiment_col,
experiment_col_element, analysis=c("KEGG", "PFAM", "INTERPRO"))
a tibble object, created with the read_interpro or mapping_ko functions.
a data frame object that contains the metadata (i.e. taxonomy, sampling site).
a metadata column name. This feature is going to be used for sub-setting.
a string of the metadata feature of interest. It is found under the experiment_col column.
a character, indicating from which input do you want to get the unique abundance profile. Valid options are "KEGG", "PFAM" or "INTERPRO".
This function is part of a package used for the analysis of bins metabolism.
get_subset_unique(ko_bin_mapp, metadata, Sample_site,
"Water_column", analysis="KEGG")
#> # A tibble: 2,553 × 19
#> Module Module_description Pathway Pathway_description Cycle Pathway_cycle
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 NA NA map00740 Riboflavin metabolism NA NA
#> 2 NA NA map01100 Metabolic pathways NA NA
#> 3 NA NA map01110 Biosynthesis of secon… NA NA
#> 4 NA NA map00740 Riboflavin metabolism NA NA
#> 5 NA NA map01100 Metabolic pathways NA NA
#> 6 NA NA NA NA NA NA
#> 7 NA NA map03050 Proteasome NA NA
#> 8 NA NA map03008 Ribosome biogenesis i… NA NA
#> 9 NA NA map00760 Nicotinate and nicoti… NA NA
#> 10 NA NA map01100 Metabolic pathways NA NA
#> # ℹ 2,543 more rows
#> # ℹ 13 more variables: Detail_cycle <chr>, Genes <chr>, Gene_description <chr>,
#> # Enzyme <chr>, KO <chr>, rbims_pathway <chr>, rbims_sub_pathway <chr>,
#> # Bin_10 <int>, Bin_12 <int>, Bin_56 <int>, Bin_113 <int>, Bin_1 <int>,
#> # Bin_2 <int>