reads the output of the read_interpro or mapping_ko functions to pull out the unique IDs based on metadata. This function identifies the IDs that are only present in a particular metadata group (i.e. KOs only present in environment "a" and absent in the rest of the environments). And retrieves a tibble object.

get_subset_unique(tibble_rbims, data_experiment, experiment_col, 
experiment_col_element, analysis=c("KEGG", "PFAM", "INTERPRO"))

Arguments

tibble_rbims

a tibble object, created with the read_interpro or mapping_ko functions.

data_experiment

a data frame object that contains the metadata (i.e. taxonomy, sampling site).

experiment_col

a metadata column name. This feature is going to be used for sub-setting.

experiment_col_element

a string of the metadata feature of interest. It is found under the experiment_col column.

analysis

a character, indicating from which input do you want to get the unique abundance profile. Valid options are "KEGG", "PFAM" or "INTERPRO".

Details

This function is part of a package used for the analysis of bins metabolism.

Examples

get_subset_unique(ko_bin_mapp, metadata, Sample_site, "Water_column", analysis="KEGG")
#> # A tibble: 2,553 x 19 #> Module Module_descripti… Pathway Pathway_description Cycle Pathway_cycle #> <chr> <chr> <chr> <chr> <chr> <chr> #> 1 NA NA map00740 Riboflavin metabolism NA NA #> 2 NA NA map01100 Metabolic pathways NA NA #> 3 NA NA map01110 Biosynthesis of second… NA NA #> 4 NA NA map00740 Riboflavin metabolism NA NA #> 5 NA NA map01100 Metabolic pathways NA NA #> 6 NA NA NA NA NA NA #> 7 NA NA map03050 Proteasome NA NA #> 8 NA NA map03008 Ribosome biogenesis in… NA NA #> 9 NA NA map00760 Nicotinate and nicotin… NA NA #> 10 NA NA map01100 Metabolic pathways NA NA #> # … with 2,543 more rows, and 13 more variables: Detail_cycle <chr>, #> # Genes <chr>, Gene_description <chr>, Enzyme <chr>, KO <chr>, #> # rbims_pathway <chr>, rbims_sub_pathway <chr>, Bin_10 <int>, Bin_12 <int>, #> # Bin_56 <int>, Bin_113 <int>, Bin_1 <int>, Bin_2 <int>