reads the output of the read_interpro or mapping_ko functions to pull out the unique IDs based on metadata. This function identifies the IDs that are only present in a particular metadata group (i.e. KOs only present in environment "a" and absent in the rest of the environments). And retrieves a tibble object.

get_subset_unique(tibble_rbims, data_experiment, experiment_col, 
experiment_col_element, analysis=c("KEGG", "PFAM", "INTERPRO"))

Arguments

tibble_rbims

a tibble object, created with the read_interpro or mapping_ko functions.

data_experiment

a data frame object that contains the metadata (i.e. taxonomy, sampling site).

experiment_col

a metadata column name. This feature is going to be used for sub-setting.

experiment_col_element

a string of the metadata feature of interest. It is found under the experiment_col column.

analysis

a character, indicating from which input do you want to get the unique abundance profile. Valid options are "KEGG", "PFAM" or "INTERPRO".

Details

This function is part of a package used for the analysis of bins metabolism.

Examples

get_subset_unique(ko_bin_mapp, metadata, Sample_site, 
"Water_column", analysis="KEGG")
#> # A tibble: 2,553 × 19
#>    Module Module_description Pathway  Pathway_description    Cycle Pathway_cycle
#>    <chr>  <chr>              <chr>    <chr>                  <chr> <chr>        
#>  1 NA     NA                 map00740 Riboflavin metabolism  NA    NA           
#>  2 NA     NA                 map01100 Metabolic pathways     NA    NA           
#>  3 NA     NA                 map01110 Biosynthesis of secon… NA    NA           
#>  4 NA     NA                 map00740 Riboflavin metabolism  NA    NA           
#>  5 NA     NA                 map01100 Metabolic pathways     NA    NA           
#>  6 NA     NA                 NA       NA                     NA    NA           
#>  7 NA     NA                 map03050 Proteasome             NA    NA           
#>  8 NA     NA                 map03008 Ribosome biogenesis i… NA    NA           
#>  9 NA     NA                 map00760 Nicotinate and nicoti… NA    NA           
#> 10 NA     NA                 map01100 Metabolic pathways     NA    NA           
#> # ℹ 2,543 more rows
#> # ℹ 13 more variables: Detail_cycle <chr>, Genes <chr>, Gene_description <chr>,
#> #   Enzyme <chr>, KO <chr>, rbims_pathway <chr>, rbims_sub_pathway <chr>,
#> #   Bin_10 <int>, Bin_12 <int>, Bin_56 <int>, Bin_113 <int>, Bin_1 <int>,
#> #   Bin_2 <int>