First, load the rbims package.
The second thing to do would be to read the KofamKOALA/KofamScan output.
The function that rbims uses to read the raw output from the KofamKOALA/KofamScan data and make a table is read_ko.
?read_ko
rbims
contains a test dataset that allow us to test this function. This dataset is saved in objects: ko_bin_table
, ko_bin_mapp
, and metadata
. First download the KofamKOALA/KofamScan example file. It is recommended that you save this file in its own folder since this function will read all the text files in your path and concatenate them. An example of a path input is shown below:
ko_bin_table<-read_ko(data_kofam ="C:/Users/Bins")
The read_ko function will create a table that contains the abundance of each KO within each bin.
head(ko_bin_table)
#> # A tibble: 6 x 4
#> Bin_name Scaffold_name KO Abundance
#> <chr> <chr> <chr> <int>
#> 1 Bin_10 scaffold_10_c1_10 K09800 1
#> 2 Bin_10 scaffold_10_c1_100 K01126 2
#> 3 Bin_10 scaffold_10_c1_103 K00616 1
#> 4 Bin_10 scaffold_10_c1_104 K05539 1
#> 5 Bin_10 scaffold_10_c1_107 K13936 1
#> 6 Bin_10 scaffold_10_c1_109 K01885 1
The function mapping_ko can now be used to map the KO and their abundance to the rest of the features of KEGG and rbims database.
ko_bin_mapp<-mapping_ko(ko_bin_table)
head(ko_bin_mapp)
#> # A tibble: 6 x 19
#> Module Module_description Pathway Pathway_descript~ Cycle Pathway_cycle
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 <NA> <NA> <NA> <NA> <NA> <NA>
#> 2 <NA> <NA> map00564 Glycerophospholi~ <NA> <NA>
#> 3 M00004 Pentose phosphate pathw~ map00030 Pentose phosphat~ <NA> <NA>
#> 4 M00007 Pentose phosphate pathw~ map00030 Pentose phosphat~ <NA> <NA>
#> 5 M00004 Pentose phosphate pathw~ map01100 Metabolic pathwa~ <NA> <NA>
#> 6 M00007 Pentose phosphate pathw~ map01100 Metabolic pathwa~ <NA> <NA>
#> # ... with 13 more variables: Detail_cycle <chr>, Genes <chr>,
#> # Gene_description <chr>, Enzyme <chr>, KO <chr>, rbims_pathway <chr>,
#> # rbims_sub_pathway <chr>, Bin_10 <int>, Bin_12 <int>, Bin_56 <int>,
#> # Bin_113 <int>, Bin_1 <int>, Bin_2 <int>
You can export this to a table like this:
write.table(ko_bin_mapp, "KEGG_mapped.tsv", quote = F, sep = "\t", row.names = F, col.names = T)