Create KEGG profile • rbims

First, load the rbims package.

library(rbims)

The second thing to do would be to read the KofamKOALA/KofamScan output.

Read the KEGG results

The function that rbims uses to read the raw output from the KofamKOALA/KofamScan data and make a table is read_ko.

?read_ko

rbims contains a test dataset that allow us to test this function. This dataset is saved in objects: ko_bin_table, ko_bin_mapp, and metadata. First download the KofamKOALA/KofamScan example file. It is recommended that you save this file in its own folder since this function will read all the text files in your path and concatenate them. An example of a path input is shown below:

ko_bin_table<-read_ko(data_kofam ="C:/Users/Bins")

The read_ko function will create a table that contains the abundance of each KO within each bin.
The write argument saves the formatted table generated in .tsv extension. When write = F gives you the output but not saves the table in your current directory.

head(ko_bin_table)
#> # A tibble: 6 × 4
#>   Bin_name Scaffold_name      KO     Abundance
#>   <chr>    <chr>              <chr>      <int>
#> 1 Bin_10   scaffold_10_c1_10  K09800         1
#> 2 Bin_10   scaffold_10_c1_100 K01126         2
#> 3 Bin_10   scaffold_10_c1_103 K00616         1
#> 4 Bin_10   scaffold_10_c1_104 K05539         1
#> 5 Bin_10   scaffold_10_c1_107 K13936         1
#> 6 Bin_10   scaffold_10_c1_109 K01885         1

Map to the KEGG database

The function mapping_ko can now be used to map the KO and their abundance to the rest of the features of KEGG and rbims database.

ko_bin_mapp<-mapping_ko(ko_bin_table)

head(ko_bin_mapp)
#> # A tibble: 6 × 19
#>   Module Module_description      Pathway Pathway_description Cycle Pathway_cycle
#>   <chr>  <chr>                   <chr>   <chr>               <chr> <chr>        
#> 1 NA     NA                      NA      NA                  NA    NA           
#> 2 NA     NA                      map005… Glycerophospholipi… NA    NA           
#> 3 M00004 Pentose phosphate path… map000… Pentose phosphate … NA    NA           
#> 4 M00007 Pentose phosphate path… map000… Pentose phosphate … NA    NA           
#> 5 M00004 Pentose phosphate path… map011… Metabolic pathways  NA    NA           
#> 6 M00007 Pentose phosphate path… map011… Metabolic pathways  NA    NA           
#> # ℹ 13 more variables: Detail_cycle <chr>, Genes <chr>, Gene_description <chr>,
#> #   Enzyme <chr>, KO <chr>, rbims_pathway <chr>, rbims_sub_pathway <chr>,
#> #   Bin_10 <int>, Bin_12 <int>, Bin_56 <int>, Bin_113 <int>, Bin_1 <int>,
#> #   Bin_2 <int>

You can export this to a table like this:

write.table(ko_bin_mapp, "KEGG_mapped.tsv", quote = F, sep = "\t", row.names = F, col.names = T)

Or setting write write = T.