Reads a table object created with InterProScan and generates a profile table of abundance with the hits of the KEGG, Pfam or INTERPRO databases. The output of KEGG database can be used within mapping_ko.
read_interpro (data_interpro,
database = c("KEGG", "Pfam", "INTERPRO",
"TIGRFAM", "SUPERFAMILY", "SMART", "SFLD", "ProSiteProfiles",
"ProSitePatterns", "ProDom", "PRINTS", "PIRSF",
"MobiDBLite","Hamap", "Gene3D", "Coils", "CDD"), profile = TRUE)
a table, output of InterProScan on tsv format. InterProScan should have been run with -pa option to be able to use the KEGG option, in the database argument.
a character indicating for which database do you want to get the abundance profile. Valid options are "KEGG", "PFAM" or "INTERPRO".
a logical value indicating if you want to print a profile or not. This option is valid for "PFAM" and "INTERPRO" database.
This function is part of a package used for the analysis of bins metabolism.
filepath <- system.file("extdata", "Interpro_test.tsv", package = "rbims")
read_interpro(data_interpro = filepath, database = "INTERPRO", profile = FALSE)
#> # A tibble: 217 × 5
#> Scaffold_name Bin_name INTERPRO domain_name Abundance
#> <chr> <chr> <chr> <chr> <int>
#> 1 Bin_10_scaffold_441_c1_24 Bin_10 IPR004695 Voltage-dependent an… 1
#> 2 Bin_10_scaffold_441_c1_24 Bin_10 IPR038665 Voltage-dependent an… 1
#> 3 Bin_12_scaffold_69_c1_124 Bin_12 IPR004695 Voltage-dependent an… 1
#> 4 Bin_12_scaffold_69_c1_124 Bin_12 IPR038665 Voltage-dependent an… 1
#> 5 Bin_56_scaffold_71_c1_69 Bin_56 IPR004695 Voltage-dependent an… 1
#> 6 Bin_56_scaffold_71_c1_69 Bin_56 IPR038665 Voltage-dependent an… 1
#> 7 Bin_113_scaffold_145_c1_85 Bin_113 IPR001647 DNA-binding HTH doma… 1
#> 8 Bin_113_scaffold_145_c1_85 Bin_113 IPR025996 WHG domain 1
#> 9 Bin_113_scaffold_145_c1_85 Bin_113 IPR009057 Homeobox-like domain… 1
#> 10 Bin_113_scaffold_145_c1_85 Bin_113 IPR036271 Tetracyclin represso… 1
#> # ℹ 207 more rows