The MEROPS database classifies peptidases (proteases) and their inhibitors using a hierarchical, structure-based system. Peptidases are grouped into Families based on significant sequence similarities, and related families are further grouped into Clans, indicating evolutionary relationships. This classification helps researchers understand enzyme function, structure, and evolution. The database provides sequence identifiers, structural data (if available), and literature references for deeper exploration.

First, load the rbims package.

The function that allow us to load merops data is read_merops.

merops_profile <- read_merops("../inst/extdata/peptidase_2/", profile = T)
  • The database argument will parse the database. In this example, I will explore the KO output.

  • The output format is chosen with the profile argument. When profile = T, a wide output is obtained.

  • The write argument saves the formatted table generated in .tsv extension. When write = F gives you the output but not saves the table in your current directory.

head(merops_profile)
#>                                                        MEROPS_family
#> 1                                        CTP synthetase |  | C26.964
#> 2                       CTP synthetase | Ruegeria sp. TW15 | C26.964
#> 3                   CTP synthetase | Silicibacter pomeroyi | C26.964
#> 4                      family S14 unassigned peptidases |  | S14.UPW
#> 5                      family S14 unassigned peptidases |  | S14.UPW
#> 6 family S14 unassigned peptidases | Actibacterium mucosum | S14.UPW
#>   domain_name X5mSIPHEX1_0 X5mSIPHEX1_1 X5mSIPHEX1_10 X5mSIPHEX1_11
#> 1  MER0450018            1            0             0             0
#> 2  MER0450113            1            0             0             0
#> 3  MER0174339            1            0             0             0
#> 4  MER1318229            2            0             0             0
#> 5  MER1320661            2            0             0             0
#> 6  MER1012580            1            0             0             0
#>   X5mSIPHEX1_13 X5mSIPHEX1_18 X5mSIPHEX1_19 X5mSIPHEX1_25 X5mSIPHEX1_26
#> 1             0             0             0             0             0
#> 2             0             0             0             0             0
#> 3             0             0             0             0             0
#> 4             0             0             0             0             0
#> 5             0             0             0             0             0
#> 6             0             0             0             0             0
#>   X5mSIPHEX1_32 X5mSIPHEX1_33 X5mSIPHEX1_37 X5mSIPHEX1_8 X5mSIPHEX1_9
#> 1             0             0             2            1            0
#> 2             0             0             1            1            0
#> 3             0             0             1            1            0
#> 4             0             0             0            0            0
#> 5             0             0             4            0            0
#> 6             0             0             0            0            0
#>   X5mSIPHEX2_10 X5mSIPHEX2_14 X5mSIPHEX2_16 X5mSIPHEX2_18 X5mSIPHEX2_25
#> 1             1             0             0             0             1
#> 2             1             0             0             0             1
#> 3             1             0             0             0             1
#> 4             2             0             0             0             0
#> 5             2             0             0             0             0
#> 6             1             0             0             0             0
#>   X5mSIPHEX2_3 X5mSIPHEX2_5 X5mSIPHEX2_7 X700mSIPHEX1_0 X700mSIPHEX1_1
#> 1            2            0            0              2              0
#> 2            1            0            0              1              0
#> 3            1            0            0              1              0
#> 4            0            0            0              0              0
#> 5            4            0            0              0              0
#> 6            0            0            0              0              0
#>   X700mSIPHEX1_12 X700mSIPHEX1_15 X700mSIPHEX1_17 X700mSIPHEX1_18
#> 1               0               0               0               0
#> 2               0               0               0               0
#> 3               0               0               0               0
#> 4               0               0               0               0
#> 5               0               0               0               0
#> 6               0               0               0               0
#>   X700mSIPHEX1_2 X700mSIPHEX1_20 X700mSIPHEX1_3 X700mSIPHEX1_8 X700mSIPHEX2_13
#> 1              0               0              0              2               2
#> 2              0               0              0              1               1
#> 3              0               0              0              1               1
#> 4              0               0              0              0               0
#> 5              0               0              0              4               4
#> 6              0               0              0              0               0
#>   X700mSIPHEX2_14 X700mSIPHEX2_16 X700mSIPHEX2_21 X700mSIPHEX2_22
#> 1               0               0               0               0
#> 2               0               0               0               0
#> 3               0               0               0               0
#> 4               0               0               0               0
#> 5               0               0               0               0
#> 6               0               0               0               0
#>   X700mSIPHEX2_23 X700mSIPHEX2_24 X700mSIPHEX2_9
#> 1               0               0              0
#> 2               0               0              0
#> 3               0               0              0
#> 4               0               0              0
#> 5               0               0              0
#> 6               0               0              0

Or print a long table profile = F.

merops_profile_long <- read_merops("../inst/extdata/peptidase_2/", profile = F)
head(merops_profile_long)
#>      Bin_name
#> 1 5mSIPHEX1_0
#> 2 5mSIPHEX1_0
#> 3 5mSIPHEX1_0
#> 4 5mSIPHEX1_0
#> 5 5mSIPHEX1_0
#> 6 5mSIPHEX1_0
#>                                                        MEROPS_family
#> 1                                        CTP synthetase |  | C26.964
#> 2                       CTP synthetase | Ruegeria sp. TW15 | C26.964
#> 3                   CTP synthetase | Silicibacter pomeroyi | C26.964
#> 4                      family S14 unassigned peptidases |  | S14.UPW
#> 5                      family S14 unassigned peptidases |  | S14.UPW
#> 6 family S14 unassigned peptidases | Actibacterium mucosum | S14.UPW
#>   domain_name Abundance
#> 1  MER0450018         1
#> 2  MER0450113         1
#> 3  MER0174339         1
#> 4  MER1318229         2
#> 5  MER1320661         2
#> 6  MER1012580         1

You can export this to a table like this:

write.table(merops_profile_long, "KO_picrust2.tsv", quote = F, sep = "\t", row.names = F, col.names = T)

Or setting write write = T.