MBGD: Microbial genome database for comparative analysis featuring enhanced functionality to characterize gene and genome functions through large-scale orthology analysis

J Mol Biol. 2025 Jan 16:168957. doi: 10.1016/j.jmb.2025.168957. Online ahead of print.

Abstract

Microbial Genome Database for Comparative Analysis (MBGD) is a comprehensive ortholog database encompassing published complete microbial genomes. The ortholog tables in MBGD are constructed in a hierarchical manner. The top-level ortholog table is now constructed from 1,812 genus-level pan-genomes, 6,268 species-level pan-genomes, and 34,079 genomes in total. To support analyses of newly sequenced genomes, MBGD updates MyMBGD functionality, which offers two analysis modes: assignment mode and clustering mode. Assignment mode rapidly classifies genes in the query genomes into existing MBGD ortholog groups, while clustering mode performs de novo clustering of query genomes using the DomClust program. In assignment mode, users can evaluate the presence of genomic functions, as defined in the KEGG Module database, in each query genome using the Genomaple software and compare the results across multiple genomes. To enhance this analysis, we developed a method to subdivide MBGD ortholog groups as needed to improve cross-references to the KEGG Orthology groups. Another notable feature is the phylogenetic profile search interface, which enables users to specify a set of organisms in which orthologs are present or absent (i.e., a phylogenetic profile), and search for ortholog groups with similar phylogenetic profiles. To construct a phylogenetic profile, users can search organisms by specifying phenotype, environment, taxonomy, or a particular ortholog group. MBGD is available at https://mbgd.nibb.ac.jp/.

Keywords: comparative genomics; function prediction; microbial genomes; ortholog.