PanGBank is a web-based platform for exploring, analyzing, and downloading pangenomes generated with PPanGGOLiN. It builds pangenomes using a Nextflow workflow (labgem/PanGBank-wf) and provides access through a public REST API .
Pangenomes are organized into collections based on genome source and taxonomy, providing a structured and scalable framework for comparative genomics. The web application allows users to search pangenomes by taxonomy, genome, or strain; visualize statistics; browse annotations with CGView genome maps; navigate between genomes; and download the corresponding PPanGGOLiN files.
A companion command-line tool (labgem/PanGBank-cli) provides easy access to the API for querying and downloading pangenome data directly from the terminal. Use case examples are available in our labgem/PanGBank-tutorial repository.
Our pangenome data and associated resources are made freely available under the Creative Commons Attribution 4.0 International (CC BY 4.0) License.
This license allows you to:
As long as you give appropriate credit, provide a link to the license, and indicate if changes were made. For full license details, please see the official CC BY 4.0 page.
PPanGGOLiN (Gautreau et al., 2020) is a software suite for constructing and analyzing prokaryotic pangenomes. It uses a statistical model instead of fixed thresholds to partition the pangenome. This approach enables robust analysis of low-quality data such as MAGs and SAGs, making it suitable for large-scale environmental studies and uncultivable species.
PPanGGOLiN builds a Partitioned Pangenome Graph by integrating gene content and genomic neighborhood information. Gene families form nodes, and edges represent genetic contiguity. The statistical model partitions gene families into persistent, shell, and cloud genomes, with neighboring families more likely to belong to the same partition.
The panRGP method (Bazin et al., 2020) identifies Regions of Genome Plasticity (RGPs), which are clusters of shell and cloud genes often acquired through horizontal gene transfer and corresponding to Genomic Islands. RGPs are grouped into spots of insertion based on shared flanking persistent genes. These regions can be further divided into conserved modules using panModule (Bazin et al., 2021), which identifies co-occurring, co-localized genes that are gained or lost together in the variable regions of the pangenome.
PanGBank is mainly developed by the LABGeM team at Genoscope:
Jean Mainguy
David Vallenet
Alexandra Calteau
Téo Lemane
Thanks to the prior contribution of Adelme Bazin and Guillaume Gautreau.
This work was financially supported by:
We use Google Analytics to understand how visitors use our website, which collects usage data (IP address, browser information, and visit timestamps) only if you accept cookies. You can manage your cookie preferences at any time. For more details, please read our Privacy Policy.
If you have any questions, please open an issue on our GitHub repository at https://github.com/labgem/PanGBank-api or contact us at labgem@genoscope.cns.fr.