What is GeneBe Hub?
GeneBe Hub is an open, public repository of Genetic Variant Annotation Databases, like PyPI is a repository for Python Packages and Docker Hub is a repository for Docker Images.
GeneBe Hub system consists of three elements:
- GeneBe Hub repository where you can browse published already prepared variant annotation databases (like ClinVar or GnomAD) and you can publish your own database
- GeneBe Client that you can use to annotate your VCF file
- Standardized annotation format that aims to unify the way we store and share variant annotation databases right now.
Visit the Getting Started guide to begin using GeneBe. To learn about creating databases, refer to the Creating a Database page. If you’d like to delve deeper into GeneBe Hub, continue reading.
Background
Interpreting genetic variants requires a vast amount of data. Before assigning pathogenicity to a variant, an expert must consider its frequency, potential effects on splicing, associations with phenotypes, and more. There are excellent tools and databases to address these needs. For example:
- SNPeff or VEP can assign variant effects and identify affected genes.
- GnomAD or ALFA provide population frequency data.
- ClinVar is a rich source of known pathogenic variants.
In addition to these, there are numerous specialized databases offering population-specific frequencies or computational scores for certain genes.
However, there is no universal format for sharing these databases. Some are distributed as VCF files, others as TSV files or bigWig files. If a database isn’t widely popular or integrated into commonly used, often proprietary annotation software, bioinformaticians must parse it themselves. Moreover, there is no standard format or centralized platform for sharing new databases.
Problem
- Lack of a standard format: There is no universal format for sharing variant annotations.
- No centralized distribution: There is no standard platform for distributing variant annotations.
- Inconsistent application: There is no standardized method for applying variant annotations to genetic data.
Solution
GeneBe Hub addresses these challenges by providing:
- A standardized format for storing and sharing annotations (https://genebe.net/about/hub-format).
- A client application to create and apply annotations in this format to VCF files (https://github.com/pstawinski/genebe-cli/releases).
- A centralized platform for users to share and distribute annotations: GeneBe Hub (https://genebe.net/hub).
In result, applying annotations to your VCF file was never easier. Just find your favorite annotation in GeneBe Hub and apply it to your VCF with GeneBe Client by calling:
java -jar GeneBeClient.jar vcf annotate \
--input-vcf "input.vcf" \
--output-vcf "output.vcf" \
--annotations "namespace/database_name"
for example to get current version of ClinVar:
java -jar GeneBeClient.jar vcf annotate \
--input-vcf "input.vcf" \
--output-vcf "output.vcf" \
--annotations "@genebe/clinvar"
Read Getting Started Guide to read more.
NOTE: Currently GeneBe Hub stores precomputed annotations that will be applied locally on your computer PLUS one special remote annotation that is applied by default, named
@genebe/base
, that assigns ACMG criteria and many basic, useful annotions that are by default used in the process of ACMG criteria application.