gpn-msa
version: 0.0.1(latest)GPN-MSA - genomic pretrained network with multiple-sequence alignment
Description
GPN-MSA - DNA Language Model for Variant Effect Prediction GPN-MSA, crafted by the Song Lab at Cal, is a DNA language model leveraging whole-genome sequence alignments across 100 vertebrate species to predict variant effects genome-wide, excelling in both coding and non-coding regions of the human genome (hg38). Detailed in Benegas et al. (2023) on bioRxiv (https://www.biorxiv.org/content/10.1101/2023.10.10.561776v1), it trains in hours and outperforms existing models on benchmarks like ClinVar and gnomAD, offering a lightweight yet powerful alternative to protein-focused predictors. Its alignment-based approach enhances deleteriousness prediction, making it a standout tool for genomic research.
Build instructions
Script for converting to GeneBe Hub format is here: https://github.com/genebe-net/annotation-builder-scripts/tree/main/gpn-msa
Meta Information
Access:
PUBLIC
Author:
@genebeURL:
Created:
01 Mar 2025, 17:26:51 UTC
Type:
VARIANT
Genome:
GRCh38
Status:
ACTIVE