MissenseViewer

What is Missense Viewer?

Missense Viewer is an interactive web application that consolidates predictions of effects of missense mutation from state-of-the-art deep learning models, namely AlphaMissense, ESM1b, and VESPA. It allows users to visualize mutation effects on both experimentally determined and AlphaFold-predicted protein structures. The platform also provides insights into pathogenicity effects of residues constuting ligand-binding sites. This resource computes an average score per position, representative of how likely it is for a protein to tolerate change on that site. This is also referred to as susceptibility.


What is AlphaMissense?

AlphaMissense is a model designed to predict the pathogenicity of single-point missense mutations across the human proteome. Developed by Google DeepMind, it assigns scores indicating whether a given mutation is likely to be benign or pathogenic.

For more information, visit the AlphaMissense publication.


What is ESM1b?

ESM1b (Evolutionary Scale Modeling 1b) is a model developed by Meta. In the context of Missense Viewer, ESM1b contributes pathogenicity predictions for single-point mutations, complementing the per-site information presented by MissenseViewer.

For more information, visit the ESM1b publication.


What is VESPA?

VESPA is yet another model for mutation effect prediction that evaluates the potential pathogenicity of single-point mutations. Given the different architecture, it is a valuable addition to the MissenseViewer.

For more information, visit the publication.


What is Susceptibility?

All of the above models predict a score per change per site. This means that 19 changes can be done to every single site. Susceptibility is the average of those 19 changes. A susceptibility score >0.5 signifies that most changes to this site are detrimental, where as score < 0.5 indicates that most changes are tolerated. This provides a reasonable summary metric which is scaled (susceptibility * 100) and visualized as bfactors. This provides a powerful way to look at per-site tolerance of the protein structurally as predicted by these predictors.


Susceptibility as bfactors

In MissenseViewer, susceptibility scores are scaled by multiplying by 100  - a scale appropriate for their use as bfactor in protein structures. Then structures are colour coded by their respective bfactors which enables users to identify "hot spots" for pathogenicity directly on 3D protein models, offering a visual overview of positions most susceptible to mutations.


What is a Binding Site?

In the context of MissenseViewer, a binding site is a defined as a  region on a protein where molecules, such as ligands/ions bind and is empirically defined as a region within 5 A of the binding entity.


How are Ligands Determined?

In the context of MissenseViewer an entity is recognised as a ligand, if the HET code, as defined in the chemical components dictionary, is not from the 20 standard amino acids, or a modified amino acid. Ligands are also filtered to exclude non-functional components such as water or buffers.

To qualify for inclusion, the above entity than must be within 5A of a protein, which is part of the MissenseViewer database.


Visualizing Missense Mutations

Missense Viewer allows users to query this section using a valid Uniprot accession.

The results page will allow users to:

  1. Visualize substitution sites on both experimentally determined and AlphaFold-predicted structures. Users can select these structures from a table listing all structures and their coverage of the protein.
  2. View susceptibility scores from each of the methods 1) Alphamissense (AM), 2) ESM1b and 3) VESPA for each residue both in the sequence viewer at the bottom and encoded as bfactors in the structures visualized.
  3. Clicking a residue in the sequence track (first track of the feature viewer) will also populate the original prediction value for that site per mutation.

Visualizing Ligand Information

Missense Viewer allows users to query this section using a valid Uniprot accession.

The results page is sectioned into three columns:

  1. A table listing all ligands found from all structures in RCSB PDB against the query Uniprot protein.
  2. Selecting a ligand from the table will populate columns 2 and 3
  3. Column 2 shows a summary distribution plot of susceptibility scores. These scores are collected across all structures for the Uniprot entry, where the chosen ligand is with 5A of the protein. Susceptibility scores of all residues in this distance are gathers and used to plot. Since these scores are gathered across all structures, column 3 then breaks them down per structure. A drop menu can be used to select structures and for the chosen ligand a plot will appear which shows susceptibility scores for only that structure.

API Protein Endpoint


This API provides susceptibility scores for each position in a given UniProt ID.

Endpoint

GET /api_prot/{uniprot_id}

Returns: A table with columns:

Uniprot Pos     Ref     AM      ESM1b   VESPA

Each row represents a residue position and its susceptibilityscores from the three methods.

Example cURL Request

curl -X GET "https://biosig.lab.uq.edu.au/missenseviewer/api_prot/P12345" -H "Accept: text/plain"

API Ligand Endpoint

This API retrieves ligand information given a UniProt ID.

Endpoint

GET /api_lig/{uniprot_id}

Returns: JSON data containing ligand information, beta factors from different structures, and residue-level scores.

Example cURL Request

curl -X GET "https://biosig.lab.uq.edu.au/missenseviewer/api_lig/P12345" -H "Accept: application/json"

Response Schema

{
    "ligands": {
        "<ligand_id>": {
            "beta_factors": {
                "AM": [<float>, <float>, ...],
                "ESM1b": [<float>, <float>, ...],
                "VESPA": [<float>, <float>, ...]
            }
        }
    },
    "structures": {
        "<pdb_chain>": {
            "<ligand_id>": {
                "residues": {
                    "AM": [["<chain_residue_id>", <float>], ...],
                    "ESM1b": [["<chain_residue_id>", <float>], ...],
                    "VESPA": [["<chain_residue_id>", <float>], ...]
                }
            }
        }
    }
}

Contact

For questions, suggestions, or feedback, please reach out:

Ashar Malik (ashar.malik@uq.edu.au)

We welcome feedback, collaborations and contributions.