VIVID

About VIVID

VIVID is a user-friendly web-server that allows users to visualise genomic mutations from single-individual to population-scale for their impacts on protein structure, function, and evolution.

Step-by-Step VIVID tutorial is also available via video and PDF.

Read more here: Pre-print manuscript

Download example dataset 1
Plasmodium falciparum: Erythrocyte binding antigen protein (source)

Download example dataset 2
SARS-CoV-2: Spike protein (source)

Run Analysis

The job submission page for input files can be accessed via the menu item Run on the top bar menu.

Provide following information:
At this input step users have two options to provide a protein structure (PDB file):
(a) Upload a protein structure from the local system or access it via the RCSB PDB database.

Provide following information:
(b) Request through VIVID to check the availability of a protein structure in the AlphaFold Protein Structure Database. First, this is done by blasting the query sequence against the SWISSPROT database to obtain the UniProt ID of the top blast search hit.
Second, the UniProt ID is then searched in AlphaFold Protein Structure Database to obtain the protein structure.
(c) After this search, VIVID provides information such as UniProt ID, sequence identity, and average pLDDT score of the model structure to allow the user to decide whether to proceed with the analyses.

Provide following information:
The complete nucleotide coding sequence of a gene can be provided by either uploading the FASTA file from the local system (1a) or by pasting the sequence along with the gene id into the box (1b).

Please note:
The header in a FASTA file should contain only '>gene_id' without any additional information such as delimiters ('|','space','tab',etc) or annotation (1b).

VCF file
Provide a VCF file that contains only bi-allelic SNPs. If the VCF file size is large (> 500 MB), we suggest users to reduce the file size by selecting/keeping SNPs specific to the coding region of the query sequence to avoid lengthy processing times. If the VCF file is unavilable, SNP information can also be provided in a tabular format (download example VCF file).

Please note:
The nucleotide coding sequence should be from the same version of the reference genome used to call SNPs.

GFF file
Provide a GFF annotation file that should contain the "CDS" feature. If the GFF file is unavailable, CDS coordinates can also be provided in tabular format (download example GFF file).

Protein structure (PDB)
Provide a protein structure file (4a or 4c) and chain ID (4b) if you have your PDB structure. This can be provided by either uploading a PDB file from the local system (4a) or accessing it via the RCSB PDB by providing a PDB ID (4c).
These details will be auto-filled if AlphaFold models are selected at the beginning.

Genetic code
Select the appropriate genetic code (default: 'Standard Code') from the drop-down menu used to translate queried coding sequence into amino acids.

Thresholds
These inputs are optional and are associated with "Contact Map" and "BioStructmap" analyses that display pairwise residue interactions and perform 3D sliding window population genetics analyses. Ångstrom (0.1 nm) is the standard unit of measurement for protein crystal structures.

Thresholds
Here, the Ångstrom threshold (6a) indicates the Euclidian distance in 3D space between alpha carbon atoms of amoni acids (default: 10 Å) in the contact map. The primary distance threshold (6b) represents the number of amino acids apart in the primary sequence (default: 6 amino acids) in the contact map. Radius Ångstrom (6c) represents Euclidian distance from the alpha carbon atoms of mutated residues (default: 15 Å), for which 3D sliding window population genetic analyses will be performed.

Email
By providing an Email address, the user can receive a notification email after job completion.
Click SUBMIT to get your results.

Results

External Information
Provides a link for UniProt and Pfam databases to obtain additional biological information where users can render structural and functional domain information (e.g., conserved structural domains, active sites, etc.) on protein sequence and structure to identify mutational hotspots.

Please note:
The UniProt and Pfam IDs are obtained by performing a BLAST search of the nucleotide query sequence against the SWISSPROT database.

Protein Sequence Viewer
The nucleotide coding sequence of a gene is encoded into amino acid residues of a protein where synonymous (purple) and non-synonymous (yellow) mutations are highlighted by default. Only residues present in the PDB file are represented in the primary sequence. This interactive panel can be used to select and highlight External Information from above, in primary sequence and 3D visualisation.

Contact Map
This represents pairwise residue-residue interactions. Interactions within user-defined Ångular threshold in 3D space and found more than the primary distance threshold are shown in blue, where interactions involving mutated residues are highlighted in pink. This interactive panel allows users to zoom in/out by selecting a box. Also, users can hover over interactions to display details of interacting residues.

Please note:
If users want to display more long-range interactions in 2D space and avoid closerly associated residues in the primary sequence, they can go back to the submission page and increase the Ångstrom threshold and Primary distance threshold.

3D Visualisation
This panel shows interactive 3D viewer. Default view displays interatomic interactions between the wild-type residue and nearby residues.

3D Visualisation
Interactions can be hidden or displayed using the controls provided (4a). The viewer can be manipulated using buttons at the bottom of the panel (4b). VIVID allows users to perform multiple 3D renderings using the control panel by selecting options from the 'colour scheme' drop-down menu (4c). The default colour scheme is synonymous and non-synonymous mutations. Some interesting, informative visualisation in the 'colour scheme' could be Tajima's D, Nucleotide diversity, SNP frequency, PSSM score, and Dynamut2 (ΔΔG).

Please note:
Population genetics indicies such as Tajima's D and Nucleotide diversity are calculated with a 3D sliding window (default Radius Ångstrom = 15) analysis in the BioStructmap program.
Depending on the size of the protein, the default value of Radius Ångstrom might not be sufficient; hence the user can go back to the submission page and increase/decrease the Radius Ångstrom threshold. For more details, please refer to the BioStructmap publication.

Mutational Analysis
This panel will be shown once Dynamut2 computations are completed. It displays predicted changes of folding free energy (ΔΔG) of substituted amino acids on protein structure stability and flexibility using the Dynamut2 program.
ΔΔG values are represented in a bar chart (5a) and tabular format (5b). ΔΔG values can also be visualised in the 3D visualisation panel by clicking on a drop-down menu of 'colour scheme'. For more details about ΔΔG calculation, please refer to Dynamut2 publication.

Arpeggio Results
This panel displays information about interatomic interactions between the substituted residue and nearby residues in protein structures. Mainly, it reports changes among 20 interatomic interactions when compared wildtype and mutant residues. Users can use this table to get information about lost and gained interaction after substitution in a protein structure.
By default, information about nine types of interactions is shown in the table (6a). Users can click on 'column visibility' (6b) to display additional interactions. For more details, please refer to Arpeggio publication.

Download
Users can download results of Dynamut2 and Arpeggio interactions of substituted protein residues.

Contact Us
In case you experience any issue with using VIVID or if you have any suggestions or comments, please do not hesitate to contact us either via email or our group website.
If you are contacting regarding a job submission, please include details such as input information and the job identifier.