Readme

What is a Structural Alphabet?

A structural alphabet is a way to represent protein structures using a discrete set of symbols, similar to how amino acids represent protein sequences. These symbols capture local backbone conformations, allowing complex 3D structures to be encoded as linear sequences. This alphabet is generated by Foldseek.

How Can These Characters Be Used to Build Alignments?

By converting 3D structures into sequences of structural characters, traditional sequence alignment methods can be applied to compare protein structures. This enables multiple structure alignments written out as MSAs and subsequent downstream analysis.

Why Are Correct Alignments Important?

Alignments determine how structures are compared and analyzed. Poor alignments can lead to incorrect evolutionary interpretations, misidentification of structural similarities, and errors in downstream applications like phylogenetic analysis.

What Does This App Do?

Structome-AlignViewer allows users to visualize 3Di character alignments alongside molecular structures. It provides tools to inspect alignment quality using structure-based visualization to ensure accuracy in structure-based evolutionary analyses.

Input Data Format

The input comprises a list of PDB chain IDs delimited by a ";" (e.g., 1hv4_A;1hv4_B;1hv4_C).

What to Look for in Structures Before Providing Them to Structome-AlignViewer

Results Page

The results page displays:

Per-Column Confidence Score and Global Confidence Score

When analyzing alignments, it is crucial to assess the reliability of each column. The Confidence score is a proxy for identifying regions where structures are well-aligned versus areas of uncertainty. This score is derived from the substitution matrix, which assigns values to character replacements based on their likelihood in structural alignments.Higher scores indicate greater agreement between structures, while lower scores suggest high variability or misalignment. The Average (Avg) provides an overall measure of alignment quality by averaging the per-column confidence scores across only the well-aligned columns (after filtering out low-information regions). This ensures that the global score reflects the alignment's reliability rather than being biased by poor regions. A higher global confidence score suggests a well-conserved alignment, while a lower score may indicate inconsistent structural relationships. This information can be used to refine alignments or highlight regions needing further inspection.

Update 26th March, 2025 : The confidence per column is now encoded as b-factor and visualised on each structure. These structures can now be downloaded. Low confidence regions appear as blue (score approaching 0), whereas high confidence regions appear red (score approaching 1).

Alignment Statistics Panel

The statistics panel summarizes the quality of the alignment. This panel provides:

These statistics help users quickly assess the reliability and structural consistency of an alignment.

What Can Be Downloaded?

Users can download:

Citation

The preprint should be available soon.

Contact

For any issues with the server or functionality, please email and include job IDs.

Ashar Malik

Email: ashar.malik@uq.edu.au