StructomeDB

Name: StructomeDB
Creator: Ashar J. Malik

v1.0 · 2026

Vanessa Robinson, Haoyue Hu, Ashar J. Malik^* and David B. Ascher^*

Structome-wide pairwise protein structural and sequence comparison database

104,528,817 structurally detected pairs from an exhaustive all-vs-all comparison of 61,631 UniProt-linked representative chains from the full experimental PDB, spanning all domains of life including viruses. Each pair carries a unified feature vector of structural and sequence alignment metrics with full biological annotation.

104.5M unique pair records

61,631 representative chains

7,351 species represented

All domains covered including viruses

Sequence–Structure Similarity Landscape

Sequence similarity (BLASTP)

Structural similarity (Foldseek TM-score)

Threshold: struct ≥ 0.5 · seq ≥ 0.50 (click or drag to move)

++ Homologs — high structural & sequence similarity

+− Convergent candidates — high struct, low seq

−+ Divergent candidates — low struct, high seq

−− Unrelated — low structural & sequence similarity

Quadrant Counts

++ Homologs

—

High struct · High seq

+− Convergent

—

High struct · Low seq

−+ Divergent

—

Low struct · High seq

−− Unrelated

—

Low struct · Low seq

Export

Random seed (same seed → same sample)

10,000 pairs sampled (if available) per quadrant as defined by the user.

Rate limited to 5 exports / minute · 20 / hour per IP.

Compare two proteins

On-demand structural and sequence comparison

Analyse →

Two-axis classification

Each pair is classified by Foldseek TM-score (structural) and BLASTP similarity (sequence) into four quadrants across a 10×10 grid with 0.1 bin resolution.

Full feature vectors

Every pair stores alignment coordinates, rotation matrix U and translation vector T, enabling instantaneous structural superimposition without re-running any search.

Analyse any pair

Submit any two PDB chains via the Analyse endpoint to compute the full pairwise feature vector on demand, regardless of whether the pair is in the pre-computed database.