p(Zn): A Background-Calibrated Classifier for Zn/Mn Binding Pocket Determination
Abstract. Distinguishing zinc-binding from manganese-binding sites remains challenging because many metalloproteins share similar folds while differing in subtle local coordination chemistry. We define p(Zn) as the probability that a metal-binding pocket is zinc-like rather than manganese-like, conditioned on structural and chemical descriptors of the local environment.
To estimate p(Zn), we train a supervised predictor on a background population of curated metal sites drawn from experimentally determined structures. For each metal site, features are extracted from the first coordination shell (ligand identity and geometry) and the surrounding second shell (local residue environment), capturing both direct coordination and contextual constraints. The resulting model assigns each site a continuous score that quantifies how Zn-like versus Mn-like its pocket is, enabling robust ranking and comparison across proteins, conformations, and experimental conditions.