Effects of target structure distortions on the quality of local ligand binding site alignments. MCC is Matthew's correlation coefficient calculated against the reference alignments constructed using target crystal structures.

ROC plots for the prediction of equivalent residue pairs using SVC and different quality target structures. TPR and FPR are the true and false positive rates, respectively; gray area corresponds to a random prediction.

Binding site matching is conducted using A—C adenine-binding and D—F other proteins. The accuracy of local alignment predictors is compared to that using global sequence and structure alignments for A, D crystal target structures, B, E high- and C, F moderate-quality protein models.

High- and moderate-quality models are constructed by eThread. Ligand binding sites and residues are detected by eFindSite. The accuracy is assessed by the area under ROC.

The performance of local alignment predictors is compared to that using global sequence and structure alignments for different target structures from the SOIPPA, Kahraman, Homogeneous and Steroid datasets. The alignment accuracy is assessed by a ligand heavy-atom RMSD calculated upon the superposition of aligned binding residues.

Abstract Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery.

Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state.

This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models.

Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA.

More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models.

This represents a significant improvement over other algorithms, e. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning.

