: Protein-protein interactions play a pivotal role in numerous biological processes, making their structural prediction essential for molecular biology and medicine. Computational docking methods generate ensembles of models for protein complexes, but identifying correct solutions within these ensembles remains a challenge. This study introduces Iter-CONSRANK, an iterative consensus-based scoring algorithm designed to enrich the fraction of correct models in such ensembles. Building on the established CONSRANK algorithm, Iter-CONSRANK incorporates iterative filtering to progressively discard lower-ranked models based on inter-residue contact conservation, thus refining the ensemble. Performance evaluation using the 3K-BM5up dataset (~1.6 × 105 models), 30K-BM5 dataset (~6.4 × 106 models), and the CAPRI-derived Score_set (~2.0 × 104 models) demonstrated significant improvement in the proportion of correct solutions across model difficulty categories. Iter-CONSRANK effectively identified correct models within the top-ranking positions, outperforming over 150 other scoring functions. For moderately challenging targets, the algorithm enriched correct solutions by up to eightfold, making subsequent analyses more straightforward. The software is publicly available, enabling its application for pre-processing docking ensembles or independent scoring. Iter-CONSRANK is a promising tool for advancing the accuracy of protein-protein docking model evaluations.
Increasing the fraction of correct solutions in ensembles of protein-protein docking models by an iterative consensus algorithm
Oliva, Romina
2025-01-01
Abstract
: Protein-protein interactions play a pivotal role in numerous biological processes, making their structural prediction essential for molecular biology and medicine. Computational docking methods generate ensembles of models for protein complexes, but identifying correct solutions within these ensembles remains a challenge. This study introduces Iter-CONSRANK, an iterative consensus-based scoring algorithm designed to enrich the fraction of correct models in such ensembles. Building on the established CONSRANK algorithm, Iter-CONSRANK incorporates iterative filtering to progressively discard lower-ranked models based on inter-residue contact conservation, thus refining the ensemble. Performance evaluation using the 3K-BM5up dataset (~1.6 × 105 models), 30K-BM5 dataset (~6.4 × 106 models), and the CAPRI-derived Score_set (~2.0 × 104 models) demonstrated significant improvement in the proportion of correct solutions across model difficulty categories. Iter-CONSRANK effectively identified correct models within the top-ranking positions, outperforming over 150 other scoring functions. For moderately challenging targets, the algorithm enriched correct solutions by up to eightfold, making subsequent analyses more straightforward. The software is publicly available, enabling its application for pre-processing docking ensembles or independent scoring. Iter-CONSRANK is a promising tool for advancing the accuracy of protein-protein docking model evaluations.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.