Skip to Main content Skip to Navigation
Journal articles

Low-Quality Structural and Interaction Data Improves Binding Affinity Prediction via Random Forest

Abstract : Docking scoring functions can be used to predict the strength of protein-ligand binding. It is widely believed that training a scoring function with low-quality data is detrimental for its predictive performance. Nevertheless, there is a surprising lack of systematic validation experiments in support of this hypothesis. In this study, we investigated to which extent training a scoring function with data containing low-quality structural and binding data is detrimental for predictive performance. We actually found that low-quality data is not only non-detrimental, but beneficial for the predictive performance of machine-learning scoring functions, though the improvement is less important than that coming from high-quality data. Furthermore, we observed that classical scoring functions are not able to effectively exploit data beyond an early threshold, regardless of its quality. This demonstrates that exploiting a larger data volume is more important for the performance of machine-learning scoring functions than restricting to a smaller set of higher data quality.
Document type :
Journal articles
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal-amu.archives-ouvertes.fr/hal-01205333
Contributor : Administrateur Hal Amu <>
Submitted on : Friday, September 25, 2015 - 12:42:59 PM
Last modification on : Friday, February 14, 2020 - 10:13:53 AM
Long-term archiving on: : Tuesday, December 29, 2015 - 10:01:49 AM

File

molecules-20-10947.pdf
Publisher files allowed on an open archive

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Collections

Citation

Hongjian Li, Kwong-Sak Leung, Man-Hon Wong, Pedro J. Ballester. Low-Quality Structural and Interaction Data Improves Binding Affinity Prediction via Random Forest. Molecules, MDPI, 2015, ⟨10.3390/molecules200610947⟩. ⟨hal-01205333⟩

Share

Metrics

Record views

204

Files downloads

293