Water Dataset
Water molecules play crucial roles in many biological processes, particularly mediating protein-ligand interactions. Despite numerous attempts over the past years, accurately predicting the placement of water molecules in a structurally and energetically favorable manner remains a grand challenge. The lack of appropriate experimental data is one reason. Furthermore, there are only indirect measurements for the energetic contributions of water molecules. However, on the structural side, the electron density in crystal structures clearly shows the positions of structurally relevant water molecules. This information has the potential to improve models for placing water molecules and assessing their energetic contribution to binding in proteins and at protein interfaces.
We have compiled a high-resolution subset of the Protein Data Bank, containing 2.3 million water molecules.[1] Furthermore, we have discriminated between well-resolved water molecules and those lacking supporting electron density. To perform this classification, we measured the electron density around individual atoms (EDIAscorer), enabling the automatic quantification of experimental support. Finally, we have characterized the water molecules with a detailed profile of geometric and structural features.
This data, which is freely available in the Supporting Information of the corresponding publication at https://pubs.acs.org/doi/10.1021/ci500662d, can be applied to modeling and validating new water models in structural biology, as well as molecular design.
[1] Nittinger, E.; Schneider, N.; Lange, G.; Rarey, M. Evidence of Water Molecules - A Statistical Evaluation of Water Molecules Based on Electron Density. J Chem Inf Model 2015, 55 (4), 771-783. DOI: https://doi.org/10.1021/ci500662d