ActivityFinder
Overview
ActivityFinder automatically links protein–ligand crystal structures to bioactivity data without requiring external services or continuous data connections. It relies exclusively on structural information from PDB files and bioactivity data stored in a structured SQL database such as ChEMBL. The linking procedure is based on sequence alignments with explicit mutation identification and detailed chemical structure matching. It captures and reports variations such as binding-site mutations, sequence mismatches, and inconsistencies in ligand modeling. This prevents unintended errors in derived data sets and downstream analyses. Although the published ActivityDB is built on public data, ActivityFinder can also be applied to proprietary data sets. In its published application, ActivityFinder linked more than 20,000 PDB structures to over one million bioactivity data points from ChEMBL.
Limitations
The PDB files must contain ligands in order to annotate the corresponding bioactivity data. A differentiation between contradicting stereoisomers and cases with unspecified stereocenters is currently not possible. For ligand entries with multiple molecular components, only the largest component is retained. The ligand topology is annotated solely from the ligand coordinates in the PDB files. An update mechanism for the resulting ActivityDB has not yet been implemented.
Software Availability
ActivityFinder is freely available for non-commercial and academic users for Linux as part of our NAOMI ChemBio Suite. To download ActivityFinder, register at https://software.zbh.uni-hamburg.de. Non-academic users can get an evaluation license free of charge. Only minimal setup steps are required to run ActivityFinder. All feedback (software.zbh(at)uni-hamburg.de) is highly appreciated.
References
Ehmki, E. S. R.; Gutermuth, T.; Harren, T.; Kurtz, S.; Rarey, M. ActivityFinder: Toward the Fully Automatic Integration of Structural and Binding Affinity Data. J Chem Inf Model 2026. DOI: https://doi.org/10.1021/acs.jcim.5c02505