SIENA Dataset
Protein binding site ensembles are essential for a comprehensive analysis of protein flexibility. In structure-based design endeavors, they assist in considering conformational degrees of freedom in protein structures. SIENA is an automated five-phase pipeline for creating a conformational ensemble for a protein binding site of interest from the Protein Data Bank (PDB). The process enables on-the-fly structure selection and superposition.
Using SIENA, we created the non-intersecting binding site ensemble dataset (NBSE).[1] The set comprises 182 ensembles with more than 9000 aligned PDB structures. Moreover, it includes alignments, ligand files in SDF format, and reduced ensembles.
Users can download the dataset at https://fiona.uni-hamburg.de/37ebe22f/nbse.zip (1.2 GB).
[1] Bietz, S.; Rarey, M. SIENA: Efficient Compilation of Selective Protein Binding Site Ensembles. J Chem Inf Model 2016, 56 (1), 248-259. DOI: https://doi.org/10.1021/acs.jcim.5b00588