SIENA Dataset
Protein binding site ensembles are essential for a comprehensive analysis of protein flexibility. In structure-based design endeavors, they assist in considering the protein's conformational degrees of freedom. SIENA is an automated five-phase pipeline for creating an ensemble for a protein of interest from the Protein Data Bank. The process enables on-the-fly structure selection and superposition.
Using SIENA, we created the non-intersecting binding site ensemble data set (NBSE). The set comprises 182 ensembles with more than 9,000 aligned PDB structures. Moreover, it includes alignments, ligand files in SDF format, and reduced ensembles. Users can download the dataset at https://fiona.uni-hamburg.de/37ebe22f/nbse.zip (1.2 GB in total).