FSees
Overview
The fragment space exhaustive enumeration system (FSees) is an efficient method to enumerate all molecules within a specific molecular subspace. This chemical space is described as a fragment space constrained by user-defined physicochemical properties. The FSees algorithm uses file-based data structures to overcome the limitation of the computer's main memory. Thus, it enables enumerating large chemical spaces. The resulting chemical library can be used as a starting point for computational lead-finding technologies, like similarity searching, pharmacophore mapping, docking, or virtual screening.
Method
An FSees calculation requires a so-called fragment space for an input. Such a space consists of a set of fragments, each decorated with linker atoms of specific types and a collection of rules defining how fragments can be combined to form molecules via their linkers. Fragment spaces are very generic definitions of chemical space. For publicly available examples, see the BRICS space at https://www.zbh.uni-hamburg.de/forschung/amd/datasets/brics.html or the BioSolveIT's Knowledge Space at https://www.biosolveit.de/chemical-spaces/#knowledgespace.
FSees iteratively extends an initial fragment by systematically adding all compatible fragments. Sophisticated protocols ensure that the process will not create duplicates during this process. Upon enumeration of molecules, several physicochemical properties are calculated. Each partially (if possible) and fully enumerated molecule is tested against user-defined property ranges. FSees works with innovative file-based data structures, enabling the enumeration of huge spaces. FSees is typically used in combination with a compound shredder (a tool that creates a fragment space out of a compound collection) or a reaction-based fragment space builder. Typical tasks for FSees are:
- create all compounds with up to three fragments by recombining fragments appearing in drug molecules
- create all compounds in a molecular weight range between 350 and 450 g/mol with up to 5 hydrogen bond acceptors and 10 donors from drug-like fragments
- take a starting fragment F (maybe an initial hit), create all compounds containing F, and up to three further building blocks from our in-house library with at most one stereo center and a clogP value between 3 and 5
FSees was used to create a collection of about 0.5 billion molecules enumerated by recombining fragments from DrugBank (see also the HELLS dataset at https://www.zbh.uni-hamburg.de/forschung/amd/datasets/hells-datasets.html).
Software Availability
FSees is freely available for non-commercial and academic users for Linux as part of our NAOMI ChemBio Suite. To download FSees, register at https://software.zbh.uni-hamburg.de. Non-academic users can get an evaluation license free of charge. Only minimal setup steps are required to run FSees. All feedback (software.zbh(at)uni-hamburg.de) is highly appreciated.
Datasets
The package download includes two example fragment spaces and instructions on enumerating them (“molecule_shredder”). For the result of an extensive enumeration of lead-like molecules, see the Hamburg Enumerated Lead-Like Set (HELLS) at https://www.zbh.uni-hamburg.de/forschung/amd/datasets/hells-datasets.html.
References
Lauck, F.; Rarey, M. FSees: Customized Enumeration of Chemical Subspaces with Limited Main Memory Consumption. J Chem Inf Model 2016, 56 (9), 1641-1653. DOI: https://doi.org/10.1021/acs.jcim.6b00117