FSees
Overview
The fragment space exhaustive enumeration system (FSees) is an efficient method for exhaustively enumerating all molecules within a certain molecular subspace. This chemical space is described as a fragment space and constrained by a set of user-defined physicochemical properties. The FSees algorithm uses file-based data structures to overcome the limitation of computer main memory thus allowing to enumerate very large chemical spaces. The resulting chemical library can be used as a starting point for computational lead-finding technologies, like similarity searching, pharmacophore mapping, docking, or virtual screening.
Method
An FSees calculation requires a so-called fragment space as input. Such a space consists of a set of fragments each decorated with linker atoms of specific types and a collection of rules defining how fragments can be connected to molecules via their linkers. Fragment spaces are very generic definitions of chemical space, for publically available examples see the BRICS space or BioSolveIT's Knowledge Space.
FSees iteratively extends an initial fragment by systematically adding all compatible fragments. Sophisticated protocols ensure that no duplicates will be created during this process. Upon enumeration of molecules, several physico-chemical properties are calculated. Each partially (if possible) and fully enumerated molecule is tested against user-defined property ranges. FSees works with innovative file-based data structures making the enumeration of huge spaces possible. FSees is typically used in combination with a compound shredder (a tool which creates a fragment space out of a compound collection) or a reaction-based fragment space builder. Typical tasks for FSees are:
- Create all compounds with up to three fragments by recombination of fragments appearing in drug molecules
- Create all compounds in a molecular weight range between 350 and 450 with up to 5 hydrogen bond acceptors and 10 donors from drug-like fragments
- Taking a starting fragment F (maybe an initial hit), create all compounds containing F and up to three further building blocks from our inhouse library with at most one stereo center and a clogP between 3 and 5.
FSees was used to create a collection of about 0.5 billion molecules created by recombination of fragments from DrugBank (see also HELLS dataset).
Software Availability
FSees is freely available for academic users for Linux(64 and 32bit). Non-academic users can get an evaluation licence free of charge. No setup steps are needed to run FSees. All feedback is highly appreciated.
FSees is part of the AMD tools software bundle. To download FSees, register at https://software.zbh.uni-hamburg.de.
Datasets
Two example fragment spaces and instructions for how to enumerate them are included in the download package. For the result of an extensive enumeration of lead-like molecules see the Hamburg Enumerated Lead-Like Set(HELLS).
People and References
FSees has been developed by Florian Lauck in the research group of Prof. Matthias Rarey at the Center for Bioinformatics, University of Hamburg. Please cite FSees with:
Lauck, F. Rarey, M. (2016). FSees: Customized Enumeration of Chemical Subspaces with Limited Main Memory Consumption. Journal of Chemical Information and Modeling, submitted for publication
Our first prototype of a space enumerator was developed by Juri Paern in 2006. The prototype ran in main memory which prohibited to craete collections with more than a few million molecules. FSees is therefore the first and to our knowledge only chemical space exhaustive enumerator available.
Paern, J., Degen, J., Rarey, M. (2007). Exploring Fragment Spaces Under Multiple Physicochemical Constraints. Journal of Computer-Aided Molecular Design, 21(6):327-340.