Test data
MetaGenomeThreader
Synthetic Metagenome
With the generation of the synthetic metagenome, only sub-sequences of three species were extracted due to the large amount of data required to be processed. Thus enabling easier management and control of the data. The bacteria species were an example of dominant members in this group, whereas the archaea species was an example of an under-represented species in maritime microbial communities.
Tab. 1.1 shows the domains, the species, the extracted genome sub-sequences and the coding DNA sequences of the genome sub-sequences, which describes the synthetic metagenome. The coding DNA sequences where used to evaluate the MetaGenomeThreader results.
domains
species
extracted
DNA sub-sequences
coding DNA sequences
Bacteria
Candidatus Pelagibacter ubique HTCC1062
NC_007205.1
+4 000...+7 000
-3 891...-4 832
-4 835...-6 247
-6 249...-6 434
-6 444...-7 310
Vibrio cholerae O1 biovar eltor str. N16961 chromosome I
NC_002505.1
+795 000...
+798 000
+794 241...+795 380
+795 485...+795 817
+795 839...+797 692
+797 707...+798 654
Archaea
Pyrococcus horikoshii OT3
NC_000961.1
+172 000...
+174 000
-171 622...-172 662
-172 610...-173 815
+173 822...+174 907
Tab. 1.1: Species and NCBI accession numbers of the synthetic metagenome DNA sequences. The '+' stands for the forward strand, the '-' for the reverse strand.
Test Data Sets of
long DNA sequences (Ø 850bp),
here
or
short DNA sequences (Ø 120bp),
here