Identified PCS's (long DNA Sequences)
Default settings were chosen in the MetaGenomeThreader test run. Only the minimum length for theamino acid sequences, which should appear in the result output was set to 50 amino acids, and the extended-modus was activated (cp. test data: short DNA sequences for settings by short DNA sequences). The extended-modus extends the DNA sequences of the identified PCS's both upstream to the next start signal and downstream to the next stop signal where possible.
To evaluate the MetaGenomeThreader result, the identified PCS's were ordered in clusters to the related target sequence. The bottom amino acid sequence (lowercase) is the protein sequence of the target protein(cp. Tab. 1.1). The identified PCS's are written in uppercase letters.
Pyrococcus horikoshii; DNA primase small subunit; -171 622...-172 662; reading frame: -3
Pyrococcus horikoshii; DNA primase large subunit; -172 610...-173 815; reading frame: -2
Vibrio cholerae; preprotein translocase subunit YajC; +795 485...+795 817; reading frame +2
Vibrio cholerae; protein export protein SecD; +795 839...+797 692; reading frame +2
No correct PCS could be identified for Candidatus Pelagibacter. Only three false predicted PCS's were calculated.
The following example shows, that the false PCS prediction was based on the wrong detection of the reading frame.
Candidatus Pelagibacter; probable periplasmic serine protease DO-like precursor; -4 835...-6 247
Result of the PCS identification, if the correct reading frame was detected:
Test Results: Statistic Section
Test Results: Interpretation