The known Genetic codes are tabulated in
compiled by Andrzej Elzanowski and Jim Ostell of the NCBI.
During a search of a nucleic acid database, Mascot uses the taxonomy of each entry
to choose the correct genetic code. If no taxonomy information is present, it defaults to the standard code.
Taxonomy can also be defined at a database level, to handle species specific databases such as EST_human.
In general, the code is different for mitochondrial and nuclear proteins. Although Mascot could try to
determine whether a database entry is mitochondrial by performing a keyword search of the FASTA description,
this is unreliable. In
any case, mitochondrial proteins will usually represent only a very small fraction of the entries in any
comprehensive database. The most important requirement is to use the correct code for a database that is
specifically mitochondrial proteins. The solution adopted in Mascot is to include a flag in the taxonomy
definition to specify whether nuclear or mitochondrial codes should be used.