The euGenes/Arthropods database at http://arthropods.eugenes.org/, is a comparative reference of current gene sets, gene orthology and summary data for Arthropod genomes. A recent OrthoMCL orthology for 263,000 genes of 14 species were computed on the TeraGrid (Dec 2009). Web-searchable common gene summary pages describe these orthologs, along with GBrowse genome maps and BLAST services. Analytical summaries include gene structure statistics, EST assemblies, gene quality statistics, and species ortholog gain/loss tables.
Summary findings include
(1) certain features of gene structures are conserved (coding length), and others distinguish species (intron sizes and number, gene duplications).
(2) EST assemblies are useful to validate gene predictions and genome assemblies.
(3) Daphnia pulex maintains the strongest homology to human and other non-arthropod eukaryotes, followed by Ixodes. Among insects, Tribolium has the strongest non-insect homology, partly due to good quality gene modeling.
(4) Gene duplication rate is more variable than singleton gene rate, and lacks strong phylogenetic trend.
(5) As yet unpredicted orthologous genes exist in most of these genomes, and species gene sets differ in this quality.