Evidential Genes for Nasonia vitripennis (OGS2) We are pleased to announce a new official gene set for Nasonia vitripennis (OGS version 2), incorporating new extensive expression evidence and Hymenoptera/Arthropoda protein homology. Nasonia now has one of the most complete hymenopteran gene sets, from the perspectives of having the fewest missing orthology gene groups, and the most species-unique genes supported by gene transcript evidence. For common gene sets of 6 hymenopterans and 4 other arthropods, Nasonia is missing only 375, compared to 429 missing in Camponotus (carpenter ant), 481 missing in Bombus (bumble bee), and 632 missing in Apis (honey bee), of 8147 gene groups common to 6 or more species. Gene data files, in several formats including detailed evidence annotations are available here: http://arthropods.eugenes.org/EvidentialGene/nasonia/ A web genome map with GBrowse, BLAST sequence search of these genes, and gene orthology assessment, are available through the link above. Gene homology evidence is collected from 220,000 proteins of 2 ants, 3 bees, Drosophila, pea aphid, Tribolium, Daphnia, and human. Gene expression evidence includes 164,000 ESTs, 188 million RNA-Seq reads, and genome tiling-array expression data. Intron splice junctions, transcript assemblies and alternative splice forms are derived from these. This combined evidence supports 24,525 good genes with 7836 alternate transcripts. This new N. vitripennis annotation uncovers twice the number of duplicated genes than in Tribolium and Drosophila, yet fewer than in pea aphid, Daphnia or human. A small increase in single copy genes in Nasonia is similar to ants, but 1.4 times more than in fruit fly. The average gene coding size is 265 amino acids, the average transcript size is 1.4 Kb, and 97% of gene models now have UTRs, versus 37% in OGS1. RNA evidence supports 7836 alternate transcripts from 4248 genes. One gene (lola) stands out with 86 annotated alternate transcripts. The next largest set is 17 alternates, for fruitless gene that is related to lola. There are 3395 genes that are both expressed and transposon associated, while 1777 others are expressed but from noncoding or aberrant gene models. This OGS2 gene set is built on the first Nasonia genome assembly, rather than the current NCBI Nvit_2.0 assembly, which has only minor assembly improvements. OGS2 does also provide information that can be used to improve the genome assembly further. There are 550 genes curated from transcript assemblies that improve on genome sequence gaps and resolve putative frame-shifts, and 833 genes are an expert's choice, including genes split over scaffolds, odorant genes, and others. Comparison of this OGS2 with the Nasonia gene sets OGS1.2 (2009) and NCBI Refseq2 (Sept 2011) shows substantial overlap among their gene models. Of the 12,989 NCBIref2 genes, 10,362 are the same loci as in OGS2 and 1655 mostly overlap. Whereas 88 NCBI Refseq2 are missing from OGS2, 12,588 OGS2 loci are not found in NCBI Refseq2. Of the 18,941 OGS1.2 loci, 10,583 are the same loci as in OGS2, 4226 mostly overlap, and 412 of OGS1 are missing from OGS2. There are 7495 OGS2 loci that are not found in OGS1.2. This table summarizes gene evidence recovered in these gene sets. Gene evidence summary OGS2 RefSeq2 OGS1 Introns 97% 90% 85% EST coverage 72% 67% 51% RNA assembly 63% 36% 29% Homolog found 100% 89% 89% Homolog score 679 635 -- =================================== Introns : match to EST/RNA spliced introns EST coverage : overlap with EST exons RNA assembly : >=66% equivalence with 28016 RNA/EST assemblies Homolog found : n found for 13772 protein loci common to these gene sets Homolog score : blastp bitscore average for found of 13772 homologs Further details on the Nasonia vitripennis OGS2 will be presented in a forthcoming publication. Access to OGS2 is available at the link above, with further information on arthropod EvidentialGene sets (arthropods.eugenes.org/EvidentialGene/). Please contact us with your questions, by email. Best wishes, Don Gilbert John Colbourne John (Jack) Werren