euGenes/Arthropods About Arthropods EvidentialGene DroSpeGe

EvidentialGene : Evidence Directed Gene Construction for Eukaryotes

2016 Sep.: Evigene draft gene set for Zea mays corn plant
2016 Mar.: Evigene draft gene sets for two Anopheles and Aedes mosquito
2015 July: Evigene gene-ome set for Atlantic Killifish
2015 Mar.: Evigene gene-ome set for Daphnia magna
2014 Nov.: EvidentialGene sets are Best of clade in gene set completeness for Arthropods, Plants and Fishes . 2015-Mar update.
2014 : Evigene gene-ome sets for Honey Bee, Deer Tick, and Tribolium Beetle are publicly available.
2014 : Evigene sets for Cacao and Killifish are Best of clade for gene set completeness (size, orthology) for plants and fishes.

      Name                    Last modified       

[DIR] Parent Directory 19-Mar-2015 00:56 [DIR] about/ 05-Jul-2016 20:05 [DIR] arthropods/ 10-Mar-2016 20:56 [DIR] cacao/ 30-Jul-2014 13:32 [DIR] daphnia/ 12-Nov-2014 13:09 [DIR] evigene/ 15-Nov-2013 21:47 [DIR] killifish/ 14-Jul-2015 14:16 [DIR] nasonia/ 01-Sep-2016 17:16 [DIR] other/ 15-Aug-2016 21:15 [DIR] pea_aphid2/ 30-Jul-2014 13:32 [DIR] plants/ 16-Jul-2016 14:08 [DIR] vertebrates/ 22-Feb-2016 15:30

Folder evigene/ has this software, about/ describes this work, in posters and such documents. Most of the other folders contain gene-ome sets of animals and plants, constructed with EvidentialGene software. Some of these are genome-free, others are mixed genome-guided and mRNA-assembled.

See also EvidentialGene @

2016: EvidentialGene Explained at Galaxy & GMOD conference, 2016 June

2014: Honey Bee, Deer Tick and Water Flea gene-omes
with Evigene-mRNA

EvidentialGene mRNA Transcript Assembly Software

2013: Gene-omes built from mRNA-seq not genome DNA

Independent comparison of EvidentialGene_trassembly doi: 10.1371/journal.pone.0091776

2012: Perfect Genes Constructed with Gigabases of RNA

EvidentialGene Animal & Plant Genes Quality Comparison

EvidentialGene :
Evidence Directed Gene Construction for Eukaryotes

  Don Gilbert, gilbertd at indiana edu, 2010 .. 2013

Annotation summary

Informant evidence for gene models is transcript and protein
data.  Transcripts from long (e.g. dbEST) and short reads (e.g.
Illumina), are mapped to the genome assembly with GMAP (long) 
or GSNAP (short).

EST assemblies are constructed using PASA, from all transcript
data.  RNA-seq assemblies are first constructed from aligned short
reads with Cufflinks, and combined with EST reads for a full 

Protein genes from related sequenced genomes are BLASTX aligned to
repeat/transposon-soft-masked genome, then refined with Exonerate to 
protein gene models.

Gene models are predicted with evidence-directed AUGUSTUS
predictor. Augustus is trained for gene parameters
using full cDNA genes derirved from the PASA EST assemblies.

Several prediction sets from different evidence sets and
parameters are combined, selecting highest evidence scored model
per locus, and avoiding artifactual gene splits and joins.  
This consensus gene set has the best match to EST and protein 

Predicted genes are UTR-extended and/or improved by PASA. Genes
are annotated with Uniprot descriptions, and classified by
evidence scores including transposable elements.

Software references:

Update 2015.07: killifish genes and daphnia genes

Developed at the Genome Informatics Lab of Indiana University Biology Department