DroSpeGe About Arthropods BLAST BioMart Maps Data News

Index of /genes2/pea_aphid2/docs

      Name                           Last modified       Size  

[DIR] Parent Directory 30-Jul-2014 13:32 - [TXT] aphid-bestpicker.html 19-Apr-2011 23:59 2k [IMG] aphid-bestpicker.png 18-Apr-2011 19:53 166k [TXT] aphid-genes-brief.txt 24-Jan-2011 12:53 3k [TXT] aphid-genes-details.txt 20-Apr-2011 16:27 227k [TXT] aphid-genes-summary.txt 05-Oct-2010 13:02 7k [TXT] aphid2-genes2_201104.sum1.txt 13-Apr-2011 14:25 6k [TXT] aphid2-genes2_201104v7.news 19-Apr-2011 23:19 4k [TXT] aphid2-reconcile-best.notes 09-Jun-2011 18:18 7k [TXT] aphid2_bestgene-compute.txt 06-Jul-2011 14:56 9k [TXT] aphid2asm.repeatmasker_tbl.txt 04-Oct-2010 11:34 2k [TXT] aphid2evigene.news 09-Jun-2011 11:26 7k


draft Pea aphid gene predictions for assembly v2.0 
by Don Gilbert, gilbertd at indiana edu  
http://arthropods.eugenes.org/genes2/  see pea_aphid2

Annotation summary

Transcripts from Acyrthosiphon pisum (210k EST and 56,522k
Illumina single reads), are mapped to the A. pisum genome with
GMAP and GSNAP.   RNA-seq assemblies are constructed from aligned
short reads with Cufflinks, and then assemblyd with EST reads
using PASA.

Arthropod proteins from Daphnia pulex, Drosophila melanogaster,
Apis, Pediculus and Tribolium are BLASTX aligned to repeat-masked
genome, then refined with Exonerate to protein gene models.

Gene models are predicted with evidence-directed AUGUSTUS
predictor. Augustus is trained for pea aphid gene parameters
using full cDNA genes derirved from the PASA EST assembly.

Several prediction sets from different evidence sets and
parameters are combined, selecting highest evidence models
per locus.  The consensus gene set has the best match to
EST and protein evidence.


Gene Models
===============
  32967 mRNA genes with evidence (Protein homology or EST)
  21326 have Protein homology (e-value <= 1e-5)
   8810 have only protein evidence      
  24064 have EST or Rnaseq evidence 
  11548 have only EST/Rna evidence
-------------------------------------------------------------

Acyrthosiphon pisum asm 2.0 gene quality scores
     
Evid.   Nevd    a1Gno   epi4    epir3   mix3 

                ==== Exon Sensitivity & Specificity ====
EST     34 Mb   0.516   0.663   0.717   0.813
Protein 25 Mb   0.536   0.757   0.785   0.786
RNAseq  17 Mb   0.575   0.839   0.792   0.875
T'poson 40 Mb   0.024   0.078   0.068   0.068
Specif  56 Mb   0.549   0.591   0.603   0.621

                ==== Gene model Accuracy ====
EST-genes n=10371  
 Perfect        1911    1968    1939    2143  
 Sensitv.       0.588   0.717   0.693   0.775 
 Specifc.       0.602   0.414   0.454   0.402 

Protein-genes n=12860
 Perfect        3487    4028    4071    4548 
 Sensitv.       0.414   0.500   0.510   0.512
 Specifc.       0.665   0.484   0.479   0.578

                ==== Genome Coverage ====
 Statistic      a1Gno   epi4    epir3   mix3 
 Coding bases   29.2    43.4    47.3    40   
 Exon bases     34.1    70.4    68.3    70.7 
 Gene count     37782   35961   39905   32967
------------------------------------------------------

Gene sets: 
a1Gno     = aphid assembly 1, Gnomon models, mapped to asm 2
epir1..3  = evidence directed predicts, aphid training
epi4..5   = EST assembly + protein evidence prediction
mix3      = mix picking best evidence of several gene sets 

  

Developed at the Genome Informatics Lab of Indiana University Biology Department