euGenes/Arthropods About Arthropods EvidentialGene DroSpeGe

Index of /EvidentialGene/arthropods/daphnia/daphnia_magna/Genes/modelled_on_genome

      Name                                 Last modified       Size  

[DIR] Parent Directory 22-Sep-2016 14:57 - [   ] daphmagna_201104m8.aa.gz 10-Apr-2011 13:54 6.9M [   ] daphmagna_201104m8.alttransnew.aa.gz 12-Apr-2011 15:49 1.3M [TXT] daphmagna_201104m8.annot.readme.txt 30-Oct-2011 14:03 3k [   ] daphmagna_201104m8.annotable.txt.gz 10-Apr-2011 15:36 1.4M [   ] daphmagna_201104m8.cds.gz 10-Apr-2011 14:00 8.7M [   ] daphmagna_201104m8.gff.gz 10-Apr-2011 15:46 17.5M [   ] daphmagna_201104m8.pasaupdate.gff.gz 12-Apr-2011 15:21 12.4M [   ] daphmagna_201104m8.pasaupdate.gtf.gz 12-Apr-2011 15:33 4.2M [TXT] daphmagna_201104m8.summary.txt 12-Apr-2011 23:10 5k [   ] daphmagna_201104m8.tr.gz 10-Apr-2011 13:54 18.1M



Gene files:
  1505610 Apr 10 15:36 daphmagna_201104m8.annotable.txt.gz  gene annotation table
 18297922 Apr 10 15:46 daphmagna_201104m8.gff.gz     gene feature locations
  7183738 Apr 10 13:54 daphmagna_201104m8.aa.gz      fasta protein
  9070872 Apr 10 14:00 daphmagna_201104m8.cds.gz     fasta coding na
 18979950 Apr 10 13:54 daphmagna_201104m8.tr.gz      fasta transcript
 12978345 Apr 12 15:21 daphmagna_201104m8.pasaupdate.gff.gz   gene feature updates
 12978345 Apr 12 15:21 daphmagna_201104m8.pasaupdate.gtf.gz   above file in GTF format
  1397110 Apr 12 15:49 daphmagna_201104m8.alttransnew.aa.gz   fasta alt-proteins not same as primary
    daphmagna_201104m8.pasaupdate includes PASA est/rna-seq validated gene model corrections,
         and addition of alternate transcripts, plus 201 additional good genes missed before.

Gene Evidence Summary for daphnia_magna, 2011mar

Evid.   Nevd    Statistic        2011m8  2010ep24   2011m8update
------  ------  -------------    ------  ------     ------
EST     26Mb    BaseOverlap      0.822   0.722      0.848
Pro     27Mb    BaseOverlap      0.819   0.727      0.823
RNA     32Mb    BaseOverlap      0.684   0.551      0.714
Intron  88981   SplicesHit       0.937   0.861      0.937

T'poson 844Kb   BaseOverlap      0.634   0.486      0.635
Specif  54Mb    BaseOverlap      0.398   0.418      0.407

ESTgene 12009   Perfect          6495    1516       8038
ESTgene 12009   Sensitv.         0.841   0.652      0.890
ESTgene 12009   Specifc.         0.720   0.626      0.751

Progene 19252   Perfect          8859    6722       9235
Progene 19252   Sensitv.         0.745   0.690      0.749
Progene 19252   Specifc.         0.719   0.700      0.723

Homolog --      homolog.Nfound   21028   17856      --    
Homolog --      homolog.bits/aa  0.995   0.989      --
Paralog --      paralog.Nfound   19228   14475      --
Paralog --      paralog.bits/aa  0.926   0.786      --

Genome  --      Coding Mb        31mb    27mb       31mb 
Genome  --      Exon Mbase       53mb    41mb       55mb  
Genome  --      Gene count       34614   23478     34815 
Genome  --       alternate tr    --      --        12641 on 5883 genes 
------------------------------------------------------------
  Predictions: 2011m8=bestgenes_of8.an8b.gff, April 2011
  ep24 = dmag_ep24augmap2an2.gff, prerelease3 on 2 May 2010  
  2011m8pp= daphmagna_201104m8.pasaplus.gff, April 2011
  ..............................................

Gene Models Summary for daphnia_magna, 2011mar
------------------------------------------------------------
 Count of genes from 2011m8 = genes/bestgenes_of7.an8b.gff
 
 34614 Genes (version: 2011m8, mix8b)
       Evidence support:
 33897 have evidence (homology, EST or RNAseq)
     26870 have Protein homology
         12256 have orthologs (>=33%)
         10185 have in-paralogs (>=33%)
     28619 have Expression (EST or RNAseq)
         15297 have EST (>=33%)
         15928 have RNAseq (>=33%)
 22753 have >= 95% evidence coverage
 28126 have >= 66% evidence coverage
 31608 have >= 33% evidence coverage

       Quality of models:
 18176 are full protein genes (complete and protein coding)
  6388 are partial protein genes (for missing start, internal stops)
 10050 are noncoding genes (for pCDS <= 0.33 or lenCDS < 120)
   374 have transposon match >=33% 

... 2011m8update ...
 Count of genes from genes/daphmagna_201104m8.pasaplus.gff.gz
 34815 Genes (version: daphmagna_201104m8update; 201 genes added)
     12641 Alternate Transcripts on 5883 genes 
------------------------------------------------------------

D.magna potential missed genes from un-mapped rna-seq reads:
  transcript first assemblies: n=38653, size= 5 megabases
  transcript all   assemblies: n=42549, size= 6 megabases 
 
 Summary comparison of 3 arthropod gene and transcript evidence
          Daphia magna,  Daphnia pulex, Pea aphid

daphnia_magna, 2011mar      daphnia_pulex, 2010     pea_aphid, 2011mar
Evid.   Nevd     Dmagm8    Evid.   Nevd   Dplx2     Evid.   Nevd   Aphid2
------  ------   ------    ------  ------ ------    ------  ------ ------
EST     26mb     0.822     EST     12mb    0.884    EST     36mb   0.813   
Pro     27mb     0.819     Pro     21mb    0.831    Pro     27mb   0.786   
RNA     32mb     0.684     RNA     42mb    0.677    RNA     55mb   0.428  
Intron   89k     0.937     Intron   68k    0.963    Intron  127k   0.690
                           TAR     36Mb    0.791          
                              
Coding span      31 mb     Coding span    48 mb     Coding span   42 mb
Exon span        53 mb     Exon span      71 mb     Exon span     74 mb
Genome size     131 mb     Genome size   227 mb     Genome size   541 mb
Gene count       34614     Gene count     47712     Gene count    32967 
--------------------------------------------------------------------------
Nevd for EST, Pro, RNA is total span of non-overlapping reads or alignments,
but count for good, unique Intron sites from spliced reads.

Best protein homology for Daphnia magna is slightly lower than Daphnia pulex:
                      N_match Align Bitscore
d.pulex x tribolium:  10739   392   332   d.pulex genes2010
d.magna x tribolium:  10713   376   319   d.magna 2011m8update
d.magna x tribolium:  10526   369   309   d.magna 2010 alpha 

d.pulex x human:      14356   378   301   d.pulex genes2010
d.magna x human:      14335   368   290   d.magna 2011m8update
#............................................................

produced using EvidentialGene annotation software from
http://arthropods.eugenes.org/EvidentialGene/
by Don Gilbert


Developed at the Genome Informatics Lab of Indiana University Biology Department