euGenes/Arthropods About Arthropods EvidentialGene DroSpeGe

Index of /EvidentialGene/other/genefilters_compared

      Name                     Last modified       Size  

[DIR] Parent Directory 30-Oct-2019 21:28 - [TXT] evg19genefilt7_abs.txt 02-Nov-2019 21:17 1k [   ] evg19genefilt7_paper.pdf 02-Nov-2019 15:52 342k [   ] evg19genefiltsupp.zip 31-Oct-2019 13:03 92k


  Gilbert, DG. (2019). Longest protein, longest transcript or most expression,
  for accurate gene reconstruction of transcriptomes?
  bioRxiv 829184; doi: https://doi.org/10.1101/829184

---
Title:  Longest protein, longest transcript or most expression, for accurate gene reconstruction of transcriptomes?
Author: Donald G. Gilbert
Affiliation: Indiana University, Bloomington, IN, USA
Email address: gilbertd@indiana.edu or gilbert.bionet@gmail.com 
Date: 2 Nov 2019, draft 7h

Abstract

Methods of transcript assembly and reduction filters are compared for recovery
of reference gene sets of human, pig and plant, including longest
coding-sequence with EvidentialGene,  longest transcript with CD-HIT, and most
RNA-seq with TransRate.   EvidentialGene methods are the most accurate in
recovering reference genes, and maintain accuracy for alternate transcripts and
paralogs.  In comparison, filtering large over-assemblies by longest RNA
measures, and most RNA-seq expression measures, discards a large portion of
accurate models, especially alternates and paralogs.  Accuracy of protein
calculations is compared, with errors found in popular methods, as is accuracy
of transcript assemblers.   Gene reconstruction accuracy depends upon the
underlying measurements, where protein criteria, including homology among
species, have the strength of evolutionary biology that other criteria lack. 
EvidentialGene provides a gene reconstruction algorithm that is consistent with
genome biology.
----


Developed at the Genome Informatics Lab of Indiana University Biology Department