Index of /EvidentialGene/daphnia/daphnia_similoides
Name Last modified Size
Parent Directory 12-Jun-2023 21:42 -
evg1dapsim/ 30-May-2017 14:31 -
evg1dapsim
----------
Gene assembly for Daphnia_similoides water flea from RNA-Seq with EvidentialGene methods
This is "reference-free", no chromosomes used nor are other species genes used to
assemble daphnia genes.
RNA source is from NCBI SRA, listed in rnasra_daphsim16huau.csv
Four gene assembly runs, each multi-kmer, with velvet/oases and idba_tran, are done,
summarized in rna_assemblies.aastat.txt
Those inputs of 3 million transcripts to tr2aacds.pl of evigene are reduced to
157459 non-redundant transcripts, comprising 46,000 putative coding gene loci.
tr2aacds result is the class table evg1dapsim.trclass.gz, and the intermediate
classified gene set of okayset/, summarized in evg1dapsim.tr2aacds.info.
These are inputs then for two further steps, reference protein blast and public
annotated sequence set, using evigene scripts run_evgaablast.sh and run_evgmrna2tsa.sh
evgmrna2tsa uses the reference protein scores and other coding metrics to reclassify
and remove some redundant or fragment transcripts, resulting in
31,000 putative loci with alternates (main class), plus 95,000 alternates,
and 7,000 loci without alternates (noclass). Removed were 23,000
redundant/fragment/nohomology transcripts.
Run information is summarized in runevg.info, with scripts and logs in folder evigene_methods/
Final public sequence set, with names from reference protein blast, are in publicset/
Intermediate transcript assemblies are not provided here.
|