Pacific white shrimp transcriptome assembled with EvidentialGene methods arthropods.eugenes.org/EvidentialGene/whiteshrimp/ 2012 oct, Don Gilbert, gilbertd near indiana edu RNA-Seq Data: NCBI BioProject:PRJNA73443 Litopenaeus vannamei transcriptome sequencing (SRP008317) see sralist.txt Assembly: diginorm.sh : redunancy filter reads with digital normalization by kmer, using khmer package, 2012 Sept velrun*.sh : Velvet/Oases, multi-kmer assembly, versions current as of 2012 July processtr.sh : Protein finding, best protein/transcript selection using evigene/cdna_bestorf, cd-hit, scripts Summary whiteshrimp1best: Total ntr= 111090 (many are poor, to be filtered, below) whiteshrimp longest 1k aa: aa1k trlen aafull utrok aarange whishrimp1vel12 1723 6532 78 92 1081,1421,8754 # litova_vel1vel3.cd90 whishrimp1tri 1281 4356 51 97 823,1090,5854 # litova1tri1.all_cd whishrimp0tsa 722 2351 29 99 516,657,2502 # litova0tsa.all_cd #................... FIXME: separate these 150k aa,tr into good, poor subsets good: 1. included in orthomcl groups; bugs/omclbugs10/omcl10a1_all/bug10a_omclgn.tab, 50k 2. else, >=60aa? and complete, or >=120aa and partial == 70k ? poor: remainder, mostly partial, under 60 aa