# daphnia_pulex_evigene17_homology.txt EvidentialGene 2017 set for Daphnia_pulex --------------------------------------------------- Protein Orthology summary for 3 Daphnia pulex gene sets. Protein alignments are done for reference gene sets of 3 Daphnia species (D. magna, D. galeata, D. similioides) and 3 insect species (Dros. mel., Tribolium c., Bemisia tab.) with BLASTp. The best aligning insect set is Bemisia (white fly), while Daphnia magna has the most complete gene set and the best overall alignments to Dap. pulex, D. galeata is closer, with higher align identity. This orthology assessment per gene locus is tabulated in XXXXX.txt, for these gene sets. The genes found in Evigene sets but missing from MLMaker17 are of all categories, long/complex, short/simple, strong and weak homology, tandem paralogs and unique genes. Daphnia magna REFERENCE (nr=29127) Evigene17 MLMaker17 Evigene10b found 72.6% 59.6% 71.4% align 91.3% 70.8% 87.3% tiny 0.2% 23.8% 1.5% best 75.4% 3.1% -- equal 21.3% Drosophila mel. REFERENCE (nr=10902) Evigene17 MLMaker17 Evigene10b found 71.4% 68.3% 70.6% align 81.9% 74.6% 78.5% tiny 0.8% 6.5% 2.2% best 65.4% 5.2% -- equal 29.2% Highly conserved Drosophila REFERENCE (BUSCO subset, nr=3038) Evigene17 MLMaker17 Evigene10b found 98.1% 94.6% 97.4% align 84.0% 76.7% 80.5% tiny 0.4% 5.6% 1.9% best 63.7% 4.7% -- equal 31.5% --------------------------------------------------- Dapplx Evigene17 of 2017 from http://arthropods.eugenes.org/EvidentialGene/daphnia/daphnia_pulex/daphnia_pulex_genes2017/ Dapplx Evigene10b of 2010 from wfleabase.org and http://arthropods.eugenes.org/EvidentialGene/daphnia/daphnia_pulex/daphnia_pulex_genes2010/ Dapplx MLMaker17 genes of 2017 from report of doi:10.1534/g3.116.038638 "A New Reference Genome Assembly for the Microcrustacean Daphnia pulex" Method: BLASTP -query reference.aa -db dapplx_twogenesets.aa -evalue 1e-9 Statistics: found = count of signif. align to reference, align = average % alignment to reference genes (align-aa/ref-aa) tiny = count of aligns for target length < 50% of reference length best = count of longest alignments per ref gene for each of two targets =============================== Daphnia_pulex Evigene17 gene locus homology summary Homology Statistic Ref Signif. Signif. Species AA-Align CDS-Ka/Ks --------------------------- any 25310 19523 dapmag 22802 16629 dapgal 7603 17373 dapsim 9223 15290 drosmel 12551 na bemtab 8525 na tribcas 8254 na --------------------------- Daphnia_pulex evg17 gene loci with homology=25310, of approx. 30,000 loci. Counts here are of best-aligned reference, not all species alignments (i.e. dapmag counts many more than dapgal,dapsim due to longer protein alignments, though closer dapgal has more CDS-Ka/Ks aligns). Will revise to count all aligns/species. Reference species are 3 Daphnia species (D. magna, D. galeata, D. similioides), and 3 Insect species (Drosophila mel., Tribolium cas., Bemisia tab.)