A Daphnia magna public gene set data and database are at http://arthropods.eugenes.org/EvidentialGene/daphnia/daphnia_magna_new/ This gene set is summarized here: http://arthropods.eugenes.org/EvidentialGene/daphnia/daphnia_magna_new/Genes/evg7finloc9_sum.txt 29121 gene loci, all supported by mRNA-seq and/or protein homology evidence 18962 (65%) are properly mapped to D.magna 2010 partial genome assembly 22063 (76%) have some homology to other species (18% to other Daphnia only) Almost all of 7 Billion RNA-seq reads (98%) are recovered in transcripts, and almost all gene loci (99%) have uniquely mapped RNA-seq. Differential expression is found in fewer orthologs, and more species-unique/novel and inparalog genes, proportionally to non-DE genes. This gene set includes first evidence of transpliced genes in Daphnia (400+). A few other crustaceans show it, but generally isn't known much among arthropods (eg well-known fruitfly mod(mdg4) transgene). These data searches are available: Search Daphnia magna Genes : searches gene annotations (names, IDs, ..) and display gene annotation reports Search Daphnia/Arthropod Gene Families : searches gene families with other species (10 spp) BLAST searches the new gene sequences GBrowse to daphnia_magna7 displays location of genes, plus summary of expression, protein evidence on Dmagna 2010 genome assembly. ----- You can for instance find the 18 gene families that all other 9 species have but D. magna lacks a gene for (a few of those are findable..) Or find 319 families missing in Insects but found in Daphnia + Human + Fish. Gene search for "Any field: globin" will pull in Hemoglobins + related families, including 2 named "cuticle protein" but containing a Globin conserved domain and located in Dmag hemoglobin cluster (a naming mistake). The hemoglobin cluster loci here remains a mess, it needs human expert curation to resolve. This gene set also has a complete DSCAM locus (ID Dapma7bEVm000001t1), with ~125 alternates, and includes a couple of new exons in the multiplex parts, compared to Dieter Ebert lab's 2011 paper doi:10.1371/journal.pone.0027947 http://arthropods.eugenes.org/EvidentialGene/daphnia/daphnia_magna_new/Examples/Dmagna7map_DSCAM.html http://arthropods.eugenes.org/genepage/daphniamagna/Dapma7bEVm000001t1 ----- It has taken much more effort than expected, a basic reason is the greater complexity of Daphnia genes versus other animal & plant species I've worked on. There are mistakes in this gene set, both false positive and false negative locus calls exist, i.e., some called loci are instead alternate transcripts, and some called alternate transcripts are separate paralog loci. I've tried to minimize those with extended effort, but mistakes remain. My hope is this gene set can be published in some short time frame (months) including placed into public databanks like NCBI and EBI. That however will require placing RNA-seq and/or genome assembly into public databanks also, to tie the gene transcripts to (they won't just take my word for it that these are accurate genes :) It has been a complicated year for me: lots of problems mixed with interested successes. The results I get for producing high quality Daphnia magna genes also apply other animals and plants. I've lots of objective support for this now, but others who could benefit are still mostly not finding these results. For the arthropods, Daphnia (magna and pulex), Honey bee, 3 Ticks, 2 Beetles, as well as a Fish and two Plants, these EvidentialGene methods produce the most orthology-complete gene sets (i.e., have most ortholog families, and the most complete ortholog proteins). They are >90% complete, and are significantly better in objective comparison to other current gene sets for these species. This Daphnia magna gene set may well surpass the model fruitfly and other insects for completeness/accuracy. A Daphnia scientist has applied my methods of transcript assembly to Daphnia galeata: Mathilde Cordellier at Uni.Hamburg DE and colleagues. That will be interesting gene data to look at. I'm finding some interesting agreement b/n D. pulex and D. magna gene sets, where they differ from eg. model insect fruitfly genes (and other insects). Mostly these are facets of genome biology that I lack expertise in, and would welcome insights from you & other daphnia researchers. For instance, this list of Human-D.pulex shared genes of interest is mostly corroborated w/ D. magna genes: http://server7.wfleabase.org/prerelease4/gene-predictions/daphnia_genes2010_human_unique.txt The Dicer gene family appears to differ from insects, maybe human-like or other, and I've a list of Centromere genes involved in mitosis, DNA-repair, etc. that two Daphnia appear to differ from insects, possibly shedding some light on mechanism of gene duplications in Daphnia. ----- For those of you who are willing to pay toward my costs to produce this, you have early access to the bulk gene data (sequence fasta and GFF map locations). A limitation on this project is my need for funds support this work. I have contributed substantial effort without salary to this in genome information engineering and dissemination, amounting to over $60,000 salary in 2013-2014. To enable use of this work and future Daphnia genome informatics, I am asking those with research budgets who wish to use these D. magna genes to contribute significant funds to defray my contribution. Those without non-personnel research funding are welcome to freely use this genome information. -- Don Gilbert, gilbertd at indiana edu, 2014 November ----------------------------------------------------------------------------