Table G1. Sus_scrofa (pig) gene set numbers, version Susscr4EVm --------------------------------------------------- 39879 gene loci, all supported by RNA-seq, most also have protein homology evidence 39879 (100%) are protein coding, 0 are non-coding NA_n_teloci are protein coding, expressed, loci with transposon domains All genes (100%) are assembled from RNA evidence, 0 are genome-modelled 25383/39879 (64%) have protein homology to other species genes. 316491 alternate transcripts are at 25512 (64%) loci, with 5 median, 12.4 ave, transcripts per locus, with 756 alts maximum, 1079 loci have 50+ alts, 8453 have 10+ alts, 27473 (69%) have complete proteins, 12406 have partial proteins, of 39879 coding genes 37918 (95%) are properly mapped to chromosome assembly (>=80% align), 1144 partial-mapped coverage ( 10% < align <80%), 817 are ~un-mapped genes ( align < 10% ), 6746/37918 (18%) are single-exon loci of those mapping >= 50% to genome, 3274 of these have homology to other species genes. 92627 are culled loci, not in public gene set, but with some unique sequences. 99 culls are multi-exon, well aligned; 87515 are single exon, well aligned, 1082 are parially mapped, and 3931 are poorly aligned to chromosomes. 13658 culls have protein homology, 78969 lack it. 175793 are culled alternate transcripts, at both public and culled loci, redundant in splicing patterns to public alternates, or lacking in alignment or evidence, though differing somewhat by sequence alignment. Gene locus IDs: Susscr4EVm000001t1 .. Susscr4EVm137575t1, Alternate transcripts have ID suffix t2 .. t100. EVm000001 is the longest protein, larger ID numbers are mostly shorter, but not for all. -------------------------------------------------------- Gene set classification notes: Culled transcripts are classified as unique by transcriptome alignments, but re-classified as redundant, or lacking sufficient evidence, by chromosome alignments. These are separated from the public gene set as redundant or low quality, but are available as valid evidence transcripts, for instance by reclassifying with an updated chromosome assembly.