This is the GENOA file server at MIT, providing access to genome alignments that detect loci and pertinent alternative transcript structures of genes in genomic sequences for the human genome. |
Related publications and online supplementary material: D. Holste, C.B. Burge, et al. The making of mRNAs: computationally dissecting the alternative splicing of precursors by using the Hollywood database. In preparation. World-wide web Supplementary material D. Holste, G.Huo, V.Tung, and C.B. Burge. Hollywood: a comparative genomics relational database of alternative splicing. In preparation. World-wide web Supplementary material E. van Nostrand, D. Holste, Burge CB. Orthology-based characterization of human intron retention. In preparation. World-wide web G.W. Yeo, E. van Nostrand, D. Holste, Poggio T, Burge CB. Identification and analysis of alternative splicing events conserved in human and mouse. Proc Natl Acad Sci USA 102(8):2850 (2005). World-wide web Supplementary material G. Yeo*, D. Holste*, G. Kreimann, and C.B. Burge. Variation in alternative splicing across human tissues. Genome Biol 5(10):R74 (2004). World-wide web Supplementary material W.G. Fairbrother, D.Holste, C.B.Burge, and P.A. Sharp. Single nucleotide polymorphism-based validation of exonic splicing enhancers. PLoS Biol 2(9):E268 (2004). World-wide web Supplementary material |
Genome Annotation (GENOA) pipeline |
Identification of pre-mRNA alternative splice forms |
GENOA aligns spliced cDNA and EST sequences to the human genome and computationally
identifies for each loci constitutive and alternative exons (see supportive schematic plot).
To this end,
GENOA uses BLASTN to detect significant blocks of identity between
repeat-masked cDNAs (rm-cDNAs)
and genomic DNA, and then aligns cDNAs to the genomic loci identified by BLASTN using the spliced
alignment algorithm MRNAVSGEN.
MRNAVSGEN is similar in
concept to SIM4, but was developed specifically to
align high quality cDNAs rather than ESTs and thus requires higher alignment quality (at least 93% identity) and
consensus terminal dinucleotides at the ends of all introns. ESTs were aligned using
SIM4 to
those genomic regions which had significant BLASTN to rm-cDNA and aligned cDNAs. Again, stringent
alignment criteria were imposed: (1) ESTs were required to overlap cDNAs (so all of the genes studied were
supported by at least one cDNA:genomic alignment); (2) the first and last aligned segments of ESTs were required to be at least
30 nucleotides in length, with 90% sequence identity; and (3) the entire ESTs alignment was required to extend over
at least 90% of the length of the EST with at least 90% sequence identity. [ supportive schematic plot | data | webpage | information ] |
An internal exon is identified as a skipped exons (SE) if it was included and skipped in one or more transcripts,
and if the boundaries of both
5' and 3' flanking exons were shared in the transcripts that included and skipped that exon (see supportive schematic plot).
Similarly, an internal exon was identified as alternative 3' splice site (ss)
exon (A3E) or alternative 5'ss exon (A5E), if that exon was altered in another transcript
at the corresponding 3'ss (5'ss). The exon core sequence of an A3E (A5E) is defined as the shortest exonic sequence
common to transcripts used to
infer the A3E (A5E) event, and the exon extension sequence of an A3E (A5E) is the exonic sequence
added to the core by the alternative 3'ss (5'ss). [ hollywood.mit.edu ] [ supportive schematic plot ] |
Please address comments/questions/suggestions regarding this webpage to Dirk Holste or Chris Burge Copyright © Chris Burge Lab, MIT, 2005 |