Basser Seminar Series

Noncoding sequence, noncoding gene and noncoding RNA

Speaker: Professor Runsheng Chen, Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of Sciences

Time: Thursday 27 November 2008, 4-5pm, **Note, different day

Location: The University of Sydney, School of IT Building, Lecture Theatre (Room 123), Level 1

Abstract

Recent evidence points to considerable transcription occurring in non-protein-coding regions of eukaryote genomes. However, their lack of conservation and demonstrated function have created controversy over whether these transcripts are functional. Applying a novel cloning strategy, we have cloned 100 novel Caenorhabditis elegans full-length ncRNAs. Studying the genomic environment and transcriptional characteristics have shown that two-thirds of all ncRNAs, including many intronic snoRNAs, are independently transcribed under the control of ncRNA-specific upstream promoter elements. Furthermore, the transcription levels of at least 60% of the ncRNAs vary with developmental stages. We identified two new classes of ncRNAs, stem–bulge RNAs (sbRNAs) and snRNA-like RNAs (snlRNAs), both featuring distinct internal motifs, secondary structures, upstream elements, and high and developmentally variable expression. Most of the novel ncRNAs are conserved in Caenorhabditis briggsae, but only one homolog was found outside the nematodes. Preliminary estimates indicate that the C. elegans transcriptome contains 2000 small non-coding RNAs, potentially acting as regulatory elements in nematode development.

Expression profiling on whole-genome tiling microarrays applied to a mixed-stage C. elegans population verified the expression of 71% of all annotated exons. Only a small fraction (11%) of the polyadenylated transcription is non-annotated and appears to consist of 3200 missed or alternative exons and 7800 small transcripts of unknown function (TUFs). Almost half (44%) of the detected transcriptional output is non-polyadenylated and probably not protein coding, and of this, 70% overlaps the boundaries of protein-coding genes in a complex manner. Specific analysis of small non-polyadenylated transcripts verified 97% of all annotated small ncRNAs and suggested that the transcriptome contains 1200 small (<500 nt) unannotated noncoding loci. After combining overlapping transcripts, we estimate that at least 70% of the total C. elegans genome is transcribed.

Speaker's biography

Professor Runsheng Chen is one of the pioneer scholars developing the researches on theoretical biology and bioinformatics in China. For more than twenty years, Professor Chen has led his group to perform a series of systemic researches in the field of bioinformatics, including the whole genome assembly and annotation for T.tengcongensis B4 (the first bacterium genome completely sequenced in China), the 1% Human Genome Project, and the Draft Sequence of the Rice Genome. So far, Professor Chen has published more than 120 papers in SCI. For his outstanding early studies on genomic informatics, Professor Chen was invited to give the “Kotani Memorial Lecture” on the 15th International CODATA Conference, Tsukuba, Japan, 29 Sep.-2 Oct. 1996, and then selected as the winner of “Kotani Prize”. Professor Chen was elected as the Member of Chinese Academy of Sciences in 2007. Professor Chen was awarded “HLHI Advancement Prize in 2008. Now Professor Chen is a professor of Institute of Biophysics, Chinese Academy of Sciences.