Detection of RNA structures in porcine EST data and related mammals

猪EST数据及相关哺乳动物RNA结构的检测

阅读:1

Abstract

BACKGROUND: Non-coding RNAs (ncRNAs) are involved in a wide spectrum of regulatory functions. Within recent years, there have been increasing reports of observed polyadenylated ncRNAs and mRNA like ncRNAs in eukaryotes. To investigate this further, we examined the large data set in the Sino-Danish PigEST resource http://pigest.ku.dk which also contains expression information distributed on 97 non-normalized cDNA libraries. RESULTS: We constructed a pipeline, EST2ncRNA, to search for known and novel ncRNAs. The pipeline utilises sequence similarity to ncRNA databases (blast), structure similarity to Rfam (RaveNnA) as well as multiple alignments to predict conserved novel putative RNA structures (RNAz). EST2ncRNA was fed with 48,000 contigs and 73,000 singletons available from the PigEST resource. Using the pipeline we identified known RNA structures in 137 contigs and single reads (conreads), and predicted high confidence RNA structures in non-protein coding regions of additional 1,262 conreads. Of these, structures in 270 conreads overlap with existing predictions in human. To sum up, the PigEST resource comprises trans-acting elements (ncRNAs) in 715 contigs and 340 singletons as well as cis-acting elements (inside UTRs) in 311 contigs and 51 singletons, of which 18 conreads contain both predictions of trans- and cis-acting elements. The predicted RNAz candidates were compared with the PigEST expression information and we identify 114 contigs with an RNAz prediction and expression in at least ten of the non-normalised cDNA libraries. We conclude that the contigs with RNAz and known predictions are in general expressed at a much lower level than protein coding transcripts. In addition, we also observe that our ncRNA candidates constitute about one to two percent of the genes expressed in the cDNA libraries. Intriguingly, the cDNA libraries from developmental (brain) tissues contain the highest amount of ncRNA candidates, about two percent. These observations are related to existing knowledge and hypotheses about the role of ncRNAs in higher organisms. Furthermore, about 80% porcine coding transcripts (of 18,600 identified) as well as less than one-third ORF-free transcripts are conserved at least in the closely related bovine genome. Approximately one percent of the coding and 10% of the remaining matches are unique between the PigEST data and cow genome. Based on the pig-cow alignments, we searched for similarities to 16 other organisms by UCSC available alignments, which resulted in a 87% coverage by the human genome for instance. CONCLUSION: Besides recovering several of the already annotated functional RNA structures, we predicted a large number of high confidence conserved secondary structures in polyadenylated porcine transcripts. Our observations of relatively low expression levels of predicted ncRNA candidates together with the observations of higher relative amount in cDNA libraries from developmental stages are in agreement with the current paradigm of ncRNA roles in higher organisms and supports the idea of polyadenylated ncRNAs.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。