Optimizing strategy for the discovery of compositionally-biased or low-complexity regions in proteins

优化蛋白质中组成偏倚或低复杂度区域的发现策略

阅读:1

Abstract

Proteins can contain tracts dominated by a subset of amino acids and that have a functional significance. These are often termed 'low-complexity regions' (LCRs) or 'compositionally-biased regions' (CBRs). However, a wide spectrum of compositional bias is possible, and program parameters used to annotate these regions are often arbitrarily chosen. Also, investigators are sometimes interested in longer regions, or sometimes very short ones. Here, two programs for annotating LCRs/CBRs, namely SEG and fLPS, are investigated in detail across the whole expanse of their parameter spaces. In doing so, boundary behaviours are resolved that are used to derive an optimized systematic strategy for annotating LCRs/CBRs. Sets of parameters that progressively annotate or 'cover' more of protein sequence space and are optimized for a given target length have been derived. This progressive annotation can be applied to discern the biological relevance of CBRs, e.g., in parsing domains for experimental constructs and in generating hypotheses. It is also useful for picking out candidate regions of interest of a given target length and bias signature, and for assessing the parameter dependence of annotations. This latter application is demonstrated for a set of human intrinsically-disordered proteins associated with cancer.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。