We demonstrate that gaps and distributional patterns embedded within real-valued measurements are inseparable biological and mechanistic information contents of the system. Such patterns are discovered through data-driven possibly gapped histogram, which further leads to the geometry-based analysis of histogram (ANOHT). Constructing a possibly gapped histogram is a complex problem of statistical mechanics due to the ensemble of candidate histograms being captured by a two-layer Ising model. This construction is also a distinctive problem of Information Theory from the perspective of data compression via uniformity. By defining a Hamiltonian (or energy) as a sum of total coding lengths of boundaries and total decoding errors within bins, this issue of computing the minimum energy macroscopic states is surprisingly resolved by applying the hierarchical clustering algorithm. Thus, a possibly gapped histogram corresponds to a macro-state. And then the first phase of ANOHT is developed for simultaneous comparison of multiple treatments, while the second phase of ANOHT is developed based on classical empirical process theory for a tree-geometry that can check the authenticity of branches of the treatment tree. The well-known Iris data are used to illustrate our technical developments. Also, a large baseball pitching dataset and a heavily right-censored divorce data are analysed to showcase the existential gaps and utilities of ANOHT.
Complexity of possibly gapped histogram and analysis of histogram.
阅读:5
作者:Fushing Hsieh, Roy Tania
| 期刊: | Royal Society Open Science | 影响因子: | 2.900 |
| 时间: | 2018 | 起止号: | 2018 Feb 28; 5(2):171026 |
| doi: | 10.1098/rsos.171026 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
