Interactive visualization of metric distortion in nonlinear data embeddings using the distortions package

使用 distortions 包对非线性数据嵌入中的度量失真进行交互式可视化

阅读:1

Abstract

Nonlinear dimensionality reduction methods like Uniform Manifold Approximation and Projection (UMAP) and T-distributed stochastic neighbor embedding ($t$-SNE) can help to organize high-dimensional genomics data into manageable low-dimensional representations, like cell types or differentiation trajectories. Such reductions can be powerful, but inevitably introduce distortion. A growing body of work has demonstrated that this distortion can have serious consequences for downstream interpretation, e.g. suggesting clusters that do not exist in the original data. Motivated by these developments, we implemented a software package, distortions, which builds on state-of-the-art methods for measuring local distortions and displays them in an intuitive and interactive way. Through case studies on simulated and real data, we find that the visualizations can help flag fragmented neighborhoods, support hyperparameter tuning, and enable method selection. We believe that this extra layer of information will help practitioners use nonlinear dimensionality reduction methods more confidently. The package documentation and notebooks reproducing all case studies are available online at https://krisrs1128.github.io/distortions/site/.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。