Multi-sample ζ-mixup: richer, more realistic synthetic samples from a p-series interpolant

多样本ζ混合:来自p级插值器的更丰富、更真实的合成样本

阅读:1

Abstract

Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup  can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose ζ-mixup, a generalization of mixup  with provably and demonstrably desirable properties that allows convex combinations of T ≥ 2 samples, leading to more realistic and diverse outputs that incorporate information from T original samples by using a p-series interpolant. We show that, compared to mixup, ζ-mixup  better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of ζ-mixup  is faster than mixup, and extensive evaluation on controlled synthetic and 26 diverse real-world natural and medical image classification datasets shows that ζ-mixup  outperforms mixup, CutMix, and traditional data augmentation techniques. The code will be released at https://github.com/kakumarabhishek/zeta-mixup.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。