Scaling up graph-based semisupervised learning via prototype vector machines.

When the amount of labeled data are limited, semisupervised learning can improve the learner's performance by also using the often easily available unlabeled data. In particular, a popular approach requires the learned function to be smooth on the underlying data manifold. By approximating this manifold as a weighted graph, such graph-based techniques can often achieve state-of-the-art performance. However, their high time and space complexities make them less attractive on large data sets. In this paper, we propose to scale up graph-based semisupervised learning using a set of sparse prototypes derived from the data. These prototypes serve as a small set of data representatives, which can be used to approximate the graph-based regularizer and to control model complexity. Consequently, both training and testing become much more efficient. Moreover, when the Gaussian kernel is used to define the graph affinity, a simple and principled method to select the prototypes can be obtained. Experiments on a number of real-world data sets demonstrate encouraging performance and scaling properties of the proposed approach. It also compares favorably with models learned via l1 -regularization at the same level of model sparsity. These results demonstrate the efficacy of the proposed approach in producing highly parsimonious and accurate models for semisupervised learning.

期刊：	IEEE Trans Neural Netw Learn Syst	影响因子：	0.000
时间：	2015	起止号：	2015 Mar;26(3):444-57
doi：	10.1109/TNNLS.2014.2315526

Scaling up graph-based semisupervised learning via prototype vector machines.

通过原型向量机扩展基于图的半监督学习

特别声明