Edge-Centric Embeddings of Digraphs: Properties and Stability Under Sparsification

有向图的边中心嵌入:稀疏化下的性质和稳定性

阅读:1

Abstract

In this paper, we define and characterize the embedding of edges and higher-order entities in directed graphs (digraphs) and relate these embeddings to those of nodes. Our edge-centric approach consists of the following: (a) Embedding line digraphs (or their iterated versions); (b) Exploiting the rank properties of these embeddings to show that edge/path similarity can be posed as a linear combination of node similarities; (c) Solving scalability issues through digraph sparsification; (d) Evaluating the performance of these embeddings for classification and clustering. We commence by identifying the motive behind the need for edge-centric approaches. Then we proceed to introduce all the elements of the approach, and finally, we validate it. Our edge-centric embedding entails a top-down mining of links, instead of inferring them from the similarities of node embeddings. This analysis is key to discovering inter-subgraph links that hold the whole graph connected, i.e., central edges. Using directed graphs (digraphs) allows us to cluster edge-like hubs and authorities. In addition, since directed edges inherit their labels from destination (origin) nodes, their embedding provides a proxy representation for node classification and clustering as well. This representation is obtained by embedding the line digraph of the original one. The line digraph provides nice formal properties with respect to the original graph; in particular, it produces more entropic latent spaces. With these properties at hand, we can relate edge embeddings to node embeddings. The main contribution of this paper is to set and prove the linearity theorem, which poses each element of the transition matrix for an edge embedding as a linear combination of the elements of the transition matrix for the node embedding. As a result, the rank preservation property explains why embedding the line digraph and using the labels of the destination nodes provides better classification and clustering performances than embedding the nodes of the original graph. In other words, we do not only facilitate edge mining but enforce node classification and clustering. However, computing the line digraph is challenging, and a sparsification strategy is implemented for the sake of scalability. Our experimental results show that the line digraph representation of the sparsified input graph is quite stable as we increase the sparsification level, and also that it outperforms the original (node-centric) representation. For the sake of simplicity, our theorem relies on node2vec-like (factorization) embeddings. However, we also include several experiments showing how line digraphs may improve the performance of Graph Neural Networks (GNNs), also following the principle of maximum entropy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。