Abstract
In the domain of heterogeneous graph representation learning, traditional methodologies often rely excessively on manually crafted meta-paths and neighbor aggregation mechanisms when faced with diverse node types and complex relationships. This dependence makes it challenging for models to adaptively optimize in situations of information scarcity or insufficient neighbors, frequently resulting in excessive information smoothing and inadequate representational capacity, which in turn affects the distinctiveness of node representations and the effectiveness of tasks. To address these limitations, this paper proposes a framework named GraphFlow, grounded in information flow optimization and potential neighbor selection mechanisms, with the aim of enhancing the learning of inter-node associations through optimized information propagation. Our framework transcends the conventional fixed reliance on neighbors by dynamically optimizing information flow paths and adaptively selecting potential neighbors, enabling it to flexibly and effectively capture latent yet highly relevant neighbors within the graph, even in contexts of information scarcity or a lack of direct neighbors. Specifically, by integrating HodgeRank ranking and adaptive meta-path generation, our approach not only effectively refines the neighbor selection process but also allows for the adaptive modeling of deep semantic relationships among nodes within a multi-level, multi-relational graph structure. This significantly enhances the distinctiveness of node representations and facilitates the effective dissemination of information flow. Extensive experiments conducted on multiple publicly available heterogeneous graph datasets validate that the proposed GraphFlow method outperforms the best baseline performances in tasks such as node classification and link prediction across most evaluation metrics. Notably, it demonstrates exceptional performance on heterogeneous graph datasets characterized by complex node types and multiple relationships, markedly improving the model's distinctiveness and generalization capabilities.