Abstract
The current power grid business handles massive data operations where data retrieval frequently encounters redundancy issues. Conventional decision tree-based methods struggle to achieve accurate data acquisition when facing redundant interference. To address this challenge, this study proposes a multi-level redundant data retrieval method using an improved decision tree algorithm for grid resource business center platforms. The methodology first establishes a multi-level data decision tree using grid resource business middle-platform data, then applies a decision tree pruning algorithm based on Akaike information criterion. The ant colony algorithm optimizes the pruning parameters of the decision tree model, and after obtaining optimal pruning parameters, processes the grid resource business middle-platform data decision tree to generate an improved version. Subsequently, the multi-level redundant data retrieval method based on the improved decision tree implements fast retrieval of hierarchical redundant data in grid resource business through designed repetitive data processing flows and multi-level redundant data discrimination mechanisms. The experimental results demonstrate that the improved decision tree algorithm improves multi-level redundant data retrieval accuracy by 14%. The optimized decision tree model for middle-platform data achieves more comprehensive representation of grid resource service data hierarchies and enables effective retrieval of multi-level redundant data including both image and text categories from the middle-platform data. The maximum F1-score reaches 0.99 with retrieval time of only 4.5 s, which is 1.5 s below the predefined threshold, confirming excellent practical performance.