Abstract
Power equipment maintenance work orders are vital in power equipment management because they contain detailed information such as equipment specifications, defect reports, and specific maintenance activities. However, due to limited research into automated information extraction, valuable operational and maintenance data remain underutilized. A key challenge is recognizing unstructured Chinese maintenance texts filled with specialized and abbreviated terms unique to the power sector. Existing named entity recognition (NER) solutions often fail to effectively manage these complexities. To tackle this, this paper proposes a NER model tailored to power equipment maintenance work orders. First, a dataset called power equipment maintenance work orders (PE-MWO) is constructed, which covers seven entity categories. Next, a novel position- and similarity-aware attention module is proposed, where an innovative position embedding method and attention score calculation are designed to improve the model's contextual understanding while keeping computational costs low. Further, with this module as the main body, combined with the BERT-wwm-ext and conditional random field (CRF) modules, an efficient NER model is jointly constructed. Finally, validated on the PE-MWO and five public datasets, our model shows high accuracy in recognizing power sector entities, outperforming comparative models on public datasets.