Integrating graph convolutional networks with large language models for structured biomedical material knowledge representation

将图卷积网络与大型语言模型相结合,用于结构化生物医学材料知识表示。

阅读:1

Abstract

Automated literature mining is key to building structured biomedical materials databases, yet current methods struggle with large publication volumes, complex entity relations and domain-specific terminology. We propose a hierarchical natural language processing (NLP) framework for extracting structured data from biomedical materials texts. Our pipeline uses named entity recognition (NER) to identify entities such as compositions, synthesis methods and properties. Sentence-level relation extraction captures direct associations (e.g. temperature, morphology), while a paragraph-level graph convolutional network (GCN) module resolves cross-sentence co-references. Rule-based templates enhance precision in specific cases. Extracted relations are integrated into a biomedical materials knowledge graph, enabling scalable and extensible data representation. Experiments show that the sentence-level model achieves 84.7% accuracy and the GCN-based module achieves 84.0%. This approach offers an efficient pipeline for structuring complex scientific texts, reducing manual effort and supporting large-scale knowledge extraction in biomedical materials and related domains.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。