A Generative AI-Based Technical Data Extraction Tool for IoT Application Systems

面向物联网应用系统的基于生成式人工智能的技术数据提取工具

阅读:2

Abstract

Nowadays, Internet of Things (IoT) application systems play an essential role in smart cities, industry, healthcare, agriculture, and smart homes. For non-expert users, designing and implementing IoT application systems remains challenging, especially when configuring sensors, edge devices, and server platforms. To support configuration tasks of IoT application systems, we have developed an AI-based setup assistance tool. However, AI models still fail to reliably support newly released or previously unseen devices, sometimes producing incomplete or erroneous outputs that may lead to configuration failures. Incorporating their technical-document information into Retrieval-Augmented Generation (RAG) is an effective way to supplement AI knowledge and improve reliability. In this paper, we propose a generative AI-based technical data extraction tool to address the challenges. It extracts essential technical information using the schema-based extraction from given PDF or HTML datasheets and converts it into a structured format suitable for AI-supported configurations. A local vector database is used to enable semantic similarity retrieval and provide document-grounded evidence for RAG-based answering, ensuring consistent support for previously unseen IoT devices. For evaluations, we applied the proposal to several sensor and device datasheets and compared extracted specifications with ground-truth values to measure accuracy and completeness. Then, we compared end-to-end configuration QA reliability against a commercial baseline (ChatPDF) using the golden benchmark. The results show that the proposed tool reliably acquires key specifications and significantly improves end-to-end configuration QA reliability. Across 960 golden QA pairs, the proposed method improves Recall from 0.636 to 0.926 and Accuracy from 0.595 to 0.807 compared with ChatPDF.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。