A Semi-Automatic Labeling Framework for PCB Defects via Deep Embeddings and Density-Aware Clustering

基于深度嵌入和密度感知聚类的PCB缺陷半自动标注框架

阅读:1

Abstract

(1) Background. Printed circuit board (PCB) inspection is increasingly constrained by the cost and latency of reliable labels, owing to tiny/low-contrast defects embedded in complex backgrounds and severe class imbalance. (2) Methods. We proposed a semi-automatic labeling pipeline that converts anomaly detection proposals into class labels via small margin cropping from images, interchangeable embeddings (HOG, ResNet-50, ViT-B/16), clustering (k-means/GMM/HDBSCAN), and cluster-level verification using representative montages. (3) Results. On 9354 cropped defects spanning 10 categories (imbalance IR ≈ 1542, Gini ≈ 0.642), ResNet-50 + HDBSCAN achieved NMI ≈ 0.290, AMI ≈ 0.283, and purity ≈ 0.624 with ~47 clusters; ViT + HDBSCAN was comparable (NMI ≈ 0.281, AMI ≈ 0.274, ~44 clusters). With a fixed taxonomy, k-means (K = 10) yielded the strongest ARI (0.169 with ResNet-50; 0.158 with ViT). Macro-purity exceeded micro-purity, indicating many small, homogeneous clusters suitable for one-shot acceptance/rejection, enabling an upper-bound ~200× reduction in operator decisions relative to per-image labeling. (4) Conclusions. The workflow provides an auditable, resource-flexible path from normal-only localization to scalable supervision, prioritizing labeling productivity over detector state-of-the-art and directly addressing the industrial bottleneck in the development lifecycle for PCB inspection.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。