Abstract
Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality globally, and accurate classification of liver lesions using ultrasound remains challenging. We present SMC-LUD (Samsung Medical Center - Liver Ultrasound Dataset), a publicly available dataset of B-mode liver ultrasound images collected from Samsung Medical Center, Seoul, Korea, between 2015 and 2024. The dataset comprises 5,385 anonymized ultrasound images from 1,021 patients, categorized into two clinically relevant classes: hepatocellular carcinoma (images = 2,716) and hemangioma (images = 2,669). All HCC cases were histopathologically confirmed through surgical resection or biopsy, while hemangioma cases were radiologically diagnosed based on characteristic imaging features. Each image was labeled and verified by board-certified radiologists and pathologists. The dataset is organized with patient-level grouping. This resource addresses the scarcity of large, well-annotated ultrasound datasets for liver lesion classification and provides a valuable foundation for developing and validating deep learning models in liver cancer screening and diagnosis.