Abstract
BACKGROUND: Ki-67 proliferation index (PI) is essential for grading well-differentiated neuroendocrine tumors (WD-NETs). Pathologists traditionally assess Ki-67 PI by identifying hotspots and manually counting the positive cells among negative cells, which is expressed as a percentage. We developed an algorithm to objectively determine Ki-67 hotspots and calculate PI in WD-NETs, comparing its results with pathologists' selected hotspots to assess reliability. METHODS: Hotspots for gastroenteropancreatic WD-NETs (n = 20) were manually annotated on whole-slide images (WSIs) by six pathologists and compared with algorithm-selected areas. Ki-67 (DAKO, MIB-1 clone, and prediluted) scoring was performed using QuPath's custom object classification algorithm. Pathologists identified hotspots on WSI, captured images, and submitted them for PI determination using the same algorithm. Ki-67 PI was translated to grade per WHO classification (G1: <3%, G2: 3-20%, G3: >20%). A pathologist's consensus grade was determined based on majority pathologist grading (>3/6) for each case. Fleiss's Kappa was used to assess inter-pathologist agreement, Cohen's Kappa was used to evaluate the agreement between pathologists and the algorithm, and Friedman test was used for hotspot area variability analysis. RESULTS: Pathologists showed moderate agreement (Fleiss's Kappa = 0.42, 80% agreement), whereas pathologist-algorithm agreement was fair (Cohen's Kappa = 0.32, 58.9% agreement). Among cases with pathologist consensus grade (n = 19), the algorithm assigned a higher grade in 8 cases (42%). In 60% of cases, hotspots overlapped between methods. There was significant hotspot area variability (Friedman statistic: 95.97, p < 0.001). CONCLUSION: Manual Ki-67 hotspot assessment is subjective, leading to grading variability. Algorithm-based assessment enhances reproducibility, though this occasionally leads to tumor upgrading, highlighting the need for standardization and further validation.