Abstract
BACKGROUND AND OBJECTIVE: Building on our previous success in artificial intelligence-driven food classification, this study aims to enhance portion size estimation methods, with a particular focus on rice and congee, using milk as a low-glycemic index reference in selected Taiwanese hospitals. METHODS: We used a RealSense D435i depth camera to capture both 2-dimensional (2D) and 3-dimensional (3D) images of rice, congee, and milk served in standard hospital bowls. The images were labeled and classified by a convolutional neural network, Xception, whereas food weight and volume were estimated using 25 depth points with multiple perceptron (MLP) models. MLP models were trained on 80% of the data set and tested on the remaining 20%, with predictions refined through a weighted ensemble of selected MLP models. We validated the MLP model performance through 3 comparisons: 1) MLP with traditional statistical methods (linear regression), 2) MLP with clinical dietitians, and 3) MLP performance across 3 different hospital settings. RESULTS: The Xception model accurately identified foods with high glycemic index and recognized milk as a low-glycemic reference, achieving a training accuracy of 1.0 with the loss approaching zero by epoch 22. Test accuracy stabilized early with consistently low loss. MLP models effectively predicted food weights, with models E, F, and G yielding the lowest mean squared errors for rice, congee, and milk, respectively. Across all food types, MLP models consistently outperformed traditional linear regression, showing lower mean absolute percentage error (MAPE: 0.3%-3.3%) than linear regression (MAPE: 4%-9.2%). When compared with estimates from 6 experienced dietitians (MAPE: 39%-41%), the MLP models also demonstrated markedly higher accuracy. Performance remained stable across 3 hospital sites, indicating strong generalizability. CONCLUSIONS: The Xception and MLP models effectively quantify foods with high glycemic index and identify milk as a low-glycemic reference, demonstrating strong predictive performance and generalizability. Further studies are warranted to extend this approach to other food types.