Authors: Machindra K. Gaikwad
Abstract: Food object detection has emerged as a critical research area in the intersection of computer vision and nutritional informatics. This paper presents a comprehensive study on the application of YOLO (You Only Look Once) models for real-time food item recognition and classification. Accurate food detection is fundamental to calorie estimation, dietary tracking, and smart kitchen applications. We investigate the evolution from YOLOv1 through YOLOv8, analysing architectural improvements, training strategies, and performance trade-offs on benchmark food datasets including FOOD-101, UEC-Food256, and a custom annotated dataset of Indian cuisines. Experimental results demonstrate that YOLOv8 achieves a mean Average Precision (mAP@0.5) of 91.3% on the FOOD-101 dataset while maintaining real-time inference speeds of 42 FPS on standard GPU hardware. The study further explores transfer learning, data augmentation, and anchor-box optimization as techniques to improve detection accuracy across diverse food categories. Our findings suggest that YOLO-based architectures are well-suited for deployment in mobile and edge computing environments for dietary assessment applications.
International Journal of Science, Engineering and Technology