Energy Optimized YOLO: Quantized Inference for Real-Time Edge AI Object Detection

Hwee Min  Chiam; Yan Chiew  Wong; Ranjit Singh  Sarban Singh; T. Joseph Sahaya  Anand

doi:10.54554/jtec.2025.17.01.003

Authors

Hwee Min Chiam Faculty of Electronics and Computer Technology and Engineering, Universiti Teknikal Malaysia Melaka (UTeM), 76100 Durian Tunggal, Melaka, Malaysia.
Yan Chiew Wong Faculty of Electronics and Computer Technology and Engineering, Universiti Teknikal Malaysia Melaka (UTeM), 76100 Durian Tunggal, Melaka, Malaysia.
Ranjit Singh Sarban Singh School of Engineering and Technology, Sunway University, Selangor, Malaysia.
T. Joseph Sahaya Anand School of Computing, MIT Vishwaprayag University, Solapur, 413255, India.

DOI:

https://doi.org/10.54554/jtec.2025.17.01.003

Keywords:

Object detection, Real-time, Edge Device, Quantization, FPGA, Jetson Nano, YOLO

Abstract

Efficient real-time object detection is a critical requirement in edge computing applications, such as smart surveillance, where resource constraints pose significant challenges. Existing deep learning methods often struggle to balance accuracy and efficiency, particularly when deployed on hardware with limited computational resources. This work focuses on developing a quantized object detection system utilizing advanced deep learning models to improve inference performance on edge devices, Zedboard and Jetson Nano. The Zedboard, an FPGA platform without GPU acceleration, executes a quantized YOLOv3-tiny model with ultra-low power consumption of 2.2W but requires over 3 seconds per inference, making it unsuitable for real-time applications. In contrast, the Jetson Nano, running an optimized YOLOv7-tiny model with FP16 quantization and GPU acceleration, achieves a processing speed of 38 FPS with mAP of 46.3%, while maintaining a low power consumption of 5.1W. Based on the results, this work presents a practical solution for real-time object detection in resource-constrained environments by demonstrating the benefits of combining quantized deep learning models with GPU acceleration. Future work could focus on fine-tuning models for specific applications, such as traffic monitoring, to improve the detection of vehicles, pedestrians, and traffic signs in dynamic environments.

Energy Optimized YOLO: Quantized Inference for Real-Time Edge AI Object Detection

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Information