Authors :
Surya Thummalapeta; Dr. Narender Reddy Kampelli
Volume/Issue :
Volume 11 - 2026, Issue 1 - January
Google Scholar :
https://tinyurl.com/7a7yuyy3
Scribd :
https://tinyurl.com/4pm9aech
DOI :
https://doi.org/10.38124/ijisrt/26jan329
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Hand gesture recognition enables natural interaction between humans and machines and is widely used in
vision- based embedded applications. Although convolutional neural networks provide strong recognition capability, their
deployment on resource-constrained platforms presents challenges related to computation, latency, and system integration.
This paper presents the design and implementation of a real-time hand gesture recognition system using a lightweight CNN
accelerator deployed on a PYNQ-Z2 FPGA platform. The proposed system adopts a hardware–software co-design
approach, where image acquisition and control are handled by the processing system, while CNN inference is offloaded to
programmable logic for acceleration. A compact CNN architecture based on depthwise separable convolutions is
employed to reduce computational complexity and resource usage. The system supports live camera input, real-time
inference, and web-based visualization. Experimental observations demonstrate the feasibility of deploying CNN-based
hand gesture recognition on low-cost FPGA platforms, highlighting practical design trade-offs and implementation
considerations.
Keywords :
Hand Gesture Recognition, FPGA Acceleration, Convolutional Neural Networks, PYNQ-Z2, Hardware–Software Co- Design.
References :
- S. Rautaray and A. Agrawal, “Hand gesture recognition for human- computer interaction: A survey,” Artificial Intelligence Review, vol. 43, no. 1, pp. 1–54, 2015.
- J. S. Sun, T. J. Zhang, J. Yang, and G. Ji, “Research on hand gesture recognition based on deep learning,” in Proc. 12th Int. Symp. Antennas, Propagation and EM Theory, 2018, pp. 1–4.
- G. Plouffe and A. Cretu, “Static and dynamic hand gesture recognition in depth data using time warping,” IEEE Trans. Instrum. Meas., vol. 65, no. 2, pp. 305–316, 2016.
- A. Kumar, S. Verma, and R. Agarwal, “A pattern recognition model for hand gestures recognition using convolutional neural networks,” Procedia Computer Science, vol. 167, pp. 133–142, 2020.
- S. Mumtaz et al., “FPGA implementation of a convolutional neural network for classification,” IEEE Access, vol. 8, pp. 74941–74950, 2020.
- F. Chollet, “Xception: Deep learning with depthwise separable convolu- tions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2017, pp. 1251–1258.
- L. Bai, Y. Zhao, and X. Zhang, “A CNN accelerator on FPGA using depthwise separable convolution,” IEEE Trans. Circuits Syst. II, vol. 65, no. 10, pp. 1415–1419, Oct. 2018.
- M. Mitalainen, S. Pangi, J. Holappa, and O. Silven, “Dynamic hand gesture recognition using effective feature extraction and attention-based deep neural networks,” IEEE Access, vol. 8, pp. 110120–110130, 2020.
- P. K. Pisharady and A. P. L. Loh, “Attention based detection and recognition of hand postures against complex backgrounds,” Int. J. Comput. Vis., vol. 101, no. 3, pp. 403–419, 2013.
- W. Zhang, J. Wang, and L. Fan, “Dynamic hand gesture recognition based on short-term sampling neural networks,” IEEE/CAA J. Automat- ica Sinica, vol. 8, no. 1, pp. 110–120, 2021.
- C. Zhang et al., “Optimizing FPGA-based accelerator design for deep convolutional neural networks,” in Proc. ACM/SIGDA Int. Symp. FPGA, 2015, pp. 161–170.
- K. Guo et al., “Angel-Eye: A complete design flow for mapping CNN onto embedded FPGA,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 37, no. 1, pp. 35–47, 2018.
- L. Bai and X. Huang, “High-speed low-cost CNN inference accelerator for depthwise separable convolution,” IEEE Trans. Circuits Syst. II, 2019.
- P. Barros et al., “A multimodal convolutional neural network for hand posture recognition,” in Proc. Int. Conf. Neural Networks, 2014.
- V. C. Johnson et al., “FPGA-based hardware acceleration using PYNQ- Z2,” in Proc. IEEE ICEEICT, 2023.
- A. Ghoward, M. Zhu, B. Chen, D. Kalenichenko, and W. Wang, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” arXiv:1704.04861, 2017.
- M. Mitalainen, S. Pangi, J. Holappa, and O. Silven, “OUHANDS database for hand detection and pose recognition,” in Proc. Int. Conf. Image Process. Theory, Tools and Applications, 2016.
- M. Everingham et al., “The Pascal visual object classes (VOC) chal- lenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, 2010.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
- T.-H. Tsai, Y.-C. Hsu, and C.-J. Wu, “Hardware architecture design for hand gesture recognition system on FPGA,” IEEE Access, vol. 11, pp. 24567–24578, 2023.
Hand gesture recognition enables natural interaction between humans and machines and is widely used in
vision- based embedded applications. Although convolutional neural networks provide strong recognition capability, their
deployment on resource-constrained platforms presents challenges related to computation, latency, and system integration.
This paper presents the design and implementation of a real-time hand gesture recognition system using a lightweight CNN
accelerator deployed on a PYNQ-Z2 FPGA platform. The proposed system adopts a hardware–software co-design
approach, where image acquisition and control are handled by the processing system, while CNN inference is offloaded to
programmable logic for acceleration. A compact CNN architecture based on depthwise separable convolutions is
employed to reduce computational complexity and resource usage. The system supports live camera input, real-time
inference, and web-based visualization. Experimental observations demonstrate the feasibility of deploying CNN-based
hand gesture recognition on low-cost FPGA platforms, highlighting practical design trade-offs and implementation
considerations.
Keywords :
Hand Gesture Recognition, FPGA Acceleration, Convolutional Neural Networks, PYNQ-Z2, Hardware–Software Co- Design.