FPGA-Based Real-Time Hand Gesture Recognition Using a Lightweight CNN on PYNQ-Z2


Authors : Surya Thummalapeta; Dr. Narender Reddy Kampelli

Volume/Issue : Volume 11 - 2026, Issue 1 - January


Google Scholar : https://tinyurl.com/7a7yuyy3

Scribd : https://tinyurl.com/4pm9aech

DOI : https://doi.org/10.38124/ijisrt/26jan329

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : Hand gesture recognition enables natural interaction between humans and machines and is widely used in vision- based embedded applications. Although convolutional neural networks provide strong recognition capability, their deployment on resource-constrained platforms presents challenges related to computation, latency, and system integration. This paper presents the design and implementation of a real-time hand gesture recognition system using a lightweight CNN accelerator deployed on a PYNQ-Z2 FPGA platform. The proposed system adopts a hardware–software co-design approach, where image acquisition and control are handled by the processing system, while CNN inference is offloaded to programmable logic for acceleration. A compact CNN architecture based on depthwise separable convolutions is employed to reduce computational complexity and resource usage. The system supports live camera input, real-time inference, and web-based visualization. Experimental observations demonstrate the feasibility of deploying CNN-based hand gesture recognition on low-cost FPGA platforms, highlighting practical design trade-offs and implementation considerations.

Keywords : Hand Gesture Recognition, FPGA Acceleration, Convolutional Neural Networks, PYNQ-Z2, Hardware–Software Co- Design.

References :

  1. S. Rautaray and A. Agrawal, “Hand gesture recognition for human- computer interaction: A survey,” Artificial Intelligence Review, vol. 43, no. 1, pp. 1–54, 2015.
  2. J. S. Sun, T. J. Zhang, J. Yang, and G. Ji, “Research on hand gesture recognition based on deep learning,” in Proc. 12th Int. Symp. Antennas, Propagation and EM Theory, 2018, pp. 1–4.
  3. G. Plouffe and A. Cretu, “Static and dynamic hand gesture recognition in depth data using time warping,” IEEE Trans. Instrum. Meas., vol. 65, no. 2, pp. 305–316, 2016.
  4. A. Kumar, S. Verma, and R. Agarwal, “A pattern recognition model for hand gestures recognition using convolutional neural networks,” Procedia Computer Science, vol. 167, pp. 133–142, 2020.
  5. S. Mumtaz et al., “FPGA implementation of a convolutional neural network for classification,” IEEE Access, vol. 8, pp. 74941–74950, 2020.
  6. F. Chollet, “Xception: Deep learning with depthwise separable convolu- tions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2017, pp. 1251–1258.
  7. L. Bai, Y. Zhao, and X. Zhang, “A CNN accelerator on FPGA using depthwise separable convolution,” IEEE Trans. Circuits Syst. II, vol. 65, no. 10, pp. 1415–1419, Oct. 2018.
  8. M. Mitalainen, S. Pangi, J. Holappa, and O. Silven, “Dynamic hand gesture recognition using effective feature extraction and attention-based deep neural networks,” IEEE Access, vol. 8, pp. 110120–110130, 2020.
  9. P. K. Pisharady and A. P. L. Loh, “Attention based detection and recognition of hand postures against complex backgrounds,” Int. J. Comput. Vis., vol. 101, no. 3, pp. 403–419, 2013.
  10. W. Zhang, J. Wang, and L. Fan, “Dynamic hand gesture recognition based on short-term sampling neural networks,” IEEE/CAA J. Automat- ica Sinica, vol. 8, no. 1, pp. 110–120, 2021.
  11. C. Zhang et al., “Optimizing FPGA-based accelerator design for deep convolutional neural networks,” in Proc. ACM/SIGDA Int. Symp. FPGA, 2015, pp. 161–170.
  12. K. Guo et al., “Angel-Eye: A complete design flow for mapping CNN onto embedded FPGA,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 37, no. 1, pp. 35–47, 2018.
  13. L. Bai and X. Huang, “High-speed low-cost CNN inference accelerator for depthwise separable convolution,” IEEE Trans. Circuits Syst. II, 2019.
  14. P. Barros et al., “A multimodal convolutional neural network for hand posture recognition,” in Proc. Int. Conf. Neural Networks, 2014.
  15. V. C. Johnson et al., “FPGA-based hardware acceleration using PYNQ- Z2,” in Proc. IEEE ICEEICT, 2023.
  16. A. Ghoward, M. Zhu, B. Chen, D. Kalenichenko, and W. Wang, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” arXiv:1704.04861, 2017.
  17. M. Mitalainen, S. Pangi, J. Holappa, and O. Silven, “OUHANDS database for hand detection and pose recognition,” in Proc. Int. Conf. Image Process. Theory, Tools and Applications, 2016.
  18. M. Everingham et al., “The Pascal visual object classes (VOC) chal- lenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, 2010.
  19. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
  20. T.-H. Tsai, Y.-C. Hsu, and C.-J. Wu, “Hardware architecture design for hand gesture recognition system on FPGA,” IEEE Access, vol. 11, pp. 24567–24578, 2023.

Hand gesture recognition enables natural interaction between humans and machines and is widely used in vision- based embedded applications. Although convolutional neural networks provide strong recognition capability, their deployment on resource-constrained platforms presents challenges related to computation, latency, and system integration. This paper presents the design and implementation of a real-time hand gesture recognition system using a lightweight CNN accelerator deployed on a PYNQ-Z2 FPGA platform. The proposed system adopts a hardware–software co-design approach, where image acquisition and control are handled by the processing system, while CNN inference is offloaded to programmable logic for acceleration. A compact CNN architecture based on depthwise separable convolutions is employed to reduce computational complexity and resource usage. The system supports live camera input, real-time inference, and web-based visualization. Experimental observations demonstrate the feasibility of deploying CNN-based hand gesture recognition on low-cost FPGA platforms, highlighting practical design trade-offs and implementation considerations.

Keywords : Hand Gesture Recognition, FPGA Acceleration, Convolutional Neural Networks, PYNQ-Z2, Hardware–Software Co- Design.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe