A Review of Artificial Intelligence-Based Real-Time Sign Language Recognition Using Computer Vision
Abstract
Sign Language Recognition serves as an essential tool to bridge the communication gap
between hearing and Deaf communities. This paper presents a comprehensive review of
AI-based sign language recognition models for real-time applications using computer vi sion. The study explores recent advances incorporating Convolutional Neural Networks
(CNNs), Recurrent Neural Networks (RNNs), and transformer architectures, analyzing 20
research articles published within the last two years to identify current methodologies,
performance metrics, and implementation trends. Analysis reveals that spatial-temporal
learning models achieve supervised learning accuracy exceeding 95% for isolated
sign recognition and 89-96% for continuous sign sequences. Skeleton-based Graph
Convolutional Networks demonstrate superior performance (96.1% accuracy) compared
to RGB-based methods, while multimodal fusion strategies yield 2-8 percentage point
improvements over unimodal approaches. However, significant challenges persist in
practical deployment, including environmental robustness (10-25% accuracy degradation
across different settings), signer variability, continuous sign segmentation (accuracy
drops from 80-92% to 65-75% in natural streams), and limited dataset vocabulary
coverage (5-10% of full sign language dictionaries).This review contributes a systematic
synthesis of state-of-the-art techniques, identifies critical implementation barriers, and
provides researchers and practitioners with clear directions for advancing real-time
sign language recognition toward practical, inclusive communication systems.
