PlumX Metrics
Embed PlumX Metrics

Arabic Lipreading Using YOLO and CNN Models

Lecture Notes in Networks and Systems, ISSN: 2367-3389, Vol: 1145 LNNS, Page: 13-23
2024
  • 0
    Citations
  • 0
    Usage
  • 0
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Conference Paper Description

Lipreading is a vital aspect of human communication and requires effective computational methods. Lip movements, integral to this process, present challenges such as variability and context dependence. Recent developments in deep learning show potential for enhancing Arabic visual speech recognition (VSR) systems. This paper focuses on leveraging deep learning to assist Arabian individuals with hearing impairments, reduce their communication barriers, and enhance their quality of life. We employed our Arabic-created dataset, including YOLO version V7 as a frontend for mouth detection and CNN models (i) InceptionV3, and (ii) custom CNN model for speech classification. Our approach aims to address the complexities of lipreading. Our results show promise, with an impressive 90% speech recognition accuracy. These results underscore the capacity of deep learning to improve visual speech recognition, Facilitating the development of more effective and precise methods for detection and recognition.

Bibliographic Details

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know