PlumX Metrics
Embed PlumX Metrics

Spectral–temporal saliency masks and modulation tensorgrams for generalizable COVID-19 detection

Computer Speech & Language, ISSN: 0885-2308, Vol: 86, Page: 101620
2024
  • 1
    Citations
  • 0
    Usage
  • 5
    Captures
  • 1
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

  • Citations
    1
  • Captures
    5
  • Mentions
    1
    • News Mentions
      1
      • News
        1

Most Recent News

New Findings from University of Quebec in COVID-19 Provides New Insights (Spectral-temporal Saliency Masks and Modulation Tensorgrams for Generalizable Covid-19 Detection)

2024 MAY 30 (NewsRx) -- By a News Reporter-Staff News Editor at NewsRx COVID-19 Daily -- Research findings on Coronavirus - COVID-19 are discussed in

Article Description

Speech COVID-19 detection systems have gained popularity as they represent an easy-to-use and low-cost solution that is well suited for at-home long-term monitoring of patients with persistent symptoms. Recently, however, the limited generalization capability of existing deep neural network based systems to unseen datasets has been raised as a serious concern, as has their limited interpretability. In this study, we aim to develop an interpretable and generalizable speech-based COVID-19 detection system. First, we propose the use of a 3-dimensional modulation frequency tensor (called modulation tensorgram representation, MTR) as input to a convolutional recurrent neural network for COVID-19 detection. The MTR representation is known to capture long-term dynamics of speech correlated with articulation and respiration, hence being a potential candidate for characterizing COVID-19 speech. The customized network explores both the spectral and temporal pattern from MTR to learn the underlying COVID-19 speech pattern. Next, we design a spectro-temporal saliency masking to aggregate regions of the MTR related to COVID-19, thus helping further improve the generalizability and interpretability of the model. Experiments are conducted on three public datasets and results show the proposed solution consistently outperforming two benchmark systems in within-, across-, and unseen-dataset tests. The learned salient regions have been shown correlated with whispered speech and vocal hoarseness, which explains the increased generalizability. Furthermore, our model relies on a small amount of parameters, thus offering a promising solution for on-device remote monitoring of COVID-19 infection.

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know