Skip to main content
Article

Classification of laryngeal pathologies using audio, bioimpedance measurements and deep learning

Author: Julia Zofia Tomaszewska (University of West London)

  • Classification of laryngeal pathologies using audio, bioimpedance measurements and deep learning

    Article

    Classification of laryngeal pathologies using audio, bioimpedance measurements and deep learning

    Author:

Abstract

Presented at the UWL Annual Doctoral Students' Conference, Friday 12 July 2024.

Keywords: deep learning

How to Cite:

Tomaszewska, J., (2025) “Classification of laryngeal pathologies using audio, bioimpedance measurements and deep learning”, New Vistas 11(1). doi: https://doi.org/10.36828/newvistas.276

Downloads:
Download HTML

56 Views

13 Downloads

Published on
2025-02-19

Peer Reviewed

021ce69c-67e7-4221-a861-371222fe8f2e

Classification of laryngeal pathologies using audio, bioimpedance measurements and deep learning

Julia Zofia Tomaszewska

School of Computing and Engineering

Supervisor:

Dr Apostolos Georgakis

School of Computing and Engineering

Vocal tract pathologies encompass a wide spectrum of disorders, from functional impairments to structural abnormalities. Their early detection and classification are critical for efficient treatment and recovery. In this study, we design and implement a digital classification system for vocal tract pathologies based on audio signals and bio-impedance (electroglottographic – EGG) measurements. By that, we hope to contribute towards the development of an accurate and reliable diagnostic tool.

For the development of the envisaged system, three classifiers are implemented. Random Forest classifier (RF) is employed to assess the effectiveness of various feature extraction methods. Additionally, RF is examined at binary classification of control (signals obtained from participants unaffected by the investigated pathologies) and pathological signals, achieving the maximum of 99.85% overall accuracy when using Mel-Frequency Cepstral Coefficients derived from audio recordings. For the classification of vocal tract pathologies, two types of deep learning classifiers are developed and tested – the Convolutional Neural Network (CNN), and the Long-Short Term Memory Network (LSTM).

The CNNs emerged as the most effective, achieving the maximum of 86.61% accuracy for Gammatone Cepstral Coefficients (GTCC) derived from audio speech data, and 84.91% accuracy for GTCCs derived from bio-impedance speech data. These findings underscore the potential of tailored approaches for specific data modalities and pathologies. Future work aims to explore the multi-modal approach, merging audio- and EGG-based systems to achieve more accurate and reliable insight into laryngeal pathology classification, thereby allowing the evolution of a potential vocal tract disorder diagnostic tool.