Implementation of an Automatic Vocal Diagnosis System Based on Acoustic Indicators 


Vol. 14,  No. 10, pp. 764-774, Oct.  2025
https://doi.org/10.3745/TKIPS.2025.14.10.764


PDF
  Abstract

This paper presents an automatic speech analysis system that segments syllables, classifies pitch ranges, and analyzes key acoustic features in repeated utterances of beginner-level vocalists. The system combines timestamp-based coarse segmentation with pitch-based fine segmentation, which enables precise alignment of repeated utterances (t2–t4) by applying consistent boundaries derived from a reference utterance (t1). Within each coarse segment defined by timestamps, pitch transitions are detected to determine fine-grained syllable boundaries. The average pitch of each syllable is used to classify it into low (<220 Hz), mid (220-349 Hz), or high (≥350 Hz) pitch categories, and the resulting segments are automatically named and stored accordingly. To validate the system, a dataset was constructed from over 130 voluntary participants. Based on breath control, pitch/rhythm accuracy, and high-pitch ability, participants were categorized into three performance groups, and the lowest-performing group (10 male participants in their 20s, with no more than one year of vocal training) was selected for controlled experiments. For performance evaluation, from these participants, 1,153 manually segmented syllables and 3,560 automatically segmented syllables were obtained. Although the participant pool was small, four repeated performances of a designated song segment yielded a large and diverse dataset. Each participant recorded the same song four times (t1–t4). Compared to expert manual segmentation, the proposed system achieved an average deviation within ±5% and a Pearson correlation coefficient above 0.95 across all acoustic measures (Pitch, F1, F2, Intensity). These results demonstrate that the system can provide real-time, quantitative analysis of unstable pitch, timing, and breathing patterns in beginner vocalists, supporting practical use in vocal training and assessment.

  Statistics


  Cite this article

[IEEE Style]

L. H. Woo and K. Y. Han, "Implementation of an Automatic Vocal Diagnosis System Based on Acoustic Indicators," The Transactions of the Korea Information Processing Society, vol. 14, no. 10, pp. 764-774, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.10.764.

[ACM Style]

Lee Hyun Woo and Kim Young Han. 2025. Implementation of an Automatic Vocal Diagnosis System Based on Acoustic Indicators. The Transactions of the Korea Information Processing Society, 14, 10, (2025), 764-774. DOI: https://doi.org/10.3745/TKIPS.2025.14.10.764.