Establishment of a normative cepstral pediatric acoustic database

JAMA Otolaryngol Head Neck Surg. 2015 Apr;141(4):358-63. doi: 10.1001/jamaoto.2014.3545.

Abstract

Importance: Few studies have used objective measures to evaluate the development of the normal pediatric voice. Cepstral analysis of continuous speech samples is a reliable method for gathering acoustic data; however, it has not been used to examine the changes that occur with voice development.

Objective: To establish and characterize acoustic patterns of the normal pediatric voice using cepstral analysis of voice samples from a normal pediatric voice database.

Design, setting, and participants: Cross-sectional study of 218 children aged 4 to 17 years, for whom English was the primary language spoken at home, conducted at a pediatric otolaryngology practice and pediatric practice in a tertiary hospital (April 2012-May 2014).

Interventions and exposures: Sustained vowel utterances and continuous speech samples (4 Consensus Auditory-Perceptual Evaluation of Voice [CAPE-V] and 2 sentences from the rainbow passage) were recorded and analyzed from children with normal voices.

Main outcomes and measures: Normal values were collected for the acoustic measures studied (ie, fundamental frequency, cepstral peak fundamental frequency, cepstral peak prominence [CPP], low-to-high spectral ratio [L/H ratio], and cepstral-spectral index of dysphonia in recorded phrases) and compiled into a normative acoustic database.

Results: Significant changes in fundamental frequency were observed with a distinct shift in slope at ages 11 and 14 years in boys for sustained vowel (ages 4-11 years, -6.83 Hz/y [P < .001]; 11-14 years, -27.62 Hz/y [P < .001]; and 14-17 years, -5.68 Hz/y [P = .001]), all voiced (ages 4-11 years, -4.19 Hz/y [P = .002]; 11-14 years, -29.42 Hz/y [P < .001]; and 14-17 years, -4.63 Hz/y [P < .001]), glottal attack (ages 4-11 years, -4.51 Hz/y; 11-14 years, -27.23 Hz/y; and 14-17 years, -1.70 Hz/y [P < .001 for all]), and rainbow (ages <14 years, -20.68 Hz/y [P < .001]; and 14-17 years, -4.50 Hz/y [P = .001]) recordings. A decreasing linear trend in fundamental frequency among all recordings (vowel, all voiced, easy onset, glottal attack, plosives, and rainbow) was found in girls (-2.56 Hz/y [P < .001], -3.48 Hz/y [P < .001], -2.82 Hz/y [P < .001], -3.49 Hz/y [P < .001], -2.30 Hz/y [P < .001], and -2.98 Hz/y [P = .01], respectively). A linear increase in CPP was seen with age in boys, with significant changes seen in recordings for vowel (0.10 dB/y [P = .05]), all voiced (0.2 dB/y [P < .001]), easy onset (0.13 dB/y [P < .001]), glottal attack (0.12 dB/y [P < .001]), plosives (0.15 dB/y [P < .001]), and rainbow (0.17 dB/y [P = .006]). A significant linear increase in CPP for girls was only seen in all voiced (0.13 dB/y [P < .001]). L/H ratio showed a linear increase with age among all speech samples (vowel, all voiced, easy onset, glottal attack, plosives, and rainbow) in boys (1.14 dB/y [P < .001], 0.92 dB/y [P < .001], 1.19 dB/y [P < .001], 0.79 dB/y [P < .001], 0.69 dB/y [P < .001], and 0.54 dB/y [P = .002], respectively) and girls (0.96 dB/y, 0.60 dB/y, 0.75 dB/y, 0.37 dB/y, 0.44 dB/y, and 0.58 dB/y, respectively [P ≤ .001 for all]).

Conclusions and relevance: This represents the first pediatric voice database using frequency-based acoustic measures. Our goal was to characterize the changes that occur in both male and female voices as children age. These findings help illustrate how acoustic measurements change with development and may aid in our understanding of the developing voice, pathologic changes, and response to treatment.

MeSH terms

  • Adolescent
  • Age Factors*
  • Child
  • Child, Preschool
  • Cross-Sectional Studies
  • Databases, Factual*
  • Female
  • Humans
  • Male
  • Phonation / physiology
  • Reference Values
  • Sex Factors
  • Sound Spectrography
  • Speech Acoustics*
  • Voice / physiology*