Dysfluency Recognition by using Spectral Entropy Features
Vinay N A1, Bharathi S H2
1Vinay N A, School of ECE, REVA University, Bangalore, India.
2Bharathi S H, School of ECE, REVA University, Bangalore, India.
Manuscript received on July 20, 2019. | Revised Manuscript received on August 10, 2019. | Manuscript published on August 30, 2019. | PP: 517-520 | Volume-8 Issue-6, August 2019. | Retrieval Number: F7881088619/2019©BEIESP | DOI: 10.35940/ijeat.F7881.088619
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: In recent decades, speech recognition technology has improved effectively and significantly, but it is restricted only for a stream of words. These recognition systems assist human’s effectively for structured speech but for unstructured speech this assistance is not so effective for humans to communicate with machines, because unstructured stream of word lacks in providing useful information about pronunciation and punctuation. Recovering of such structural information by detecting the position of each phones in a sentence by locating the sentence boundaries, repeated words and missing phones in each phrase. The proposed work investigates the spectral entropy features, for the automatic detection of voiced and non-voiced regions, in the process of dysfluent speech recognition. The entropy features are estimated by normalizing the Fourier transform spectrum as Probability mass function (PMF). For clear formants of speech, the value of entropy is low and the value of entropy is high for flat distribution of silence part or if there is any noise in speech sample. A comparison of entropy features with Word Error Rate is presented in the proposed work.
Keywords: Dysfluency, MHFCC, Spectral Entropy, Speech recognition, IMF, Phones, Word Error Rate (WER).