Xavier Domont, Martin Heckmann, Heiko Wersing, Frank Joublin, and Christian Goerick (2007)
A Hierarchical Model for Syllable Recognition
In: European Symposium on Artificial Neural Networks (ESANN). d-side publications, Bruges, Belgium, pages 573--578.
Inspired by recent findings on the similarities between the primary auditory and visual cortex we propose a neural network for speech recognition based on a hierarchical feedforward architecture for visual object recognition. When using a Gammatone filterbank for the spectral analysis the resulting spectrograms of syllables can be interpreted as images. After a preprocessing enhancing the formants in the speech signal and a length normalization, the images can than be fed into the visual hierarchy. We demonstrate the validity of our approach on the recognition of 25 different monosyllabic words and compare the results to the Sphinx-4 speech recognition system. Our hierarchical model achieves an improvement for high noise levels.
Download the
BibTeX file
Document File:
OBJECT IS MARKED FOR EXPORT
Created by mheckmann - 2007-01-26 14:05
Last modified by - 2007-11-13 11:03
Created by mheckmann - 2007-01-26 14:05
Last modified by - 2007-11-13 11:03



ESANN2007-107_2.pdf
(