Past, present and future of the Speech Transmission Index
Herman Steeneken, TNO Human factors, Holland

Connected discourse can be considered as a sequence of the smallest speech items, the phonemes. Each phoneme is represented by a specific frequency spectrum. For the recognition of  a speech token the differences between the spectra of the phonemes must be preserved to some degree. These spectral differences are related to the fluctuations in the envelope of the speech signal, more specific the envelope functions of several frequency bands that cover the frequency range of a speech signal.

Distortion of the speech signal, such as noise or reverberation, will result in a reduction of the spectral differences between the spectra of the various phonemes. The degradation of the original envelope functions is related to the intelligibility of the corresponding speech signal. The objective speech intelligibility prediction method, the Speech Transmission Index (STI), is based on these phenomena. Rather than determination of the speech intelligibility by using real speech from a talker and analysis by listeners, an artificial –speech-like- test signal is used. The degradation of this test signal, after transmission through a channel, is analysed and expressed by an index that ranges between 0 – 1. A unique relation between various subjective intelligibility measures and the objective STI has been derived. In this paper a summary of subjective intelligibility measures and their relation with the STI will be discussed. The first developments of the STI started in 1970, and resulted in a simple measuring device to be used with radio communication channels. Since then, the method was improved by the introduction of the modulation transfer function (MTF, related to the envelope function) and its application for temporal distorted speech (reverberation, echoes, automatic gain control). The STI method is standardized by IEC (IEC 20618) and intelligibility criteria for use in auditoria, offices and workshops are standardized by ISO (ISO9921) and various national standards. Although the concept of the STI will not change, practical improvements will be made in the future. For example the test signal will be replaced by a speech signal which facilitates the use of the STI during a life performance in an auditorium. Also applications for non-native speakers, the hearing impaired and predictions for room acoustics are in progress. Several commercial available low cost measuring devices are on the market which allow easy application.