Statistical Speaker Diarization Using Dependent Combination of Extracted Features

Kadhim, Hasan Almgotir, Woo, Wai Lok and Dlay, Satnam (2016) Statistical Speaker Diarization Using Dependent Combination of Extracted Features. In: AIMS 2015 - 3rd International Conference on Artificial Intelligence, Modelling and Simulation, 2nd - 4th December 2015, Kota Kinabalu, Sabah, Malaysia.

Full text not available from this repository.
Official URL:


The paper describes a novel method that improvises the procedure for supervised speaker diarization. The procedure supposes that the database of the speakers is available. Initially, the database and observation signal of the speakers, are prepared. The audio features has been extracted from the database and the observation signal. Instead of the using of one of Mel Frequency Cepstral Coefficient, Perceptual Linear Prediction, or Power Normalized Cepstral Coefficients, a combination of all of them have been used. The combination form of these features is independent, i.e. They are concatenated in the feature matrix. The comparison between features of observation signal and statistical properties of database features, has been made. The comparing procedure is used to make the decision of the logical mask of the comparison. Both of bottom-up and top-down scenarios collaborate to complete the last decisions successfully. Diarization Error Rate test denotes that combination of features has less than errors than any one alone.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Clustering, Mel Feature Cepstral Coefficient, Perceptual Linear Predictive, Power Normalized Cepstral Coefficient, Segmentation, Speaker Diarization
Subjects: G400 Computer Science
Department: Faculties > Engineering and Environment > Computer and Information Sciences
Depositing User: Paul Burns
Date Deposited: 09 Apr 2019 12:01
Last Modified: 10 Oct 2019 20:16

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics