Retta, Ephrem Afele and Sutcliffe, Richard and Almekhlafi, Eiad and Enku, Yosef Kefyalew and Alemu, Eyob and Gemechu, Tigist Demssice and Berwo, Michael Abebe and Mhamed, Mustafa and Feng, Jun (2023) Kiñit classification in Ethiopian chants, Azmaris and modern music: A new dataset and CNN benchmark. PLoS One, 18 (4). e0284560-e0284560. DOI https://doi.org/10.1371/journal.pone.0284560
Retta, Ephrem Afele and Sutcliffe, Richard and Almekhlafi, Eiad and Enku, Yosef Kefyalew and Alemu, Eyob and Gemechu, Tigist Demssice and Berwo, Michael Abebe and Mhamed, Mustafa and Feng, Jun (2023) Kiñit classification in Ethiopian chants, Azmaris and modern music: A new dataset and CNN benchmark. PLoS One, 18 (4). e0284560-e0284560. DOI https://doi.org/10.1371/journal.pone.0284560
Retta, Ephrem Afele and Sutcliffe, Richard and Almekhlafi, Eiad and Enku, Yosef Kefyalew and Alemu, Eyob and Gemechu, Tigist Demssice and Berwo, Michael Abebe and Mhamed, Mustafa and Feng, Jun (2023) Kiñit classification in Ethiopian chants, Azmaris and modern music: A new dataset and CNN benchmark. PLoS One, 18 (4). e0284560-e0284560. DOI https://doi.org/10.1371/journal.pone.0284560
Abstract
In this paper, we create EMIR, the first-ever Music Information Retrieval dataset for Ethiopian music. EMIR is freely available for research purposes and contains 600 sample recordings of Orthodox Tewahedo chants, traditional Azmari songs and contemporary Ethiopian secular music. Each sample is classified by five expert judges into one of four well-known Ethiopian Kiñits, Tizita, Bati, Ambassel and Anchihoye. Each Kiñit uses its own pentatonic scale and also has its own stylistic characteristics. Thus, Kiñit classification needs to combine scale identification with genre recognition. After describing the dataset, we present the Ethio Kiñits Model (EKM), based on VGG, for classifying the EMIR clips. In Experiment 1, we investigated whether Filterbank, Mel-spectrogram, Chroma, or Mel-frequency Cepstral coefficient (MFCC) features work best for Kiñit classification using EKM. MFCC was found to be superior and was therefore adopted for Experiment 2, where the performance of EKM models using MFCC was compared using three different audio sample lengths. 3s length gave the best results. In Experiment 3, EKM and four existing models were compared on the EMIR dataset: AlexNet, ResNet50, VGG16 and LSTM. EKM was found to have the best accuracy (95.00%) as well as the fastest training time. However, the performance of VGG16 (93.00%) was found not to be significantly worse (P < 0.01). We hope this work will encourage others to explore Ethiopian music and to experiment with other models for Kiñit classification.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Benchmarking; Datasets as Topic; Ethiopia; Humans; Music; Singing |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 26 Sep 2023 15:40 |
Last Modified: | 30 Oct 2024 21:01 |
URI: | http://repository.essex.ac.uk/id/eprint/36489 |
Available files
Filename: journal.pone.0284560.pdf
Licence: Creative Commons: Attribution 4.0