Chen, Jiacheng and Sun, Xia and Jin, Xin and Sutcliffe, Richard (2022) Extracting drug-drug interactions from no-blinding texts using key semantic sentences and GHM loss. Journal of Biomedical Informatics, 135. p. 104192. DOI https://doi.org/10.1016/j.jbi.2022.104192
Chen, Jiacheng and Sun, Xia and Jin, Xin and Sutcliffe, Richard (2022) Extracting drug-drug interactions from no-blinding texts using key semantic sentences and GHM loss. Journal of Biomedical Informatics, 135. p. 104192. DOI https://doi.org/10.1016/j.jbi.2022.104192
Chen, Jiacheng and Sun, Xia and Jin, Xin and Sutcliffe, Richard (2022) Extracting drug-drug interactions from no-blinding texts using key semantic sentences and GHM loss. Journal of Biomedical Informatics, 135. p. 104192. DOI https://doi.org/10.1016/j.jbi.2022.104192
Abstract
The extraction of drug-drug interactions (DDIs) is an important task in the field of biomedical research, which can reduce unexpected health risks during patient treatment. Previous work indicates that methods using external drug information have a much higher performance than those methods not using it. However, the use of external drug information is time-consuming and resource-costly. In this work, we propose a novel method for extracting DDIs which does not use external drug information, but still achieves comparable performance. First, we no longer convert the drug name to standard tokens such as DRUG0, the method commonly used in previous research. Instead, full drug names with drug entity marking are input to BioBERT, allowing us to enhance the selected drug entity pair. Second, we adopt the Key Semantic Sentence approach to emphasize the words closely related to the DDI relation of the selected drug pair. After the above steps, the misclassification of similar instances which are created from the same sentence but corresponding to different pairs of drug entities can be significantly reduced. Then, we employ the Gradient Harmonizing Mechanism (GHM) loss to reduce the weight of mislabeled instances and easy-to-classify instances, both of which can lead to poor performance in DDI extraction. Overall, we demonstrate in this work that it is better not to use drug blinding with BioBERT, and show that GHM performs better than Cross-Entropy loss if the proportion of label noise is less than 30%. The proposed model achieves state-of-the-art results with an F1-score of 84.13% on the DDIExtraction 2013 corpus (a standard English DDI corpus), which fills the performance gap (4%) between methods that rely on and do not rely on external drug information.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Drug–drug interactions; Drug blinding; Data imbalance; Label-noise |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 03 Nov 2022 17:53 |
Last Modified: | 30 Oct 2024 21:03 |
URI: | http://repository.essex.ac.uk/id/eprint/33824 |
Available files
Filename: 1-s2.0-S1532046422001988-main.pdf
Licence: Creative Commons: Attribution-Noncommercial-No Derivative Works 3.0