NEREL-BIO: A dataset of biomedical abstracts annotated with nested named entities
Электронный научный архив УРФУ
Информация об архиве | Просмотр оригиналаПоле | Значение | |
Заглавие |
NEREL-BIO: A dataset of biomedical abstracts annotated with nested named entities
|
|
Автор |
Loukachevitch, N.
Manandhar, S. Baral, E. Rozhkov, I. Braslavski, P. Ivanov, V. Batura, T. Tutubalina, E. |
|
Тематика |
ARTICLE
HUMAN HUMAN EXPERIMENT LANGUAGE MEDLINE READING NATURAL LANGUAGE PROCESSING SEMANTICS LANGUAGE NATURAL LANGUAGE PROCESSING PUBMED SEMANTICS |
|
Описание |
Motivation: This article describes NEREL-BIO-an annotation scheme and corpus of PubMed abstracts in Russian and smaller number of abstracts in English. NEREL-BIO extends the general domain dataset NEREL by introducing domain-specific entity types. NEREL-BIO annotation scheme covers both general and biomedical domains making it suitable for domain transfer experiments. NEREL-BIO provides annotation for nested named entities as an extension of the scheme employed for NEREL. Nested named entities may cross entity boundaries to connect to shorter entities nested within longer entities, making them harder to detect. Results: NEREL-BIO contains annotations for 700+ Russian and 100+ English abstracts. All English PubMed annotations have corresponding Russian counterparts. Thus, NEREL-BIO comprises the following specific features: Annotation of nested named entities, it can be used as a benchmark for cross-domain (NEREL → NEREL-BIO) and cross-language (English → Russian) transfer. We experiment with both transformer-based sequence models and machine reading comprehension models and report their results. © 2023 The Author(s). Published by Oxford University Press.
Russian Science Foundation, RSF: 20-11-20166 This work was supported by the Russian Science Foundation [20-11-20166]. |
|
Дата |
2024-04-05T16:20:28Z
2024-04-05T16:20:28Z 2023 |
|
Тип |
Article
Journal article (info:eu-repo/semantics/article) |info:eu-repo/semantics/publishedVersion |
|
Идентификатор |
Loukachevitch, N, Manandhar, S, Baral, E, Rozhkov, I, Braslavski, P, Batura, T, Ivanov, V & Tutubalina, E 2023, 'NEREL-BIO: a dataset of biomedical abstracts annotated with nested named entities', Bioinformatics, Том. 39, № 4, btad161. https://doi.org/10.1093/bioinformatics/btad161
Loukachevitch, N., Manandhar, S., Baral, E., Rozhkov, I., Braslavski, P., Batura, T., Ivanov, V., & Tutubalina, E. (2023). NEREL-BIO: a dataset of biomedical abstracts annotated with nested named entities. Bioinformatics, 39(4), [btad161]. https://doi.org/10.1093/bioinformatics/btad161 1367-4803 Final All Open Access, Gold, Green https://www.scopus.com/inward/record.uri?eid=2-s2.0-85153975102&doi=10.1093%2fbioinformatics%2fbtad161&partnerID=40&md5=ef500623e5009d9fb96fe419fc49b5e0 https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btad161/49726688/btad161.pdf http://elar.urfu.ru/handle/10995/130432 10.1093/bioinformatics/btad161 85153975102 000978997300001 |
|
Язык |
en
|
|
Связанные ресурсы |
info:eu-repo/grantAgreement/RSF//20-11-20166
|
|
Права |
Open access (info:eu-repo/semantics/openAccess)
cc-by https://creativecommons.org/licenses/by/4.0/ |
|
Формат |
application/pdf
|
|
Издатель |
Oxford University Press
|
|
Источник |
Bioinformatics
Bioinformatics |
|