Просмотреть запись

Natural language processing for clusterization of genes according to their functions

Электронный научный архив УРФУ

Информация об архиве | Просмотр оригинала
 
 
Поле Значение
 
Заглавие Natural language processing for clusterization of genes according to their functions
 
Автор Dordiuk, V.
Demicheva, E.
Espino, F. P.
Ushenin, K.
 
Тематика BERT
CLUSTERIZATION
DIFFERENTIAL GENE EXPRESSION ANALYSIS
GENE EXPRESSION
GENE ONTOLOGY
NATURAL LANGUAGE PROCESSING
SEMANTIC ANALYSIS
GENE ONTOLOGY
NATURAL LANGUAGE PROCESSING SYSTEMS
PIPELINES
SEMANTICS
TEXT PROCESSING
BERT
CLUSTERIZATION
DIFFERENTIAL GENE EXPRESSION ANALYSE
DIFFERENTIAL GENE EXPRESSIONS
GENE EXPRESSION ANALYSIS
GENE ONTOLOGY
GENES EXPRESSION
LANGUAGE PROCESSING
NATURAL LANGUAGE PROCESSING
NATURAL LANGUAGES
SEMANTIC ANALYSIS
GENE EXPRESSION
 
Описание There are hundreds of methods for analysis of data obtained in mRNA-sequencing. The most of them are focused on small number of genes. In this study, we propose an approach that reduces the analysis of several thousand genes to analysis of several clusters. The list of genes is enriched with information from open databases. Then, the descriptions are encoded as vectors using the pretrained language model (BERT) and some text processing approaches. The encoded gene function pass through the dimensionality reduction and clusterization. Aiming to find the most efficient pipeline, 180 cases of pipeline with different methods in the major pipeline steps were analyzed. The performance was evaluated with clusterization indexes and expert review of the results. © 2022 IEEE.
 
Дата 2024-04-08T11:07:10Z
2024-04-08T11:07:10Z
2022
 
Тип Conference paper
Conference object (info:eu-repo/semantics/conferenceObject)
info:eu-repo/semantics/submittedVersion
 
Идентификатор Dordiuk, V, Demicheva, E, Espino, FP & Ushenin, K 2022, Natural language processing for clusterization of genes according to their functions. в Proceedings - 2022 Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2022. Proceedings - 2022 Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2022, Institute of Electrical and Electronics Engineers Inc., стр. 1-4. https://doi.org/10.1109/CSGB56354.2022.9865330
Dordiuk, V., Demicheva, E., Espino, F. P., & Ushenin, K. (2022). Natural language processing for clusterization of genes according to their functions. в Proceedings - 2022 Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2022 (стр. 1-4). (Proceedings - 2022 Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2022). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CSGB56354.2022.9865330
978-166545288-5
Final
All Open Access; Green Open Access
https://arxiv.org/pdf/2207.08162
https://arxiv.org/pdf/2207.08162
http://elar.urfu.ru/handle/10995/131416
10.1109/CSGB56354.2022.9865330
85138478040
 
Язык en
 
Права Open access (info:eu-repo/semantics/openAccess)
 
Формат application/pdf
 
Издатель Institute of Electrical and Electronics Engineers Inc.
 
Источник 2022 Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine (CSGB)
Proceedings - 2022 Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2022