Collaborative relation annotation and quality analysis in Markyt environment

Detalhes bibliográficos
Autor(a) principal: Pérez-Pérez, Martín
Data de Publicação: 2017
Outros Autores: Pérez-Rodríguez, Gael, Fdez-Riverola, Florentino, Lourenço, Anália
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/1822/48123
Resumo: Text mining is showing potential to help in biomedical knowledge integration and discovery at various levels. However, results depend largely on the specifics of the knowledge problem and, in particular, on the ability to produce high-quality benchmarking corpora that may support the training and evaluation of automatic prediction systems. Annotation tools enabling the flexible and customizable production of such corpora are thus pivotal. The open-source Markyt annotation environment brings together the latest web technologies to offer a wide range of annotation capabilities in a domain-agnostic way. It enables the management of multi-user and multi-round annotation projects, including inter-annotator agreement and consensus assessments. Also, Markyt supports the description of entity and relation annotation guidelines on a project basis, being flexible to partial word tagging and the occurrence of annotation overlaps. This paper describes the current release of Markyt, namely new annotation perspectives, which enable the annotation of relations among entities, and enhanced analysis capabilities. Several demos, inspired by public biomedical corpora, are presented as means to better illustrate such functionalities. Markyt aims to bring together annotation capabilities of broad interest to those producing annotated corpora. Markyt demonstration projects describe 20 different annotation tasks of varied document sources (e.g. abstracts, twitters or drug labels) and languages (e.g. English, Spanish or Chinese). Continuous development is based on feedback from practical applications as well as community reports on short- and medium-term mining challenges. Markyt is freely available for non-commercial use at http://markyt.org.
id RCAP_3e6a6bb61884638b8fbfecb5481983e3
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/48123
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Collaborative relation annotation and quality analysis in Markyt environmentScience & TechnologyText mining is showing potential to help in biomedical knowledge integration and discovery at various levels. However, results depend largely on the specifics of the knowledge problem and, in particular, on the ability to produce high-quality benchmarking corpora that may support the training and evaluation of automatic prediction systems. Annotation tools enabling the flexible and customizable production of such corpora are thus pivotal. The open-source Markyt annotation environment brings together the latest web technologies to offer a wide range of annotation capabilities in a domain-agnostic way. It enables the management of multi-user and multi-round annotation projects, including inter-annotator agreement and consensus assessments. Also, Markyt supports the description of entity and relation annotation guidelines on a project basis, being flexible to partial word tagging and the occurrence of annotation overlaps. This paper describes the current release of Markyt, namely new annotation perspectives, which enable the annotation of relations among entities, and enhanced analysis capabilities. Several demos, inspired by public biomedical corpora, are presented as means to better illustrate such functionalities. Markyt aims to bring together annotation capabilities of broad interest to those producing annotated corpora. Markyt demonstration projects describe 20 different annotation tasks of varied document sources (e.g. abstracts, twitters or drug labels) and languages (e.g. English, Spanish or Chinese). Continuous development is based on feedback from practical applications as well as community reports on short- and medium-term mining challenges. Markyt is freely available for non-commercial use at http://markyt.org.This work was partially supported by the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). The authors also acknowledge the PhD grants of M.P.-P. and G.P.-R., funded by the Xunta de Galicia.info:eu-repo/semantics/publishedVersionOxford University PressUniversidade do MinhoPérez-Pérez, MartínPérez-Rodríguez, GaelFdez-Riverola, FlorentinoLourenço, Anália20172017-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/48123engPérez-Pérez, Martín; Pérez-Rodríguez, Gael; Fdez-Riverola, Florentino; Lourenço, Anália, Collaborative relation annotation and quality analysis in Markyt environment. Database, 2017(bax090), 2017.0162-41051758-046310.1093/database/bax09029220479info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:01:11Zoai:repositorium.sdum.uminho.pt:1822/48123Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:51:06.497127Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Collaborative relation annotation and quality analysis in Markyt environment
title Collaborative relation annotation and quality analysis in Markyt environment
spellingShingle Collaborative relation annotation and quality analysis in Markyt environment
Pérez-Pérez, Martín
Science & Technology
title_short Collaborative relation annotation and quality analysis in Markyt environment
title_full Collaborative relation annotation and quality analysis in Markyt environment
title_fullStr Collaborative relation annotation and quality analysis in Markyt environment
title_full_unstemmed Collaborative relation annotation and quality analysis in Markyt environment
title_sort Collaborative relation annotation and quality analysis in Markyt environment
author Pérez-Pérez, Martín
author_facet Pérez-Pérez, Martín
Pérez-Rodríguez, Gael
Fdez-Riverola, Florentino
Lourenço, Anália
author_role author
author2 Pérez-Rodríguez, Gael
Fdez-Riverola, Florentino
Lourenço, Anália
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Pérez-Pérez, Martín
Pérez-Rodríguez, Gael
Fdez-Riverola, Florentino
Lourenço, Anália
dc.subject.por.fl_str_mv Science & Technology
topic Science & Technology
description Text mining is showing potential to help in biomedical knowledge integration and discovery at various levels. However, results depend largely on the specifics of the knowledge problem and, in particular, on the ability to produce high-quality benchmarking corpora that may support the training and evaluation of automatic prediction systems. Annotation tools enabling the flexible and customizable production of such corpora are thus pivotal. The open-source Markyt annotation environment brings together the latest web technologies to offer a wide range of annotation capabilities in a domain-agnostic way. It enables the management of multi-user and multi-round annotation projects, including inter-annotator agreement and consensus assessments. Also, Markyt supports the description of entity and relation annotation guidelines on a project basis, being flexible to partial word tagging and the occurrence of annotation overlaps. This paper describes the current release of Markyt, namely new annotation perspectives, which enable the annotation of relations among entities, and enhanced analysis capabilities. Several demos, inspired by public biomedical corpora, are presented as means to better illustrate such functionalities. Markyt aims to bring together annotation capabilities of broad interest to those producing annotated corpora. Markyt demonstration projects describe 20 different annotation tasks of varied document sources (e.g. abstracts, twitters or drug labels) and languages (e.g. English, Spanish or Chinese). Continuous development is based on feedback from practical applications as well as community reports on short- and medium-term mining challenges. Markyt is freely available for non-commercial use at http://markyt.org.
publishDate 2017
dc.date.none.fl_str_mv 2017
2017-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/48123
url http://hdl.handle.net/1822/48123
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Fdez-Riverola, Florentino; Lourenço, Anália, Collaborative relation annotation and quality analysis in Markyt environment. Database, 2017(bax090), 2017.
0162-4105
1758-0463
10.1093/database/bax090
29220479
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Oxford University Press
publisher.none.fl_str_mv Oxford University Press
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799132281425625088