Learning from multiple annotators: distinguishing good from random labelers

Rodrigues, Filipe; Pereira, Francisco; Ribeiro, Bernardete

Learning from multiple annotators: distinguishing good from random labelers

Detalhes bibliográficos
Autor(a) principal:	Rodrigues, Filipe
Data de Publicação:	2013
Outros Autores:	Pereira, Francisco, Ribeiro, Bernardete
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10316/27407 https://doi.org/10.1016/j.patrec.2013.05.012
Resumo:	With the increasing popularity of online crowdsourcing platforms such as Amazon Mechanical Turk (AMT), building supervised learning models for datasets with multiple annotators is receiving an increasing attention from researchers. These platforms provide an inexpensive and accessible resource that can be used to obtain labeled data, and in many situations the quality of the labels competes directly with those of experts. For such reasons, much attention has recently been given to annotator-aware models. In this paper, we propose a new probabilistic model for supervised learning with multiple annotators where the reliability of the different annotators is treated as a latent variable. We empirically show that this model is able to achieve state of the art performance, while reducing the number of model parameters, thus avoiding a potential overfitting. Furthermore, the proposed model is easier to implement and extend to other classes of learning problems such as sequence labeling tasks.

Metadados do item

id	RCAP_3a9a30386cabdc89299a164113f89048
oai_identifier_str	oai:estudogeral.uc.pt:10316/27407
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Learning from multiple annotators: distinguishing good from random labelersMultiple annotatorsCrowdsourcingLatent variable modelsExpectation–MaximizationLogistic RegressionWith the increasing popularity of online crowdsourcing platforms such as Amazon Mechanical Turk (AMT), building supervised learning models for datasets with multiple annotators is receiving an increasing attention from researchers. These platforms provide an inexpensive and accessible resource that can be used to obtain labeled data, and in many situations the quality of the labels competes directly with those of experts. For such reasons, much attention has recently been given to annotator-aware models. In this paper, we propose a new probabilistic model for supervised learning with multiple annotators where the reliability of the different annotators is treated as a latent variable. We empirically show that this model is able to achieve state of the art performance, while reducing the number of model parameters, thus avoiding a potential overfitting. Furthermore, the proposed model is easier to implement and extend to other classes of learning problems such as sequence labeling tasks.Elsevier2013-09-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/27407http://hdl.handle.net/10316/27407https://doi.org/10.1016/j.patrec.2013.05.012engRODRIGUES, Filipe; PEREIRA, Francisco; RIBEIRO, Bernardete - Learning from multiple annotators: distinguishing good from random labelers. "Pattern Recognition Letters". ISSN 0167-8655. Vol. 34 Nº. 12 (2013) p. 1428-14360167-8655http://www.sciencedirect.com/science/article/pii/S016786551300202XRodrigues, FilipePereira, FranciscoRibeiro, Bernardeteinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2020-05-25T12:20:33Zoai:estudogeral.uc.pt:10316/27407Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T20:58:19.102947Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Learning from multiple annotators: distinguishing good from random labelers
title	Learning from multiple annotators: distinguishing good from random labelers
spellingShingle	Learning from multiple annotators: distinguishing good from random labelers Rodrigues, Filipe Multiple annotators Crowdsourcing Latent variable models Expectation–Maximization Logistic Regression
title_short	Learning from multiple annotators: distinguishing good from random labelers
title_full	Learning from multiple annotators: distinguishing good from random labelers
title_fullStr	Learning from multiple annotators: distinguishing good from random labelers
title_full_unstemmed	Learning from multiple annotators: distinguishing good from random labelers
title_sort	Learning from multiple annotators: distinguishing good from random labelers
author	Rodrigues, Filipe
author_facet	Rodrigues, Filipe Pereira, Francisco Ribeiro, Bernardete
author_role	author
author2	Pereira, Francisco Ribeiro, Bernardete
author2_role	author author
dc.contributor.author.fl_str_mv	Rodrigues, Filipe Pereira, Francisco Ribeiro, Bernardete
dc.subject.por.fl_str_mv	Multiple annotators Crowdsourcing Latent variable models Expectation–Maximization Logistic Regression
topic	Multiple annotators Crowdsourcing Latent variable models Expectation–Maximization Logistic Regression
description	With the increasing popularity of online crowdsourcing platforms such as Amazon Mechanical Turk (AMT), building supervised learning models for datasets with multiple annotators is receiving an increasing attention from researchers. These platforms provide an inexpensive and accessible resource that can be used to obtain labeled data, and in many situations the quality of the labels competes directly with those of experts. For such reasons, much attention has recently been given to annotator-aware models. In this paper, we propose a new probabilistic model for supervised learning with multiple annotators where the reliability of the different annotators is treated as a latent variable. We empirically show that this model is able to achieve state of the art performance, while reducing the number of model parameters, thus avoiding a potential overfitting. Furthermore, the proposed model is easier to implement and extend to other classes of learning problems such as sequence labeling tasks.
publishDate	2013
dc.date.none.fl_str_mv	2013-09-01
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10316/27407 http://hdl.handle.net/10316/27407 https://doi.org/10.1016/j.patrec.2013.05.012
url	http://hdl.handle.net/10316/27407 https://doi.org/10.1016/j.patrec.2013.05.012
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	RODRIGUES, Filipe; PEREIRA, Francisco; RIBEIRO, Bernardete - Learning from multiple annotators: distinguishing good from random labelers. "Pattern Recognition Letters". ISSN 0167-8655. Vol. 34 Nº. 12 (2013) p. 1428-1436 0167-8655 http://www.sciencedirect.com/science/article/pii/S016786551300202X
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Elsevier
publisher.none.fl_str_mv	Elsevier
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799133873990270976

Learning from multiple annotators: distinguishing good from random labelers

Registros relacionados