D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://repositorio.inesctec.pt/handle/123456789/2829 http://dx.doi.org/10.1007/s13173-012-0069-3 |
Resumo: | In some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled instances to train a clas- sifier. In such circumstances it is common to have mas- sive corpora where a few instances are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in unla- beled instances to improve classification models. However, these techniques assume that the labeled instances cover all the classes to learn which might not be the case. More- over, when in the presence of an imbalanced class distribution, getting labeled instances from minority classes might be very costly, requiring extensive labeling, if queries are randomly selected. Active learning allows asking an oracle to label new instances, which are selected by criteria, aiming to reduce the labeling effort. D-Confidence is an active learning approach that is effective when in pres- enc |
id |
RCAP_ea3c5d20806c36b3438671f37521414d |
---|---|
oai_identifier_str |
oai:repositorio.inesctec.pt:123456789/2829 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributionsIn some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled instances to train a clas- sifier. In such circumstances it is common to have mas- sive corpora where a few instances are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in unla- beled instances to improve classification models. However, these techniques assume that the labeled instances cover all the classes to learn which might not be the case. More- over, when in the presence of an imbalanced class distribution, getting labeled instances from minority classes might be very costly, requiring extensive labeling, if queries are randomly selected. Active learning allows asking an oracle to label new instances, which are selected by criteria, aiming to reduce the labeling effort. D-Confidence is an active learning approach that is effective when in pres- enc2017-11-16T14:10:56Z2012-01-01T00:00:00Z2012info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://repositorio.inesctec.pt/handle/123456789/2829http://dx.doi.org/10.1007/s13173-012-0069-3engAlípio JorgeNuno Escudeiroinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-05-15T10:20:34Zoai:repositorio.inesctec.pt:123456789/2829Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:53:19.605539Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
title |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
spellingShingle |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions Alípio Jorge |
title_short |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
title_full |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
title_fullStr |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
title_full_unstemmed |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
title_sort |
D-Confidence: an active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions |
author |
Alípio Jorge |
author_facet |
Alípio Jorge Nuno Escudeiro |
author_role |
author |
author2 |
Nuno Escudeiro |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Alípio Jorge Nuno Escudeiro |
description |
In some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled instances to train a clas- sifier. In such circumstances it is common to have mas- sive corpora where a few instances are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in unla- beled instances to improve classification models. However, these techniques assume that the labeled instances cover all the classes to learn which might not be the case. More- over, when in the presence of an imbalanced class distribution, getting labeled instances from minority classes might be very costly, requiring extensive labeling, if queries are randomly selected. Active learning allows asking an oracle to label new instances, which are selected by criteria, aiming to reduce the labeling effort. D-Confidence is an active learning approach that is effective when in pres- enc |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012-01-01T00:00:00Z 2012 2017-11-16T14:10:56Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://repositorio.inesctec.pt/handle/123456789/2829 http://dx.doi.org/10.1007/s13173-012-0069-3 |
url |
http://repositorio.inesctec.pt/handle/123456789/2829 http://dx.doi.org/10.1007/s13173-012-0069-3 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131607618027520 |