The impact of NLP techniques in the multilabel text classification problem

Gonçalves, Teresa; Quaresma, Paulo

The impact of NLP techniques in the multilabel text classification problem

Detalhes bibliográficos
Autor(a) principal:	Gonçalves, Teresa
Data de Publicação:	2004
Outros Autores:	Quaresma, Paulo
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10174/2558
Resumo:	Support Vector Machines have been used successfully to classify text documents into sets of concepts. However, typically, linguistic information is not being used in the classification process or its use has not been fully evaluated. We apply and evaluate two basic linguistic procedures (stop-word removal and stemming/lemmatization) to the multilabel text classification problem. These procedures are applied to the Reuters dataset and to the Portuguese juridical documents from Supreme Courts and Attorney General’s Office.

Metadados do item

id	RCAP_4f6e13ffb7f611e5be0ca3dc916f5273
oai_identifier_str	oai:dspace.uevora.pt:10174/2558
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	The impact of NLP techniques in the multilabel text classification problemmachine learningText classificationSupport Vector Machines have been used successfully to classify text documents into sets of concepts. However, typically, linguistic information is not being used in the classification process or its use has not been fully evaluated. We apply and evaluate two basic linguistic procedures (stop-word removal and stemming/lemmatization) to the multilabel text classification problem. These procedures are applied to the Reuters dataset and to the Portuguese juridical documents from Supreme Courts and Attorney General’s Office.Springer-Verlag2011-02-15T11:25:04Z2011-02-152004-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article168602 bytesapplication/pdfhttp://hdl.handle.net/10174/2558http://hdl.handle.net/10174/2558eng424-428Advances in Soft Computinglivretcg@uevora.ptpq@uevora.ptIIPWM-04, Intelligent Information Processing and Web MiningKlopotek, M.Weirzchon, S.Trojanowski, K.498Gonçalves, TeresaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T18:39:06Zoai:dspace.uevora.pt:10174/2558Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:58:14.237559Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	The impact of NLP techniques in the multilabel text classification problem
title	The impact of NLP techniques in the multilabel text classification problem
spellingShingle	The impact of NLP techniques in the multilabel text classification problem Gonçalves, Teresa machine learning Text classification
title_short	The impact of NLP techniques in the multilabel text classification problem
title_full	The impact of NLP techniques in the multilabel text classification problem
title_fullStr	The impact of NLP techniques in the multilabel text classification problem
title_full_unstemmed	The impact of NLP techniques in the multilabel text classification problem
title_sort	The impact of NLP techniques in the multilabel text classification problem
author	Gonçalves, Teresa
author_facet	Gonçalves, Teresa Quaresma, Paulo
author_role	author
author2	Quaresma, Paulo
author2_role	author
dc.contributor.author.fl_str_mv	Gonçalves, Teresa Quaresma, Paulo
dc.subject.por.fl_str_mv	machine learning Text classification
topic	machine learning Text classification
description	Support Vector Machines have been used successfully to classify text documents into sets of concepts. However, typically, linguistic information is not being used in the classification process or its use has not been fully evaluated. We apply and evaluate two basic linguistic procedures (stop-word removal and stemming/lemmatization) to the multilabel text classification problem. These procedures are applied to the Reuters dataset and to the Portuguese juridical documents from Supreme Courts and Attorney General’s Office.
publishDate	2004
dc.date.none.fl_str_mv	2004-01-01T00:00:00Z 2011-02-15T11:25:04Z 2011-02-15
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10174/2558 http://hdl.handle.net/10174/2558
url	http://hdl.handle.net/10174/2558
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	424-428 Advances in Soft Computing livre tcg@uevora.pt pq@uevora.pt IIPWM-04, Intelligent Information Processing and Web Mining Klopotek, M. Weirzchon, S. Trojanowski, K. 498
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	168602 bytes application/pdf
dc.publisher.none.fl_str_mv	Springer-Verlag
publisher.none.fl_str_mv	Springer-Verlag
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136465771298816

The impact of NLP techniques in the multilabel text classification problem

Registros relacionados