Extração de relações hiponímicas em corpora de língua portuguesa

Machado, Pablo Neves

Extração de relações hiponímicas em corpora de língua portuguesa

Detalhes bibliográficos
Autor(a) principal:	Machado, Pablo Neves
Data de Publicação:	2015
Tipo de documento:	Dissertação
Idioma:	por
Título da fonte:	Biblioteca Digital de Teses e Dissertações da PUC_RS
Texto Completo:	http://tede2.pucrs.br/tede2/handle/tede/6108
Resumo:	Natural Language Processing (NLP) is a Computer Science area featured by its relevance to the development of applications that process large amounts of text or speech. In this paper we focus on texts in Portuguese, extracting from them hyponymic relations between entities, using a rules-based approach adapted from Hearst to English, and Freitas and Quental and Taba and Caseli to Portuguese. The prototype was executed over a corpus of Portuguese texts and the output was analyzed according to the reference author and rule sets. The evaluation process followed the one proposed by Freitas and Quental with human judgment, and the results are compared to those reported in the main references. The dissertation also studies in detail the most common errors identified.

Metadados do item

id	P_RS_c21dd38dfd8822025c0ec9613e15c13d
oai_identifier_str	oai:tede2.pucrs.br:tede/6108
network_acronym_str	P_RS
network_name_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling	Lima, Vera Lúcia Strube de265.515.190-91http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4781127A8021.323.160-31http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4294042A5Machado, Pablo Neves2015-06-08T11:20:00Z2015-03-26http://tede2.pucrs.br/tede2/handle/tede/6108Natural Language Processing (NLP) is a Computer Science area featured by its relevance to the development of applications that process large amounts of text or speech. In this paper we focus on texts in Portuguese, extracting from them hyponymic relations between entities, using a rules-based approach adapted from Hearst to English, and Freitas and Quental and Taba and Caseli to Portuguese. The prototype was executed over a corpus of Portuguese texts and the output was analyzed according to the reference author and rule sets. The evaluation process followed the one proposed by Freitas and Quental with human judgment, and the results are compared to those reported in the main references. The dissertation also studies in detail the most common errors identified.O Processamento da Linguagem Natural (PLN) é uma área da Ciência da Computação destacada por sua relevância para o desenvolvimento de aplicações em processamento de grandes quantidades de documentos textuais ou orais. Neste trabalho focamos nos textos em língua portuguesa, deles extraindo relações hiponímicas entre entidades, usando uma abordagem baseada em regras adaptadas dos trabalhos de Hearst para o inglês, Freitas e Quental e Taba e Caseli para o português, aqui complementadas. Para validar a proposta foi desenvolvido um protótipo que extrai relações hiponímicas de corpora em língua portuguesa. O protótipo foi executado sobre corpus de textos e os resultados obtidos foram analisados tanto por fonte de referência como por grupos de regras. O processo avaliativo seguiu o proposto por Freitas e Quental com avaliação humana, e as medidas obtidas são comparadas com as relatadas nas principais fontes de referência. A dissertação ainda estuda em detalhe os erros mais frequentes identificados.Submitted by Setor de Tratamento da Informação - BC/PUCRS (tede2@pucrs.br) on 2015-06-08T11:20:00Z No. of bitstreams: 1 470106 - Texto Completo.pdf: 1241867 bytes, checksum: fb5ae9bcc63565dabf9bfb2f5c3ed3ad (MD5)Made available in DSpace on 2015-06-08T11:20:00Z (GMT). No. of bitstreams: 1 470106 - Texto Completo.pdf: 1241867 bytes, checksum: fb5ae9bcc63565dabf9bfb2f5c3ed3ad (MD5) Previous issue date: 2015-03-26application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/162954/470106%20-%20Texto%20Completo.pdf.jpgporPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilFaculdade de InformáticaINFORMÁTICAPROCESSAMENTO DA LINGUAGEM NATURALANÁLISE SEMÂNTICA (PROGRAMAÇÃO)CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOExtração de relações hiponímicas em corpora de língua portuguesainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis1974996533081274470600600600-30085425104011491443671711205811204509info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSTHUMBNAIL470106 - Texto Completo.pdf.jpg470106 - Texto Completo.pdf.jpgimage/jpeg3779http://tede2.pucrs.br/tede2/bitstream/tede/6108/4/470106+-+Texto+Completo.pdf.jpg58170f033414adfe8093d00c50239ec4MD54TEXT470106 - Texto Completo.pdf.txt470106 - Texto Completo.pdf.txttext/plain121679http://tede2.pucrs.br/tede2/bitstream/tede/6108/3/470106+-+Texto+Completo.pdf.txtcb37092d07cd71fa4b012efbc221cce4MD53ORIGINAL470106 - Texto Completo.pdf470106 - Texto Completo.pdfapplication/pdf1241867http://tede2.pucrs.br/tede2/bitstream/tede/6108/2/470106+-+Texto+Completo.pdffb5ae9bcc63565dabf9bfb2f5c3ed3adMD52LICENSElicense.txtlicense.txttext/plain; charset=utf-8610http://tede2.pucrs.br/tede2/bitstream/tede/6108/1/license.txt5a9d6006225b368ef605ba16b4f6d1beMD51tede/61082015-09-29 08:25:14.794oai:tede2.pucrs.br:tede/6108QXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2HDp8OjbyBFbGV0csO0bmljYTogQ29tIGJhc2Ugbm8gZGlzcG9zdG8gbmEgTGVpIEZlZGVyYWwgbsK6OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYcOnw6NvIGVsZXRyw7RuaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWbDrWNpYSBVbml2ZXJzaWRhZGUgQ2F0w7NsaWNhIGRvIFJpbyBHcmFuZGUgZG8gU3VsLCBzZWRpYWRhIGEgQXYuIElwaXJhbmdhIDY2ODEsIFBvcnRvIEFsZWdyZSwgUmlvIEdyYW5kZSBkbyBTdWwsIGNvbSByZWdpc3RybyBkZSBDTlBKIDg4NjMwNDEzMDAwMi04MSBiZW0gY29tbyBlbSBvdXRyYXMgYmlibGlvdGVjYXMgZGlnaXRhaXMsIG5hY2lvbmFpcyBlIGludGVybmFjaW9uYWlzLCBjb25zw7NyY2lvcyBlIHJlZGVzIMOgcyBxdWFpcyBhIGJpYmxpb3RlY2EgZGEgUFVDUlMgcG9zc2EgYSB2aXIgcGFydGljaXBhciwgc2VtIMO0bnVzIGFsdXNpdm8gYW9zIGRpcmVpdG9zIGF1dG9yYWlzLCBhIHTDrXR1bG8gZGUgZGl2dWxnYcOnw6NvIGRhIHByb2R1w6fDo28gY2llbnTDrWZpY2EuCg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br\|\|opendoar:2015-09-29T11:25:14Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv	Extração de relações hiponímicas em corpora de língua portuguesa
title	Extração de relações hiponímicas em corpora de língua portuguesa
spellingShingle	Extração de relações hiponímicas em corpora de língua portuguesa Machado, Pablo Neves INFORMÁTICA PROCESSAMENTO DA LINGUAGEM NATURAL ANÁLISE SEMÂNTICA (PROGRAMAÇÃO) CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short	Extração de relações hiponímicas em corpora de língua portuguesa
title_full	Extração de relações hiponímicas em corpora de língua portuguesa
title_fullStr	Extração de relações hiponímicas em corpora de língua portuguesa
title_full_unstemmed	Extração de relações hiponímicas em corpora de língua portuguesa
title_sort	Extração de relações hiponímicas em corpora de língua portuguesa
author	Machado, Pablo Neves
author_facet	Machado, Pablo Neves
author_role	author
dc.contributor.advisor1.fl_str_mv	Lima, Vera Lúcia Strube de
dc.contributor.advisor1ID.fl_str_mv	265.515.190-91
dc.contributor.advisor1Lattes.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4781127A8
dc.contributor.authorID.fl_str_mv	021.323.160-31
dc.contributor.authorLattes.fl_str_mv	http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4294042A5
dc.contributor.author.fl_str_mv	Machado, Pablo Neves
contributor_str_mv	Lima, Vera Lúcia Strube de
dc.subject.por.fl_str_mv	INFORMÁTICA PROCESSAMENTO DA LINGUAGEM NATURAL ANÁLISE SEMÂNTICA (PROGRAMAÇÃO)
topic	INFORMÁTICA PROCESSAMENTO DA LINGUAGEM NATURAL ANÁLISE SEMÂNTICA (PROGRAMAÇÃO) CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
dc.subject.cnpq.fl_str_mv	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description	Natural Language Processing (NLP) is a Computer Science area featured by its relevance to the development of applications that process large amounts of text or speech. In this paper we focus on texts in Portuguese, extracting from them hyponymic relations between entities, using a rules-based approach adapted from Hearst to English, and Freitas and Quental and Taba and Caseli to Portuguese. The prototype was executed over a corpus of Portuguese texts and the output was analyzed according to the reference author and rule sets. The evaluation process followed the one proposed by Freitas and Quental with human judgment, and the results are compared to those reported in the main references. The dissertation also studies in detail the most common errors identified.
publishDate	2015
dc.date.accessioned.fl_str_mv	2015-06-08T11:20:00Z
dc.date.issued.fl_str_mv	2015-03-26
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://tede2.pucrs.br/tede2/handle/tede/6108
url	http://tede2.pucrs.br/tede2/handle/tede/6108
dc.language.iso.fl_str_mv	por
language	por
dc.relation.program.fl_str_mv	1974996533081274470
dc.relation.confidence.fl_str_mv	600 600 600
dc.relation.department.fl_str_mv	-3008542510401149144
dc.relation.cnpq.fl_str_mv	3671711205811204509
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Pontifícia Universidade Católica do Rio Grande do Sul
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv	PUCRS
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Faculdade de Informática
publisher.none.fl_str_mv	Pontifícia Universidade Católica do Rio Grande do Sul
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) instacron:PUC_RS
instname_str	Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str	PUC_RS
institution	PUC_RS
reponame_str	Biblioteca Digital de Teses e Dissertações da PUC_RS
collection	Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv	http://tede2.pucrs.br/tede2/bitstream/tede/6108/4/470106+-+Texto+Completo.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/6108/3/470106+-+Texto+Completo.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/6108/2/470106+-+Texto+Completo.pdf http://tede2.pucrs.br/tede2/bitstream/tede/6108/1/license.txt
bitstream.checksum.fl_str_mv	58170f033414adfe8093d00c50239ec4 cb37092d07cd71fa4b012efbc221cce4 fb5ae9bcc63565dabf9bfb2f5c3ed3ad 5a9d6006225b368ef605ba16b4f6d1be
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv	biblioteca.central@pucrs.br\|\|
_version_	1799765313200324608

Extração de relações hiponímicas em corpora de língua portuguesa

Registros relacionados