Event extraction and representation: A case study for the portuguese language

Quaresma, Paulo; Nogueira, Vítor; Raiyani, Kashyap; Bayot, Roy

Event extraction and representation: A case study for the portuguese language

Detalhes bibliográficos
Autor(a) principal:	Quaresma, Paulo
Data de Publicação:	2019
Outros Autores:	Nogueira, Vítor, Raiyani, Kashyap, Bayot, Roy
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10174/27059 https://doi.org/10.3390/info10060205
Resumo:	Text information extraction is an important natural language processing (NLP) task, which aims to automatically identify, extract, and represent information from text. In this context, event extraction plays a relevant role, allowing actions, agents, objects, places, and time periods to be identified and represented. The extracted information can be represented by specialized ontologies, supporting knowledge-based reasoning and inference processes. In this work, we will describe, in detail, our proposal for event extraction from Portuguese documents. The proposed approach is based on a pipeline of specialized natural language processing tools; namely, a part-of-speech tagger, a named entities recognizer, a dependency parser, semantic role labeling, and a knowledge extraction module. The architecture is language-independent, but its modules are language-dependent and can be built using adequate AI (i.e., rule-based or machine learning) methodologies. The developed system was evaluated with a corpus of Portuguese texts and the obtained results are presented and analysed. The current limitations and future work are discussed in detail.

Metadados do item

id	RCAP_0b3362d0f66365087ff32955dc33a5f8
oai_identifier_str	oai:dspace.uevora.pt:10174/27059
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Event extraction and representation: A case study for the portuguese languageEventsInformation extractionNatural language processingOntologies populationText miningText information extraction is an important natural language processing (NLP) task, which aims to automatically identify, extract, and represent information from text. In this context, event extraction plays a relevant role, allowing actions, agents, objects, places, and time periods to be identified and represented. The extracted information can be represented by specialized ontologies, supporting knowledge-based reasoning and inference processes. In this work, we will describe, in detail, our proposal for event extraction from Portuguese documents. The proposed approach is based on a pipeline of specialized natural language processing tools; namely, a part-of-speech tagger, a named entities recognizer, a dependency parser, semantic role labeling, and a knowledge extraction module. The architecture is language-independent, but its modules are language-dependent and can be built using adequate AI (i.e., rule-based or machine learning) methodologies. The developed system was evaluated with a corpus of Portuguese texts and the obtained results are presented and analysed. The current limitations and future work are discussed in detail.MDPI AG2020-02-19T11:57:14Z2020-02-192019-06-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/27059http://hdl.handle.net/10174/27059https://doi.org/10.3390/info10060205engpq@uevora.ptvbn@uevora.ptndnd283Quaresma, PauloNogueira, VítorRaiyani, KashyapBayot, Royinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:22:20Zoai:dspace.uevora.pt:10174/27059Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:17:14.159186Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Event extraction and representation: A case study for the portuguese language
title	Event extraction and representation: A case study for the portuguese language
spellingShingle	Event extraction and representation: A case study for the portuguese language Quaresma, Paulo Events Information extraction Natural language processing Ontologies population Text mining
title_short	Event extraction and representation: A case study for the portuguese language
title_full	Event extraction and representation: A case study for the portuguese language
title_fullStr	Event extraction and representation: A case study for the portuguese language
title_full_unstemmed	Event extraction and representation: A case study for the portuguese language
title_sort	Event extraction and representation: A case study for the portuguese language
author	Quaresma, Paulo
author_facet	Quaresma, Paulo Nogueira, Vítor Raiyani, Kashyap Bayot, Roy
author_role	author
author2	Nogueira, Vítor Raiyani, Kashyap Bayot, Roy
author2_role	author author author
dc.contributor.author.fl_str_mv	Quaresma, Paulo Nogueira, Vítor Raiyani, Kashyap Bayot, Roy
dc.subject.por.fl_str_mv	Events Information extraction Natural language processing Ontologies population Text mining
topic	Events Information extraction Natural language processing Ontologies population Text mining
description	Text information extraction is an important natural language processing (NLP) task, which aims to automatically identify, extract, and represent information from text. In this context, event extraction plays a relevant role, allowing actions, agents, objects, places, and time periods to be identified and represented. The extracted information can be represented by specialized ontologies, supporting knowledge-based reasoning and inference processes. In this work, we will describe, in detail, our proposal for event extraction from Portuguese documents. The proposed approach is based on a pipeline of specialized natural language processing tools; namely, a part-of-speech tagger, a named entities recognizer, a dependency parser, semantic role labeling, and a knowledge extraction module. The architecture is language-independent, but its modules are language-dependent and can be built using adequate AI (i.e., rule-based or machine learning) methodologies. The developed system was evaluated with a corpus of Portuguese texts and the obtained results are presented and analysed. The current limitations and future work are discussed in detail.
publishDate	2019
dc.date.none.fl_str_mv	2019-06-01T00:00:00Z 2020-02-19T11:57:14Z 2020-02-19
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10174/27059 http://hdl.handle.net/10174/27059 https://doi.org/10.3390/info10060205
url	http://hdl.handle.net/10174/27059 https://doi.org/10.3390/info10060205
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	pq@uevora.pt vbn@uevora.pt nd nd 283
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	MDPI AG
publisher.none.fl_str_mv	MDPI AG
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136654579990528

Event extraction and representation: A case study for the portuguese language

Registros relacionados