PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , , , , , , , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1007/978-3-030-98305-5_10 http://hdl.handle.net/11449/234320 |
Resumo: | This work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved interesting results, which shows its potential for such a domain. To the best of our knowledge, this is the first BERT-based model to the oil and gas context. |
id |
UNSP_e1c1a4b367e12e17dbb4261a725e9b92 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/234320 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in PortugueseBERTDomain adaptionOil and gasThis work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved interesting results, which shows its potential for such a domain. To the best of our knowledge, this is the first BERT-based model to the oil and gas context.UNESP - São Paulo State University School of Technology and SciencesUNESP - São Paulo State University Institute of Geosciences and Exact SciencesUNESP - São Paulo State University School of SciencesPetróleo Brasileiro S.A. - PetrobrasCentro de Pesquisas da Petróleo Brasileiro S.A. - CENPES/PetrobrasUNESP - São Paulo State University School of Technology and SciencesUNESP - São Paulo State University Institute of Geosciences and Exact SciencesUNESP - São Paulo State University School of SciencesUniversidade Estadual Paulista (UNESP)Petróleo Brasileiro S.A. - PetrobrasCentro de Pesquisas da Petróleo Brasileiro S.A. - CENPES/PetrobrasRodrigues, Rafael B. M. [UNESP]Privatto, Pedro I. M. [UNESP]de Sousa, Gustavo José [UNESP]Murari, Rafael P. [UNESP]Afonso, Luis C. S. [UNESP]Papa, João P. [UNESP]Pedronette, Daniel C. G. [UNESP]Guilherme, Ivan R. [UNESP]Perrout, Stephan R.Riente, Aliel F.2022-05-01T15:46:21Z2022-05-01T15:46:21Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject101-109http://dx.doi.org/10.1007/978-3-030-98305-5_10Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13208 LNAI, p. 101-109.1611-33490302-9743http://hdl.handle.net/11449/23432010.1007/978-3-030-98305-5_102-s2.0-85127159496Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)info:eu-repo/semantics/openAccess2024-04-23T16:11:12Zoai:repositorio.unesp.br:11449/234320Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:41:16.267450Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
title |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
spellingShingle |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese Rodrigues, Rafael B. M. [UNESP] BERT Domain adaption Oil and gas |
title_short |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
title_full |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
title_fullStr |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
title_full_unstemmed |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
title_sort |
PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese |
author |
Rodrigues, Rafael B. M. [UNESP] |
author_facet |
Rodrigues, Rafael B. M. [UNESP] Privatto, Pedro I. M. [UNESP] de Sousa, Gustavo José [UNESP] Murari, Rafael P. [UNESP] Afonso, Luis C. S. [UNESP] Papa, João P. [UNESP] Pedronette, Daniel C. G. [UNESP] Guilherme, Ivan R. [UNESP] Perrout, Stephan R. Riente, Aliel F. |
author_role |
author |
author2 |
Privatto, Pedro I. M. [UNESP] de Sousa, Gustavo José [UNESP] Murari, Rafael P. [UNESP] Afonso, Luis C. S. [UNESP] Papa, João P. [UNESP] Pedronette, Daniel C. G. [UNESP] Guilherme, Ivan R. [UNESP] Perrout, Stephan R. Riente, Aliel F. |
author2_role |
author author author author author author author author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (UNESP) Petróleo Brasileiro S.A. - Petrobras Centro de Pesquisas da Petróleo Brasileiro S.A. - CENPES/Petrobras |
dc.contributor.author.fl_str_mv |
Rodrigues, Rafael B. M. [UNESP] Privatto, Pedro I. M. [UNESP] de Sousa, Gustavo José [UNESP] Murari, Rafael P. [UNESP] Afonso, Luis C. S. [UNESP] Papa, João P. [UNESP] Pedronette, Daniel C. G. [UNESP] Guilherme, Ivan R. [UNESP] Perrout, Stephan R. Riente, Aliel F. |
dc.subject.por.fl_str_mv |
BERT Domain adaption Oil and gas |
topic |
BERT Domain adaption Oil and gas |
description |
This work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved interesting results, which shows its potential for such a domain. To the best of our knowledge, this is the first BERT-based model to the oil and gas context. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-05-01T15:46:21Z 2022-05-01T15:46:21Z 2022-01-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1007/978-3-030-98305-5_10 Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13208 LNAI, p. 101-109. 1611-3349 0302-9743 http://hdl.handle.net/11449/234320 10.1007/978-3-030-98305-5_10 2-s2.0-85127159496 |
url |
http://dx.doi.org/10.1007/978-3-030-98305-5_10 http://hdl.handle.net/11449/234320 |
identifier_str_mv |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13208 LNAI, p. 101-109. 1611-3349 0302-9743 10.1007/978-3-030-98305-5_10 2-s2.0-85127159496 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
101-109 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128264168800256 |