PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese

Detalhes bibliográficos
Autor(a) principal: Rodrigues, Rafael B. M. [UNESP]
Data de Publicação: 2022
Outros Autores: Privatto, Pedro I. M. [UNESP], de Sousa, Gustavo José [UNESP], Murari, Rafael P. [UNESP], Afonso, Luis C. S. [UNESP], Papa, João P. [UNESP], Pedronette, Daniel C. G. [UNESP], Guilherme, Ivan R. [UNESP], Perrout, Stephan R., Riente, Aliel F.
Tipo de documento: Artigo de conferência
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1007/978-3-030-98305-5_10
http://hdl.handle.net/11449/234320
Resumo: This work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved interesting results, which shows its potential for such a domain. To the best of our knowledge, this is the first BERT-based model to the oil and gas context.
id UNSP_e1c1a4b367e12e17dbb4261a725e9b92
oai_identifier_str oai:repositorio.unesp.br:11449/234320
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in PortugueseBERTDomain adaptionOil and gasThis work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved interesting results, which shows its potential for such a domain. To the best of our knowledge, this is the first BERT-based model to the oil and gas context.UNESP - São Paulo State University School of Technology and SciencesUNESP - São Paulo State University Institute of Geosciences and Exact SciencesUNESP - São Paulo State University School of SciencesPetróleo Brasileiro S.A. - PetrobrasCentro de Pesquisas da Petróleo Brasileiro S.A. - CENPES/PetrobrasUNESP - São Paulo State University School of Technology and SciencesUNESP - São Paulo State University Institute of Geosciences and Exact SciencesUNESP - São Paulo State University School of SciencesUniversidade Estadual Paulista (UNESP)Petróleo Brasileiro S.A. - PetrobrasCentro de Pesquisas da Petróleo Brasileiro S.A. - CENPES/PetrobrasRodrigues, Rafael B. M. [UNESP]Privatto, Pedro I. M. [UNESP]de Sousa, Gustavo José [UNESP]Murari, Rafael P. [UNESP]Afonso, Luis C. S. [UNESP]Papa, João P. [UNESP]Pedronette, Daniel C. G. [UNESP]Guilherme, Ivan R. [UNESP]Perrout, Stephan R.Riente, Aliel F.2022-05-01T15:46:21Z2022-05-01T15:46:21Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject101-109http://dx.doi.org/10.1007/978-3-030-98305-5_10Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13208 LNAI, p. 101-109.1611-33490302-9743http://hdl.handle.net/11449/23432010.1007/978-3-030-98305-5_102-s2.0-85127159496Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)info:eu-repo/semantics/openAccess2024-04-23T16:11:12Zoai:repositorio.unesp.br:11449/234320Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:41:16.267450Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
title PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
spellingShingle PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
Rodrigues, Rafael B. M. [UNESP]
BERT
Domain adaption
Oil and gas
title_short PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
title_full PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
title_fullStr PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
title_full_unstemmed PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
title_sort PetroBERT: A Domain Adaptation Language Model for Oil and Gas Applications in Portuguese
author Rodrigues, Rafael B. M. [UNESP]
author_facet Rodrigues, Rafael B. M. [UNESP]
Privatto, Pedro I. M. [UNESP]
de Sousa, Gustavo José [UNESP]
Murari, Rafael P. [UNESP]
Afonso, Luis C. S. [UNESP]
Papa, João P. [UNESP]
Pedronette, Daniel C. G. [UNESP]
Guilherme, Ivan R. [UNESP]
Perrout, Stephan R.
Riente, Aliel F.
author_role author
author2 Privatto, Pedro I. M. [UNESP]
de Sousa, Gustavo José [UNESP]
Murari, Rafael P. [UNESP]
Afonso, Luis C. S. [UNESP]
Papa, João P. [UNESP]
Pedronette, Daniel C. G. [UNESP]
Guilherme, Ivan R. [UNESP]
Perrout, Stephan R.
Riente, Aliel F.
author2_role author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (UNESP)
Petróleo Brasileiro S.A. - Petrobras
Centro de Pesquisas da Petróleo Brasileiro S.A. - CENPES/Petrobras
dc.contributor.author.fl_str_mv Rodrigues, Rafael B. M. [UNESP]
Privatto, Pedro I. M. [UNESP]
de Sousa, Gustavo José [UNESP]
Murari, Rafael P. [UNESP]
Afonso, Luis C. S. [UNESP]
Papa, João P. [UNESP]
Pedronette, Daniel C. G. [UNESP]
Guilherme, Ivan R. [UNESP]
Perrout, Stephan R.
Riente, Aliel F.
dc.subject.por.fl_str_mv BERT
Domain adaption
Oil and gas
topic BERT
Domain adaption
Oil and gas
description This work proposes the PetroBERT, which is a BERT-based model adapted to the oil and gas exploration domain in Portuguese. PetroBERT was pre-trained using the Petrolês corpus and a private daily drilling report corpus over BERT multilingual and BERTimbau. The proposed model was evaluated in the NER and sentence classification tasks and achieved interesting results, which shows its potential for such a domain. To the best of our knowledge, this is the first BERT-based model to the oil and gas context.
publishDate 2022
dc.date.none.fl_str_mv 2022-05-01T15:46:21Z
2022-05-01T15:46:21Z
2022-01-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/978-3-030-98305-5_10
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13208 LNAI, p. 101-109.
1611-3349
0302-9743
http://hdl.handle.net/11449/234320
10.1007/978-3-030-98305-5_10
2-s2.0-85127159496
url http://dx.doi.org/10.1007/978-3-030-98305-5_10
http://hdl.handle.net/11449/234320
identifier_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13208 LNAI, p. 101-109.
1611-3349
0302-9743
10.1007/978-3-030-98305-5_10
2-s2.0-85127159496
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 101-109
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808128264168800256