BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering

Detalhes bibliográficos
Autor(a) principal: Andrade, Leonardo Bonalume de
Data de Publicação: 2024
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da UFRGS
Texto Completo: http://hdl.handle.net/10183/273151
Resumo: Legal document summarization aims to provide a clear understanding of the main points and arguments in a legal document, contributing to the efficiency of the judicial system. In this work, we propose BB25HLegalSum, a method that combines BERT clusters with the BM25 algorithm to summarize legal documents and present them to users with highlighted important information. The process involves selecting unique sentences from the original document, clustering them to find sentences about a similar subject, scoring clusters and sentences to generate a summary according to three strategies, and highlighting them to the user in the original document. Legal workers positively assessed the highlighted presentation.
id URGS_5cb79977f7c372f081d8446792d95177
oai_identifier_str oai:www.lume.ufrgs.br:10183/273151
network_acronym_str URGS
network_name_str Biblioteca Digital de Teses e Dissertações da UFRGS
repository_id_str 1853
spelling Andrade, Leonardo Bonalume deBecker, Karin2024-03-09T05:01:38Z2024http://hdl.handle.net/10183/273151001198196Legal document summarization aims to provide a clear understanding of the main points and arguments in a legal document, contributing to the efficiency of the judicial system. In this work, we propose BB25HLegalSum, a method that combines BERT clusters with the BM25 algorithm to summarize legal documents and present them to users with highlighted important information. The process involves selecting unique sentences from the original document, clustering them to find sentences about a similar subject, scoring clusters and sentences to generate a summary according to three strategies, and highlighting them to the user in the original document. Legal workers positively assessed the highlighted presentation.application/pdfengSumarizador de conteúdoProcessamento de linguagem naturalResumo automático de textoAlgoritmosText summarizationLegal documentsBERTMultiple color highlightingMultiple criteria highlightingBB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clusteringinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisUniversidade Federal do Rio Grande do SulInstituto de InformáticaPrograma de Pós-Graduação em ComputaçãoPorto Alegre, BR-RS2024mestradoinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSTEXT001198196.pdf.txt001198196.pdf.txtExtracted Texttext/plain153441http://www.lume.ufrgs.br/bitstream/10183/273151/2/001198196.pdf.txte70d89e67b521cdf0d12f9d7003f8eafMD52ORIGINAL001198196.pdfTexto completo (inglês)application/pdf1283120http://www.lume.ufrgs.br/bitstream/10183/273151/1/001198196.pdfb55d1c796d237662ea0e60ea0cb8ae05MD5110183/2731512024-03-21 05:04:54.603996oai:www.lume.ufrgs.br:10183/273151Biblioteca Digital de Teses e Dissertaçõeshttps://lume.ufrgs.br/handle/10183/2PUBhttps://lume.ufrgs.br/oai/requestlume@ufrgs.br||lume@ufrgs.bropendoar:18532024-03-21T08:04:54Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false
dc.title.pt_BR.fl_str_mv BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
title BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
spellingShingle BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
Andrade, Leonardo Bonalume de
Sumarizador de conteúdo
Processamento de linguagem natural
Resumo automático de texto
Algoritmos
Text summarization
Legal documents
BERT
Multiple color highlighting
Multiple criteria highlighting
title_short BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
title_full BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
title_fullStr BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
title_full_unstemmed BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
title_sort BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
author Andrade, Leonardo Bonalume de
author_facet Andrade, Leonardo Bonalume de
author_role author
dc.contributor.author.fl_str_mv Andrade, Leonardo Bonalume de
dc.contributor.advisor1.fl_str_mv Becker, Karin
contributor_str_mv Becker, Karin
dc.subject.por.fl_str_mv Sumarizador de conteúdo
Processamento de linguagem natural
Resumo automático de texto
Algoritmos
topic Sumarizador de conteúdo
Processamento de linguagem natural
Resumo automático de texto
Algoritmos
Text summarization
Legal documents
BERT
Multiple color highlighting
Multiple criteria highlighting
dc.subject.eng.fl_str_mv Text summarization
Legal documents
BERT
Multiple color highlighting
Multiple criteria highlighting
description Legal document summarization aims to provide a clear understanding of the main points and arguments in a legal document, contributing to the efficiency of the judicial system. In this work, we propose BB25HLegalSum, a method that combines BERT clusters with the BM25 algorithm to summarize legal documents and present them to users with highlighted important information. The process involves selecting unique sentences from the original document, clustering them to find sentences about a similar subject, scoring clusters and sentences to generate a summary according to three strategies, and highlighting them to the user in the original document. Legal workers positively assessed the highlighted presentation.
publishDate 2024
dc.date.accessioned.fl_str_mv 2024-03-09T05:01:38Z
dc.date.issued.fl_str_mv 2024
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10183/273151
dc.identifier.nrb.pt_BR.fl_str_mv 001198196
url http://hdl.handle.net/10183/273151
identifier_str_mv 001198196
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UFRGS
instname:Universidade Federal do Rio Grande do Sul (UFRGS)
instacron:UFRGS
instname_str Universidade Federal do Rio Grande do Sul (UFRGS)
instacron_str UFRGS
institution UFRGS
reponame_str Biblioteca Digital de Teses e Dissertações da UFRGS
collection Biblioteca Digital de Teses e Dissertações da UFRGS
bitstream.url.fl_str_mv http://www.lume.ufrgs.br/bitstream/10183/273151/2/001198196.pdf.txt
http://www.lume.ufrgs.br/bitstream/10183/273151/1/001198196.pdf
bitstream.checksum.fl_str_mv e70d89e67b521cdf0d12f9d7003f8eaf
b55d1c796d237662ea0e60ea0cb8ae05
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)
repository.mail.fl_str_mv lume@ufrgs.br||lume@ufrgs.br
_version_ 1810085640368291840