BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering
Autor(a) principal: | |
---|---|
Data de Publicação: | 2024 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da UFRGS |
Texto Completo: | http://hdl.handle.net/10183/273151 |
Resumo: | Legal document summarization aims to provide a clear understanding of the main points and arguments in a legal document, contributing to the efficiency of the judicial system. In this work, we propose BB25HLegalSum, a method that combines BERT clusters with the BM25 algorithm to summarize legal documents and present them to users with highlighted important information. The process involves selecting unique sentences from the original document, clustering them to find sentences about a similar subject, scoring clusters and sentences to generate a summary according to three strategies, and highlighting them to the user in the original document. Legal workers positively assessed the highlighted presentation. |
id |
URGS_5cb79977f7c372f081d8446792d95177 |
---|---|
oai_identifier_str |
oai:www.lume.ufrgs.br:10183/273151 |
network_acronym_str |
URGS |
network_name_str |
Biblioteca Digital de Teses e Dissertações da UFRGS |
repository_id_str |
1853 |
spelling |
Andrade, Leonardo Bonalume deBecker, Karin2024-03-09T05:01:38Z2024http://hdl.handle.net/10183/273151001198196Legal document summarization aims to provide a clear understanding of the main points and arguments in a legal document, contributing to the efficiency of the judicial system. In this work, we propose BB25HLegalSum, a method that combines BERT clusters with the BM25 algorithm to summarize legal documents and present them to users with highlighted important information. The process involves selecting unique sentences from the original document, clustering them to find sentences about a similar subject, scoring clusters and sentences to generate a summary according to three strategies, and highlighting them to the user in the original document. Legal workers positively assessed the highlighted presentation.application/pdfengSumarizador de conteúdoProcessamento de linguagem naturalResumo automático de textoAlgoritmosText summarizationLegal documentsBERTMultiple color highlightingMultiple criteria highlightingBB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clusteringinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisUniversidade Federal do Rio Grande do SulInstituto de InformáticaPrograma de Pós-Graduação em ComputaçãoPorto Alegre, BR-RS2024mestradoinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSTEXT001198196.pdf.txt001198196.pdf.txtExtracted Texttext/plain153441http://www.lume.ufrgs.br/bitstream/10183/273151/2/001198196.pdf.txte70d89e67b521cdf0d12f9d7003f8eafMD52ORIGINAL001198196.pdfTexto completo (inglês)application/pdf1283120http://www.lume.ufrgs.br/bitstream/10183/273151/1/001198196.pdfb55d1c796d237662ea0e60ea0cb8ae05MD5110183/2731512024-03-21 05:04:54.603996oai:www.lume.ufrgs.br:10183/273151Biblioteca Digital de Teses e Dissertaçõeshttps://lume.ufrgs.br/handle/10183/2PUBhttps://lume.ufrgs.br/oai/requestlume@ufrgs.br||lume@ufrgs.bropendoar:18532024-03-21T08:04:54Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false |
dc.title.pt_BR.fl_str_mv |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
title |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
spellingShingle |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering Andrade, Leonardo Bonalume de Sumarizador de conteúdo Processamento de linguagem natural Resumo automático de texto Algoritmos Text summarization Legal documents BERT Multiple color highlighting Multiple criteria highlighting |
title_short |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
title_full |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
title_fullStr |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
title_full_unstemmed |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
title_sort |
BB25HLegalSum : a method for legal document summarization that leverages BM25 and BERT-based clustering |
author |
Andrade, Leonardo Bonalume de |
author_facet |
Andrade, Leonardo Bonalume de |
author_role |
author |
dc.contributor.author.fl_str_mv |
Andrade, Leonardo Bonalume de |
dc.contributor.advisor1.fl_str_mv |
Becker, Karin |
contributor_str_mv |
Becker, Karin |
dc.subject.por.fl_str_mv |
Sumarizador de conteúdo Processamento de linguagem natural Resumo automático de texto Algoritmos |
topic |
Sumarizador de conteúdo Processamento de linguagem natural Resumo automático de texto Algoritmos Text summarization Legal documents BERT Multiple color highlighting Multiple criteria highlighting |
dc.subject.eng.fl_str_mv |
Text summarization Legal documents BERT Multiple color highlighting Multiple criteria highlighting |
description |
Legal document summarization aims to provide a clear understanding of the main points and arguments in a legal document, contributing to the efficiency of the judicial system. In this work, we propose BB25HLegalSum, a method that combines BERT clusters with the BM25 algorithm to summarize legal documents and present them to users with highlighted important information. The process involves selecting unique sentences from the original document, clustering them to find sentences about a similar subject, scoring clusters and sentences to generate a summary according to three strategies, and highlighting them to the user in the original document. Legal workers positively assessed the highlighted presentation. |
publishDate |
2024 |
dc.date.accessioned.fl_str_mv |
2024-03-09T05:01:38Z |
dc.date.issued.fl_str_mv |
2024 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10183/273151 |
dc.identifier.nrb.pt_BR.fl_str_mv |
001198196 |
url |
http://hdl.handle.net/10183/273151 |
identifier_str_mv |
001198196 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da UFRGS instname:Universidade Federal do Rio Grande do Sul (UFRGS) instacron:UFRGS |
instname_str |
Universidade Federal do Rio Grande do Sul (UFRGS) |
instacron_str |
UFRGS |
institution |
UFRGS |
reponame_str |
Biblioteca Digital de Teses e Dissertações da UFRGS |
collection |
Biblioteca Digital de Teses e Dissertações da UFRGS |
bitstream.url.fl_str_mv |
http://www.lume.ufrgs.br/bitstream/10183/273151/2/001198196.pdf.txt http://www.lume.ufrgs.br/bitstream/10183/273151/1/001198196.pdf |
bitstream.checksum.fl_str_mv |
e70d89e67b521cdf0d12f9d7003f8eaf b55d1c796d237662ea0e60ea0cb8ae05 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS) |
repository.mail.fl_str_mv |
lume@ufrgs.br||lume@ufrgs.br |
_version_ |
1810085640368291840 |