Jurisprudence search based on facts similarity using NLP and ML techniques.

Ruiz, Rodrigo Amorim

Jurisprudence search based on facts similarity using NLP and ML techniques.

Detalhes bibliográficos
Autor(a) principal:	Ruiz, Rodrigo Amorim
Data de Publicação:	2021
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Biblioteca Digital de Teses e Dissertações da USP
Texto Completo:	https://www.teses.usp.br/teses/disponiveis/3/3141/tde-14022022-122906/
Resumo:	Part of a lawyers job is to understand the clients problem, to textually describe its facts and to apply the sources of law. To support a new legal case, a handful of past judgments on similar cases are typically employed by the lawyers, but finding them is currently a time-consuming procedure. To address this problem, we built a machine learning model responsible for classifying similarity between two facts descriptions. This similarity metric measures how much (from 0 to 1) a legal decision may be used to support another. We trained different model architectures combining several state-of-the-art natural language processing and machine learning techniques using an extracted dataset from the Superior Court of Justice website of past judgments, which enabled the dynamic construction of facts description pairs when one case cites another as a reference. The final best architecture employs TF-IDF for encoding and reducing dimensionality of our input documents, a Siamese Neural Network (SNN) with a Multilayer Perceptron (MLP) for feature extraction and a final layer, another MLP, responsible for concatenating and classifying the features into the similarity metric, achieving 85.98% accuracy, 83.89% precision and 89.06% recall. Such a model would enable the lawyer to compare a case facts description with several judgments of the jurisprudence and start their search on the most similar ones.

Metadados do item

id	USP_d6b5f1a36a0c9261e99fc31cd85e0ee2
oai_identifier_str	oai:teses.usp.br:tde-14022022-122906
network_acronym_str	USP
network_name_str	Biblioteca Digital de Teses e Dissertações da USP
repository_id_str	2721
spelling	Jurisprudence search based on facts similarity using NLP and ML techniques.Pesquisa de jurisprudência baseada na semelhança de fatos usando técnicas de PNL e ML.Aprendizado computacionalArtificial intelligenceBag-of wordsCosine similarityDeep learningFastTextGloVeInteligência artificialJurisprudenceJurisprudênciaLinguagem NaturalLogistic regressionLong short-term memoryMachine learningMultilayer perceptronNaive bayesNatural language processingNeural networkRedes neuraisSiamese neural networkTF-IDFTransfer learningTransformerWord embeddingWord2VecPart of a lawyers job is to understand the clients problem, to textually describe its facts and to apply the sources of law. To support a new legal case, a handful of past judgments on similar cases are typically employed by the lawyers, but finding them is currently a time-consuming procedure. To address this problem, we built a machine learning model responsible for classifying similarity between two facts descriptions. This similarity metric measures how much (from 0 to 1) a legal decision may be used to support another. We trained different model architectures combining several state-of-the-art natural language processing and machine learning techniques using an extracted dataset from the Superior Court of Justice website of past judgments, which enabled the dynamic construction of facts description pairs when one case cites another as a reference. The final best architecture employs TF-IDF for encoding and reducing dimensionality of our input documents, a Siamese Neural Network (SNN) with a Multilayer Perceptron (MLP) for feature extraction and a final layer, another MLP, responsible for concatenating and classifying the features into the similarity metric, achieving 85.98% accuracy, 83.89% precision and 89.06% recall. Such a model would enable the lawyer to compare a case facts description with several judgments of the jurisprudence and start their search on the most similar ones.Parte do trabalho de um advogado é entender o problema do cliente, descrever textualmente seus fatos e aplicar as fontes da lei. Para apoiar um novo processo legal, uma série de julgamentos anteriores em casos semelhantes são normalmente empregados pelos advogados, mas encontrá-los é atualmente um procedimento que demanda tempo. Para resolver esse problema, construímos um modelo de aprendizado de máquina responsável por classificar a similaridade entre as descrições de dois fatos. Essa métrica de similaridade mede quanto (de 0 a 1) uma decisão legal pode ser usada para apoiar outra. Treinamos diferentes arquiteturas combinando várias técnicas de processamento de linguagem natural e aprendizado de máquina do estado da arte usando um conjunto de dados extraído do site do Superior Tribunal de Justiça de julgamentos anteriores, o que possibilitou a construção dinâmica de pares de descrição de fatos quando um caso cita outro como referência. A melhor arquitetura final emprega TF-IDF para codificar e reduzir a dimensionalidade dos documentos de entrada, uma Rede Neural Siamesa (SNN) com um Multilayer Perceptron (MLP) para extração de \"features\" e uma camada final, outro MLP, responsável por concatenar e classificar essas \"features\" na métrica de similaridade, alcançando 85,98% de acurácia, 83,89% de precisão e 89,06% de sensibilidade. Tal modelo permitiria ao advogado comparar a descrição dos fatos de um caso com vários julgamentos da jurisprudência e iniciar sua busca pelos mais semelhantes.Biblioteca Digitais de Teses e Dissertações da USPBona, Glauber DeRuiz, Rodrigo Amorim2021-08-24info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/3/3141/tde-14022022-122906/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-10-09T12:45:07Zoai:teses.usp.br:tde-14022022-122906Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.bropendoar:27212024-10-09T12:45:07Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv	Jurisprudence search based on facts similarity using NLP and ML techniques. Pesquisa de jurisprudência baseada na semelhança de fatos usando técnicas de PNL e ML.
title	Jurisprudence search based on facts similarity using NLP and ML techniques.
spellingShingle	Jurisprudence search based on facts similarity using NLP and ML techniques. Ruiz, Rodrigo Amorim Aprendizado computacional Artificial intelligence Bag-of words Cosine similarity Deep learning FastText GloVe Inteligência artificial Jurisprudence Jurisprudência Linguagem Natural Logistic regression Long short-term memory Machine learning Multilayer perceptron Naive bayes Natural language processing Neural network Redes neurais Siamese neural network TF-IDF Transfer learning Transformer Word embedding Word2Vec
title_short	Jurisprudence search based on facts similarity using NLP and ML techniques.
title_full	Jurisprudence search based on facts similarity using NLP and ML techniques.
title_fullStr	Jurisprudence search based on facts similarity using NLP and ML techniques.
title_full_unstemmed	Jurisprudence search based on facts similarity using NLP and ML techniques.
title_sort	Jurisprudence search based on facts similarity using NLP and ML techniques.
author	Ruiz, Rodrigo Amorim
author_facet	Ruiz, Rodrigo Amorim
author_role	author
dc.contributor.none.fl_str_mv	Bona, Glauber De
dc.contributor.author.fl_str_mv	Ruiz, Rodrigo Amorim
dc.subject.por.fl_str_mv	Aprendizado computacional Artificial intelligence Bag-of words Cosine similarity Deep learning FastText GloVe Inteligência artificial Jurisprudence Jurisprudência Linguagem Natural Logistic regression Long short-term memory Machine learning Multilayer perceptron Naive bayes Natural language processing Neural network Redes neurais Siamese neural network TF-IDF Transfer learning Transformer Word embedding Word2Vec
topic	Aprendizado computacional Artificial intelligence Bag-of words Cosine similarity Deep learning FastText GloVe Inteligência artificial Jurisprudence Jurisprudência Linguagem Natural Logistic regression Long short-term memory Machine learning Multilayer perceptron Naive bayes Natural language processing Neural network Redes neurais Siamese neural network TF-IDF Transfer learning Transformer Word embedding Word2Vec
description	Part of a lawyers job is to understand the clients problem, to textually describe its facts and to apply the sources of law. To support a new legal case, a handful of past judgments on similar cases are typically employed by the lawyers, but finding them is currently a time-consuming procedure. To address this problem, we built a machine learning model responsible for classifying similarity between two facts descriptions. This similarity metric measures how much (from 0 to 1) a legal decision may be used to support another. We trained different model architectures combining several state-of-the-art natural language processing and machine learning techniques using an extracted dataset from the Superior Court of Justice website of past judgments, which enabled the dynamic construction of facts description pairs when one case cites another as a reference. The final best architecture employs TF-IDF for encoding and reducing dimensionality of our input documents, a Siamese Neural Network (SNN) with a Multilayer Perceptron (MLP) for feature extraction and a final layer, another MLP, responsible for concatenating and classifying the features into the similarity metric, achieving 85.98% accuracy, 83.89% precision and 89.06% recall. Such a model would enable the lawyer to compare a case facts description with several judgments of the jurisprudence and start their search on the most similar ones.
publishDate	2021
dc.date.none.fl_str_mv	2021-08-24
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://www.teses.usp.br/teses/disponiveis/3/3141/tde-14022022-122906/
url	https://www.teses.usp.br/teses/disponiveis/3/3141/tde-14022022-122906/
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv	Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Liberar o conteúdo para acesso público.
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP
instname_str	Universidade de São Paulo (USP)
instacron_str	USP
institution	USP
reponame_str	Biblioteca Digital de Teses e Dissertações da USP
collection	Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv	virginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.br
_version_	1815256508315729920

Jurisprudence search based on facts similarity using NLP and ML techniques.

Registros relacionados