Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs

Detalhes bibliográficos
Autor(a) principal: Correia, Fábio José Gonçalves
Data de Publicação: 2014
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/1822/36762
Resumo: Lattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to assess the performance of such algorithms on High Performance Computing (HPC) systems. This dissertation compares a wide range of algorithms that solve the SVP, the SVP-solvers, namely, the Voronoi cell-based algorithm and two enumeration-based solvers, SE++ and ENUM. We show that different techniques and optimizations used to significantly improve the performance of ENUM can also be applied to other enumeration algorithms, namely the extreme pruning technique and the optimization that avoids symmetric branches of the enumeration tree. We present the first practical results of the Voronoi cell-based algorithm and compare its performance to the mentioned enumeration algorithms. The Voronoi cell-based algorithm performed considerably worse, although it displays potential for parallelization. The optimization that avoids the computation of symmetric branches improved the performance of SE++ by almost 50%, thus outperforming ENUM by a factor of 3%, on the average. Parallel versions of the enumeration algorithms were implemented on a shared memory system based on a dual (8+8)-core device, which, in some instances, scale super-linearly for up to 8 threads and linearly for 16 threads. The parallel versions of the enumeration with extreme pruning algorithms were parallelized with MPI and achieved speedups of up to 12.96x with 16 processes with ENUM on a lattice in dimension 74. We show that an efficient parallel implementation of ENUM can be integrated into BKZ, a lattice basis reduction algorithm, to parallelize it efficiently, since for high block-sizes almost all of the execution time of BKZ is spent on ENUM. This implementation of BKZ achieves speedups of up to 13.72x for a lattice in dimension 60, reduced with block-size 50. We also compared the quality of the output bases to other BKZ implementations, namely AC_BKZ, an implementation developed in colaboration with Thomas Arnreich, and G_BKZ_FP, an implementation publicly available in the NTL library. Our implementation showed to compute the bases with better quality, in the general case. Finally, a novel parallel approach for the enumeration with extreme pruning is proposed. This approach promises to significantly improve the performance of the enumeration with extreme pruning, since much higher number of cores can be used to achieve higher speedups.
id RCAP_c569f8b5ae01dac56d33981779c2180c
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/36762
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Assessing the hardness of SVP algorithms in the presence of CPUs and GPUsParallel computingLattice based cryptographySVPVoronoi cell681.3519.68Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaLattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to assess the performance of such algorithms on High Performance Computing (HPC) systems. This dissertation compares a wide range of algorithms that solve the SVP, the SVP-solvers, namely, the Voronoi cell-based algorithm and two enumeration-based solvers, SE++ and ENUM. We show that different techniques and optimizations used to significantly improve the performance of ENUM can also be applied to other enumeration algorithms, namely the extreme pruning technique and the optimization that avoids symmetric branches of the enumeration tree. We present the first practical results of the Voronoi cell-based algorithm and compare its performance to the mentioned enumeration algorithms. The Voronoi cell-based algorithm performed considerably worse, although it displays potential for parallelization. The optimization that avoids the computation of symmetric branches improved the performance of SE++ by almost 50%, thus outperforming ENUM by a factor of 3%, on the average. Parallel versions of the enumeration algorithms were implemented on a shared memory system based on a dual (8+8)-core device, which, in some instances, scale super-linearly for up to 8 threads and linearly for 16 threads. The parallel versions of the enumeration with extreme pruning algorithms were parallelized with MPI and achieved speedups of up to 12.96x with 16 processes with ENUM on a lattice in dimension 74. We show that an efficient parallel implementation of ENUM can be integrated into BKZ, a lattice basis reduction algorithm, to parallelize it efficiently, since for high block-sizes almost all of the execution time of BKZ is spent on ENUM. This implementation of BKZ achieves speedups of up to 13.72x for a lattice in dimension 60, reduced with block-size 50. We also compared the quality of the output bases to other BKZ implementations, namely AC_BKZ, an implementation developed in colaboration with Thomas Arnreich, and G_BKZ_FP, an implementation publicly available in the NTL library. Our implementation showed to compute the bases with better quality, in the general case. Finally, a novel parallel approach for the enumeration with extreme pruning is proposed. This approach promises to significantly improve the performance of the enumeration with extreme pruning, since much higher number of cores can be used to achieve higher speedups.A criptografia baseada em retículos tem vindo a tornar-se um tópico central ao longo da última década, dado que se acredita que criptosistemas baseados em retículos sejam resistentes a ataques infligidos com computadores quânticos. A segurança destes criptosistemas é medida pela eficácia dos algoritmos que resolvem problemas centrais em retículos, como o problema do vector mais curto. Por isso, é importante avaliar o desempenho destes algoritmos em arquitecturas computacionais de alto rendimento. Esta dissertação compara uma grande variedade de algoritmos que resolvem o problema do vector mais curto, nomeadamente o algoritmo baseado em células de Voronoi e dois algoritmos de enumeração, o SE++ e o ENUM. Além disso, mostramos que é possível aplicar várias técnicas e optimizações do ENUM a outros algoritmos de enumeração, nomeadamente a técnica de poda extrema e a optimização que evita a computação de ramos simétricos da árvore de enumeração. Foram ainda apresentados os primeiros resultados práticos do algoritmo de células de Voronoi, cujo desempenho foi comparado ao desempenho dos algoritmos de enumeração mencionados. O algoritmo baseado em células de Voronoi, apesar do apresentar potencial de paralelização, apresenta um desempenho bastante pior que as restantes implementações. A optimização que evita a computação de ramos simétricos acelera o SE++ em quase 50%, o que lhe permite ultrapassar o ENUM em termos de desempenho por um factor de 3%, no caso médio. Além disso, foram implementadas versões paralelas do algoritmos de enumeração, quer num sistema de memória partilhada baseado num dispositivo com 8+8 núcleos computacionais, para as variantes sem poda extrema, quer em memória distribuída, para as variantes com poda extrema. Os resultados mostram que as implementações em memória partilhada atingem, em certos casos, acelerações super-lineares até 8 threads e lineares para 16 threads. As implementações em memória distribuída, por seu turno, são aceleradas em cerca de 13 vezes para 16 processos. Também é mostrado que é possível integrar uma versão paralela eficiente do algoritmo ENUM no BKZ, um algoritmo de redução da base de um retículo, como forma de o paralelizar eficientemente, dado que a grande maioria do tempo de execução é gasto em chamadas ao ENUM para tamanhos de bloco grandes. Esta implementação alcança acelerações até 13.72 vezes para o retículo na dimensão 60, reduzido com um tamanho de bloco 50. A qualidade das bases dos retículos, computados por esta implementação, foi comparada a outros implementações do BKZ, nomeadamente o AC_BKZ, uma implementação desenvolvida em conjunto com Thomas Arnreich, e o G_BKZ_FP, uma implementação acessível publicamente na biblioteca NTL. A nossa implementação apresentou computar bases com uma qualidade superior às restantes, no caso geral. Por fim, é proposta uma abordagem paralela inovadora que promete melhorar o desempenho dos algoritmos de enumeração com poda extrema significativamente, dado que poderá tirar partido de um maior número de núcleos computacionaisProença, Alberto JoséMariano, Artur Miguel MatosUniversidade do MinhoCorreia, Fábio José Gonçalves2014-12-192014-12-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/1822/36762eng201195224info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:15:28Zoai:repositorium.sdum.uminho.pt:1822/36762Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:07:54.289279Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
spellingShingle Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
Correia, Fábio José Gonçalves
Parallel computing
Lattice based cryptography
SVP
Voronoi cell
681.3
519.68
Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
title_short Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_full Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_fullStr Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_full_unstemmed Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_sort Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
author Correia, Fábio José Gonçalves
author_facet Correia, Fábio José Gonçalves
author_role author
dc.contributor.none.fl_str_mv Proença, Alberto José
Mariano, Artur Miguel Matos
Universidade do Minho
dc.contributor.author.fl_str_mv Correia, Fábio José Gonçalves
dc.subject.por.fl_str_mv Parallel computing
Lattice based cryptography
SVP
Voronoi cell
681.3
519.68
Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
topic Parallel computing
Lattice based cryptography
SVP
Voronoi cell
681.3
519.68
Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
description Lattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to assess the performance of such algorithms on High Performance Computing (HPC) systems. This dissertation compares a wide range of algorithms that solve the SVP, the SVP-solvers, namely, the Voronoi cell-based algorithm and two enumeration-based solvers, SE++ and ENUM. We show that different techniques and optimizations used to significantly improve the performance of ENUM can also be applied to other enumeration algorithms, namely the extreme pruning technique and the optimization that avoids symmetric branches of the enumeration tree. We present the first practical results of the Voronoi cell-based algorithm and compare its performance to the mentioned enumeration algorithms. The Voronoi cell-based algorithm performed considerably worse, although it displays potential for parallelization. The optimization that avoids the computation of symmetric branches improved the performance of SE++ by almost 50%, thus outperforming ENUM by a factor of 3%, on the average. Parallel versions of the enumeration algorithms were implemented on a shared memory system based on a dual (8+8)-core device, which, in some instances, scale super-linearly for up to 8 threads and linearly for 16 threads. The parallel versions of the enumeration with extreme pruning algorithms were parallelized with MPI and achieved speedups of up to 12.96x with 16 processes with ENUM on a lattice in dimension 74. We show that an efficient parallel implementation of ENUM can be integrated into BKZ, a lattice basis reduction algorithm, to parallelize it efficiently, since for high block-sizes almost all of the execution time of BKZ is spent on ENUM. This implementation of BKZ achieves speedups of up to 13.72x for a lattice in dimension 60, reduced with block-size 50. We also compared the quality of the output bases to other BKZ implementations, namely AC_BKZ, an implementation developed in colaboration with Thomas Arnreich, and G_BKZ_FP, an implementation publicly available in the NTL library. Our implementation showed to compute the bases with better quality, in the general case. Finally, a novel parallel approach for the enumeration with extreme pruning is proposed. This approach promises to significantly improve the performance of the enumeration with extreme pruning, since much higher number of cores can be used to achieve higher speedups.
publishDate 2014
dc.date.none.fl_str_mv 2014-12-19
2014-12-19T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/36762
url http://hdl.handle.net/1822/36762
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 201195224
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799132499549356032