Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs

Correia, Fábio José Gonçalves

Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs

Detalhes bibliográficos
Autor(a) principal:	Correia, Fábio José Gonçalves
Data de Publicação:	2014
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/1822/36762
Resumo:	Lattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to assess the performance of such algorithms on High Performance Computing (HPC) systems. This dissertation compares a wide range of algorithms that solve the SVP, the SVP-solvers, namely, the Voronoi cell-based algorithm and two enumeration-based solvers, SE++ and ENUM. We show that different techniques and optimizations used to significantly improve the performance of ENUM can also be applied to other enumeration algorithms, namely the extreme pruning technique and the optimization that avoids symmetric branches of the enumeration tree. We present the first practical results of the Voronoi cell-based algorithm and compare its performance to the mentioned enumeration algorithms. The Voronoi cell-based algorithm performed considerably worse, although it displays potential for parallelization. The optimization that avoids the computation of symmetric branches improved the performance of SE++ by almost 50%, thus outperforming ENUM by a factor of 3%, on the average. Parallel versions of the enumeration algorithms were implemented on a shared memory system based on a dual (8+8)-core device, which, in some instances, scale super-linearly for up to 8 threads and linearly for 16 threads. The parallel versions of the enumeration with extreme pruning algorithms were parallelized with MPI and achieved speedups of up to 12.96x with 16 processes with ENUM on a lattice in dimension 74. We show that an efficient parallel implementation of ENUM can be integrated into BKZ, a lattice basis reduction algorithm, to parallelize it efficiently, since for high block-sizes almost all of the execution time of BKZ is spent on ENUM. This implementation of BKZ achieves speedups of up to 13.72x for a lattice in dimension 60, reduced with block-size 50. We also compared the quality of the output bases to other BKZ implementations, namely AC_BKZ, an implementation developed in colaboration with Thomas Arnreich, and G_BKZ_FP, an implementation publicly available in the NTL library. Our implementation showed to compute the bases with better quality, in the general case. Finally, a novel parallel approach for the enumeration with extreme pruning is proposed. This approach promises to significantly improve the performance of the enumeration with extreme pruning, since much higher number of cores can be used to achieve higher speedups.

Metadados do item

id	RCAP_c569f8b5ae01dac56d33981779c2180c
oai_identifier_str	oai:repositorium.sdum.uminho.pt:1822/36762
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUsParallel computingLattice based cryptographySVPVoronoi cell681.3519.68Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaLattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to assess the performance of such algorithms on High Performance Computing (HPC) systems. This dissertation compares a wide range of algorithms that solve the SVP, the SVP-solvers, namely, the Voronoi cell-based algorithm and two enumeration-based solvers, SE++ and ENUM. We show that different techniques and optimizations used to significantly improve the performance of ENUM can also be applied to other enumeration algorithms, namely the extreme pruning technique and the optimization that avoids symmetric branches of the enumeration tree. We present the first practical results of the Voronoi cell-based algorithm and compare its performance to the mentioned enumeration algorithms. The Voronoi cell-based algorithm performed considerably worse, although it displays potential for parallelization. The optimization that avoids the computation of symmetric branches improved the performance of SE++ by almost 50%, thus outperforming ENUM by a factor of 3%, on the average. Parallel versions of the enumeration algorithms were implemented on a shared memory system based on a dual (8+8)-core device, which, in some instances, scale super-linearly for up to 8 threads and linearly for 16 threads. The parallel versions of the enumeration with extreme pruning algorithms were parallelized with MPI and achieved speedups of up to 12.96x with 16 processes with ENUM on a lattice in dimension 74. We show that an efficient parallel implementation of ENUM can be integrated into BKZ, a lattice basis reduction algorithm, to parallelize it efficiently, since for high block-sizes almost all of the execution time of BKZ is spent on ENUM. This implementation of BKZ achieves speedups of up to 13.72x for a lattice in dimension 60, reduced with block-size 50. We also compared the quality of the output bases to other BKZ implementations, namely AC_BKZ, an implementation developed in colaboration with Thomas Arnreich, and G_BKZ_FP, an implementation publicly available in the NTL library. Our implementation showed to compute the bases with better quality, in the general case. Finally, a novel parallel approach for the enumeration with extreme pruning is proposed. This approach promises to significantly improve the performance of the enumeration with extreme pruning, since much higher number of cores can be used to achieve higher speedups.A criptografia baseada em retículos tem vindo a tornar-se um tópico central ao longo da última década, dado que se acredita que criptosistemas baseados em retículos sejam resistentes a ataques infligidos com computadores quânticos. A segurança destes criptosistemas é medida pela eficácia dos algoritmos que resolvem problemas centrais em retículos, como o problema do vector mais curto. Por isso, é importante avaliar o desempenho destes algoritmos em arquitecturas computacionais de alto rendimento. Esta dissertação compara uma grande variedade de algoritmos que resolvem o problema do vector mais curto, nomeadamente o algoritmo baseado em células de Voronoi e dois algoritmos de enumeração, o SE++ e o ENUM. Além disso, mostramos que é possível aplicar várias técnicas e optimizações do ENUM a outros algoritmos de enumeração, nomeadamente a técnica de poda extrema e a optimização que evita a computação de ramos simétricos da árvore de enumeração. Foram ainda apresentados os primeiros resultados práticos do algoritmo de células de Voronoi, cujo desempenho foi comparado ao desempenho dos algoritmos de enumeração mencionados. O algoritmo baseado em células de Voronoi, apesar do apresentar potencial de paralelização, apresenta um desempenho bastante pior que as restantes implementações. A optimização que evita a computação de ramos simétricos acelera o SE++ em quase 50%, o que lhe permite ultrapassar o ENUM em termos de desempenho por um factor de 3%, no caso médio. Além disso, foram implementadas versões paralelas do algoritmos de enumeração, quer num sistema de memória partilhada baseado num dispositivo com 8+8 núcleos computacionais, para as variantes sem poda extrema, quer em memória distribuída, para as variantes com poda extrema. Os resultados mostram que as implementações em memória partilhada atingem, em certos casos, acelerações super-lineares até 8 threads e lineares para 16 threads. As implementações em memória distribuída, por seu turno, são aceleradas em cerca de 13 vezes para 16 processos. Também é mostrado que é possível integrar uma versão paralela eficiente do algoritmo ENUM no BKZ, um algoritmo de redução da base de um retículo, como forma de o paralelizar eficientemente, dado que a grande maioria do tempo de execução é gasto em chamadas ao ENUM para tamanhos de bloco grandes. Esta implementação alcança acelerações até 13.72 vezes para o retículo na dimensão 60, reduzido com um tamanho de bloco 50. A qualidade das bases dos retículos, computados por esta implementação, foi comparada a outros implementações do BKZ, nomeadamente o AC_BKZ, uma implementação desenvolvida em conjunto com Thomas Arnreich, e o G_BKZ_FP, uma implementação acessível publicamente na biblioteca NTL. A nossa implementação apresentou computar bases com uma qualidade superior às restantes, no caso geral. Por fim, é proposta uma abordagem paralela inovadora que promete melhorar o desempenho dos algoritmos de enumeração com poda extrema significativamente, dado que poderá tirar partido de um maior número de núcleos computacionaisProença, Alberto JoséMariano, Artur Miguel MatosUniversidade do MinhoCorreia, Fábio José Gonçalves2014-12-192014-12-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/1822/36762eng201195224info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:15:28Zoai:repositorium.sdum.uminho.pt:1822/36762Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:07:54.289279Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
spellingShingle	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs Correia, Fábio José Gonçalves Parallel computing Lattice based cryptography SVP Voronoi cell 681.3 519.68 Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
title_short	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_full	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_fullStr	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_full_unstemmed	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
title_sort	Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs
author	Correia, Fábio José Gonçalves
author_facet	Correia, Fábio José Gonçalves
author_role	author
dc.contributor.none.fl_str_mv	Proença, Alberto José Mariano, Artur Miguel Matos Universidade do Minho
dc.contributor.author.fl_str_mv	Correia, Fábio José Gonçalves
dc.subject.por.fl_str_mv	Parallel computing Lattice based cryptography SVP Voronoi cell 681.3 519.68 Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
topic	Parallel computing Lattice based cryptography SVP Voronoi cell 681.3 519.68 Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
description	Lattice-based cryptography has been a hot topic in the past decade, since it is believed that lattice-based cryptosystems are immune against attacks operated by quantum computers. The security of this type of cryptography is based on the hardness of algorithms that solve lattice-based problems, namely the Shortest Vector Problem (SVP). Therefore, it is important to assess the performance of such algorithms on High Performance Computing (HPC) systems. This dissertation compares a wide range of algorithms that solve the SVP, the SVP-solvers, namely, the Voronoi cell-based algorithm and two enumeration-based solvers, SE++ and ENUM. We show that different techniques and optimizations used to significantly improve the performance of ENUM can also be applied to other enumeration algorithms, namely the extreme pruning technique and the optimization that avoids symmetric branches of the enumeration tree. We present the first practical results of the Voronoi cell-based algorithm and compare its performance to the mentioned enumeration algorithms. The Voronoi cell-based algorithm performed considerably worse, although it displays potential for parallelization. The optimization that avoids the computation of symmetric branches improved the performance of SE++ by almost 50%, thus outperforming ENUM by a factor of 3%, on the average. Parallel versions of the enumeration algorithms were implemented on a shared memory system based on a dual (8+8)-core device, which, in some instances, scale super-linearly for up to 8 threads and linearly for 16 threads. The parallel versions of the enumeration with extreme pruning algorithms were parallelized with MPI and achieved speedups of up to 12.96x with 16 processes with ENUM on a lattice in dimension 74. We show that an efficient parallel implementation of ENUM can be integrated into BKZ, a lattice basis reduction algorithm, to parallelize it efficiently, since for high block-sizes almost all of the execution time of BKZ is spent on ENUM. This implementation of BKZ achieves speedups of up to 13.72x for a lattice in dimension 60, reduced with block-size 50. We also compared the quality of the output bases to other BKZ implementations, namely AC_BKZ, an implementation developed in colaboration with Thomas Arnreich, and G_BKZ_FP, an implementation publicly available in the NTL library. Our implementation showed to compute the bases with better quality, in the general case. Finally, a novel parallel approach for the enumeration with extreme pruning is proposed. This approach promises to significantly improve the performance of the enumeration with extreme pruning, since much higher number of cores can be used to achieve higher speedups.
publishDate	2014
dc.date.none.fl_str_mv	2014-12-19 2014-12-19T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/1822/36762
url	http://hdl.handle.net/1822/36762
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	201195224
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799132499549356032

Assessing the hardness of SVP algorithms in the presence of CPUs and GPUs

Registros relacionados