Adjoint differentiation for matrix spectrum cutoff operation

Detalhes bibliográficos
Autor(a) principal: Rodrigues, Alexandre da Rocha
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/41040
Resumo: Adjoint differentiation is based on a sequence of arithmetic operations so it requires a very large amount of memory to save all intermediate derivatives, O(N3) for matrix multiplication. With an analytical formula we can reduce the memory usage to only the matrices, O(N2). Consider a operation f(A) where f is a scalar function and A is a symmetric matrix. Usually this operation is performed in the orthonormal basis (matrix U) formed by the eigenvectors of A. There is an analytical formula for the adjoint Ā that does not require Ū. In some applications, for example multi-linear regression, it is very common to have large matrices with a considerable amount of eigenvalues close to 0, so it could be beneficial to use a low rank (k) approximation of the original matrix. This can be done with several algorithms of Reduced Singular Value Decomposition SVD. This implies the complete matrix U is not known and the analytical formula cannot be applied in a straightforward manner. We can define Uk, as the first k columns of U (eigenvectors of A). This matrix can P be found using a reduced SVD algorithm. Given an epsilon, we define f(A) = ∑ λi>ϵ λiuiuTi as the spectrum cutoff function. Epsilon should be related with k by λk+1 < ϵ < λk for eigenvalues sorted in descending order. In this work, we propose an approximate formula to Ā that uses results of a Projection Based Randomized Algorithm for SVD and the reduced matrix Uk, saving memory usage. We also created an interface that can be used with any given matrix and configured with several parameters. Our findings showed that we need to extend matrix Uk namely using QR decomposition to find additional column vectors that maintain orthogonality of U. We also need too assume a default value for the unknown eigenvalues of A. Our final results showed an MRE error of 1.85% for both small and large matrices when they have k small eigenvalues and N − k large ones. We did not find an improvement in computational time but we still have the capability to save a considerable amount of memory usage. We concluded that this is a good approach to compute adjoint derivatives of large matrices namely for the previously mentioned popular applications.
id RCAP_c5cd6ecb48a7dc5c5ecdb6dfbacc63eb
oai_identifier_str oai:ria.ua.pt:10773/41040
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Adjoint differentiation for matrix spectrum cutoff operationAdjoint differentiationEingenvalues and eigenvectorsMatrix decompositionQRSVDAdjoint differentiation is based on a sequence of arithmetic operations so it requires a very large amount of memory to save all intermediate derivatives, O(N3) for matrix multiplication. With an analytical formula we can reduce the memory usage to only the matrices, O(N2). Consider a operation f(A) where f is a scalar function and A is a symmetric matrix. Usually this operation is performed in the orthonormal basis (matrix U) formed by the eigenvectors of A. There is an analytical formula for the adjoint Ā that does not require Ū. In some applications, for example multi-linear regression, it is very common to have large matrices with a considerable amount of eigenvalues close to 0, so it could be beneficial to use a low rank (k) approximation of the original matrix. This can be done with several algorithms of Reduced Singular Value Decomposition SVD. This implies the complete matrix U is not known and the analytical formula cannot be applied in a straightforward manner. We can define Uk, as the first k columns of U (eigenvectors of A). This matrix can P be found using a reduced SVD algorithm. Given an epsilon, we define f(A) = ∑ λi>ϵ λiuiuTi as the spectrum cutoff function. Epsilon should be related with k by λk+1 < ϵ < λk for eigenvalues sorted in descending order. In this work, we propose an approximate formula to Ā that uses results of a Projection Based Randomized Algorithm for SVD and the reduced matrix Uk, saving memory usage. We also created an interface that can be used with any given matrix and configured with several parameters. Our findings showed that we need to extend matrix Uk namely using QR decomposition to find additional column vectors that maintain orthogonality of U. We also need too assume a default value for the unknown eigenvalues of A. Our final results showed an MRE error of 1.85% for both small and large matrices when they have k small eigenvalues and N − k large ones. We did not find an improvement in computational time but we still have the capability to save a considerable amount of memory usage. We concluded that this is a good approach to compute adjoint derivatives of large matrices namely for the previously mentioned popular applications.A diferenciação automática tradicional baseia-se na sequência de operações aritméticas e, por isso, requer uma grande quantidade de memória para guardar todas as derivadas intermédias, O(N3) para multiplicações de matrizes. Com uma formula analítica podemos reduzir o uso de memória para apenas as matrizes, O(N2). Vamos considerar uma operação matricial f(A) onde f é uma função escalar e A uma matriz simétrica. Normalmente esta operação é realizada na base ortonormal (matriz U) formada pelos vetores próprios de A. Existe uma formula analítica para a diferenciação adjunta Ā que não necessita de Ū. Em aplicações como regressão multi-linear, é comum as matrizes possíurem alguns valores próprios perto de 0 e, por isso, pode ser benéfico usar uma aproximação de menor caraterística (rank k) da matriz original. Isto pode ser feito com a ajuda de vários algoritmos de SVD reduzida. Mas isto implica que a matriz completa U não é conhecida e assim a formula analítica não pode ser aplicada diretamente. A matriz Uk = [u1u2u3 . . . uk], ou seja as colunas de U correspondentes ao maiores valores próprios de A, pode ser encontrada com um algoritmo de decomposição SVD reduzida. A função de corte do espetro de uma matriz pode ser definida por f(A) = ∑ λi>ϵ λiuiuTi para um dado ϵ. Neste caso λk+1 < ϵ < λk deve ser garantido para valores próprios ordenados de forma descendente. Neste trabalho, propomos uma formula aproximada para Ā que usa os resultados de um algoritmo rápido de SVD reduzido e a matriz Uk, reduzindo o uso de memória. Também disponibilizamos uma interface que pode ser usada obter a matriz Ā de uma qualquer matriz A dada consoante alguns parâmetros. Demonstramos que precisamos de estender a matriz Uk usando, por exemplo, a decomposição QR para encontrar vetores colunas que mantenham a ortogonalidade de U. Foi também necessário encontrar um valor para substituir os valores próprios de A desconhecidos. Os nossos resultados mostraram um erro relativo final de 1.85% para matrizes de vários tamanhos desde que tenham k valores próprios pequenos e N − k grandes. Apesar de não termos encontrado uma redução do tempo computacional, reduzimos significativamente o uso de memória comparado com o uso da formula e método tradicionais. Concluímos que se trata de um bom método para calcular derivadas adjuntas de matrizes grandes nomeadamente para as aplicações populares mencionadas anteriormente.2024-03-12T11:42:08Z2023-07-21T00:00:00Z2023-07-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/41040engRodrigues, Alexandre da Rochainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-18T01:48:08Zoai:ria.ua.pt:10773/41040Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T04:02:08.938093Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Adjoint differentiation for matrix spectrum cutoff operation
title Adjoint differentiation for matrix spectrum cutoff operation
spellingShingle Adjoint differentiation for matrix spectrum cutoff operation
Rodrigues, Alexandre da Rocha
Adjoint differentiation
Eingenvalues and eigenvectors
Matrix decomposition
QR
SVD
title_short Adjoint differentiation for matrix spectrum cutoff operation
title_full Adjoint differentiation for matrix spectrum cutoff operation
title_fullStr Adjoint differentiation for matrix spectrum cutoff operation
title_full_unstemmed Adjoint differentiation for matrix spectrum cutoff operation
title_sort Adjoint differentiation for matrix spectrum cutoff operation
author Rodrigues, Alexandre da Rocha
author_facet Rodrigues, Alexandre da Rocha
author_role author
dc.contributor.author.fl_str_mv Rodrigues, Alexandre da Rocha
dc.subject.por.fl_str_mv Adjoint differentiation
Eingenvalues and eigenvectors
Matrix decomposition
QR
SVD
topic Adjoint differentiation
Eingenvalues and eigenvectors
Matrix decomposition
QR
SVD
description Adjoint differentiation is based on a sequence of arithmetic operations so it requires a very large amount of memory to save all intermediate derivatives, O(N3) for matrix multiplication. With an analytical formula we can reduce the memory usage to only the matrices, O(N2). Consider a operation f(A) where f is a scalar function and A is a symmetric matrix. Usually this operation is performed in the orthonormal basis (matrix U) formed by the eigenvectors of A. There is an analytical formula for the adjoint Ā that does not require Ū. In some applications, for example multi-linear regression, it is very common to have large matrices with a considerable amount of eigenvalues close to 0, so it could be beneficial to use a low rank (k) approximation of the original matrix. This can be done with several algorithms of Reduced Singular Value Decomposition SVD. This implies the complete matrix U is not known and the analytical formula cannot be applied in a straightforward manner. We can define Uk, as the first k columns of U (eigenvectors of A). This matrix can P be found using a reduced SVD algorithm. Given an epsilon, we define f(A) = ∑ λi>ϵ λiuiuTi as the spectrum cutoff function. Epsilon should be related with k by λk+1 < ϵ < λk for eigenvalues sorted in descending order. In this work, we propose an approximate formula to Ā that uses results of a Projection Based Randomized Algorithm for SVD and the reduced matrix Uk, saving memory usage. We also created an interface that can be used with any given matrix and configured with several parameters. Our findings showed that we need to extend matrix Uk namely using QR decomposition to find additional column vectors that maintain orthogonality of U. We also need too assume a default value for the unknown eigenvalues of A. Our final results showed an MRE error of 1.85% for both small and large matrices when they have k small eigenvalues and N − k large ones. We did not find an improvement in computational time but we still have the capability to save a considerable amount of memory usage. We concluded that this is a good approach to compute adjoint derivatives of large matrices namely for the previously mentioned popular applications.
publishDate 2023
dc.date.none.fl_str_mv 2023-07-21T00:00:00Z
2023-07-21
2024-03-12T11:42:08Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/41040
url http://hdl.handle.net/10773/41040
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138193895849984