Adjoint differentiation for matrix spectrum cutoff operation
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10773/41040 |
Resumo: | Adjoint differentiation is based on a sequence of arithmetic operations so it requires a very large amount of memory to save all intermediate derivatives, O(N3) for matrix multiplication. With an analytical formula we can reduce the memory usage to only the matrices, O(N2). Consider a operation f(A) where f is a scalar function and A is a symmetric matrix. Usually this operation is performed in the orthonormal basis (matrix U) formed by the eigenvectors of A. There is an analytical formula for the adjoint Ā that does not require Ū. In some applications, for example multi-linear regression, it is very common to have large matrices with a considerable amount of eigenvalues close to 0, so it could be beneficial to use a low rank (k) approximation of the original matrix. This can be done with several algorithms of Reduced Singular Value Decomposition SVD. This implies the complete matrix U is not known and the analytical formula cannot be applied in a straightforward manner. We can define Uk, as the first k columns of U (eigenvectors of A). This matrix can P be found using a reduced SVD algorithm. Given an epsilon, we define f(A) = ∑ λi>ϵ λiuiuTi as the spectrum cutoff function. Epsilon should be related with k by λk+1 < ϵ < λk for eigenvalues sorted in descending order. In this work, we propose an approximate formula to Ā that uses results of a Projection Based Randomized Algorithm for SVD and the reduced matrix Uk, saving memory usage. We also created an interface that can be used with any given matrix and configured with several parameters. Our findings showed that we need to extend matrix Uk namely using QR decomposition to find additional column vectors that maintain orthogonality of U. We also need too assume a default value for the unknown eigenvalues of A. Our final results showed an MRE error of 1.85% for both small and large matrices when they have k small eigenvalues and N − k large ones. We did not find an improvement in computational time but we still have the capability to save a considerable amount of memory usage. We concluded that this is a good approach to compute adjoint derivatives of large matrices namely for the previously mentioned popular applications. |
id |
RCAP_c5cd6ecb48a7dc5c5ecdb6dfbacc63eb |
---|---|
oai_identifier_str |
oai:ria.ua.pt:10773/41040 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Adjoint differentiation for matrix spectrum cutoff operationAdjoint differentiationEingenvalues and eigenvectorsMatrix decompositionQRSVDAdjoint differentiation is based on a sequence of arithmetic operations so it requires a very large amount of memory to save all intermediate derivatives, O(N3) for matrix multiplication. With an analytical formula we can reduce the memory usage to only the matrices, O(N2). Consider a operation f(A) where f is a scalar function and A is a symmetric matrix. Usually this operation is performed in the orthonormal basis (matrix U) formed by the eigenvectors of A. There is an analytical formula for the adjoint Ā that does not require Ū. In some applications, for example multi-linear regression, it is very common to have large matrices with a considerable amount of eigenvalues close to 0, so it could be beneficial to use a low rank (k) approximation of the original matrix. This can be done with several algorithms of Reduced Singular Value Decomposition SVD. This implies the complete matrix U is not known and the analytical formula cannot be applied in a straightforward manner. We can define Uk, as the first k columns of U (eigenvectors of A). This matrix can P be found using a reduced SVD algorithm. Given an epsilon, we define f(A) = ∑ λi>ϵ λiuiuTi as the spectrum cutoff function. Epsilon should be related with k by λk+1 < ϵ < λk for eigenvalues sorted in descending order. In this work, we propose an approximate formula to Ā that uses results of a Projection Based Randomized Algorithm for SVD and the reduced matrix Uk, saving memory usage. We also created an interface that can be used with any given matrix and configured with several parameters. Our findings showed that we need to extend matrix Uk namely using QR decomposition to find additional column vectors that maintain orthogonality of U. We also need too assume a default value for the unknown eigenvalues of A. Our final results showed an MRE error of 1.85% for both small and large matrices when they have k small eigenvalues and N − k large ones. We did not find an improvement in computational time but we still have the capability to save a considerable amount of memory usage. We concluded that this is a good approach to compute adjoint derivatives of large matrices namely for the previously mentioned popular applications.A diferenciação automática tradicional baseia-se na sequência de operações aritméticas e, por isso, requer uma grande quantidade de memória para guardar todas as derivadas intermédias, O(N3) para multiplicações de matrizes. Com uma formula analítica podemos reduzir o uso de memória para apenas as matrizes, O(N2). Vamos considerar uma operação matricial f(A) onde f é uma função escalar e A uma matriz simétrica. Normalmente esta operação é realizada na base ortonormal (matriz U) formada pelos vetores próprios de A. Existe uma formula analítica para a diferenciação adjunta Ā que não necessita de Ū. Em aplicações como regressão multi-linear, é comum as matrizes possíurem alguns valores próprios perto de 0 e, por isso, pode ser benéfico usar uma aproximação de menor caraterística (rank k) da matriz original. Isto pode ser feito com a ajuda de vários algoritmos de SVD reduzida. Mas isto implica que a matriz completa U não é conhecida e assim a formula analítica não pode ser aplicada diretamente. A matriz Uk = [u1u2u3 . . . uk], ou seja as colunas de U correspondentes ao maiores valores próprios de A, pode ser encontrada com um algoritmo de decomposição SVD reduzida. A função de corte do espetro de uma matriz pode ser definida por f(A) = ∑ λi>ϵ λiuiuTi para um dado ϵ. Neste caso λk+1 < ϵ < λk deve ser garantido para valores próprios ordenados de forma descendente. Neste trabalho, propomos uma formula aproximada para Ā que usa os resultados de um algoritmo rápido de SVD reduzido e a matriz Uk, reduzindo o uso de memória. Também disponibilizamos uma interface que pode ser usada obter a matriz Ā de uma qualquer matriz A dada consoante alguns parâmetros. Demonstramos que precisamos de estender a matriz Uk usando, por exemplo, a decomposição QR para encontrar vetores colunas que mantenham a ortogonalidade de U. Foi também necessário encontrar um valor para substituir os valores próprios de A desconhecidos. Os nossos resultados mostraram um erro relativo final de 1.85% para matrizes de vários tamanhos desde que tenham k valores próprios pequenos e N − k grandes. Apesar de não termos encontrado uma redução do tempo computacional, reduzimos significativamente o uso de memória comparado com o uso da formula e método tradicionais. Concluímos que se trata de um bom método para calcular derivadas adjuntas de matrizes grandes nomeadamente para as aplicações populares mencionadas anteriormente.2024-03-12T11:42:08Z2023-07-21T00:00:00Z2023-07-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/41040engRodrigues, Alexandre da Rochainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-18T01:48:08Zoai:ria.ua.pt:10773/41040Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T04:02:08.938093Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Adjoint differentiation for matrix spectrum cutoff operation |
title |
Adjoint differentiation for matrix spectrum cutoff operation |
spellingShingle |
Adjoint differentiation for matrix spectrum cutoff operation Rodrigues, Alexandre da Rocha Adjoint differentiation Eingenvalues and eigenvectors Matrix decomposition QR SVD |
title_short |
Adjoint differentiation for matrix spectrum cutoff operation |
title_full |
Adjoint differentiation for matrix spectrum cutoff operation |
title_fullStr |
Adjoint differentiation for matrix spectrum cutoff operation |
title_full_unstemmed |
Adjoint differentiation for matrix spectrum cutoff operation |
title_sort |
Adjoint differentiation for matrix spectrum cutoff operation |
author |
Rodrigues, Alexandre da Rocha |
author_facet |
Rodrigues, Alexandre da Rocha |
author_role |
author |
dc.contributor.author.fl_str_mv |
Rodrigues, Alexandre da Rocha |
dc.subject.por.fl_str_mv |
Adjoint differentiation Eingenvalues and eigenvectors Matrix decomposition QR SVD |
topic |
Adjoint differentiation Eingenvalues and eigenvectors Matrix decomposition QR SVD |
description |
Adjoint differentiation is based on a sequence of arithmetic operations so it requires a very large amount of memory to save all intermediate derivatives, O(N3) for matrix multiplication. With an analytical formula we can reduce the memory usage to only the matrices, O(N2). Consider a operation f(A) where f is a scalar function and A is a symmetric matrix. Usually this operation is performed in the orthonormal basis (matrix U) formed by the eigenvectors of A. There is an analytical formula for the adjoint Ā that does not require Ū. In some applications, for example multi-linear regression, it is very common to have large matrices with a considerable amount of eigenvalues close to 0, so it could be beneficial to use a low rank (k) approximation of the original matrix. This can be done with several algorithms of Reduced Singular Value Decomposition SVD. This implies the complete matrix U is not known and the analytical formula cannot be applied in a straightforward manner. We can define Uk, as the first k columns of U (eigenvectors of A). This matrix can P be found using a reduced SVD algorithm. Given an epsilon, we define f(A) = ∑ λi>ϵ λiuiuTi as the spectrum cutoff function. Epsilon should be related with k by λk+1 < ϵ < λk for eigenvalues sorted in descending order. In this work, we propose an approximate formula to Ā that uses results of a Projection Based Randomized Algorithm for SVD and the reduced matrix Uk, saving memory usage. We also created an interface that can be used with any given matrix and configured with several parameters. Our findings showed that we need to extend matrix Uk namely using QR decomposition to find additional column vectors that maintain orthogonality of U. We also need too assume a default value for the unknown eigenvalues of A. Our final results showed an MRE error of 1.85% for both small and large matrices when they have k small eigenvalues and N − k large ones. We did not find an improvement in computational time but we still have the capability to save a considerable amount of memory usage. We concluded that this is a good approach to compute adjoint derivatives of large matrices namely for the previously mentioned popular applications. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-07-21T00:00:00Z 2023-07-21 2024-03-12T11:42:08Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10773/41040 |
url |
http://hdl.handle.net/10773/41040 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138193895849984 |