Tensor Decomposition for Compressing Recurrent Neural Network

Andros Tjandra,Sakriani Sakti,Satoshi Nakamura

Tensor Decomposition for Compressing Recurrent Neural Network

2018

Andros Tjandra
Sakriani Sakti
Satoshi Nakamura

In the machine learning fields, Recurrent Neural Network (RNN) has become a popular architecture for sequential data modeling. However, behind the impressive performance, RNNs require a large number of parameters for both training and inference. In this paper, we are trying to reduce the number of parameters and maintain the expressive power from RNN simultaneously. We utilize several tensor decompositions method including CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. We evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods.

Keywords:

Artificial intelligence
Recurrent neural network
Matrix decomposition
Machine learning
Tensor
Pattern recognition
Stress (mechanics)
Architecture
Tucker decomposition
Inference
Computer science
Data modeling
Expressive power
sequential data
tensor train
tensor decomposition
Mathematics
Algorithm

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations