Communication-avoiding kernel ridge regression on parallel and distributed systems

2021 
Kernel ridge regression (KRR) is a fundamental method in machine learning. Given an n-by-d data matrix as input, a traditional implementation requires $$\Theta (n^2)$$ memory to form an n-by-n kernel matrix and $$\Theta (n^3)$$ flops to compute the final model. These time and storage costs prohibit KRR from scaling up to large datasets. For example, even on a relatively small dataset (a 520k-by-90 input requiring 357 MB), KRR requires 2 TB memory just to store the kernel matrix. The reason is that n usually is much larger than d for real-world applications. On the other hand, weak scaling becomes a problem: if we keep d and n/p fixed as p grows (p is the number of machines), the memory needed grows as $$\Theta (p)$$ per processor and the flops as $$\Theta (p^2)$$ per processor. In the perfect weak scaling situation, both the memory needed and the flops grow as $$\Theta (1)$$ per processor (i.e. memory and flops are constant). The traditional Distributed KRR implementation (DKRR) only achieved 0.32% weak scaling efficiency from 96 to 1536 processors. We propose two new methods to address these problems: the balanced KRR (BKRR) and K-means KRR (KKRR). These methods consider alternative ways to partition the input dataset into p different parts, generating p different models, and then selecting the best model among them. Compared to a conventional implementation, KKRR2 (optimized version of KKRR) improves the weak scaling efficiency from 0.32 to 38% and achieves a 591 $$\times $$ speedup for getting the same accuracy by using the same data and the same hardware (1536 processors). BKRR2 (optimized version of BKRR) achieves a higher accuracy than the current fastest method using less training time for a variety of datasets. For the applications requiring only approximate solutions, BKRR2 improves the weak scaling efficiency to 92% and achieves 3505 $$\times $$ speedup (theoretical speedup: 4096 $$\times $$ ). The source code of this paper is available at https://people.eecs.berkeley.edu/~youyang/cakrr.zip .
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []