Luo Luo

Fudan University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Zhihua Zhang

China Communications Construction Company (China)

Haishan Ye

Xi'an Jiaotong University

Guangzeng Xie

Peking University

Cheng Chen

Shanghai Jiao Tong University

Chengchang Liu

Chinese University of Hong Kong

Zhihua Zhang

Peking University

Zhihua Zhang

Shanghai Jiao Tong University

Wu-Jun Li

Nanjing University

Weinan Zhang

Shanghai Jiao Tong University

Zihao Chen

Harbin Institute of Technology

Cooperative Institutions

Shanghai Jiao Tong University

University of Hong Kong

Hong Kong University of Science and Technology

"National Medical Research Center for Rehabilitation and Balneology" of the Ministry of Health of the Russian Federation

University of Chinese Academy of Sciences

Peking University

Southeast University

Research Medical Center

Chinese Academy of Sciences

Xi'an Jiaotong University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Decentralized Gradient-Free Methods for Stochastic Non-smooth Non-convex Optimization

Proceedings of the AAAI Conference on Artificial Intelligence (2024)

Zhenwei Lin Jingfan Xia Qi Deng Luo Luo

We consider decentralized gradient-free optimization of minimizing Lipschitz continuous functions that satisfy neither smoothness nor convexity assumption. We propose two novel gradient-free algorithms, the Decentralized Gradient-Free Method (DGFM) and its variant, the Decentralized Gradient-Free Method+ (DGFM+). Based on the techniques of randomized smoothing and gradient tracking, DGFM requires the computation of the zeroth-order oracle of a single sample in each iteration, making it less demanding in terms of computational resources for individual computing nodes. Theoretically, DGFM achieves a complexity of O(d^(3/2)δ^(-1)ε^(-4)) for obtaining a (δ,ε)-Goldstein stationary point. DGFM+, an advanced version of DGFM, incorporates variance reduction to further improve the convergence behavior. It samples a mini-batch at each iteration and periodically draws a larger batch of data, which improves the complexity to O(d^(3/2)δ^(-1)ε^(-3)). Moreover, experimental results underscore the empirical advantages of our proposed algorithms when applied to real-world datasets.

Proximal Gradient Methods

10.1609/aaai.v38i16.29697

Cite

Citations (0)

Efficient and Robust High-Dimensional Linear Contextual Bandits

Cheng Chen Luo Luo Weinan Zhang Yong Yu Yijiang Lian

The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) updating time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.

Matrix (chemical analysis)

10.24963/ijcai.2020/588

Cite

Citations (3)

On Hankel Operators between Bergman Spaces on The Unit Ball

HAL (Le Centre pour la Communication Scientifique Directe) (2005)

Aline Bonami Luo Luo

We study the boundedness of a (small) Hankel operator between different Bergman spaces on the unit ball B in Cn. We give conditions on its symbol which are necessary and/or sufficient for the continuity of the corresponding operator from Ap(B) into Aq(B), for all finite p,q >0.

Unit sphere

Bergman space

Ball (mathematics)

Bergman kernel

Source

Cite

Citations (16)

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

arXiv (Cornell University) (2024)

Zhuanghua Liu Luo Luo Bryan Kian Hsiang Low

We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian. The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate that is dependent on the condition number of the problem. This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework, which results in the condition-number-free local superlinear convergence rate. Furthermore, we can boost our method by applying the block update on the Hessian approximation, which leads to an even faster local convergence rate. The numerical experiments show the proposed methods significantly outperform the baseline methods.

10.48550/arxiv.2402.02359

Cite

Citations (0)

Robust Frequent Directions with Application in Online Learning

Luo Luo Cheng Chen Zhihua Zhang Wu-Jun Li Tong Zhang

The frequent directions (FD) technique is a deterministic approach for online sketching that has many applications in machine learning. The conventional FD is a heuristic procedure that often outputs rank deficient matrices. To overcome the rank deficiency problem, we propose a new sketching strategy called robust frequent directions (RFD) by introducing a regularization term. RFD can be derived from an optimization problem. It updates the sketch matrix and the regularization term adaptively and jointly. RFD reduces the approximation error of FD without increasing the computational cost. We also apply RFD to online learning and propose an effective hyperparameter-free online Newton algorithm. We derive a regret bound for our online Newton algorithm based on RFD, which guarantees the robustness of the algorithm. The experimental studies demonstrate that the proposed method outperforms state-of-the-art second order online learning algorithms.

Robustness

Regularization

Online algorithm

Hyperparameter

Sketch

Matrix Completion

Cite

Citations (9)

Fast stochastic second-order method logarithmic in condition number

Pattern Recognition (2018)

Haishan Ye Guangzeng Xie Luo Luo Zhihua Zhang

10.1016/j.patcog.2018.11.031

Cite

Citations (1)

A Proximal Stochastic Quasi-Newton Algorithm

arXiv (Cornell University) (2016)

Luo Luo Zihao Chen Zhihua Zhang Wujun Li

In this paper, we discuss the problem of minimizing the sum of two convex functions: a smooth function plus a non-smooth function. Further, the smooth part can be expressed by the average of a large number of smooth component functions, and the non-smooth part is equipped with a simple proximal mapping. We propose a proximal stochastic second-order method, which is efficient and scalable. It incorporates the Hessian in the smooth part of the function and exploits multistage scheme to reduce the variance of the stochastic gradient. We prove that our method can achieve linear rate of convergence.

10.48550/arxiv.1602.00223

Cite

Citations (4)

Robust crowdsourced learning

Zhiquan Liu Luo Luo Wu-Jun Li

In general, a large amount of labels are needed for supervised learning algorithms to achieve satisfactory performance. It's typically very time-consuming and money-consuming to get such kind of labeled data. Recently, crowdsourcing services provide an effective way to collect labeled data with much lower cost. Hence, crowdsourced learning (CL), which performs learning with labeled data collected from crowdsourcing services, has become a very hot and interesting research topic in recent years. Most existing CL methods exploit only the labels from different workers (annotators) for learning while ignoring the attributes of the instances. In many real applications, the attributes of the instances are actually the most discriminative information for learning. Hence, CL methods with attributes have attracted more and more attention from CL researchers. One representative model of such kind is the personal classifier (PC) model, which has achieved the state-of-the-art performance. However, the PC model makes an unreasonable assumption that all the workers contribute equally to the final classification. This contradicts the fact that different workers have different quality (ability) for data labeling. In this paper, we propose a novel model, called robust personal classifier (RPC), for robust crowdsourced learning. Our model can automatically learn an expertise score for each worker. This expertise score reflects the inherent quality of each worker. The final classifier of our RPC model gives high weights for good workers and low weights for poor workers or spammers, which is more reasonable than PC model with equal weights for all workers. Furthermore, the learned expertise score can be used to eliminate spammers or low-quality workers. Experiments on simulated datasets and UCI datasets show that the proposed model can dramatically outperform the baseline models such as PC model in terms of classification accuracy and ability to detect spammers.

Crowdsourcing

Discriminative model

Labeled data

10.1109/bigdata.2013.6691593

Cite

Citations (6)

Sketched Follow-The-Regularized-Leader for Online Factorization Machine

Luo Luo Wenpeng Zhang Zhihua Zhang Wenwu Zhu Tong Zhang

Factorization Machine (FM) is a supervised machine learning model for feature engineering, which is widely used in many real-world applications. In this paper, we consider the case that the data samples arrive sequentially. The existing convex formulation for online FM has the strong theoretical guarantee and stable performance in practice, but the computational cost is typically expensive when the data is high-dimensional. To address this weakness, we devise a novel online learning algorithm called Sketched Follow-The-Regularizer-Leader (SFTRL). SFTRL presents the parameters of FM implicitly by maintaining low-rank matrices and updates the parameters via sketching. More specifically, we propose Generalized Frequent Directions to approximate indefinite symmetric matrices in a streaming way, making that the sum of historical gradients for FM could be estimated with tighter error bound efficiently. With mild assumptions, we prove that the regret bound of SFTRL is close to that of the standard FTRL. Experimental results show that SFTRL has better prediction quality than the state-of-the-art online FM algorithms in much lower time and space complexities.

Rank (graph theory)

Learning to Rank

Feature (linguistics)

10.1145/3219819.3220044

Cite

Citations (7)

Optimizing over Multiple Distributions under Generalized Quasar-Convexity Condition

arXiv (Cornell University) (2024)

Shihong Ding Yang Long Luo Luo Cong Fang

We study a typical optimization model where the optimization variable is composed of multiple probability distributions. Though the model appears frequently in practice, such as for policy problems, it lacks specific analysis in the general setting. For this optimization problem, we propose a new structural condition/landscape description named generalized quasar-convexity (GQC) beyond the realms of convexity. In contrast to original quasar-convexity \citep{hinder2020near}, GQC allows an individual quasar-convex parameter $\gamma_i$ for each variable block $i$ and the smaller of $\gamma_i$ implies less block-convexity. To minimize the objective function, we consider a generalized oracle termed as the internal function that includes the standard gradient oracle as a special case. We provide optimistic mirror descent (OMD) for multiple distributions and prove that the algorithm can achieve an adaptive $\tilde{\mathcal{O}}((\sum_{i=1}^d1/\gamma_i)\epsilon^{-1})$ iteration complexity to find an $epsilon$-suboptimal global solution without pre-known the exact values of $\gamma_i$ when the objective admits "polynomial-like" structural. Notably, it achieves iteration complexity that does not explicitly depend on the number of distributions and strictly faster $(\sum_{i=1}^d 1/\gamma_i \text{ v.s. } d\max_{i\in[1:d]} 1/\gamma_i)$ than mirror decent methods. We also extend GQC to the minimax optimization problem proposing the generalized quasar-convexity-concavity (GQCC) condition and a decentralized variant of OMD with regularization. Finally, we show the applications of our algorithmic framework on discounted Markov Decision Processes problem and Markov games, which bring new insights on the landscape analysis of reinforcement learning.

Convexity

10.48550/arxiv.2407.14839

Cite

Citations (0)