Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








26,397 Hits in 3.5 sec

Improved Sample Complexity for Stochastic Compositional Variance Reduced Gradient [article]

Tianyi Lin, Chenyou Fan, Mengdi Wang, Michael I. Jordan
2020 arXiv   pre-print
In this paper, we develop a new stochastic compositional variance-reduced gradient algorithm with the sample complexity of O((m+n)log(1/ϵ)+1/ϵ^3) where m+n is the total number of samples.  ...  Convex composition optimization is an emerging topic that covers a wide range of applications arising from stochastic optimal control, reinforcement learning and multi-stage stochastic programming.  ...  CONCLUSIONS We propose a stochastic compositional variance gradient algorithm for convex composition optimization with an improved sample complexity.  ... 
arXiv:1806.00458v5 fatcat:2u3hxtqg4nao3iom4owbas7ire

Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent

Huizhuo Yuan, Xiangru Lian, Chris Junchi Li, Ji Liu, Wenqing Hu
2019 Neural Information Processing Systems  
Such a complexity is known to be the best one among IFO complexity results for non-convex stochastic compositional optimization.  ...  We employ a recently developed idea of Stochastic Recursive Gradient Descent to design a novel algorithm named SARAH-Compositional, and prove a sharp Incremental First-order Oracle (IFO) complexity upper  ...  Conclusion In this paper, we propose a novel algorithm called SARAH-Compositional for solving stochastic compositional optimization problems using the idea of a recently proposed variance reduced gradient  ... 
dblp:conf/nips/YuanLLLH19 fatcat:l3ld7pyycbdjnmdca7jvjxl2qq

Accelerated Method for Stochastic Composition Optimization With Nonsmooth Regularization

Zhouyuan Huo, Bin Gu, Ji Liu, Heng Huang
2018 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
; for general composition problem, our algorithm significantly improves the state-of-the-art convergence rate from O(T–1/2) to O((n1+n2)2/3T-1).  ...  To the best of our knowledge, our method admits the fastest convergence rate for stochastic composition optimization: for strongly convex composition problem, our algorithm is proved to admit linear convergence  ...  (x s+1 t − ηv s+1 t ); (8) 9: end for 10: xs+1 ← x s+1 m ; 11: end for Variance Reduced Stochastic Compositional Proximal Gradient In this section, we propose variance reduced stochastic compositional  ... 
doi:10.1609/aaai.v32i1.11795 fatcat:tzozwu24undftfo2frzjxwgoem

Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization [article]

Huizhuo Yuan, Xiangru Lian, Ji Liu
2020 arXiv   pre-print
Such a complexity is known to be the best one among IFO complexity results for non-convex stochastic compositional optimization, and is believed to be optimal.  ...  We employ a recently developed idea of Stochastic Recursive Gradient Descent to design a novel algorithm named SARAH-Compositional, and prove a sharp Incremental First-order Oracle (IFO) complexity upper  ...  Improved oracle complexity for stochastic compositional variance reduced gradient. arXiv preprint arXiv:1806.00458, 2018. Liu Liu, Ji Liu, and Dacheng Tao.  ... 
arXiv:1912.13515v2 fatcat:764ag2w3ujaydpseimqu6mi2ra

Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems

Tianyi Chen, Yuejiao Sun, Wotao Yin
2021 Neural Information Processing Systems  
Under certain regularity conditions, applying our results to stochastic compositional, min-max, and reinforcement learning problems either improves or matches the best-known sample complexity in the respective  ...  This paper unifies several SGD-type updates for stochastic nested problems into a single SGD approach that we term ALternating Stochastic gradient dEscenT (ALSET) method.  ...  As a by-product, this general result also improves the existing sample complexity of the min-max and compositional cases. It matches the sample complexity of SGD for single-level stochastic problems.  ... 
dblp:conf/nips/ChenSY21 fatcat:b6r6djgpcbf73ajblcoptppuuy

Improved Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization [article]

Tianyi Lin, Chenyou Fan, Mengdi Wang
2019 arXiv   pre-print
We consider the nonsmooth convex composition optimization problem where the objective is a composition of two finite-sum functions and analyze stochastic compositional variance reduced gradient (SCVRG)  ...  More specifically, our method achieves the total IFO complexity of O((m+n)log(1/ϵ)+1/ϵ^3) which improves that of O(1/ϵ^3.5) and O((m+n)/√(ϵ)) obtained by SCGD and accelerated gradient descent (AGD) respectively  ...  Fast stochastic variance reduced admm for stochastic composition optimization.  ... 
arXiv:1802.02339v7 fatcat:b55jnnsukngwddiqevwtjxxkfy

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks [article]

Weilin Cong, Rana Forsati, Mahmut Kandemir, Mehrdad Mahdavi
2021 arXiv   pre-print
We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding  ...  However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients.  ...  reduce the variance in unbiased stochastic gradients.  ... 
arXiv:2006.13866v2 fatcat:em7gsyj23vcrbfgkt4dzwo45fa

Momentum Schemes with Stochastic Variance Reduction for Nonconvex Composite Optimization [article]

Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh
2019 arXiv   pre-print
Two new stochastic variance-reduced algorithms named SARAH and SPIDER have been recently proposed, and SPIDER has been shown to achieve a near-optimal gradient oracle complexity for nonconvex optimization  ...  the near-optimal gradient oracle complexity for achieving a generalized first-order stationary condition.  ...  Such an issue has been successfully resolved by using more advanced stochastic variance-reduced gradient estimators that induce a smaller variance, leading to the design of a variety of stochastic variance-reduced  ... 
arXiv:1902.02715v3 fatcat:arhpvqvorngv3pqlktcslgkbbu

Stochastic Gradient Made Stable: A Manifold Propagation Approach for Large-Scale Optimization [article]

Yadong Mu and Wei Liu and Wei Fan
2016 arXiv   pre-print
A stochastic gradient is typically calculated from a limited number of samples (known as mini-batch), so it potentially incurs a high variance and causes the estimated parameters bounce around the optimal  ...  This way S3GD is able to generate a highly-accurate estimate of the exact gradient from each mini-batch with largely-reduced computational complexity.  ...  For example, the work in [28] explicitly expresses the stochastic gradient variance and proves that constructing mini-batch using special nonuniform sampling strategy is able to reduce the stochastic  ... 
arXiv:1506.08350v2 fatcat:hw4n7r64p5ddfic7lacyofjjza

Fast Training Method for Stochastic Compositional Optimization Problems

Hongchang Gao, Heng Huang
2021 Neural Information Processing Systems  
To address this problem, we propose novel decentralized stochastic compositional gradient descent methods to efficiently train the largescale stochastic compositional optimization problem.  ...  Existing methods for the stochastic compositional optimization problem only focus on the single machine scenario, which is far from satisfactory when data are distributed on different devices.  ...  But it has a worse convergence rate than the standard stochastic gradient descent method. To improve it, a series of variance-reduced methods have been proposed.  ... 
dblp:conf/nips/GaoH21 fatcat:lf4sg4ivefahxdpi6yf2pdwt2q

Achieving Linear Speedup in Decentralized Stochastic Compositional Minimax Optimization [article]

Hongchang Gao
2023 arXiv   pre-print
To address this issue, we developed a novel decentralized stochastic compositional gradient descent ascent with momentum algorithm to reduce the consensus error in the inner-level function.  ...  The stochastic compositional minimax problem has attracted a surge of attention in recent years since it covers many emerging machine learning models.  ...  In addition, some efforts have been made to improve the sample complexity and communication complexity by compressing the communicated variables [18, 11] or reducing the gradient variance [17, 33] .  ... 
arXiv:2307.13430v2 fatcat:esxqcr3odfbrlnv7j4bffufpiq

On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum

Hongchang Gao, Junyi Li, Heng Huang
2022 International Conference on Machine Learning  
In this paper, we developed a novel local stochastic compositional gradient descent with momentum method, which facilitates Federated Learning for the stochastic compositional problem.  ...  Meanwhile, our communication complexity O(1/ϵ 3 ) can match existing methods. To the best of our knowledge, this is the first work achieving such favorable sample and communication complexities.  ...  2021) and reducing the variance of stochastic gradients (Khanduri et al., 2021; Karimireddy et al., 2020a; Das et al., 2020) .  ... 
dblp:conf/icml/GaoLH22 fatcat:mhwjrykvtzg2re5luccjcxjiuu

Optimal Algorithms for Stochastic Multi-Level Compositional Optimization [article]

Wei Jiang, Bokun Wang, Yibo Wang, Lijun Zhang, Tianbao Yang
2022 arXiv   pre-print
To address these limitations, we propose a Stochastic Multi-level Variance Reduction method (SMVR), which achieves the optimal sample complexity of 𝒪(1 / ϵ^3) to find an ϵ-stationary point for non-convex  ...  Furthermore, when the objective function satisfies the convexity or Polyak-Łojasiewicz (PL) condition, we propose a stage-wise variant of SMVR and improve the sample complexity to 𝒪(1 / ϵ^2) for convex  ...  The authors would like to thank the anonymous reviewers for their helpful comments.  ... 
arXiv:2202.07530v4 fatcat:42fwzrnfqbf7jlltx75ibpfuou

Randomized Smoothing SVRG for Large-scale Nonsmooth Convex Optimization [article]

Wenjie Huang
2018 arXiv   pre-print
We develop and analyze a new algorithm that achieves robust linear convergence rate, and both its time complexity and gradient complexity are superior than state-of-art nonsmooth algorithms and subgradient-based  ...  In ( [Johnson and Zhang, 2013] ) and its proximal extension in ( [Xiao and Zhang, 2014] ), stochastic variance reduced gradient (SVRG) is proposed that reduces the variance of stochastic gradient descent  ...  Actually, same reduced variance bounds and convergence rate holds for problems with or without composite functions R(x). 3.  ... 
arXiv:1805.05189v1 fatcat:7mjcgk7acrg2zgw3nv2tqc5cr4

Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient [article]

Haonan Jia, Xiao Zhang, Jun Xu, Wei Zeng, Hao Jiang, Xiaohui Yan, Ji-Rong Wen
2020 arXiv   pre-print
Stochastic variance-reduced gradient methods such as SVRG have been applied to reduce the estimation variance (Zhao et al. 2019).  ...  Deep Q-learning algorithms often suffer from poor gradient estimations with an excessive variance, resulting in unstable training and poor sampling efficiency.  ...  of our stochastic recursive gradient for the variance reduction in DQN.  ... 
arXiv:2007.12817v1 fatcat:n4ta5mxeyjeulbilojziu7kvnq
« Previous Showing results 1 — 15 out of 26,397 results