When Are Nonconvex Optimization Problems Not Scary?

This class of nonconvex problems has two distinctive features: (i) All local minimizer are also global. ... This thesis focuses on a class of nonconvex optimization problems which CAN be solved to global optimality with polynomial-time algorithms. ... An important class of nonconvex problems are discrete optimization problems. ...

doi:10.7916/d8251j7h fatcat:aridvancfvfq5acwn2or3lxazy

In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional ... Finally we highlight alternatives, and open problems in this direction. ... Introduction General nonconvex optimization problems (henceforth "NCVX problems" for brevity) are NPhard, even the goal is computing only a local minimizer [MK87, Ber99] . ...

arXiv:1510.06096v2 fatcat:r2jzsjmhfzgufprx3aklv3ofde

Multiple Versions

In this paper, we overview recent advances on global nonconvex optimization theory for solving this problem, ranging from geometric analysis of its optimization landscapes, to efficient optimization algorithms ... for solving the associated nonconvex optimization problem, to applications in machine intelligence, representation learning, and imaging sciences. ... Wright, “When are nonconvex problems not scary?,” arXiv preprint arXiv:1510.06096, 2015. [106] Y. Zhai, Z. Yang, Z. Liao, J. Wright, and Y. ...

arXiv:2001.06970v1 fatcat:zluhhl3635bzrnnk7fjw5tvi7a

Cubic-regularized Newton's method (CR) is a popular algorithm that guarantees to produce a second-order stationary solution for solving nonconvex optimization problems. ... However, existing understandings of the convergence rate of CR are conditioned on special types of geometrical properties of the objective function. ... When Are Nonconvex Problems Not Scary? ArXiv:1510.06096v2. [Sun et al., 2017] Sun, J., Qu, Q., and Wright, J. (2017). A geometrical analysis of phase retrieval. ...

arXiv:1808.07382v1 fatcat:7eutnjrplfckzbt5na2satbbbe

These applications expand the scope of entropic GW correspondence to major shape analysis problems and are stable to distortion and noise. ... With these applications in mind, we present an algorithm for probabilistic correspondence that optimizes an entropy-regularized Gromov-Wasserstein (GW) objective. ... That said, γ is meaningful even when Σ0 and Σ are not isometric, measuring the optimal deviation from preserving the distance structure of a surface. ...

doi:10.1145/2897824.2925903 fatcat:u22rmttwwfcypdkxzpgniyjurq

The highlight of our work is that the theoretical guarantees are purely algebraic and do not assume any statistical priors of the additive adversaries, and thus it applies to various interesting settings ... Despite its tremendous practical importance, it is generally an NP-hard problem to find the least squares estimator. ... Regarding the optimization landscape, many works have shown that seemingly nonconvex objective function is not as "scary" as expected [11, 50, 51] : the landscape is usually benign and contains only one ...

arXiv:2106.15493v2 fatcat:4fpcshkosrhpbgcroheu6adoti

Open Access Multiple Versions

In this work, we show these problems can be formulated as ℓ^4-norm optimization problems with spherical constraint, and study the geometric properties of their nonconvex optimization landscapes. ... Despite the empirical success of simple nonconvex algorithms, theoretical justifications of why these methods work so well are far from satisfactory. ... When are nonconvex problems not scary? arXiv preprint arXiv:1510.06096, 2015. [SQW16a] Ju Sun, Qing Qu, and John Wright. ...

arXiv:1912.02427v2 fatcat:fb3x6iimyjcjdpworgsqkne5w4

Multiple Versions

Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule ... For some highly structured nonconvex problems however, the success of gradient descent can be understood by studying the geometry of the objective. ... When are nonconvex problems not scary? arXiv preprint arXiv:1510.06096, 2015. [37] Ju Sun, Qing Qu, and John Wright. A geometric analysis of phase retrieval. ...

arXiv:1809.10313v1 fatcat:2iopwmq3p5hxxkzux7c3cww4ha

optimization problems, when the objective function satisfies gradient-Lipschitz, Hessian-Lipschitz, and dispersive noise assumptions. ... Such SGD rate matches, up to a polylogarithmic factor of problem-dependent parameters, the rate of most accelerated nonconvex stochastic optimization algorithms that adopt additional techniques, such as ... When are nonconvex problems not scary? arXiv preprint arXiv:1510.06096. Sun, J., Qu, Q., & Wright, J. (2017). Complete dictionary recovery over the sphere i: Overview and the geometric picture. ...

arXiv:1902.00247v2 fatcat:q2olwny57revbl5z7vytcn5gfq

Multiple Versions

Although the cost function is nonconvex, the global convergence of gradient descent algorithm from a random initialization is studied, when m is large enough. ... The problem can be reformulated as a least-squares minimization problem. ... ACKNOWLEDGMENTS The authors are indebted to Stefano Marchesini for providing us with the gold balls data set used in numerical simulations. The first author would like to thank Ms. ...

arXiv:1607.01121v1 fatcat:xb3va5dj4ng25asiiw5yhve6zu

In this paper, we present a convergence analysis for an online tensorial ICA algorithm, by viewing the problem as a nonconvex stochastic approximation problem. ... Natasha 2: Faster non-convex optimization than SGD. In Advances in Neural Information Processing Systems, pages 2676-2687, 2018. (Cited on page 12.) Zeyuan Allen-Zhu and Yuanzhi Li. ... First efficient convergence for streaming k-PCA: a global, gap-free, and near-optimal rate. The 58th Annual Symposium on Foundations of Computer Science, 2017. (Cited on page 13.) ...

arXiv:2012.14415v2 fatcat:c2yhwj3mifcazgymxpmsvwjgxi

Multiple Versions

However, when the new metric is used in discrete and continuous optimization together, the results are not always two-win. ... As a result, the reliability of these recovered features might be diminished when they are used as ground truth during optimization. ...

arXiv:2312.05720v4 fatcat:u7cm2ts7lzhtvpa6g23o5pnale

Open Access Multiple Versions

Stochastic gradient descent (SGD) with stochastic momentum is popular in nonconvex stochastic optimization and particularly for the training of deep neural networks. ... In standard SGD, parameters are updated by improving along the path of the gradient at the current iterate on a batch of examples, where the addition of a "momentum" term biases the update in the direction ... When are nonconvex problems not scary? NIPS Workshop on Non-convex Optimization for Machine Learning: Theory and Practice, 2015. Ju Sun, Qing Qu, and John Wright. ...

dblp:conf/iclr/WangLA20 fatcat:53xyuzycvbb3lbwmwxpdhtl7zq

In over two decades of research, the field of dictionary learning has gathered a large collection of successful applications, and theoretical guarantees for model recovery are known only whenever optimization ... When are nonconvex problems not scary? arXiv preprint arXiv:1510.06096, 2015. Yuandong Tian. Over-parameterization as a catalyst for better generalization of deep relu network. ... Nonetheless, one should keep in mind that the optimization landscape of these optimization problems is still not fully understood, and practical local-search algorithms may converge to a local minimum ...

arXiv:2006.06179v2 fatcat:lvku4p56szcslnim3hgvy4pvpi

Multiple Versions

Empirically, we show that our results also hold during the training of neural networks in real-world tasks when explicit regularization or weight decay is not used. ... We prove that gradient flow on this model converges to critical points of a minimum-norm separation problem exhibiting neural collapse in its global minimizer. ... ACKNOWLEDGMENTS We are grateful to Qing Qu and X.Y. ...

arXiv:2110.02796v2 fatcat:rpnczdoqhvff3g6y7smk4spkzu

Multiple Versions

When Are Nonconvex Optimization Problems Not Scary?

Preserved Fulltext

When Are Nonconvex Problems Not Scary? [article]

Preserved Fulltext

Other Versions

Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications [article]

Preserved Fulltext

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property [article]

Preserved Fulltext

Entropic metric alignment for correspondence problems

Preserved Fulltext

Generalized Orthogonal Procrustes Problem under Arbitrary Adversaries [article]

Preserved Fulltext

Analysis of the Optimization Landscapes for Overcomplete Representation Learning [article]

Preserved Fulltext

Other Versions

Efficient Dictionary Learning with Gradient Descent [article]

Preserved Fulltext

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points [article]

Preserved Fulltext

On Gradient Descent Algorithm for Generalized Phase Retrieval Problem [article]

Preserved Fulltext

Stochastic Approximation for Online Tensorial Independent Component Analysis [article]

Preserved Fulltext

Other Versions

Beyond Gradient and Priors in Privacy Attacks: Leveraging Pooler Layer Inputs of Language Models in Federated Learning [article]

Preserved Fulltext

Other Versions

Escaping Saddle Points Faster with Stochastic Momentum

Preserved Fulltext

Recovery and Generalization in Over-Realized Dictionary Learning [article]

Preserved Fulltext

An Unconstrained Layer-Peeled Perspective on Neural Collapse [article]

Preserved Fulltext

Other Versions