Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








22,449 Hits in 2.4 sec

On the Local Minima of the Empirical Risk [article]

Chi Jin, Lydia T. Liu, Rong Ge, Michael I. Jordan
2018 arXiv   pre-print
Our objective is to find the ϵ-approximate local minima of the underlying function F while avoiding the shallow local minima---arising because of the tolerance ν---which exist only in f.  ...  Population risk is always of primary interest in machine learning; however, learning algorithms only have access to the empirical risk.  ...  In the context of empirical risk minimization, such a result would allow fewer samples to be taken while still providing a strong guarantee on avoiding local minima.  ... 
arXiv:1803.09357v2 fatcat:6zvclcqzanhdtl4fjfhsuufkii

Regularizing Neural Networks via Adversarial Model Perturbation [article]

Yaowei Zheng, Richong Zhang, Yongyi Mao
2021 arXiv   pre-print
This work proposes a new regularization scheme, based on the understanding that the flat local minima of the empirical risk cause the model to generalize better.  ...  Comparing with most existing regularization schemes, AMP has strong theoretical justifications, in that minimizing the AMP loss can be shown theoretically to favour flat local minima of the empirical risk  ...  Figure 1 (Figure 2 : 12 left) sketches an empirical risk curve L ERM , which contains two local minima, a sharp one on the left and a flat one on the right.  ... 
arXiv:2010.04925v4 fatcat:cscc5e5tczhurgh7ctz3ldimlq

Characterization of Excess Risk for Locally Strongly Convex Population Risk [article]

Mingyang Yi, Ruoyu Wang, Zhi-Ming Ma
2022 arXiv   pre-print
For non-convex problems with d model parameters such that d/n is smaller than a threshold independent of n, the order of (1/n) can be maintained if the empirical risk has no spurious local minima with  ...  We establish upper bounds for the expected excess risk of models trained by proper iterative algorithms which approximate the local minima.  ...  We first show a fact that the locally strongly convexity around the local minima of population risk (population local minima) can be generalized to the local minima of empirical risk (empirical local minima  ... 
arXiv:2012.02456v4 fatcat:5dscclksvzgcjb3cijfyp6357m

Theory II: Landscape of the Empirical Risk in Deep Learning [article]

Qianli Liao, Tomaso Poggio
2017 arXiv   pre-print
We further experimentally explored and visualized the landscape of empirical risk of a DCNN on CIFAR-10 during the entire training process and especially the global minima.  ...  Previous theoretical work on deep learning and neural network optimization tend to focus on avoiding saddle points and local minima.  ...  The second part is about the landscape of the minima of the empirical risk: what can we say in general about global and local minima?  ... 
arXiv:1703.09833v2 fatcat:gpqjwxkajzcxhgi4cog7junc6y

Piecewise linear activations substantially shape the loss surfaces of neural networks [article]

Fengxiang He, Bohan Wang, Dacheng Tao
2020 arXiv   pre-print
We first prove that the loss surfaces of many neural networks have infinite spurious local minima which are defined as the local minima with higher empirical risks than the global minima.  ...  The constructed spurious local minima are concentrated in one cell as a valley: they are connected with each other by a continuous path, on which empirical risk is invariant.  ...  All local minima in a cell are concentrated as a local minimum valley: on a local minimum valley, all local minima are connected with each other by a continuous path, on which the empirical risk is invariant  ... 
arXiv:2003.12236v1 fatcat:r54rh2tczzbkhgbduhwcocl3zm

Distribution-Dependent Analysis of Gibbs-ERM Principle [article]

Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári
2019 arXiv   pre-print
The first part of our analysis focuses on the localized excess risk in the vicinity of a fixed local minimizer.  ...  This result is then extended to bounds on the global excess risk, by characterizing probabilities of local minima (and their complement) under Gibbs densities, a results which might be of independent interest  ...  ACKNOWLEDGMENTS Authors would like to thank Olivier Bousquet, Sébastien Gerchinovitz, and Abbas Mehrabian for stimulating discussions on this work.  ... 
arXiv:1902.01846v1 fatcat:jro2br2slzg25fodq257s5wo24

The landscape of empirical risk for nonconvex losses

Song Mei, Yu Bai, Andrea Montanari
2018 Annals of Statistics  
2 ), the empirical risk has exactly two local minima θ + , θ− related by an exchange of the two classes.  ...  In general, gradient descent and other local optimization procedures are expected to converge to local minima of the empirical risk R n (θ ).  ... 
doi:10.1214/17-aos1637 fatcat:646bhqjaovclbdqcsuof2esoam

Theory of Deep Learning IIb: Optimization Properties of SGD [article]

Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio
2018 arXiv   pre-print
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.  ...  The main new result in this paper is theoretical and experimental evidence for the following conjecture about SGD: SGD concentrates in probability -- like the classical Langevin equation -- on large volume  ...  We gratefully acknowledge the support of NVIDIA Corporation with the donation of the DGX-1 used for this research.  ... 
arXiv:1801.02254v1 fatcat:osy2wb6cojh3ze5yqybg3nf5zy

Page 149 of Neural Computation Vol. 7, Issue 1 [page]

1995 Neural Computation  
While it is possible that the error functions studied contain local minima, a mathematical study of these local minima is beyond the scope of this paper.  ...  Empirical Risk Minimization 149 Finally, in the Hebb learning algorithm (recently studied in a very similar context by Barkai ef al. 1993) one has 1 m w= a yx!  ... 

The rare event risk in African emerging stock markets

Konstantinos Tolikas, Suzanne G.M. Fifield
2011 Managerial Finance  
Findings: The empirical results indicate that the GL distribution best fitted the empirical data over the period of study.  ...  Purpose: To investigate the asymptotic distribution of the extreme daily stock returns in African stock markets over the period 1996 to 2007 and examine the implications for downside risk measurement.  ...  The determination of these capital requirements is based on inputs provided by models (i.e. Value-at-Risk) which are based on distributional assumptions.  ... 
doi:10.1108/03074351111113324 fatcat:nxscip6wcvhsnoszfcmuk36xva

Are deep ResNets provably better than linear predictors? [article]

Chulhee Yun, Suvrit Sra, Ali Jadbabaie
2019 arXiv   pre-print
First, we show that there exist datasets for which all local minima of a fully-connected ReLU network are no better than the best linear predictor, whereas a ResNet has strictly better local minima.  ...  Recent results in the literature indicate that a residual network (ResNet) composed of a single residual block outperforms linear predictors, in the sense that all local minima in its optimization landscape  ...  Acknowledgments All the authors acknowledge support from DARPA Lagrange. Chulhee Yun also thanks Korea Foundation for Advanced Studies for their support.  ... 
arXiv:1907.03922v2 fatcat:4yf3rjp5oveujme23y5iotyy2q

Fast Rates for Empirical Risk Minimization of Strict Saddle Problems [article]

Alon Gonen, Shai Shalev-Shwartz
2017 arXiv   pre-print
We derive bounds on the sample complexity of empirical risk minimization (ERM) in the context of minimizing non-convex risks that admit the strict saddle property.  ...  Our results and techniques may pave the way for statistical analyses of additional strict saddle problems.  ...  More generally, these works consider the task of approximating the k leading eigenvectors. It is not hard to extend our results to this task as well.  ... 
arXiv:1701.04271v4 fatcat:4ebqm7probetfoi5rb4uemmzbi

Optimization Landscapes of Wide Deep Neural Networks Are Benign [article]

Johannes Lederer
2021 arXiv   pre-print
We highlight the importance of constraints for such networks and show that constraint -- as well as unconstraint -- empirical-risk minimization over such networks has no confined points, that is, suboptimal  ...  We analyze the optimization landscapes of deep learning with wide networks.  ...  We prove that the optimization landscapes of empirical-risk minimizers over wide feedforward networks have no spurious local minima.  ... 
arXiv:2010.00885v2 fatcat:pp2cmwcr5bgfbgazc3mw7x7v7y

Small nonlinearities in activation functions create bad local minima in neural networks [article]

Chulhee Yun, Suvrit Sra, Ali Jadbabaie
2019 arXiv   pre-print
We investigate the loss surface of neural networks. We prove that even for one-hidden-layer networks with "slightest" nonlinearity, the empirical risks have spurious local minima in most cases.  ...  .), for which there exists a bad local minimum. Our results make the least restrictive assumptions relative to existing results on spurious local optima in neural networks.  ...  But success stories of deep learning suggest that local minima of the empirical risk could be close to global minima. Choromanska et al.  ... 
arXiv:1802.03487v4 fatcat:fuxyiuxmejem7o3tjd7xbejfb4

LOCAL ESTIMATION OF DYNAMIC COPULA MODELS

BEATRIZ V. M. MENDES, EDUARDO F. L. DE MELO
2010 International Journal of Theoretical and Applied Finance  
Results indicate that volatility does affect the strength of dependence. The in-sample Value-at-Risk based on the dynamic model outperforms those based on the empirical estimates.  ...  In this paper we exploit this stylized fact combined with local maximum likelihood estimation of copula models to analyze the dynamic joint behavior of series of financial log returns.  ...  Acknowledgments The authors wish to thank the Editor and an anonymous referee whose comments and suggestions helped to greatly improve the quality of the paper.  ... 
doi:10.1142/s0219024910005759 fatcat:3rdqr2on2ffvnpbiee5ppcul24
« Previous Showing results 1 — 15 out of 22,449 results