Calibrated Stochastic Gradient Descent for Convolutional Neural Networks.

Studies show that for this task a simple gradient descent algorithm is quite effective. ... In the given paper a comparative analysis of convolutional neural networks training algorithms that are used in tasks of image recognition is provided. ... This publication reflects the views only of the author, and the Commission cannot be held responsible for any use which may be made of the information contained therein. ...

dblp:conf/colins/RudenkoBO20 fatcat:urkwrrkkq5fqvjcrxkmto4move

We show that differentially private stochastic gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models. ... We highlight and exploit parallels between stochastic gradient Langevin dynamics, a scalable Bayesian inference technique for training deep neural networks, and DP-SGD, in order to train differentially ... Results on MNIST Results for a five layer convolutional neural network (CNN) trained on MNIST with the compared optimization procedures. ...

arXiv:2107.04296v2 fatcat:mlfw7oo6f5htzf7zg54yj7bks4

Open Access Multiple Versions

We developed Variational Laplace for Bayesian neural networks (BNNs), which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling ... of the neural-network weights. ... First, their approach connects fullbatch gradient descent to squared-gradient regularizers. Of course, most neural network training is stochastic gradient descent based on minibatches. ...

doi:10.3390/e23121629 pmid:34945935 pmcid:PMC8700595 fatcat:rzd4l4arhbggzdeq3x6yras35m

DOAJ

networks, and finally, we apply SWAG, a recent method that is based on the Bayesian inference equivalence of mini-batch Stochastic Gradient Descent. ... We compare three different approaches to obtaining probabilistic models based on convolutional neural networks in a Bayesian formalism, namely Deep Ensembles, Concrete Dropout, and Stochastic Weight Averaging-Gaussian ... We thank the Netherlands Organization for Applied Scientific Research for releasing the F3 seismic dataset. ...

arXiv:2105.12115v1 fatcat:5ah5twtcyjc7fbm2xil7q4ln3q

Finally, the gradient centralization is embedded in the stochastic gradient function, so as to improve the generalization performance of the network and make the training more efficient and stable. ... In order to improve the recognition and tracking ability of the fully-convolutional siamese networks for object tracking in complex scenes, this paper proposes an improved object tracking algorithm with ... Gradient Centralization Optimization technology is essential for efficient training of deep neural networks. Gradient descent has always been a crucial part of training deep neural networks. ...

doi:10.25236/ajcis.2021.040112 fatcat:bxukpjbncfb43kn22v7brgw4bq

The experimental results show that the application of the convolutional neural network method in the system improves the response speed to the physical fitness state of tennis players by 5%. ... This adds technical support for timely understanding of tennis players' physical health information and prevents players from making mistakes on the court due to physical reasons. ... For simplicity, this article briefly introduces the mini-batch stochastic gradient descent algorithm. ...

doi:10.1155/2022/8599894 pmid:35942453 pmcid:PMC9356835 fatcat:o3dwsz2xsjhtpiz4oeanrjsmce

DOAJ

To train this network, the Stochastic Gradient Descent is usually utilized with the usage of mini-batches [16] . ... The CNN learning algorithm was implemented using the Stochastic Gradient Descent (SGD) with learning rate 0.01 and momentum 0.9 with 30000 iterations on the 50% of the whole dataset and a mini-batch of ...

doi:10.1007/978-3-319-44944-9_10 fatcat:bwilgihpkjgx5dtvrj6gq7mwkm

Stochastic Gradient Descent Langevin Dynamics (SGLD) algorithms, which add noise to the classic gradient descent, are known to improve the training of neural networks in some cases where the neural network ... In this paper we study the possibilities of training acceleration for the numerical resolution of stochastic control problems through gradient descent, where the control is parametrized by a neural network ... Acknowledgements The authors thank Idris Kharroubi for helpful discussions. ...

arXiv:2212.12018v2 fatcat:e7ycv675evg2tlmmii6qui6ngu

Multiple Versions

The results illustrate that: (1) the image registration technology shows excellent calibration achievement and the average error is only 4%; (2) with the resnet50 being selected as the backbone network ... Firstly, the moving images are calibrated by image registration, and the similarity method is adopted to evaluate the calibrated results. ... In order to explore the most suitable gradient descent algorithm for crack detection, the sgdm, rmsprop, and adam Analyses of Parameters The neural network adopts gradient descent algorithms to obtain ...

doi:10.3390/buildings12081081 fatcat:hdc6t45245g6thchk3wlprhfje

DOAJ Szczepanski

Wang EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression ............. ............................................................ X. Ruan, Y. Liu, C. ... Gao Aware Convolutional Recurrent Neural Network for Irregular Medical Time Series .......... ......................................... Q. Tan, M. Ye, A. J. Ma, B. Yang, T. C.-F. Yip, G. L.-H. ...

doi:10.1109/tnnls.2021.3112415 fatcat:76cvoarxv5gfxca5ziikpsss5m

This is followed by the Stochastic Gradient Descent model and the Logistic Regression model with 89.70% and 88.98%. ... The best non-neural network model was the support vector machine with a score of 89.71%. ... Support Vector Machine with Stochastic Gradient Descent. ...

arXiv:2006.04532v1 fatcat:7s5jxsawt5aubfczwkcvpud2va

Subsequently, to simplify the parameter complexity of the traditional inversion method, we proposed using the convolutional neural network instead of the traditional inversion method and constructing a ... Compared with traditional aerosol depth inversion, we found that convolutional neural networks can provide a higher correlation between PM2.5 concentration and satellite imagery through a more simplified ... The core of the stochastic gradient descent method is to perform a small-scale sample approximate estimation. ...

doi:10.3390/atmos13040522 fatcat:zprwmfh7ofa6zkrexrsaeadnry

DOAJ Szczepanski

Methods from convex optimization such as accelerated gradient descent are widely used as building blocks for deep learning algorithms. ... However, the reasons for their empirical success are unclear, since neural networks are not convex and standard guarantees do not apply. ... I am grateful to Jacob Abernethy, Samory Kpotufe and Brian McWilliams for useful conversations. ...

arXiv:1509.01851v2 fatcat:c7vzvwe66reu5pvzacvfaortnu

Multiple Versions

We develop variational Laplace for Bayesian neural networks (BNNs) which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling ... of the neural-network weights. ... Several approaches to Bayesian inference in neural networks are available, including stochastic gradient Langevin dynamics (Welling & Teh, 2011) Laplace's method (Azevedo-Filho & Shachter, 1994; MacKay ...

arXiv:2011.10443v2 fatcat:yhrbwcdrujf3niyu5nde3fuwte

Multiple Versions

In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification , and show that being robust to adversarial attacks after standard (gradient descent) ... Once the dataset has been created, in principle no specialized algorithm (besides standard gradient descent) is needed to train a robust model. ... The most common way to approximate the solution of this optimization problem for a neural network f is to first generate adversarial examples by running multiple steps of projected gradient descent (PGD ...

arXiv:2207.11727v2 fatcat:wcbpv2bmzjhbjnzvlzuhraszbi

Open Access Multiple Versions

First-Order Optimization (Training) Algorithms in Deep Learning

Preserved Fulltext

Differentially private training of neural networks with Langevin dynamics for calibrated predictive uncertainty [article]

Preserved Fulltext

Other Versions

Gradient Regularization as Approximate Variational Inference

Preserved Fulltext

Calibration and Uncertainty Quantification of Bayesian Convolutional Neural Networks for Geophysical Applications [article]

Preserved Fulltext

Object tracking in siamese network with attention mechanism and Mish function

Preserved Fulltext

A Continuous Deep Learning System Study of Tennis Player Health Information and Professional Input

Preserved Fulltext

Convolutional Neural Networks for Pose Recognition in Binary Omni-directional Images [chapter]

Preserved Fulltext

Langevin algorithms for Markovian Neural Networks and Deep Stochastic control [article]

Preserved Fulltext

Other Versions

A New Approach to Automatically Calibrate and Detect Building Cracks

Preserved Fulltext

Table of contents

Preserved Fulltext

Detecting Problem Statements in Peer Assessments [article]

Preserved Fulltext

Haze Grading Using the Convolutional Neural Networks

Preserved Fulltext

Deep Online Convex Optimization by Putting Forecaster to Sleep [article]

Preserved Fulltext

Variational Laplace for Bayesian neural networks [article]

Preserved Fulltext

Other Versions

Can we achieve robustness from data alone? [article]

Preserved Fulltext

Other Versions