Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








6,104 Hits in 3.2 sec

First-Order Optimization (Training) Algorithms in Deep Learning

Oleg Rudenko, Oleksandr Bezsonov, Kyrylo Oliinyk
2020 International Conference on Computational Linguistics and Intelligent Systems  
Studies show that for this task a simple gradient descent algorithm is quite effective.  ...  In the given paper a comparative analysis of convolutional neural networks training algorithms that are used in tasks of image recognition is provided.  ...  This publication reflects the views only of the author, and the Commission cannot be held responsible for any use which may be made of the information contained therein.  ... 
dblp:conf/colins/RudenkoBO20 fatcat:urkwrrkkq5fqvjcrxkmto4move

Differentially private training of neural networks with Langevin dynamics for calibrated predictive uncertainty [article]

Moritz Knolle, Alexander Ziller, Dmitrii Usynin, Rickmer Braren, Marcus R. Makowski, Daniel Rueckert, Georgios Kaissis
2021 arXiv   pre-print
We show that differentially private stochastic gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models.  ...  We highlight and exploit parallels between stochastic gradient Langevin dynamics, a scalable Bayesian inference technique for training deep neural networks, and DP-SGD, in order to train differentially  ...  Results on MNIST Results for a five layer convolutional neural network (CNN) trained on MNIST with the compared optimization procedures.  ... 
arXiv:2107.04296v2 fatcat:mlfw7oo6f5htzf7zg54yj7bks4

Gradient Regularization as Approximate Variational Inference

Ali Unlu, Laurence Aitchison
2021 Entropy  
We developed Variational Laplace for Bayesian neural networks (BNNs), which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling  ...  of the neural-network weights.  ...  First, their approach connects fullbatch gradient descent to squared-gradient regularizers. Of course, most neural network training is stochastic gradient descent based on minibatches.  ... 
doi:10.3390/e23121629 pmid:34945935 pmcid:PMC8700595 fatcat:rzd4l4arhbggzdeq3x6yras35m

Calibration and Uncertainty Quantification of Bayesian Convolutional Neural Networks for Geophysical Applications [article]

Lukas Mosser, Ehsan Zabihi Naeini
2021 arXiv   pre-print
networks, and finally, we apply SWAG, a recent method that is based on the Bayesian inference equivalence of mini-batch Stochastic Gradient Descent.  ...  We compare three different approaches to obtaining probabilistic models based on convolutional neural networks in a Bayesian formalism, namely Deep Ensembles, Concrete Dropout, and Stochastic Weight Averaging-Gaussian  ...  We thank the Netherlands Organization for Applied Scientific Research for releasing the F3 seismic dataset.  ... 
arXiv:2105.12115v1 fatcat:5ah5twtcyjc7fbm2xil7q4ln3q

Object tracking in siamese network with attention mechanism and Mish function

2021 Academic Journal of Computing & Information Science  
Finally, the gradient centralization is embedded in the stochastic gradient function, so as to improve the generalization performance of the network and make the training more efficient and stable.  ...  In order to improve the recognition and tracking ability of the fully-convolutional siamese networks for object tracking in complex scenes, this paper proposes an improved object tracking algorithm with  ...  Gradient Centralization Optimization technology is essential for efficient training of deep neural networks. Gradient descent has always been a crucial part of training deep neural networks.  ... 
doi:10.25236/ajcis.2021.040112 fatcat:bxukpjbncfb43kn22v7brgw4bq

A Continuous Deep Learning System Study of Tennis Player Health Information and Professional Input

Lina Gong, Rahim Khan
2022 Computational Intelligence and Neuroscience  
The experimental results show that the application of the convolutional neural network method in the system improves the response speed to the physical fitness state of tennis players by 5%.  ...  This adds technical support for timely understanding of tennis players' physical health information and prevents players from making mistakes on the court due to physical reasons.  ...  For simplicity, this article briefly introduces the mini-batch stochastic gradient descent algorithm.  ... 
doi:10.1155/2022/8599894 pmid:35942453 pmcid:PMC9356835 fatcat:o3dwsz2xsjhtpiz4oeanrjsmce

Convolutional Neural Networks for Pose Recognition in Binary Omni-directional Images [chapter]

S. V. Georgakopoulos, K. Kottari, K. Delibasis, V. P. Plagianakos, I. Maglogiannis
2016 IFIP Advances in Information and Communication Technology  
To train this network, the Stochastic Gradient Descent is usually utilized with the usage of mini-batches [16] .  ...  The CNN learning algorithm was implemented using the Stochastic Gradient Descent (SGD) with learning rate 0.01 and momentum 0.9 with 30000 iterations on the 50% of the whole dataset and a mini-batch of  ... 
doi:10.1007/978-3-319-44944-9_10 fatcat:bwilgihpkjgx5dtvrj6gq7mwkm

Langevin algorithms for Markovian Neural Networks and Deep Stochastic control [article]

Pierre Bras, Gilles Pagès
2023 arXiv   pre-print
Stochastic Gradient Descent Langevin Dynamics (SGLD) algorithms, which add noise to the classic gradient descent, are known to improve the training of neural networks in some cases where the neural network  ...  In this paper we study the possibilities of training acceleration for the numerical resolution of stochastic control problems through gradient descent, where the control is parametrized by a neural network  ...  Acknowledgements The authors thank Idris Kharroubi for helpful discussions.  ... 
arXiv:2212.12018v2 fatcat:e7ycv675evg2tlmmii6qui6ngu

A New Approach to Automatically Calibrate and Detect Building Cracks

Zongchao Liu, Xiaoda Li, Junhui Li, Shuai Teng
2022 Buildings  
The results illustrate that: (1) the image registration technology shows excellent calibration achievement and the average error is only 4%; (2) with the resnet50 being selected as the backbone network  ...  Firstly, the moving images are calibrated by image registration, and the similarity method is adopted to evaluate the calibrated results.  ...  In order to explore the most suitable gradient descent algorithm for crack detection, the sgdm, rmsprop, and adam Analyses of Parameters The neural network adopts gradient descent algorithms to obtain  ... 
doi:10.3390/buildings12081081 fatcat:hdc6t45245g6thchk3wlprhfje

Table of contents

2021 IEEE Transactions on Neural Networks and Learning Systems  
Wang EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression ............. ............................................................ X. Ruan, Y. Liu, C.  ...  Gao Aware Convolutional Recurrent Neural Network for Irregular Medical Time Series .......... ......................................... Q. Tan, M. Ye, A. J. Ma, B. Yang, T. C.-F. Yip, G. L.-H.  ... 
doi:10.1109/tnnls.2021.3112415 fatcat:76cvoarxv5gfxca5ziikpsss5m

Detecting Problem Statements in Peer Assessments [article]

Yunkai Xiao, Gabriel Zingle, Qinjin Jia, Harsh R. Shah, Yi Zhang, Tianyi Li, Mohsin Karovaliya, Weixiang Zhao, Yang Song, Jie Ji, Ashwin Balasubramaniam, Harshit Patel (+3 others)
2020 arXiv   pre-print
This is followed by the Stochastic Gradient Descent model and the Logistic Regression model with 89.70% and 88.98%.  ...  The best non-neural network model was the support vector machine with a score of 89.71%.  ...  Support Vector Machine with Stochastic Gradient Descent.  ... 
arXiv:2006.04532v1 fatcat:7s5jxsawt5aubfczwkcvpud2va

Haze Grading Using the Convolutional Neural Networks

Lirong Yin, Lei Wang, Weizheng Huang, Jiawei Tian, Shan Liu, Bo Yang, Wenfeng Zheng
2022 Atmosphere  
Subsequently, to simplify the parameter complexity of the traditional inversion method, we proposed using the convolutional neural network instead of the traditional inversion method and constructing a  ...  Compared with traditional aerosol depth inversion, we found that convolutional neural networks can provide a higher correlation between PM2.5 concentration and satellite imagery through a more simplified  ...  The core of the stochastic gradient descent method is to perform a small-scale sample approximate estimation.  ... 
doi:10.3390/atmos13040522 fatcat:zprwmfh7ofa6zkrexrsaeadnry

Deep Online Convex Optimization by Putting Forecaster to Sleep [article]

David Balduzzi
2016 arXiv   pre-print
Methods from convex optimization such as accelerated gradient descent are widely used as building blocks for deep learning algorithms.  ...  However, the reasons for their empirical success are unclear, since neural networks are not convex and standard guarantees do not apply.  ...  I am grateful to Jacob Abernethy, Samory Kpotufe and Brian McWilliams for useful conversations.  ... 
arXiv:1509.01851v2 fatcat:c7vzvwe66reu5pvzacvfaortnu

Variational Laplace for Bayesian neural networks [article]

Ali Unlu, Laurence Aitchison
2021 arXiv   pre-print
We develop variational Laplace for Bayesian neural networks (BNNs) which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling  ...  of the neural-network weights.  ...  Several approaches to Bayesian inference in neural networks are available, including stochastic gradient Langevin dynamics (Welling & Teh, 2011) Laplace's method (Azevedo-Filho & Shachter, 1994; MacKay  ... 
arXiv:2011.10443v2 fatcat:yhrbwcdrujf3niyu5nde3fuwte

Can we achieve robustness from data alone? [article]

Nikolaos Tsilivis, Jingtong Su, Julia Kempe
2023 arXiv   pre-print
In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification , and show that being robust to adversarial attacks after standard (gradient descent)  ...  Once the dataset has been created, in principle no specialized algorithm (besides standard gradient descent) is needed to train a robust model.  ...  The most common way to approximate the solution of this optimization problem for a neural network f is to first generate adversarial examples by running multiple steps of projected gradient descent (PGD  ... 
arXiv:2207.11727v2 fatcat:wcbpv2bmzjhbjnzvlzuhraszbi
« Previous Showing results 1 — 15 out of 6,104 results