A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Learning Halfspaces and Neural Networks with Random Initialization
[article]
2015
arXiv
pre-print
We study non-convex empirical risk minimization for learning halfspaces and neural networks. ...
For loss functions that are L-Lipschitz continuous, we present algorithms to learn halfspaces and multi-layer neural networks that achieve arbitrarily small excess risk ϵ>0. ...
MJ and YZ were partially supported by the U.S.ARL and the U.S.ARO under contract/grant number W911NF-11-1-0391. We thank Sivaraman Balakrishnan for helpful comments on an earlier draft. ...
arXiv:1511.07948v1
fatcat:3oiaouh33zc25d4wismghcdpra
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
[article]
2021
arXiv
pre-print
We prove that SGD produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution for a broad class of distributions that includes log-concave ...
To the best of our knowledge, this is the first work to show that overparameterized neural networks trained by SGD can generalize when the data is corrupted with adversarial label noise. ...
We thank Maria-Florina Balcan for pointing us to a number of works on learning halfspaces in the presence of noise. ...
arXiv:2101.01152v3
fatcat:rhygrb6cmrcslbumz3panrv6ym
From Local Pseudorandom Generators to Hardness of Learning
[article]
2021
arXiv
pre-print
Our results include: hardness of learning shallow ReLU neural networks under the Gaussian distribution and other distributions; hardness of learning intersections of ω(1) halfspaces, DNF formulas with ...
We also establish lower bounds on the complexity of learning intersections of a constant number of halfspaces, and ReLU networks with a constant number of hidden neurons. ...
Acknowledgements We thank Benny Applebaum and anonymous reviewers for their valuable comments. This research is partially supported by ISF grant 2258/19. ...
arXiv:2101.08303v2
fatcat:mej6qudnvjeata6mhuyugmwjai
Gated Linear Networks
[article]
2020
arXiv
pre-print
What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target, forgoing the ability to learn ...
We show that this architecture gives rise to universal learning capabilities in the limit, with effective model capacity increasing as a function of network size in a manner comparable with deep ReLU networks ...
Unlike contemporary neural networks, we demonstrate that the halfspace-gated GLN architecture and learning rule is naturally robust to catastrophic forgetting without any modifications or knowledge of ...
arXiv:1910.01526v2
fatcat:fbgnq4rfwzbspis4jmv6qeudgq
Gated Linear Networks
2021
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target, forgoing the ability to learn ...
We show that this architecture gives rise to universal learning capabilities in the limit, with effective model capacity increasing as a function of network size in a manner comparable with deep ReLU networks ...
Unlike contemporary neural networks, we demonstrate that the halfspace-gated GLN architecture and learning rule is naturally robust to catastrophic forgetting without any modifications or knowledge of ...
doi:10.1609/aaai.v35i11.17202
fatcat:c57f567vajdrnjxfbbzlwol7xm
STRIP - a strip-based neural-network growth algorithm for learning multiple-valued functions
2001
IEEE Transactions on Neural Networks
We construct two neural networks based on these hidden units and show that they correctly compute the given but arbitrary multiple-valued function. ...
Preliminary experimental results are presented and discussed. Index Terms-Constructive algorithm, genetic algorithm, multiple-threshold perceptron, multiple-valued logic, neural network, partitioning. ...
ACKNOWLEDGMENT The authors would like to thank the referees for their important and interesting suggestions. ...
doi:10.1109/72.914519
pmid:18244379
fatcat:ohwx3vafybeaphciww5ewsm7my
Neural Abstractions
[article]
2023
arXiv
pre-print
Neural networks have extensively been used before as approximators; in this work, we make a step further and use them for the first time as abstractions. ...
By using neural ODEs with ReLU activation functions as abstractions, we cast the safety verification problem for nonlinear dynamical models into that of hybrid automata with affine dynamics, which we verify ...
Alec was supported by the EPSRC Centre for Doctoral Training in Autonomous Intelligent Machines and Systems (EP/S024050/1). ...
arXiv:2301.11683v1
fatcat:nhwahth75fhspowkvyu2lajjqq
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning
[article]
2015
arXiv
pre-print
We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multilayer feed-forward networks. ...
We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning. ...
Hence, the hypothesis class of neural intersection of k/2 halfspaces is a subset of hypothesis class of feed-forward neural networks with k hidden units in a single hidden layer. ...
arXiv:1412.6614v4
fatcat:vsmrbijxrfd2zhgtirk5p32gfu
Adversarial Spheres
[article]
2018
arXiv
pre-print
As a result of the theory, the vulnerability of neural networks to small adversarial perturbations is a logical consequence of the amount of test error observed. ...
Despite substantial research interest, the cause of the phenomenon is still poorly understood and remains unsolved. ...
Acknowledgments Special thanks to Surya Ganguli, Jascha Sohl-dickstein, Jeffrey Pennington, and Sam Smith for interesting discussions on this problem. ...
arXiv:1801.02774v3
fatcat:tneoo2mzpzbdxag4uzha6hecgm
An Efficient Explorative Sampling Considering the Generative Boundaries of Deep Generative Neural Networks
[article]
2019
arXiv
pre-print
Deep generative neural networks (DGNNs) have achieved realistic and high-quality data generation. ...
We define generative boundaries which determine the activation of nodes in the internal layer and probe inside the model with this information. ...
Explaining deep neural networks One can explain an output of neural networks by the sensitivity analysis, which aims to figure out which portion of an input contributes to the output. ...
arXiv:1912.05827v1
fatcat:w5gypgetnjbltdjb4omkpc6zfi
Learning Neural Networks with Two Nonlinear Layers in Polynomial Time
[article]
2018
arXiv
pre-print
This is the first assumption-free, provably efficient algorithm for learning neural networks with two nonlinear layers. ...
We give a polynomial-time algorithm for learning neural networks with one layer of sigmoids feeding into any Lipschitz, monotone activation function (e.g., sigmoid or ReLU). ...
[ZLJ16] to obtain results for learning sparse neural networks with certain smooth activations, and Goel et al. ...
arXiv:1709.06010v4
fatcat:dfy27fty6vfwzagcbl7vbeq4h4
Online Learning in Contextual Bandits using Gated Linear Networks
[article]
2020
arXiv
pre-print
This algorithm is based on Gated Linear Networks (GLNs), a recently introduced deep learning architecture with properties well-suited to the online setting. ...
We empirically evaluate GLCB compared to 9 state-of-the-art algorithms that leverage deep neural networks, on a standard benchmark suite of discrete and continuous contextual bandit problems. ...
. • Neural Greedy estimates action-values with a neural network and follows -greedy policy. • Neural Linear utilizes a neural network to extract latent features, from which action values are estimated ...
arXiv:2002.11611v2
fatcat:vm65osogrrh2vbreila2kvrk3a
From average case complexity to improper learning complexity
[article]
2014
arXiv
pre-print
Agnostically learning halfspaces with a constant approximation ratio is hard. 3. Learning an intersection of ω(1) halfspaces is hard. ...
There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in (Kearns and Valiant 89) and relies on cryptographic assumptions. ...
Acknowledgements: Amit Daniely is a recipient of the Google Europe Fellowship in Learning Theory, and this research is supported in part by this Google Fellowship. ...
arXiv:1311.2272v2
fatcat:4j35d76anjalradrkkois6fu7m
An Efficient Explorative Sampling Considering the Generative Boundaries of Deep Generative Neural Networks
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Deep generative neural networks (DGNNs) have achieved realistic and high-quality data generation. ...
We define generative boundaries which determine the activation of nodes in the internal layer and probe inside the model with this information. ...
Explaining deep neural networks One can explain an output of neural networks by the sensitivity analysis, which aims to figure out which portion of an input contributes to the output. ...
doi:10.1609/aaai.v34i04.5852
fatcat:q7vli524vzffpf5j5ppjxbeooq
Learning Stable Deep Dynamics Models
[article]
2020
arXiv
pre-print
We show that such learning systems are able to model simple dynamical systems and can be combined with additional deep generative models to learn complex dynamics, such as video textures, in a fully end-to-end ...
The approach works by jointly learning a dynamics model and Lyapunov function that guarantees non-expansiveness of the dynamics under the learned Lyapunov function. ...
Specifically, we letf be defined by a 2-100-100-2 fully connected network, and V be a 2-100-100-1 ICNN, with both networks initialized via the default weights of PyTorch (the Kaiming uniform initialization ...
arXiv:2001.06116v1
fatcat:5ixd3au4wjhplhifpwab7vtooy
« Previous
Showing results 1 — 15 out of 620 results