Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








9,165 Hits in 4.2 sec

Measuring Neural Net Robustness with Constraints [article]

Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi
2017 arXiv   pre-print
We propose metrics for measuring the robustness of a neural net and devise a novel algorithm for approximating these metrics based on an encoding of robustness as a linear program.  ...  We show how our metrics can be used to evaluate the robustness of deep neural nets with experiments on the MNIST and CIFAR-10 datasets.  ...  The aim of our paper is to provide metrics for evaluating robustness, and to demonstrate the importance of using such impartial measures to compare robustness.  ... 
arXiv:1605.07262v2 fatcat:fdx3sqki6reahmk25xdsu4vr3m

Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions [article]

Georg Siedel, Weijia Shao, Silvia Vock, Andrey Morozov
2024 arXiv   pre-print
We evaluate the model robustness against imperceptible random p-norm corruptions and propose a novel robustness metric.  ...  We empirically investigate whether robustness transfers across different p-norms and derive conclusions on which p-norm corruptions a model should be trained and evaluated.  ...  corruptions for calculating mCE Lp and iCE metrics at test time.  ... 
arXiv:2305.05400v4 fatcat:335enfgtdrhqrnntmbs6u4ltie

Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems [article]

Mohammad Kachuee, Jinseok Nam, Sarthak Ahuja, Jin-Myung Won, Sungjin Lee
2022 arXiv   pre-print
To enable such robust frequent model updates, we suggest a simple and effective approach that ensures controlled policy updates for individual domains, followed by an off-policy evaluation for making deployment  ...  decisions without any need for lengthy A/B experimentation.  ...  In the pre-deployment evaluation, a set of expert-defined guard-rails is applied to the evaluation results to ensure robust model updates, especially for business-critical cases.  ... 
arXiv:2204.07135v1 fatcat:lz4r3xv5brhzzgqigkjtuqhfti

Robust Speech Recognition Using Warped Dft-Based Cepstral Features In Clean And Multistyle Training

Md. Jahangir Alam, Pierre Dumouchel, Patrick Kenny, D. O'Shaughnessy
2014 Zenodo  
Word error rate (WER) is used as an evaluation metric.  ...  Results and Discussion Word error rate (WER) is used as an evaluation metric for performance evaluation and comparison of the warped DFTbased cepstral feature extraction methods.  ... 
doi:10.5281/zenodo.54513 fatcat:ujwo2mtr5nfclgplxxiuelyha4

LLM-based Frameworks for Power Engineering from Routine to Novel Tasks [article]

Ran Li, Chuanqing Pu, Junyi Tao, Canbing Li, Feilong Fan, Yue Xiang, Sijie Chen
2023 arXiv   pre-print
Bard in terms of success rate, consistency, and robustness.  ...  Here, we propose LLM-based frameworks for different programming tasks in power systems.  ...  A set of evaluation metrics are designed to assess LLMs on a multi-metric scale, encompassing preknowledge in prompt, model assessment metrics and code assessment metrics in terms of success rate, consistency  ... 
arXiv:2305.11202v3 fatcat:qhgdn6lsnbggnnv3qstxwtkswq

Robustness Verification of Semantic Segmentation Neural Networks Using Relaxed Reachability [chapter]

Hoang-Dung Tran, Neelanjana Pal, Patrick Musau, Diego Manzanas Lopez, Nathaniel Hamilton, Xiaodong Yang, Stanley Bak, Taylor T. Johnson
2021 Lecture Notes in Computer Science  
of intersection-over-union (IoU), the typical performance evaluation measure for segmentation tasks.  ...  AbstractThis paper introduces robustness verification for semantic segmentation neural networks (in short, semantic segmentation networks [SSNs]), building on and extending recent approaches for robustness  ...  Additionally, we define and evaluate several metrics for robustness, as the robustness evaluation is more sophisticated for segmentation.  ... 
doi:10.1007/978-3-030-81685-8_12 fatcat:3co6pfnvxbahzjrgyq6rjuthrm

Evaluating the Effectiveness of Margin Parameter when Learning Knowledge Embedding Representation for Domain-specific Multi-relational Categorized Data [article]

Matthew Wai Heng Chung, Hegler Tissot
2019 arXiv   pre-print
We evaluate the effects of distinct values for the margin parameter focused on translational embedding representation models for multi-relational categorized data.  ...  Finally, the correlation between link prediction and classification accuracy shows traditional validation protocol for embedding models is a weak metric to represent the quality of embedding representation  ...  traditional LP metrics within the embedding training and evaluation protocols.  ... 
arXiv:1912.10264v1 fatcat:i5kbtznn5fcsnmb5uwwsehkbcq

A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias [article]

Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan
2023 arXiv   pre-print
However, in this paper, we find when adaptation protocols (LP, FT, LP+FT) are also evaluated on a variety of safety objectives (e.g., calibration, robustness, etc.), a complementary perspective to feature  ...  Going beyond conventional linear probing (LP) and fine tuning (FT) strategies, protocols that can effectively control feature distortion, i.e., the failure to update features orthogonal to the in-distribution  ...  ACKNOWLEDGMENTS We thank Ekdeep Singh Lubana for several helpful discussions during the course of this project. This work was performed under the auspices of the U.S.  ... 
arXiv:2303.13500v1 fatcat:l53glz2a7vbsxd2nr4lvwzoipm

AutoFT: Learning an Objective for Robust Fine-Tuning [article]

Caroline Choi, Yoonho Lee, Annie Chen, Allan Zhou, Aditi Raghunathan, Chelsea Finn
2024 arXiv   pre-print
We propose AutoFT, a data-driven approach for robust fine-tuning. Given a task, AutoFT searches for a fine-tuning procedure that enhances out-of-distribution (OOD) generalization.  ...  Specifically, AutoFT uses bi-level optimization to search for an objective function and hyperparameters that maximize post-adaptation performance on a small OOD validation set.  ...  Acknowledgements We thank Kyle Hsu, Lukas Haas, and other members of the IRIS lab for helpful feedback and discussions. We also thank Sachin Goyal for help with ImageNet experiments.  ... 
arXiv:2401.10220v2 fatcat:vrxnqn7tmza27bts5xmoi5cbbe

Joint Lp-Norm and L2,1-Norm Constrained Graph Laplacian PCA for Robust Tumor Sample Clustering and Gene Network Module Discovery

Xiang-Zhen Kong, Yu Song, Jin-Xing Liu, Chun-Hou Zheng, Sha-Sha Yuan, Juan Wang, Ling-Yun Dai
2021 Frontiers in Genetics  
In this article, a novel method named Lp-norm and L2,1-norm constrained graph Laplacian principal component analysis (PL21GPCA) based on traditional principal component analysis (PCA) is proposed for robust  ...  Third, to retain the geometric structure of the data, we introduce the graph Laplacian regularization item to the PL21GPCA optimization model.  ...  ACKNOWLEDGMENTS Thanks a lot for my co-tutor Yong Xu who is now a professor in Harbin Institute of Technology, Shenzhen, China.  ... 
doi:10.3389/fgene.2021.621317 pmid:33708239 pmcid:PMC7940841 fatcat:rjqiv52dwfazzgzyt7yxfflszm

Optimizing Dynamic Trajectories for Robustness to Disturbances Using Polytopic Projections [article]

Henrique Ferrolho, Wolfgang Merkt, Vladimir Ivan, Wouter Wolfslag, Sethu Vijayakumar
2020 arXiv   pre-print
This paper focuses on robustness to disturbance forces and uncertain payloads. We present a novel formulation to optimize the robustness of dynamic trajectories.  ...  The non-trivial transcription proposed allows trajectory optimization frameworks to converge to highly robust dynamic solutions.  ...  We would also like to thank the anonymous reviewers for their constructive comments.  ... 
arXiv:2003.00609v2 fatcat:pnpwhyammvbntlb7gbnfnqlfbi

Quantifying Degrees of Controllability in Temporal Networks with Uncertainty

Shyan Akmal, Savana Ammons, Hemeng Li, James C. Boerkoel Jr.
2019 International Conference on Automated Planning and Scheduling  
We introduce new methods for predicting the degrees of strong and dynamic controllability for uncontrollable networks.  ...  In addition, we show empirically that both metrics are good predictors of the actual dispatch success rate.  ...  Finally, we thank Jordan Abrahams, Susan Martonosi, and Mohamed Omar for offering their expertise in temporal networks, optimization, and convex geometry respectively.  ... 
dblp:conf/aips/AkmalALB19 fatcat:5w6cojo2mrgfxahlgwolwn6d6q

Robust Validation of Network Designs under Uncertain Demands and Failures

Yiyang Chang, Sanjay G. Rao, Mohit Tawarmalani
2017 Symposium on Networked Systems Design and Implementation  
Acknowledgements We thank our shepherd Nate Foster, and the reviewers for their insightful feedback.  ...  Beyond networking, the complexity status of robust optimization formulations has been investigated and tractable formulations derived for various special cases [12, 14] .  ...  We show that these techniques lead to tighter bounds on the validation problem than existing state-of-the-art approaches in robust optimization, a finding that has applications beyond networking.  ... 
dblp:conf/nsdi/ChangRT17 fatcat:b6bogltymnbbpesab3wt2pvcuu

Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models [article]

Dheeraj Mekala, Alex Nguyen, Jingbo Shang
2024 arXiv   pre-print
Utilizing open-sourced OPT and Llama-2 models up to 13B in size, two publicly available instruction-tuning training datasets and evaluated by both automatic metrics & humans, our paper introduces a novel  ...  Our experiments span different-sized models, revealing that this characteristic holds for models ranging from 1B (small) to 13B (large) in size.  ...  The authors thank Shang Data Lab, Palash Chauhan, Amulya Bangalore, Shreyas Rajesh, Gautham Reddy, Sanjana Garg, Ethan Thai, Queso Tran, Rahul Mistry, Sandy La, and Sophia Do for their valuable contributions  ... 
arXiv:2402.10430v1 fatcat:npxudxgqejhlfecre7l36pmnta

Beyond BLEU:Training Neural Machine Translation with Semantic Similarity

John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
While most neural machine translation (NMT) systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such  ...  In this paper, we introduce an alternative reward function for optimizing NMT systems that is based on recent work in semantic similarity.  ...  Cer et al. (2010) compared several metrics to optimize for SMT, finding BLEU to be robust as a training metric and finding that the most effective and most stable metrics for training are not necessarily  ... 
doi:10.18653/v1/p19-1427 dblp:conf/acl/WietingBGN19 fatcat:ckylq5pjtbfhhpswp2lkgpplwe
« Previous Showing results 1 — 15 out of 9,165 results