Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 320 results for author: Jha, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09854  [pdf, other

    cs.CL

    On the relevance of pre-neural approaches in natural language processing pedagogy

    Authors: Aditya Joshi, Jake Renzella, Pushpak Bhattacharyya, Saurav Jha, Xiangyu Zhang

    Abstract: While neural approaches using deep learning are the state-of-the-art for natural language processing (NLP) today, pre-neural algorithms and approaches still find a place in NLP textbooks and courses of recent years. In this paper, we compare two introductory NLP courses taught in Australia and India, and examine how Transformer and pre-neural approaches are balanced within the lecture plan and ass… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Under review at Teaching NLP workshop at ACL 2024; 8 pages

  2. arXiv:2405.07764  [pdf, other

    cs.CL cs.SI physics.soc-ph

    LGDE: Local Graph-based Dictionary Expansion

    Authors: Dominik J. Schindler, Sneha Jha, Xixuan Zhang, Kilian Buehling, Annett Heft, Mauricio Barahona

    Abstract: Expanding a dictionary of pre-selected keywords is crucial for tasks in information retrieval, such as database query and online data collection. Here we propose Local Graph-based Dictionary Expansion (LGDE), a method that uses tools from manifold learning and network science for the data-driven discovery of keywords starting from a seed dictionary. At the heart of LGDE lies the creation of a word… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  3. arXiv:2405.07333  [pdf, other

    quant-ph cs.DC

    Quantum Mini-Apps: A Framework for Developing and Benchmarking Quantum-HPC Applications

    Authors: Nishant Saurabh, Pradeep Mantha, Florian J. Kiwit, Shantenu Jha, Andre Luckow

    Abstract: With the increasing maturity and scale of quantum hardware and its integration into HPC systems, there is a need to develop robust techniques for developing, characterizing, and benchmarking quantum-HPC applications and middleware systems. This requires a better understanding of interaction, coupling, and common execution patterns between quantum and classical workload tasks and components. This p… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  4. arXiv:2405.03513  [pdf, other

    cs.CR cs.CE

    QBER: Quantifying Cyber Risks for Strategic Decisions

    Authors: Muriel Figueredo Franco, Aiatur Rahaman Mullick, Santosh Jha

    Abstract: Quantifying cyber risks is essential for organizations to grasp their vulnerability to threats and make informed decisions. However, current approaches still need to work on blending economic viewpoints to provide insightful analysis. To bridge this gap, we introduce QBER approach to offer decision-makers measurable risk metrics. The QBER evaluates losses from cyberattacks, performs detailed risk… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 10 pages, 9 equations, 3 tables, 2 figures

  5. arXiv:2404.18094  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

    Authors: Wenbin Wang, Yang Song, Sanjay Jha

    Abstract: Conventional text-to-speech (TTS) research has predominantly focused on enhancing the quality of synthesized speech for speakers in the training dataset. The challenge of synthesizing lifelike speech for unseen, out-of-dataset speakers, especially those with limited reference data, remains a significant and unresolved problem. While zero-shot or few-shot speaker-adaptive TTS approaches have been e… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 15 pages, 13 figures. Copyright has been transferred to IEEE

    Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2024

  6. arXiv:2404.15293  [pdf, other

    eess.IV cs.GR q-bio.NC

    Interactive Manipulation and Visualization of 3D Brain MRI for Surgical Training

    Authors: Siddharth Jha, Zichen Gui, Benjamin Delbos, Richard Moreau, Arnaud Leleve, Irene Cheng

    Abstract: In modern medical diagnostics, magnetic resonance imaging (MRI) is an important technique that provides detailed insights into anatomical structures. In this paper, we present a comprehensive methodology focusing on streamlining the segmentation, reconstruction, and visualization process of 3D MRI data. Segmentation involves the extraction of anatomical regions with the help of state-of-the-art de… ▽ More

    Submitted 24 March, 2024; originally announced April 2024.

  7. arXiv:2404.08509  [pdf, other

    cs.DC cs.CL cs.LG

    Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction

    Authors: Haoran Qiu, Weichao Mao, Archit Patke, Shengkun Cui, Saurabh Jha, Chen Wang, Hubertus Franke, Zbigniew T. Kalbarczyk, Tamer Başar, Ravishankar K. Iyer

    Abstract: Large language models (LLMs) have been driving a new wave of interactive AI applications across numerous domains. However, efficiently serving LLM inference requests is challenging due to their unpredictable execution times originating from the autoregressive nature of generative models. Existing LLM serving systems exploit first-come-first-serve (FCFS) scheduling, suffering from head-of-line bloc… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted at AIOps'24

  8. arXiv:2404.07139  [pdf, other

    cs.AI cs.GT

    Towards a Game-theoretic Understanding of Explanation-based Membership Inference Attacks

    Authors: Kavita Kumari, Murtuza Jadliwala, Sumit Kumar Jha, Anindya Maiti

    Abstract: Model explanations improve the transparency of black-box machine learning (ML) models and their decisions; however, they can also be exploited to carry out privacy threats such as membership inference attacks (MIA). Existing works have only analyzed MIA in a single "what if" interaction scenario between an adversary and the target ML model; thus, it does not discern the factors impacting the capab… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2202.02659

  9. arXiv:2403.19837  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.LO

    Concept-based Analysis of Neural Networks via Vision-Language Models

    Authors: Ravi Mangal, Nina Narodytska, Divya Gopinath, Boyue Caroline Hu, Anirban Roy, Susmit Jha, Corina Pasareanu

    Abstract: The analysis of vision-based deep neural networks (DNNs) is highly desirable but it is very challenging due to the difficulty of expressing formal specifications for vision tasks and the lack of efficient verification procedures. In this paper, we propose to leverage emerging multimodal, vision-language, foundation models (VLMs) as a lens through which we can reason about vision models. VLMs have… ▽ More

    Submitted 10 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  10. arXiv:2403.19137  [pdf, other

    cs.CV

    CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language Models

    Authors: Saurav Jha, Dong Gong, Lina Yao

    Abstract: Continual learning (CL) aims to help deep neural networks to learn new knowledge while retaining what has been learned. Recently, pre-trained vision-language models such as CLIP, with powerful generalizability, have been gaining traction as practical CL candidates. However, the domain mismatch between the pre-training and the downstream CL tasks calls for finetuning of the CLIP on the latter. The… ▽ More

    Submitted 23 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Work under review

  11. arXiv:2403.18073  [pdf, other

    cs.DC

    Workflow Mini-Apps: Portable, Scalable, Tunable & Faithful Representations of Scientific Workflows

    Authors: Ozgur Ozan Kilic, Tianle Wang, Matteo Turilli, Mikhail Titov, Andre Merzky, Line Pouchard, Shantenu Jha

    Abstract: Workflows are critical for scientific discovery. However, the sophistication, heterogeneity, and scale of workflows make building, testing, and optimizing them increasingly challenging. Furthermore, their complexity and heterogeneity make performance reproducibility hard. In this paper, we propose workflow mini-apps as a tool to address the challenges in building and testing workflows while contro… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  12. arXiv:2403.17155  [pdf, other

    cs.CL cs.CR

    Task-Agnostic Detector for Insertion-Based Backdoor Attacks

    Authors: Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen

    Abstract: Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering and named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering ta… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Findings of NAACL 2024

  13. arXiv:2403.15721  [pdf, other

    cs.DC

    Design and Implementation of an Analysis Pipeline for Heterogeneous Data

    Authors: Arup Kumar Sarker, Aymen Alsaadi, Niranda Perera, Mills Staylor, Gregor von Laszewski, Matteo Turilli, Ozgur Ozan Kilic, Mikhail Titov, Andre Merzky, Shantenu Jha, Geoffrey Fox

    Abstract: Managing and preparing complex data for deep learning, a prevalent approach in large-scale data science can be challenging. Data transfer for model training also presents difficulties, impacting scientific fields like genomics, climate modeling, and astronomy. A large-scale solution like Google Pathways with a distributed execution environment for deep learning models exists but is proprietary. In… ▽ More

    Submitted 7 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: 14 pages, 16 figures, 2 tables

    ACM Class: H.2.4; D.2.7; D.2.2

  14. Loss Regularizing Robotic Terrain Classification

    Authors: Shakti Deo Kumar, Sudhanshu Tripathi, Krishna Ujjwal, Sarvada Sakshi Jha, Suddhasil De

    Abstract: Locomotion mechanics of legged robots are suitable when pacing through difficult terrains. Recognising terrains for such robots are important to fully yoke the versatility of their movements. Consequently, robotic terrain classification becomes significant to classify terrains in real time with high accuracy. The conventional classifiers suffer from overfitting problem, low accuracy problem, high… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Preliminary draft of the work published in IEEE conference 2023

  15. arXiv:2402.18649  [pdf, other

    cs.CR cs.AI

    A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems

    Authors: Fangzhou Wu, Ning Zhang, Somesh Jha, Patrick McDaniel, Chaowei Xiao

    Abstract: Large Language Model (LLM) systems are inherently compositional, with individual LLM serving as the core foundation with additional layers of objects such as plugins, sandbox, and so on. Along with the great potential, there are also increasing concerns over the security of such probabilistic intelligent systems. However, existing studies on LLM security often focus on individual LLM, but without… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  16. arXiv:2402.15911  [pdf, other

    cs.CR cs.CL

    PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails

    Authors: Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, Atul Prakash

    Abstract: Large language models (LLMs) are typically aligned to be harmless to humans. Unfortunately, recent work has shown that such models are susceptible to automated jailbreak attacks that induce them to generate harmful content. More recent LLMs often incorporate an additional layer of defense, a Guard Model, which is a second LLM that is designed to check and moderate the output response of the primar… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  17. arXiv:2402.05980  [pdf, other

    cs.SE cs.AI cs.LG cs.PL

    Do Large Code Models Understand Programming Concepts? A Black-box Approach

    Authors: Ashish Hooda, Mihai Christodorescu, Miltiadis Allamanis, Aaron Wilson, Kassem Fawaz, Somesh Jha

    Abstract: Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs. We prop… ▽ More

    Submitted 23 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  18. arXiv:2402.02047  [pdf, other

    cs.SE cs.LG

    Calibration and Correctness of Language Models for Code

    Authors: Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Susmit Jha, Prem Devanbu, Toufique Ahmed

    Abstract: Machine learning models are widely used but can also often be wrong. Users would benefit from a reliable indication of whether a given output from a given model should be trusted, so a rational decision can be made whether to use the output or not. For example, outputs can be associated with a confidence measure; if this confidence measure is strongly associated with likelihood of correctness, the… ▽ More

    Submitted 16 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  19. arXiv:2401.07886  [pdf, other

    cs.LG cs.AI cs.CL cs.DC

    Learned Best-Effort LLM Serving

    Authors: Siddharth Jha, Coleman Hooper, Xiaoxuan Liu, Sehoon Kim, Kurt Keutzer

    Abstract: Many applications must provide low-latency LLM service to users or risk unacceptable user experience. However, over-provisioning resources to serve fluctuating request patterns is often prohibitively expensive. In this work, we present a best-effort serving system that employs deep reinforcement learning to adjust service quality based on the task distribution and system load. Our best-effort syst… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  20. arXiv:2311.13713  [pdf, other

    cs.CR cs.CV cs.LG

    A Somewhat Robust Image Watermark against Diffusion-based Editing Models

    Authors: Mingtian Tan, Tianhao Wang, Somesh Jha

    Abstract: Recently, diffusion models (DMs) have become the state-of-the-art method for image synthesis. Editing models based on DMs, known for their high fidelity and precision, have inadvertently introduced new challenges related to image copyright infringement and malicious editing. Our work is the first to formalize and address this issue. After assessing and attempting to enhance traditional image water… ▽ More

    Submitted 7 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

  21. arXiv:2311.10571  [pdf, other

    stat.ML cs.LG stat.CO

    Direct Amortized Likelihood Ratio Estimation

    Authors: Adam D. Cobb, Brian Matejek, Daniel Elenius, Anirban Roy, Susmit Jha

    Abstract: We introduce a new amortized likelihood ratio estimator for likelihood-free simulation-based inference (SBI). Our estimator is simple to train and estimates the likelihood ratio using a single forward pass of the neural estimator. Our approach directly computes the likelihood ratio between two competing parameter sets which is different from the previous approach of comparing two neural network ou… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 12 Pages, 10 Figures, GitHub: https://github.com/SRI-CSL/dnre

  22. arXiv:2311.04824  [pdf, other

    cs.DB cs.DC cs.PL

    Bilevel Relations and Their Applications to Data Insights

    Authors: Xi Wu, Xiangyao Yu, Shaleen Deep, Ahmed Mahmood, Uyeong Jang, Stratis Viglas, Somesh Jha, John Cieslewicz, Jeffrey F. Naughton

    Abstract: Many data-insight analytic tasks in anomaly detection, metric attribution, and experimentation analysis can be modeled as searching in a large space of tables and finding important ones, where the notion of importance is defined in some adhoc manner. While various frameworks have been proposed (e.g., DIFF, VLDB 2019), a systematic and general treatment is lacking. This paper describes bilevel rela… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Some overlap on examples and experiments with arXiv:2302.00120. The latter draft will be revised to focus on implementation

  23. arXiv:2311.00429  [pdf, other

    eess.IV cs.LG

    Crop Disease Classification using Support Vector Machines with Green Chromatic Coordinate (GCC) and Attention based feature extraction for IoT based Smart Agricultural Applications

    Authors: Shashwat Jha, Vishvaditya Luhach, Gauri Shanker Gupta, Beependra Singh

    Abstract: Crops hold paramount significance as they serve as the primary provider of energy, nutrition, and medicinal benefits for the human population. Plant diseases, however, can negatively affect leaves during agricultural cultivation, resulting in significant losses in crop output and economic value. Therefore, it is crucial for farmers to identify crop diseases. However, this method frequently necessi… ▽ More

    Submitted 6 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

  24. arXiv:2310.19272  [pdf, other

    cs.LG cs.AI cs.CV

    NPCL: Neural Processes for Uncertainty-Aware Continual Learning

    Authors: Saurav Jha, Dong Gong, He Zhao, Lina Yao

    Abstract: Continual learning (CL) aims to train deep neural networks efficiently on streaming data while limiting the forgetting caused by new tasks. However, learning transferable knowledge with less interference between tasks is difficult, and real-world deployment of CL models is limited by their inability to measure predictive uncertainties. To address these issues, we propose handling CL tasks with neu… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted as a poster at NeurIPS 2023

  25. arXiv:2310.19137  [pdf, other

    cs.LG cs.AI

    Automaton Distillation: Neuro-Symbolic Transfer Learning for Deep Reinforcement Learning

    Authors: Suraj Singireddy, Andre Beckus, George Atia, Sumit Jha, Alvaro Velasquez

    Abstract: Reinforcement learning (RL) is a powerful tool for finding optimal policies in sequential decision processes. However, deep RL methods suffer from two weaknesses: collecting the amount of agent experience required for practical RL problems is prohibitively expensive, and the learned policies exhibit poor generalization on tasks outside of the training distribution. To mitigate these issues, we int… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  26. arXiv:2310.18924  [pdf, other

    cs.LG

    Remaining useful life prediction of Lithium-ion batteries using spatio-temporal multimodal attention networks

    Authors: Sungho Suh, Dhruv Aditya Mittal, Hymalai Bello, Bo Zhou, Mayank Shekhar Jha, Paul Lukowicz

    Abstract: Lithium-ion batteries are widely used in various applications, including electric vehicles and renewable energy storage. The prediction of the remaining useful life (RUL) of batteries is crucial for ensuring reliable and efficient operation, as well as reducing maintenance costs. However, determining the life cycle of batteries in real-world scenarios is challenging, and existing methods have limi… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  27. arXiv:2310.18491  [pdf, other

    cs.LG cs.CL cs.CR

    Publicly-Detectable Watermarking for Language Models

    Authors: Jaiden Fairoze, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Mingyuan Wang

    Abstract: We present a highly detectable, trustless watermarking scheme for LLMs: the detection algorithm contains no secret information, and it is executable by anyone. We embed a publicly-verifiable cryptographic signature into LLM output using rejection sampling. We prove that our scheme is cryptographically correct, sound, and distortion-free. We make novel uses of error-correction techniques to overcom… ▽ More

    Submitted 28 May, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  28. arXiv:2310.17064  [pdf, other

    cs.AI cs.CL cs.LG cs.LO

    math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories

    Authors: Hassen Saidi, Susmit Jha, Tuhin Sahai

    Abstract: As artificial intelligence (AI) gains greater adoption in a wide variety of applications, it has immense potential to contribute to mathematical discovery, by guiding conjecture generation, constructing counterexamples, assisting in formalizing mathematics, and discovering connections between different mathematical areas, to name a few. While prior work has leveraged computers for exhaustive mat… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  29. arXiv:2310.16678  [pdf, other

    cs.LG cs.CR

    Robust and Actively Secure Serverless Collaborative Learning

    Authors: Olive Franzese, Adam Dziedzic, Christopher A. Choquette-Choo, Mark R. Thomas, Muhammad Ahmad Kaleem, Stephan Rabanser, Congyu Fang, Somesh Jha, Nicolas Papernot, Xiao Wang

    Abstract: Collaborative machine learning (ML) is widely used to enable institutions to learn better models from distributed data. While collaborative approaches to learning intuitively protect user data, they remain vulnerable to either the server, the clients, or both, deviating from the protocol. Indeed, because the protocol is asymmetric, a malicious server can abuse its power to reconstruct client data… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023

  30. arXiv:2310.11689  [pdf, other

    cs.CL cs.LG

    Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs

    Authors: Jiefeng Chen, Jinsung Yoon, Sayna Ebrahimi, Sercan O Arik, Tomas Pfister, Somesh Jha

    Abstract: Large language models (LLMs) have recently shown great advances in a variety of tasks, including natural language understanding and generation. However, their use in high-stakes decision-making scenarios is still limited due to the potential for errors. Selective prediction is a technique that can be used to improve the reliability of the LLMs by allowing them to abstain from making predictions wh… ▽ More

    Submitted 11 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Paper published at Findings of the Association for Computational Linguistics: EMNLP, 2023

  31. arXiv:2310.08015  [pdf, other

    cs.LG cs.CR

    Why Train More? Effective and Efficient Membership Inference via Memorization

    Authors: Jihye Choi, Shruti Tople, Varun Chandrasekaran, Somesh Jha

    Abstract: Membership Inference Attacks (MIAs) aim to identify specific data samples within the private training dataset of machine learning models, leading to serious privacy violations and other sophisticated threats. Many practical black-box MIAs require query access to the data distribution (the same distribution where the private data is drawn) to train shadow models. By doing so, the adversary obtains… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  32. arXiv:2310.06758  [pdf, other

    cs.SE cs.PL

    slash: A Technique for Static Configuration-Logic Identification

    Authors: Mohannad Alhanahnah, Philipp Schubert, Thomas Reps, Somesh Jha, Eric Bodden

    Abstract: Researchers have recently devised tools for debloating software and detecting configuration errors. Several of these tools rely on the observation that programs are composed of an initialization phase followed by a main-computation phase. Users of these tools are required to manually annotate the boundary that separates these phases, a task that can be time-consuming and error-prone (typically, th… ▽ More

    Submitted 20 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  33. arXiv:2310.03371  [pdf, ps, other

    cs.IT cs.DC

    Fundamental Limits of Distributed Optimization over Multiple Access Channel

    Authors: Shubham Jha

    Abstract: We consider distributed optimization over a $d$-dimensional space, where $K$ remote clients send coded gradient estimates over an {\em additive Gaussian Multiple Access Channel (MAC)} with noise variance $σ_z^2$. Furthermore, the codewords from the clients must satisfy the average power constraint $P$, resulting in a signal-to-noise ratio (SNR) of $KP/σ_z^2$. In this paper, we study the fundamen… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: Submitted to IEEE for possible publication

  34. arXiv:2309.16436  [pdf, other

    cs.AI cs.LO

    Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive Synthesis using Large Language Models and Satisfiability Solving

    Authors: Sumit Kumar Jha, Susmit Jha, Patrick Lincoln, Nathaniel D. Bastian, Alvaro Velasquez, Rickard Ewetz, Sandeep Neema

    Abstract: Generative large language models (LLMs) with instruct training such as GPT-4 can follow human-provided instruction prompts and generate human-like responses to these prompts. Apart from natural language responses, they have also been found to be effective at generating formal artifacts such as code, plans, and logical specifications from natural language prompts. Despite their remarkably improved… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 25 pages, 7 figures

  35. arXiv:2309.15386  [pdf, other

    cs.LG cs.AI

    Neural Stochastic Differential Equations for Robust and Explainable Analysis of Electromagnetic Unintended Radiated Emissions

    Authors: Sumit Kumar Jha, Susmit Jha, Rickard Ewetz, Alvaro Velasquez

    Abstract: We present a comprehensive evaluation of the robustness and explainability of ResNet-like models in the context of Unintended Radiated Emission (URE) classification and suggest a new approach leveraging Neural Stochastic Differential Equations (SDEs) to address identified limitations. We provide an empirical demonstration of the fragility of ResNet-like models to Gaussian noise perturbations, wher… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 11 pages, 3 figures, 4 tables

  36. arXiv:2309.09258  [pdf, other

    cs.LG math.OC stat.ML

    Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets

    Authors: Pulkit Gopalani, Samyak Jha, Anirbit Mukherjee

    Abstract: In this note, we demonstrate a first-of-its-kind provable convergence of SGD to the global minima of appropriately regularized logistic empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates with adequately smooth and bounded activations like sigmoid and tanh. We also prove an exponentially fast convergence rate for continuous time SGD that also applies to smooth unbou… ▽ More

    Submitted 17 March, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

    Comments: 18 Pages, 1 figure. Published in the Transactions on Machine Learning Research (TMLR) in Feb, 2024. arXiv admin note: substantial text overlap with arXiv:2210.11452

  37. arXiv:2309.05070  [pdf, other

    cs.RO cs.AI eess.SY

    Chasing the Intruder: A Reinforcement Learning Approach for Tracking Intruder Drones

    Authors: Shivam Kainth, Subham Sahoo, Rajtilak Pal, Shashi Shekhar Jha

    Abstract: Drones are becoming versatile in a myriad of applications. This has led to the use of drones for spying and intruding into the restricted or private air spaces. Such foul use of drone technology is dangerous for the safety and security of many critical infrastructures. In addition, due to the varied low-cost design and agility of the drones, it is a challenging task to identify and track them usin… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

  38. Identifying and Mitigating the Security Risks of Generative AI

    Authors: Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

    Abstract: Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well… ▽ More

    Submitted 28 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Journal ref: Foundations and Trends in Privacy and Security 6 (2023) 1-52

  39. arXiv:2308.14659  [pdf, other

    cs.LG

    RESTORE: Graph Embedding Assessment Through Reconstruction

    Authors: Hong Yung Yip, Chidaksh Ravuru, Neelabha Banerjee, Shashwat Jha, Amit Sheth, Aman Chadha, Amitava Das

    Abstract: Following the success of Word2Vec embeddings, graph embeddings (GEs) have gained substantial traction. GEs are commonly generated and evaluated extrinsically on downstream applications, but intrinsic evaluations of the original graph properties in terms of topological structure and semantic information have been lacking. Understanding these will help identify the deficiency of the various families… ▽ More

    Submitted 5 September, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  40. arXiv:2308.13007  [pdf, other

    cs.SD cs.AI eess.AS

    Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations

    Authors: Wenbin Wang, Yang Song, Sanjay Jha

    Abstract: While most research into speech synthesis has focused on synthesizing high-quality speech for in-dataset speakers, an equally essential yet unsolved problem is synthesizing speech for unseen speakers who are out-of-dataset with limited reference data, i.e., speaker adaptive speech synthesis. Many studies have proposed zero-shot speaker adaptive text-to-speech and voice conversion approaches aimed… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: 5 pages, 3 figures. Accepted by Interspeech 2023, Oral

  41. arXiv:2308.06608  [pdf, other

    quant-ph cs.DC

    A Conceptual Architecture for a Quantum-HPC Middleware

    Authors: Nishant Saurabh, Shantenu Jha, Andre Luckow

    Abstract: Quantum computing promises potential for science and industry by solving certain computationally complex problems faster than classical computers. Quantum computing systems evolved from monolithic systems towards modular architectures comprising multiple quantum processing units (QPUs) coupled to classical computing nodes (HPC). With the increasing scale, middleware systems that facilitate the eff… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 12 pages, 3 figures

    ACM Class: D.m

  42. arXiv:2308.03906  [pdf, other

    cs.CV

    TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models

    Authors: Indranil Sur, Karan Sikka, Matthew Walmer, Kaushik Koneripalli, Anirban Roy, Xiao Lin, Ajay Divakaran, Susmit Jha

    Abstract: We present a Multimodal Backdoor Defense technique TIJO (Trigger Inversion using Joint Optimization). Recent work arXiv:2112.07668 has demonstrated successful backdoor attacks on multimodal models for the Visual Question Answering task. Their dual-key backdoor trigger is split across two modalities (image and text), such that the backdoor is activated if and only if the trigger is present in both… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: Published as conference paper at ICCV 2023. 13 pages, 6 figures, 7 tables

  43. arXiv:2308.03664  [pdf, other

    cs.LG

    Two-stage Early Prediction Framework of Remaining Useful Life for Lithium-ion Batteries

    Authors: Dhruv Mittal, Hymalai Bello, Bo Zhou, Mayank Shekhar Jha, Sungho Suh, Paul Lukowicz

    Abstract: Early prediction of remaining useful life (RUL) is crucial for effective battery management across various industries, ranging from household appliances to large-scale applications. Accurate RUL prediction improves the reliability and maintainability of battery technology. However, existing methods have limitations, including assumptions of data from the same sensors or distribution, foreknowledge… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: Accepted at the 49th Annual Conference of the IEEE Industrial Electronics Society (IECON 2023)

  44. arXiv:2307.16331  [pdf, other

    cs.LG cs.CR

    Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks

    Authors: Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha, Atul Prakash

    Abstract: Adversarial examples threaten the integrity of machine learning systems with alarming success rates even under constrained black-box conditions. Stateful defenses have emerged as an effective countermeasure, detecting potential attacks by maintaining a buffer of recent queries and detecting new queries that are too similar. However, these defenses fundamentally pose a trade-off between attack dete… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: 2nd AdvML Frontiers Workshop at ICML 2023

  45. arXiv:2307.08813  [pdf, other

    cs.CL cs.LG

    Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

    Authors: Gilchan Park, Byung-Jun Yoon, Xihaier Luo, Vanessa López-Marrero, Shinjae Yoo, Shantenu Jha

    Abstract: Understanding protein interactions and pathway knowledge is crucial for unraveling the complexities of living systems and investigating the underlying mechanisms of biological functions and complex diseases. While existing databases provide curated biological data from literature and other sources, they are often incomplete and their maintenance is labor-intensive, necessitating alternative approa… ▽ More

    Submitted 18 October, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

  46. PSI/J: A Portable Interface for Submitting, Monitoring, and Managing Jobs

    Authors: Mihael Hategan-Marandiuc, Andre Merzky, Nicholson Collier, Ketan Maheshwari, Jonathan Ozik, Matteo Turilli, Andreas Wilke, Justin M. Wozniak, Kyle Chard, Ian Foster, Rafael Ferreira da Silva, Shantenu Jha, Daniel Laney

    Abstract: It is generally desirable for high-performance computing (HPC) applications to be portable between HPC systems, for example to make use of more performant hardware, make effective use of allocations, and to co-locate compute jobs with large datasets. Unfortunately, moving scientific applications between HPC systems is challenging for various reasons, most notably that HPC systems have different HP… ▽ More

    Submitted 20 September, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

  47. arXiv:2307.01292  [pdf, other

    cs.CR cs.AI cs.LG

    Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems

    Authors: Debopam Sanyal, Jui-Tse Hung, Manav Agrawal, Prahlad Jasti, Shahab Nikkhoo, Somesh Jha, Tianhao Wang, Sibin Mohan, Alexey Tumanov

    Abstract: Model-serving systems have become increasingly popular, especially in real-time web applications. In such systems, users send queries to the server and specify the desired performance metrics (e.g., desired accuracy, latency). The server maintains a set of models (model zoo) in the back-end and serves the queries based on the specified metrics. This paper examines the security, specifically robust… ▽ More

    Submitted 6 August, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 17 pages, 9 figures, 6 tables

  48. arXiv:2306.05562  [pdf, other

    cs.RO cs.AI cs.CE

    AircraftVerse: A Large-Scale Multimodal Dataset of Aerial Vehicle Designs

    Authors: Adam D. Cobb, Anirban Roy, Daniel Elenius, F. Michael Heim, Brian Swenson, Sydney Whittington, James D. Walker, Theodore Bapty, Joseph Hite, Karthik Ramani, Christopher McComb, Susmit Jha

    Abstract: We present AircraftVerse, a publicly available aerial vehicle design dataset. Aircraft design encompasses different physics domains and, hence, multiple modalities of representation. The evaluation of these cyber-physical system (CPS) designs requires the use of scientific analytical and simulation models ranging from computer-aided design tools for structural and manufacturing analysis, computati… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: The dataset is hosted at https://zenodo.org/record/6525446, baseline models and code at https://github.com/SRI-CSL/AircraftVerse, and the dataset description at https://aircraftverse.onrender.com/

  49. arXiv:2305.20052  [pdf, other

    cs.LG cs.CV

    Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision

    Authors: Chase Walker, Sumit Jha, Kenny Chen, Rickard Ewetz

    Abstract: Attribution algorithms are frequently employed to explain the decisions of neural network models. Integrated Gradients (IG) is an influential attribution method due to its strong axiomatic foundation. The algorithm is based on integrating the gradients along a path from a reference image to the input image. Unfortunately, it can be observed that gradients computed from regions where the output log… ▽ More

    Submitted 18 December, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: 16 pages, 11 figures, accepted at AAAI 2024, the full code implementation of the paper results is located at: https://github.com/chasewalker26/Integrated-Decision-Gradients

    ACM Class: I.4.7

  50. arXiv:2305.17528  [pdf, other

    cs.LG

    Two Heads are Better than One: Towards Better Adversarial Robustness by Combining Transduction and Rejection

    Authors: Nils Palumbo, Yang Guo, Xi Wu, Jiefeng Chen, Yingyu Liang, Somesh Jha

    Abstract: Both transduction and rejection have emerged as important techniques for defending against adversarial perturbations. A recent work by Tramèr showed that, in the rejection-only case (no transduction), a strong rejection-solution can be turned into a strong (but computationally inefficient) non-rejection solution. This detector-to-classifier reduction has been mostly applied to give evidence that c… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.