Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 2,255 results for author: Liu, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06028  [pdf, other

    cs.CV

    ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery

    Authors: Xian Sun, Qiwei Yan, Chubo Deng, Chenglong Liu, Yi Jiang, Zhongyan Hou, Wanxuan Lu, Fanglong Yao, Xiaoyu Liu, Lingxiang Hao, Hongfeng Yu

    Abstract: Scene Graph Generation (SGG) is a high-level visual understanding and reasoning task aimed at extracting entities (such as objects) and their interrelationships from images. Significant progress has been made in the study of SGG in natural images in recent years, but its exploration in the domain of remote sensing images remains very limited. The complex characteristics of remote sensing images ne… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  2. arXiv:2406.05898  [pdf, other

    cs.IR cs.AI cs.LG

    Async Learned User Embeddings for Ads Delivery Optimization

    Authors: Mingwei Tang, Meng Liu, Hong Li, Junjie Yang, Chenglin Wei, Boyang Li, Dai Li, Rengan Xu, Yifan Xu, Zehua Zhang, Xiangyu Wang, Linfeng Liu, Yuelei Xie, Chengye Liu, Labib Fawaz, Li Li, Hongnan Wang, Bill Zhu, Sri Reddy

    Abstract: User representation is crucial for recommendation systems as it helps to deliver personalized recommendations by capturing user preferences and behaviors in low-dimensional vectors. High-quality user embeddings can capture subtle preferences, enable precise similarity calculations, and adapt to changing preferences over time to maintain relevance. The effectiveness of recommendation systems depend… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by workshop on Multimodal Representation and Retrieval at SIGIR 2024, Washington DC

  3. arXiv:2406.05779  [pdf, other

    cs.CV

    Learning to utilize gradient information for crisp edge detection

    Authors: Changsong Liu, Wei Zhang, Yanyan Liu, Yuming Li, Wenlin Li, Yimeng Fan, Liang Zhang

    Abstract: Edge detection is a fundamental task in computer vision and it has made great progress under the development of deep convolutional neural networks (DCNNs), some of them have achieved a beyond human-level performance. However, recent top-performing edge detection methods tend to generate thick and blurred edge lines. In this work, we propose an effective method to solve this problem. Our approach c… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  4. arXiv:2406.05682  [pdf, other

    cs.LG cs.AI

    From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHR

    Authors: Ran Xu, Yiwen Lu, Chang Liu, Yong Chen, Yan Sun, Xiao Hu, Joyce C Ho, Carl Yang

    Abstract: Electronic Health Records (EHRs) contain rich patient information and are crucial for clinical research and practice. In recent years, deep learning models have been applied to EHRs, but they often rely on massive features, which may not be readily available for all patients. We propose HTP-Star, which leverages hypergraph structures with a pretrain-then-finetune framework for modeling EHR data, e… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: CHIL 2024

  5. arXiv:2406.05221  [pdf, other

    cs.DC

    GCAPS: GPU Context-Aware Preemptive Priority-based Scheduling for Real-Time Tasks

    Authors: Yidi Wang, Cong Liu, Daniel Wong, Hyoseung Kim

    Abstract: Scheduling real-time tasks that utilize GPUs with analyzable guarantees poses a significant challenge due to the intricate interaction between CPU and GPU resources, as well as the complex GPU hardware and software stack. While much research has been conducted in the real-time research community, several limitations persist, including the absence or limited availability of GPU-level preemption, ex… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by ECRTS 2024. arXiv admin note: substantial text overlap with arXiv:2401.16529

  6. arXiv:2406.04975  [pdf, other

    cs.LG cs.AI

    UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting

    Authors: Juncheng Liu, Chenghao Liu, Gerald Woo, Yiwei Wang, Bryan Hooi, Caiming Xiong, Doyen Sahoo

    Abstract: Transformer-based models have emerged as powerful tools for multivariate time series forecasting (MTSF). However, existing Transformer models often fall short of capturing both intricate dependencies across variate and temporal dimensions in MTS data. Some recent models are proposed to separately capture variate and temporal dependencies through either two sequential or parallel attention mechanis… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  7. arXiv:2406.04637  [pdf

    cs.HC cs.CY

    Accessible Adventures: Teaching Accessibility to High School Students Through Games

    Authors: Kyrie Zhixuan Zhou, Chunyu Liu, Jingwen Shan, Devorah Kletenik, Rachel F. Adler

    Abstract: Accessibility education has been rarely incorporated into the high school curricula. This is a missed opportunity to equip next-generation software designers and decision-makers with knowledge, awareness, and empathy regarding accessibility and disabilities. We taught accessibility to students (N=93) in a midwestern high school through empathy-driven games and interviewed three Computer Science hi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 87th Annual Meeting of the Association for Information Science and Technology (ASIS&T)

  8. arXiv:2406.04207  [pdf, other

    cs.CV

    CDMamba: Remote Sensing Image Change Detection with Mamba

    Authors: Haotian Zhang, Keyan Chen, Chenyang Liu, Hao Chen, Zhengxia Zou, Zhenwei Shi

    Abstract: Recently, the Mamba architecture based on state space models has demonstrated remarkable performance in a series of natural language processing tasks and has been rapidly applied to remote sensing change detection (CD) tasks. However, most methods enhance the global receptive field by directly modifying the scanning mode of Mamba, neglecting the crucial role that local information plays in dense p… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  9. arXiv:2406.04149  [pdf

    eess.IV cs.AI

    Characterizing segregation in blast rock piles a deep-learning approach leveraging aerial image analysis

    Authors: Chengeng Liu, Sihong Liu, Chaomin Shen, Yupeng Gao, Yuxuan Liu

    Abstract: Blasted rock material serves a critical role in various engineering applications, yet the phenomenon of segregation-where particle sizes vary significantly along the gradient of a quarry pile-presents challenges for optimizing quarry material storage and handling. This study introduces an advanced image analysis methodology to characterize such segregation of rock fragments. The accurate delineati… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  10. arXiv:2406.04052  [pdf, ps, other

    cs.LG cs.AI

    Multivector Neurons: Better and Faster O(n)-Equivariant Clifford Graph Neural Networks

    Authors: Cong Liu, David Ruhe, Patrick Forré

    Abstract: Most current deep learning models equivariant to $O(n)$ or $SO(n)$ either consider mostly scalar information such as distances and angles or have a very high computational complexity. In this work, we test a few novel message passing graph neural networks (GNNs) based on Clifford multivectors, structured similarly to other prevalent equivariant models in geometric deep learning. Our approach lever… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  11. arXiv:2406.03930  [pdf, other

    cs.CL

    Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art

    Authors: Chen Cecilia Liu, Iryna Gurevych, Anna Korhonen

    Abstract: The surge of interest in culturally aware and adapted Natural Language Processing (NLP) has inspired much recent research. However, the lack of common understanding of the concept of "culture" has made it difficult to evaluate progress in this emerging area. Drawing on prior research in NLP and related fields, we propose an extensive taxonomy of elements of culture that can provide a systematic fr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  12. arXiv:2406.03794  [pdf, other

    cs.LG

    Infusing Self-Consistency into Density Functional Theory Hamiltonian Prediction via Deep Equilibrium Models

    Authors: Zun Wang, Chang Liu, Nianlong Zou, He Zhang, Xinran Wei, Lin Huang, Lijun Wu, Bin Shao

    Abstract: In this study, we introduce a unified neural network architecture, the Deep Equilibrium Density Functional Theory Hamiltonian (DEQH) model, which incorporates Deep Equilibrium Models (DEQs) for predicting Density Functional Theory (DFT) Hamiltonians. The DEQH model inherently captures the self-consistency nature of Hamiltonian, a critical aspect often overlooked by traditional machine learning app… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  13. arXiv:2406.03725  [pdf, other

    cs.CL

    LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification

    Authors: Chun Liu, Hongguang Zhang, Kainan Zhao, Xinghai Ju, Lin Yang

    Abstract: With the booming of Large Language Models (LLMs), prompt-learning has become a promising method mainly researched in various research areas. Recently, many attempts based on prompt-learning have been made to improve the performance of text classification. However, most of these methods are based on heuristic Chain-of-Thought (CoT), and tend to be more complex but less efficient. In this paper, we… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ACL 2024 main conference

  14. arXiv:2406.03695  [pdf, other

    cs.CR

    FACOS: Enabling Privacy Protection Through Fine-Grained Access Control with On-chain and Off-chain System

    Authors: Chao Liu, Cankun Hou, Tianyu Jiang, Jianting Ning, Hui Qiao, Yusen Wu

    Abstract: Data-driven landscape across finance, government, and healthcare, the continuous generation of information demands robust solutions for secure storage, efficient dissemination, and fine-grained access control. Blockchain technology emerges as a significant tool, offering decentralized storage while upholding the tenets of data security and accessibility. However, on-chain and off-chain strategies… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  15. arXiv:2406.03317  [pdf, other

    cs.HC cs.MM

    Save It for the "Hot" Day: An LLM-Empowered Visual Analytics System for Heat Risk Management

    Authors: Haobo Li, Wong Kam-Kwai, Yan Luo, Juntong Chen, Chengzhong Liu, Yaxuan Zhang, Alexis Kai Hon Lau, Huamin Qu, Dongyu Liu

    Abstract: The escalating frequency and intensity of heat-related climate events, particularly heatwaves, emphasize the pressing need for advanced heat risk management strategies. Current approaches, primarily relying on numerical models, face challenges in spatial-temporal resolution and in capturing the dynamic interplay of environmental, social, and behavioral factors affecting heat risks. This has led to… ▽ More

    Submitted 7 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  16. arXiv:2406.02509  [pdf, other

    cs.CV

    CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation

    Authors: Dejia Xu, Weili Nie, Chao Liu, Sifei Liu, Jan Kautz, Zhangyang Wang, Arash Vahdat

    Abstract: Recently video diffusion models have emerged as expressive generative tools for high-quality video content creation readily available to general users. However, these models often do not offer precise control over camera poses for video generation, limiting the expression of cinematic language and user control. To address this issue, we introduce CamCo, which allows fine-grained Camera pose Contro… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Project page: https://ir1d.github.io/CamCo/

  17. arXiv:2406.01901  [pdf, other

    cs.LG

    Bifurcated Generative Flow Networks

    Authors: Chunhui Li, Cheng-Hao Liu, Dianbo Liu, Qingpeng Cai, Ling Pan

    Abstract: Generative Flow Networks (GFlowNets), a new family of probabilistic samplers, have recently emerged as a promising framework for learning stochastic policies that generate high-quality and diverse objects proportionally to their rewards. However, existing GFlowNets often suffer from low data efficiency due to the direct parameterization of edge flows or reliance on backward policies that may strug… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  18. arXiv:2406.01638  [pdf, other

    cs.LG cs.AI cs.CL

    TimeCMA: Towards LLM-Empowered Time Series Forecasting via Cross-Modality Alignment

    Authors: Chenxi Liu, Qianxiong Xu, Hao Miao, Sun Yang, Lingzheng Zhang, Cheng Long, Ziyue Li, Rui Zhao

    Abstract: The widespread adoption of scalable mobile sensing has led to large amounts of time series data for real-world applications. A fundamental application is multivariate time series forecasting (MTSF), which aims to predict future time series values based on historical observations. Existing MTSF methods suffer from limited parameterization and small-scale training data. Recently, Large language mode… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  19. arXiv:2406.01380  [pdf, other

    cs.CV stat.AP

    Convolutional Unscented Kalman Filter for Multi-Object Tracking with Outliers

    Authors: Shiqi Liu, Wenhan Cao, Chang Liu, Tianyi Zhang, Shengbo Eben Li

    Abstract: Multi-object tracking (MOT) is an essential technique for navigation in autonomous driving. In tracking-by-detection systems, biases, false positives, and misses, which are referred to as outliers, are inevitable due to complex traffic scenarios. Recent tracking methods are based on filtering algorithms that overlook these outliers, leading to reduced tracking accuracy or even loss of the objects… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures

  20. arXiv:2406.01359  [pdf, other

    cs.CL cs.SE

    R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

    Authors: Ken Deng, Jiaheng Liu, He Zhu, Congnan Liu, Jingxin Li, Jiakai Wang, Peng Zhao, Chenchen Zhang, Yanan Wu, Xueqiao Yin, Yuanxing Zhang, Wenbo Su, Bangyu Xiang, Tiezheng Ge, Bo Zheng

    Abstract: Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of… ▽ More

    Submitted 3 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  21. arXiv:2406.01356  [pdf, other

    cs.CV

    MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images

    Authors: Ke-Lei Wang, Pin-Hsuan Chou, Young-Ching Chou, Chia-Jen Liu, Cheng-Kuan Lin, Yu-Chee Tseng

    Abstract: While there are a lot of models for instance segmentation, PolarMask stands out as a unique one that represents an object by a Polar coordinate system. With an anchor-box-free design and a single-stage framework that conducts detection and segmentation at one time, PolarMask is proved to be able to balance efficiency and accuracy. Hence, it can be easily connected with other downstream real-time a… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  22. arXiv:2406.01059  [pdf, other

    cs.CV

    VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

    Authors: Jinze Yang, Haoran Wang, Zining Zhu, Chenglong Liu, Meng Wymond Wu, Zeke Xie, Zhong Ji, Jungong Han, Mingming Sun

    Abstract: In this paper, we focus on resolving the problem of image outpainting, which aims to extrapolate the surrounding parts given the center contents of an image. Although recent works have achieved promising performance, the lack of versatility and customization hinders their practical applications in broader scenarios. Therefore, this work presents a novel image outpainting framework that is capable… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 15 pages

  23. arXiv:2406.01044  [pdf

    physics.med-ph cs.AI

    Nuclear Medicine Artificial Intelligence in Action: The Bethesda Report (AI Summit 2024)

    Authors: Arman Rahmim, Tyler J. Bradshaw, Guido Davidzon, Joyita Dutta, Georges El Fakhri, Munir Ghesani, Nicolas A. Karakatsanis, Quanzheng Li, Chi Liu, Emilie Roncali, Babak Saboury, Tahir Yusufaly, Abhinav K. Jha

    Abstract: The 2nd SNMMI Artificial Intelligence (AI) Summit, organized by the SNMMI AI Task Force, took place in Bethesda, MD, on February 29 - March 1, 2024. Bringing together various community members and stakeholders, and following up on a prior successful 2022 AI Summit, the summit theme was: AI in Action. Six key topics included (i) an overview of prior and ongoing efforts by the AI task force, (ii) em… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  24. arXiv:2406.00960  [pdf, other

    cs.GR cs.RO

    PDP: Physics-Based Character Animation via Diffusion Policy

    Authors: Takara E. Truong, Michael Piseno, Zhaoming Xie, C. Karen Liu

    Abstract: Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-fr… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  25. arXiv:2406.00625  [pdf, other

    cs.CV

    SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection

    Authors: Yun Peng, Xiao Lin, Nachuan Ma, Jiayuan Du, Chuangwei Liu, Chengju Liu, Qijun Chen

    Abstract: Visual anomaly detection is vital in real-world applications, such as industrial defect detection and medical diagnosis. However, most existing methods focus on local structural anomalies and fail to detect higher-level functional anomalies under logical conditions. Although recent studies have explored logical anomaly detection, they can only address simple anomalies like missing or addition and… ▽ More

    Submitted 5 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  26. arXiv:2406.00403  [pdf, other

    cs.LG cs.AI

    Dual-perspective Cross Contrastive Learning in Graph Transformers

    Authors: Zelin Yao, Chuang Liu, Xueqi Ma, Mukun Chen, Jia Wu, Xiantao Cai, Bo Du, Wenbin Hu

    Abstract: Graph contrastive learning (GCL) is a popular method for leaning graph representations by maximizing the consistency of features across augmented views. Traditional GCL methods utilize single-perspective i.e. data or model-perspective) augmentation to generate positive samples, restraining the diversity of positive samples. In addition, these positive samples may be unreliable due to uncontrollabl… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures, submitted to IEEE TKDE

  27. arXiv:2406.00219  [pdf, other

    cs.CV

    Fairness in Autonomous Driving: Towards Understanding Confounding Factors in Object Detection under Challenging Weather

    Authors: Bimsara Pathiraja, Caleb Liu, Ransalu Senanayake

    Abstract: The deployment of autonomous vehicles (AVs) is rapidly expanding to numerous cities. At the heart of AVs, the object detection module assumes a paramount role, directly influencing all downstream decision-making tasks by considering the presence of nearby pedestrians, vehicles, and more. Despite high accuracy of pedestrians detected on held-out datasets, the potential presence of algorithmic bias… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 14 pages

  28. arXiv:2405.20608  [pdf, other

    cs.CL

    Identifying while Learning for Document Event Causality Identification

    Authors: Cheng Liu, Wei Xiang, Bang Wang

    Abstract: Event Causality Identification (ECI) aims to detect whether there exists a causal relation between two events in a document. Existing studies adopt a kind of identifying after learning paradigm, where events' representations are first learned and then used for the identification. Furthermore, they mainly focus on the causality existence, but ignoring causal direction. In this paper, we take care o… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted at ACL 2024

  29. arXiv:2405.20313  [pdf, other

    cs.LG q-bio.BM

    Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation

    Authors: Guillaume Huguet, James Vuckovic, Kilian Fatras, Eric Thibodeau-Laufer, Pablo Lemos, Riashat Islam, Cheng-Hao Liu, Jarrid Rector-Brooks, Tara Akhound-Sadegh, Michael Bronstein, Alexander Tong, Avishek Joey Bose

    Abstract: Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFl… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: preprint

  30. arXiv:2405.19221  [pdf

    q-bio.QM cs.LG

    Domain adaptation in small-scale and heterogeneous biological datasets

    Authors: Seyedmehdi Orouji, Martin C. Liu, Tal Korem, Megan A. K. Peters

    Abstract: Machine learning techniques are steadily becoming more important in modern biology, and are used to build predictive models, discover patterns, and investigate biological problems. However, models trained on one dataset are often not generalizable to other datasets from different cohorts or laboratories, due to differences in the statistical properties of these datasets. These could stem from tech… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: main manuscript + supplement

  31. arXiv:2405.19038  [pdf, other

    cs.RO

    PointNetPGAP-SLC: A 3D LiDAR-based Place Recognition Approach with Segment-level Consistency Training for Mobile Robots in Horticulture

    Authors: T. Barros, L. Garrote, P. Conde, M. J. Coombes, C. Liu, C. Premebida, U. J. Nunes

    Abstract: This paper addresses robotic place recognition in horticultural environments using 3D-LiDAR technology and deep learning. Three main contributions are proposed: (i) a novel model called PointNetPGAP, which combines a global average pooling aggregator and a pairwise feature interaction aggregator; (ii) a Segment-Level Consistency (SLC) model, used only during training, with the goal of augmenting t… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: This preprint has been submitted to IEEE RA-L for review

  32. arXiv:2405.18726  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI

    Authors: Che Liu, Changde Du, Xiaoyu Chen, Huiguang He

    Abstract: Drawing inspiration from the hierarchical processing of the human auditory system, which transforms sound from low-level acoustic features to high-level semantic understanding, we introduce a novel coarse-to-fine audio reconstruction method. Leveraging non-invasive functional Magnetic Resonance Imaging (fMRI) data, our approach mimics the inverse pathway of auditory processing. Initially, we utili… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  33. arXiv:2405.18240  [pdf, other

    cs.CV

    MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution

    Authors: Wenzhuo Liu, Fei Zhu, Shijie Ma, Cheng-Lin Liu

    Abstract: Although Vision Transformers (ViTs) have recently advanced computer vision tasks significantly, an important real-world problem was overlooked: adapting to variable input resolutions. Typically, images are resized to a fixed resolution, such as 224x224, for efficiency during training and inference. However, uniform input size conflicts with real-world scenarios where images naturally vary in resol… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  34. arXiv:2405.18234  [pdf, other

    cs.RO

    Cooperative Relative Localization in MAV Swarms with Ultra-wideband Ranging

    Authors: Changrui Liu, Sven U. Pfeiffer, Guido C. H. E. de Croon

    Abstract: Relative localization (RL) is essential for the successful operation of micro air vehicle (MAV) swarms. Achieving accurate 3-D RL in infrastructure-free and GPS-denied environments with only distance information is a challenging problem that has not been satisfactorily solved. In this work, based on the range-based peer-to-peer RL using the ultra-wideband (UWB) ranging technique, we develop a nove… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 16 pages, 18 figures

  35. arXiv:2405.17830  [pdf, other

    cs.CL

    More Than Catastrophic Forgetting: Integrating General Capabilities For Domain-Specific LLMs

    Authors: Chengyuan Liu, Shihang Wang, Yangyang Kang, Lizhi Qing, Fubang Zhao, Changlong Sun, Kun Kuang, Fei Wu

    Abstract: The performance on general tasks decreases after Large Language Models (LLMs) are fine-tuned on domain-specific tasks, the phenomenon is known as Catastrophic Forgetting (CF). However, this paper presents a further challenge for real application of domain-specific LLMs beyond CF, called General Capabilities Integration (GCI), which necessitates the integration of both the general capabilities and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  36. arXiv:2405.17229  [pdf, other

    cs.HC

    InsigHTable: Insight-driven Hierarchical Table Visualization with Reinforcement Learning

    Authors: Guozheng Li, Peng He, Xinyu Wang, Runfei Li, Chi Harold Liu, Chuangxin Ou, Dong He, Guoren Wang

    Abstract: Embedding visual representations within original hierarchical tables can mitigate additional cognitive load stemming from the division of users' attention. The created hierarchical table visualizations can help users understand and explore complex data with multi-level attributes. However, because of many options available for transforming hierarchical tables and selecting subsets for embedding, t… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  37. arXiv:2405.16887  [pdf

    cs.AI cs.MA cs.RO

    A Large Language Model-based multi-agent manufacturing system for intelligent shopfloor

    Authors: Zhen Zhao, Dunbing Tang, Haihua Zhu, Zequn Zhang, Kai Chen, Changchun Liu, Yuchen Ji

    Abstract: As productivity advances, the demand of customers for multi-variety and small-batch production is increasing, thereby putting forward higher requirements for manufacturing systems. When production tasks frequent changes due to this demand, traditional manufacturing systems often cannot response promptly. The multi-agent manufacturing system is proposed to address this problem. However, because of… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  38. arXiv:2405.16874  [pdf, other

    cs.CV

    CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild

    Authors: Xingqun Qi, Hengyuan Zhang, Yatian Wang, Jiahao Pan, Chen Liu, Peng Li, Xiaowei Chi, Mengfei Li, Qixun Zhang, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo

    Abstract: Deriving co-speech 3D gestures has seen tremendous progress in virtual avatar animation. Yet, the existing methods often produce stiff and unreasonable gestures with unseen human speech inputs due to the limited 3D speech-gesture data. In this paper, we propose CoCoGesture, a novel framework enabling vivid and diverse gesture synthesis from unseen human speech prompts. Our key insight is built upo… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: The dataset will be released as soon as possible

  39. arXiv:2405.16479  [pdf, other

    cs.CV

    Differentiable Proximal Graph Matching

    Authors: Haoru Tan, Chuang Wang, Xu-Yao Zhang, Cheng-Lin Liu

    Abstract: Graph matching is a fundamental tool in computer vision and pattern recognition. In this paper, we introduce an algorithm for graph matching based on the proximal operator, referred to as differentiable proximal graph matching (DPGM). Specifically, we relax and decompose the quadratic assignment problem for the graph matching into a sequence of convex optimization problems. The whole algorithm can… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  40. arXiv:2405.16225  [pdf, ps, other

    cs.LG cs.AI

    Local Causal Structure Learning in the Presence of Latent Variables

    Authors: Feng Xie, Zheng Li, Peng Wu, Yan Zeng, Chunchen Liu, Zhi Geng

    Abstract: Discovering causal relationships from observational data, particularly in the presence of latent variables, poses a challenging problem. While current local structure learning methods have proven effective and efficient when the focus lies solely on the local relationships of a target variable, they operate under the assumption of causal sufficiency. This assumption implies that all the common cau… ▽ More

    Submitted 6 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  41. arXiv:2405.15843  [pdf, other

    cs.CV cs.AI

    SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception

    Authors: Louis Foucard, Samar Khanna, Yi Shi, Chi-Kuei Liu, Quinn Z Shen, Thuyen Ngo, Zi-Xiang Xia

    Abstract: In this paper, we propose SpotNet: a fast, single stage, image-centric but LiDAR anchored approach for long range 3D object detection. We demonstrate that our approach to LiDAR/image sensor fusion, combined with the joint learning of 2D and 3D detection tasks, can lead to accurate 3D object detection with very sparse LiDAR support. Unlike more recent bird's-eye-view (BEV) sensor-fusion methods whi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  42. arXiv:2405.15293  [pdf, other

    cs.CR

    Transaction Fee Estimation in the Bitcoin System

    Authors: Limeng Zhang, Rui Zhou, Qing Liu, Chengfei Liu, M. Ali Babar

    Abstract: In the Bitcoin system, transaction fees serve as an incentive for blockchain confirmations. In general, a transaction with a higher fee is likely to be included in the next block mined, whereas a transaction with a smaller fee or no fee may be delayed or never processed at all. However, the transaction fee needs to be specified when submitting a transaction and almost cannot be altered thereafter.… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  43. arXiv:2405.15279  [pdf, other

    cs.CV

    Towards Global Optimal Visual In-Context Learning Prompt Selection

    Authors: Chengming Xu, Chen Liu, Yikai Wang, Yanwei Fu

    Abstract: Visual In-Context Learning (VICL) is a prevailing way to transfer visual foundation models to new tasks by leveraging contextual information contained in in-context examples to enhance learning and prediction of query sample. The fundamental problem in VICL is how to select the best prompt to activate its power as much as possible, which is equivalent to the ranking problem to test the in-context… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  44. arXiv:2405.15082   

    cs.RO

    ETA-INIT: Enhancing the Translation Accuracy for Stereo Visual-Inertial SLAM Initialization

    Authors: Han Song, Zhongche Qu, Zhi Zhang, Zihan Ye, Cong Liu

    Abstract: As the current initialization method in the state-of-the-art Stereo Visual-Inertial SLAM framework, ORB-SLAM3 has limitations. Its success depends on the performance of the pure stereo SLAM system and is based on the underlying assumption that pure visual SLAM can accurately estimate the camera trajectory, which is essential for inertial parameter estimation. Meanwhile, the further improved initia… ▽ More

    Submitted 9 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: this paper needs revision and I'll publish new version in the future

  45. arXiv:2405.15020  [pdf, other

    cs.CV cs.AI

    AdjointDEIS: Efficient Gradients for Diffusion Models

    Authors: Zander W. Blasingame, Chen Liu

    Abstract: The optimization of the latents and parameters of diffusion models with respect to some differentiable metric defined on the output of the model is a challenging and complex problem. The sampling for diffusion models is done by solving either the probability flow ODE or diffusion SDE wherein a neural network approximates the score function or related quantity, allowing a numerical ODE/SDE solver t… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Initial pre-print

  46. arXiv:2405.14913  [pdf, other

    stat.ME cs.LG math.PR stat.ML

    High Rank Path Development: an approach of learning the filtration of stochastic processes

    Authors: Jiajie Tao, Hao Ni, Chong Liu

    Abstract: Since the weak convergence for stochastic processes does not account for the growth of information over time which is represented by the underlying filtration, a slightly erroneous stochastic model in weak topology may cause huge loss in multi-periods decision making problems. To address such discontinuities Aldous introduced the extended weak convergence, which can fully characterise all essentia… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  47. arXiv:2405.14824  [pdf, other

    cs.CV cs.RO

    Camera Relocalization in Shadow-free Neural Radiance Fields

    Authors: Shiyao Xu, Caiyun Liu, Yuantao Chen, Zhenxin Zhu, Zike Yan, Yongliang Shi, Hao Zhao, Guyue Zhou

    Abstract: Camera relocalization is a crucial problem in computer vision and robotics. Recent advancements in neural radiance fields (NeRFs) have shown promise in synthesizing photo-realistic images. Several works have utilized NeRFs for refining camera poses, but they do not account for lighting changes that can affect scene appearance and shadow regions, causing a degraded pose optimization process. In thi… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted by ICRA 2024. 8 pages, 5 figures, 3 tables. Codes and dataset: https://github.com/hnrna/ShadowfreeNeRF-CameraReloc

  48. arXiv:2405.14696  [pdf, other

    cs.CL cs.AI cs.DB

    A Declarative System for Optimizing AI Workloads

    Authors: Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baille Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Gerardo Vitagliano

    Abstract: A long-standing goal of data management systems has been to build systems which can compute quantitative insights over large corpora of unstructured data in a cost-effective manner. Until recently, it was difficult and expensive to extract facts from company documents, data from scientific papers, or metrics from image and video corpora. Today's models can accomplish these tasks with high accuracy… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 29 pages, 9 figures

    ACM Class: H.2.3; I.2.5

  49. arXiv:2405.14578  [pdf, other

    cs.LG

    Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

    Authors: Shuaipeng Li, Penghao Zhao, Hailin Zhang, Xingwu Sun, Hao Wu, Dian Jiao, Weiyan Wang, Chengjun Liu, Zheng Fang, Jinbao Xue, Yangyu Tao, Bin Cui, Di Wang

    Abstract: In current deep learning tasks, Adam style optimizers such as Adam, Adagrad, RMSProp, Adafactor, and Lion have been widely used as alternatives to SGD style optimizers. These optimizers typically update model parameters using the sign of gradients, resulting in more stable convergence curves. The learning rate and the batch size are the most critical hyperparameters for optimizers, which require c… ▽ More

    Submitted 4 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  50. arXiv:2405.14295  [pdf, other

    cs.CV

    Focus Anywhere for Fine-grained Multi-page Document Understanding

    Authors: Chenglong Liu, Haoran Wei, Jinyue Chen, Lingyu Kong, Zheng Ge, Zining Zhu, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang

    Abstract: Modern LVLMs still struggle to achieve fine-grained document understanding, such as OCR/translation/caption for regions of interest to the user, tasks that require the context of the entire page, or even multiple pages. Accordingly, this paper proposes Fox, an effective pipeline, hybrid data, and tuning strategy, that catalyzes LVLMs to focus anywhere on single/multi-page documents. We introduce a… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.