Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

309 Hits in 4.5 sec

Design and application of adaptive PID controller based on asynchronous advantage actor–critic learning method

Qifeng Sun, Chengze Du, Youxiang Duan, Hui Ren, Hongqiang Li
2019 Wireless networks  
To address the problems of the slow convergence and inefficiency in the existing adaptive PID controllers, we propose a new adaptive PID controller using the asynchronous advantage actor-critic (A3C) algorithm  ...  Firstly, the controller can train the multiple agents of the actor-critic structures in parallel exploiting the multi-thread asynchronous learning characteristics of the A3C structure.  ...  Google's DeepMind team proposed the asynchronous advantage actor-critic (A3C) learning algorithm [14, 15] .  ... 
doi:10.1007/s11276-019-02225-x fatcat:jqlgojvabrd7famurtm5ggwhea

Online Reinforcement Learning-Based Control of an Active Suspension System Using the Actor Critic Approach

Ahmad Fares, Ahmad Bani Younes
2020 Applied Sciences  
The Temporal Difference (TD) advantage actor critic algorithm is used with the appropriate reward function.  ...  In this paper, a controller learns to adaptively control an active suspension system using reinforcement learning without prior knowledge of the environment.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app10228060 fatcat:t3a2sa527ffonophc2u3y5f5ye

MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding [article]

Haolin Zhou, Chaoqi Yang, Xiaofeng Gao, Qiong Chen, Gongshen Liu, Guihai Chen
2022 arXiv   pre-print
To address the challenge, we propose a Multi-ObjecTive Actor-Critics algorithm based on reinforcement learning (RL), named MoTiAC, for the problem of bidding optimization with various goals.  ...  In MoTiAC, objective-specific agents update the global network asynchronously with different goals and perspectives, leading to a robust bidding policy.  ...  We generalize the popular asynchronous advantage actor-critic (A3C) [13] reinforcement learning algorithm for multiple objectives in the RTB setting.  ... 
arXiv:2002.07408v2 fatcat:cbyh5qfrmnb4jbhhzqyvwp76b4

Research on Obstacle Avoidance Planning for UUV Based on A3C Algorithm

Hongjian Wang, Wei Gao, Zhao Wang, Kai Zhang, Jingfei Ren, Lihui Deng, Shanshan He
2023 Journal of Marine Science and Engineering  
As a type of deep reinforcement learning algorithm, the A3C (Asynchronous Advantage Actor-Critic) algorithm can effectively utilize computer resources and improve training efficiency by synchronously training  ...  Actor-Critic in multiple threads.  ...  Acknowledgments: The authors would like to thank the anonymous reviewers and the handling editors for their constructive comments that greatly improved this article from its original form.  ... 
doi:10.3390/jmse12010063 fatcat:3lha3tclbveqjdjzmlmajwzaxe

Autotuning PID control using Actor-Critic Deep Reinforcement Learning [article]

Vivien van Veldhuizen
2022 arXiv   pre-print
To study this, an algorithm called Advantage Actor Critic (A2C) is implemented on a simulated robot arm. The simulation primarily relies on the ROS framework.  ...  Initial tests show that the model is indeed able to adapt its predictions to apple locations, making it an adaptive controller.  ...  Secondly, I would like to thank David Speck, for teaching us the much appreciated basics ROS.  ... 
arXiv:2212.00013v1 fatcat:lec5qlaykbhufac2b5ntk3jhkq

Editorial: Advance of simulations and techniques for communication networks and information systems

Dingde Jiang, Houbing Song, Liuwei Huo
2021 Wireless networks  
function virtual(NFV), network slicing, edge computing, cloud computing, machine learning and swarm intelligent algorithm, are deployed into the network.  ...  How to simulate the behavior of services and verify the functions of applications in the network?  ...  We also thank the Editin-Chief, Dr. Imrich Chlamtac for his supportive guidance during the entire process. The special issue is sponsored by the National Natural  ... 
doi:10.1007/s11276-021-02601-6 fatcat:6zvqqznkpregnf7feeern63fhu

Deep Deterministic Policy Gradient to Regulate Feedback Control Systems Using Reinforcement Learning

Samir Salem Al-Bawri, Mohammad Tariqul Islam, Mandeep Jit Singh, Mohd Faizal Jamlos, Adam Narbudowicz, Max J. Ammann, Dominique M. M. P. Schreurs
2022 Computers Materials & Continua  
We propose an adaptive speed control of the motor system based on depth deterministic strategy gradient (DDPG). The actor-critic scenario using DDPG is implemented to build the RL agent.  ...  However, the existing algorithms are unable to provide satisfactory results. Therefore, this research uses a reinforcement learning (RL) algorithm to manage the control system.  ...  Acknowledgement: The authors extend their appreciation to King Saud University for funding this work through Researchers Supporting Project number (RSP-2021/387), King Saud University, Riyadh, Saudi Arabia  ... 
doi:10.32604/cmc.2022.021917 fatcat:db33p2eadrgxpdq52db6l6eh5e

Research on PID Parameter Tuning and Optimization Based on SAC-Auto for USV Path Following

Lifei Song, Chuanyi Xu, Le Hao, Jianxi Yao, Rong Guo
2022 Journal of Marine Science and Engineering  
To satisfy the time-varying demand of PID parameters for guiding control, especially when the USV moves in waves, the soft actorcritic auto (SAC-auto) method is presented to adjust the PID parameters  ...  In this paper, a PID parameter tuning and optimizing method based on deep reinforcement learning were proposed to control the USV heading.  ...  Institutional Review Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/jmse10121847 fatcat:grkhapqk3fa6ldmfbypsuxeyom

A Survey on Reinforcement Learning in Aviation Applications [article]

Pouria Razzaghi and Amin Tabrizian and Wei Guo and Shulu Chen and Abenezer Taye and Ellis Thompson and Alexis Bregeon and Ali Baheri and Peng Wei
2022 arXiv   pre-print
Compared with model-based control and optimization methods, reinforcement learning (RL) provides a data-driven, learning-based framework to formulate and solve sequential decision-making problems.  ...  Some of them are offline planning problems, while others need to be solved online and are safety-critical. In this survey paper, we first describe standard RL formulations and solutions.  ...  Asynchronous advantage actor-critic (A3C) [25] uses advantage estimates rather than discounted returns in the actor-critic framework and asynchronously updates both the policy and value networks on multiple  ... 
arXiv:2211.02147v2 fatcat:y4tqirja3nffpdpmjhka22ibee

Intelligent Controller Based on Distributed Deep Reinforcement Learning for PEMFC Air Supply System

Jiawen Li, Tao Yu
2021 IEEE Access  
Compared with other control methods, the proposed intelligent controller exhibits better control performance and robustness.  ...  The control algorithm proposed in this paper is of significance to future PEMFC air flux control research.  ...  In order to prove the availability of the CIED-MD3 controller, the TD3 controller, DDPG controller [25] , PSO-fuzzy-PID controller [16] , Fuzzy-PID controller [19] and PID controller are used as the  ... 
doi:10.1109/access.2021.3049162 fatcat:rzys2tzozzg6rp55t4ndhmeh3i

One-Layer Real-Time Optimization Using Reinforcement Learning: A Review with Guidelines

Ruan de Rezende Faria, Bruno Didier Olivier Capron, Maurício B. de Souza, Argimiro Resende Secchi
2023 Processes  
The literature about each mentioned layer is reviewed, supporting the proposal of a benchmark study of reinforcement learning using a one-layer approach.  ...  The multi-agent deep deterministic policy gradient algorithm was applied for economic optimization and control of the isothermal Van de Vusse reactor.  ...  ] are variations of the actor-critic algorithm with agents learning asynchronously, with two or three agents in parallel.  ... 
doi:10.3390/pr11010123 fatcat:tlthimzdzrhyzmkyrgopbzmmki

Adaptive Nonlinear Model Predictive Horizon Using Deep Reinforcement Learning for Optimal Trajectory Planning

Younes Al Al Younes, Martin Barczyk
2022 Drones  
This is done by tuning the NMPH's parameters online using two different actor-critic DRL-based algorithms, deep deterministic policy gradient (DDPG) and soft actor-critic (SAC).  ...  The results demonstrate the learning curves, sample complexity, and stability of the DRL-based adaptation scheme and show the superior performance of adaptive NMPH relative to our earlier designs.  ...  (TD3) [21] , soft actor-critic (SAC) [22] , and asynchronous advantage actor-critic (A3C) [23] ) algorithms.  ... 
doi:10.3390/drones6110323 fatcat:olu36goyzvganhre7kjkbiogzm

Data-Driven Control Algorithm for Snake Manipulator

Kai Hu, Lang Tian, Chenghang Weng, Liguo Weng, Qiang Zang, Min Xia, Guodong Qin
2021 Applied Sciences  
It is difficult to obtain an ideal control effect by using the traditional manipulator control method. In view of this, this paper proposes a data-driven snake manipulator control algorithm.  ...  After collecting data, the algorithm uses the strong learning and decision-making ability of the deep deterministic strategy gradient to learn these system data.  ...  Hybrid algorithm actor-critic Asynchronous advantage actor-critic (A3C) Asynchronous training framework, network structure optimization, and evaluation point optimization.  ... 
doi:10.3390/app11178146 fatcat:cxrby7fyqbfklm6nbpfqjn25sy

Drone Deep Reinforcement Learning: A Review

Ahmad Taher Azar, Anis Koubaa, Nada Ali Mohamed, Habiba A. Ibrahim, Zahra Fathy Ibrahim, Muhammad Kazim, Adel Ammar, Bilel Benjdira, Alaa M. Khamis, Ibrahim A. Hameed, Gabriella Casalino
2021 Electronics  
In this paper, we described the state of the art of one subset of these algorithms: the deep reinforcement learning (DRL) techniques.  ...  To ensure this level of autonomy, many artificial intelligence algorithms were designed. These algorithms targeted the guidance, navigation, and control (GNC) of UAVs.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/electronics10090999 doaj:57ededb7d1a0445eaf34975cb6625c1f fatcat:kya3fbblszd27i4exlybnji4ni

Improved Generalized Predictive Control for High-Speed Train Network Systems Based on EMD-AQPSO-LS-SVM Time Delay Prediction Model

Xiangyu Kong, Tong Zhang, Georgios I. Giannopoulos
2020 Mathematical Problems in Engineering  
Further, based on actor-critic reinforcement learning algorithm, an improved generalized predictive control method is proposed for the train network system.  ...  The actor-critic network is used to predict the future output of the system, and the recursive least squares identification algorithm with the variable forgetting factor is adopted to identify the future  ...  Acknowledgments is research was funded by the Natural Science Foundation of Liaoning Province (grant no. 20180551003).  ... 
doi:10.1155/2020/6913579 fatcat:6tkum3wr6vaqpjci7pm2r55kva
« Previous Showing results 1 — 15 out of 309 results