Article Contents
Article Contents

Data-driven control of hydraulic servo actuator based on adaptive dynamic programming

This research has been supported in part by the Serbian Ministry of Education, Science and Technological Development under grant 451-03-9/2021-14/200108, the National Natural Science Foundation of China under grants 61976081, 62073001, the Natural Science Fund for Excellent Young Scholars of Henan Province under grant 202300410127

• The hydraulic servo actuators (HSA) are often used in the industry in tasks that request great powers, high accuracy and dynamic motion. It is well known that HSA is a highly complex nonlinear system, and that the system parameters cannot be accurately determined due to various uncertainties, inability to measure some parameters, and disturbances. This paper considers control problem of the HSA with unknown dynamics, based on adaptive dynamic programming via output feedback. Due to increasing practical application of the control algorithm, a linear discrete model of HSA is considered and an online learning data-driven controller is used, which is based on measured input and output data instead of unmeasurable states and unknown system parameters. Hence, the ADP based data-driven controller in this paper requires neither the knowledge of the HSA dynamics nor exosystem dynamics. The convergence of the ADP based control algorithm is also theoretically shown. Simulation results verify the feasibility and effectiveness of the proposed approach in solving the optimal control problem of HSA.

Mathematics Subject Classification: Primary: 49M25, 76-02, 90C39.

 Citation:

• Figure 1.  The HSA configuration

Figure 2.  ADP based control algorithm for the discretized HSA system

Figure 3.  Flowchart of ADP based controller design

Figure 4.  Hybrid nature of control signal

Figure 5.  Trajectories of the input and output of the HSA

Figure 6.  Trajectory of states

Figure 7.  Convergence of $\bar{P}_j$ and $\bar{K}_j$ to their optimal values $\bar{P}^*$ and $\bar{K}^*$ during the learning process

Figure 8.  (a) Comparison of the cost functions during learning; (b) Error between the optimal and approximated cost function

Figure 9.  (a) Comparison of the control policies during learning process; (b) Error between the optimal and approximated input signal

Table 1.  Parameters of the HSA

 Notations Denotes $x_v$ The spool valve displacement $p_a$, $p_b$ Forward and the return pressure $q_a$, $q_b$ Forward and the return flows $y$ Piston displacement $L$ Piston stroke $K_e$ Load spring gradient $p_S$, $p_0$ Supply and tank pressure $m_t$, $m_p$, $m$ Total mass, piston mass, payload mass $F_f$ Friction forces $F_{ext}$ Disturbance forces $A_a$, $A_b$ Effective areas of the head and rod piston side $V_a$, $V_b$, $V_{a0}$, $V_{b0}$ Fluid volumes of the head and rod piston side and corresponding initial volumes $q_{Li}$, $q_{Le}$ Internal and external leakage flow $\beta_e$ Bulk modulus of the fluid
•  [1] W. Aangenent, D. Kostic, B. de Jager, R. van de Molengraft and M. Steinbuch, Data-Based optimal control, Proceedings of the 2005 American Control Conference, (2005), 1460–1465. doi: 10.1109/ACC.2005.1470171. [2] A. Al-Tamimi, F. L. Lewis and M. Abu-Khalaf, Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica, 43 (2007), 473-481.  doi: 10.1016/j.automatica.2006.09.019. [3] K. J. Astrom and B. Wittenmark, Adaptative Control, Addison-Wesley, Reading, 1995. [4] D. Bertsekas, Reinforcement and Optimal Control, Athena Scientific, USA, 2019. [5] D. Bertsekas, Dynamic Programming and Optimal Control Vol. 1, 4$^{th}$ edition, Athena Scientific, USA, 2012. [6] T. Bian and Z. P. Jiang, Value iteration and adaptive dynamic programming for data-driven adaptive optimal control designs, Automatica, 71 (2016), 348-360.  doi: 10.1016/j.automatica.2016.05.003. [7] J. F. Blackburn, G. Reethof and J. L. Shearer, Fluid Power Control, The MIT Press Cambridge, USA, 1960. [8] A. Cavallo, G. De Maria, C. Natale and S. Pirozzi, Slipping detection and avoidance based on Kalman filter, Mechatronics, 24 (2014), 489-499.  doi: 10.1016/j.mechatronics.2014.05.006. [9] Y. H. Chang, Q. Hu and C. J. Tomlin, Secure estimation based Kalman filter for cyber–physical systems against sensor attacks, Automatica, 95 (2018), 399-412.  doi: 10.1016/j.automatica.2018.06.010. [10] T. Chen and B. A. Francis, Optimal Sampled-data Control Systems, Springer-Verlag, London, 1996. doi: 10.1007/978-1-4471-3037-6. [11] V. Filipovic, N. Nedic and V. Stojanovic, Robust identification of pneumatic servo actuators in the real situations, Forschung im Ingenieurwesen - Engineering Research, 75 (2011), 183-196.  doi: 10.1007/s10010-011-0144-5. [12] W. Gao, Y. Jiang, Z. P. Jiang and T. Chai, Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming, Automatica, 72 (2016), 37-45.  doi: 10.1016/j.automatica.2016.05.008. [13] W. Gao, Y. Jiang, Z. P. Jiang and T. Chai, Adaptive and optimal output feedback control of linear systems: An adaptive dynamic programming approach, Proceeding of the 11th World Congress on Intelligent Control and Automation, China, (2014), 2085–2090. [14] W. Gao and Z. P. Jiang, Learning-based adaptive optimal tracking control of strict-feedback nonlinear systems, IEEE Trans. Neural Netw. Learn. Syst., 29 (2018), 2614-2624.  doi: 10.1109/TNNLS.2017.2761718. [15] W. Gao, M. Huang, Z. P. Jiang and T. Chai, Sampled-data-based adaptive optimal output-feedback control of a 2-degree-of-freedom helicopter, IET Control Theory and Applications, 10 (2016), 1440-1447.  doi: 10.1049/iet-cta.2015.0977. [16] G. Hewer, An iterative technique for the computation of the steady state gains for the discrete optimal regulator, IEEE Transactions on Automatic Control, 16 (1971), 382-384.  doi: 10.1109/TAC.1971.1099755. [17] Q. Hu, Robust adaptive sliding mode attitude maneuvering and vibration damping of three-axis-stabilized flexible spacecraft with actuator saturation limits, Nonlinear Dynamics, 55 (2009), 301-321.  doi: 10.1007/s11071-008-9363-1. [18] P. A. Ioannou and J. Sun, Robust adaptive control, Dover Publications, New York, 2012. [19] M. Jelali and A. Kroll, Hydraulic Servo-systems: Modelling, Identification and Control, Springer-Verlag London, UK, 2012. doi: 10.1007/978-1-4471-0099-7. [20] F. L. Lewis and D. Liu, Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, John Wiley & Sons, New Jersey, USA, 2012. doi: 10.1002/9781118453988. [21] F. L. Lewis and K. G. Vamvoudakis, Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 41 (2010), 14-25. [22] F. L. Lewis, D. Vrabie and V. L. Syrmos, Optimal Control, 3$^{rd}$ edition, John Wiley & Sons, New Jersey, 2012. doi: 10.1002/9781118122631. [23] X. Li, J. Shen, H. Akca and R. Rakkiyappan, LMI-based stability for singularly perturbed nonlinear impulsive differential systems with delays of small parameter, Appl. Math. Comput., 250 (2015), 798-804.  doi: 10.1016/j.amc.2014.10.113. [24] X. Li, X. Yang and S. Song, Lyapunov conditions for finite-time stability of time-varying time-delay systems, Automatica, 103 (2019), 135-140.  doi: 10.1016/j.automatica.2019.01.031. [25] L. Ljung, System Identification: Theory for the User, Prentice Hall, Inc., Englewood Cliffs, NJ, 1987 [26] X. Lv and X. Li, Finite time stability and controller design for nonlinear impulsive sampled-data systems with applications, ISA Transactions, 70 (2017), 30-36.  doi: 10.1016/j.isatra.2017.07.025. [27] K. Maes, A. Iliopoulos, W. Weijtjens, C. Devriendt and G. Lombaert, Dynamic strain estimation for fatigue assessment of an offshore monopile wind turbine using filtering and modal expansion algorithms, Mechanical Systems and Signal Processing, 76–77 (2016), 592-611.  doi: 10.1016/j.ymssp.2016.01.004. [28] N. Manring, Fluid Power Pumps and Motors: Analysis, Design and Control, McGraw Hill Professional, USA, 2013. [29] J. J. Murray, C. J. Cox, G. G. Lendaris and R. Saeks, Adaptive dynamic programming, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews, 32 (2002), 140-153.  doi: 10.1109/TSMCC.2002.801727. [30] M. Mynuddin and W. Gao, Distributed predictive cruise control based on reinforcement learning and validation on microscopic traffic simulation, IET Intelligent Transport Systems, 14 (2020), 270-277.  doi: 10.1049/iet-its.2019.0404. [31] M. Mynuddin, W. Gao and Z. P. Jiang, Reinforcement learning for multi-agent systems with an application to distributed predictive cruise control, 2020 American Control Conference (ACC), (2020), 315–320. doi: 10.23919/ACC45564.2020.9147968. [32] N. Nedic, V. Stojanovic and V. Djordjevic, Optimal control of hydraulically driven parallel robot platform based on firefly algorithm, Nonlinear Dynam., 82 (2015), 1457-1473.  doi: 10.1007/s11071-015-2252-5. [33] R. Pintelon and J. Schoukens, System Identification: A Frequency Domain Approach, 2$^{nd}$ edition, John Wiley & Sons, New Jersey, 2012. [34] C. R. Rojas, J. C. Aguero, J. S. Welsh, G. C. Goodwin and A. Feuer, Robustness in experiment design, IEEE Trans. Automat. Control, 57 (2012), 860-874.  doi: 10.1109/TAC.2011.2166294. [35] M. Roozegar, M. J. Mahjoob and M. Jahromi, Optimal motion planning and control of a nonholonomic spherical robot using dynamic programming approach: Simulation and experimental results, Mechatronics, 39 (2016), 174-184. [36] J. L. Sun and C. S. Liu, An overview on the adaptive dynamic programming based missile guidance law, Acta Automatica Sinica, 43 (2017), 1101-1113. [37] V. Stojanovic, N. Nedic, D. Prsic, L. Dubonjic and V. Djordjevic, Application of cuckoo search algorithm to constrained control problem of a parallel robot platform, J. Advanced Manufacturing Technology, 87 (2016), 2497-2507.  doi: 10.1007/s00170-016-8627-z. [38] V. Stojanovic and D. Prsic, Robust identification for fault detection in the presence of non-Gaussian noises: Application to hydraulic servo drives, Nonlinear Dynamics, 100 (2020), 2299-2313.  doi: 10.1007/s11071-020-05616-4. [39] M. Davari, W. Gao, Z. P. Jiang and F. L. Lewis, An Optimal Primary Frequency Control Based on Adaptive Dynamic Programming for Islanded Modernized Microgrids, IEEE Transactions on Automation Science and Engineering, 18 (2021), 1109-1121.  doi: 10.1109/TASE.2020.2996160. [40] M. Tomás-Rodríguez and S. P. Banks, Linear, Time-varying Approximations to Nonlinear Dynamical Systems: with Applications in Control and Optimization, Springer-Verlag Berlin, 2010. doi: 10.1007/978-1-84996-101-1. [41] A. Vacca and G. Franzoni, Hydraulic Fluid Power: Fundamentals, Applications, and Circuit Design, John Wiley & Sons, USA, 2021. [42] K. G. Vamvoudakis and F. L. Lewis, Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton–Jacobi equations, Automatica, 47 (2011), 1556-1569.  doi: 10.1016/j.automatica.2011.03.005. [43] A. van de Walle, F. Naets and W. Desmet, Virtual microphone sensing through vibro-acoustic modelling and Kalman filtering, Mechanical Systems and Signal Processing, 104 (2018), 120-133.  doi: 10.1016/j.ymssp.2017.08.032. [44] J. J. Vyas, B. Gopalsamy and H. Joshi, Electro-Hydraulic Actuation Systems: Design, Testing, Identification and Validation, Springer, Singapore, 2019. doi: 10.1007/978-981-13-2547-2. [45] P. Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, Ph.D thesis, Harvard University, 1975. [46] H. Xu, S. Jagannathan and F. L. Lewis, Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses, Automatica, 48 (2012), 1017-1030.  doi: 10.1016/j.automatica.2012.03.007. [47] X. Zhang and X. Li, Input-to-state stability of non-linear systems with distributed-delayed impulses, IET Control Theory Appl., 11 (2017), 81-89.  doi: 10.1049/iet-cta.2016.0469. [48] H. Zhang, R. Ye, S. Liu, J. Cao, A. Alsaedi and X. Li, LMI-based approach to stability analysis for fractional-order neural networks with discrete and distributed delays, Internat. J. Systems Sci., 49 (2018), 537-545.  doi: 10.1080/00207721.2017.1412534.

Figures(9)

Tables(1)