Adaptive ankle impedance control for bipedal robotic upright balance

Upright balance control is a fundamental skill of bipedal robots for various tasks that are usually performed by human beings. Conventional robotic control is often realized by developing accurate dynamic models using a series of fixed torque‐ankle states, but their success is subject to accurate physical and kinematic models. This can be particularly challenging when external disturbing forces present, but this is common in unstructured robotic working environments, leading to ineffective robotic control. To address such limitation, this paper presents an adaptive ankle impedance control method with the support of the advances of adaptive fuzzy inference systems, by which the desired ankle torques are generated in real time to adaptively meet the dynamic control requirement. In particular, the control method is initialised with specific external disturbing forces first representing a general situation, which then evolves whilst performing in a real‐world working environment by acting on the feedback from the control system. This is implemented by initialising a rule base for a typical situation, and then allowing the rule base to evolve to specific robotic working environments. This closed loop feedback and action mechanism timely and effectively configures the control system to meet the dynamic control requirements. The proposed control method was applied to a bipedal robot on a moving vehicle for system validation and evaluation, with robotic loads ranging from 0 to 1.65 kg and external disturbances in terms of vehicle acceleration ranging from 0.5 to 1.5 m/s2 , leading to robotic swing angles up to 7.6º and anti‐disturbance timespans up to 8.5 s. These experimental results demonstrate the power of the proposed upright balance control method in improving the robustness, and thus applicability, of bipedal robots.

bipedal robots often work in an unstructured environment, with the presence of various uncertain external disturbances. Second, a bipedal robot is of a relatively high Centre of Mass (CoM) and equipped with a relatively small base of support for flexible movement (Yin et al., 2021). Therefore, effective upright balance control of bipedal robots represents one of the main challenges in the field of bipedal robotic research (Rajasekaran et al., 2015). Recent studies on bipedal robotic balance control have led to multiple control strategies, including the stepping strategy, the hip control strategy, and the ankle control strategy (Jeong et al., 2019). The hip control strategy and stepping strategy are usually employed when significant external disturbances are present, but the ankle control strategy has been widely used in upright balance control due to relatively small external disturbances (Yin et al., 2020). This paper targets robotic upright balance control, and hence focuses on the ankle control strategy. The majority implementations of the ankle control strategy are realized by developing accurate physical and kinematic models (Tokur et al., 2020).
There are generally three groups of approaches for bipedal robotic upright balance control based on the ankle control strategy. The first group of approaches focus on the bipedal robotic Zero Moment Point (ZMP), which is defined as a point on the robotic foot at which the equal moment acting on the bipedal robot, due to gravity and ground reaction force, is equal to zero in the upright state (Al-Shuka et al., 2016). This approach utilizes the joint torque as the feedback information to estimate the actual biped robotic ZMP in real time. Then, a control law is designed based on the actual ZMP to adjust the joint torque of bipedal robots, in an effort to continuously reduce the gap between the expected ZMP and the actual value (Vukobratovi¢ et al., 2006). This group of approaches have been utilized by a large number of bipedal robotic upright balance control projects, with many successful cases reported in the literature (Al-Shuka et al., 2016;Ando et al., 2021;Shin & Kim, 2011).
Despite the success, there still exist some limitations: (1) this approach assumes that the bipedal robotic feet are planar and have sufficiently big friction to keep the feet from sliding; (2) the real-time bipedal robotic ZMP estimated by torque sensors usually represent lags in reference to the actual attitude changes, resulting in delayed actions in the control loop.
The second group of approaches is based on the law of momentum conservation by simultaneously adjusting the linear and angular momentum to perform the bipedal robotic upright balance control (Azad & Featherstone, 2016;Lee & Goswami, 2012). For instance, Hinata and Nenchev (2018) proposed a decomposition momentum control system which deals with linear momentum and angular momentum at the same time; this has been developed as a balance control framework and applied to the whole robotic body. In addition, Liu et al. (2018) proposed an angular momentum inverted pendulum model to realize effective bipedal robotic upright balance control to better handle potential sudden changes of angular momentum. By studying the combinational configuration of linear and angular momenta, a practical method of humanoid-ground interaction was proposed for the generation of the joint torques . Given that the success of this momentum balance approach for bipedal robotic upright balance control is largely dependent on accurate momentum information derived from a robotic dynamic model, further improvement to this approach therefore can be very challenging.
The third group of approaches for bipedal robotic balance control takes the advantages of the advances in artificial intelligence, and especially its sub-field of machine learning, thanks to their generality and thus great adaptability to complex problems (He & Dong, 2017;Yang, Jiang, et al., 2018). For example, a neural network was established to obtain the relationship between the CoM of a human body, joint angle, and its velocity (Jiang et al., 2016), and the developed model was then applied to bipedal robotic balance control. In addition to standard approaches, adaptive control approach has also been developed to meet individual needs. For instance, an adaptive control approach was developed for personalized robotic exoskeleton upright balance to adaptively support various robot shapes and robotic tasks (Yin et al., 2019), and a hybrid autonomous controller was developed using deep reinforcement learning for bipedal robotic balance control (Kouppas et al., 2021). Despite the promising results, these approaches either do not target dynamic external forces, or are of limited applicability due to the prerequisite of a large labelled dataset for model training, which may not be always available, and can be very difficult to obtain practically. This work proposes an adaptive ankle impedance control method for bipedal robotic upright balance, aiming to mitigate the aforementioned limitations. Note that the structural variability and system non-linearity of bipedal robots usually present great challenges to robotic motion control (Yang, Bellingham, et al., 2018). Therefore, this work also ignores the non-direct factors to robotic upright balance control as a common practice, such as body bending, stepping, and arm swing. This effectively simplifies the robotic upright balance control as an ankle impedance model to represent the dynamic relationship between the robotic ankle and the working environment. In particular, the impedance gains are adaptively updated according to the bipedal robotic balance states, and the impedance model was used to calculate the ankle anti-disturbance torque. Here, the ankle dynamic torque is estimated by the constructed inverted pendulum (IP) model of the bipedal robot. From this, the combination of these two interconnected components is used jointly to obtain the desired ankle torque for bipedal robotic upright balance control. The adaptive update of the impedance gains is performed by employing the experience-based fuzzy rule interpolation (E-FRI) system. Briefly, the E-FRI system in this work evolves the rule base with the performance feedback provided by the controller system, to address the limitation of lack of training dataset or expert knowledge about the bipedal robotic upright balance control, and to dynamically respond to the unstructured working environment.
The E-FRI is also additionally used to enable effective updates of the impedance gains, which improves the robustness of the control system.
The proposed control method was validated through experimentation based on simulations on the OpenSim platform. Various external disturbances and robotic loads were implemented in the experiments to simulate real-world situations, and the experimental results demonstrate the power of the proposed control method for bipedal robotic upright balance with improved robustness. The novel contributions of this work are twofold: (1) proposing an adaptive ankle impedance control method for bipedal robotic upright balance control in response to disturbance from the robotic working environment, (2) developing an adaptive impedance gain update method through the application of E-FRI.
The rest of this paper is organized as follows. Section 2 reviews the technical background about impedance modal and fuzzy rule interpolation. The proposed adaptive ankle impedance control method is detailed in Section 3. Section 4 reports the application of the proposed control method to a bipedal robot on a moving vehicle, and analyses the experimental results. The paper is concluded in Section 5 with possible directions of future work suggested.

| BACKGROUND
The technical background of the proposed adaptive ankle impedance control method, including the impedance control and fuzzy rule interpolation are reviewed in this section.

| Impedance model
The impedance model resembles a virtual spring-damper system representing the dynamic relationship between the robotic end-effector and the working environment, which ensures that the robot interacts with the environment in an energy-efficient and safe way (Song et al., 2019). The impedance model can thus estimate the external force from the environment on the robotic end-effector. The standard impedance model in the time domain can be defined as: where x d denotes the desired trajectory or the goal positions, and x t is the actual trajectory of the robotic end-effector; M, B, and K refer to the desired inertia, damping, and stiffness matrices, respectively, which can be referred to impedance gains; F t ð Þ is the actual external force from the environment. When the impedance gains are specified, the external force can be determined by the difference between the desired trajectory and the actual trajectory.
The impedance model provides a measure of the dynamic compliance of the robotic end-effector, and this compliance does not rely on the intrinsic property of robotic end-effector hardware. Therefore, the impedance model can be modified according to the designed control law.
Depending on the applications of the robots, there are two groups of approaches to implement impedance control, namely constant impedance control and variable impedance control (Abu-Dakka & Saveriano, 2020). In particular, the constant impedance control is realized with predefined constant impedance. In many tasks, the robotic end-effector needs to vary its impedance along with the execution of the task. For instance, a bipedal robot in an unstructured environment often suffers from external perturbation, and the bipedal robot needs to adapt its joint impedance to accommodate the degree of the external perturbation for upright balance.
The variable impedance control approaches focus on the search of appropriate impedance gains according to the robotic control states, with the block scheme of variable impedance control shown in Figure 1. In this figure, τ t represents the torque applied to the robotic end-effector, and q t is the robotic joint angle. Based on such a control approach, the impedance gains are updated online using the impedance adaptation strategy which can be fulfilled by imitation learning, iterative learning, reinforcement learning, or a predefined model (Duan et al., 2018;Roveda et al., 2020). The impedance gain has three parameters, including inertia, damping, and stiffness. In particular, the inertia and damping can be The block scheme of variable impedance control retrieved from a parameter update model. In general, the stiffness is increased when high trajectory tracking error presents, allowing the robotic end-effector to track the desired trajectory more accurately (Yang et al., 2017). The following predefined model is an example of stiffness update: where k 0 is a predefined small stiffness, e t represents the trajectory tracking error and α indicates a positive gain. Various variable impedance control approaches have been investigated in the past decades for a wide range of robotic applications, such as industry operations, human-robot cooperation, and rehabilitation (Duan et al., 2018;Li et al., 2018;Roveda et al., 2020).

| Fuzzy rule interpolation
Fuzzy inference is a mechanism that formulates a mapping from an input domain to an output domain for a given problem using fuzzy logic and fuzzy set theory. Actually, Fuzzy inference has been deemed as a powerful tool to represent any non-linear functions in control and other applications involving uncertain and nonlinear systems based on fuzzy rule bases. Fuzzy rule bases are translated from prior knowledge which can be either in the form of expert knowledge or labelled data (Cord on et al., 2001). Given an input, conventional fuzzy inference can generate an output using the fuzzy rule base by firing those rules which antecedents overlap with the given input. However, due to the imbalanced distribution and sparsity of prior knowledge, the generated rule base maybe not cover the entire problem domain. Under such circumstance, the conventional fuzzy inference systems will fail to produce conclusions, as the given inputs are not covered by any rules in the rule base. Fuzzy rule interpolation (FRI) enhances the power of conventional fuzzy inference by allowing the interpolation or extrapolation of conclusions when the inputs are not covered by the rule base (K oczy & Hirota, 1993;Yang & Shen, 2013).
Theoretically, FRI is a fuzzy extension of the linear interpolation or extrapolation using fuzzy sets and fuzzy logic. Consequently, FRI takes the advantages of both linear interpolation/extrapolation for knowledge generalization and fuzzy logic for uncertainty management. Given an input which is not covered by any fuzzy rule in the rule base, the two closest neighbouring rules are selected based on a specific distance metric to allow the interpolation or extrapolation of a conclusion. For instance, the popular transformation-based fuzzy interpolation approach will first generate an intermediate rule such that its antecedent is as 'close' to the given input as possible based on the distance metric (Huang & Shen, 2006;). An interpolated rule is then produced by ensuring the same shape transformation between the antecedents of the intermediate rule and those of the interpolated rule is enforced to that between the conclusion of the intermediate rule and that of the interpolated rule.
After years of research, FRI has been further developed from different perspectives to improve its performance and applicability, such as adaptive FRI (Chen & Barman, 2019;Yang et al., 2016;, dynamic FRI (Naik et al., 2017), rough-FRI (Chen et al., 2016), experience-based FRI (E-FRI)  and other advances of FRI . In particular, E-FRI was developed to address the lack of prior knowledge, either in the form of expert knowledge or labelled data. E-FRI first initialises a rule base by limited available labelled data or expert knowledge. The rule base is then improved during the performance. A traditional single input and single output rule R i can be expressed as: where A i and B i are fuzzy sets, EF i refers to an experience factor indicating the usage information of the particular rule, CD i stands for the cooling down factor representing the time that the particular rule has not been used for FRI performance and the w i represents the inherent weight of the particular rule. The inherent weight is and decided by the two factors EF i and CD i : where a, b and n are sensitivity factors. The values of these factors are usually problem-specific.
Regarding to a specific given input, the importance factor (IF) of a particular rule can be represented as a function of the inherent weight and the distance between the rule antecedents and the given input based on a certain distance metric: where d i represents the distance between the rule antecedents and the given input, n is the number of rules in the current rule base. From this, the two most 'informative' rules can be identified to perform FRI.
The rule base evolves through a rule base revision mechanism whilst the FRI performs inferences. In particular, the performance of a particular rule through FRI is provided by system feedback, and is used to update the inherent weights of the rule. Depending on the performance, there are three types of revisions to evolve the rule base, including updating the weights or existing rules, removing useless rules, and adding new rules.
These three types of rule base revisions jointly ensure the freshness, effectiveness, and conciseness of the rule base.

| ADAPTIVE ANKLE IMPEDANCE CONTROL METHOD
The proposed adaptive ankle impedance control method is used to estimate the desired ankle torque τ q ð Þ for bipedal robotic upright balance control. It is comprised of three major components, including an inverted pendulum model to estimate the ankle dynamic torque, an adaptive impedance model to calculate the ankle anti-disturbance torque, and an impedance gain update component to revise the impedance gains in real time according to the bipedal robotic balance sates. An overall view of the proposed system and the three major components are detailed in the this section.

| Method overview
The overall architecture of the proposed bipedal robotic system with the proposed adaptive ankle impedance control method is illustrated in Figure 2. The control approach proposed herein is implemented using the ankle control strategy as discussed in Section 1, and thus the humanoid robot is driven by the torque actuator of the ankle with the desired ankle toque (τ q ) provided by the control system. The ankle toque (τ q ) is obtained through two control loops in parallel, denoted as the dynamic loop and the impedance loop, with the dynamic model and impedance model being the main components of these two control loops respectively. Both control loops take the robotic ankle angle θ foot ð Þand its velocity _ θ foot À Á as inputs which effectively represent the robotic upright balance states. The inverted pendulum model in the dynamic loop estimates the ankle dynamic torque τ r ð Þ, which is detailed in Section 3.2; the impedance loop calculates the anti-disturbance torque τ e ð Þ by using the impedance model, which is detailed in Section 3.3. Note that the impedance model uses three key parameters, that is, the impedance gains of stiffness, damping and inertia (K, B, M), which are dynamically generated by applying the experience-based fuzzy rule interpolation approach according to the bipedal robotic balance sates. The impedance gain update mechanism is detailed in Section 3.4. The combination of the outputs of these two control loops (i.e., τ r and τ e ) forms the final output of the control system, that is the desired ankle torque τ q ð Þ. This torque is then passed to the robotic ankle torque actuator to drive the bipedal robot to maintain upright balance.

| Inverted pendulum model
The establishment of an accurate dynamic model for a bipedal robot can be very challenging, due to the complexity of the force information of such a complex multi-link system. In order to focus on the study of ankle strategy-based bipedal robotic upright balance control, non-direct factors in this work, such as arm swing, body bending, and stepping are ignored. In this case, the bipedal robot can be simplified as an Inverted Pendulum (IP) model which swings around the ankle joint. Therefore, this work takes the bipedal robotic upright balance control process as an IP swinging process. For the convenience of analysis, an x-y coordinate system for the IP model is established to better represent the bipedal robotic The overall architecture of adaptive ankle impedance control for bipedal robotic upright balance dynamic model. In specific, the position of the robotic ankle joint in its initial state is taken as the origin coordinates, and the angle of the swing of the body, that is, the ankle angle, is denoted as θ foot , as illustrated in Figure 3.
According to the established x-y coordinate system, when the bipedal robot is disturbed by small external disturbance, the differential equation of the robot rotating around the ankle joint can be expressed as: where F x and F y represent the horizontal and vertical force at the CoM of the bipedal robot, respectively. Here, the horizontal movement of CoM of the bipedal robot can be expressed as: This equation can be rewritten as: Similarly, the vertical movement of CoM of the bipedal robot CoM can be represented as: which can be re-written as: Applying Equations (8) and (10) to Equation (6), the dynamic torque τ r during bipedal robotic upright balance control can be concisely expressed as: The inverted pendulum model of the bipedal robot The desired ankle torque τ q can thus be calculated as the summation of the dynamic torque τ r and the anti-disturbance torque τ e : where the anti-disturbance torque is estimated by the ankle impedance model, as detailed in the next section.

| Impedance model
The impedance model aims to simulate the movements of robotic joints using a mass-damper-spring system to react on external disturbance. The schematic diagram of the impedance model representing the bipedal robotic ankle joint in this work is illustrated in Figure 4. In the Cartesian coordinate system, the impedance model derives the force f e based on the input position error led by the disturbance from the external environment, which is expressed as: where k, b and m refer to the stiffness, damping and inertia gains in the impedance model, respectively, and x e is the offset of the CoM of the bipedal robot on the x-coordinate. Note that the Jacobi matrix J θ foot ð Þdescribes the coordinate transformation from the Cartesian coordinate to the Polar coordinate, which is defined as: where θ foot is the ankle angle of the bipedal robot. Form this, the anti-disturbance torque can be computed as: The kinematic relationship between the representation in the IP model and that using the impedance model can be described as: where l is the distance between the angle point and the CoM of the bipedal robot. In the case of bipedal robotic upright balance control, the robotic ankle angle θ foot , that is, the swing angle, is usually marginal, subject to sin θ foot ≈ θ foot . Therefore, the anti-disturbance torque as expressed in Equation (15) can be rewritten as: F I G U R E . 4 The bipedal robot ankle joint impedance model schematic diagram where θ e ¼ θ foot À θ ref is the ankle offset angle, and θ ref refers to the ankle reference angle which is the equilibrium position of the bipedal robot.
Expand the above equation leading to: where K, B, and M refer to the stiffness, damping, and inertia gains in the robotic ankle impedance model, respectively. According to Equations (17) and (18), these values can be expressed as: Applying Equations (11) and (18) to Equation (12), the desired ankle torque can be re-expressed as: 3.4 | Impedance gain update The impedance gain needs to be constantly updated to maintain bipedal robotic upright balance subject to random external disturbance. The study of human upright balance control reveals that human ankle impedance is controlled by the central nervous system according to the motion states, which can be simulated on the mechanical ankles of humanoid robots (Pang et al., 2017). Inspired by this, this work proposes an impedance gain updating method by innovatively employing E-FRI, as introduced in Section 2.2, based on the bipedal robotic ankle angle information. The complexity of the proposed adaptive ankle impedance control strategy largely depends on the complexity of the E-FRI, but the reasonably low complexity of the E-FRI has been empirically confirmed by their successful applications. Therefore, the complexity of the proposed system is expected to be sufficiently low to support online robotic control, which has been evidenced by the experiments as reported in Section 4. The framework of the adapted E-FRI, as shown in Figure 5, consists of three major parts, including rule base initialisation, fuzzy rule interpolation, and rule base revision. Specifically, the rule base is initialised using limited expert knowledge about robotic upright balance control based on a 'standard human body'. For a given robotic state, that is, a specific ankle joint angle, the two most 'informative' rules, based on a certain metric, are selected to perform FRI. In this work, the scale and move transformation-based fuzzy rule interpolation is applied to perform FRI due to its wide applications and effectiveness for impedance gain estimation. The implementation of these three main components are detailed below.

| Rule base initialisation
There is a reasonable understanding about human upright balance control, but the knowledge on this is not complete. Therefore, the rule base in this work is initialised partially with such limited expert knowledge. Note that this work only considers the ankle impedance adjustment F I G U R E . 5 Experience-based fuzzy interpolation for impedance gain update mechanism with the support of bipedal robotic upright balance states. In particular, the upright balance state is mainly represented by the ankle joint information, i.e. ankle angle and its velocity. Of course, other factors may also marginally influence the ankle impedance, but these are not the predominant factors, and thus are not considered in this work.
It is suggested by the biology research that the square root of the joint mechanical stiffness is linearly related to the joint mechanical damping value (Yang et al., 2017). Therefore, the target damping value can be estimated by: where v is a pre-defined constant coefficient. In the process of bipedal robotic upright balance control, the robotic ankle inertia value changes marginally, so this work sets the inertia value as a constant coefficient. Therefore, the stiffness value in the impedance model is the only variable requiring constant updating. Consequently, a typical fuzzy rule R i in the rule base has two inputs and one output: where A 1i , A 2i and B i are fuzzy sets, x 1 and x 2 represent the ankle angle and its velocity, y stands for the stiffness value in the impedance model.
The meanings of w i , EF i and CD i are detailed in Section 2.2. EF i is initialised based on experience; in general, a smaller EF value may lead to unexpected rule removal, and a larger EF value may result in a longer convergence time. CD i is initialised as 0, and then w i can be calculated using Equation 4. Without losing generality, triangular membership functions are utilized in this work to represent fuzzy sets. Note that it is impossible to implement a complete rule base utilizing the limited expert knowledge about bipedal robotic upright balance control due to the complexity of the problem; that is, the initialised rule base should be adaptively revised for enhanced performance using the E-FRI approach.

| Fuzzy rule interpolation
As described above, the initialised rule base is very sparse. Assume a new robotic upright balance state, including the ankle angle (θ foot ) and its velocity ( _ θ foot ), is available, and this new input as represented by fuzzy sets A Ã 0 1 and A Ã 0 2 are not covered by the initialised rule base. Then the two most 'informative' rules regarding this new input are identified using Equation 5, to support the interpolation of the stiffness value for the given new robotic upright balance state. Suppose that the two most 'informative' rules have been selected and denoted as R i and R j in the sparse rule base for interpolation, and denote the interpolated result as B Ã . An intermediate rule R Ã is interpolated first using analogy-based reasoning, which can be expressed as: where A Ã 0 1 and A Ã 1 have the same representative value, and so do A Ã 0 2 and A Ã 2 . From this, the shape difference between the given input and the antecedent of the intermediate rule can be estimated using a transformation metric in multiple ways (Salahshour & Allahviranloo, 2013). In this work, the move and scale transformation-based approach is applied due to its effectiveness for operation, and more details can be found in Huang and Shen (2006). Once the transformation, including the move rate (M) and scale rate (S), are computed, the interpolated consequence B Ã can be calculated by applying S and M to the consequence B Ã 0 of the intermediate rule. The defuzzfized consequence B Ã is then used to update the stiffness value of the impedance model.

| Rule base revision
Note that the rule base generated from the limited expert knowledge on an exemplar situation may not perform well for a specific robot with a different working environment, weight, or body shape. Therefore, the fuzzy rule base must be revised based on the feedback from the immediate previous upright balance control performance whilst it performs fuzzy rule interpolation. The rule base revision mechanism is then designed to guarantee that the impedance gain is estimated accurately.
The working progress of the rule base revision involves two fundamental operations, including rule parameter updating and similarity degree calculation. Once an interpolated rule is produced, the rule parameters, including CD, EF, and w i , will be updated. In particular, if a rule has been selected to perform FRI, the CD value of the rule will be reset to 0. If the performance feedback is positive, the EF value of the selected rule will be increased by 1; otherwise, the EF value will be decreased by 1. Meanwhile, the CD values of all unselected rules in the current rule base will be increased by 1, and their EF values will remain the same. From this, the weights of all rules in the current rule base will be re-calculated according to Equation (4).
Suppose that the interpolated rule is R Ã , the similarity degree between the newly interpolated rule and each rule R i in the current rule base can be computed as: where S Á ð Þ denotes the similarity degree calculation between the two fuzzy sets. In this work, the similarity degree between fuzzy sets A i1 and A Ã 1 can be computed as: Note that a number of approaches have been proposed for the calculation of similarity degree (Zhu & Xu, 2012), and thus other approaches may also be employed here.
Based on these operations, the rule base can be revised in three ways, including updating the current rules, adding new rules, and removing useless or outdated rules. Suppose that the current rule base has n fuzzy rules and the interpolated rule is denoted as R Ã ; the working progress of rule base revision is summarized in Figure 6. In this figure, the value of i is initialized as 0. In order to realize the rule base revision, the performance of the newly interpolated rule is evaluated based on the result of the bipedal robotic upright balance control. A newly interpolated rule will be discarded if the performance feedback is negative, and the weights of all other rules in the current rule base will be updated. Otherwise, based on a pre-defined similarity degree threshold δ, there are two actions to revise the rule base: (1) if the similarity degree between the newly interpolated rule and each rule in the current rule base is less than the pre-specified threshold δ, the newly interpolated rule will be added into the current rule base; (2) if there exist similar rules which weights are smaller than the weight of the newly interpolated rule, those similar rules will be removed from the current rule base and the newly interpolated rule will be added into the current rule base; otherwise, the newly interpolated rule will be discarded. Through the above rule base revision mechanism, only accurate and up to date rules are included in the current rule base.
The flowchart of rule base revision This rule base revision mechanism will effectively update the impedance gains of the impedance model for better upright balance control performance.

| Stability analysis
The proposed control system uses an IP model and an adaptive impedance model to estimate the desired ankle torque (τ q ), as described in Equation 20. For the convenience of analysis, the desired ankle torque (τ q ) can be re-expressed as: where In order to analyse the stability properties of the proposed control system, the well-established class of Lyapunov function was employed here. Consider the following parameter dependent Lyapunov candidate function: For simplicity, denote V 1 θ e , _ θ e , t À Á as V 1 in the remainder of this paper. Keep M as a constant and τ q ¼ C, the following can be generated by differentiating V 1 with respect to time: Substituting Equation (20) in the above equation yields: Note that positive stiffness has been used in this work. Discard the mixed term θ foot and _ θ foot , and keep the potentially positive term _ K t ð Þ; Equation 29 can be rewritten as the following by considering B ¼ v ffiffiffi ffi K p : In the case of bipedal robotic upright balance control, the robotic ankle angle or the swing angle, that is, θ foot , is usually small. However, the angular velocity _ θ foot is relatively large, and thus _ θ foot > θ foot . If there exists a positive value of v, subject to _ K q t ð Þ < 2v ffiffiffiffiffiffiffiffiffiffiffi K q t ð Þ p , the above equation is negative. This means the proposed control system is stable.

| EXPERIMENTATION
The proposed adaptive ankle impedance control method was applied to a bipedal robot on a moving vehicle for system validation and robustness evaluation. The acceleration and deceleration of a moving vehicle provide a good simulation of a robotic working environment and thus an ideal testing bed for the proposed control approach.

| Experiment conditions
A bipedal robot model and a vehicle model were constructed using the OpenSim, which jointly from the simulation platform, to facilitate the verification of the proposed adaptive ankle control method. The models were implemented using HTML, leading to .osim files for OpenSim. In particular, the .osim file of the bipedal robot model provides details of each segment, joint constraint, and joint drive of the robot, amongst other information. The 3D mechanical structure was implemented using the Solidworks which provides the graphic display of the simulated bipedal robot and the vehicle. The resulted simulation platform is illustrated in Figure 7. The controller programme of the simulated models was developed using C++ which works with the API of OpenSim in Visual Studio to drive the simulated models.
In this experiment, the bipedal robotic ankle joint was driven by two actuators. The mechanical parameters of this model are proportional to those of a typical adult male (Hamner et al., 2010), as listed in Table 1. Over the course of the experiment, different loads were applied to the 'standard bipedal robot' to change its CoM, for exploring the robustness of the proposed control method. The CoM of the bipedal robot can be calculated as: where M ¼ P n i¼1 m i is the total mass of the bipedal robot, CoM i and m i are the CoM and mass of the ith body segment, respectively. In order to quantify the performance of bipedal robotic upright balance control, a two dimensional performance evaluation index representing the swing angle and the anti-disturbance timespan is used in this work, as shown in Figure 8. The swing angle is defined as the angle range of the bipedal robot deviating from the equilibrium position, which represents the stability of the upright balance control. A smaller value indicates better performance. The anti-disturbance timespan is defined as the time period from the application of the external force to the full stable condition at the equilibrium position, which represents the rapidity of balance re-establishment by the bipedal robotic upright balance control. The smaller the anti-disturbance timespan is, the better performance the controller demonstrates.

| System validation
In this experiment, an external force was applied to the vehicle platform, which was used to represent the external disturbance to the 'standard bipedal robot'. The movement of the vehicle platform includes four states: stationary, acceleration, constant speed, and deceleration. Therefore, the vehicle platform provides external disturbance to the bipedal robotic upright balance only during the acceleration and deceleration stages.
The reaction of the robot as instructed by the upright balance control system will be utilized to verify the effectiveness of the proposed adaptive ankle impedance control method. As detailed in Section 3.4, the impedance gains are updated by the E-FRI for different robot shapes/sizes and  Table 2. In this experiment, the experience factor (EF) of all rules were initialised as 100, the cooling down factor (CD) wes initially configured as 0, the similarity degree threshold was configured as 0:6, and the weight (w) of each rule was then updated using Equation 4, with parameter values n ¼ 500, a ¼ 100, b ¼ 5 in this experiment.
Based on the initialised rule base, the proposed adaptive ankle impedance control method can perform upright balance control tasks for the 'standard bipedal robot', but may not perform to an acceptable level on a specific robot in a specific environment. Therefore, the rule base must be revised to ensure continuing performance improvement based on the feedback of previous performance. To validate this functionality, the acceleration of the vehicle platform was set to 0:5m=s 2 as the disturbance in this experiment, and the initialised controller was utilized to perform the upright balance control tasks. Of course, the performance feedback from the bipedal robot was not available at the time of performing the task; thus, the rule base revision process would only occur once the performance feedback became available. The process of fuzzy rule interpolation and rule base revision is detailed in Section 3.4. After 2000 FRI performances, the rule base was stabilized with 80 rules, and 9 randomly selected rules from the stabilized rule base are illustrated in Table 3.
Based on the stabilized rule base, the experimental results regarding bipedal robotic ankle angle, the impedance model stiffness gain, and the desired ankle torque are illustrated in Figure 9. During the bipedal robot upright balancing process, the counter-clockwise swing is denoted as the negative direction, and the clockwise swing is denoted as the positive direction. There was no external interference during the vehicle's stationary states, and the bipedal robot quickly adjusted itself to an upright balance state in this situation. After the bipedal robot is adjusted to a balanced F I G U R E 8 The performance evaluation index of bipedal robotic upright balance control  position, it maintained the upright balance, and the desired ankle torque was barely changed. Then, an acceleration of 0:5m=s 2 was applied to the vehicle platform at 5 s time point; the bipedal robotic body offset angle θ foot is shown in Figure 9a.
This figure shows that after a lean forward position to about À2:2 ∘ and a lean backward position to about 2:6 ∘ , the bipedal robot gradually stabilized at around 1:8 ∘ which is consistent with the research results regarding the human upright balance position (Yin et al., 2020). The target stiffness is automatically adjusted by the E-FRI according to the bipedal robotic tilt angle and its velocity, as shown in Figure 9b. In particular, the target stiffness was ranged from 2 to 37 Nm=rad. When the bipedal robotic body deviated away from the equilibrium position, the target stiffness was increased, and vice versa, which are consistent with the properties of the human joint. The desired ankle torque is summarized in Figure 9c.
At the initial stage of the experiment, a backward rotation torque was produced by the change of the bipedal robotic gravity, and thus the desired ankle torque should be forward rotation torque. That is, the desired ankle torque presents a tendency to increase first and then decrease. The vehicle platform advanced at a constant speed after the speed reached 0.1 m/s, the bipedal robot was then experienced a non-external interference period and it quickly adjusted to an stable upright balance state.
In summary, the proposed adaptive ankle impedance control method can effectively calculate the desired ankle torque for bipedal robot upright balance control, which drives the bipedal robot through the ankle torque actuator to realize the upright balance control.

| Robustness evaluation
Note that the rule base developed from the expert knowledge based on a 'standard bipedal robot' may not be suitable for other robotic working environments in addition to different robot body sizes and shapes etc., and thus the controller must be able to adapt to customized needs of individual robots and their working environments. Therefore, two further experiments were performed, to facilitate the evaluation of the robustness of the proposed adaptive ankle impedance control method for different bipedal robots. These two experiments considered different external disturbances and different robotic physical parameters independently as reported below.

| Robust control for different external disturbance
Different vehicle acceleration rates were applied to the moving vehicle as the external disturbance of the bipedal robot in this experiment to verify the effectiveness of the proposed adaptive ankle impedance control method for various external disturbance. To facilitate a comparative study, the experiments were performed on both the proposed adaptive ankle impedance control method and the virtual-ankle stiffness control method (Emmens et al., 2018). Briefly, the virtual-ankle stiffness controller aims to keep the robotic ankle in the desired position, which can be regarded as a proportional controller. By taking the ankle angle as the input, the desired ankle torque τvas can be expressed as: where θ d is the desired ankle angle in the robot equilibrium position. The control gain K must be adaptively adjusted according to the robotic CoM, so as to effectively simulate the adjustment mechanism of the humanoid ankle stiffness, which can as expressed as: The results of bipedal robotic upright balance control (0:5m=s 2 acceleration) where M is the robotic mass, and l CoM is the height of the robotic CoM.
Case A: The proposed adaptive ankle impedance control method When the acceleration of the vehicle platform was 0.5, 1.0 and 1:5m=s 2 , the resultant bipedal robot body tilt angles and the desired ankle torques were recorded, as illustrated in Figure 10. From this figure, it is clear that for different external disturbances, the upright balance process was basically the same and the proposed control method could successfully realize upright balance control. The bipedal robot body tilt angle swing range θ foot was increased to 5:9 ∘ , and 7:2 ∘ from 4:8 ∘ , along with the increase of the acceleration of the vehicle platform from 0:5 to 1:0m=s 2 and 1:5m=s 2 , respectively. The anti-disturbance timespan of the biped robot was basically the same, specified as 6.8, 7.2 and 7.7 s, respectively. After this period, the bipedal robot returned to the basic equilibrium position within the range of 1.8 -2.0 .
When the acceleration of the vehicle was 0m=s 2 , the bipedal robot did not swing, that is the robot was in the stable upright equilibrium state, and thus the bipedal robot body offset angle swing range was around 0 ∘ . Under this premise, the swing ranges under different acceleration values of the vehicle platform is shown in Figure 11a. When the vehicle platform acceleration was in the range of 0-1.5 m/s 2 , the upright balance control process was basically the same, with a relatively small anti-disturbance period.

Case B: The virtual-ankle stiffness control method
In this experiment, the moving process or the external disturbance of the vehicle was the same as that of Case A, and the acceleration rates of the vehicle platform were 0.5, 1.0 and 1.5 m/s 2 . The swing range under different acceleration values of the vehicle platform is shown in Figure 11b. When the vehicle platform acceleration was 1.0 m/s 2 , the anti-disturbance period was 0 ∘ ; the bipedal robotic was failed to achieve upright balance when the vehicle platform acceleration was greater than 1.0 m/s 2 . The comparative experimental results demonstrate that the proposed adaptive ankle impedance control method can improve robust performance against the external disturbance acceleration interference of the vehicle platform.

| Robust control for different robotic physical parameters
The most important parameter of the robotic body is its CoM. In this experiment, the moving process of the vehicle was kept the same as that in the previous experiment, and the acceleration was set as 0:8m=s 2 . In order to show the actual working process of the bipedal robot, different loads were applied to the robot to effectively revise its CoM. In particular, the loads were ranged from 0 to 1.65 kg, and the total mass was in the range of 3.5-5.15 kg, which means the CoM was in the range of 0.4-0.68 m.
The experimental results regarding the bipedal robotic swing angle and the anti-disturbance timespan with different bipedal robotic loads are illustrated in Figure 12. From this figure, it can be seen that the swing angle and anti-disturbance timespan were within reasonable boundaries. In particular, F I G U R E 1 0 The results of bipedal robotic upright balance control with different vehicle acceleration the robotic swing angles was ranged from 5:5 ∘ to 7:6 ∘ , and the anti-disturbance timespan was ranged from 7.0 to 8.5 s. The swing angle and antidisturbance timespan were both increased along with the increase of the robotic total mass and CoM, but the increases shared a very similar trend. This experiment shows the effectiveness of the application of fuzzy rule interpolation to impedance gain update to dynamically respond to various bipedal robotic working states and robotic loads, and thus proves the employability of the proposed adaptive impedance model.

| CONCLUSION
This study proposes an adaptive ankle impedance control method for bipedal robotic upright balance control by dynamically adjusting the desired ankle torque, aiming to mitigate the challenges of poor robustness led by traditional bipedal robot control methods. The goals are achieved by F I G U R E 1 1 The swing range under different acceleration rates of the vehicle platform F I G U R E 1 2 Evaluation index based on various bipedal robotic loads (a) swing angle (b) anti-disturbance timespan applying the E-FRI approach for adaptive impedance gain update based on the states of a bipedal robot. The proposed controller was applied to a bipedal robot on a moving vehicle with various external disturbances and robotic loads, and the promising experimental results demonstrate the working and robustness of the proposed method for bipedal robotic upright balance control under uncertain and unstructured environments. This can be of high financial interest towards the commercial production of bipedal robots in mass volume for different tasks. Although the proposed controller was only applied to bipedal robot ankle control in this work, it is readily applicable to the other robot joints, which remains as a piece of future work. In addition, it is appealing to further evaluate the proposed robotic control approach by applying it to more different real-world robotic working environments. What is more, the introduction of the extra E-FRI may make the control system more complex, and thus a formal analysis of system complexity is required.

DATA AVAILABILITY STATEMENT
Data sharing is not applicable to this article as no new data were created or analyzed in this study.