Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping...

8
Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1 , Patrick M. Wensing 1 , David E. Orin 1 , James P. Schmiedeler 2 1 Department of Electrical and Computer Engineering, The Ohio State University (liu.805 / wensing.2 / [email protected]) 2 Department of Aerospace and Mechanical Engineering, University of Notre Dame ([email protected]) Abstract— Current biped robots with articulated legs, even the most impressive to date, still lack the ability to execute dynamic motions such as jumping and running with compa- rable performance to biological systems. This work explores dynamic jumping with the planar biped prototype KURMET, which employs unidirectional series-elastic actuation at each of its joints. While this actuation scheme enables the performance of high-power dynamic movements like the jump, its presence complicates the jumping control problem and has prevented previous researchers from obtaining precise jump control in systems of reasonable complexity. To manage this problem, this paper develops a layered fuzzy control system for KURMET that realizes repeated dynamic hopping and accurate control of both torso height and velocity at each top of flight. An effective two-stage training approach is used for the fuzzy controller to learn the required, yet highly nonlinear, relationships between its inputs and outputs. Finally, the state machine employed at the lowest-level of control is used to achieve a maximal normalized jump height that outperforms most humans and can be sequenced with the hopping movement. I. I NTRODUCTION Due to their structural similarities with human beings, biped robots with articulated legs offer advantages to assist and interact with humans and their immediate environments. However, in order to leverage the agility legs offer and function in realistic scenarios, biped robots must be able to perform dynamic movements without compromising their dynamic stability. The development of methods to address dynamic stability in legged machines remains a research challenge. Although much recent work has focused on dynamic movements in legged locomotion, only a small portion has aimed to address controlled hopping. A number of recent contributions to legged hopping have focused on the de- velopment of new actuation schemes [1], [2]. While these design efforts have led to impressive high jump performance, their control approaches do not manage the jump height or velocity and thus, fail to address the underactuation problem inherent to jump control. Raibert’s bipeds with prismatic legs [3], on the other hand, demonstrated jump height and velocity control, yet the control approach has not proven to be more broadly applicable to articulated morphologies. Overall, jump control for articulated mechanisms has not reached the maturity observed in state-of-the-art dynamic walking and running machines [4], [5], [6]. The incorporation of compliance into actuation schemes for dynamic bipeds has emerged as a standard in the robotics Fig. 1. KURMET, an experimental biped designed for the study of dynamic movements. community [4], [7], [8], inspired by biological systems. Sci- entists have cited tendon elasticity as a source of efficiency for high power movements in mammals and insects [9]. The advantages afforded by compliant mechanisms are most pronounced for jump performance, where elasticity helps to provide the explosive leg power required for liftoff [10]. Although methods have been developed to manage the complexities associated with compliant actuators in biped running [11], corresponding developments have not been achieved for precision controlled jumping. The work de- scribed herein represents an attempt to provide such control for a jumping biped with a compliant actuation scheme. To investigate the problem of dynamic jumping, a new experimental biped robot, KURMET, shown in Fig. 1, has been developed at The Ohio State University. KURMET’s most prominent features are its unidirectional series-elastic actuators (USEAs) embedded into its torso and employed to drive the two hip and two knee joints. The USEAs provide leg compliance in the direction of impact loading during ground contact and rigidity in the direction of leg return for higher precision leg positioning during the swing phase [12]. Previous work has shown superior jump performance when USEAs are included at both the hip and knee joints [13]. Although the USEAs offer performance advantages, their

Transcript of Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping...

Page 1: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

Fuzzy Controlled Hopping in a Biped Robot

Yiping Liu1, Patrick M. Wensing1, David E. Orin1, James P. Schmiedeler21Department of Electrical and Computer Engineering, The Ohio State University

(liu.805 / wensing.2 / [email protected])2Department of Aerospace and Mechanical Engineering, University of Notre Dame

([email protected])

Abstract— Current biped robots with articulated legs, eventhe most impressive to date, still lack the ability to executedynamic motions such as jumping and running with compa-rable performance to biological systems. This work exploresdynamic jumping with the planar biped prototype KURMET,which employs unidirectional series-elastic actuation at each ofits joints. While this actuation scheme enables the performanceof high-power dynamic movements like the jump, its presencecomplicates the jumping control problem and has preventedprevious researchers from obtaining precise jump control insystems of reasonable complexity. To manage this problem, thispaper develops a layered fuzzy control system for KURMETthat realizes repeated dynamic hopping and accurate control ofboth torso height and velocity at each top of flight. An effectivetwo-stage training approach is used for the fuzzy controller tolearn the required, yet highly nonlinear, relationships betweenits inputs and outputs. Finally, the state machine employedat the lowest-level of control is used to achieve a maximalnormalized jump height that outperforms most humans andcan be sequenced with the hopping movement.

I. INTRODUCTION

Due to their structural similarities with human beings,biped robots with articulated legs offer advantages to assistand interact with humans and their immediate environments.However, in order to leverage the agility legs offer andfunction in realistic scenarios, biped robots must be ableto perform dynamic movements without compromising theirdynamic stability. The development of methods to addressdynamic stability in legged machines remains a researchchallenge.

Although much recent work has focused on dynamicmovements in legged locomotion, only a small portion hasaimed to address controlled hopping. A number of recentcontributions to legged hopping have focused on the de-velopment of new actuation schemes [1], [2]. While thesedesign efforts have led to impressive high jump performance,their control approaches do not manage the jump height orvelocity and thus, fail to address the underactuation probleminherent to jump control. Raibert’s bipeds with prismaticlegs [3], on the other hand, demonstrated jump height andvelocity control, yet the control approach has not provento be more broadly applicable to articulated morphologies.Overall, jump control for articulated mechanisms has notreached the maturity observed in state-of-the-art dynamicwalking and running machines [4], [5], [6].

The incorporation of compliance into actuation schemesfor dynamic bipeds has emerged as a standard in the robotics

Fig. 1. KURMET, an experimental biped designed for the study of dynamicmovements.

community [4], [7], [8], inspired by biological systems. Sci-entists have cited tendon elasticity as a source of efficiencyfor high power movements in mammals and insects [9].The advantages afforded by compliant mechanisms are mostpronounced for jump performance, where elasticity helps toprovide the explosive leg power required for liftoff [10].Although methods have been developed to manage thecomplexities associated with compliant actuators in bipedrunning [11], corresponding developments have not beenachieved for precision controlled jumping. The work de-scribed herein represents an attempt to provide such controlfor a jumping biped with a compliant actuation scheme.

To investigate the problem of dynamic jumping, a newexperimental biped robot, KURMET, shown in Fig. 1, hasbeen developed at The Ohio State University. KURMET’smost prominent features are its unidirectional series-elasticactuators (USEAs) embedded into its torso and employed todrive the two hip and two knee joints. The USEAs provideleg compliance in the direction of impact loading duringground contact and rigidity in the direction of leg return forhigher precision leg positioning during the swing phase [12].Previous work has shown superior jump performance whenUSEAs are included at both the hip and knee joints [13].

Although the USEAs offer performance advantages, their

Page 2: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

presence significantly increases control challenges. The useof USEAs to actuate the articulated legs, along with thenonlinear contact-flight hybrid dynamics, produce a verycomplex system that is difficult to control through traditionalanalytical approaches. Recently, intelligent control strategieshave emerged as a necessary alternative to address suchcomplexities in dynamic machines [14]. Intelligent con-trol strategies do not require a closed-form control model,which is usually intractable to generate. Instead, through aprogressive learning process, intelligent methods can helpthe controller gain the essential control knowledge neededto manage the complex system. Previous applications ofintelligent control schemes in dynamic legged locomotionat The Ohio State University have also shown promisingresults [15], [16], [17]. Still, these results have largely beenlimited to simulation environments.

This paper is part of the effort to realize controlleddynamic movements in a mechanically complex biped robotusing an intelligent control approach. The first contributionof this paper is the development of a layered fuzzy controlsystem capable of repeated dynamic hopping with preciseheight and velocity control at each top of flight (TOF). Thelayered control system uses a lower level state machine anda higher level discrete-time fuzzy controller that regulateskey parameters of the state machine at TOF. Simplifyingtechniques, such as the use of a common state machine fordifferent hopping scenarios and the design of a minimumnumber of fuzzy controller outputs, are adopted to keep thecontrol system as compact as possible. These techniques al-low rapidly convergent learning algorithms to be designed forthe physical system. The second contribution is the introduc-tion of an effective two-stage training method that allows thecontroller to iteratively learn to manage the highly nonlinearrelationships between its inputs and outputs. Through the useof simple heuristics, the training is first performed offline in adynamic simulation environment. The preliminary controlleracquired from simulation is then refined through physicalexperiments with an online adaptation algorithm.

The organization of this paper is as follows. First, descrip-tions of the KURMET hardware and its model are given, withan emphasis on the series-elastic elements. Next, the layeredcontrol structure is detailed, and the heuristics to developthe controller are discussed. Then, the training and onlineadaptation algorithms are described. Finally, test results arepresented to verify the validity of the control approach andhighlight the advantages of the online adaptation algorithm.

II. HARDWARE AND MODEL

KURMET is a five-link planar biped robot constrained bya boom [18]. Each leg is comprised of a thigh and a shanksegment, with revolute joints at the hips and knees. The thighand shank are actuated in parallel by identical unidirectionalseries-elastic actuators (USEAs) [12]. Each USEA includesa Maxon Powermax EC-30 brushless motor and a planetarygearhead in series with a spiral torsion spring, as shown inFig. 2. To reduce the leg inertia, the USEAs are located in thetorso and connected to their respective leg segments through

Fig. 2. Exploded view of the unidirectional series-elastic actuator onKURMET. (Courtesy of B. Knox [12].)

a cable drive system. KURMET’s torso is rigidly mountedto a carbon-fiber boom that is coaxial with the hip axis.The boom is mounted to the wall at a free 2-DOF joint thatallows boom pitch and yaw and is aligned to be level whenKURMET stands on the ground with legs fully extended.KURMET has a total mass of approximately 15 kg and astanding hip height of 50 cm with legs fully extended.

As the initial fuzzy controller training process employsoffline dynamic simulation, a dynamic model of the hardwarethat faithfully includes the most important dynamic charac-teristics of the system is essential to the physical applicabilityof the trained controller. All the links, including the boom,are modeled as rigid bodies with realistic mass, inertia, andgeometric properties (see [19] for details). The model of theUSEAs is split into two parts: a model for the DC motor anda hybrid-dynamics model for the unidirectional direct-driveand series-elastic regimes of operation. The model for DCmotor has been developed by Curran [13], while the hybrid-dynamic model is detailed here.

The USEA has two regimes of operation. The Series-Elastic Actuation Mode occurs when the motor is away fromthe unidirectional hardstop and primarily occurs when theleg is compressed against the ground. In this mode, theactuator drives the link through the tension torque of thespiral spring. The Direct-Drive Actuation Mode occurs whenthe motor is compressed against the hardstop and primarilyoccurs when the leg is in the air. An elastic element existson the hardstop in order to reduce impacts that occur withinthe actuator during the transition between these modes. Thishardstop is modeled as a stiff linear contact spring and anonlinear damper in parallel. A nonlinear damper model isused to avoid the instantaneous change in contact force thatis observed at initial contact with a linear model [20]. Toreduce link oscillations during flight, the spiral spring is pre-tensioned.

The behavior of the USEA for the thigh can be describedby

τl =

Ks (θm − θl) θm − θp > θl

Ks (θm − θl) +Kc (θm − θp − θl)+λc · |θm − θl − θp| ·

(θ̇m − θ̇l

)θm − θp ≤ θl ,

(1)

Page 3: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

Hip joint

Thigh motor

Thigh

Spiral spring

Contact damper

Stiff contact spring

Hardstop

Fig. 3. The detailed thigh USEA model [19]. The hardstop is physicallyattached to the link, and its contact is modeled with a stiff spring anddamper. Note that the thigh USEA is pictured in the series-elastic mode ofoperation.

where τl is the joint torque on the link, θm is the motorposition after the gear reduction, θl is the link position, θpis the pretensioning angle, Ks is the spring constant of thespiral spring, Kc is the spring constant of the unidirectionalhardstop contact spring (in the model, Kc ≤ 10Ks) and λcis the coefficient of the nonlinear contact damper. Figure 3provides a graphical explanation of these variables for thethigh. The shank has a similar equation with flexure in theopposite direction of the thigh.

In this work, dynamic simulation was performed with thegraphical simulation package RobotBuilder [21], which isbuilt upon the DynaMechs library [22]. RobotBuilder pro-vides efficient calculations of the articulated body’s forwarddynamics and admits the integration of additional actuatordynamics and user-defined control policies. Further details onthe dynamic model used for simulation, including a groundcontact model, can be found in [19].

III. LOW-LEVEL HOPPING CONTROL

KURMET’s low-level hopping control is a modular foun-dation that allows effective function of the fuzzy controller.The low-level control uses a state machine, shown in Fig-ure 4, to coordinate a series of motor trajectories and realizethe hop motion. Research in neuromotor physiology alsoindicates that the central nervous system of animals mayuse a linear combination of fixed pattern muscle synergiesto generate a variety of complex movements [23]. This lendscredibility to a modular approach, such as the state machine,for low-level hopping control.

Table I summarizes the motor actions taken during eachstate and the trigger to exit each state. A few key pointsshould be noted about this state machine:

I. The hopping state machine only controls motor po-sitions; it never attempts to directly control the linkpositions. This allows the controller to use the naturaldynamics of the USEAs and the biped mechanism togood advantage while eliminating the problems associ-ated with direct link control through elastic elements.

Touchdown (TD)

Deflection

threshold p%

Time threshold T2

Top of flight

(TOF)

Zero

deflections

&

lift-off (LO)

PRE_TD

HOLD

THRUST

PRE_LO

PRE_TOF

after

Fig. 4. Hopping state machine for KURMET. In the states with a boldborderline, KURMET is in contact with the ground.

0 100 200 300 400 500 600 700 800 900 1000-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Time (ms)

Left s

hank / s

hank m

oto

r positio

n (

rad)

HO

LD

PRE_TD THRUST PRE_LO PRE_TOF HO

LD

PRE_TD

Desired motor position

Actual motor position

Actual link position

Singularity hardstop position

Fig. 5. Typical motor/link trajectories (left shank) in one hop from TOFto TOF, partitioned by the system states.

II. In this hopping state machine, the motor positionsare under closed-loop control at all times, which isa departure from Hester’s jumping state machine forKURMET [24]. Previous approaches used open-loopmotor currents, as opposed to closed-loop motor trajec-tories, to provide thrust. Open-loop control is sensitiveto modeling errors, such as inaccurate backward effi-ciency of the gear reducer, and therefore cannot reliablyprovide the desired thrusts. Further, it leaves the motorposition unknown during the critical instant of lift-off,which significantly increases the risk of violating jointkinematic limits. With motor position under closed-loopcontrol, the two problems can be avoided.

III. Whenever a new motor position is commanded, a cubic-spline trajectory is generated for the motor to makea smooth transition. This strategy helps to reduce un-wanted oscillations in the leg positions during flight.

IV. The left and right legs are synchronized at all times.

The parameterization of the hopping state machine hasbeen refined to accommodate a wide range of initial top-of-flight (TOF) system conditions, including those for a high

Page 4: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

TABLE IHOPPING STATE MACHINE SUMMARY

STATE MOTOR ACTIONS EXIT TRIGGER

PRE TD Servo thigh and shank motors to updated pre-touchdownpositions in T1 time. Touchdown

HOLD Both thigh and shank motors hold their current positions. T2 time after Touchdown

THRUST Servo thigh and shank motors to updated thrust positions inT3 time. Spring deflection < p% of the maximum†

PRE LO Servo thigh and shank motors to pre-liftoff positions in T4time. Lift-off, and all USEAs achieving zero deflections

PRE TOF Servo thigh and shank motors to previous pre-touchdownposition in T5 time.‡

Top of flight (TOF)

† The thigh and shank do not necessarily exit the THRUST state at the same time. ‡ T5 is always set longer than the duration of the PRE TOF state.

jump. This range is constrained mainly by the maximumallowable motor torque and speed, as well as the joint limits.Fig. 5 shows the typical motor and link position trajectories(left shank) in one hop from TOF to TOF. Right after TOF,the motor directly drives the link to the desired touchdownposition while the link is still in the air. Upon touchdown,the motor holds its position, and the spring starts to deflect.After a short period of time, the motor starts to thrust in theopposite direction to the link movement, and by increasingthe spring deflection, it injects extra energy into the spring.To avoid knee hyperextension, however, the motor has tounwind right before lift-off to keep the link away from theapproaching singularity hardstop. (The singularity hardstopposition is shown with the green dashed line in Fig. 5.)

IV. FUZZY CONTROLLER

The fuzzy controller is invoked at TOF in each jump.When called, it evaluates the current system status andconsults its rulebase to determine appropriate values of keyparameters in the hopping state machine for the next hopcycle.

A. Fuzzy Controller Inputs and Outputs

Well chosen fuzzy inputs and outputs are an importantdesign decision in the construction of the high-level fuzzycontroller. A minimal selection of inputs and outputs cannot only increase the inferential accuracy of the fuzzycontroller, but also can simplify the training process requiredto initialize the fuzzy rulebase.

The inputs of the fuzzy controller are the system variablesthat directly reflect the control objectives. In this work, theyare the torso height at current TOF ho, the torso velocityat current TOF vo, the desired change in torso height at thenext TOF ∆hd, and the desired change in torso velocity atthe next TOF ∆vd.

The choice of the outputs requires more consideration.Previously, Hester et al. [24] developed a fuzzy jump con-troller for KURMET with 6 outputs, 4 of which were open-loop thrust currents. While the weakness of using open-loopcurrent has been discussed, the number of outputs also causesproblems. The non-minimal selection of the fuzzy outputs

gives extra and unnecessary flexibility to the controller,potentially causing the training to converge to differentmodes of operation. When multiple modes of operation existin the fuzzy rulebase, the controller is likely to generatepoor outputs when inferring between rules. Sophisticatedtraining laws help, but are fundamentally unable to solvethis problem.

The final fuzzy controller outputs selected, a set of two,are the common-mode thrust angle θthr,comm and the dif-ferential thrust angle θthr,diff. These outputs are convertedto more straightforward motor position commands whichare key state machine parameters. (All other state machineparameters are either fixed or assigned heuristically.)

θthr,t = θthr,comm + θthr,diff , (2)

θthr,s = θthr,comm − θthr,diff , (3)

where θthr,t is added to the current thigh motor position togenerate the motor thrust position for the thigh motors in theTHRUST state (see Table I). A similar strategy is employedfor the shank motors.

The output θthr,comm predominantly influences the hopheight, and the output θthr,diff mainly influences the hopvelocity; the effects of the output variables were found tobe only weakly coupled. This selection keeps the numberof fuzzy controller outputs minimal yet ensures that thecontroller has enough flexibility to reach its height andvelocity objectives.

B. Fuzzy Controller Structure

The fuzzy controller provides a nonlinear mapping be-tween its inputs and outputs. As shown in Fig. 6, it iscomprised of four functional components: the fuzzificationinterface, the fuzzy rulebase, the inference mechanism, andthe defuzzification interface.

The fuzzification interface preprocesses the inputs. It con-verts a specific input value into certainties of membershipfunctions that are associated with each input. In this work,identical, equilateral-triangular input membership functionsare used for all inputs. The membership function centers

Page 5: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

Fuzzification

Defuzzific

ation

Fuzzy

Inference

Mechanism

Execution of One

Cycle in the Low-level

Hopping

State Machine

(TOF to TOF)

Adaptation

Mechanism

Rulebase

Fuzzy Controller

Fig. 6. Block diagram of the fuzzy controller.

Fig. 7. Input membership functions for input vo. The outmost membershipfunctions saturate at 1.0 when |vo| > 0.6.

(where the certainty of a membership function is 1) areequally-spaced, and the certainty of a triangular membershipfunction becomes zero at the centers of its two neighboringmembership functions. Due to this design, at any given time,any input value will have at most two membership functionswith non-zero certainties. Fig. 7 shows the membershipfunctions used to characterize vo. For example, if the currenthop velocity vo is 0.1 m/s, the only non-zero certainties areµvo

0.0 = µvo0.2 = 0.5. The membership function centers for all

of the fuzzy controller inputs are listed in Table II.The fuzzy rulebase provides control knowledge for the

system. It contains a set of rules that specifies the con-troller outputs for every combination of input membershipfunctions. In this work, the inputs ho, ∆hd, and ∆vdhave 3 membership functions each, and the input vo has 7membership functions, so there are 3×7×3×3 = 189 rules.The number of membership functions, and correspondinglynumber of rules, was experimentally determined to providea good trade-off between the control resolution and therequired training/adaptation time (see IV-C). The additionalmembership functions for the vo input are required to managethe nonlinearities in resultant jump behavior that are mostpronounced for different initial velocities. Each input mem-

TABLE IIFUZZY CONTROLLER INPUT MEMBERSHIP FUNCTION CENTERS

Input Membership function centers Unitsho 0.54, 0.58, 0.62 mvo -0.6, -0.4, -0.2, 0.0, 0.2, 0.4, 0.6 m/s

∆hd -0.02, 0.0, 0.02 m∆vd -0.2, 0.0, 0.2 m/s

bership function is represented by its center, and an examplerule looks like the following:

If ho = 0.58 m, vo = 0 m/s, ∆hd = 0 m and ∆vd = 0 m/s,then θthr,comm = 61.8 rad and θthr,diff = 7.4 rad ,

where the ‘if’ part of the rule is called the premise and the‘then’ part is called the consequent.

The fuzzy rulebase only specifies the controller outputsfor certain input cases. The controller outputs for generalinput cases are inferred from the existing rules. The inferencemechanism is the component that determines the applicabil-ity of each rule to the current set of input values. In this work,the applicability of a rule, or the certainty for the premiseof the rule, is calculated as the product of the certainties ofthe input membership functions. For example, the certaintyof the above rule would be (suppose z is the rule index)

µz = µho0.58 × µ

vo0.0 × µ

∆hd0.0 × µ

∆vd0.0 . (4)

Since each fuzzy controller input can have a maximum oftwo membership functions with non-zero certainty at a time,there are at most 24 = 16 applicable rules for any possibleset of input values. In other words, the control output for anyarbitrary input can be inferred from a maximum of 16 rulesinstead of the whole rulebase. Such a feature reduces thecomputational complexity of the fuzzy controller and enablesreal-time calculation, even for much larger rulebases.

The defuzzification interface then converts the result of theinference mechanism into the final fuzzy controller outputs.The outputs are calculated as the weighted average of theconsequents of all applicable rules.

yi =

∑z∈S

µz · uz,i∑z∈S

µz

=∑z∈S

µz · uz,i , (5)

where z is the rule index and S is the set of the indices ofall the applicable rules. The subscript i can be 1 for θthr,commor 2 for θthr,diff, and uz,i is the ith output singleton for rule z.Note that this is a zero-order Sugeno-type fuzzy controller.

C. Fuzzy Training

In traditional fuzzy controllers, the consequents of a ruleare usually assigned by the user based on previous expe-rience. For the fuzzy controller developed here, though, theuser initially has no knowledge as to how to assign the valuesof outputs θthr,comm and θthr,diff for each rule. Therefore, atraining process is needed for the fuzzy controller to learnthe consequents of each rule in a progressive manner.

Marhefka et al. used the idea of training to initialize therulebase for a fuzzy running controller in a quadruped [15].In this work, a training cycle is a single test hop that starts atTOF and terminates at the next TOF. The detailed trainingprocess for one rule is summarized in Table III. The twosimple heuristics used are:• Increase θthr,comm if the desired change of TOF torso

height is positive.

Page 6: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

TABLE IIITRAINING PROCESS FOR ONE RULE

1) Initialize the biped at TOF with ho as the torso height, vo asthe torso velocity and the legs in an initial static configuration.ho and vo are specified by the rule premise.

2) Initialize θ̂thr,comm, the estimated value for θthr,comm, and θ̂thr,diff,the estimated value for θthr,diff, from an adjacent rule.

3) Perform the test hop.4) Evaluate the hop at the next TOF in terms of the actual hop

height h and the actual hop velocity v.a) If |ho +∆hd−h| < δh and |vo +∆vd−v| < δv , then

θ̂thr,comm ⇒ θthr,comm and θ̂thr,diff ⇒ θthr,diff .The rule is considered trained, and thetraining process for this rule terminates.

b) Elsei) Update θ̂thr,comm and θ̂thr,diff:wh · (ho + ∆hd − h) + θ̂thr,comm ⇒ θ̂thr,comm,-wv · (vo + ∆vd − v) + θ̂thr,diff ⇒ θ̂thr,diff,where wh and wv are the positive weights used toregulate the update rate;

ii) Re-initialize the biped at TOF with ho and vo andthe legs in the initial static configuration. Repeatthe training process from Step 3 while applying theupdated θ̂thr,comm and θ̂thr,diff.

• Increase θthr,diff if the desired change of TOF torsovelocity is negative (forward).

Once a rule is trained, training moves on to the next ruleuntil all 189 rules are trained. The training efficiency forone rule can be greatly improved by setting its initial thrustangles to the consequents of an adjacent trained rule. (Twoadjacent rules are any two rules whose premises only differby one adjacent input center.) Typically, the training processfor one rule takes about 100 to 200 iterations, depending onthe values for wh, wv , δh and δv in Table III.

For this work, the order in which rules were trained didnot have a large impact on the resultant rulebase. That is, aslong as each rule’s consequents were seeded from an adjacenttrained rule, the final rulebase was the same regardless of ruletraining order. This observation is largely attributed to the2-output setup of the fuzzy controller. In contrast, Hester’s6-output controller was much more sensitive to rule trainingorder due to the unneeded flexibility of its outputs [24].

D. Online Adaptation

Due to modeling errors and the assumption in the trainingthat the legs are static at the initial TOF, the fuzzy rulebasefrom offline training will not immediately produce accurateperformance on the prototype biped. An online adaptationprocess was developed to fine-tune the fuzzy rulebase. If atthe TOF before the jth jump, the desired torso height for thenext TOF is hd(j) and the desired torso velocity for the nextTOF is vd(j), and at the next TOF, the actual torso heightis h(j) and the actual torso velocity is v(j), then

eh(j) = hd(j)− h(j) , (6)ev(j) = vd(j)− v(j) , (7)

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

120

100

80

60

40

20

0

120

100

80

60

40

20

0-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

Fig. 8. θthr,t/θthr,s vs. vo fuzzy curves corresponding to different ho with∆vd = 0 m/s and ∆hd = 0 m. Note that the values for the motor anglesare for the input side of the gear. These curves highlight the necessity fora nonlinear controller to select appropriate control outputs.

and the updated output singletons are

uz,1(j + 1) = uz,1(j) +K1 · µz(j) · eh(j) , (8)uz,2(j + 1) = uz,2(j) +K2 · µz(j) · ev(j) , (9)

where z ∈ S and S is the set of indices of rules that areactivated to achieve the jth TOF. The adaptation heuristicsare the same as the ones used in the offline training. K1 andK2 are the adaptation gains for the two output singletons andare tuned experimentally. The certainty of the rule is usedto scale the update so that the output singletons of moreapplicable rules are changed more. It should be noted thatif the performance error of the original offline rulebase goesbeyond a certain limit, the online adaptation process willbecome significantly less efficient and may even fail if thejumping becomes unstable.

V. RESULTS

A. Fuzzy Control Surfaces

Fig. 8 shows some of the highly nonlinear relationships be-tween the fuzzy controller inputs and outputs that are learnedby the offline fuzzy training. These curves are consistent withthe training heuristics. Note that θthr,t/θthr,s are related to theimmediate fuzzy controller outputs θthr,comm/θthr,diff throughEqs. 2 and 3.

B. Online Adaptation

A simple hopping velocity profile is used to examinethe effectiveness of the online adaptation. The blue curves

Page 7: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

0 5 10 15 20 25 30-50

-40

-30

-20

-10

0

10

20

30

40

50

Hop count

Hop v

elo

city (

cm

/s)

0 5 10 15 20 25 3050

52

54

56

58

60

62

64

Hop count

Hop h

eig

ht (c

m)

Desired Profile

Actual Before

Actual After

Fig. 9. The controller’s performance in tracking a simple hop profile before and after online adaptation. Note, the desired change insuccessive TOF heights are limited to ±1.5 cm and the desired change in successive TOF velocities to ±15 cm/s.

10 20 30 40 50-50

-40

-30

-20

-10

0

10

20

30

40

50

Hop v

elo

city (

cm

/s)

Hop count

10 20 30 40 5054

55

56

57

58

59

60

61

62

Hop h

eig

ht (c

m)

Hop count

Actual Profile

Desired Profile

Fig. 10. The controller’s performance in executing a 55-hop profile after adequate online adaptation. (The adaptation gain has alreadybeen set to a very small value.) Note, the desired change in successive TOF heights are limited to ±1.5 cm and the desired change insuccessive TOF velocities to ±15 cm/s.

in Fig. 9 show the performance prior to online adaptation.After the velocity profile is executed twice with the onlineadaptation mechanism turned on, the cumulative-updatedrulebase allows the fuzzy controller to track the desiredprofile much better (the red curves in Fig. 9) than the originaloffline rulebase. The offline training provided a rulebasethat gave sufficient initial tracking of the velocity profile sothat online adaptation could effectively improve the trackingperformance. The online adaptation was very fast, such thatgood tracking could be achieved after only the second exe-cution of the velocity profile. Further improvement, however,is constrained by the mechanical limitations of the physicalsystem.

C. Tracking a Velocity and Height Profile

After similar online adaptation, the controller is ableto follow a relatively complex hopping profile with goodaccuracy. A 55-hop profile was tested on KURMET, andthe results are shown in Fig. 10. This profile highlights theability of the controller to track sharp changes in the velocitysetpoint, as well as simultaneous changes to the velocity andheight setpoints. While the majority of hops enjoy a heighterror of under 0.5 cm, the controller is still able to maintain

height control to within 1 cm in the worst case. In steadystate, the velocity error remains under 4 cm/s.

D. High Jump Test

KURMET’s mechanical design is specifically optimizedfor high jumping. The state machine presented previously isintended to support continuous hopping and to accommodatea wide range of TOF system conditions. To demonstrate thehigh jump ability of KURMET, all the parameters of thestate machine are tuned to produce maximal TOF height ina single jump. As a preliminary investigation, the parameterswere refined in this work based on qualitative experimentalobservations. Later, this approach may be improved by an in-telligent search method such as Particle Swarm Optimizationor Genetic Algorithms to optimize the numerical parametersfor a high jump. Still, the hand-tuned controller is able toachieve a height performance well beyond the range of thefuzzy controller. The maximum jump height achieved byKURMET (from hip joint to ground) is 75 cm, which isequivalent to a normalized jump height (ratio of the jumpheight to the leg segment length [25]) of 3. In this sense,KURMET outperforms most humans. The hip height for ahigh jump is shown in Fig. 11. The smoothness of this curve

Page 8: Fuzzy Controlled Hopping in a Biped Robotpwensing/Papers/Liu11-ICRA.pdf · Fuzzy Controlled Hopping in a Biped Robot Yiping Liu 1, Patrick M. Wensing , David E. Orin , James P. Schmiedeler2

200 400 600 800 1000 1200 14000.2

0.3

0.4

0.5

0.6

0.7

0.75

0.8

High jumptouchdown

Time (ms)

Tors

o h

eig

ht (m

)

Fig. 11. Torso height over time for a high jump. The jump begins ata height that can be reached with the hopping fuzzy controller. Note thesmoothness of the controlled trajectory.

provides evidence that the low-level state machine promotesthe natural dynamics of the system. It is useful to note thatthe high jump starts from a height that is within the rangeof the fuzzy controller.

A video of a high jump for this system can be found at

http://go.osu.edu/KURMET_Fuzzy_Control_Video

This video also includes a demonstration of the fuzzy con-troller, of the system’s ability to reject a push disturbance,and of the system’s ability to perform a high jump during asequence of fuzzy controlled hops. It should be noted that alow-gain position controller is used to command the desiredvelocity for the system to robustly hop in place.

VI. CONCLUSIONS AND FUTURE WORK

The work in this paper has realized repeated dynamichopping in the planar biped robot KURMET with accuratecontrol of the top-of-flight height and velocity. The layeredcontrol approach and the two-stage training process are itstwo major contributions. The promising results encourage ex-tension of this approach to study other dynamic movementswith similar experimental platforms. The major challengeto such an extension is the development of a suitable low-level controller as well as proper determination of the fuzzyinput/output structure. While the system has demonstrateda high level of performance, evolutionary strategies or otherintelligent methods should be employed to investigate its trueperformance boundaries. Additionally, it would be of interestto compare our control system’s accuracy, robustness, andadaptation ability to a traditional model-based controller.

VII. ACKNOWLEDGMENTS

The authors gratefully thank Brian Knox for the designand construction of KURMET and Matthew Hester, who’swork laid the foundation for many of the ideas describedabove. Support for this work was provided by Grant No.IIS-0535098 from the National Science Foundation to TheOhio State University.

REFERENCES

[1] J. Babic, B. Lim, D. Omrcen, J. Lenarcic, and F. C. Park, “Abiarticulated robotic leg for jumping movements: theory and exper-iments,” ASME Journal of Mechanisms and Robotics, vol. 1, no. 1,pp. 011013:1–9, 2009.

[2] R. Niiyama, A. Nagakubo, and Y. Kuniyoshi, “Mowgli: A bipedaljumping and landing robot with an artificial musculoskeletal sys-tem,” IEEE International Conference on Robotics and Automation,pp. 2546–2551, April 2007.

[3] M. Raibert, Legged Robots That Balance. MIT Press, 1986.[4] T. Yang, E. Westervelt, J. Schmiedeler, and R. Bockbrader, “Design

and control of a planar bipedal robot ERNIE with parallel kneecompliance,” Autonomous Robots, vol. 25, no. 4, pp. 317–330, 2008.

[5] B. Morris, E. R. Westervelt, C. Chevallereau, G. Buche, and J. W.Grizzle, “Achieving bipedal running with RABBIT: Six steps towardinfinity,” in Fast Motions in Biomechanics and Robotics, pp. 277–297,Springer Berlin / Heidelberg, 2006.

[6] B. Dynamics, “Petman - BigDog gets a big brother.” http://bostondynamics.com/robot_petman.html, 2009.

[7] G. A. Pratt and M. M. Williamson, “Series elastic actuators,” in Pro-ceedings of the 1995 IEEE/RSJ International Conference on IntelligentRobots and Systems, vol. 1, pp. 399–406, Aug. 1995.

[8] J. W. Hurst and A. A. Rizzi, “Series compliance for an efficientrunning gait,” IEEE Robotics and Automation Magazine, vol. 15,pp. 42–51, 2008.

[9] R. Alexander and H. Bennet-Clark, “Storage of elastic strain energyin muscle and other tissue,” Nature, vol. 265, pp. 114–117, 1976.

[10] T.J. Roberts and R.L. Marsh, “Probing the limits to muscle-poweredaccelerations: lessons from jumping bullfrogs,” The Journal of Exper-imental Biology, vol. 206, no. 15, pp. 2567–2580, 2003.

[11] I. Poulakakis, “Spring loaded inverted pendulum embedding: Exten-sions toward the control of compliant running robots,” in Proc. of IEEEInternational Conference on Robotics and Automation, (Anchorage,AK), pp. 5219–5224, 2010.

[12] B. Knox and J. Schmiedeler, “A unidirectional series-elastic actuatordesign using a spiral torsion spring,” Journal of Mechanical Design,vol. 131, pp. 125001:1–5, 2009.

[13] S. Curran, D. Orin, B. Knox, and J. Schmiedeler, “Analysis andoptimization of a series-elastic actuator for jumping in robots witharticulated legs,” in Proc. of 2008 ASME Dynamic Systems and ControlConference, pp. DSCC2008–2142:1–8, October 2008.

[14] P. Antsaklis and K. Passino, eds., An Introduction to Intelligent andAutonomous Control. Kluwer Academic Publishers, 1993.

[15] D. W. Marhefka, D. E. Orin, J. P. Schmiedeler, and K. J. Waldron,“Intelligent control of quadruped gallops,” IEEE/ASME Transactionson Mechatronics, vol. 8, pp. 446–456, December 2003.

[16] D. P. Krasny and D. E. Orin, “Generating high-speed dynamic runninggaits in a quadruped robot using an evolutionary search,” IEEETransactions on Systems, Man, and Cybernetics, Part B: Cybernetics,vol. 34, pp. 1685–1696, August 2004.

[17] L. Palmer and D. Orin, “Intelligent control of high-speed turning in aquadruped trot,” Journal of Intelligent and Robotic Systems, pp. 47–68,2009.

[18] B. Knox, “Design of a biped robot capable of dynamic maneuvers,”Master’s thesis, The Ohio State University, Dept. of MechanicalEngineering, 2008.

[19] Y. Liu, “Fuzzy control of hopping in a biped robot,” Master’s thesis,The Ohio State Univ., Dept. of Electrical & Computer Engineering,2010.

[20] D. Marhefka and D. Orin, “A compliant contact model with nonlineardamping for simulation of robotic systems,” IEEE Transactions onSystems, Man and Cybernetics, Part A, vol. 29, pp. 566–572, Novem-ber 1999.

[21] S. Rodenbaugh, “RobotBuilder: a graphical software tool for the rapiddevelopment of robotic dynamic simulations,” Master’s thesis, TheOhio State Univ., Dept. of Electrical & Computer Engineering, 2003.

[22] S. McMillan, D. Orin, and R. McGhee, “DynaMechs: An objectoriented software package for efficient dynamic simulation of under-water robotic vehicles,” in Underwater Robotic Vehicles: Design andControl, pp. 73–98, TSI Press, 1995.

[23] E. Bizzi, V. C. K. Cheung, A. d’Avella, P. Saltiel, and M. Tresch,“Combining modules for movement,” Brain Research Reviews, vol. 57,no. 1, pp. 125–133, 2008.

[24] M. Hester, J. P. Schmiedeler, P. M. Wensing, and D. E. Orin, “Fuzzycontrol of vertical jumping with a planar biped,” in Proc. of ASMEInternational Design Engineering Technical Conferences, (Montreal,Canada), pp. DETC2010–28857:1–8, August 2010.

[25] R. M. Alexander, “Leg design and jumping technique for humans,other vertebrates, and insects.,” Philosophical Transactions: BiologicalSciences, vol. 347, pp. 235–248, 2009.