HIL Implementation of Harmonic Drive Motor (Part III)

Part III: Migration to Target Processor

This material continues from our previous discussion

  • Part I: Discretization and Simulation
  • Part II: Algorithm Formulation

If you have not done so, make sure you read them first to understand the whole development process.

Recall from last time, we are able to implement our HDM model as an algorithm written using Scilab script, and test that it yields similar response to Xcos simulation. The rest of the work is to rewrite the algorithm using a programming language that some cross compiler supports, since it has to be run on a target system. In our case, we will be using PIC24EP product line from Microchip. The advantages are its low cost and simplicity, yet with proper setting it could run at a high performance of 70 MIPS.

Figure 1 shows the hardware prototype that I solder and wire on a general-purpose PCB. Well, it has more parts than necessary because I use it for some other embedded applications. What we need for this HDM implementation is of course the processor, a PIC24EP32GP202, crystal oscillator circuit, 3.3 V regulator, a push-button switch and a USB-to-serial convertor for data communication with a host PC.

Figure 1 target prototype board with PIC24EP32GP202

Even though this prototype is fairly simple, it does require some time and patience to build. So if you have some commercial board handy, feel free to use it. The idea is to understand the basics of discrete-time system implementation. That essence remains the same for all embedded products.

In any case, a prototype like this should cost less than 10 bucks. And most microcontroller products provide compilers for free (with some limitations such as code size or optimization level. Not really an issue for our simple HDM application). For Microchip products, MPLABX and compilers for all their product lines are available for free download. Figure 2 shows a screenshot of MPLABX.

Figure 2 MPLABX IDE from Microchip

You also need some hardware for programming the PIC. I use a PICKIT3 shown in the top image. This cost about USD 60 in Thailand. It is a one-time investment for all embedded applications using PIC products. I like this modest debugger/programmer more than the more expensive and bulky ICD3. That’s something for a more serious programmer.

Pure Algorithm Test

As stated before, the whole HIL hardware not only involves a processor, but also contains peripherals such as ADC, DAC (or PWM) on the signal path. These modules add complexity to the plant transfer function. So before we reach that stage, it is a good idea to test only the pure HDM plant developed from last time, to see if the algorithm executes correctly on the target processor. If it does not, the problem could be numerically-related, or caused by bugs in the program.

If you have some experience with C language, you’d know too well that it is not as user friendly as Scilab. Writing a byte beyond an allocated array does not result in any error or warning, but the program crashes. Well, even with my more than 10-year experience with C, I just did that before writing this article. That ‘s why I strongly recommend that you test your code in steps and then add on. For example, before implementing the HDM algorithm, I first test that the timer works properly by blinking an LED at 1 sec period.

So before getting to this point, I separately test all the support functions such as serial communication, timer, port initialization, clock switching, etc. These are not included in the article but some can be read from companion website www.dewplanet.com

So to define the process for our pure algorithm test, the input is generated in the algorithm. The outputs ql and qm are computed, and both input and output samples are sent via the serial port to a terminal program in PC. This process is executed when a push button is pressed. The data is captured for 2 seconds. With 0.01 sec sampling period, this results to about 200 data points each.

HDM DFII Algorithm in C

Refer to our Scilab algorithm developed from Part II, we have to rewrite it in C language. First the parameters and coefficients are declared as global variables.

// ******** ----- HDM parameters ------ *********
double km = 100;  // torque constant
double kb = 1;  // back EMF constant
double ks = 1000;  // torsional stiffness of harmonic drive
double gr = 10;   // gear ratio
double La = 0.1;  // armature inductance
double Ra = 1;  // armature resistance
double Jm = 1;  //  motor inertia
double Bm = 0.01;  // motor shaft friction
double Jl = 3;  // load inertia
double Bl = 0.05;  // load friction
double Smax = 3000;  // maximum motor speed (RPM)
double Ts = 0.01;   // sampling time for discrete model
// ******** ----- DFII gains and coefficients ------ *********
double coeff1a[5] = {0,0,0,0,0};  // denominator coefficients of P1
double coeff1b[6] = {0,0,0,0,0,0}; // numerator coefficients of P1
double k1 = 0;  // gain of P1
double coeff2a[2] = {0,0}; // denominator coefficients of P2
double coeff2b[3] = {0,0,0}; // numerator coefficients of P2
double k2 = 0;  // gain of P2

The gains and coefficients need to be computed later. It’s a common programming practice to initialize them to zero, though not necessary. Also, we declare global arrays to keep the states of P1 and P2

 double p1x[6] = {0,0,0,0,0,0};  // state of p1
double p2x[3] = {0,0,0};  // state of p2

The trick I use to capture data for this initial test is to create a loop counter, and a flag to turn the HDM on or off.

#define OFF 0
#define ON 1
int lcount = 0;   // loop count
int hdmrun = OFF;  // a flag to start the motor

The algorithm is not invoked when hdmrun is OFF. When a push-button is pressed, hdmrun is set to ON and the algorithm starts. lcount keeps the loop count. Whenever the loop count matches our specified data points, say, 200, hdmrun is set to OFF and the algorithm stops.

Coefficient computation is put into a function coeffcompute().

void coeffcompute(void)
    double kv;
    double ac1,ac2,ac3,ac4,ac5;
    double aLz[3];  // denominator coefficients of aL
    double a1z[6];  // unnormalized denominator coefficients of P1
    double Ts2, Ts3, Ts4, Ts5;
    Ts2 = Ts*Ts;
    Ts3 = pow(Ts,3);
    Ts4 = pow(Ts,4);
    Ts5 = pow(Ts,5);
    ac5 = La*Jm*Jl;
    ac4 = Ra*Jm*Jl+La*(Jm*Bl + Jl*Bm);
    ac3 = La*(Jm*ks + Bm*Bl + Jl*ks) + Ra*(Jm*Bl + Jl*Bm) + gr*km*kb*Jl;
    ac2 = La*ks*(Bm+Bl)+Ra*(Jm*ks + Bm*Bl + Jl*ks) + gr*km*kb*Bl;
    ac1 = ks*(La*ks + Ra*(Bm+Bl) - ks*La + gr*km*kb);
    kv = 0.1333*Smax*ac1/(km*gr*ks);
    // unnormalized denominator coefficients of p1(z)
    a1z[5] = 32*ac5 + 16*Ts*ac4 + 8*Ts2*ac3 + 4*Ts3*ac2 + 2*Ts4*ac1;
    a1z[4] = -160*ac5 - 48*Ts*ac4 - 8*Ts2*ac3 + 4*Ts3*ac2 + 6*Ts4*ac1;
    a1z[3] = 320*ac5 + 32*Ts*ac4 - 16*Ts2*ac3 - 8*Ts3*ac2 + 4*Ts4*ac1;
    a1z[2] = -320*ac5 + 32*Ts*ac4 + 16*Ts2*ac3 - 8*Ts3*ac2 - 4*Ts4*ac1;
    a1z[1] = 160*ac5 - 48*Ts*ac4 + 8*Ts2*ac3 + 4*Ts3*ac2 - 6*Ts4*ac1;
    a1z[0] = -32*ac5 + 16*Ts*ac4 - 8*Ts2*ac3 + 4*Ts3*ac2 - 2*Ts4*ac1;
    // compute gain and coefficients of P1
    k1 = kv*km*ks/a1z[5];
    coeff1b[0] = Ts5;
    coeff1b[1] = 5*Ts5;
    coeff1b[2] = 10*Ts5;
    coeff1b[3] = 10*Ts5;
    coeff1b[4] = 5*Ts5;
    coeff1b[5] = Ts5;
    coeff1a[0] = a1z[4]/a1z[5];
    coeff1a[1] = a1z[3]/a1z[5];
    coeff1a[2] = a1z[2]/a1z[5];
    coeff1a[3] = a1z[1]/a1z[5];
    coeff1a[4] = a1z[0]/a1z[5];
    // compute gain and coefficients of P2
    aLz[0] = Ts2;
    aLz[1] = 2*Ts2;
    aLz[2] = Ts2;
    k2 = gr/(ks*aLz[2]);
    coeff2b[0] = 4*Jl + 2*Bl*Ts + ks*Ts2;
    coeff2b[1] = 2*ks*Ts2 - 8*Jl;
    coeff2b[2] = 4*Jl - 2*Bl*Ts + ks*Ts2;
    coeff2a[0] = aLz[1]/aLz[2];
    coeff2a[1] = aLz[0]/aLz[2];

The code is quite similar to Scilab, with only two adjustments

  • Scilab array index starts from 1. C starts from 0. This is important and can cause system crash easily when your index exceeds the allocated memory for an array.
  • The ^ symbol that means “raised to the power of” in Scilab is interpreted as bitwise exclusive-or in C. So to compute x raised to the power of n, we need to use C math function pow(x,n). Don’t forget to include header file “math.h” at the top of program

Now we show the HDM DFII algorithm, implemented in an ISR for timer 1.
Note: To save space, the timer initialization to 0.01 sec is not shown. Please consult Microchip documents for information, or see some example from www.dewplanet.com

void __attribute__((interrupt, auto_psv)) _T1Interrupt(void)
// Timer 1 interrupt every 10 mS
    if (hdmrun)  {
        // pointers to coefficient and state arrays
        double *p1a_coeff, *p1b_coeff, *p1_state; // for p1
        double *p2a_coeff, *p2b_coeff, *p2_state;  // for p2
        double vin, qlout, qmout; // input and output
        p1a_coeff = coeff1a;
        p1b_coeff = coeff1b;
        p1_state = p1x;
        p2a_coeff = coeff2a;
        p2b_coeff = coeff2b;
        p2_state = p2x;
        // generate input
        if (inputtype == PULSE)  {
            if (lcount <= 60) vin = 100; // pulse from 0 to 0.6 sec
            else vin = 0;
        else if (inputtype == PRBS)   {
            int j;
            feedin = bvec[0]^(bvec[2]^(bvec[3]^bvec[12]));
            for (j = 12;j>=1;j--) {
            bvec[0] = feedin;
            if (bvec[12]==0) vin = -100;
            else vin = 100;
	 // ******--------  HDM DFII algorithm --------------*****
        // ----- P1 computation ------------- 
        // state updates
        // difference equation computation
        *p1_state = vin - *(p1a_coeff)*(*(p1_state+1)) - *(p1a_coeff+1)*(*(p1_state+2)) - *(p1a_coeff+2)*(*(p1_state+3)) - *(p1a_coeff+3)*(*(p1_state+4)) - *(p1a_coeff+4)*(*(p1_state+5));
        qlout = *(p1b_coeff)*(*(p1_state)) + *(p1b_coeff+1)*(*(p1_state+1))+ *(p1b_coeff+2)*(*(p1_state+2))+ *(p1b_coeff+3)*(*(p1_state+3))+ *(p1b_coeff+4)*(*(p1_state+4))+ *(p1b_coeff+5)*(*(p1_state+5));
        qlout = k1*qlout;
        // ------ P2 computation ---------
        // state updates
        // difference equation computation
        *p2_state = qlout - *(p2a_coeff)*(*(p2_state+1)) - *(p2a_coeff+1)*(*(p2_state+2));
        qmout = *(p2b_coeff)*(*(p2_state)) + *(p2b_coeff+1)*(*(p2_state+1))+ *(p2b_coeff+2)*(*(p2_state+2));
        qmout = k2*qmout;
        f2a(msgbuf,vin); // f2a is a custom function written by the author
        PutStrU1(", ");
        PutStrU1(", ");
    }  // if (hmdrun)
    //if (lcount>=201)  {  // stop the HDM after specified loop counts
    if (lcount>=201)  {
        hdmrun = OFF;
        lcount = 0;
    _T1IF = 0;

Focus for the moment on the part below comment line HDM DFII algorithm. This is just the same algorithm implemented as Scilab script before, but some pointer increments are used in the state update and coefficient selection. In C programming, pointer arithmetic is an efficient approach for array indexing. But, well, if you are not comfortable with it, feel free to implement it as standard array indexing, the same way used in Scilab. For this HDM transfer function with overall order of 7 , I don’t think the advantage of using pointer would be significant. I demonstrate it here for you to be familiar in case you might find its use in other digital signal processing applications.

Running the Algorithm

After successfully compile and download the algorithm to target, now we are ready to test the program. First the inputtype variable is set to PULSE to generate the pulse with magnitude 100 and duration 0.6 sec, the same input used in Scilab. The outputs are converted to ASCII message* and sent via UART1 port to a terminal program in PC. Figure 3 and 4 show the message being sent to HyperTerminal program.

*I write my own function, f2a(), to convert from a vairable of type double to ASCII format. Using standard C function like sprintf() causes a lot of overhead and hence not suitable to put in a timer routine with small sampling period.

Figure 3 capturing the data via UART1

Figure 4 strings of data in HyperTerminal program

What we want to do next is to compare the outputs with those computed by Scilab. In the future I plan to develop a tool using some serial port package in Scilab for easier data processing, but right now we have to do a little more work. Note that data is sent in a format suitable for matrix creation, with columns as input, ql, and qm, in that order. So I just save data in a text file, copy, and paste to a Scilab script in a bracket of matrixname = [ ];, and we’re done. The data from pulse input is now contained in pulseio.sce. Then I simulate the continuous and DFII in Scilab using same input. Figure 5 shows the comparison. We see a close match of the responses. For qm, the response from target processor tends to wiggle a bit. Future test with longer execution time is required to observe if this oscillation grows to instability.

Figure 5 response comparison from pulse input

We ‘d like to close this article by testing with another interesting input known as the Pseudo-Random Binary Sequence (PRBS). This is claimed in system identification literature as a rich input that could capture plant dynamics well. It is also very easy to generate.

First we show how to create an input superblock for PRBS generation. As shown in Figure 6, it consists of 13 delay stages, with feedback from some of the outputs XORed. I don’t want to go into detail about the theory of such structure since it deviates too much from our main discussion.

Figure 6 a PRBS generator created as superblock in Xcos

The output of this generator is shown in Figure 7. It is basically a 2-valued sequence with switching pattern that does not repeat itself until some specified points (determined by number of delays). The standard generator outputs values that switch between 0 and 1. We shift down the level and amplify to yield a signal that switches between 100 and -100.

Figure 7 PRBS signal generated from the diagram in Figure 6

The input superblock can then be used in Xcos simulation as in Figure 8.

Figure 8 HMD Xcos diagram with PRBS generator superblock as input

This PRBS is used as test input for the Xcos simulation, Scilab algorithm execution, and the algorithm implemented on the target processor. The procedure is the same as we did with pulse input. Notice how we create the PRBS in C code in the section if (inputtype == PRBS). That one uses standard array reference that seems easier for a non-programmer to understand. I don’t mind using pointer operation for this because in real application the input sequence can be generated outside the control loop.

Note: since each delay outputs only binary 0 or 1. An experienced programmer would be tempted to implement the whole delay line in just one 16-bit integer variable and use shift instructions and bitwise logic. Though memory efficient, that scheme would be hard for a general audience to understand.

Set inputtype variable to PRBS. Run the algorithm on target and capture data as before. Save it to prbsio.sce. Then plot all responses for comparison. The result is shown in Figure 9. The outputs from C and Scilab algorithms match almost perfectly. Both responses deviate slightly from continuous-time, but show similar trends.

Figure 9 response comparison from PRBS input


The experiment in this article might be considered a preliminary step for target implementation, since it bypasses the input and output modules of the microcontroller and observes only computational results from the algorithm executed by the target processor. Being a 16-bit microcontroller with fixed-point math units, the PIC24EP is no match to most sophisticated PC processors. Nevertheless, we see that the responses from computation are almost identical. Of course, this would not be the case when peripheral modules are involved, together with some added components such as low-pass filters. Things would be more challenging when this HDM plant is put into a feedback loop with a controller.

Related Files

  • All Scilab and Xcos models are contained in hilhdm3.zip
  • C source file : main.c (Require good fishing skill since this file is quite a mess. It has many unrelated parts. Many are commented out.)
  • Comments


Comments are closed.