Jiyu Wang’s MSc Thesis Defence

Title: Behavior shift to altered physics law of standing: a prediction from the Reinforcement Learning Controller of postural control”

Thesis Supervisor: Dr. Jean-Sébastien Blouin
Committee members: Dr. Calvin Kuo, Dr. Romeo Chua
Chair: Dr. Kayla Fewster

Abstract: 

Background: Computational models help us understand how the brain controls movement while enabling testable predictions. Models for standing balance typically follow optimal feedback control principles, which optimizes a set of well-defined rules or cost functions (Todorov, 2004). A central question to our understanding of postural control is the overall goal of standing balance. Current opinions of the topic diverge: researchers have argued that minimization of movement variability or overall exerted torque could be potential goals of balance control. Recent research findings suggested that movement variability in motor commands is not always detrimental and may help to explore the environment and facilitate motor learning (Wu et al., 2014). Most of the commonly used models, including Proportional-Integral-Derivative and Linear Quadratic Regulators, however, rely on minimization of movement variability (Van der Kooij et al., 1999; Lockhart & Ting, 2017). To reconcile the conflicted roles of movement variability, we modeled human postural control with the Markov Decision Process (MDP) framework that searches for the best decision that minimizes the defined cost functions while exploring all possible actions and their related consequences.

Objectives: The purposes of the thesis were to (1) model standing balance control using the MDP framework and identify best parameter combinations that represent the physiological characteristics (2) test predictions from the model using a custom-designed robotic balancing platform and altered standing balance dynamics.

Methods: Biomechanics of standing was represented as an inverted pendulum, and the Q-learning algorithm was applied to solve for the control problem. Grid searches were performed to evaluate the model by comparing the range, root mean square (RMS) of the time series angle data, mean power frequency (MPF) and 99% power bandwidth with previously reported data. In the experiment, participants (n = 3) were asked to balance on the custom robotic balancing platform during perturbations in which torque bias terms were added. The exerted torque and body angles were recorded and analyzed.

Results: Of all 3520 simulations obtained from grid searches, 1497 successfully learnt to balance. 29 fitted within the MPF and 99% power bandwidth, and none of them fell within the limits of range and RMS. The experimental results suggested that one participant demonstrated behavioral patterns that maintain exerted torque under the altered dynamics compared with normal standing and two participants showed a mixed strategy of maintaining torque as well as body angles.

Conclusions: The MDP model is able to generate behavior close to human balance control given specific parameters. The experimental results indicated different behavioral patterns under altered standing dynamics among participants, providing inconclusive evidence of whether the goal of standing is to minimize torque or movement variability.