Nonlinear Trajectory Optimization and DDP Methods

Jul 16, 2024

Lecture Notes: Nonlinear Trajectory Optimization and DDP Methods

Review of DDP (Differential Dynamic Programming)

  • Finished discussing DDP story and some implementation details:
    • Backward pass: Strategy for propagating derivatives
    • Line search: Techniques for ensuring convergence
    • Constraints in DDP:
      • For control limits: hacky ways to handle constraints
      • For state constraints: wrap DDP with an augmented Lagrangian method (AL)

Minimum Time Problems

  • Applications: Racing, autonomous drone racing, race car applications, minimum time to climb for airplanes
  • Goal: Minimize time to reach a goal state
  • Problem formulation: Minimize over trajectory and final time
    • Cost function: $\int_{t}^{t_{final}} 1 , dt$
    • Constraints: Dynamics, torque limits, etc.

Discretization Strategy

  • Using fixed number of knot points
  • Make time steps (Δt) decision variables by treating them as control inputs
  • Constraints on time steps (Δt) to prevent solver from exploiting large Δt for cheat physics, and ensuring positive time steps

Nonlinear Trajectory Optimization: Direct Methods

  • Types of algorithms:
    • Indirect methods (leverage dynamic programming, optimal control ideas): iLQR, DDP
    • Direct methods: Discretize problem and solve as a nonlinear programming problem (NLP)
  • Advantages: Flexible; can use any off-the-shelf NLP solver like IPOPT, SNOPT, NITRO
  • Sequential Quadratic Programming (SQP):
    • Solves nonlinear problems by iteratively solving a series of quadratic programming problems
    • Involves least-squares fit of the cost function plus constraints
    • Merits of using Newton's method for constrained optimization

Direct Collocation Methods

  • Strategy: Represent trajectories using polynomial splines, enforce dynamics constraints via spline derivatives
  • Common choices: Cubic splines for states and piecewise linear for controls
  • Hermite-Simpson integration: Implicit method that provides third-order integration accuracy
    • Each constraint only requires on average two dynamics evaluations per time step
  • Implementation:
    • Use IPOPT or other NLP solvers for solving optimal control problems
    • Ability to leverage infeasible initial guesses, useful for sampling based planning combined with direct methods

Example:

  • Implementing direct collocation for an Acrobot
  • Use Hermite-Simpson integration for control
  • Example code to demonstrate setting and solving nonlinear trajectory optimization problems using IPOPT

Applications

  • Robotics applications where dynamics are jerky, non-smooth
  • Use spline approaches for complex trajectory optimizations
  • Collocation methods useful for efficiently solving for feasible trajectories

Summary

  • Understanding tradeoffs between indirect and direct methods
  • Practical strategies for implementing and solving complex trajectory optimization problems