🚗

Unified Planning for Autonomous Driving

Mar 23, 2025

Planning-Oriented Autonomous Driving

Authors

  • Yihan Hu 1,2, Jiazhi Yang 1, Li Chen 1, Keyu Li 1, Chonghao Sima 1, Xizhou Zhu 3,1
  • Siqi Chai 2, Senyao Du 2, Tianwei Lin 2, Wenhai Wang 1, Lewei Lu 3, Xiaosong Jia 1
  • Qiang Liu 2, Jifeng Dai 1, Yu Qiao 1, Hongyang Li 1

Affiliations

  • 1: OpenDriveLab and OpenGVLab, Shanghai AI Laboratory
  • 2: Wuhan University
  • 3: SenseTime Research

Overview

  • Framework: Unified Autonomous Driving (UniAD)
  • Objective: Improve autonomous driving systems with planning-oriented design by integrating full-stack driving tasks in one network.
  • Key Themes: Perception, prediction, and planning as unified processes; focus on planning as the main objective.

Abstract

  • Autonomous driving systems traditionally split tasks into separate modules (perception, prediction, planning).
  • Issues: Accumulative errors, misalignment, and coordination inefficiencies.
  • Proposal: UniAD integrates tasks into a unified framework, focusing on planning.
  • Results show significant improvements in performance over standalone and multi-task learning models.

Introduction

  • Traditional designs use separate or multi-task paradigms with divided heads.
  • Problems include information loss, error accumulation, and feature misalignment.
  • Multi-task learning (MTL) can cause negative transfer across tasks.
  • End-to-end approaches integrate perception, prediction, and planning, but often lack detailed component coordination.
  • UniAD focuses on planning, employing a query-based design for node interaction, improving task coordination.

Methodology

  • Architecture: Transformer decoder-based modules for perception and prediction, leading to a planner.
  • Feature Extraction: Multi-camera images transformed into a unified bird’s-eye-view (BEV) feature.
  • Modules:
    • TrackFormer: Detects and tracks agents across frames.
    • MapFormer: Maps road elements using semantic segmentation.
    • MotionFormer: Predicts future trajectories of agents.
    • OccFormer: Predicts future occupancies with agent identities.
    • Planner: Uses ego-vehicle query to plan safe trajectories.

Perception

  • TrackFormer: Joint detection and tracking without post-processing.
  • MapFormer: Sparse representation of road elements to assist motion forecasting.

Prediction

  • MotionFormer: Predicts multimodal future movements using transformer structure.
  • Occupancy Prediction: Uses OccFormer for scene-agent interaction and instance-level occupancy prediction.

Planning

  • Planner: Converts navigation signals to embeddings for planning queries.
  • Optimization: Uses non-linear methods to refine trajectories, considering perceptual uncertainties.

Experiments

  • Conducted on nuScenes dataset.
  • Joint Results: Task coordination improves planning and safety.
  • Perception Results: Outperforms previous methods in tracking and mapping.
  • Prediction Results: Significant improvement in motion forecasting and occupancy prediction.
  • Planning Results: Lower collision rates and better trajectory planning.

Qualitative Results and Ablation Studies

  • Visualization shows integration of tasks in complex driving scenarios.
  • Ablation studies confirm effectiveness of specific design choices, such as scene-level anchors and attention mechanisms.

Conclusions and Future Work

  • UniAD demonstrates significant advantages of planning-oriented design.
  • Future work includes lightweight system development and integration of additional tasks like depth estimation and behavior prediction.

Acknowledgements

  • Supported by National Key R&D Program of China, Shanghai Committee of Science and Technology, and NSFC.

References

  • Includes key papers on autonomous driving, perception, prediction, and planning methodologies.