COMPASS: Cross-Embodiment Mobility Policy via Residual RL and Skill Synthesis

[Website] [arXiv]

Overview

This repository provides the official PyTorch implementation of COMPASS.

COMPASS is a novel framework for cross-embodiment mobility that combines:

Imitation Learning (IL) for strong baseline performance
Residual Reinforcement Learning (RL) for embodiment-specific adaptation
Policy distillation to create a unified, generalist policy

Quick Start

🚀 Get started in 3 steps:

Install Isaac Lab and dependencies
Train your own specialists or deploy on robots
Data generation for GR00T post-training

Installation

📦 Complete Installation Guide (click to expand)

1. Isaac Lab Installation

Install Isaac Lab and the residual RL mobility extension by following this instruction.

2. Environment Setup

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

3. Dependencies

Install the required packages:

${ISAACLAB_PATH}/isaaclab.sh -p -m pip install -r requirements.txt

4. X-Mobility Installation

Install the X-Mobility package:

${ISAACLAB_PATH}/isaaclab.sh -p -m pip install x_mobility/x_mobility-0.1.0-py3-none-any.whl

Download the pre-trained X-Mobility checkpoint from: https://huggingface.co/nvidia/X-Mobility/blob/main/x_mobility-nav2-semantic_action_path.ckpt

5. Residual RL environment USDs

Download the residual RL environment USDs from: https://huggingface.co/nvidia/COMPASS/blob/main/compass_usds.zip
Unzip and place the downloaded USDs in the compass/rl_env/exts/mobility_es/mobility_es/usd directory

Usage

🤖 Training Residual RL Specialists (click to expand)

Residual RL Specialists

Train with the default configurations in configs/train_config.gin:

${ISAACLAB_PATH}/isaaclab.sh -p run.py \
    -c configs/train_config.gin \
    -o <output_dir> \
    -b <path/to/x_mobility_ckpt> \
    --enable_camera

Evaluate trained model:

${ISAACLAB_PATH}/isaaclab.sh -p run.py \
    -c configs/eval_config.gin \
    -o <output_dir> \
    -b <path/to/x_mobility_ckpt> \
    -p <path/to/residual_policy_ckpt> \
    --enable_camera \
    --video \
    --video_interval <video_interval>

NOTE: The GPU memory usage is proportional to the number of environments in residual RL training. For example, 32 environments will use around 30GB memory, so reduce the number of environments if you have limited GPU memory.

🧠 Policy Distillation (click to expand)

Policy Distillation

Collect specialist data:

Update specialists policy checkpoint paths in dataset_config_template

Run data collection:

${ISAACLAB_PATH}/isaaclab.sh -p record.py \
    -c configs/distillation_dataset_config_template.yaml \
    -o <output_dir> \
    -b <path/to/x_mobility_ckpt> \
    --dataset-name <dataset_name>

Train generalist policy:

python3 distillation_train.py \
    --config-files configs/distillation_config.gin \
    --dataset-path <path/to/specialists_dataset> \
    --output-dir <output_dir>

Evaluate generalist policy:

${ISAACLAB_PATH}/isaaclab.sh -p run.py \
    -c configs/eval_config.gin \
    -o <output_dir> \
    -b <path/to/x_mobility_ckpt> \
    -d <path/to/generalist_policy_ckpt> \
    --enable_camera \
    --video \
    --video_interval <video_interval>

📤 Model Export (click to expand)

Model Export

Export RL specialist policy to ONNX or JIT formats:

python3 onnx_conversion.py \
    -b <path/to/x_mobility_ckpt> \
    -r <path/to/residual_policy_ckpt> \
    -o <path/to/output_onnx_file> \
    -j <path/to/output_jit_file>

Export generalist policy to ONNX or JIT formats:

python3 onnx_conversion.py \
    -b <path/to/x_mobility_ckpt> \
    -g <path/to/generalist_policy_ckpt> \
    -e <embodiment_type> \
    -o <path/to/output_onnx_file> \
    -j <path/to/output_jit_file>

Convert the ONNX to TensorRT:

python3 trt_conversion.py -o <path/to/onnx_file> -t <path/to/trt_engine_file>

🔧 Add New Embodiment or Scene (click to expand)

Add New Embodiment or Scene

Follow this instruction to add a new embodiment or scene to the Isaac Lab RL environment.
Register the new embodiment or scene to the EmbodimentEnvCfgMap and EnvSceneAssetCfgMap in run.py, then update the configs or use command line arguments to select the new embodiment or scene.

🚀 ROS2 Deployment (click to expand)

ROS2 Deployment

To deploy COMPASS in Isaac Sim or on real robots using ROS2, please follow the detailed instructions in ros2_deployment/README.md. This guide covers containerized workflows, and Isaac Sim integration.

📊 Logging Options (click to expand)

Logging

The training and evaluation scripts use TensorBoard for logging by default. Weights & Biases (W&B) logging is also supported for more advanced experiment tracking features.

To use TensorBoard (default):

Logs will be saved to <output_dir>/tensorboard/
View logs with: tensorboard --logdir=<output_dir>/tensorboard/

To use Weights & Biases:

Install and set up W&B: pip install wandb and follow the setup instructions
Log in to your W&B account: wandb login

Add the --logger wandb flag to your command:

${ISAACLAB_PATH}/isaaclab.sh -p run.py \
    -c configs/train_config.gin \
    -o <output_dir\
    -b <path/to/x_mobility_ckpt> \
    --enable_camera \
    --logger wandb \
    --wandb-run-name "experiment_name" \
    --wandb-project-name "project_name" \
    --wandb-entity-name "your_username_or_team"

GR00T Post-training with COMPASS Datasets

The COMPASS distillation datasets can also be used to train VLA models like GR00T, to enhance their navigation capabilities.

🤖 GR00T Post-training Steps (click to expand)

Step 1: Collect the datasets and convert to GR00T Lerobot format

Follow the steps described above in the "🤖 Training Residual RL Specialists" and "🧠 Policy Distillation" sections to train a specialist policy and generate the corresponding specialist datasets.

Use the following command to convert the distillation dataset from HDF5 to the GR00T Lerobot episodic format:

python scripts/hdf5_to_lerobot_episodic.py --hdf5-dir <path/to/hdf5/directory> --output-path <path/to/lerobot/format>

Step 2: Post-train the GR00T model

Once the dataset is converted, follow the post-training instructions provided in the GR00T repo. A ready-to-use navigation data configuration for post-training is available in this branch.

Step 3: Evaluate the post-trained GR00T model

To evaluate the post-trained GR00T model, first launch the inference server from within the GR00T repository by following the setup instructions provided there. Ensure that the data configuration matches the one used during training, and the port number is set as 8888.

Once the server is running, start the COMPASS evaluation with the following command:

${ISAACLAB_PATH}/isaaclab.sh -p run.py \
    -c configs/eval_config.gin \
    -o <output_dir> \
    -b <path/to/x_mobility_ckpt> \
    --enable_camera \
    --gr00t-policy

You can modify the evaluation parameters in the eval_config.gin file as needed.

License

COMPASS is released under the Apache License 2.0. See LICENSE for additional details.

Core Contributors

Wei Liu, Huihua Zhao, Chenran Li, Joydeep Biswas, Soha Pouya, Yan Chang

Acknowledgments

We would like to acknowledge the following projects where parts of the codes in this repo is derived from:

Citation

If you find this work useful in your research, please consider citing:

@article{liu2025compass,
  title={COMPASS: Cross-embodiment Mobility Policy via Residual RL and Skill Synthesis},
  author={Liu, Wei and Zhao, Huihua and Li, Chenran and Biswas, Joydeep and Pouya, Soha and Chang, Yan},
  journal={arXiv preprint arXiv:2502.16372},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

COMPASS: Cross-Embodiment Mobility Policy via Residual RL and Skill Synthesis

Overview

Quick Start

Installation

1. Isaac Lab Installation

2. Environment Setup

3. Dependencies

4. X-Mobility Installation

5. Residual RL environment USDs

Usage

Residual RL Specialists

Policy Distillation

Model Export

Add New Embodiment or Scene

ROS2 Deployment

Logging

GR00T Post-training with COMPASS Datasets

License

Core Contributors

Acknowledgments

Citation

About

Uh oh!

Releases 3

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
compass		compass
configs		configs
docker		docker
images		images
ros2_deployment		ros2_deployment
scripts		scripts
x_mobility		x_mobility
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.style.yapf		.style.yapf
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
distillation_train.py		distillation_train.py
onnx_conversion.py		onnx_conversion.py
record.py		record.py
requirements.txt		requirements.txt
run.py		run.py
trt_conversion.py		trt_conversion.py

Uh oh!

License

Uh oh!

NVlabs/COMPASS

Folders and files

Latest commit

History

Repository files navigation

COMPASS: Cross-Embodiment Mobility Policy via Residual RL and Skill Synthesis

Overview

Quick Start

Installation

1. Isaac Lab Installation

2. Environment Setup

3. Dependencies

4. X-Mobility Installation

5. Residual RL environment USDs

Usage

Residual RL Specialists

Policy Distillation

Model Export

Add New Embodiment or Scene

ROS2 Deployment

Logging

GR00T Post-training with COMPASS Datasets

License

Core Contributors

Acknowledgments

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 3

Uh oh!

Languages

Packages