English | δΈζ
- π’ News
- π Why RM-Gallery?
- π₯ Installation
- π RM Gallery Walkthrough
- π Documentation
- π€ Contribute
- π Citation
- [2025-07-09] We release RM Gallery v0.1.0 now, which is also available in PyPI!
RM-Gallery is a one-stop platform for training, building and applying reward models. It provides a comprehensive solution for implementing reward models at both task-level and atomic-level, with high-throughput and fault-tolerant capabilities.
- Integrated RM Training Pipeline: Provides an RL-based framework for training reasoning reward models, compatible with popular frameworks (e.g., verl), and offers examples for integrating RM-Gallery into the framework.
RM Training Pipeline improves accuracy on RM Bench
-
Unified Reward Model Architecture: Flexible implementation of reward models through standardized interfaces, supporting various architectures (model-based/free), reward formats (scalar/critique), and scoring patterns (pointwise/listwise/pairwise)
-
Comprehensive RM Gallery: Provides a rich collection of ready-to-use Reward Model instances for diverse tasks (e.g., math, coding, preference alignment) with both task-level(RMComposition) and component-level(RewardModel). Users can directly apply RMComposition/RewardModel for specific tasks or assemble custom RMComposition via component-level RewardModel.
-
Rubric-Critic-Score Paradigm: Adopts the Rubric+Critic+Score-based reasoning Reward Model paradigm, offering best practices to help users generate rubrics with limited preference data.
-
Multiple Usage Scenarios: Covers multiple Reward Model (RM) usage scenarios with detailed best practices, including Training with Rewards (e.g., post-training), Inference with Rewards (e.g., Best-of-NοΌdata-correction)
-
High-Performance RM Serving: Leverages the New API platform to deliver high-throughput, fault-tolerant reward model serving, enhancing feedback efficiency.
RM Gallery requires Python >= 3.10 and < 3.13
# Pull the source code from GitHub
git clone https://github.com/modelscope/RM-Gallery.git
# Install the package
pip install .pip install rm-galleryfrom rm_gallery.core.reward.registry import RewardRegistry
# 1. Choose a pre-built reward model
rm = RewardRegistry.get("safety_listwise_reward")
# 2. Prepare your data
from rm_gallery.core.data.schema import DataSample
sample = DataSample(...) # See docs for details
# 3. Evaluate
result = rm.evaluate(sample)
print(result)That's it! π
π 5-Minute Quickstart Guide - Get started in minutes
π Interactive Notebooks - Try it hands-on
Choose from 35+ pre-built reward models or create your own:
# Use pre-built models
rm = RewardRegistry.get("math_correctness_reward")
rm = RewardRegistry.get("code_quality_reward")
rm = RewardRegistry.get("helpfulness_listwise_reward")
# Or build custom models
class CustomReward(BasePointWiseReward):
def _evaluate(self, sample, **kwargs):
# Your custom logic here
return RewardResult(...)π See all available reward models β
Train your own reward models with VERL framework:
# Prepare data and launch training
cd examples/train/pointwise
./run_pointwise.shπ Training guide β
Test your models on standard benchmarks:
- RewardBench2 - Latest reward model benchmark
- RM-Bench - Comprehensive evaluation
- Conflict Detector - Detect evaluation conflicts
- JudgeBench - Judge capability evaluation
π Evaluation guide β
- Best-of-N Selection - Choose the best from multiple responses
- Data Refinement - Improve data quality with reward feedback
- Post Training (RLHF) - Integrate with reinforcement learning
- High-Performance Serving - Deploy as scalable service
π Complete Documentation - Full documentation site
- 5-Minute Quickstart - Get started fast
- Interactive Examples - Hands-on Jupyter notebooks
- Building Custom RMs - Create your own
- Training Guide - Train reward models
- API Reference - Complete API docs
- Changelog - Version history and updates
Contributions are always encouraged!
We highly recommend install pre-commit hooks in this repo before committing pull requests. These hooks are small house-keeping scripts executed every time you make a git commit, which will take care of the formatting and linting automatically.
pip install -e .
pre-commit installPlease refer to our Contribution Guide for more details.
Reference to cite if you use RM-Gallery in a paper:
@software{
title = {RM-Gallery: A One-Stop Reward Model Platform},
author = {The RM-Gallery Team},
url = {https://github.com/modelscope/RM-Gallery},
month = {07},
year = {2025}
}


